Short Bio

After a French baccalaureate in Physics and Mathematics and a B.Sc. in Psychology, Aude Oliva received two M.Sc. degrees –in Experimental Psychology, and in Cognitive Science and a Ph.D from the Institut National Polytechnique of Grenoble, France. She joined the MIT faculty in the Department of Brain and Cognitive Sciences in 2004 and the MIT Computer Science and Artificial Intelligence Laboratory - CSAIL - in 2011. Her research has made its way in textbooks of Perception, Cognition, Computer vision, Design, and popular press as well as in museums of Art and Science. Her research programs are funded by the National Science Foundation, the National Eye Institute, the Andrea Bocelli foundation and Google .

Research Overview

Computer vision started with the goal of building machines that can see like humans. Currently, many techniques in automatic visual understanding are inspired by how humans recognize and interact with objects and scenes, and how visual information is stored in memory. Aude Oliva’s work in Computational Perception and Cognition builds on the synergy between human and machine vision, and how it applies to solving high-level recognition problems like understanding scenes and events, encoding time, perceiving space (using visual or auditory modalities), recognizing objects, modelling attention, eye movements and visual memory, as well as predicting subjective properties of images (like image memorability). Her research program is broadly dedicated to understanding Human Visual Intelligence, and it integrates knowledge and tools from image processing, image statistics, computer vision, computer graphics, human perception, cognition and neuroscience (neuro-imaging methods). Curriculum-Vitae (pdf)

Selected Projects

Scene Understanding: The Spatial Envelope Theory

Understanding visual scenes is central to our interaction with the world. Whether watching fast cuts in movie trailers or dynamically selecting routes for walking or driving, we act with visual intelligence with apparently little effort. To accomplish this feat, our research found that humans capitalize on global properties of the visual scene, building the shape of the space before the objects.

scene understanding scene understanding

Visual Intelligence: What makes an image memorable?

When glancing at a magazine or browsing the Internet we are continuously exposed to images. Despite this overflow of visual information, humans are extremely good at remembering thousands of pictures along with their visual details. But not all images are created equal. Artists, advertisers and photographers are routinely challenged by the question “what makes an image memorable?”. Our recent work shows that one can predict image memorability, opening a new domain of application in computer vision.

memorable images memorable images

Computational Perception: Hybrid Images

Hybrid images, an original technique based on the multiscale processing of images by the human visual system, are static images with two interpretations that change as a function of viewing distance or image size. These images can be used to create compelling displays in which the observer experiences different percepts when interacting with the image.

hybrid images hybrid images

Visual Intelligence: Predicting where people look

Many experiments have shown that the human visual system makes extensive use of contextual information for facilitating object search in real-world scenes. However, the question of how to formally model contextual influences has been challenging. Our model presents an original approach of attentional guidance by global scene information, predicting the image regions likely to be fixated by human observers performing natural search tasks in real world scenes.

modeling attention modeling attention

Neuroscience: Neural Organization of Objects and Scenes

Behavioral and computational studies suggest that visual scene analysis rapidly produces a rich description of both the objects and the spatial layout of surfaces in a scene. Our cognitive neuroscience work in human neuro-imaging shows that object and scene representation are distributed over a collection of high-level brain regions, with each region capitalizing on the functional properties of the object or the visual space.

Neuroscience Neuroscience