After a French baccalaureate in Physics and Mathematics and a B.Sc. in Psychology, Aude Oliva received two M.Sc. degrees –in Experimental Psychology, and in Cognitive Science and a Ph.D from the Institut National Polytechnique of Grenoble, France. She joined the MIT faculty in the Department of Brain and Cognitive Sciences in 2004 and the MIT Computer Science and Artificial Intelligence Laboratory - CSAIL - in 2012.
Her research is cross-disciplinary, spanning human perception/cognition, computer vision, and cognitive neuroscience, focusing on research questions at the intersection of the three domains. Her work has been featured in the scientific and popular press and has made its way in textbooks of Perception, Cognition, Computer Vision, Design, as well as in museums of Art and Science. She is the recipient of a National Science Foundation CAREER Award. Her research programs are funded by the National Science Foundation, the National Eye Institute, QCRI, Google and Xerox.
Computer vision started with the goal of building machines that can see like humans. Currently, many techniques in automatic visual understanding are inspired by how humans recognize and interact with objects and scenes, and how information is stored in memory. Aude Oliva’s cross-disciplinary work in Computational Perception and Cognition builds on the synergy between human and machine vision, and how it applies to solving high-level recognition problems like understanding scenes and events, perceiving space, localizing sounds, recognizing objects, modelling attention, eye movements and visual memory, as well as predicting subjective properties of images (like image memorability). Her research program is broadly dedicated to understanding Human Visual Intelligence, and it integrates knowledge and tools from image processing, image statistics, computer vision, human perception, cognition and neuro-imaging. Curriculum-Vitae (pdf)
Understanding visual scenes is central to our interaction with the world. Whether watching fast cuts in movie trailers or dynamically selecting routes for walking or driving, we act with visual intelligence with apparently little effort. To accomplish this feat, humans capitalize on global properties of the visual scene, building the shape of the space before the objects.
When glancing at a magazine or browsing the Internet we are continuously exposed to images. Despite this overflow of visual information, humans are extremely good at remembering thousands of pictures along with their visual details. But not all images are created equal. Artists, advertisers and photographers are routinely challenged by the question “what makes an image memorable?”. Our recent work shows that one can predict image memorability, opening a new domain of application in computer vision.
Hybrid images, an original technique based on the multiscale processing of images by the human visual system, are static images with two interpretations that change as a function of viewing distance or image size. These images can be used to create compelling displays in which the observer experiences different percepts when interacting with the image.
Many experiments have shown that the human visual system makes extensive use of contextual information for facilitating object search in real-world scenes. However, the question of how to formally model contextual influences has been challenging. Our model presents an original approach of attentional guidance by global scene information, predicting the image regions likely to be fixated by human observers performing natural search tasks in real world scenes.
Neural Organization of Objects and Scenes Affordances
Behavioral and computational studies suggest that visual scene analysis rapidly produces a rich description of both the objects and the spatial layout of surfaces in a scene. Our cognitive neuroscience work in human neuro-imaging shows that object and scene representation are distributed over a collection of high-level brain regions, with each region capitalizing on the functional properties of the object or the visual space.