The Computational Magic of the Ventral Stream: Sketch of a Theory (and Why Some Deep Architectures Work)

SpeakerDr. Tomaso Poggio
Organization Department of Brain & Cognitive Sciences, Massachusetts Institute of Technology
LocationEBII 1230
DateApril 12, 2013 12:50 PM


The talk explores the theoretical consequences of a simple assumption: the computational goal of the feedforward path in the ventral stream – from V1, V2, V4 and to IT – is to discount image transformations, after learning them during development. The initial assumption is that a basic neural operation consists of dot products between input vectors and synaptic weights – which can be modified by learning. It proves that a multi-layer hierarchical architecture of dot-product modules can learn in an unsupervised way geometric transformations of images and then achieve the dual goals of invariance to global affine transformations and of robustness to deformations. These architectures learn in an unsupervised way to be automatically invariant to transformations of a new object, achieving the goal of recognition with one or very few labeled examples. The theory should apply to a varying degree to a range of hierarchical architectures such as HMAX, convolutional networks and related feedforward models of the visual system and formally characterize some of their properties.


Tomaso A. Poggio is the Eugene McDermo7 Professor in the Dept. of Brain & Cognitive Sciences at MIT and a member of both the Computer Science and Artificial Intelligence Laboratory and of the McGovern Institute. He is an honorary member of the Neuroscience Research Program, a member of the American Academy of Arts and Sciences, a Founding Fellow of AAAI, a founding member of the McGovern Institute for Brain Research. Among other honors, he received the Laurea Honoris Causa from the University of Pavia for the Volta Bicentennial, the 2003 Gabor Award, the Okawa Prize 2009, and the AAAS Fellowship. He is one of the most cited computational scientists (h- index=116, according to GoogleScholar) with contributions ranging from the biophysical and behavioral studies of the visual system to the computational analyses of vision and learning in humans and machines. With W. Reichardt he characterized quantitatively the visuomotor control system in the fly. With D. Marr, he introduced the seminal idea of levels of analysis in computational neuroscience. He introduced regularization as a mathematical framework to approach the ill-­‐posed problems of vision and the key problem of learning from data. He has contributed to the early development of the theory of learning ­‑ in particular introducing concepts such as RBFs, supervised learning in RKHSs and stability as a necessary and sufficient condition for generalization. In the last decade he has developed an influential quantitative model of visual recognition in the visual cortex. The citation for the recent 2009 Okawa prize mentions his “… outstanding contributions to the establishment of computational neuroscience, and pioneering researches ranging from the biophysical and behavioral studies of the visual system to the computational analysis of vision and learning in humans and machines.” 

  April 2013
Sun Mon Tues Wed Thu Fri Sat