“Dynamics and Neural Collapse in Deep Classifiers trained with the Square Loss”
Mercoledì 25 Maggio 2022, ore 13:30 - Aula LUM250 (Via Luzzatti 8) - Tomaso Poggio (MIT)
Abstract
Here we consider a model of the dynamics of gradient flow under the square loss in overparametrized ReLUnetworks. We show that convergence to a solution with the maximum margin, which is the inverse of the product of the Frobenius norms of each layer weight matrix, is expected when normalization by a Lagrange multiplier (LM) is used together with Weight Decay (WD). We prove that SGD converges to solutions that have a bias towards 1)large margin and 2) low rank of the weight matrices. In addition, the solutions are predicted to show Neural Collapse. Non-vacous bounds are shown for expected error based on empirical margin.
Bio
Tomaso A. Poggio, is the Eugene McDermott Professor at the Department of Brain and Cognitive Sciences; Director, Center for Brains, Minds and Machines; Member of the Computer Science and Artificial Intelligence Laboratory at MIT; since 2000, member of the faculty of the McGovern Institute for Brain Research. He received his Doctor in Theoretical Physics from the University of Genoa in 1971 and was a Wissenschaftlicher Assistant, Max Planck Institut für Biologische Kybernetik, Tüebingen, Germany from 1972 until 1981 when he became Associate Professor at MIT. He is an honorary member of the Neuroscience Research Program, a member of the American Academy of Arts and Sciences and a Founding Fellow of AAAI. He received several awards such as the Otto-Hahn-Medaille Award of the Max-Planck-Society, the Max Planck Research Award (with M. Fahle), from the Alexander von Humboldt Foundation, the MIT 50K Entrepreneurship Competition Award, the Laurea Honoris Causa from the University of Pavia in 2000 (Volta Bicentennial), the 2003 Gabor Award, the 2009 Okawa prize, the American Association for the Advancement of Science (AAAS) Fellowship (2009) and the Swartz Prize for Theoretical and Computational Neuroscience in 2014. He is one of the most cited computational neuroscientists (with a h-index greater than 100 – based on GoogleScholar). A former Corporate Fellow of Thinking Machines Corporation and a former director of PHZ Capital Partners, Inc., is a director of Mobileye and was involved in starting, or investing in, several other high tech companies including Arris Pharmaceutical, nFX, Imagen, Digital Persona and Deep Mind. Among his PhD students and post-docs are some of the today’s leaders in the Science and in the Engineering of Intelligence, from Christof Koch (President and Chief Scientific Officer, Allen Institute) to Amnon Shashua (CTO and founder, Mobileye) and Demis Hassabis (CEO and founder, Deep Mind).