Università degli Studi di Padova

“The Quarks of Attention”

Mercoledì 25 Maggio 2022, ore 12:30 - Aula LUM250 (Via Luzzatti 8) - Pierre Baldi (University of California Irvine)

Abstract

Attention plays a fundamental role in both natural and artificial intelligence systems. In deep learning, several attention-based neural network architectures have been proposed to tackle problems in natural language processing (NLP) and beyond, including transformer architectures which currently achieve state-of-the-art performance in NLP tasks. In this presentation we will:

  • identify and classify the most fundamental building blocks (quarks) of attention, both within and beyond the standard model of deep learning;
  • identify how these building blocks are used in all current attention-based architectures, including transformers;
  • demonstrate how transformers can effectively be applied to new problems in physics, from particle physics to astronomy; and
  • present a mathematical theory of attention capacity where, paradoxically, one of the main tools in the proofs is itself an attention mechanism.

Bio

Pierre Baldi earned MS degrees in Mathematics and Psychology from the University of Paris, and a PhD in Mathematics from the California Institute of Technology. He is currently Distinguished Professor in the Department of Computer Science, Director of the Institute for Genomics and Bioinformatics, and Associate Director of the Center for Machine Learning and Intelligent Systems at the University of California Irvine. The long term focus of his research is on understanding intelligence in brains and machines. In the 1980s and 1990s he pioneered the application of deep learning to biomedical imaging (fingerprint recognition), protein structure prediction, and the game of GO. He has made multiple contributions to the theory of deep learning, and developed and applied deep learning methods for the natural sciences. He recently published his fifth book: Deep Learning in Science, Cambridge University Press (2021). His honors include the 1993 Lew Allen Award at JPL, the 2010 E.R. Caianiello Prize for research in machine learning, and election to Fellow of the AAAS, AAAI, IEEE, ACM, and ISCB.