Andrea Montanari received a Laurea degree in Physics in 1997, and a Ph. D. in Theoretical Physics in 2001
(both from Scuola Normale Superiore in Pisa, Italy). He has been post-doctoral fellow at Laboratoire de Physique
Théorique de l'Ecole Normale Supérieure (LPTENS), Paris, France, and the Mathematical Sciences Research Institute,
Berkeley, USA. From 2002 to 2010 he has been Chargé de Recherche (with Centre National de la Recherche Scientifique, CNRS) at LPTENS.
In September 2006 he joined Stanford University as a faculty, and since 2015 he is Full Professor in the Departments of
Electrical Engineering and Statistics.
He was co-awarded the ACM SIGMETRICS best paper award in 2008. He received the CNRS bronze medal for theoretical physics in 2006, the National Science Foundation CAREER award in 2008, the Okawa Foundation Research Grant in 2013, and the Applied Probability Society Best Publication Award in 2015. He was an Information Theory Society distinguished lecturer for 2015-2016. In 2016 he received the James L. Massey Research & Teaching Award of the Information Theory Society for young scholars, and in 2017 was elevated to IEEE Fellow. In 2018 he was an invited sectional speaker at the International Congress of Mathematicians. He received the 2020 Le Cam prize of the French Statistical Society, and is an invited IMS Medallion lecturer for the 2020 Bernoulli-IMS World Congress.
Title: The generalization error of overparametrized models: Insights from exact asymptotics
Abstract: Deep learning models are often so complex that they achieve vanishing classification error on the training set. Despite their huge complexity, the same architectures achieve small generalization error. This phenomenon has been rationalized in terms of a so-called double descent curve. As the model complexity increases, the generalization error follows the usual U-shaped curve at the beginning, first decreasing and then peaking around the interpolation threshold (when the model achieves vanishing training error). However, it descends again as model complexity exceeds this threshold.
I will focus on the case of a fully-connected two-layers neural network, and consider its linearization around a random initial condition. I will show that many intersting phenomena can be demonstrated and mathematically understood in this simple setting. I will then describe a few open problems and directions for future research.
[Based on joint work with Song Mei, Feng Ruan, Youngtak Sohn, Jun Yan, Yiqiao Zhong]
Emina Soljanin is a professor of Electrical and Compute Engineering at Rutgers. Before moving to Rutgers in January 2016, she was a (Distinguished) Member of Technical Staff for 21 years in various incarnations of the Mathematical Sciences Research Center of Bell Labs. Her interests and expertise are wide, currently ranging from distributed computing to quantum information science. She is an IEEE Fellow, a 2017 outstanding alumnus of the Texas A&M School of Engineering, the 2011 Padovani Lecturer, a 2016/17 Distinguished Lecturer, and 2019 President for the IEEE Information Theory Society.
Title: Diversity vs. Parallelism in Distributed Computing with Redundancy
Abstract: Distributed computing enables parallel execution of tasks that make up a large computing job. In large scale systems, even small random fluctuations in service times (inherent to computing environments) often cause a non-negligible number of straggling tasks with long completion times. Redundancy, in the form of simple task replication and erasure coding, has emerged as a potentially powerful way to curtail the variability in service time, as it provides diversity that allows a job to be completed when only a subset of redundant tasks gets executed. Thus both redundancy and parallelism reduce the execution time, but compete for resources of the system. In situations of constrained resources (e.g., fixed number of parallel servers), increasing redundancy reduces the available level of parallelism. This talk will present the diversity vs. parallelism trade off for some common models of task size dependent execution times, and show that different models operate optimally at different levels of redundancy, and thus require very different code rates.
[Joint work with Pei Peng and Phil Whiting]
Giuseppe Caire (S '92 -- M '94 -- SM '03 -- F '05)
was born in Torino in 1965. He received the B.Sc. in Electrical Engineering from Politecnico di Torino in 1990,
the M.Sc. in Electrical Engineering from Princeton University in 1992, and the Ph.D. from Politecnico di Torino in 1994.
He has been a post-doctoral research fellow with the European Space Agency (ESTEC, Noordwijk, The Netherlands) in 1994-1995,
Assistant Professor in Telecommunications at the Politecnico di Torino, Associate Professor at the University of Parma, Italy,
Professor with the Department of Mobile Communications at the Eurecom Institute, Sophia-Antipolis, France,
a Professor of Electrical Engineering with the Viterbi School of Engineering, University of Southern California, Los Angeles,
and he is currently an Alexander von Humboldt Professor with the Faculty of Electrical Engineering and Computer Science at the
Technical University of Berlin, Germany.
He received the Jack Neubauer Best System Paper Award from the IEEE Vehicular Technology Society in 2003, the IEEE Communications Society & Information Theory Society Joint Paper Award in 2004 and in 2011, the Leonard G. Abraham Prize for best IEEE JSAC paper in 2019, the Okawa Research Award in 2006, the Alexander von Humboldt Professorship in 2014, the Vodafone Innovation Prize in 2015, and an ERC Advanced Grant in 2018. Giuseppe Caire is a Fellow of IEEE since 2005. He has served in the Board of Governors of the IEEE Information Theory Society from 2004 to 2007, and as officer from 2008 to 2013. He was President of the IEEE Information Theory Society in 2011. His main research interests are in the field of communications theory, information theory, channel and source coding with particular focus on wireless communications.
Title: Coded Caching: Past, Present, Future
Abstract: Coded caching has emerged as a powerful and elegant idea for content distribution over communication networks. Since the initial work of Maddah-Ali and Niesen, a vast set of theoretical results have been developed in the network coding and information theory community. These results range from solving more and more complicated theoretical "puzzles" (i.e., highly involved, but somehow practically irrelevant problems) to addressing more concrete problems of practical relevance for applications. Yet, questions still remain about whether such schemes will ever be used in the real world on a vast scale. This talk provides an account of some recent exciting results including the real-world implementation of coded caching on actual wireless networks, addressing some of the residual skepticism about the feasibility and actual gains achievable by these schemes.