Andrew Gordon Wilson is faculty in the Courant Institute and Center for Data Science at NYU. Before joining NYU, he was an assistant professor at Cornell University from 2016-2019. He was a research fellow in the Machine Learning Department at Carnegie Mellon University from 2014-2016, and completed his PhD at the University of Cambridge in 2014. Andrew's interests include probabilistic modelling, scientific computing, Gaussian processes, Bayesian statistics, and loss surfaces and generalization in deep learning. His webpage is https://cims.nyu.edu/~andrewgw.Abstract:
To answer scientific questions, and reason about data, we must build models and perform inference within those models. But how should we approach model construction and inference to make the most successful predictions? How do we represent uncertainty and prior knowledge? How flexible should our models be? Should we use a single model, or multiple different models? Should we follow a different procedure depending on how much data are available? In this talk I will present a philosophy for model construction, grounded in probability theory. I will exemplify this approach with methods that exploit loss surface geometry for scalable and practical Bayesian deep learning. I will also discuss recent results on prior specification and tempering in Bayesian deep learning. The talk will primarily be based on two references: (1) https://arxiv.org/abs/2002.08791 (2) https://arxiv.org/abs/2003.02139
Ofir Nachum is a Research Scientist at Google Research in Mountain View, California. His research focuses on deep reinforcement learning, including applications of convex duality as well as hierarchical and offline RL. He received his Bachelor's and Master's from MIT.Abstract:
We review basic concepts of convex duality and summarize how this duality may be applied to a variety of reinforcement learning (RL) settings, including policy evaluation, policy optimization, and uncertainty estimation. The derivations yield a number of intriguing results and insights, such as new mechanisms for incorporating off-policy learning into RL and straightforward applications of standard concentration inequalities to long-term value estimation. Through a number of such examples, we hope to highlight the utility of convex duality as a new perspective on old problems. At the same time we hope to equip the audience with the appropriate tools to be able to further apply these same ideas in different ways and on new applications.
Maximilian is an Associate Professor of Economics here at Oxford. His research interests include, statistical decision theory (applied to experimental design, machine learning, policy choice, and empirical Bayes methods), statistics as a social process (publication bias etc.), the use of economic theory in econometrics, identification and causality, and economic inequality and (optimal) taxation.Abstract:
This talk will provide a survey of several papers on the theory and practice of experimental design. I will compare different objectives (estimator precision, outcomes of participants, informing policy choice to maximize average outcomes, and informing policy choice to maximize utilitarian welfare), and their implications for experimental design. I will consider heuristic algorithms, will prove approximate optimality results for some of these algorithms, and will discuss several empirical applications.
Rohin is a PhD student at the Center for Human-Compatible AI (CHAI) at UC Berkeley. He is interested in how we can build AI systems that do what we intend them to do, rather than what we literally specify. He writes the Alignment Newsletter, a weekly publication with recent content relevant to AI alignment that has over 1700 subscribers. He has received the Tong Leong Lim Pre-Doctoral Prize, the Berkeley Fellowship, and the NSF Fellowship.Abstract:
Deep reinforcement learning has emerged as a powerful tool to create AI systems that perform intelligent, complex behaviors -- in environments with good simulators to generate lots of data, and perfect reward functions. However, challenges arise when we try to build AI agents that assist humans. First, since the human is part of the environment, we no longer have a perfect simulator. To address this challenge, we learn a model of the human using imitation learning, and argue that due to the cooperative nature of the task, the imperfections of the learned model do not significantly affect the agent learned through deep RL. Second, the human often cannot specify a perfect reward function that captures good behavior: there are many ways in which behaviors can go wrong, and enumerating all of these is far too difficult. For this, we introduce a preference learning algorithm that can correct many such errors, simply by analyzing the state in which the agent finds itself, without requiring any extra human annotations. While there is much work yet to be done, we believe this is a significant step towards agents that can assist humans in realistic settings.
Zachary is an assistant professor at Carnegie Mellon University appointed in both the Machine Learning Department and Tepper School of Business. His research spans core machine learning methods and their social impact and addresses diverse application areas, including clinical medicine and natural language processing. Current research focuses include robustness under distribution shift, breast cancer screening, the effective and equitable allocation of organs, and the intersection of causal thinking with messy data. He is the founder of the Approximately Correct blog and the creator of Dive Into Deep Learning, an interactive open-source book drafted entirely through Jupyter notebooks. Find on Twitter (@zacharylipton) or GitHub (@zackchase).Abstract:
We might hope that when faced with unexpected inputs, well-designed software systems would fire off warnings. However, ML systems, which depend strongly on properties of their inputs (e.g. the i.i.d. assumption), tend to fail silently. Faced with distribution shift, we wish (i) to detect and (ii) to quantify the shift, and (iii) to correct our classifiers on the fly—when possible. This talk will describe a line of recent work on tackling distribution shift. First, I will focus on recent work on label shift, a more classic problem, where strong assumptions enable principled methods. Then I will discuss how recent tools from generative adversarial networks have been appropriated (and misappropriated) to tackle dataset shift—characterizing and (partially) repairing a foundational flaw in the method. Finally, I will discuss new work that leverages human-in-the-loop feedback to develop classifiers that take into account causal structure in text classification problem and appear (empirically) to benefit on a battery of out-of-domain evaluations.
Smitha is a 3rd-year PhD student at UC Berkeley. She is mainly interested in problems that have to do with figuring out what the right objective for a system should be, often relying on frameworks of causality and measurement theory in doing so. She is funded by an Open Philanthropy AI Fellowship and a NSF Graduate Research Fellowship.Abstract:
It is often difficult to hand-specify what the correct reward function is for a task, and thus, researchers have instead aimed to learn reward functions from human behavior or feedback. The types of behavior interpreted as evidence of the reward function have expanded greatly in recent years. For example, researchers have even suggested that the reward specified by a human is merely a source of information about the true reward function. We can only imagine that there are many other types of human behavior that have yet to be formalized as evidence of the reward. How will robots make sense of all these diverse types of behavior? Our insight is that different types of behavior can be interpreted in a single unifying formalism - as a reward-rational (implicit) choice. The formalism offers a unifying lens with which to view existing work, as well as provides a recipe for interpreting new types of behavior. Furthermore, the formalism shows how we can properly interpret combinations of human feedback, in particular, it reveals that in many cases interpreting combinations of feedback requires interpreting the type of feedback the person chooses as a reward-rational choice in itself.
Hisham is a doctoral student at the Australian National University and CSIRO Data61, based in Sydney, Australia. His research includes topics from Optimal Transport, Generative Models, Fairness and Privacy and more generally, answering questions that build on the theoretical understanding of machine learning. Currently, Hisham is interning at Amazon Cambridge, working on the Supply Chain Optimization Team (SCOT).Abstract:
Since the introduction of Generative Adversarial Networks (GANs) and Variational Autoencoders (VAE), the literature on generative modelling has witnessed an overwhelming resurgence. The impressive, yet elusive empirical performance of GANs has lead to the rise of many GAN-VAE hybrids, with the hopes of GAN level performance and additional benefits of VAE, such as an encoder for feature reduction, which is not offered by GANs. Recently, the Wasserstein Autoencoder (WAE) was proposed, achieving performance similar to that of GANs, yet it is still unclear whether the two are fundamentally different or can be further improved into a unified model. In this work, we study the $f$-GAN and WAE models and make two main discoveries. First, we find that the $f$-GAN and WAE objectives partake in a primal-dual relationship and are equivalent under some assumptions, which then allows us to explicate the success of WAE. Second, the equivalence result allows us to, for the first time, prove generalization bounds for Autoencoder models, which is a pertinent problem when it comes to theoretical analyses of generative models. Furthermore, we show that the WAE objective is related to other statistical quantities such as the $f$-divergence and in particular, upper bounded by the Wasserstein distance, which then allows us to tap into existing efficient (regularized) optimal transport solvers. Our findings thus present the first primal-dual relationship between GANs and Autoencoder models, comment on generalization abilities and make a step towards unifying these models.
George Papamakarios is a research scientist at DeepMind London, working on probabilistic modelling, inference, and model-based reinforcement learning. Prior to joining DeepMind, he did a PhD in machine learning at the University of Edinburgh, and an MSc in advanced computing at Imperial College London. His PhD focused on density estimation and likelihood-free inference with normalizing flows, and introduced various new normalizing-flow models and likelihood-free inference methods.
Adi Shamir is an Israeli cryptographer. He is a co-inventor of the Rivest–Shamir–Adleman (RSA) algorithm, a co-inventor of the Feige–Fiat–Shamir identification scheme, one of the inventors of differential cryptanalysis, and has made numerous contributions to the fields of cryptography and computer science.
Max Jaderberg is a senior research scientist at DeepMind driving the intersection of deep learning, reinforcement learning, and multi-agent systems. His recent work includes creating the first agent to beat human professionals at StarCraft II, and creating algorithms for training teams of agents to play with humans in first-person video games. He previously co-founded Vision Factory, a computer vision startup, which was acquired by Google in 2014, and completed a PhD at the Visual Geometry Group, University of Oxford.
Florian is a PhD student in machine learning advised by Marius Kloft at Humboldt-Universität zu Berlin since 2015. He is interested in Bayesian approximate inference and works with Stephan Mandt (UCI) and Manfred Opper (TU Berlin). He did a scientific internship at Disney Research in Pittsburgh (USA) in 2017 focusing on approximate Bayesian methods applied to NLP problem. In September 2015, he received a Master’s degree (M.Sc.) in mathematics from Humboldt-Universität zu Berlin.
Marc Rußwurm received his Bachelor and Master degrees in Geodesy and Geoinformation at the Technical University of Munich (TUM) in 2014 and 2018 where he focused on Remote Sensing and Earth Observation. In the final years of his studies, he combined methods of computer vision and natural language processing for analysis of multi-temporal satellite images. His work was published in multiple machine learning and remote sensing conferences and journals and his publication at the Earthvision 2017 workshop at the CVPR conference received the best paper award. In 2018 he continued his research as Ph.D. Candidate at the TUM Chair of Remote Sensing Technology and conducted multiple international research experience at the Polish Space Research center, as a participant at the Frontier Development Lab at ESA Rome and Oxford, and at the IRISA Obelix Lab in France.
Joel Z. Leibo is a research scientist at DeepMind, where he has worked since 2013. A computational neuroscientist by training, his recent work focuses on modeling social phenomena like the tragedy of the commons using multi-agent deep reinforcement learning techniques.
Luba Elliott is a curator and researcher specialising in artificial intelligence in the creative industries. She is currently working to educate and engage the broader public about the latest developments in AI art through talks, workshops and exhibitions at venues across the art and technology spectrum including The Photographers’ Gallery, Victoria & Albert Museum, Seoul MediaCity Biennale, Impakt Festival, Leverhulme Centre for the Future of Intelligence, NeurIPS and ECCV. Her Creative AI London meetup community includes 2,300+ members. She has advised organisations including The World Economic Forum, Google and City University on the topic and was featured on the BBC, Forbes and The Guardian. She is a member of the AI council at the British Interactive Media Association. Previously, she worked in startups and venture capital and has a degree in Modern Languages from Cambridge University.
David Balduzzi works on game theory and machine learning at DeepMind. His PhD was in representation theory and algebraic geometry at the university of Chicago. He then worked on computational neuroscience at UW-madison and machine learning at the MPI for intelligent Systems, ETH Zurich and Victoria University in Wellington.
Simon Kohl holds a Masters in Physics from the Karlsruhe Institute of Technology (KIT) where he specialised in statistical analyses of high-energy particle decays. In 2016 he started his PhD in Computer Science jointly with the German Cancer Research Center in Heidelberg and KIT. His focus has been on automatic segmentation of prostate cancer in MRI images using deep nets. This task is interesting and hard as the delineation of prostate cancer is highly ambiguous, which results in very noisy expert annotations. In 2017/2018 he interned with DeepMind where he worked on generative models for semantic segmentation.
Sir Timothy John Berners-Lee OM KBE FRS FREng FRSA FBCS is an English computer scientist best known as the inventor of the World Wide Web. He is a Professorial Fellow of Computer Science at the University of Oxford and a professor at the Massachusetts Institute of Technology (MIT). Berners-Lee proposed an information management system on 12 March 1989, then implemented the first successful communication between a Hypertext Transfer Protocol (HTTP) client and server via the internet in mid-November.