Group Invited Talks

Andrew Gordon Wilson (NYU)

How do we build models that learn and generalize?
26 Oct 2020, 14:00 – 15:00, Zoom


Andrew Gordon Wilson is faculty in the Courant Institute and Center for Data Science at NYU. Before joining NYU, he was an assistant professor at Cornell University from 2016-2019. He was a research fellow in the Machine Learning Department at Carnegie Mellon University from 2014-2016, and completed his PhD at the University of Cambridge in 2014. Andrew's interests include probabilistic modelling, scientific computing, Gaussian processes, Bayesian statistics, and loss surfaces and generalization in deep learning. His webpage is


To answer scientific questions, and reason about data, we must build models and perform inference within those models. But how should we approach model construction and inference to make the most successful predictions? How do we represent uncertainty and prior knowledge? How flexible should our models be? Should we use a single model, or multiple different models? Should we follow a different procedure depending on how much data are available? In this talk I will present a philosophy for model construction, grounded in probability theory. I will exemplify this approach with methods that exploit loss surface geometry for scalable and practical Bayesian deep learning. I will also discuss recent results on prior specification and tempering in Bayesian deep learning. The talk will primarily be based on two references: (1) (2)

Ofir Nachum (Google)

Reinforcement Learning via Convex Duality
10 Sep 2020, 16:00 – 17:00, Zoom


Ofir Nachum is a Research Scientist at Google Research in Mountain View, California. His research focuses on deep reinforcement learning, including applications of convex duality as well as hierarchical and offline RL. He received his Bachelor's and Master's from MIT.


We review basic concepts of convex duality and summarize how this duality may be applied to a variety of reinforcement learning (RL) settings, including policy evaluation, policy optimization, and uncertainty estimation. The derivations yield a number of intriguing results and insights, such as new mechanisms for incorporating off-policy learning into RL and straightforward applications of standard concentration inequalities to long-term value estimation. Through a number of such examples, we hope to highlight the utility of convex duality as a new perspective on old problems. At the same time we hope to equip the audience with the appropriate tools to be able to further apply these same ideas in different ways and on new applications.

Saad Jbabdi (Nuffield Department of Clinical Neurosciences)

30 Apr 2020, 11:00 – 12:00, Zoom

Mark Woolrich (Oxford Centre for Human Brain Activity)

16 Apr 2020, 11:00 – 12:00, Zoom

Stephen Smith (FMRIB Analysis Group)

02 Apr 2020, 11:00 – 12:00, Zoom

Maximilian Kasy

What do we want? And when do we want it? Alternative objectives and their implications for experimental design.
19 Feb 2020, ,
Web Slides


Maximilian is an Associate Professor of Economics here at Oxford. His research interests include, statistical decision theory (applied to experimental design, machine learning, policy choice, and empirical Bayes methods), statistics as a social process (publication bias etc.), the use of economic theory in econometrics, identification and causality, and economic inequality and (optimal) taxation.


This talk will provide a survey of several papers on the theory and practice of experimental design. I will compare different objectives (estimator precision, outcomes of participants, informing policy choice to maximize average outcomes, and informing policy choice to maximize utilitarian welfare), and​ their implications for experimental design. I will consider heuristic algorithms, will prove approximate optimality results for some of these algorithms, and will discuss several empirical applications.

Rohin Shah

Assisting humans through deep reinforcement learning.
21 Feb 2020, ,
Web Slides


Rohin is a PhD student at the Center for Human-Compatible AI (CHAI) at UC Berkeley. He is interested in how we can build AI systems that do what we intend them to do, rather than what we literally specify. He writes the Alignment Newsletter, a weekly publication with recent content relevant to AI alignment that has over 1700 subscribers. He has received the Tong Leong Lim Pre-Doctoral Prize, the Berkeley Fellowship, and the NSF Fellowship.


Deep reinforcement learning has emerged as a powerful tool to create AI systems that perform intelligent, complex behaviors -- in environments with good simulators to generate lots of data, and perfect reward functions. However, challenges arise when we try to build AI agents that assist humans. First, since the human is part of the environment, we no longer have a perfect simulator. To address this challenge, we learn a model of the human using imitation learning, and argue that due to the cooperative nature of the task, the imperfections of the learned model do not significantly affect the agent learned through deep RL. Second, the human often cannot specify a perfect reward function that captures good behavior: there are many ways in which behaviors can go wrong, and enumerating all of these is far too difficult. For this, we introduce a preference learning algorithm that can correct many such errors, simply by analyzing the state in which the agent finds itself, without requiring any extra human annotations. While there is much work yet to be done, we believe this is a significant step towards agents that can assist humans in realistic settings.

Zack Chase Lipton (CMU)

Robust Deep Learning Under Distribution Shift
22 Jan 2020, , LTA
Web Slides


Zachary is an assistant professor at Carnegie Mellon University appointed in both the Machine Learning Department and Tepper School of Business. His research spans core machine learning methods and their social impact and addresses diverse application areas, including clinical medicine and natural language processing. Current research focuses include robustness under distribution shift, breast cancer screening, the effective and equitable allocation of organs, and the intersection of causal thinking with messy data. He is the founder of the Approximately Correct blog and the creator of Dive Into Deep Learning, an interactive open-source book drafted entirely through Jupyter notebooks. Find on Twitter (@zacharylipton) or GitHub (@zackchase).


We might hope that when faced with unexpected inputs, well-designed software systems would fire off warnings. However, ML systems, which depend strongly on properties of their inputs (e.g. the i.i.d. assumption), tend to fail silently. Faced with distribution shift, we wish (i) to detect and (ii) to quantify the shift, and (iii) to correct our classifiers on the fly—when possible. This talk will describe a line of recent work on tackling distribution shift. First, I will focus on recent work on label shift, a more classic problem, where strong assumptions enable principled methods. Then I will discuss how recent tools from generative adversarial networks have been appropriated (and misappropriated) to tackle dataset shift—characterizing and (partially) repairing a foundational flaw in the method. Finally, I will discuss new work that leverages human-in-the-loop feedback to develop classifiers that take into account causal structure in text classification problem and appear (empirically) to benefit on a battery of out-of-domain evaluations.

Smitha Milli

Reward-rational (implicit) choice: A unifying formalism for reward learning
22 Jan 2020, ,
Web Slides


Smitha is a 3rd-year PhD student at UC Berkeley. She is mainly interested in problems that have to do with figuring out what the right objective for a system should be, often relying on frameworks of causality and measurement theory in doing so. She is funded by an Open Philanthropy AI Fellowship and a NSF Graduate Research Fellowship.


It is often difficult to hand-specify what the correct reward function is for a task, and thus, researchers have instead aimed to learn reward functions from human behavior or feedback. The types of behavior interpreted as evidence of the reward function have expanded greatly in recent years. For example, researchers have even suggested that the reward specified by a human is merely a source of information about the true reward function. We can only imagine that there are many other types of human behavior that have yet to be formalized as evidence of the reward. How will robots make sense of all these diverse types of behavior? Our insight is that different types of behavior can be interpreted in a single unifying formalism - as a reward-rational (implicit) choice. The formalism offers a unifying lens with which to view existing work, as well as provides a recipe for interpreting new types of behavior. Furthermore, the formalism shows how we can properly interpret combinations of human feedback, in particular, it reveals that in many cases interpreting combinations of feedback requires interpreting the type of feedback the person chooses as a reward-rational choice in itself.

Hisham Husain

A Primal-Dual Link between GANs and Autoencoders
06 Nov 2019, ,


Hisham is a doctoral student at the Australian National University and CSIRO Data61, based in Sydney, Australia. His research includes topics from Optimal Transport, Generative Models, Fairness and Privacy and more generally, answering questions that build on the theoretical understanding of machine learning. Currently, Hisham is interning at Amazon Cambridge, working on the Supply Chain Optimization Team (SCOT).


Since the introduction of Generative Adversarial Networks (GANs) and Variational Autoencoders (VAE), the literature on generative modelling has witnessed an overwhelming resurgence. The impressive, yet elusive empirical performance of GANs has lead to the rise of many GAN-VAE hybrids, with the hopes of GAN level performance and additional benefits of VAE, such as an encoder for feature reduction, which is not offered by GANs. Recently, the Wasserstein Autoencoder (WAE) was proposed, achieving performance similar to that of GANs, yet it is still unclear whether the two are fundamentally different or can be further improved into a unified model. In this work, we study the $f$-GAN and WAE models and make two main discoveries. First, we find that the $f$-GAN and WAE objectives partake in a primal-dual relationship and are equivalent under some assumptions, which then allows us to explicate the success of WAE. Second, the equivalence result allows us to, for the first time, prove generalization bounds for Autoencoder models, which is a pertinent problem when it comes to theoretical analyses of generative models. Furthermore, we show that the WAE objective is related to other statistical quantities such as the $f$-divergence and in particular, upper bounded by the Wasserstein distance, which then allows us to tap into existing efficient (regularized) optimal transport solvers. Our findings thus present the first primal-dual relationship between GANs and Autoencoder models, comment on generalization abilities and make a step towards unifying these models.

George Papamakarios (Deep Mind)

Probabilistic modelling and inference with normalizing flows
10 Oct 2019, 11:00 – 12:00, CS LTB


George Papamakarios is a research scientist at DeepMind London, working on probabilistic modelling, inference, and model-based reinforcement learning. Prior to joining DeepMind, he did a PhD in machine learning at the University of Edinburgh, and an MSc in advanced computing at Imperial College London. His PhD focused on density estimation and likelihood-free inference with normalizing flows, and introduced various new normalizing-flow models and likelihood-free inference methods.

Masashi Sugiyama (University of Tokyo)

Machine Learning from Weak Supervision: Towards Accurate Classification with Low Labeling Costs
20 Sep 2019, 15:30 – 16:35, lecture theatre in Stats

Adi Shamir (Weizmann Institute)

A Simple Explanation for the Mysterious Existence of Adversarial Examples with Small Hamming Distance
20 Sep 2019, 11:00 – 12:00, Stats large lecture theatre


Adi Shamir is an Israeli cryptographer. He is a co-inventor of the Rivest–Shamir–Adleman (RSA) algorithm, a co-inventor of the Feige–Fiat–Shamir identification scheme, one of the inventors of differential cryptanalysis, and has made numerous contributions to the fields of cryptography and computer science.

Mario Lucic (Google)

On Mutual Information Maximization for Representation Learning
03 Sep 2019, 16:00 – 17:00, LTB

Yang-Hui He (City, University of London)

Machine learning mathematical structures
04 Jul 2019, 12:00 – 13:00, Room 051

Max Jaderberg (Deep Mind)

Deep reinforcement learning
27 Jun 2019, 11:00 – 12:00, Lecture Theater B


Max Jaderberg is a senior research scientist at DeepMind driving the intersection of deep learning, reinforcement learning, and multi-agent systems. His recent work includes creating the first agent to beat human professionals at StarCraft II, and creating algorithms for training teams of agents to play with humans in first-person video games. He previously co-founded Vision Factory, a computer vision startup, which was acquired by Google in 2014, and completed a PhD at the Visual Geometry Group, University of Oxford.

Florian Wenzel (Humboldt-Universität zu Berlin)

Scalable Approximate Bayesian Inference
11 Jun 2019, 11:00 – 12:00, Tony Hoare


Florian is a PhD student in machine learning advised by Marius Kloft at Humboldt-Universität zu Berlin since 2015. He is interested in Bayesian approximate inference and works with Stephan Mandt (UCI) and Manfred Opper (TU Berlin). He did a scientific internship at Disney Research in Pittsburgh (USA) in 2017 focusing on approximate Bayesian methods applied to NLP problem. In September 2015, he received a Master’s degree (M.Sc.) in mathematics from Humboldt-Universität zu Berlin.

Matt Fellows (Oxford CS)

Variational Inference for Reinforcement Learning
06 Jun 2019, 12:00 – 12:50, Room 441

Marc Rußwurm (TUM)

Between Earth Observation and Machine Learning
02 May 2019, 12:00 – 13:00, 051


Marc Rußwurm received his Bachelor and Master degrees in Geodesy and Geoinformation at the Technical University of Munich (TUM) in 2014 and 2018 where he focused on Remote Sensing and Earth Observation. In the final years of his studies, he combined methods of computer vision and natural language processing for analysis of multi-temporal satellite images. His work was published in multiple machine learning and remote sensing conferences and journals and his publication at the Earthvision 2017 workshop at the CVPR conference received the best paper award. In 2018 he continued his research as Ph.D. Candidate at the TUM Chair of Remote Sensing Technology and conducted multiple international research experience at the Polish Space Research center, as a participant at the Frontier Development Lab at ESA Rome and Oxford, and at the IRISA Obelix Lab in France.

Joel Z Leibo (DeepMind)

Autocurricula and the Emergence of Innovation from Social Interaction: A Manifesto for Multi-Agent Intelligence Research
05 Apr 2019, 12:00 – 13:00, Tony Hoare Room


Joel Z. Leibo is a research scientist at DeepMind, where he has worked since 2013. A computational neuroscientist by training, his recent work focuses on modeling social phenomena like the tragedy of the commons using multi-agent deep reinforcement learning techniques.

Marta Garnelo & Dan Rosenbaum (DeepMind)

Generative models for few-shot prediction and inference tasks
19 Mar 2019, 14:00 – 15:00, LTB

Luba Elliott (Creative AI consultant)

AI in recent art practice
15 Mar 2019, 14:00 – 15:00, LRB


Luba Elliott is a curator and researcher specialising in artificial intelligence in the creative industries. She is currently working to educate and engage the broader public about the latest developments in AI art through talks, workshops and exhibitions at venues across the art and technology spectrum including The Photographers’ Gallery, Victoria & Albert Museum, Seoul MediaCity Biennale, Impakt Festival, Leverhulme Centre for the Future of Intelligence, NeurIPS and ECCV. Her Creative AI London meetup community includes 2,300+ members. She has advised organisations including The World Economic Forum, Google and City University on the topic and was featured on the BBC, Forbes and The Guardian. She is a member of the AI council at the British Interactive Media Association. Previously, she worked in startups and venture capital and has a degree in Modern Languages from Cambridge University.

Athanasios Vlontzos (Imperial)

07 Mar 2019, 10:00 – 11:00, 441

David Balduzzi (Deep Mind)

Open-ended learning in symmetric zero-sum games
01 Mar 2019, 12:00 – 13:00, Tony Hoare room


David Balduzzi works on game theory and machine learning at DeepMind. His PhD was in representation theory and algebraic geometry at the university of Chicago. He then worked on computational neuroscience at UW-madison and machine learning at the MPI for intelligent Systems, ETH Zurich and Victoria University in Wellington.

Reuben Binns (Oxford CS)

Fair machine learning: how can data-driven decision-making perpetuate bias, and what should be done about it?
28 Feb 2019, 12:00 – 12:45, Tony Hoare Room

Simon Kohl (Karlsruhe Institute of Technology)

Segmentation of Ambiguous Images
21 Feb 2019, 12:00 – 13:00, LT-B


Simon Kohl holds a Masters in Physics from the Karlsruhe Institute of Technology (KIT) where he specialised in statistical analyses of high-energy particle decays. In 2016 he started his PhD in Computer Science jointly with the German Cancer Research Center in Heidelberg and KIT. His focus has been on automatic segmentation of prostate cancer in MRI images using deep nets. This task is interesting and hard as the delineation of prostate cancer is highly ambiguous, which results in very noisy expert annotations. In 2017/2018 he interned with DeepMind where he worked on generative models for semantic segmentation.

Paul Brodersen (Department of Pharmacology)

Spiking Neural Networks
14 Feb 2019, 10:00 – 10:50, Room 441

Gabriele de Canio (ESA)

13 Feb 2019, 10:00 – 13:00, Tony Hoare Room

Pim de Haan (Amsterdam)

11 Feb 2019, 13:00 – 14:00, Tony Hoare

Taco Cohen (Qualcomm)

New Developments in Equivariant and Geometric Deep Learning
01 Feb 2019, 15:00 – 16:00, Lecture Room B

Petar Veličković (University of Cambridge)

Adversarial learning meets graphs (and why should you care?)
08 Jan 2019, 14:00 – 15:00, Lecture Theatre B

Pedro Ortega (Deep Mind)

01 Nov 2018, 12:00 – 13:00, Robert Hooke Building

Sir Tim Berners-Lee (Oxford and MIT)

ML in the "new internet"
26 Oct 2018, 11:00 – 12:00, RHB room 114


Sir Timothy John Berners-Lee OM KBE FRS FREng FRSA FBCS is an English computer scientist best known as the inventor of the World Wide Web. He is a Professorial Fellow of Computer Science at the University of Oxford and a professor at the Massachusetts Institute of Technology (MIT). Berners-Lee proposed an information management system on 12 March 1989, then implemented the first successful communication between a Hypertext Transfer Protocol (HTTP) client and server via the internet in mid-November.

Mark van der Wilk (Imperial)

Learning Invariances Using the Marginal Likelihood
23 Oct 2018, 11:00 – 12:00, LTB

Are you looking to do a PhD in machine learning? Did you do a PhD in another field and want to do a postdoc in machine learning? Would you like to visit the group?

How to apply


We are located at
Department of Computer Science, University of Oxford
Wolfson Building
Parks Road
Twitter: @OATML_Oxford
Github: OATML