Clare is a DPhil student at the University of Oxford working with Yarin Gal and Marta Kwiatkowska. She has previously worked on developing a stronger theoretical understanding of distributional reinforcement learning at Google Brain, and is broadly interested in theoretical foundations of machine learning. She obtained her undergraduate degree in mathematics and computer science at McGill University, and is a Rhodes Scholar.
We propose a new perspective on representation learning in reinforcement learning based on geometric properties of the space of value functions. We leverage this perspective to provide formal evidence regarding the usefulness of value functions as auxiliary tasks. Our formulation considers adapting the representation to minimize the (linear) approximation of the value function of all stationary policies for a given environment. We show that this optimization reduces to making accurate predictions regarding a special class of value functions which we call adversarial value functions (AVFs). We demonstrate that using value functions as auxiliary tasks corresponds to an expected-error relaxation of our formulation, with AVFs a natural candidate, and identify a close relationship with proto-value functions (Mahadevan, 2005). We highlight characteristics of AVFs and their usefulness as auxiliary tasks in a series of experiments on the four-room domain.
Marc G. Bellemare, Will Dabney, Robert Dadashi, Adrien Ali Taiga, Pablo Samuel Castro, Nicolas Le Roux, Dale Schuurmans, Tor Lattimore, Clare Lyle
Invariance is often cited as a desirable property of machine learning systems, claimed to improve model accuracy and reduce overfitting. Empirically, invariant models often generalize better than their non-invariant counterparts. But is it possible to show that invariant models provably do so? In this paper we explore the effect of invariance on model generalization. We find strong Bayesian and frequentist motivations for enforcing invariance which leverage recent results connecting PAC-Bayes generalization bounds and the marginal likelihood. We make use of these results to perform model selection on neural networks.
Clare Lyle, Marta Kwiatkowska, Mark van der Wilk, Yarin Gal
Understanding and Improving Generalization in Deep Learning workshop, ICML, 2019
Since their introduction a year ago, distributional approaches to reinforcement learning (distributional RL) have produced strong results relative to the standard approach which models expected values (expected RL). However, aside from convergence guarantees, there have been few theoretical results investigating the reasons behind the improvements distributional RL provides. In this paper we begin the investigation into this fundamental question by analyzing the differences in the tabular, linear approximation, and non-linear approximation settings. We prove that in many realizations of the tabular and linear approximation settings, distributional RL behaves exactly the same as expected RL. In cases where the two methods behave differently, distributional RL can in fact hurt performance when it does not induce identical behaviour. We then continue with an empirical analysis comparing distributional and expected RL methods in control settings with non-linear approximators to tease apart where the improvements from distributional RL methods are coming from.
Clare Lyle, Pablo Samuel Castro, Marc G Bellemare
The applications of probably approximately correct (PAC) learning results to deep networks have historically been about as interesting as they sound. For neural networks of the scale used in practical applications, bounds involving concepts like VC dimension conclude that the algorithm will have no more than a certain error rate on the test set with probability at least zero. Recently, some work by Dziugaite and Roy, along with some folks from Columbia has managed to obtain non-vacuous generalization bounds for more realistic problems using a concept introduced by McAllester (1999) called PAC Bayes bounds. …Full post...
Clare Lyle, 09 Apr 2019