I am a Research Scientist at DeepMind, part of the Deep Learning team. In 2021, I completed my PhD (without receiving any corrections) at the University of Cambridge, supervised by Prof Pietro Liò and a member of King’s College.
In my undergraduate summer holidays, I gained SWE experience across the stack during internships at Google and Facebook. Later, I also worked in various collaborative research environments: Mila, X, Relation Therapeutics and DeepMind. I have also contributed to the wider research community, co-organising the ViGIL workshop at NAACL 2021.
At Cambridge, I supervised ~60 students for +200h for undergraduate courses and research projects, interviewed CS applicants, chaired women@CL and was a Cambridge Spark Teaching Fellow. I also held Master’s practicals (2018, 2019) and graph ML seminars (2020, 2021). Teaching remains a great passion: since graduating, I have delivered tutorials and supervised projects at ML summer schools.
Outside work, I love rowing, travelling, playing/recording the piano/guitar and chasing my favourite bands on tour. 🎼 I sometimes write poetry and take up cycling challenges!
PhD in Machine Learning, 2021
University of Cambridge
MPhil in Advanced Computer Science, 2017
University of Cambridge
BA in Computer Science, 2016
University of Cambridge
Real-world data is high-dimensional: a book, image, or musical performance can easily contain hundreds of thousands of elements even after compression. However, the most commonly used autoregressive models, Transformers, are prohibitively expensive to scale to the number of inputs and layers needed to capture this long-range structure. We develop Perceiver AR, an autoregressive, modality-agnostic architecture which uses cross-attention to map long-range inputs to a small number of latents while also maintaining end-to-end causal masking. Perceiver AR can directly attend to over a hundred thousand tokens, enabling practical long-context density estimation without the need for hand-crafted sparsity patterns or memory mechanisms. When trained on images or music, Perceiver AR generates outputs with clear long-term coherence and structure. Our architecture also obtains state-of-the-art likelihood on long-sequence benchmarks, including 64 x 64 ImageNet images and PG-19 books.
This thesis presents three research works that study and develop likely aspects of future intelligent agents. The first contribution centers on vision-and-language learning, introducing a challenging embodied task that shifts the focus of an existing one to the visual reasoning problem. By extending popular visual question answering (VQA) paradigms, I also designed several models that were evaluated on the novel dataset. This produced initial performance estimates for environment understanding, through the lens of a more challenging VQA downstream task. The second work presents two ways of obtaining hierarchical representations of graph-structured data. These methods either scaled to much larger graphs than the ones processed by the best-performing method at the time, or incorporated theoretical properties via the use of topological data analysis algorithms. Both approaches competed with contemporary state-of-the-art graph classification methods, even outside social domains in the second case, where the inductive bias was PageRank-driven. Finally, the third contribution delves further into relational learning, presenting a probabilistic treatment of graph representations in complex settings such as few-shot, multi-task learning and scarce-labelled data regimes. By adding relational inductive biases to neural processes, the resulting framework can model an entire distribution of functions which generate datasets with structure. This yielded significant performance gains, especially in the aforementioned complex scenarios, with semantically-accurate uncertainty estimates that drastically improved over the neural process baseline. This type of framework may eventually contribute to developing lifelong-learning systems, due to its ability to adapt to novel tasks and distributions. (Full abstract on the thesis webpage)
Master’s research projects: Structure-aware Generation of Molecules in Protein Pockets (Pavol Drotar, 2020-21) (92⁄100) (presented at NeurIPS MLSB), Machine Unlearning (Mukul Rathi, 2020-21) (91⁄100), Goal-Conditioned Reinforcement Learning in the Presence of an Adversary (Carlos Purves, 2019-20) (87⁄100), Representation Learning for Spatio-Temporal Graphs (Felix Opolka, 2018-19) (85⁄100) (presented at ICLR RLGM), Dynamic Temporal Analysis for Graph Structured Data (Aaron Solomon, 2018-19) (presented at ICLR RLGM)
Computer Science Tripos Part II projects: Benchmarking Graph Neural Networks using Wikipedia (Péter Mernyei, 2019-20, Novel Applications spotlight talk at ICML GRL+), Multimodal Relational Reasoning for Visual Question Answering (Aaron Tjandra, 2019-20), The PlayStation Reinforcement Learning Environment (Carlos Purves, 2018-19) (80⁄100) (presented at NeurIPS Deep RL), Deep Learning for Music Recommendation (Andrew Wells, 2017-18) (76⁄100).
Undergraduate courses for Murray Edwards, King’s, and Queens’ Colleges: AI, Databases, Discrete Mathematics, Foundations of Computer Science, Logic and Proof, Machine Learning and Real-world Data.