Cătălina Cangea

Research Scientist



I am a Research Scientist at DeepMind, part of the Deep Learning team. In 2021, I completed my PhD (without receiving any corrections) at the University of Cambridge, supervised by Prof Pietro Liò and a member of King’s College.

In my undergraduate summer holidays, I gained SWE experience across the stack during internships at Google and Facebook. Later, I also worked in various collaborative research environments: Mila, X, Relation Therapeutics and DeepMind. I have also contributed to the wider research community, co-organising the ViGIL workshop at NAACL 2021.

At Cambridge, I supervised ~60 students for +200h for undergraduate courses and research projects, interviewed CS applicants, chaired women@CL and was a Cambridge Spark Teaching Fellow. I also held Master’s practicals (2018, 2019) and graph ML seminars (2020, 2021). Teaching remains a great passion: since graduating, I have delivered tutorials and supervised projects at ML summer schools.

Outside work, I love rowing, travelling, playing/recording the piano/guitar and chasing my favourite bands on tour. 🎼 I sometimes write poetry and take up cycling challenges!


  • Multimodal learning
  • Long-range modelling (audio, music, language)
  • Real-world applications


  • PhD in Machine Learning, 2021

    University of Cambridge

  • MPhil in Advanced Computer Science, 2017

    University of Cambridge

  • BA in Computer Science, 2016

    University of Cambridge


Proposed and mentored a project for 5 students (Exploiting domain structure for music ML tasks) at the LOGML 2022 summer school. Had lots of fun during the week and am excited about the next months, during which we will take the work further!

Along with Iulia Duță, I delivered a GNN tutorial at the EEML 2022 summer school - had fun answering all the questions in real-time and hope to attend in-person at future editions. :)

Thrilled to announce that our work on autoregressive, modality-agnostic generation with Perceiver AR got accepted to ICML 2022! A project very dear to my heart, where I finally got the chance to work on music generation. Here are some samples! See you at ICML :)

I was honoured to give a keynote talk at the Romanian AI days event. Many thanks to the organisers for inviting me and to the audience for the stimulating Q&A!

Our work on Generative Compositional Augmentations for Scene Graph Prediction led by Boris Knyazev, in collaboration with Mila and Element AI researchers, just got accepted to ICCV 2021! There are lucky instances where reviewers look at the rebuttal…

On May 25th, after a fruitful discussion with my examiners Nic Lane and Xavier Bresson, I passed my PhD viva without corrections! My thesis is titled ‘Exploiting multimodality and structure in world representations’ and will soon be publicly available. I am endlessly grateful to my supervisor Pietro Liò for all the unconditional, mindful and thoughtful support he has given me throughout the degree.

An updated and extended version of our study Deep Graph Mapper has been accepted to Frontiers in Big Data, under the topic Topology in Real-World Machine Learning and Data Analysis! Full version soon available.

It’s Feb 23rd and I finished writing my thesis! :)

Recent Publications

Exploiting multimodality and structure in world representations

This thesis presents three research works that study and develop likely aspects of future intelligent agents. The first contribution …

Deep Graph Mapper: Seeing Graphs through the Neural Lens

Graph summarisation has received much attention lately, with various works tackling the challenge of defining pooling operators on data …

Message Passing Neural Processes

Neural Processes (NPs) are powerful and flexible models able to incorporate uncertainty when representing stochastic processes, while …

Wiki-CS: A Wikipedia-Based Benchmark for Graph Neural Networks

We present Wiki-CS, a novel dataset derived from Wikipedia for benchmarking Graph Neural Networks. The dataset consists of nodes …



Research Scientist


Oct 2021 – Present London, UK
Part of the Deep Learning team, working on multimodal learning and generative methods for long-range sequential data. Have so far co-led several research projects, published at ICML 2022 and hosted an RS intern whose work got accepted to the Foundation Models for Decision Making Workshop (NeurIPS 2022). External mentoring at EEML 2022 (GNN tutorial) and LOGML 2022.

Research Scientist Intern


Jul 2020 – Nov 2020 Cambridge, UK (remote)
Hosted by Piotr Mirowski, in the Robotics, Embodied Agents and Lifelong learning (REAL) team.


Relation Therapeutics

Jun 2020 – Jul 2020 Cambridge, UK (remote)
Developing (graph-)ML solutions to aid in drug development and repurposing efforts.

AI Resident

X, the moonshot factory

May 2019 – Aug 2019 Mountain View, California
Worked on a real-world challenging problem - accurately tracking changes in code across different versions - using and adapting state-of-the-art ML techniques. Patent Code change graph node matching with machine learning now available.


Conferences and workshops

Nov 2018 – Present
Reviewer for ICML 2020 (was awarded a Top Reviewer Certificate of Appreciation), NeurIPS 2020, BMVC 2020 and WiML 2018, RLGM 2019, LRGR 2019, GRL 2019, ViGIL 2019, GRL+ 2020, ViGIL 2021 (also co-organising).

Research Intern


Jul 2018 – Sep 2018 Montréal, Canada
Collaboration with Aaron Courville on a visual reasoning project involving a novel benchmark and alternative perspective on EQA-style tasks. Work published at BMVC 2019 and presented as a spotlight talk at the ViGIL NeurIPS workshop.

Machine Learning Teaching Fellow

Cambridge Spark

May 2018 – May 2021 Cambridge, UK
Teaching the Neural Networks module from the Applied Data Science London Bootcamp to industry professionals.

Admissions Interviewer

University of Cambridge

Dec 2016 – Dec 2017 Cambridge, UK
Undergraduate admissions interviews for the Computer Science Tripos, in Murray Edwards College (Dec 2016) and King’s College (Dec 2017).


University of Cambridge

Oct 2016 – Jun 2021 Cambridge, UK

Master’s research projects: Structure-aware Generation of Molecules in Protein Pockets (Pavol Drotar, 2020-21) (92100) (presented at NeurIPS MLSB), Machine Unlearning (Mukul Rathi, 2020-21) (91100), Goal-Conditioned Reinforcement Learning in the Presence of an Adversary (Carlos Purves, 2019-20) (87100), Representation Learning for Spatio-Temporal Graphs (Felix Opolka, 2018-19) (85100) (presented at ICLR RLGM), Dynamic Temporal Analysis for Graph Structured Data (Aaron Solomon, 2018-19) (presented at ICLR RLGM)

Computer Science Tripos Part II projects: Benchmarking Graph Neural Networks using Wikipedia (Péter Mernyei, 2019-20, Novel Applications spotlight talk at ICML GRL+), Multimodal Relational Reasoning for Visual Question Answering (Aaron Tjandra, 2019-20), The PlayStation Reinforcement Learning Environment (Carlos Purves, 2018-19) (80100) (presented at NeurIPS Deep RL), Deep Learning for Music Recommendation (Andrew Wells, 2017-18) (76100).

Undergraduate courses for Murray Edwards, King’s, and Queens’ Colleges: AI, Databases, Discrete Mathematics, Foundations of Computer Science, Logic and Proof, Machine Learning and Real-world Data.


Software Engineer Intern


Jun 2016 – Sep 2016 London, UK
LogDevice team. I optimised client operations on a RocksDB database and implemented a new API required by another team in Facebook.

Software Engineer Intern


Jul 2015 – Sep 2015 New York, USA
iOS Product Infrastructure Team. I worked towards delivering a better experience for users of the Facebook iOS app. My project aimed to reduce the time taken to load content close to the area currently being viewed on screen, by improving the prioritization system for network requests.

STEP Intern


Jun 2014 – Sep 2014 Zurich, Switzerland
YouTube Uploads team. I added processing progress for video uploads on several YouTube pages, as the Upload page was the only one displaying this information.

Recent & Upcoming Talks

Few-shot learning on structured data - Keynote talk

An overview of why few-shot learning on structured data is important and some examples of works I have contributed to that address …

Graph generation and probabilistic methods

An overview of several graph generation and probabilistic approaches, part of the R250 MPhil course at the Cambridge CS department.

Meta-learning with Neural Processes

A reading group-style session focused around MetaFun (Xu et al. ICML 2020).

Graph Representation Learning under Uncertainty

Introducing a novel framework for learning graph representations while incorporating uncertainty modelling.

Deep Graph Mapper: Seeing Graphs through the Neural Lens (remote talk, with Cristian Bodnar)

A novel method based on topology and GNNs for graph visualisation and pooling.


Top Reviewer Certificate of Appreciation

The Top Reviewer Certificate of Appreciation acknowledges ‘excellent service as a reviewer for ICML 2020’, awarded by the Program and General chairs.

Wiseman Award

The award acknowledges those who make a commendable contribution to the work of the Department, going above and beyond the requirements of their course or project.

Travel Grant

Travel award to attend the Machine Learning for Health (ML4H) Workshop at NeurIPS 2018.

Travel Grant

Partial funding for travelling to NeurIPS 2018 and presenting my poster at the WiML workshop.

MPhil Graduation Prize

Awarded for obtaining a Distinction in the MPhil degree.

Master of Philosophy in Advanced Computer Science

Graduated with Distinction.

Bronze Medal

Won 3rd place at the Hack Cambridge MLH hackathon, as part of team facejack.

Rosemary Murray Scholarship

Awarded for obtaining a First Class result in Part II of the Computer Science Tripos.

Bachelor of Arts in Computer Science

First Class honours in final year.

Paula Browne Scholarship

Received the scholarship every year during my undergraduate degree. The scholarship was given to only one other student in my year.

Silver Medal

Awarded for obtaining 10th place at the National phase of the Olympiad during 10th grade.

Special Prize for Best Writing Style

Awarded for obtaining the highest score on the essay that was part of the written task.