Edward Lockhart
Research Areas
Authored Publications
Sort By
Olympiad-Level Formal Mathematical Reasoning with Reinforcement Learning
Thomas Hubert
Rishi Mehta
Laurent Sartran
Miklós Z. Horváth
Eric Wieser
Aja Huang
Julian Schrittwieser
Yannick Schroecker
Hussain Masoom
Ottavia Bertolli
Tom Zahavy
Amol Mandhane
Jessica Yung
Iuliya Beloshapka
Borja Ibarz
Vivek Veeriah
Lei Yu
Oliver Nash
Paul Lezeau
Salvatore Mercuri
Calle Sönne
Bhavik Mehta
Alex Davies
Daniel Zheng
Yin Li
Ingrid von Glehn
Mark Rowland
Samuel Albanie
Ameya Velingker
Simon Schmitt
Henryk Michalewski
Nicolas Sonnerat
Demis Hassabis
David Silver
Nature (2025)
Preview abstract
A long-standing goal of artificial intelligence is to build systems capable of complex reasoning in vast domains, a task epitomized by mathematics with its boundless concepts and demand for rigorous proof. Recent AI systems, often reliant on human data, typically lack the formal verification necessary to guarantee correctness. By contrast, formal languages such as Lean1 offer an interactive environment that grounds reasoning, and reinforcement learning (RL) provides a mechanism for learning in such environments. We present AlphaProof, an AlphaZero-inspired2 agent that learns to find formal proofs through RL by training on millions of auto-formalized problems. For the most difficult problems, it uses Test-Time RL, a method of generating and learning from millions of related problem variants at inference time to enable deep, problem-specific adaptation. AlphaProof substantially improves state-of-the-art results on historical mathematics competition problems. At the 2024 IMO competition, our AI system, with AlphaProof as its core reasoning engine, solved three out of the five non-geometry problems, including the competition’s most difficult problem. Combined with AlphaGeometry 23, this performance, achieved with multi-day computation, resulted in reaching a score equivalent to that of a silver medallist, marking the first time an AI system achieved any medal-level performance. Our work demonstrates that learning at scale from grounded experience produces agents with complex mathematical reasoning strategies, paving the way for a reliable AI tool in complex mathematical problem-solving.
View details
Parallel WaveNet: Fast High-Fidelity Speech Synthesis
Aäron van den Oord
Yazhe Li
Igor Babuschkin
Karen Simonyan
Koray Kavukcuoglu
George van den Driessche
Luis Carlos Cobo Rus
Florian Stimberg
Norman Casagrande
Dominik Grewe
Seb Noury
Sander Dieleman
Erich Elsen
Nal Kalchbrenner
Alexander Graves
Helen King
Thomas Walters
Demis Hassabis
NA, Google Deepmind, NA (2017)
Preview abstract
The recently-developed WaveNet architecture [27] is the current state of the art in realistic speech synthesis, consistently rated as more natural sounding for many different languages than any previous system. However, because WaveNet relies on sequential generation of one audio sample at a time, it is poorly suited to today’s massively parallel computers, and therefore hard to deploy in a real-time production setting. This paper introduces Probability Density Distillation, a new method for training a parallel feed-forward network from a trained WaveNet with no significant difference in quality. The resulting system is capable of generating high-fidelity speech samples at more than 20 times faster than real-time, and is deployed online by Google Assistant, including serving multiple English and Japanese voices.
View details