Goran Žužić

Goran Žužić

Authored Publications
Sort By
  • Title
  • Title, descending
  • Year
  • Year, descending
    Olympiad-Level Formal Mathematical Reasoning with Reinforcement Learning
    Thomas Hubert
    Rishi Mehta
    Laurent Sartran
    Miklós Z. Horváth
    Eric Wieser
    Aja Huang
    Julian Schrittwieser
    Yannick Schroecker
    Hussain Masoom
    Ottavia Bertolli
    Tom Zahavy
    Amol Mandhane
    Jessica Yung
    Iuliya Beloshapka
    Borja Ibarz
    Vivek Veeriah
    Lei Yu
    Oliver Nash
    Paul Lezeau
    Salvatore Mercuri
    Calle Sönne
    Bhavik Mehta
    Alex Davies
    Daniel Zheng
    Yin Li
    Ingrid von Glehn
    Mark Rowland
    Samuel Albanie
    Ameya Velingker
    Simon Schmitt
    Henryk Michalewski
    Nicolas Sonnerat
    Demis Hassabis
    David Silver
    Nature (2025)
    Preview abstract A long-standing goal of artificial intelligence is to build systems capable of complex reasoning in vast domains, a task epitomized by mathematics with its boundless concepts and demand for rigorous proof. Recent AI systems, often reliant on human data, typically lack the formal verification necessary to guarantee correctness. By contrast, formal languages such as Lean1 offer an interactive environment that grounds reasoning, and reinforcement learning (RL) provides a mechanism for learning in such environments. We present AlphaProof, an AlphaZero-inspired2 agent that learns to find formal proofs through RL by training on millions of auto-formalized problems. For the most difficult problems, it uses Test-Time RL, a method of generating and learning from millions of related problem variants at inference time to enable deep, problem-specific adaptation. AlphaProof substantially improves state-of-the-art results on historical mathematics competition problems. At the 2024 IMO competition, our AI system, with AlphaProof as its core reasoning engine, solved three out of the five non-geometry problems, including the competition’s most difficult problem. Combined with AlphaGeometry 23, this performance, achieved with multi-day computation, resulted in reaching a score equivalent to that of a silver medallist, marking the first time an AI system achieved any medal-level performance. Our work demonstrates that learning at scale from grounded experience produces agents with complex mathematical reasoning strategies, paving the way for a reliable AI tool in complex mathematical problem-solving. View details
    Preview abstract We combine several recent advancements to solve $(1+\varepsilon)$-transshipment and $(1+\varepsilon)$-maximum flow with a parallel algorithm with $\tilde{O}(1/\varepsilon)$ depth and $\tilde{O}(m/\varepsilon)$ work. We achieve this by developing and deploying suitable parallel linear cost approximators in conjunction with an accelerated continuous optimization framework known as the box-simplex game by Jambulapati et al. (ICALP 2022). A linear cost approximator is a linear operator that allows us to efficiently estimate the cost of the optimal solution to a given routing problem. Obtaining accelerated $\varepsilon$ dependencies for both problems requires developing a stronger multicommodity cost approximator, one where cancellations between different commodities are disallowed. For maximum flow, we observe that a recent linear cost approximator due to Agarwal et al. (SODA 2024) can be augmented with additional parallel operations and achieve $\varepsilon^{-1}$ dependency via the box-simplex game. For transshipment, we also obtain construct a deterministic and distributed approximator. This yields a deterministic CONGEST algorithm that requires $\tilde{O}(\varepsilon^{-1}(D + \sqrt{n}))$ rounds on general networks of hop diameter $D$ and $\tilde{O}(\varepsilon^{-1}D)$ rounds on minor-free networks to compute a $(1+\varepsilon)$-approximation. View details
    Preview abstract Transshipment, also known under the names of earth mover's distance, uncapacitated min-cost flow, or Wasserstein's metric, is an important and well-studied problem that asks to find a flow of minimum cost that routes a general demand vector. Adding to its importance, recent advancements in our understanding of algorithms for transshipment have led to breakthroughs for the fundamental problem of computing shortest paths. Specifically, the recent near-optimal $(1+\eps)$-approximate single-source shortest path algorithms in the parallel and distributed settings crucially solve transshipment as a central step of their approach. The key property that differentiates transshipment from other similar problems like shortest path is the so-called \emph{boosting}: one can boost a (bad) approximate solution to a near-optimal $(1 + \eps)$-approximate solution. This conceptually reduces the problem to finding an approximate solution. However, not all approximations can be boosted---there have been several proposed approaches that were shown to be susceptible to boosting, and a few others where boosting was left as an open question. The main takeaway of our paper is that any black-box $\alpha$-approximate transshipment solver that computes a \emph{dual} solution can be boosted to an $(1 + \eps)$-approximate solver. Moreover, we significantly simplify and decouple previous approaches to transshipment (in sequential, parallel, and distributed settings) by showing all of them (implicitly) obtain approximate dual solutions. Our analysis is very simple and relies only on the well-known multiplicative weights framework. Furthermore, to keep the paper completely self-contained, we provide a new (and arguably much simpler) analysis of multiplicative weights that leverages well-known optimization tools to bypass the ad-hoc calculations used in the standard analyses. View details
    ×