Lasse Espeholt
Authored Publications
Sort By
Boosting Search Engines with Interactive Agents
Leonard Adolphs
Michelle Chen Huebscher
Pier Giuseppe Sessa
Thomas Hofmann
Yannic Kilcher
Transactions on Machine Learning Research (2022)
Preview abstract
This paper presents first successful steps in designing search agents that learn meta-strategies for iterative query refinement in information-seeking tasks. Our approach uses machine reading to guide the selection of refinement terms from aggregated search results. Agents are then empowered with simple but effective search operators to exert fine-grained and transparent control over queries and search results. We develop a novel way of generating synthetic search sessions, which leverages the power of transformer-based language models through (self-)supervised learning. We also present a reinforcement learning agent with dynamically constrained actions that learns interactive search strategies from scratch. Our search agents obtain retrieval and answer quality performance comparable to recent neural methods, using only a traditional term-based BM25 ranking function and interpretable discrete reranking and filtering actions.
View details
MetNet: A Neural Weather Model for Precipitation Forecasting
Casper Kaae Sønderby
Avital Oliver
Jason Hickey
Nal Kalchbrenner
Submission to journal (2020)
Preview abstract
Weather forecasting is a long standing scientific challenge with direct social and economic impact. The task is suitable for deep neural networks due to vast amounts of continuously collected data and a rich spatial and temporal structure that presents long range dependencies. We introduce MetNet, a neural network that forecasts precipitation up to 8 hours into the future at the high spatial resolution of 1 km and at the temporal resolution of 2 minutes with a latency in the order of seconds. MetNet takes as input radar and satellite data and forecast lead time and produces a probabilistic precipitation map. The architecture uses axial self-attention to aggregate the global context from a large input patch corresponding to a million square kilometers. We evaluate the performance of MetNet at various precipitation thresholds and find that MetNet outperforms Numerical Weather Prediction at forecasts of up to 7 to 8 hours on the scale of the continental United States.
View details
SEED RL: Scalable and Efficient Deep-RL with Accelerated Central Inference
Piotr Michal Stanczyk
Marcin Michalski
International Conference on Learning Representations (2020) (to appear)
Preview abstract
We present a modern scalable reinforcement learning agent called SEED (Scalable, Efficient Deep-RL). By effectively utilizing modern accelerators, we show that it is not only possible to train on millions of frames per second but also to lower the cost of experiments compared to current methods. We achieve this with a simple architecture that features centralized inference and an optimized communication layer. SEED adopts two state of the art distributed algorithms, IMPALA/V-trace (policy gradients) and R2D2 (Q-learning), and is evaluated on Atari-57, DeepMind Lab and Google Research Football. We improve the state of the art on Football and are able to reach state of the art on Atari-57 three times faster in wall-time. For the scenarios we consider, a 40% to 80% cost reduction for running experiments is achieved. The implementation along with experiments is open-sourced so results can be reproduced and novel ideas tried out.
View details
Google Research Football: A Novel Reinforcement Learning Environment
Karol Kurach
Piotr Michal Stanczyk
Michał Zając
Carlos Riquelme
Damien Vincent
Marcin Michalski
Sylvain Gelly
AAAI (2019)
Preview abstract
Recent progress in the field of reinforcement learning has been accelerated by virtual learning environments such as video games, where novel algorithms and ideas can be quickly tested in a safe and reproducible manner. We introduce the Google Research Football Environment, a new reinforcement learning environment where agents are trained to play football in an advanced, physics-based 3D simulator.
The resulting environment is challenging, easy to use and customize, and it is available under a permissive open-source license. We further propose three full-game scenarios of varying difficulty with the Football Benchmarks, we report baseline results for three commonly used reinforcement algorithms (Impala, PPO, and Ape-X DQN), and we also provide a diverse set of simpler scenarios with the Football Academy.
View details
Multi-task Deep Reinforcement Learning with PopArt
Matteo Hessel
Hubert Soyer
Wojciech Czarnecki
Simon Schmitt
Hado van Hasselt
DeepMind (2019) (to appear)
Preview abstract
The reinforcement learning community has made great strides in designing algorithms capable of exceeding human performance on specific tasks. These algorithms are mostly trained one task at the time, each new task requiring to train a brand new agent instance. In this work, we investigate algorithms capable of learning to master not one but multiple sequential-decision tasks at once. We use PopArt normalisation to derive scale invariant policy-gradient updates, and we propose an actor critic architecture designed for multi-task learning. In combination with the IMPALA reinforcement-learning architecture this results in state of the art performance on learning to play all games in a set of 57 diverse Atari games. Excitingly, our method learns a single trained policy - with a single set of weights - that exceeds median human performance across all games. To our knowledge, this is the first time a single agent surpasses human-level performance on this multi-task domain. The same approach demonstrates state of the art results on a set of 30 tasks defined in the 3D reinforcement learning platform DeepMind Lab.
View details
IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures
Hubert Soyer
Remi Munos
Karen Simonyan
Volodymyr Mnih
Tom Ward
Yotam Doron
Vlad Firoiu
Tim Harley
Iain Robert Dunning
Shane Legg
Koray Kavukcuoglu
ArXiv (2018) (to appear)
Preview abstract
In this work we aim to solve a large collection of tasks using a single reinforcement learning agent with a single set of parameters. A key challenge is to handle the increased amount of data and extended training time, which is already a problem in single task learning.
In order to tackle this challenging problem, we have developed a new distributed agent architecture IMPALA (Importance-Weighted Actor Learner) that can scale to using thousands of machines and achieve a throughput rate of $250,000$ frames per second. We achieve stable learning at high throughput by combining decoupled acting and learning with a novel off-policy correction method called V-trace, which was critical for achieving learning stability.
We demonstrate the effectiveness of IMPALA for multi-task reinforcement learning on DMLab-30 (a set of 30 tasks from the DeepMind Lab environment \cite{beattie2016dmlab}) and ATARI-57 (all available ATARI games in Arcade Learning Environment \cite{bellemare13arcade}). Our results show that IMPALA is able to achieve better performance than previous agents, uses less data and crucially exhibits positive transfer between tasks as a result of its multi-task approach.
View details
Conditional Image Generation with PixelCNN Decoders
Aäron van den Oord
Nal Kalchbrenner
Koray Kavukcuoglu
Alexander Graves
Advances in Neural Information Processing Systems 29, Curran Associates, Inc. (2016), pp. 4790-4798 (to appear)
Preview abstract
This work explores conditional image generation with a new image density model
based on the PixelCNN architecture. The model can be conditioned on any vector,
including descriptive labels or tags, or latent embeddings created by other networks.
When conditioned on class labels from the ImageNet database, the model is able to
generate diverse, realistic scenes representing distinct animals, objects, landscapes
and structures. When conditioned on an embedding produced by a convolutional
network given a single image of an unseen face, it generates a variety of new
portraits of the same person with different facial expressions, poses and lighting
conditions. We also show that conditional PixelCNN can serve as a powerful
decoder in an image autoencoder. Additionally, the gated convolutional layers in
the proposed model improve the log-likelihood of PixelCNN to match the state-ofthe-art performance of PixelRNN on ImageNet, with greatly reduced computational
cost.
View details
Neural Machine Translation in Linear Time
Nal Kalchbrenner
Karen Simonyan
Aäron van den Oord
Alexander Graves
Koray Kavukcuoglu
Arxiv (2016)
Preview abstract
We present a neural architecture for sequences, the ByteNet, that has two core features: it runs in time that is linear in the length of the sequences and it preserves the sequences' temporal resolution. The ByteNet is a stack of two dilated convolutional neural networks, one to encode the source and one to decode the target, where the target decoder unfolds dynamically to generate variable length outputs. We show that the ByteNet decoder attains state-of-the-art performance on character-level language modelling and outperforms recurrent neural networks. We also show that the ByteNet achieves a performance on raw character-level machine translation that approaches that of the best neural translation models that run in quadratic time. A visualization technique reveals the latent alignment structure learnt by the ByteNet.
View details
Teaching Machines to Read and Comprehend
Karl Moritz Hermann
Tomas Kocisky
Edward Grefenstette
Will Kay
Mustafa Suleyman
Phil Blunsom
NIPS (2015) (to appear)
Preview abstract
Teaching machines to read natural language documents remains an elusive chal-
lenge. Such models can be tested on their ability to answer questions posed on
the contents of the documents that they have seen, but until now large scale su-
pervised training and test datasets have been missing for such tasks. In this work
we introduce a new machine reading paradigm based on large scale supervised
training datasets extracted from readily available online sources. We define mod-
els for this task based on both a traditional natural language processing pipeline,
and on attention based recurrent neural networks. Our results demonstrate that
neural network models are able to learn to read documents and answer complex
questions with minimal prior knowledge of language structure.
View details