Andreea Gane
Research Areas
Authored Publications
Sort By
Rethinking Attention with Performers
Valerii Likhosherstov
David Martin Dohan
Peter Hawkins
Jared Quincy Davis
Afroz Mohiuddin
Lukasz Kaiser
Adrian Weller
accepted to ICLR 2021 (oral presentation) (to appear)
Preview abstract
We introduce Performers, Transformer architectures which can estimate regular (softmax) full-rank-attention Transformers with provable accuracy, but using only linear (as opposed to quadratic) space and time complexity, without relying on any priors such as sparsity or low-rankness. To approximate softmax attention-kernels, Performers use a novel Fast Attention Via positive Orthogonal Random features approach (FAVOR+), which may be of independent interest for scalable kernel methods. FAVOR+ can be also used to efficiently model kernelizable attention mechanisms beyond softmax. This representational power is crucial to accurately compare softmax with other kernels for the first time on large-scale tasks, beyond the reach of regular Transformers, and investigate optimal attention-kernels. Performers are linear architectures fully compatible with regular Transformers and with strong theoretical guarantees: unbiased or nearly-unbiased estimation of the attention matrix, uniform convergence and low estimation variance. We tested Performers on a rich set of tasks stretching from pixel-prediction through text models to protein sequence modeling. We demonstrate competitive results with other examined efficient sparse and dense attention methods, showcasing effectiveness of the novel attention-learning paradigm leveraged by Performers.
View details
Population Based Optimization for Biological Sequence Design
Zelda Mariet
David Martin Dohan
D. Sculley
ICML 2020 (2020)
Preview abstract
The use of black-box optimization for the design of new biological sequences is an emerging research area with potentially revolutionary impact. The cost and latency of wet-lab experiments requires methods that find good sequences in few experimental rounds of large batches of sequences --- a setting that off-the-shelf black-box optimization methods are ill-equipped to handle. We find that the performance of existing methods varies drastically across optimization tasks, posing a significant obstacle to real-world applications. To improve robustness, we propose population-based optimization (PBO), which generates batches of sequences by sampling from an ensemble of methods. The number of sequences sampled from any method is proportional to the quality of sequences it previously proposed, allowing PBO to combine the strengths of individual methods while hedging against their innate brittleness. Adapting the population of methods online using evolutionary optimization further improves performance. Through extensive experiments on in-silico optimization tasks, we show that PBO outperforms any single method in its population, proposing both higher quality single sequences as well as more diverse batches. By its robustness and ability to design diverse, high-quality sequences, PBO is shown to be a new state-of-the art approach to the batched black-box optimization of biological sequences.
View details
A Comparison of Generative Models for Sequence Design
David Dohan
Ramya Deshpande
Olivier Chapelle
Babak Alipanahi
Machine Learning in Computational Biology Workshop (2019)
Preview abstract
In this paper, we compare generative models of different complexity for designing DNA and protein sequences using the Cross Entropy Method.
View details