Transforming Retrieval to a Generation task

Tom Kwiatkowski

Palak Jain

Livio Baldini Soares

2024

Download Google Scholar

Listen with Illuminate

Abstract

Most NLP tasks such as summarization, semantic parsing can now be fulfilled with LLMs without any external pipelines. Retrieval remains a task that requires a separate retriever model, making it pipelined and cumbersome.
In this work, we explore posing Retrieval as a generation task than can be completely folded into LLMs.

We draw motivation from two attributes of LLMs:
a) LLMs are knowledge warehouses. They memorize tons of corpora during pre-training giving them access to vast amounts of information in their parameters.
b) LLM Decoding is inherently a search mechanism, searching a meaningful sequences through the universe of all output paths/sequences.

Where LLMs lack is their failure to attribute the generation to trusted knowledge corpus.
In this work, we force the LLM to generate only verbatim sequences from the corpus by constraining decoding. Moreover, we can stitch together constrained and natural unconstrained generation, allowing us to blend reasoning with retrieval. This is achieved within a single decoding pass of LLM, no pipelined systems needed.

Explore our many areas of focus

Building a collaborative ecosystem

Shaping the future together

Translating discovery into real-world impact

Transforming Retrieval to a Generation task

Abstract

Learn more about how we research

Google Ai

Google Cloud

Google DeepMind

Google Labs