Natural Language Processing

Natural Language Processing (NLP) research at Google focuses on algorithms that apply at scale, across languages, and across domains. Our systems are used in numerous ways across Google, impacting user experience in search, mobile, apps, ads, translate and more.

Our work spans the range of traditional NLP tasks, with general-purpose syntax and semantic algorithms underpinning more specialized systems. We are particularly interested in algorithms that scale well and can be run efficiently in a highly distributed environment.

Our syntactic systems predict part-of-speech tags for each word in a given sentence, as well as morphological features such as gender and number. They also label relationships between words, such as subject, object, modification, and others. We focus on efficient algorithms that leverage large amounts of unlabeled data, and recently have incorporated neural net technology.

On the semantic side, we identify entities in free text, label them with types (such as person, location, or organization), cluster mentions of those entities within and across documents (coreference resolution), and resolve the entities to the Knowledge Graph.

Recent work has focused on incorporating multiple sources of knowledge and information to aid with analysis of text, as well as applying frame semantics at the noun phrase, sentence, and document level.

Recent Publications

Rankers, Judges, and Assistants: Towards Understanding the Interplay of LLMs in Information Retrieval Evaluation

Krisztian Balog

Don Metzler

Zhen Qin

Proceedings of the 48th International ACM SIGIR Conference on Research and Development in Information Retrieval (2025)

VIDEOPHY-2: A Challenging Action-Centric Physical Commonsense Evaluation in Video Generation

Kai-Wei Chang

Hritik Bansal

Aditya Grover

Roman Goldenberg

Yonatan Bitton

Clark Peng

(2025)

Sufficient Context: A New Lens on Retrieval Augmented Generation Systems

Hailey Joren

Jianyi Zhang

Chun-Sung Ferng

Da-Cheng Juan

Ankur Taly

Cyrus Rashtchian

International Conference on Learning Representations (ICLR) (2025)

A unified acoustic-to-speech-to-language embedding space captures the neural basis of natural language processing in everyday conversations

Michael Brenner

Uri Hasson

Samuel A. Nastase

Harshvardhan Gazula

Aditi Rao

Tom Sheffer

Werner Doyle

Orrin Devinsky

aditi singh

Adeen Flinker

Patricia Dugan

Yossi Matias

Bobbi Aubrey

Sasha Devore

Daniel Friedman

Leonard Niekerken

Catherine Kim

Mariano Schain

Haocheng Wang

Zaid Zada

Gina Choe

Avinatan Hassidim

Nature Human Behaviour (2025)

Inside-Out: Hidden Factual Knowledge in LLMs

Jonathan Herzig

Eran Ofek

Hadas Orgad

Zorik Gekhman

Idan Szpektor

Roi Reichart

Yonatan Belinkov

Eyal Ben-David

2025

RefVNLI: Towards Scalable Evaluation of Subject-driven Text-to-image Generation

Aviv Slobodkin

Hagai Taitelbaum

Yonatan Bitton

Brian Gordon

Michal Sokolik

Almog Gueta

Royi Rassin

Dani Lischinski

Idan Szpektor

2025

Defining the technology of today and tomorrow.

Philosophy

People

Foundational ML & Algorithms

Computing Systems & Quantum AI

Science, AI & Society

Projects

Publications

Resources

Shaping the future, together.

Student programs

Faculty programs

Conferences & events

Natural Language Processing

Recent Publications

Some of our teams

Join us