The Best of Both Worlds: Combining Recent Advances in Neural Machine Translation

Ankur Bapna

George Foster

Llion Jones

Macduff Hughes

Melvin Johnson

Mia Chen

Mike Schuster

Niki J. Parmar

Orhan Firat

Wolfgang Macherey

Yonghui Wu

Zhifeng Chen

ACL'18 (2018) (to appear)

Download Google Scholar

Abstract

The past year has witnessed rapid advances in sequence-to-sequence (seq2seq)
modeling for Machine Translation (MT). The classic RNN-based approaches to MT
were first out-performed by the convolutional seq2seq model, which was then
out-performed by the more recent Transformer model. Each of these new
approaches consists of a fundamental architecture accompanied by a set of
modeling and training techniques that are in principle applicable to other
seq2seq architectures. In this paper, we tease apart the new architectures and
their accompanying techniques in two ways. First, we identify several key
modeling and training techniques, and apply them to the RNN architecture,
yielding a new RNMT+ model that outperforms all of the three fundamental architectures
on the benchmark WMT'14 English to French and
English to German tasks. Second, we analyze the properties of each
fundamental seq2seq architecture and devise new hybrid architectures intended
to combine their strengths. Our hybrid models obtain further improvements,
outperforming the RNMT+ model on both benchmark datasets.

Defining the technology of today and tomorrow.

Philosophy

People

Teams

AI/ML Foundations  & Capabilities

Algorithms & Optimization

Computing Paradigms

Responsible Human-Centric Technology

Science & Societal Impact

Projects

Publications

Resources

Shaping the future, together.

Student programs

Faculty programs

Conferences & events

The Best of Both Worlds: Combining Recent Advances in Neural Machine Translation

Abstract

Research Areas

Learn more about how we conduct our research

Defining the technology of today and tomorrow.

Philosophy

People

Teams

AI/ML Foundations & Capabilities

Algorithms & Optimization

Computing Paradigms

Responsible Human-Centric Technology

Science & Societal Impact

Projects

Publications

Resources

Shaping the future, together.

Student programs

Faculty programs

Conferences & events

The Best of Both Worlds: Combining Recent Advances in Neural Machine Translation

Abstract

Research Areas

Learn more about how we conduct our research

AI/ML Foundations  & Capabilities