Global Relational Models of Source Code

Vincent Josua Hellendoorn

Petros Maniatis

Rishabh Singh

Charles Sutton

David Bieber

International Conference on Learning Representations (ICLR) (2020)

Download Google Scholar

Abstract

Models of code can learn distributed representations of a program’s syntax and
semantics to predict many non-trivial properties of a program. Recent state-of-the-art models leverage highly structured representations of programs, such as
trees, graphs and paths therein (e.g., data-flow relations), which are precise and
abundantly available for code. This provides a strong inductive bias towards semantically
meaningful relations, yielding more generalizable representations than
classical sequence-based models. Unfortunately, these models primarily rely on
graph-based message passing to represent relations in code, which makes them
de facto local due to the high cost of message-passing steps, quite in contrast to
modern, global sequence-based models, such as the Transformer. In this work,
we bridge this divide between global and structured models by introducing two
new hybrid model families that are both global and incorporate structural bias:
Graph Sandwiches, which wrap traditional (gated) graph message-passing layers
in sequential message-passing layers; and Graph Relational Embedding Attention
Transformers (GREAT for short), which bias traditional Transformers with
relational information from graph edge types. By studying a popular, non-trivial
program repair task, variable-misuse identification, we explore the relative merits
of traditional and hybrid model families for code representation. Starting with a
graph-based model that already improves upon the prior state-of-the-art for this
task by 20%, we show that our proposed hybrid models improve an additional
10–15%, while training both faster and using fewer parameters.

Defining the technology of today and tomorrow.

Philosophy

People

Teams

AI/ML Foundations  & Capabilities

Algorithms & Optimization

Computing Paradigms

Responsible Human-Centric Technology

Science & Societal Impact

Projects

Publications

Resources

Shaping the future, together.

Student programs

Faculty programs

Conferences & events

Global Relational Models of Source Code

Abstract

Research Areas

Learn more about how we conduct our research

Defining the technology of today and tomorrow.

Philosophy

People

Teams

AI/ML Foundations & Capabilities

Algorithms & Optimization

Computing Paradigms

Responsible Human-Centric Technology

Science & Societal Impact

Projects

Publications

Resources

Shaping the future, together.

Student programs

Faculty programs

Conferences & events

Global Relational Models of Source Code

Abstract

Research Areas

Learn more about how we conduct our research

AI/ML Foundations  & Capabilities