CodeTrek: Flexible Modeling of Code using an Extensible Relational Representation

Pardis Pashakhanloo

Aaditya Naik

Yuepeng Wang

Hanjun Dai

Petros Maniatis

Mayur Naik

ICLR (2022)

Download Google Scholar

Abstract

Designing a suitable representation for code-reasoning tasks is challenging in
aspects such as the kinds of program information to model, how to combine them,
and how much context to consider. We propose CodeTrek, a deep learning approach
that addresses these challenges by representing codebases as databases that conform
to rich relational schemas. The relational representation not only allows CodeTrek
to uniformly represent diverse kinds of program information, but also to leverage
program-analysis queries to derive new semantic relations, which can be readily
incorporated without further architectural engineering. CodeTrek embeds this
relational representation using a set of walks that can traverse different relations
in an unconstrained fashion, and incorporates all relevant attributes along the way.
We evaluate CodeTrek on four diverse and challenging Python tasks: variable
misuse, exception prediction, unused definition, and variable shadowing. CodeTrek
achieves an accuracy of 91%, 63%, 98%, and 94% on these tasks respectively, and
outperforms state-of-the-art neural models by 2--19% points.

Defining the technology of today and tomorrow.

Philosophy

People

Teams

AI/ML Foundations  & Capabilities

Algorithms & Optimization

Computing Paradigms

Responsible Human-Centric Technology

Science & Societal Impact

Projects

Publications

Resources

Shaping the future, together.

Student programs

Faculty programs

Conferences & events

CodeTrek: Flexible Modeling of Code using an Extensible Relational Representation

Abstract

Research Areas

Learn more about how we conduct our research

Defining the technology of today and tomorrow.

Philosophy

People

Teams

AI/ML Foundations & Capabilities

Algorithms & Optimization

Computing Paradigms

Responsible Human-Centric Technology

Science & Societal Impact

Projects

Publications

Resources

Shaping the future, together.

Student programs

Faculty programs

Conferences & events

CodeTrek: Flexible Modeling of Code using an Extensible Relational Representation

Abstract

Research Areas

Learn more about how we conduct our research

AI/ML Foundations  & Capabilities