Finding Related Tables

Anish Das Sarma

Lujun Fang

Nitin Gupta

Alon Y. Halevy

Hongrae Lee

Fei Wu

Reynold Xin

Cong Yu

SIGMOD (2012)

Download Google Scholar

Abstract

We consider the problem of finding related tables in a large corpus of
heterogenous tables. Detecting related tables provides users a
powerful tool for enhancing their tables with additional data and
enables effective reuse of available public data. Our first
contribution is a framework that captures several types of relatedness,
including tables that are candidates for joins and tables that are
candidates for union. Our second contribution is a set of algorithms
for detecting related tables that can be either unioned or
joined. We describe a set of experiments that demonstrate that our
algorithms produce highly related tables. We also show that
we can often improve the results of table search by
pulling up tables that are ranked much lower based on their
relatedness to top-ranked tables. Finally, we describe how to scale up
our algorithms and show the results of running it on a corpus of over
a million tables extracted from Wikipedia.

Research Areas

Data Management

Defining the technology of today and tomorrow.

Philosophy

People

Teams

AI/ML Foundations  & Capabilities

Algorithms & Optimization

Computing Paradigms

Responsible Human-Centric Technology

Science & Societal Impact

Projects

Publications

Resources

Shaping the future, together.

Student programs

Faculty programs

Conferences & events

Finding Related Tables

Abstract

Research Areas

Learn more about how we conduct our research

Defining the technology of today and tomorrow.

Philosophy

People

Teams

AI/ML Foundations & Capabilities

Algorithms & Optimization

Computing Paradigms

Responsible Human-Centric Technology

Science & Societal Impact

Projects

Publications

Resources

Shaping the future, together.

Student programs

Faculty programs

Conferences & events

Finding Related Tables

Abstract

Research Areas

Learn more about how we conduct our research

AI/ML Foundations  & Capabilities