You (Will) Wu
Authored Publications
Sort By
Towards Understanding Chain-of-Thought Prompting: An Empirical Study of What Matters
Boshi Wang
Sewon Min
Xiang Deng
Luke Zettlemoyer
Huan Sun
Proc. of The 61st Annual Meeting of the Association for Computational Linguistics (2023)
Preview abstract
Chain-of-Thought (CoT) prompting can dramatically improve the multi-step reasoning abilities of large language models (LLMs). CoT explicitly encourages the LLM to generate intermediate rationales for solving a problem, by providing a series of reasoning steps in the demonstrations. Despite its success, there is still little understanding of what makes CoT prompting effective and which aspects of the demonstrated reasoning steps contribute to its performance. In this paper, we show that CoT reasoning is possible even with invalid demonstrations - prompting with invalid reasoning steps can achieve over 80-90% of the performance obtained using CoT under various metrics, while still generating coherent lines of reasoning during inference. Further experiments show that other aspects of the rationales, such as being relevant to the query and correctly ordering the reasoning steps, are much more important for effective CoT reasoning. Overall, these findings both deepen our understanding of CoT prompting, and open up new questions regarding LLMs' capability to learn to reason in context.
View details
Preview abstract
Relational tables on the Web store a vast amount of knowledge. Owing to the wealth of such tables, there has been tremendous progress on a variety of tasks in the area of table understanding. However, existing work generally relies on heavily-engineered task-specific features and model architectures. In this paper, we present TURL, a novel framework that introduces the pre-training/fine-tuning paradigm to relational Web tables. During pre-training, our framework learns deep contextualized representations on relational tables in a self-supervised manner. Its universal model design with pre-trained representations can be applied to a wide range of tasks with minimal task-specific fine-tuning. Specifically, we propose a structure-aware Transformer encoder to model the row-column structure, and present a new Masked Entity Recovery (MER) objective for pre-training to capture relational knowledge. We compiled a benchmark consisting of 6 different tasks for table understanding and used it to systematically evaluate TURL. We show that TURL generalizes well to all tasks and substantially outperforms existing methods in almost all instances.
View details
TURL: Table Understanding through Representation Learning
Xiang Deng
Huan Sun
Alyssa Whitlock Lees
Cong Yu
47th International Conference on Very Large Data Bases (2021)
Preview abstract
Relational tables on the Web store a vast amount of knowledge. Owing to the wealth of such tables, various tasks on table understanding has made tremendous progress. However, existing work largely replies on heavily-engineered task-specific features and model architectures. In this paper, we present TURL, a novel framework that introduces the pre-training-fine-tuning paradigm to relational Web tables. During pre-training, it learns deep contextualized representations on relational tables in an unsupervised manner. Its universal model design along with the pre-trained representations can be applied to a wide range of tasks with minimal task-specific fine-tuning. More specifically, we propose a structure-aware Transformer encoder to model the row-column structure of relational tables, and present a new Masked Entity Recovery (MER) objective for pre-training to capture the semantics and knowledge in large-scale unlabeled data. To systematically evaluate TURL, we compile a benchmark that consists of 6 different tasks for table understanding (e.g., relation extraction, cell filling). We show that TURL generalizes well to all these tasks and substantially outperforms existing methods in most cases. Our source code, benchmark,as well as pre-trained models are available online to facilitate future research.
View details
Preview abstract
We present ReasonBert, a pre-training method that augments language models with the ability to reason over long-range relations and multiple, possibly hybrid contexts. Unlike existing pre-training methods that only harvest learning signals from local contexts of naturally occurring texts, we propose a generalized notion of distant supervision to automatically connect multiple pieces of text and tables to create pre-training examples that require long-range reasoning. Different types of reasoning are simulated, including intersecting multiple pieces of evidence, bridging from one piece of evidence to another, and detecting unanswerable cases. We conduct a comprehensive evaluation on a variety of extractive question answering datasets ranging from single-hop to multi-hop and from text-only to table-only to hybrid that require various reasoning capabilities and show that ReasonBert achieves remarkable improvement over an array of strong baselines. Few-shot experiments further demonstrate that our pre-training method substantially improves sample efficiency.
View details
Collocating Structured Web Tables with News Articles
Alyssa Whitlock Lees
Cong Yu
Levy de Souza Silva
Luciano Barbosa
WWW 2021 International Workshop on News Recommendation and Intelligence (2021) (to appear)
Preview abstract
With the increased frequency of critical headline updates and news information published on the web, it can be overwhelming for users to understand and verify articles in a larger context. In particular, crucial statistics or facts within a news story may be difficult to parse without comparison to other entities. Structured web tables, as found in a source such as Wikipedia, offer users an important resource for understanding the news. Displaying tables or charts alongside news documents can assist in comprehension of a document content. However, automatically finding relevant and meaningful connections between web tables to a given news article is a challenging research problem.
We are not aware of any research directly addressing this problem;
however, one can think of the news article as a (long) keyword query and apply information retrieval, or question-answering, techniques.
Previous work in this area used embeddings of KB entities and the utilization of different metrics for semantic similarity for table lookup.
Our experiments applying these baseline approaches in a straightforward way for this render spurious results that are inappropriate or irrelevant for a reader.
In this paper, we build on prior efforts, focusing specifically on the task of matching Wikipedia web tables to news articles. Our contribution includes a survey of existing techniques applied to the news to web matching problem. From these baselines, we propose a new model that leverages recent advances in Bidirectional transformer language models along with entity based table embeddings. Specifically our technique contains three technical components. First, we construct a training data set built from news article follow-up queries to Wikipedia articles over a large aggregate of users. Second, we extract unique web table based categories from Google's Knowledge Graph that describe Wikipedia table column components. Third, we fine-tune a Bidirectional Encoder Representations from Transformers (Bert), pre-trained on news corpus data.
Using human-based curators as a standard for creating an evaluation set, our approach significantly outperforms the baselines.
View details
Generating Representative Headlines for News Stories
Xiaotao Gu
Yuning Mao
Jiawei Han
Cong Yu
Daniel Finnie
Jiaqi Zhai
Nick Zukoski
The Web Conference 2020
Preview abstract
Millions of news articles are published online every day, which can be overwhelming for readers to follow. Grouping articles that are reporting the same event into news stories is a common way of assisting readers in their news consumption. However, it remains a challenging research problem to efficiently and effectively generate a representative headline for each story. Automatic summarization of a document set has been studied for decades, while few studies have focused on generating representative headlines for a set of articles. Unlike summaries, which aim to capture most information with least redundancy, headlines aim to capture information jointly shared by the story articles in short length, and exclude information that is too specific to each individual article. In this work, we study the problem of generating representative headlines for news stories. We develop a distant supervision approach to train large-scale generation models without any human annotation. This approach centers on two technical components. First, we propose a multi-level pre-training framework that incorporates massive unlabeled corpus with different quality-vs.-quantity balance at different levels. We show that models trained within this framework outperform those trained with pure human curated corpus. Second, we propose a novel self-voting-based article attention layer to extract salient information shared by multiple articles. We show that models that incorporate this layer are robust to potential noises in news stories and outperform existing baselines with or without noises. We can further enhance our model by incorporating human labels, and we show our distant supervision approach significantly reduces the demand on labeled data.
View details
Preview abstract
Modern search engines provide contextual information surrounding query entities
beyond ``ten blue links'' in the form of knowledge cards.
Among the various attributes displayed about entities there has been
recent interest in providing trivia due to observed engagement rates.
Obtaining such trivia at a large scale is, however, non-trivial:
hiring professional content creators is expensive and
extracting statements from the Web can result in
unreliable or uninteresting facts.
In this paper we show how fun facts can be mined from tables
on the Web to provide a large volume of reliable and interesting content.
We employ a template-based approach to generate statements that are
postprocessed by workers. We show how to bootstrap and streamline the process
for faster and cheaper task completion.
However, the content contained in these tables is dynamic.
Therefore, we address the problem of automatically maintaining templates
when tables are updated.
View details
Preview abstract
Modern search engines increasingly incorporate tabular content,
which consists of a set of entities each augmented with a small set
of facts. The facts can be obtained from multiple sources: an entity’s
knowledge base entry, the infobox on its Wikipedia page, or its row
within a WebTable. Crucially, the informativeness of a fact depends
not only on the entity but also the specific context (e.g., the query).
To the best of our knowledge, this paper is the first to study the
problem of contextual fact ranking: given some entities and a con-
text (i.e., succinct natural language description), identify the most
informative facts for the entities collectively within the context.
We propose to contextually rank the facts by exploiting deep
learning techniques. In particular, we develop pointwise and pair-
wise ranking models, using textual and statistical information for
the given entities and context derived from their sources. We en-
hance the models by incorporating entity type information from
an IsA (hypernym) database. We demonstrate that our approaches
achieve better performance than state-of-the-art baselines in terms
of MAP, NDCG, and recall. We further conduct user studies for two
specific applications of contextual fact ranking—table synthesis and
table compression—and show that our models can identify more
informative facts than the baselines.
View details
Knowledge Exploration using Tables on the Web
Fernando Chirigati
Cong Yu
Proceedings of the VLDB Endowment, 10 (2017), pp. 193-204
Preview abstract
The increasing popularity of mobile device usage has ushered in many features in modern search engines that help users with various information needs. One of those needs is Knowledge Exploration, where related documents are returned in response to a user query, either directly through right-hand side knowledge panels or indirectly through navigable sections underneath individual search results. Existing knowledge exploration features have relied on a combination of Knowledge Bases and query logs. In this paper, we propose Knowledge Carousels of two modalities, namely sideways and downwards, that facilitate exploration of IS-A and HAS-A relationships, respectively, with regard to an entity-seeking query, based on leveraging the large corpus of tables on the Web. This brings many
technical challenges, including associating correct carousels with the search entity, selecting the best carousel from the candidates, and finding titles that best describe the carousel. We describe how we address these challenges and also experimentally demonstrate through user studies that our approach produces better result sets than baseline approaches.
View details