Post Hoc Explanations of Language Models Can Improve Language Models

Satyapriya Krishna

Jiaqi Ma

Dylan Z Slack

Asma Ghandeharioun

Sameer Singh

Himabindu Lakkaraju

NeurIPS 2023 (2023)

Download Google Scholar

Abstract

Large Language Models (LLMs) have shown remarkable capabilities in performing complex tasks, excelling at in-context learning, and providing step-by-step reasoning. However, incorporating human-annotated rationales such as Chain-of-Thoughts for enhancing model performance faces challenges in scalability and can sometimes adversely affect performance. In this work, we present a novel approach, AMPLIFY: Advanced Model Performance Leveraging In-Context Learning with Post Hoc Explanations, which addresses these challenges by replacing human-annotated rationales with automatically generated rationales using post hoc explanation methods. Post hoc explanation techniques have gained popularity for determining attribution scores for input features in model predictions, deepening our understanding of model behavior and helping pinpoint errors in complex models. We leverage these explanations to provide corrective signals to large language models, reducing prediction errors and augmenting in-context learning with automatically generated rationales. Our findings demonstrate that AMPLIFY leads to performance improvements between 10-25% across a wide range of tasks, including those where previously considered prompting techniques, such as Chain-of-Thoughts, which rely on human-annotated explanations, fall short. This highlights the potential of utilizing post hoc explanation methods as a valuable tool for enhancing the efficiency and effectiveness of large language models in various tasks. Furthermore, we conduct an extensive empirical analysis to examine the impact and improvements attributed to each step of AMPLIFY offering critical insights for refining in-context learning while addressing the limitations posed by methods dependent on human-annotated rationales.

Defining the technology of today and tomorrow.

Philosophy

People

Foundational ML & Algorithms

Computing Systems & Quantum AI

Science, AI & Society

Projects

Publications

Resources

Shaping the future, together.

Student programs

Faculty programs

Conferences & events

Post Hoc Explanations of Language Models Can Improve Language Models

Abstract

Meet the teams driving innovation