Skip to Main Content
HBS Home
  • About
  • Academic Programs
  • Alumni
  • Faculty & Research
  • Baker Library
  • Giving
  • Harvard Business Review
  • Initiatives
  • News
  • Recruit
  • Map / Directions
Faculty & Research
  • Faculty
  • Research
  • Featured Topics
  • Academic Units
  • …→
  • Harvard Business School→
  • Faculty & Research→
Publications
Publications
  • Article
  • Advances in Neural Information Processing Systems (NeurIPS)

Reliable Post hoc Explanations: Modeling Uncertainty in Explainability

By: Dylan Slack, Sophie Hilgard, Sameer Singh and Himabindu Lakkaraju
  • Format:Print
ShareBar

Abstract

As black box explanations are increasingly being employed to establish model credibility in high stakes settings, it is important to ensure that these explanations are accurate and reliable. However, prior work demonstrates that explanations generated by state-of-the-art techniques are inconsistent, unstable, and provide very little insight into their correctness and reliability. In addition, these methods are also computationally inefficient, and require significant hyper-parameter tuning. In this paper, we address the aforementioned challenges by developing a novel Bayesian framework for generating local explanations along with their associated uncertainty. We instantiate this framework to obtain Bayesian versions of LIME and KernelSHAP which output credible intervals for the feature importances, capturing the associated uncertainty. The resulting explanations not only enable us to make concrete inferences about their quality (e.g., there is a 95% chance that the feature importance lies within the given range), but are also highly consistent and stable. We carry out a detailed theoretical analysis that leverages the aforementioned uncertainty to estimate how many perturbations to sample, and how to sample for faster convergence. This work makes the first attempt at addressing several critical issues with popular explanation methods in one shot, thereby generating consistent, stable, and reliable explanations with guarantees in a computationally efficient manner. Experimental evaluation with multiple real world datasets and user studies demonstrate the efficacy of the proposed framework.

Keywords

Black Box Explanations; Bayesian Modeling; Decision Making; Risk and Uncertainty; Information Technology

Citation

Slack, Dylan, Sophie Hilgard, Sameer Singh, and Himabindu Lakkaraju. "Reliable Post hoc Explanations: Modeling Uncertainty in Explainability." Advances in Neural Information Processing Systems (NeurIPS) 34 (2021).
  • Read Now

About The Author

Himabindu Lakkaraju

Technology and Operations Management
→More Publications

More from the Authors

    • 2024
    • Faculty Research

    Fair Machine Unlearning: Data Removal while Mitigating Disparities

    By: Himabindu Lakkaraju, Flavio Calmon, Jiaqi Ma and Alex Oesterling
    • 2024
    • Faculty Research

    Quantifying Uncertainty in Natural Language Explanations of Large Language Models

    By: Himabindu Lakkaraju, Sree Harsha Tanneru and Chirag Agarwal
    • 2023
    • Advances in Neural Information Processing Systems (NeurIPS)

    Post Hoc Explanations of Language Models Can Improve Language Models

    By: Satyapriya Krishna, Jiaqi Ma, Dylan Slack, Asma Ghandeharioun, Sameer Singh and Himabindu Lakkaraju
More from the Authors
  • Fair Machine Unlearning: Data Removal while Mitigating Disparities By: Himabindu Lakkaraju, Flavio Calmon, Jiaqi Ma and Alex Oesterling
  • Quantifying Uncertainty in Natural Language Explanations of Large Language Models By: Himabindu Lakkaraju, Sree Harsha Tanneru and Chirag Agarwal
  • Post Hoc Explanations of Language Models Can Improve Language Models By: Satyapriya Krishna, Jiaqi Ma, Dylan Slack, Asma Ghandeharioun, Sameer Singh and Himabindu Lakkaraju
ǁ
Campus Map
Harvard Business School
Soldiers Field
Boston, MA 02163
→Map & Directions
→More Contact Information
  • Make a Gift
  • Site Map
  • Jobs
  • Harvard University
  • Trademarks
  • Policies
  • Accessibility
  • Digital Accessibility
Copyright © President & Fellows of Harvard College.