Skip to Main Content
HBS Home
  • About
  • Academic Programs
  • Alumni
  • Faculty & Research
  • Baker Library
  • Giving
  • Harvard Business Review
  • Initiatives
  • News
  • Recruit
  • Map / Directions
Faculty & Research
  • Faculty
  • Research
  • Featured Topics
  • Academic Units
  • …→
  • Harvard Business School→
  • Faculty & Research→
Publications
Publications
  • 2022
  • Article
  • Proceedings of the AAAI/ACM Conference on Artificial Intelligence, Ethics, and Society

Fairness via Explanation Quality: Evaluating Disparities in the Quality of Post hoc Explanations

By: Jessica Dai, Sohini Upadhyay, Ulrich Aivodji, Stephen Bach and Himabindu Lakkaraju
  • Format:Print
  • | Pages:12
ShareBar

Abstract

As post hoc explanation methods are increasingly being leveraged to explain complex models in high-stakes settings, it becomes critical to ensure that the quality of the resulting explanations is consistently high across all subgroups of a population. For instance, it should not be the case that explanations associated with instances belonging to, e.g., women, are less accurate than those associated with other genders. In this work, we initiate the study of identifying group-based disparities in explanation quality. To this end, we first outline several key properties that contribute to explanation quality—namely, fidelity (accuracy), stability, consistency, and sparsity—and discuss why and how disparities in these properties can be particularly problematic. We then propose an evaluation framework which can quantitatively measure disparities in the quality of explanations. Using this framework, we carry out an empirical analysis with three datasets, six post hoc explanation methods, and different model classes to understand if and when group-based disparities in explanation quality arise. Our results indicate that such disparities are more likely to occur when the models being explained are complex and non-linear. We also observe that certain post hoc explanation methods (e.g., Integrated Gradients, SHAP) are more likely to exhibit disparities. Our work sheds light on previously unexplored ways in which explanation methods may introduce unfairness in real world decision making.

Keywords

Prejudice and Bias; Mathematical Methods; Research; Analytics and Data Science

Citation

Dai, Jessica, Sohini Upadhyay, Ulrich Aivodji, Stephen Bach, and Himabindu Lakkaraju. "Fairness via Explanation Quality: Evaluating Disparities in the Quality of Post hoc Explanations." Proceedings of the AAAI/ACM Conference on Artificial Intelligence, Ethics, and Society (2022): 203–214.
  • Read Now
  • Purchase

About The Author

Himabindu Lakkaraju

Technology and Operations Management
→More Publications

More from the Authors

    • 2023
    • Faculty Research

    When Algorithms Explain Themselves: AI Adoption and Accuracy of Experts' Decisions

    By: Himabindu Lakkaraju and Chiara Farronato
    • 2022
    • Advances in Neural Information Processing Systems (NeurIPS)

    Which Explanation Should I Choose? A Function Approximation Perspective to Characterizing Post hoc Explanations

    By: Tessa Han, Suraj Srinivas and Himabindu Lakkaraju
    • 2022
    • Advances in Neural Information Processing Systems (NeurIPS)

    Efficiently Training Low-Curvature Neural Networks

    By: Suraj Srinivas, Kyle Matoba, Himabindu Lakkaraju and Francois Fleuret
More from the Authors
  • When Algorithms Explain Themselves: AI Adoption and Accuracy of Experts' Decisions By: Himabindu Lakkaraju and Chiara Farronato
  • Which Explanation Should I Choose? A Function Approximation Perspective to Characterizing Post hoc Explanations By: Tessa Han, Suraj Srinivas and Himabindu Lakkaraju
  • Efficiently Training Low-Curvature Neural Networks By: Suraj Srinivas, Kyle Matoba, Himabindu Lakkaraju and Francois Fleuret
ǁ
Campus Map
Harvard Business School
Soldiers Field
Boston, MA 02163
→Map & Directions
→More Contact Information
  • Make a Gift
  • Site Map
  • Jobs
  • Harvard University
  • Trademarks
  • Policies
  • Accessibility
  • Digital Accessibility
Copyright © President & Fellows of Harvard College