Skip to Main Content
HBS Home
  • About
  • Academic Programs
  • Alumni
  • Faculty & Research
  • Baker Library
  • Giving
  • Harvard Business Review
  • Initiatives
  • News
  • Recruit
  • Map / Directions
Faculty & Research
  • Faculty
  • Research
  • Featured Topics
  • Academic Units
  • …→
  • Harvard Business School→
  • Faculty & Research→
Publications
Publications
  • 2022
  • Article
  • Proceedings of the International Conference on Artificial Intelligence and Statistics (AISTATS)

Exploring Counterfactual Explanations Through the Lens of Adversarial Examples: A Theoretical and Empirical Analysis.

By: Martin Pawelczyk, Chirag Agarwal, Shalmali Joshi, Sohini Upadhyay and Himabindu Lakkaraju
  • Format:Electronic
ShareBar

Abstract

As machine learning (ML) models become more widely deployed in high-stakes applications, counterfactual explanations have emerged as key tools for providing actionable model explanations in practice. Despite the growing popularity of counterfactual explanations, a deeper understanding of these explanations is still lacking. In this work, we systematically analyze counterfactual explanations through the lens of adversarial examples. We do so by formalizing the similarities between popular counterfactual explanation and adversarial example generation methods identifying conditions when they are equivalent. We then derive the upper bounds on the distances between the solutions output by counterfactual explanation and adversarial example generation methods, which we validate on several real world data sets. By establishing these theoretical and empirical similarities between counterfactual explanations and adversarial examples, our work raises fundamental questions about the design and development of existing counterfactual explanation algorithms.

Keywords

Machine Learning Models; Counterfactual Explanations; Adversarial Examples; Mathematical Methods

Citation

Pawelczyk, Martin, Chirag Agarwal, Shalmali Joshi, Sohini Upadhyay, and Himabindu Lakkaraju. "Exploring Counterfactual Explanations Through the Lens of Adversarial Examples: A Theoretical and Empirical Analysis." Proceedings of the International Conference on Artificial Intelligence and Statistics (AISTATS) 25th (2022).
  • Read Now

About The Author

Himabindu Lakkaraju

Technology and Operations Management
→More Publications

More from the Authors

    • 2023
    • Faculty Research

    When Algorithms Explain Themselves: AI Adoption and Accuracy of Experts' Decisions

    By: Himabindu Lakkaraju and Chiara Farronato
    • 2022
    • Advances in Neural Information Processing Systems (NeurIPS)

    Which Explanation Should I Choose? A Function Approximation Perspective to Characterizing Post hoc Explanations

    By: Tessa Han, Suraj Srinivas and Himabindu Lakkaraju
    • 2022
    • Advances in Neural Information Processing Systems (NeurIPS)

    Efficiently Training Low-Curvature Neural Networks

    By: Suraj Srinivas, Kyle Matoba, Himabindu Lakkaraju and Francois Fleuret
More from the Authors
  • When Algorithms Explain Themselves: AI Adoption and Accuracy of Experts' Decisions By: Himabindu Lakkaraju and Chiara Farronato
  • Which Explanation Should I Choose? A Function Approximation Perspective to Characterizing Post hoc Explanations By: Tessa Han, Suraj Srinivas and Himabindu Lakkaraju
  • Efficiently Training Low-Curvature Neural Networks By: Suraj Srinivas, Kyle Matoba, Himabindu Lakkaraju and Francois Fleuret
ǁ
Campus Map
Harvard Business School
Soldiers Field
Boston, MA 02163
→Map & Directions
→More Contact Information
  • Make a Gift
  • Site Map
  • Jobs
  • Harvard University
  • Trademarks
  • Policies
  • Accessibility
  • Digital Accessibility
Copyright © President & Fellows of Harvard College