Publications
Publications
- January 2025
- HBS Case Collection
AI vs Human: Analyzing Acceptable Error Rates Using the Confusion Matrix
By: Tsedal Neeley and Tim Englehart
Abstract
This technical note introduces the confusion matrix as a foundational tool in artificial intelligence (AI) and large language models (LLMs) for assessing the performance of classification models, focusing on their reliability for decision-making. A confusion matrix displays true positives, true negatives, false positives, and false negatives, providing a comprehensive view of a model's predictive strengths and weaknesses. Key performance metrics—such as accuracy, precision, recall (sensitivity), and specificity—are explained in the context of AI reliability across applications like medical diagnostics, cybersecurity threat detection, and customer service automation. The note highlights that AI systems rarely achieve 100% accuracy but can approach near-perfect levels, such as 99% in advanced implementations. Additionally, the note distinguishes various reliability levels, helping leaders identify when AI systems outperform human judgment and when human oversight remains crucial. By understanding performance thresholds, leaders can make informed decisions about integrating AI solutions while balancing risks and ensuring trust in outcomes.
Keywords
Citation
Neeley, Tsedal, and Tim Englehart. "AI vs Human: Analyzing Acceptable Error Rates Using the Confusion Matrix." Harvard Business School Technical Note 425-049, January 2025.