Publications
Publications
- 2024
Scaling Core Earnings Measurement with Large Language Models
By: Matthew Shaffer and Charles CY Wang
Abstract
We study the application of large language models (LLMs) to the estimation of core earnings, i.e., a firm's persistent profitability from its core business activities. This construct is central to investors' assessments of economic performance and valuations. However, quantifying it requires judgment and integration of information scattered throughout financial disclosures contextualized with general industry knowledge. This has become increasingly difficult as financial disclosures have become more "bloated" and accounting standards have increased non-recurring impacts on GAAP net income. LLMs, with their ability to process unstructured text, incorporate general knowledge, and mimic human reasoning, may be well-suited for this kind of task. Using the text of 10-K filings from U.S. public companies between 2000 and 2023, we employ LLMs with two prompting strategies: (i) a baseline "out of the box" approach providing only a definition of core earnings and the full 10-K, and (ii) a structured "sequential" approach, refined through experiments, instructing the model to identify unusual losses, then gains, and then tabulate and aggregate them. We evaluate the models' analyses by reviewing their stated reasoning process and analyzing their core earnings measures with an array of standard quantitative tests. Under the baseline approach, the LLM conflates core earnings with other financial concepts (e.g., EBITDA). However, the sequential approach yields a valid core earnings measure that outperforms GAAP Net Income and Compustat's OPEPS and OIADP in predicting average future earnings in most standard tests. Our findings are relevant for practitioners, showing how these models can fail and succeed in complex tasks of this nature. For researchers, we pave a path for using current and future models to generate valid, neutral, scalable measures of core earnings, rather than relying on surrogates provided by company management or standard data providers. Overall, our findings suggest LLMs have enormous potential for lowering the costs associated with processing and analyzing the increasingly bloated financial disclosures of publicly traded companies.
Citation
Shaffer, Matthew, and Charles CY Wang. "Scaling Core Earnings Measurement with Large Language Models." Working Paper, November 2024.