Skip to Main Content
HBS Home
  • About
  • Academic Programs
  • Alumni
  • Faculty & Research
  • Baker Library
  • Giving
  • Harvard Business Review
  • Initiatives
  • News
  • Recruit
  • Map / Directions
Faculty & Research
  • Faculty
  • Research
  • Featured Topics
  • Academic Units
  • …→
  • Harvard Business School→
  • Faculty & Research→
Publications
Publications
  • 2021
  • Working Paper
  • HBS Working Paper Series

Time and the Value of Data

By: Ehsan Valavi, Joel Hestness, Newsha Ardalani and Marco Iansiti
  • Format:Print
  • | Language:English
  • | Pages:43
ShareBar

Abstract

Managers often believe that collecting more data will continually improve the accuracy of their machine learning models. However, we argue in this paper that when data lose relevance over time, it may be optimal to collect a limited amount of recent data instead of keeping around an infinite supply of older (less relevant) data. In addition, we argue that increasing the stock of data by including older datasets may, in fact, damage the model's accuracy. Expectedly, the model's accuracy improves by increasing the flow of data (defined as data collection rate); however, it requires other tradeoffs in terms of refreshing or retraining machine learning models more frequently.

Using these results, we investigate how the business value created by machine learning models scales with data and when the stock of data establishes a sustainable competitive advantage. We argue that data's time-dependency weakens the barrier to entry that the stock of data creates. As a result, a competing firm equipped with a limited (yet sufficient) amount of recent data can develop more accurate models. This result, coupled with the fact that older datasets may deteriorate models' accuracy, suggests that created business value doesn't scale with the stock of available data unless the firm offloads less relevant data from its data repository. Consequently, a firm's growth policy should incorporate a balance between the stock of historical data and the flow of new data.

We complement our theoretical results with an experiment. In the experiment, we use the simple yet widely used machine learning task known as next work prediction. We empirically measure the loss in accuracy of a next word prediction model trained on datasets from various time periods. Our empirical measurements confirm the economic significance of the value decline over time. For example, 100MB of text data, after seven years, becomes as valuable as 50MB of current data for the next word prediction task.

Keywords

Economics Of AI; Machine Learning; Non-stationarity; Perishability; Value Depreciation; Analytics and Data Science; Value

Citation

Valavi, Ehsan, Joel Hestness, Newsha Ardalani, and Marco Iansiti. "Time and the Value of Data." Harvard Business School Working Paper, No. 21-016, August 2020. (Revised November 2021.)
  • SSRN
  • Read Now

About The Author

Marco Iansiti

Technology and Operations Management
→More Publications

More from the Authors

    • March 2023
    • Faculty Research

    Moderna

    By: Marco Iansiti, Karim R. Lakhani, Hannah Mayer, Kerry Herman, Allison J. Wigen and Dave Habeeb
    • September 2022 (Revised October 2022)
    • Faculty Research

    Data Privacy in Practice at LinkedIn

    By: Iavor Bojinov, Marco Iansiti and Seth Neel
    • September 2021
    • Information Systems Research

    Network Interconnectivity and Entry into Platform Markets

    By: Feng Zhu, Xinxin Li, Ehsan Valavi and Marco Iansiti
More from the Authors
  • Moderna By: Marco Iansiti, Karim R. Lakhani, Hannah Mayer, Kerry Herman, Allison J. Wigen and Dave Habeeb
  • Data Privacy in Practice at LinkedIn By: Iavor Bojinov, Marco Iansiti and Seth Neel
  • Network Interconnectivity and Entry into Platform Markets By: Feng Zhu, Xinxin Li, Ehsan Valavi and Marco Iansiti
ǁ
Campus Map
Harvard Business School
Soldiers Field
Boston, MA 02163
→Map & Directions
→More Contact Information
  • Make a Gift
  • Site Map
  • Jobs
  • Harvard University
  • Trademarks
  • Policies
  • Accessibility
  • Digital Accessibility
Copyright © President & Fellows of Harvard College