Skip to Main Content
HBS Home
  • About
  • Academic Programs
  • Alumni
  • Faculty & Research
  • Baker Library
  • Giving
  • Harvard Business Review
  • Initiatives
  • News
  • Recruit
  • Map / Directions
Faculty & Research
  • Faculty
  • Research
  • Featured Topics
  • Academic Units
  • …→
  • Harvard Business School→
  • Faculty & Research→
Publications
Publications
  • November 2021
  • Article
  • Proceedings of Machine Learning Research (PMLR)

Gaussian Process Subset Scanning for Anomalous Pattern Detection in Non-iid Data

By: William Herlands, Edward McFowland III, Andrew Gordon Wilson and Daniel B. Neill
  • Format:Print
ShareBar

Abstract

Identifying anomalous patterns in real-world data is essential for understanding where, when, and how systems deviate from their expected dynamics. Yet methods that separately consider the anomalousness of each individual data point have low detection power for subtle, emerging irregularities. Additionally, recent detection techniques based on subset scanning make strong independence assumptions and suffer degraded performance in correlated data. We introduce methods for identifying anomalous patterns in non-iid data by combining Gaussian processes with novel log-likelihood ratio statistic and subset scanning techniques. Our approaches are powerful, interpretable, and can integrate information across multiple streams. We illustrate their performance on numeric simulations and three open source spatiotemporal datasets of opioid overdose deaths, 311 calls, and storm reports.

Keywords

Pattern Detection; Subset Scanning; Gaussian Processes; Mathematical Methods

Citation

Herlands, William, Edward McFowland III, Andrew Gordon Wilson, and Daniel B. Neill. "Gaussian Process Subset Scanning for Anomalous Pattern Detection in Non-iid Data." Proceedings of Machine Learning Research (PMLR) 84 (2018): 425–434. (Also presented at the 21st International Conference on Artificial Intelligence and Statistics (AISTATS), 2018.)
  • Read Now

About The Author

Edward McFowland III

Technology and Operations Management
→More Publications

More from the Authors

    • 2023
    • Journal of Machine Learning Research

    Exploiting Discovered Regression Discontinuities to Debias Conditioned-on-observable Estimators

    By: Benjamin Jakubowski, Siram Somanchi, Edward McFowland III and Daniel B. Neill
    • 2023
    • Journal of the American Statistical Association

    Estimating Causal Peer Influence in Homophilous Social Networks by Inferring Latent Locations.

    By: Edward McFowland III and Cosma Rohilla Shalizi
    • October–December 2022
    • INFORMS Journal on Data Science

    Achieving Reliable Causal Inference with Data-Mined Variables: A Random Forest Approach to the Measurement Error Problem

    By: Mochen Yang, Edward McFowland III, Gordon Burtch and Gediminas Adomavicius
More from the Authors
  • Exploiting Discovered Regression Discontinuities to Debias Conditioned-on-observable Estimators By: Benjamin Jakubowski, Siram Somanchi, Edward McFowland III and Daniel B. Neill
  • Estimating Causal Peer Influence in Homophilous Social Networks by Inferring Latent Locations. By: Edward McFowland III and Cosma Rohilla Shalizi
  • Achieving Reliable Causal Inference with Data-Mined Variables: A Random Forest Approach to the Measurement Error Problem By: Mochen Yang, Edward McFowland III, Gordon Burtch and Gediminas Adomavicius
ǁ
Campus Map
Harvard Business School
Soldiers Field
Boston, MA 02163
→Map & Directions
→More Contact Information
  • Make a Gift
  • Site Map
  • Jobs
  • Harvard University
  • Trademarks
  • Policies
  • Accessibility
  • Digital Accessibility
Copyright © President & Fellows of Harvard College