Skip to Main Content
HBS Home
  • About
  • Academic Programs
  • Alumni
  • Faculty & Research
  • Baker Library
  • Giving
  • Harvard Business Review
  • Initiatives
  • News
  • Recruit
  • Map / Directions
Faculty & Research
  • Faculty
  • Research
  • Featured Topics
  • Academic Units
  • …→
  • Harvard Business School→
  • Faculty & Research→
Publications
Publications
  • Article
  • Journal of Machine Learning Research

Fast Generalized Subset Scan for Anomalous Pattern Detection

By: Edward McFowland III, Skyler Speakman and Daniel B. Neill
  • Format:Electronic
  • | Pages:29
ShareBar

Abstract

We propose Fast Generalized Subset Scan (FGSS), a new method for detecting anomalous patterns in general categorical data sets. We frame the pattern detection problem as a search over subsets of data records and attributes, maximizing a nonparametric scan statistic over all such subsets. We prove that the nonparametric scan statistics possess a novel property that allows for efficient optimization over the exponentially many subsets of the data without an exhaustive search, enabling FGSS to scale to massive and high-dimensional data sets. We evaluate the performance of FGSS in three real-world application domains (customs monitoring, disease surveillance, and network intrusion detection), and demonstrate that FGSS can successfully detect and characterize relevant patterns in each domain. As compared to three other recently proposed detection algorithms, FGSS substantially decreased run time and improved detection power for massive multivariate data sets.

Keywords

Pattern Detection; Anomaly Detection; Knowledge Discovery; Bayesian Networks; Scan Statistics; Analytics and Data Science

Citation

McFowland III, Edward, Skyler Speakman, and Daniel B. Neill. "Fast Generalized Subset Scan for Anomalous Pattern Detection." Art. 12. Journal of Machine Learning Research 14 (2013): 1533–1561.
  • Read Now

About The Author

Edward McFowland III

Technology and Operations Management
→More Publications

More from the Authors

    • March 2025
    • Information and Organization

    Novice Risk Work: How Juniors Coaching Seniors on Emerging Technologies Such as Generative AI Can Lead to Learning Failures

    By: Katherine C. Kellogg, Hila Lifshitz-Assaf, Steven Randazzo, Ethan Mollick, Fabrizio Dell'Acqua, Edward McFowland III, François Candelon and Karim R. Lakhani
    • May 2024
    • Faculty Research

    Pernod Ricard: Uncorking Digital Transformation

    By: Iavor Bojinov, Edward McFowland III, François Candelon, Nikolina Jonsson and Emer Moloney
    • January 2024
    • Bioinformatics

    Subset Scanning for Multi-Trait Analysis Using GWAS Summary Statistics

    By: Rui Cao, Evan Olawsky, Edward McFowland III, Erin Marcotte, Logan Spector and Tianzhong Yang
More from the Authors
  • Novice Risk Work: How Juniors Coaching Seniors on Emerging Technologies Such as Generative AI Can Lead to Learning Failures By: Katherine C. Kellogg, Hila Lifshitz-Assaf, Steven Randazzo, Ethan Mollick, Fabrizio Dell'Acqua, Edward McFowland III, François Candelon and Karim R. Lakhani
  • Pernod Ricard: Uncorking Digital Transformation By: Iavor Bojinov, Edward McFowland III, François Candelon, Nikolina Jonsson and Emer Moloney
  • Subset Scanning for Multi-Trait Analysis Using GWAS Summary Statistics By: Rui Cao, Evan Olawsky, Edward McFowland III, Erin Marcotte, Logan Spector and Tianzhong Yang
ǁ
Campus Map
Harvard Business School
Soldiers Field
Boston, MA 02163
→Map & Directions
→More Contact Information
  • Make a Gift
  • Site Map
  • Jobs
  • Harvard University
  • Trademarks
  • Policies
  • Accessibility
  • Digital Accessibility
Copyright © President & Fellows of Harvard College.