Skip to Main Content
HBS Home
  • About
  • Academic Programs
  • Alumni
  • Faculty & Research
  • Baker Library
  • Giving
  • Harvard Business Review
  • Initiatives
  • News
  • Recruit
  • Map / Directions
Faculty & Research
  • Faculty
  • Research
  • Featured Topics
  • Academic Units
  • …→
  • Harvard Business School→
  • Faculty & Research→
Publications
Publications
  • May 2017
  • Other Article
  • GigaScience

Stepwise Distributed Open Innovation Contests for Software Development: Acceleration of Genome-Wide Association Analysis

By: Andrew Hill, Po-Ru Loh, Ragu B. Bharadwaj, Pascal Pons, Jingbo Shang, Eva C. Guinan, Karim R. Lakhani, Iain Kilty and Scott Jelinsky
  • Format:Electronic
ShareBar

Abstract

BACKGROUND: The association of differing genotypes with disease-related phenotypic traits offers great potential to both help identify new therapeutic targets and support stratification of patients who would gain the greatest benefit from specific drug classes. Development of low-cost genotyping and sequencing has made collecting large-scale genotyping data routine in population and therapeutic intervention studies. In addition, a range of new technologies is being used to capture numerous new and complex phenotypic descriptors. As a result, genotype and phenotype datasets have grown exponentially. Genome-wide association studies associate genotypes and phenotypes using methods such as logistic regression. As existing tools for association analysis limit the efficiency by which value can be extracted from increasing volumes of data, there is a pressing need for new software tools that can accelerate association analyses on large genotype-phenotype datasets. RESULTS: Using open innovation (OI) and contest-based crowdsourcing, the logistic regression analysis in a leading, community-standard genetics software package (PLINK 1.07) was substantially accelerated. OI allowed us to do this in <6 months by providing rapid access to highly skilled programmers with specialized, difficult-to-find skill sets. Through a crowd-based contest a combination of computational, numeric, and algorithmic approaches was identified that accelerated the logistic regression in PLINK 1.07 by 18- to 45-fold. Combining contest-derived logistic regression code with coarse-grained parallelization, multithreading, and associated changes to data initialization code further developed through distributed innovation, we achieved an end-to-end speedup of 591-fold for a data set size of 6678 subjects by 645 863 variants, compared to PLINK 1.07's logistic regression. This represents a reduction in run time from 4.8 hours to 29 seconds. Accelerated logistic regression code developed in this project has been incorporated into the PLINK2 project. CONCLUSIONS: Using iterative competition-based OI, we have developed a new, faster implementation of logistic regression for genome-wide association studies analysis. We present lessons learned and recommendations on running a successful OI process for bioinformatics.

Keywords

Crowdsourcing; Genome-wide Association Study; Logistic Regression; Open Innovation; PLINK; Collaborative Innovation and Invention

Citation

Hill, Andrew, Po-Ru Loh, Ragu B. Bharadwaj, Pascal Pons, Jingbo Shang, Eva C. Guinan, Karim R. Lakhani, Iain Kilty, and Scott Jelinsky. "Stepwise Distributed Open Innovation Contests for Software Development: Acceleration of Genome-Wide Association Analysis." GigaScience 6, no. 5 (May 2017).
  • Read Now

About The Author

Karim R. Lakhani

Technology and Operations Management
→More Publications

More from the Authors

    • March 2023
    • Faculty Research

    VideaHealth: Building the AI Factory

    By: Karim R. Lakhani
    • March 2023
    • Faculty Research

    Moderna (B) Case Supplement

    By: Karim R. Lakhani, Allison J. Wigen and Dave Habeeb
    • March 2023
    • Faculty Research

    Moderna (A) Case Supplement

    By: Karim R. Lakhani, Allison J. Wigen and Dave Habeeb
More from the Authors
  • VideaHealth: Building the AI Factory By: Karim R. Lakhani
  • Moderna (B) Case Supplement By: Karim R. Lakhani, Allison J. Wigen and Dave Habeeb
  • Moderna (A) Case Supplement By: Karim R. Lakhani, Allison J. Wigen and Dave Habeeb
ǁ
Campus Map
Harvard Business School
Soldiers Field
Boston, MA 02163
→Map & Directions
→More Contact Information
  • Make a Gift
  • Site Map
  • Jobs
  • Harvard University
  • Trademarks
  • Policies
  • Accessibility
  • Digital Accessibility
Copyright © President & Fellows of Harvard College