Publications
Publications
- September 15, 2021
- Bioinformatics
Improving Deconvolution Methods in Biology Through Open Innovation Competitions: An Application to the Connectivity Map
By: Andrea Blasco, Ted Natoli, Michael G. Endres, Rinat A. Sergeev, Steven Randazzo, Jin Hyun Paik, N.J. Maximilian Macaluso, Rajiv Narayan, Xiaodong Lu, David Peck, Karim R. Lakhani and Aravind Subramanian
Abstract
A recurring problem in biomedical research is how to isolate signals of distinct populations (cell types, tissues, and genes) from composite measures obtained by a single analyte or sensor. Existing computational deconvolution approaches work well in many specific settings, but they might be suboptimal in more general applications. Here, we describe new methods that were obtained via an open innovation competition. The goal of the competition was to characterize the expression of 1,000 genes from 500 composite measurements, which constitutes the approach of a new assay, called L1000, used to scale-up the Connectivity Map (CMap)—a catalog of millions of perturbational gene expression profiles. The competition used a novel dataset of 2,200 profiles and attracted 294 competitors from 20 countries. The top-nine performing methods ranged from machine learning approaches (Convolutional Neural Networks and Random Forests) to more traditional ones (Gaussian Mixtures and k-means). These solutions were faster and more accurate than the benchmark and likely have applications beyond gene expression.
Keywords
Citation
Blasco, Andrea, Ted Natoli, Michael G. Endres, Rinat A. Sergeev, Steven Randazzo, Jin Hyun Paik, N.J. Maximilian Macaluso, Rajiv Narayan, Xiaodong Lu, David Peck, Karim R. Lakhani, and Aravind Subramanian. "Improving Deconvolution Methods in Biology Through Open Innovation Competitions: An Application to the Connectivity Map." Bioinformatics 37, no. 18 (September 15, 2021).