Managing with Data Science - Harvard Business School MBA Program

Managing with Data Science

Course Number 1365

Professor Srikant Datar
Visiting Lecturer Alistair Croll
Fall; Q1; 1.5 credits
Project

Overview

The last few years has seen an explosion of data. Data is being collected at a staggering rate from a wide range of sources as the scale of digital activities continues to increase. Looking outward, companies have enormous data on their customers, such as what they buy, how they buy, and where they buy. Looking inward, they also have data on many of their own activities, from operations to employee engagement. However, value is not created by data alone; it is created by the application of data to achieve a business need.

Data science seeks to make sense of and gain insights from data. A recent McKinsey study estimates that over the next few years, the demand for managers with data skills will total 1.5 million. This course focuses on helping students develop the basic data skills needed to guide an organization towards becoming data-centric and to potentially create data products.

Course Objectives

Data science is a “team sport” whose practitioners draw on concepts from statistics, computer science, and machine learning to build predictive models that can inform decision making. The course objectives are twofold:

  • Familiarize students with the fundamentals of data science such that they can work effectively with a data science team in an organization, both to shape the “ask” and interpret outputs.
  • Develop their understanding of data science’s implications for management and decision making in a data-rich environment.

  • Neither a mathematical nor programming background is required for this course.

    Topics Covered

    Through a series of new cases, caselets, and assignments students will learn to:

    • Shape actionable business “asks”
    • Find, evaluate, and augment data
    • Apply basic algorithms to build models (decision trees, random forests, regression, neural networks, clustering)
    • Identify limitations of models and their outputs
    • Think critically about all parts of the data science ecosystem and working processes to make effective decisions

    A new modeling platform, Data Robot, will be a vehicle for much of the learning. You can explore the platform at www.DataRobot.com. The company headquarters is located in Boston and they will support us during the course.

    Requirements

    The most skillful managers in a data-rich world are not data scientists themselves, but they have a deep-enough understanding of the data science ecosystem and individual modeling techniques to know their value and their limitations. Therefore, the curriculum balances a high level overview of data science and its role in business with the basic mechanics of modeling techniques and model building. We will work through the mechanics of data science in Excel but also train on using the DataRobot platform so that students will not have to program in Python or R to work on their projects. Throughout the course, the focus will be on thinking critically about data, models, and conclusions in a managerial context.

    Part 1: The Data Science Ecosystem & Introduction to Modelling

    • Case: Moneyball: Introduction to Data Science
    • Case: Data Science at Target — Developing an “Ask” and Scoping a Modeling Effort (Part 1)
    • Case: Clustering customers at Wine Emporium — Exploring data through K-means clustering
    • Case: Predicting default at Lending Club: Building a basic model

    Part 2: Data Considerations & Modeling Techniques
    • Case: Predicting default at Lending Club — Basic prediction using decision trees, random forest
    • Case: RetailMart: Customer status — Basic prediction using regression
    • Introduction to DataRobot: Deep learning, Neural networks, Ensemble Models, Gradient Boosting
    • Case: Mandrill — Naïve Bayes, sentiment analysis

    Part 3: Interpretation and Decision Making

    • Case: Data Science at Target — Learning, Insights, and Future Actions (Part 2)
    • Data Visualization
    • Caselets on managing with data science, building data products

    Who is eligible?

    This course will be open to students from across Harvard University and its graduate schools, as well as Tufts and MIT. Sophomores, Juniors, and Seniors from the College are also welcome. The course will also accept a limited number of Advanced Leadership Institute Fellows.