Visiting Associate Professor of Business Administration
Associate Professor Yael Grushka-Cockayne's research and teaching activities focus on data science, forecasting, project management, and behavioral decision-making. Her research is published in numerous academic and professional journals, and she is a regular speaker at international conferences in the areas of decision analysis, analytics, project management and management science. She is also an award-winning teacher, winning the Darden Morton Leadership Faculty Award in 2011, the University of Virginia's Mead-Colley Award in 2012, the Darden Outstanding Faculty Award in 2013, and the Faculty Diversity Award in 2013 and 2018. In 2015 Yael won the University of Virginia All University Teaching award and has been voted MBA faculty marshal in 2016, 2017 and 2018. In 2014, Yael was named one of "21 Thought-Leader Professors" in Data Science.
Associate Professor Yael Grushka-Cockayne's research and teaching activities focus on data science, forecasting, project management, and behavioral decision-making. Her research is published in numerous academic and professional journals, and she is a regular speaker at international conferences in the areas of decision analysis, analytics, project management and management science. She is also an award-winning teacher, winning the Darden Morton Leadership Faculty Award in 2011, the University of Virginia's Mead-Colley Award in 2012, the Darden Outstanding Faculty Award in 2013, and the Faculty Diversity Award in 2013 and 2018. In 2015 Yael won the University of Virginia All University Teaching award and has been voted MBA faculty marshal in 2016, 2017 and 2018. In 2014, Yael was named one of "21 Thought-Leader Professors" in Data Science.
At HBS Yael teaches the RC TOM course. At the Darden school, Yael taught the core Decision Analysis course, and elective courses on Project Management and Data Science. Yael's recent "Fundamentals of Project Planning and Management" Coursera MOOC had over 150,000 enrolled, across 200 countries worldwide.
Before starting her academic career, she worked in San Francisco as a marketing director of an Israeli ERP company. As an expert in the areas of project management, she has served as a consultant to international firms in the aerospace and pharma industries. She is a UVA Excellence in Diversity fellow and a member of INFORMS, the Decision Analysis Society, the Operational Research Society and the Project Management Institute (PMI). She is an Associate Editor at Management Science, Operation Research, and Decision Analysis and has served as the Secretary/Treasurer of the INFORMS Decision Analysis Society between 2012-2017.
Education: B.Sc., Ben-Gurion University; MSc, London School of Economics; Ph.D., MRes, London Business School
We introduce an exponential smoothing model that a manager can use to forecast the demand of a new product or service. The model has five features that make it suitable for accurately forecasting product life cycles at scale. First, the trend in our model follows the density of a new distribution called the tilted-Gompertz distribution. This model can capture the wide range of skewed diffusions commonly found in practice—diffusions of innovations described as having “extra-Bass” skew. Second, its parameters can be updated via exponential smoothing; therefore, the model can react to local changes in the environment. This model is the first exponential smoothing model to incorporate a life-cycle trend. Third, the model relies on multiplicative errors, instead of the additive errors primarily used in existing models. Multiplicative errors ensure that all quantile forecasts are strictly positive. Fourth, the model includes prior distributions on its parameters. These prior distributions become regularization terms in the model and allow the manager to make accurate forecasts from the beginning of a life cycle, which is notoriously difficult. The model's skewed shape, time-varying, regularized parameters, and multiplicative errors can make its quantile forecasts more accurate than leading diffusion models, such as the Bass, gamma/shifted-Gompertz, and trapezoid models. Fifth, the model's estimation procedure is based on an efficient optimization routine, which can be used to forecast product life cycles at scale. In two empirical studies, one of search interest in social networks and the other of new computer sales, we demonstrate that our model outperforms leading diffusion models in out-of-sample forecasting. Our model's point and other quantile forecasts are more accurate. Accurate quantile forecasts at different horizons are critical to many operational decisions, such as capacity and inventory management.
Kenneth C. Lichtendahl Jr., Yael Grushka-Cockayne, Victor Richmond R. Jose and Robert L. Winkler
Many organizations face critical decisions that rely on forecasts of binary events. In these situations, organizations often gather forecasts from multiple experts or models and average those forecasts to produce a single aggregate forecast. Because the average forecast is known to be under-confident, methods have been proposed that create an aggregate forecast more extreme than the average forecast. But is it always appropriate to extremize the average forecast? And if not, when is it appropriate to anti-extremize (i.e., to make the aggregate forecast less extreme)? To answer these questions, we introduce a class of optimal aggregators. These aggregators are Bayesian ensembles because they follow from a Bayesian model of the underlying information experts have. Each ensemble is a generalized additive model of experts' probabilities that first transforms the experts' probabilities into their corresponding information states, then linearly combines these information states, and finally transforms the combined information states back into the probability space. Analytically, we find that these optimal aggregators do not always extremize the average forecast, and when they do, they can run counter to existing methods. On two publicly available datasets, we demonstrate that these new ensembles are easily fit to real forecast data and are more accurate than existing methods.
Problem definition: In collaboration with Heathrow Airport, we develop a predictive system that generates quantile forecasts of transfer passengers’ connection times. Sampling from the distribution of individual passengers’ connection times, the system also produces quantile forecasts for the number of passengers arriving at the immigration and security areas.
Academic/Practical relevance: Airports and airlines have been challenged to improve decision-making by producing accurate forecasts in real time. Our work is the first to apply machine learning for predicting real-time quantile forecasts in the airport. We focus on passengers’ connecting journeys, which have only been studied by few researchers. Better forecasts of these journeys can help optimize passenger experience and improve airport resource deployment.
Methodology: The predictive model developed is based on a regression tree combined with copula-based simulations. We generalize the tree method to predict complete distributions, moving beyond point forecasts. To derive insights from the tree, we introduce the concept of a stable tree that can be summarized by its key variables’ splits.
Results: We identify seven key factors that impact passengers’ connection times, dividing passengers into 16 passenger segments. We find that adding correlations among the connection times of passengers arriving on the same flight can improve the forecasts of arrivals at the immigration and security areas. When compared to several benchmarks, our model is shown to be more accurate in both point forecasting and quantile forecasting.
Managerial implications: Our predictive system can produce accurate forecasts, frequently, and in real-time. With these forecasts, an airport’s operating team can make data-driven decisions, identify late connecting passengers and assist them to make their connections. The airport can also update its resourcing plans based on the prediction of passenger arrivals. Our approach can be generalized to other domains, such as rail or hospital passenger flow.
Robert L. Winkler, Yael Grushka-Cockayne, Kenneth C. Lichtendahl Jr. and Victor Richmond R. Jose
The use and aggregation of probability forecasts in practice is on the rise. In this position piece, we explore some recent, and not so recent, developments concerning the use of probability forecasts in decision-making. Despite these advances, challenges still exist. We expand on some important challenges such as miscalibration, dependence among forecasters, and selecting an appropriate evaluation measure, while connecting the processes of aggregating and evaluating forecasts to decision-making. Through three important applications from the domains of meteorology, economics, and political science, we illustrate state-of-the-art usage of probability forecasts: how they are aggregated, evaluated, and communicated to stakeholders. We expect to see greater use and aggregation of probability forecasts, especially given developments in statistical modeling, machine learning, and expert forecasting; the popularity of forecasting competitions; and the increased reporting of probabilities in the media. Our vision is that increased exposure to and improved visualizations of probability forecasts will enhance the public’s understanding of probabilities and how they can contribute to better decisions.
Firms today average forecasts collected from multiple experts and models. Because of cognitive biases, strategic incentives, or the structure of machine-learning algorithms, these forecasts are often overfit to sample data and are overconfident. Little is known about the challenges associated with aggregating such forecasts. We introduce a theoretical model to examine the combined effect of overfitting and overconfidence on the average forecast. Their combined effect is that the mean and median probability forecasts are poorly calibrated with hit rates of their prediction intervals too high and too low, respectively. Consequently, we prescribe the use of a trimmed average, or trimmed opinion pool, to achieve better calibration. We identify the random forest, a leading machine-learning algorithm that pools hundreds of overfit and overconfident regression trees, as an ideal environment for trimming probabilities. Using several known data sets, we demonstrate that trimmed ensembles can significantly improve the random forest’s predictive accuracy.
From forecasting competitions to conditional value-at-risk requirements, the use of multiple quantile assessments is growing in practice. To evaluate them, we use a rule from the general class of proper scoring rules for a forecaster’s multiple quantiles of a single uncertain quantity of interest. The general rule is additive in the component scores. Each component contains a function that measures its quantile’s distance from the realization and weights its contribution to the overall score. To determine this function, we propose that the score of a group’s combined quantile should be better than that of a randomly selected forecaster’s quantile only when the forecasters bracket the realization (i.e., their quantiles do not fall on the same side of the realization). If a score satisfies this property, we say it is sensitive to bracketing. We characterize the class of proper scoring rules that is sensitive to bracketing when the decision maker uses a generalized average to combine forecasters’ quantiles. Finally, we show how weights can be set to match the payoffs in many important business contexts.
Bert De Reyck, Ioannis Fragkos, Yael Grushka-Cockayne, Casey Lichtendahl, Hammond Guerin and Andrew Kritzer
The advent of big data has created opportunities for firms to customize their products and services to unprecedented levels of granularity. Using big data to personalize an offering in real time, however, remains a major challenge. In the mobile advertising industry, once a customer enters the network, an ad-serving decision must be made in a matter of milliseconds. In this work, we describe the design and implementation of an ad-serving algorithm that incorporates machine-learning methods to make personalized ad-serving decisions within milliseconds. We developed this algorithm for Vungle Inc., one of the largest global mobile ad networks. Our approach also addresses other important issues that most ad networks face, such as user fatigue, budget restrictions, and campaign pacing. In an A/B test versus the company’s legacy algorithm, our algorithm generated a 23 percent increase in revenue per 1,000 impressions. Across the company’s network, this increase represents a $1 million increase in monthly revenue.
Managerial flexibility can have a significant impact on the value of new product development projects. We investigate how the market environment in which a firm operates influences the value and use of development flexibility. We characterize the market environment according to two dimensions, namely (i) its intensity, and (ii) its degree of innovation. We show that these two market characteristics can have a different effect on the value of flexibility. In particular, we show that more intense or innovative environments may increase or decrease the value of flexibility. For instance, we demonstrate that the option to defer a product launch is typically most valuable when there is little competition. We find, however, that under certain conditions defer options may be highly valuable in more competitive environments. We also consider the value associated with the flexibility to switch development strategies, from a focus on incremental innovations to more risky ground-breaking products. We find that such a switching option is most valuable when the market is characterized by incremental innovations and by relatively intense competition. Our insights can help firms understand how managerial flexibility should be explored, and how it might depend on the nature of the environment in which they operate.
This article examines the prediction contest as a vehicle for aggregating the opinions of a crowd of experts. After proposing a general definition distinguishing prediction contests from other mechanisms for harnessing the wisdom of crowds, we focus on point-forecasting contests—contests in which forecasters submit point forecasts with a prize going to the entry closest to the quantity of interest. We first illustrate the incentive for forecasters to submit reports that exaggerate in the direction of their private information. Whereas this exaggeration raises a forecaster's mean squared error, it increases his or her chances of winning the contest. And in contrast to conventional wisdom, this nontruthful reporting usually improves the accuracy of the resulting crowd forecast. The source of this improvement is that exaggeration shifts weight away from public information (information known to all forecasters) and by so doing helps alleviate public knowledge bias. In the context of a simple theoretical model of overlapping information and forecaster behaviors, we present closed-form expressions for the mean squared error of the crowd forecasts which will help identify the situations in which point forecasting contests will be most useful.
Many large organizations use a stage‐gate process to manage new product development projects. In a typical stage‐gate process project managers learn about potential ideas from research and exert effort in development while senior executives make intervening go/no‐go decisions. This decentralized decision making results in an agency problem because the idea quality in early stages is unknown to the executive and the project manager must exert unobservable development effort in later stages. In light of these challenges, how should the firm structure incentives to ensure that project managers reveal relevant information and invest the appropriate effort to create value? In this study, we develop a model of adverse selection in research and moral hazard in development with a go/no‐go decision at the intervening gate. Our results show that the principal's uncertainty regarding early‐stage idea quality—a term we refer to as idea risk—alters the effect of late‐stage development risk. The presence of idea risk can alter the incentives offered to the agent and may lead the principal to reject projects that otherwise seem favorable in terms of positive net present value. A simulation of early‐stage ideas, found through search on a complex landscape, shows that the firm can mitigate the negative effects of idea risk by encouraging breadth of search and high tolerance for failure.
We introduce an alternative to the popular linear opinion pool for combining individual probability forecasts. One of the well-known problems with the linear opinion pool is that it can be poorly calibrated. It tends toward underconfidence as the crowd's diversity increases, i.e., as the variance in the individuals' means increases. To address this calibration problem, we propose the exterior-trimmed opinion pool. To form this pool, forecasts with low and high means, or cumulative distribution function (cdf) values, are trimmed away from a linear opinion pool. Exterior trimming decreases the pool's variance and improves its calibration. A linear opinion pool, however, will remain overconfident when individuals are overconfident and not very diverse. For these situations, we suggest trimming away forecasts with moderate means or cdf values. This interior trimming increases variance and reduces overconfidence. Using probability forecast data from U.S. and European Surveys of Professional Forecasters, we present empirical evidence that trimmed opinion pools can outperform the linear opinion pool.
When several individuals are asked to forecast an uncertain quantity, they often face implicit or explicit incentives to be the most accurate. Despite the desire to elicit honest forecasts, such competition induces forecasters to report strategically and nontruthfully. The question we address is whether the competitive crowd's forecast (the average of strategic forecasts) is more accurate than the truthful crowd's forecast (the average of truthful forecasts from the same forecasters). We analyze a forecasting competition in which a prize is awarded to the forecaster whose point forecast is closest to the actual outcome. Before reporting a forecast, we assume each forecaster receives two signals: one common and one private. These signals represent the forecasters' past shared and personal experiences relevant for forecasting the uncertain quantity of interest. In a set of equilibrium results, we characterize the nature of the strategic forecasts in this game. As the correlation among the forecasters' private signals increases, the forecasters switch from using a pure to a mixed strategy. In both cases, forecasters exaggerate their private information and thereby make the competitive crowd's forecast more accurate than the truthful crowd's forecast.
Lichtendahl, Kenneth C., Yael Grushka-Cockayne, and Phillip E. Pfeifer. "The Wisdom of Competitive Crowds."Operations Research 61, no. 6 (November–December 2013): 1383–1398. (*Finalist in the Decision Analysis Society Publication Award, 2015.)
View Details
We consider two ways to aggregate expert opinions using simple averages: averaging probabilities and averaging quantiles. We examine analytical properties of these forecasts and compare their ability to harness the wisdom of the crowd. In terms of location, the two average forecasts have the same mean. The average quantile forecast is always sharper: it has lower variance than the average probability forecast. Even when the average probability forecast is overconfident, the shape of the average quantile forecast still offers the possibility of a better forecast. Using probability forecasts for gross domestic product growth and inflation from the Survey of Professional Forecasters, we present evidence that both when the average probability forecast is overconfident and when it is underconfident, it is outperformed by the average quantile forecast. Our results show that averaging quantiles is a viable alternative and indicate some conditions under which it is likely to be more useful than averaging probabilities.
We describe an integrated decision-making framework and model that we developed to aid EUROCONTROL, the European air traffic management organization, in its vital role of constructing a single unified European sky. Combining multicriteria decision analysis with large-scale optimization methods, such as integer programming and column generation using branch and price, our model facilitates the process by which the numerous European aviation stakeholders evaluate and select technological enhancements to the European air traffic management system. We consider multiple objectives and potential disagreements by stakeholders regarding the impact of proposed system enhancements and allow for different priorities for each key performance area. In an earlier paper, we described the mathematical programming model in detail. In this paper, we elaborate on the broader decision framework and supporting methodologies to help EUROCONTROL in its facilitation role. Using our model and decision framework, EUROCONTROL is currently selecting a set of enhancements to the European aviation system upon which all stakeholders have agreed.
We develop a multistakeholder, multicriteria decision-making framework for Eurocontrol, the European air traffic management organization, for evaluating and selecting operational improvements to the air traffic management system. The selected set of improvements will form the master plan of the Single European Sky initiative for harmonizing air traffic, in an effort to cope with the forecasted increase in air traffic, while maintaining safety, protecting the environment, and improving predictability and efficiency. The challenge is to select the set of enhancements such that the required performance targets are met and all key stakeholders are committed to the decisions. In this paper, we develop and implement a model to identify a preferred set of improvements to the arrival and departure procedures to and from airports. We provide an integrated approach for valuing a large number of alternatives, while considering interactions among them. The model combines quantitative and qualitative expert assessments of the possible enhancements and identifies commonalities and differences in the stakeholders' perspectives, ultimately recommending a preferred course of action. The model is currently being adopted by Eurocontrol as the formal trade-off analysis methodology supporting all enhancements' decision-making discussions throughout the construction of the master plan.
Bert De Reyck, Yael Grushka-Cockayne, Martin Lockett, Sergio Ricardo Calderini, Marcio Moura and Andrew Sloper
The ever-increasing penetration of projects as a way to organise work in many organisations necessitates effective management of multiple projects. This has resulted in a greater interest in the processes of project portfolio management (PPM), with more and more software tools being developed to assist and automate the process. Much of the early work on PPM concentrated on the management of IT projects, largely from the perspective of the management of resources and risk. Many of the recent articles have been by vendors of the software, promoting the value of the PPM process. However, the claims made in those articles are typically only supported by anecdotal evidence. In this paper, we assess whether there is a correspondence between the use of PPM processes and techniques, and improvements in the performance of projects and portfolios of projects. Based on our findings, we introduce a three-stage classification scheme of PPM adoption, and present a strong correlation between (1) increasing adoption of PPM processes and a reduction in project related problems, and (2) between PPM adoption and project performance.
Saxena, A., N. McFarland, and Y. Grushka-Cockayne. "Keep It Cool." University of Virginia, Darden School of Business Case UVA-QA-0898, 2018.
View Details
Goelz, C., D. Willingham, S. Le, and Y. Grushka-Cockayne. "Getting Rich on Crypto." University of Virginia, Darden School of Business Case UVA-QA-0897, 2018.
View Details
Wilcox, R.T., Y. Grushka-Cockayne, N. King, and J. White. "UnbeLEAFable Snacks." University of Virginia, Darden School of Business Case UVA-M-0911, 2017.
View Details
Grushka-Cockayne, Y., and T. Hasegawa. "Ariake Arena Exercise." University of Virginia, Darden School of Business Case No. UVA-QA-0889, 2017.
View Details
Carnahan, E., H. Kim, and Y. Grushka-Cockayne. "HealthCare.gov (B)." University of Virginia, Darden School of Business Case UVA-QA-0888, 2017.
View Details
Grushka-Cockayne, Y., and K. C. Lichtendahl. "Introduction to Experience-Based Forecasting: Empirical Backtesting." University of Virginia, Darden School of Business Technical Note UVA-QA-0851, 2016.
View Details
Green, A., Y. Grushka-Cockayne, K. C. Lichtendahl, and Temple Fennell. "Opening Casino Jack." University of Virginia, Darden School of Business Case UVA-QA-0827, 2015.
View Details
Grushka-Cockayne, Y., and A. Robertson. "New Menu at Split Banana." University of Virginia, Darden School of Business Case UVA-QA-0816, 2014.
View Details
Grushka-Cockayne, Y., P. Crama, E. Tang, and A. Banerjee. "Set in Stone." University of Virginia, Darden School of Business Case UVA-QA-0814, 2014.
View Details
Sorensen, T., Y. Grushka-Cockayne, and R. Carraway. "Flex Technology." University of Virginia, Darden School of Business Case UVA-QA-0811, 2014.
View Details
Gupta, R., K. C. Lichtendahl, and Y. Grushka-Cockayne. "Ocean's Dilemma." University of Virginia, Darden School of Business Case UVA-QA-0798, 2012.
View Details
Lichtendahl, K. C., and Y. Grushka-Cockayne. "Scoring Expert Forecasts." University of Virginia, Darden School of Business Technical Note UVA-QA-0772, 2011.
View Details
Lichtendahl, K. C., and Y. Grushka-Cockayne. "Ballis's Benchmark (B)." University of Virginia, Darden School of Business Case UVA-F-1622, 2010.
View Details
Grushka-Cockayne, Y., and K. C. Lichtendahl. "Ballis's Benchmark (A)." University of Virginia, Darden School of Business Case UVA-F-1621, 2010.
View Details