Publications
Publications
- winter 2003
- Marketing Science
Massively Categorical Variables: Revealing the Information in Zip Codes
Abstract
We introduce the idea of a massively categorical variable, a variable such as zip code that takes on too many values to be treated in the standard manner, and show how to use it directly as explanatory variables in an econometric model. In an application of this concept, we explore several issues confronted in direct marketing. To begin with, the data offered by many providers, such as Experian and Claritas, are masked through aggregation to protect consumer privacy. Although this practice creates some difficulty when trying to construct models of individual-level choice behavior, we show how to take full advantage of such data through a hierarchical Bayesian variance components (HBVC) model. The flexibility of our approach allows us to combine several sources of information, some of which may not be aggregated, in a coherent manner, and we show that the conventional modeling practice understates the uncertainty with regard to its parameter values. To give economic meaning to our results, we develop targeting strategies under an array of financial conditions and show how to determine an organization's willingness-to-pay for additional data.
Keywords
Citation
Steenburgh, Thomas J., Andrew Ainslie, and Peder Hans Engebretson. "Massively Categorical Variables: Revealing the Information in Zip Codes." Marketing Science 22, no. 1 (winter 2003): 40–57.