Introduction
The study of methods or algorithms meant to understand the underlying patterns in data and generate predictions based on these patterns is known as machine learning (ML). In marketing, academic research has typically focused on causal inference. The requirement to generate counterfactual predictions drives the focus on causation. Will rising advertising spending, for example, improve demand? To answer this question, you’ll need an unbiased assessment of the influence of advertising on demand. Marketing techniques, on the other hand, rely on the ability to make correct forecasts. For example, which customers to target, which product configurations a customer is most likely to pick, which form of a banner ad will produce the most clicks, and what rivals’ market shares and actions are likely to be. All of these are issues of prediction. These issues do not necessitate causality; instead, great out-of-sample prediction accuracy is required. Machine learning technology can help with these difficulties.
Machine learning approaches
In terms of their emphasis and the features they supply, machine learning approaches differ from econometric methods. To begin, machine learning approaches are concerned with achieving the best out-of-sample predictions, whereas causal econometric methods produce the best unbiased estimators. As a result, methods designed for causal inference frequently fail to perform effectively when making out-of-sample predictions. The best-unbiased estimator does not always offer the most incredible out-of-sample prediction, as we shall explain below. In some instances, a biassed estimator performs better for out-of-sample data. Second, machine learning techniques are built to function in circumstances where we don’t have an a priori understanding of how the data’s results were created. This feature of ML differs from econometric techniques, which are used to evaluate a specific causal hypothesis. Third, unlike many empirical marketing strategies, machine learning algorithms can handle many variables and determine which ones should be kept and eliminated. Finally, with ML approaches, scalability is a significant concern and strategies like feature selection and efficient optimization aid in achieving scale and efficiency. Because many of these algorithms must operate in real-time, scalability is becoming increasingly crucial for marketers.
There are two types of machine learning models: supervised learning and unsupervised learning. Input data for supervised learning must include predictor (independent) variables and a target (dependent) variable whose value must be approximated. The procedure learns how to forecast the value of the target variable based on the predictor variables using various methods. Supervised learning includes things like decision trees, regression analysis, and neural networks. Supervised learning is utilized when the objective of an investigation is to anticipate the value of a variable. Unsupervised learning ignores the identification of a target (dependent) variable, favouring treating all variables equally. The objective here isn’t to forecast the value of a variable but to find patterns, groups, or other ways to define the data that could better understand how the data interact. Unsupervised learning techniques include cluster analysis, factor analysis (principle components analysis), EM algorithms, and topic modelling (text analysis).
Support Vector Machines is a prominent classification algorithm in the last 20 years, with applications in image recognition, text mining, and illness detection, thanks to its stability and capacity to handle massive, high-dimensional data. Cui and Curry (2005) brought it to marketing and introduced SVM theory and applications. They also compare SVM’s predictive performance to that of the multinomial logit model on simulated choice data, demonstrating that SVM outperforms the multinomial logit model, especially when data is noisy and products have a large number of characteristics (i.e., high dimensionality). They also discovered that SVM beats the multinomial logit model by a substantial margin when predicting choices from more extensive choice sets. Although both techniques’ predictive performance decreases as the options set grow, the reduction is significantly steeper for multinomial logit than for SVM because the first-choice prediction task becomes more complex. The latent-class SVM model, which permits the inclusion of latent variables inside SVM, is another extension. By elaborating on the convex-concave technique used to estimate latent-class SVM while incorporating respondent heterogeneity, Liu and Dzyabura (2016) create an algorithm for predicting multi-taste consumer preferences. They demonstrate that their model’s prediction outperforms single-taste benchmarks.
Reference
[1] Cui, D. and D. Curry. Prediction in Marketing using the Support Vector Machines. Marketing Science, 24(4): 595–615, 2005.
[2] Liu, L. and D. Dzyabura. Capturing Multi-taste Preferences: A Machine Learning Approach. Working Paper, 2016. [3] Dzyabura, Daria, and Hema Yoganarasimhan. “Machine learning and marketing.” In Handbook of Marketing Analytics. Edward Elgar Publishing, 2018.