Machine learning in economics: do or don't?
Author: Jens Buurveld
The intersection of Machine Learning with econometrics has become an important research landscape in economics. Whereas standard regression techniques are more focused on inference, where attention is mostly limited to sign and statistical significance, the data scientist’s Machine Learning approach proves to be better at predicting instead of describing. Nonetheless, applying Machine Learning to economics requires finding relevant tasks. Despite the growing interest in Machine Learning, not a lot of progress has been made in understanding when Machine Learning can be applied to economics; be it micro-, macro- or applied economics.
Why use Machine Learning in the first place? There is not one answer: Machine Learning is not one model, it covers a wide range of different approaches each with their own strengths and shortcomings. It may therefore prove useful to cover what Machine Learning can be used for. For simplicity I will only touch upon dimension-reduction and non-linearities; two distinct features of Machine Learning.
We all know the formula of the national income identity: Y=C+I+G+NX. Based on this, if I can predict next year’s consumption, investments, and so on, I would be able to predict next year’s GDP. But what if I have a very large dataset with 134 macroeconomic indicators? Should I try to find only the indicators which resemble the national income identity components and use these, or should I just wing it and use all 134 indicators for my least-squares regression? This would result in a very long (and unreadable) list of predictors. Machine Learning dimension-reduction approaches provide a solution where I ‘throw in’ all 134 indicators, but the underlying method shrinks the model to a lesser amount of variables. The method may even force some unimportant parameters to zero, resulting in a more interpretable output.
A second important feature of Machine Learning concerns its ability to account for non-linearities. Regular Linear Regression, the name speaks for itself, assumes a linear form, whereas some Machine Learning approaches are able to let go of this standard to allow for complex non-linearities and interaction effects. Highly flexible models like these lose a large portion of their interpretability however. For this reason Machine Learning approaches are often referred to as black box approaches: “I don’t know what is happening, I just know it works”. The question to be asked then is: do I always need to know why demand for my products is going up in the next quarter, as this could depend on a million things, or do I just want to know in time so that I can timely increase my production?
Should Machine Learning then be applied in economics? Yes and no. As stated in the introduction, standard regression techniques are more focused on inference and much of the current economic research falls under this category: I want to know whether joining a union will increase my pension. The highly flexible Machine Learning models do not necessarily always perform better, they may even perform worse! But for actually predicting, your average OLS may just not cut it anymore. What to predict should then be carefully considered, and the range of possibilities here is also not very small. For example: we can predict the duration of high unemployment periods to properly implement employment strategies, or predict deteriorating environments in certain neighborhoods for targeted municipality interventions.
There is much more to learn about the methodological properties and possibilities of Machine Learning in economics, but this requires a shift in thinking from trying to identify a single causal relationship to how specific prediction-making can guide economic policy. As of now, many academic papers only tried ‘simple’ stuff like inflation prediction or predicting recessions. Although these findings are important, the use of predictions can perhaps better be more focused on specific situations, such as the example on unemployment periods, and linked to actual policy-making. There is light at the end of the tunnel, with an overall increase in adoption of Machine Learning techniques it will probably not be long to see large-scale applications of these methods in economics.