Estimating security betas via machine learning

HFRC Working Paper Series | Version 10/2021

Abstract

This paper evaluates the predictive performance of machine learning techniques in estimating time-varying betas of US stocks. Compared to established estimators, tree-based models and neural networks outperform from both a statistical and an economic perspective. Random forests perform the best overall. Machine learning-based estimators provide the lowest forecast errors. Moreover, unlike traditional approaches, they lead to truly ex-post market-neutral portfolios. The inherent model complexity is strongly time-varying. The most important predictors are various historical betas as well as fundamental turnover and size signals. Compared to linear regressions, interactions and nonlinear effects enhance the predictive performance substantially.