Options
2025
Journal Article
Title
Measuring time series heterogeneity for global learning
Abstract
Training forecasting models across the time series in a data set can improve accuracy over local training where one model is trained per time series. Several studies have shown, that the performance can further be improved when the time series in a data set are clustered beforehand, and then a model per cluster is trained. The employed cluster algorithms typically represent the time series by a set of time series features, and minimize the variance of these in the clusters. This assumes that lower variance of time series features in the clusters is beneficial for learning. However it is unclear how the variance of these time series features relate to forecast performance for global models. We empirically evaluate this for 13 time series features for Light Gradient Boosting Machines. Further, since there is a trade-off between cluster size and variance within clusters, we evaluate the change in performance when more time series are pooled, i.e. larger clusters are formed. We find that: (i) learning across already a relatively small number of time series improves forecast accuracy by 8% to 13% over local learning with diminishing effects when even more time series are grouped together, (ii) and that the variance of some time series features relates significantly positive or negative to forecast performance.
Keyword(s)