Accurately predicting sales data is a challenge all retail companies face. Especially in light of the vast availability of data and solutions to evaluate data, firms at the cutting-edge of forecasting technology can gain an invaluable competitive advantage. The Makridakis Competitions (also known as the “M Competitions”) are a series of forecasting tournaments led by forecasting researcher Spyros Makridakis. The competitions have been held since 1982 in five iterations; the sixth tournament’s final will be held in 2024. The M5 Competition, which concluded in 2022, was the first M Competition to leverage real life time-series data. The 5,507 participating teams were tasked with predicting hierarchical unit sales data across several product categories for Walmart.

The teams’ results were compared to simple benchmark models that utilize exponential smoothing. Only c. 12% of the teams managed to outperform all benchmarks, but this time, they outperformed by up to c. 20%, while in previous challenges the high watermark peaked around 10%. The particularly challenging part in the M5 challenge was forecasting zero-demand days, introducing advanced models like those in the GAMLSS framework to handle overdispersion, a detail not specifically tackled in previous M Competitions.

The most significant methodological evolution in the M5 Competition was without a doubt the dominance of machine learning (ML) driven models. In M4, two of the winning submissions were already ML based, but only on statistical, series-specific functionalities, and thus not significantly superior to the median of four statistical models. However, in M5 all of the winning submissions leveraged “pure” ML models which significantly outperformed all statistical benchmarks. As in previous years, the findings again underline the efficacy of method combinations. A majority of the top 50 placing teams utilized hybrid approaches, with ML and statistical technique combinations proving particularly effective. A new facete in the M5 Competition was the success of cross-learning methodologies (one model leveraging learning from multiple data-series), which were not as successful in previous Competitions due to the low correlation of data-series. Also, external adjustments from bigger-picture data applied to smaller scales improved accuracy. This was tested before in the M2 Competition, where researchers couldn’t improve their accuracy in simply statistical models – ML might be the key driver in its recent success. Higher dynamism of the models by cross-validation (measuring how well models will perform on new data, leveraging existing data) was also one of the success factors in close to all winner models.

These advancements illustrate the trends in forecasting methodologies as more complex data and new predictive modelling techniques are introduced. Overall, the findings from the M5 Competition underscore the importance of the correct model choice in retail forecasting, and impressively demonstrate the competitive edge a player at the winning side of the spectrum can obtain.

Paper: M5 accuracy competition: Results, findings, and conclusions – ScienceDirect