MODELING RETAIL SALES USING AUTOREGRESSIVE INTEGRATED MOVING AVERAGE AND LONG SHORT-TERM MEMORY FORECASTING METHODS

Oleksii Kachmar; Roman Shuvar; Igor Kolych

doi:10.30970/eli.30.8

MODELING RETAIL SALES USING AUTOREGRESSIVE INTEGRATED MOVING AVERAGE AND LONG SHORT-TERM MEMORY FORECASTING METHODS

Oleksii Kachmar, Roman Shuvar, Igor Kolych

Abstract

Background. Forecasting retail sales is crucial for modern supply chain and inventory management. Traditional statistical models alone can be insufficient due to the large amounts of data generated by extensive retail chains. Combining time series analysis with machine learning can improve forecast accuracy.

Materials and Methods. This research used the M5-forecasting accuracy dataset, containing over 30,000 time series of store-item daily sales. The study involved data preprocessing to handle any missing values and splitting the series into training and hold-out test sets. Three forecasting methods were applied. The first method accounted for autoregressive and moving average components. The second approach explicitly included trend and seasonality by decomposing the series into those components, fitting a model to the trend-adjusted series, and then reintroducing the seasonal part. Third, a long short-term memory deep learning regressor was trained to capture longer-range dependencies. The evaluation on the test set was performed using the Mean Absolute Error (MAE). Residual analysis examined autocorrelation and the distribution of errors.

Results and Discussion. A focus on one item showed a strong weekly cycle. The first autoregressive approach without explicit seasonality partially captured the data but left some significant autocorrelation in the residuals. The second autoregressive variant that considered trend and weekly seasonal decomposition achieved the best short-term predictive accuracy, reflected by lower MAE. The deep learning regressor, implemented in a recursive multi-step setup, did not outperform the autoregressive one, partly due to error accumulation and possibly incorrect choice of its architecture.

Conclusion. The study indicates that for retail data with clear weekly fluctuations, autoregressive moving-average models enhanced by trend and seasonal decomposition can provide robust forecasts. Neural network methods can model non-linearities but require more specialized sequence-to-sequence configurations to avoid cumulative forecast errors. Future work can involve combining methods for multi-horizon and hierarchical retail time series.

Keywords: Time series analysis, machine learning, retail forecasting, ARIMA, LSTM, seasonality

Full Text:

PDF

References

Pavlyshenko, B.M. (2019). Machine-learning models for sales time series forecasting. Data Stream Mining and Processing, 4(1), 15. https://doi.org/10.3390/data4010015
Makridakis, S., Spiliotis, E., & Assimakopoulos, V. (2021). The M5 competition: Background, organization, and implementation. International Journal of Forecasting. https://doi.org/10.1016/j.ijforecast.2021.07.007
Pongdatu, G. A. N., & Putra, Y. H. (2018). Seasonal time series forecasting using SARIMA and Holt Winter’s exponential smoothing. IOP Conference Series: Materials Science and Engineering, 407, 012153. https://doi.org/10.1088/1757-899x/407/1/012153
Wang, Y., Zhu, S., & Li, C. (2019). Research on multistep time series prediction based on LSTM. 2019 3rd International Conference on Electronic Information Technology and Computer Engineering (EITCE), Xiamen, China, 1155-1159. https://doi.org/10.1109/EITCE47263.2019.9095044
Wei, H., & Zeng, Q. (2021). Research on sales forecast based on XGBoost-LSTM algorithm model. Journal of Physics: Conference Series, 1754(1), 012191. https://doi.org/10.1088/1742-6596/1754/1/012191
Massaro, A., et al. (2021). Augmented data and XGBoost improvement for sales forecasting in the large-scale retail sector. Applied Sciences, 11(17), 7793. https://doi.org/10.3390/app11177793
Kaggle. (2020). M5 Forecasting dataset. Publicly available at: https://www.kaggle.com/competitions/m5-forecasting-accuracy/data
Rasul, K., Ashok, A., Williams, A. R., Ghonia, H., Bhagwatkar, R., Khorasani, A., Bayazi, M. J. D., Adamopoulos, G., Riachi, R., Hassen, N., Biloš, M., Garg, S., Schneider, A., Chapados, N., Drouin, A., Zantedeschi, V., Nevmyvaka, Y., & Rish, I. (2024). Lag-Llama: Towards foundation models for probabilistic time series forecasting. arXiv. https://doi.org/10.48550/arXiv.2310.08278
Polykhov, M. Time series forecasting using LSTM (Прогнозуванння часових рядів методом LSTM). Ekmair. https://ekmair.ukma.edu.ua/items/7f55fa50-bf7c-4ba3-a7a0-cab60010af06
Xu, Chengjin, Nayyeri, Mojtaba, Alkhoury, Fouad, Shariat Yazdi, Hamed, Lehmann, Jens. (2020). Temporal Knowledge Graph Embedding Model based on Additive Time Series Decomposition. https://www.researchgate.net/publication/344450419_Temporal_Knowledge_Graph_Embedding_Model_based_on_Additive_Time_Series_Decomposition
Olah, C. (2015). Understanding LSTM Networks. colah's blog. https://colah.github.io/posts/2015-08-Understanding-LSTMs/

DOI: http://dx.doi.org/10.30970/eli.30.8

Refbacks

There are currently no refbacks.

Username
Password
Remember me

Electronics and information technologies / Електроніка та інформаційні технології

MODELING RETAIL SALES USING AUTOREGRESSIVE INTEGRATED MOVING AVERAGE AND LONG SHORT-TERM MEMORY FORECASTING METHODS

Abstract

Full Text:

References

Refbacks