Semantic-Enhanced Demand Forecasting: A Multimodal Transformer Integrating Product Descriptions and Customer Purchase History

Authors

https://doi.org/10.48314/anowa.v1i4.58

Abstract

Accurate forecasting of daily demand at the individual customer-product level remains a critical yet challenging problem in retail, hindered by data sparsity, volatile consumer behavior, and the underutilization of unstructured product information. This study addresses this gap by proposing a novel Multimodal Semantic Transformer (MST) framework that integrates semantic product embeddings, derived from Natural Language Processing (NLP) of descriptions, with structured customer purchase history and multi-scale temporal features. Using the Online Retail II dataset, the model was evaluated against benchmarks including LSTM, Gradient Boosting Machines, and a unimodal Transformer. The results demonstrate that the MST framework significantly outperforms all benchmarks, achieving a 15.5% reduction in Mean Squared Error (MSE) compared to the best baseline. Key findings confirm that semantic fusion provides a crucial signal for sparse products and that temporal embeddings with dynamic attention conditioning are essential for modeling complex seasonality and context. The study concludes that deep multimodal integration is a transformative approach for granular demand forecasting, offering a scalable and interpretable solution to enable hyper-personalized inventory management and more resilient, customer-centric retail supply chains.

Keywords:

Multimodal transformer, Demand forecasting, Semantic integration, Natural language processing embeddings, Retail analytics

References

  1. [1] Tiwari, R. (2025). Harnessing AI and predictive analytics for enhanced demand forecasting in retail supply chains. Supply chain and retail management, 8(1), 41–52. http://admin.mantechpublications.com/index.php/JOLSCM/article/viewFile/2349/878

  2. [2] Saarinen, L., & Huttunen, P. (2025). Revisiting the value of data sharing in retail supply chain demand planning. International journal of operations & production management, 45(11), 1910–1936. https://doi.org/10.1108/IJOPM-07-2024-0560

  3. [3] Oliveira, J., & Ramos, P. (2024). Evaluating the effectiveness of time series transformers for demand forecasting in retail. Mathematics, 12(17), 2728. https://doi.org/10.3390/math12172728

  4. [4] Sukel, M., & Worring, M. (2024). Multimodal temporal fusion transformers are good product demand forecasters. IEEE multimedia, 31(2), 48–60. https://doi.org/10.1109/MMUL.2024.3373827

  5. [5] Yan, Y., & Resnick, N. (2024). A high-performance turnkey system for customer lifetime value prediction in retail brands. Quantitative marketing and economics, 22(2), 169–192. https://doi.org/10.1007/s11129-023-09272-x

  6. [6] Rahikka, J., & Mikkola, P. (2025). Modern time series methods for demand forecasting in retail [Thesis]. https://helda.helsinki.fi/server/api/core/bitstreams/0f5658a0-0cb9-40b9-ab1e-b32e5f365d89/content

  7. [7] Chowdhury, A. R., Paul, R., & Rozony, F. Z. (2025). A systematic review of demand forecasting models for retail e-commerce enhancing accuracy in inventory and delivery planning. International journal of scientific interdisciplinary research, 6(1), 1–27. https://doi.org/10.63125/mbbfw637

  8. [8] Samal, T., & Ghosh, A. (2025). Ensemble-based predictive analytics for demand forecasting in multi-channel retailing. Expert systems with applications, 299, 130212. https://doi.org/10.1016/j.eswa.2025.130212

  9. [9] Li, Q. (2023). Achieving sales forecasting with higher accuracy and efficiency: A new model based on modified transformer. Journal of theoretical and applied electronic commerce research, 18(4), 1990–2006. https://doi.org/10.3390/jtaer18040100

  10. [10] Wang, Y. (2025). Causal-aware multimodal transformer for supply chain demand forecasting: Integrating text , time series , and satellite imagery. IEEE access, 13(August), 176813–176829. https://doi.org/10.1109/ACCESS.2025.3619552

  11. [11] Cai, W., Song, Y., & Wei, Z. (2021). Multimodal data guided spatial feature fusion and grouping strategy for e-commerce commodity demand forecasting. Mobile information systems, 2021(1), 5568208. https://doi.org/10.1155/2021/5568208

  12. [12] Bi, X., Adomavicius, G., Li, W., & Qu, A. (2022). Improving sales forecasting accuracy: A tensor factorization approach with demand awareness. Informs journal on computing, 34(3), 1644–1660. https://doi.org/10.1287/ijoc.2021.1147

  13. [13] Rafi, M. A., Rodrigues, G. N., Mir, N. H., Bhuiyan, S. M., Mridha, M. F., Islam, R., & Watanobe, Y. (2025). A hybrid temporal convolutional network and transformer model for accurate and scalable. IEEE open journal of the computer society, 6, 380–391. https://doi.org/10.1109/OJCS.2025.3538579

  14. [14] Tripathi, S., Trigunait, R., & Chandra, D. (2025). A behavioral and environmental framework for sustainable retail forecasting and decision-making. Circular economy and sustainability, 5, 1–29. https://doi.org/10.1007/s43615-025-00705-1

  15. [15] Rai, S. K. (2025). Data-driven retail: The engineering behind personalized customer experiences. Journal of computer science and technology studies, 7(10), 571–581. https://doi.org/10.32996/jcsts

  16. [16] Tarighat, N., Cohen, M. C., Clark, J. J., & Member, L. S. (2025). Domain adaptation for retail demand prediction. IEEE access, 13, 146267–146294. https://doi.org/10.1109/ACCESS.2025.3600468

Published

2025-12-05

How to Cite

Zare Baghiabad, F. . (2025). Semantic-Enhanced Demand Forecasting: A Multimodal Transformer Integrating Product Descriptions and Customer Purchase History. Annals of Optimization With Applications, 1(4), 221-233. https://doi.org/10.48314/anowa.v1i4.58