11th International Conference "Distributed Computing and Grid Technologies in Science and Education" (GRID'2025)

Name: 11th International Conference "Distributed Computing and Grid Technologies in Science and Education" (GRID'2025)
Start: 2025-07-07T09:00:00+03:00
End: 2025-07-11T18:00:00+03:00
Location: No location set

7–11 Jul 2025

Europe/Moscow timezone

Support

grid2025@jinr.ru

Anticipating Data Demand in HEP: A Transformer Approach

8 Jul 2025, 18:00

15m

Room 420

Sectional talk Computing for MegaScience Projects

Mikhail Shubin (Lomonosov Moscow State University)

Modern high-energy physics (HEP) experiments generate and store vast volumes of data, which users access through complex and irregular patterns. Efficient data management in such environments requires accurate forecasting of dataset popularity to optimize storage, caching, and data distribution strategies. In this work, we propose an approach for predicting future dataset access patterns using transformer-based deep learning models. By leveraging historical logs of user interactions with HEP datasets, our method captures temporal dependencies and contextual signals to forecast both short- and medium-term data demand.

We evaluate our approach on real HEP access logs and conduct a comparative analysis of the accuracy of the proposed transformer-based method with previously used methods, including Facebook Prophet, Random Forest, and LSTM. Our results suggest that transformer architectures are a powerful tool for proactive data management in large-scale scientific computing environments. Although the proposed method is demonstrated using user analysis data access patterns, it is equally applicable to production data popularity forecasting.

Additionally, we implement a custom evaluation metric focused on the total sum of future accesses compared to the sum of predicted accesses, rather than relying on traditional day-by-day accuracy metrics.

Maria Grigorieva (Moscow State University) Mikhail Shubin (Lomonosov Moscow State University) Nina Popova (Lomonosov Moscow State University)

shubin-grid2025-pres.pdf

11th International Conference "Distributed Computing and Grid Technologies in Science and Education" (GRID'2025)

Support

Anticipating Data Demand in HEP: A Transformer Approach

Room 420

Speaker

Description

Authors

Presentation materials

Choose timezone

11th International Conference "Distributed Computing and Grid Technologies in Science and Education" (GRID'2025)

Support

Speaker

Description

Authors

Presentation materials