11th International Conference "Distributed Computing and Grid Technologies in Science and Education" (GRID'2025)

Name: 11th International Conference "Distributed Computing and Grid Technologies in Science and Education" (GRID'2025)
Start: 2025-07-07T09:00:00+03:00
End: 2025-07-11T18:00:00+03:00
Location: No location set

7–11 Jul 2025

Europe/Moscow timezone

Support

grid2025@jinr.ru

Data Shift Problem in Machine Learning for Particle Identification

8 Jul 2025, 14:45

15m

Room 310

Sectional talk Methods and Technologies for Experimental Data Processing

Vladimir Papoyan (JINR & AANL)

Particle identification (PID) is an essential step in the data analysis workflow of high-energy physics experiments. Machine learning approaches have become widely used in high-energy physics problems in general, and in PID in particular for the last ten years. Due to the fact that conventional algorithms of PID have poor performance in the high momentum range. However, due to the absence of ground-truth labels in experimental data, classifiers must be trained on Monte Carlo (MC) simulations. This creates a fundamental challenge: differences between the simulated and real data distributions known as data shift. It can significantly affect model generalization and performance. The impact of data shift was explored by comparing particle classification results across several MC datasets generated with different simulation settings. How the distributions of key features (momentum, energy, velocity, mass squared) vary between simulations was analyzed. The results highlight the need to carefully validate and adapt machine learning models to ensure reliable performance on data with potentially shifted distributions, especially in scenarios where real labels are unavailable.

Vladimir Papoyan (JINR & AANL)

PID_GRID2025_PapoyanVV.pdf

11th International Conference "Distributed Computing and Grid Technologies in Science and Education" (GRID'2025)

Support

Data Shift Problem in Machine Learning for Particle Identification

Room 310

Speaker

Description

Author

Presentation materials

Choose timezone

11th International Conference "Distributed Computing and Grid Technologies in Science and Education" (GRID'2025)

Support

Speaker

Description

Author

Presentation materials