Event index based correlation analysis for the JUNO experiment

Jul 6, 2021, 1:30 PM
407 or Online - https://jinr.webex.com/jinr/j.php?MTID=m573f9b30a298aa1fc397fb1a64a0fb4b

407 or Online - https://jinr.webex.com/jinr/j.php?MTID=m573f9b30a298aa1fc397fb1a64a0fb4b

Sectional reports 9. Big data Analytics and Machine learning Big data Analytics and Machine learning.


Tao Lin (IHEP)


The Jiangmen Underground Neutrino Observatory (JUNO) experiment is mainly designed to determine the neutrino mass hierarchy and precisely measure oscillation parameters by detecting reactor anti-neutrinos. The total event rate from DAQ is about 1 kHz and the estimated volume of raw data is about 2 PB/year. But the event rate of reactor anti-neutrino is only about 60/day. So one of challenges for data analysis is to select sparse physics signal events in a very large amount of data, whose volume can not be reduced by using the traditional data streaming method. In order to improve the speed of data analysis, a new correlated data analysis method has been implemented based on event’s index data. The index data contain the address of events in the original data files as well as all the information needed by event selection, which are produced in event pre-processing using the JUNO’s Sniper-based offline software. The index data are subsequently selected by using refined selection criteria with Spark so that the volume of index data is further reduced. At the final stage of data analysis, only the events within the time window are loaded according to the event address in the index data. A performance study shows that this method achieves a 15-fold speedup compared to correlation analysis by reading all the events. This contribution will introduce detailed software design for event index based correlation analysis and present performance measured with a prototype system.

Primary authors

Tao Lin (IHEP) Dr Yan Liu (IHEP) Prof. Weidong Li (IHEP) Dr Jiaheng Zou (IHEP)

Presentation materials