Development of JINR Tier-1 service monitoring system

11 Sept 2018, 14:45
15m
406B

406B

Sectional reports 2. Operation, monitoring, optimization in distributed computing systems 2. Operation, monitoring, optimization in distributed computing systems

Speaker

Mr Igor Pelevanyuk (JINR)

Description

Tier-1 center for CMS in JINR has been successfully operating since 2015. Monitoring is an important aspect of ensuring its performance. Hardware monitoring of the Tier-1 center had been introduced at construction time and was constantly upgraded with the center. The scientific community makes use of the resources through Grid services that depend on more low-level services. A dedicated monitoring system has been developed to keep an eye on the state of all services related to Tier-1 operations. The main object of the monitoring system is to collect data from different sources, process it and provide a comprehensive overview on a web page. The mechanism was implemented to allow determining status by analyzing collected data. The notion of event was introduced to allow reactions on ongoing changes of all services. The whole system consists of core libraries and monitoring modules. A monitoring module may unite functionality related to data collection, analysis, visualization, possible statuses and events, and reactions on events. This allows building flexible monitoring modules which together form a Tier-1 service monitoring system.

Primary author

Co-authors

Presentation materials