Accounting and monitoring infrastructure for Distributed Computing in the ATLAS experiment

9 Jul 2021, 09:00
45m
Conference Hall

Conference Hall

Conference Hall, 5th floor
Plenary reports 4. Distributed computing applications Plenary reports

Speaker

Mr Aleksandr Alekseev (Ivannikov Institute for System Programming of the RAS)

Description

The ATLAS experiment uses various tools to monitor and analyze the metadata of the main distributed computing applications. One of the tools is fully based on the unified monitoring infrastructure (UMA) provided by the CERN-IT Monit group. The UMA infrastructure uses modern and efficient open-source solutions such as Kafka, InfluxDB, ElasticSearch, Kibana and Grafana to collect, store and visualize metadata produced by data and workflow management systems. This software stack is adapted for the ATLAS experiment and allows the development of dedicated monitoring and accounting dashboards in Grafana visualization environment. The current state of the monitoring infrastructure and overview of core monitoring and accounting dashboards in the ATLAS are presented in this contribution.

Primary authors

Mr Aleksandr Alekseev (Ivannikov Institute for System Programming of the RAS) Dario Barberis (University and INFN Genova (Italy)) Thomas Beermann (Wuppertal University (DE))

Presentation materials