Development of dashboards for the workflow management system in the ATLAS experiment

5 Jul 2021, 16:15
15m
403 or Online - https://jinr.webex.com/jinr/j.php?MTID=mf93df38c8fbed9d0bbaae27765fc1b0f

403 or Online - https://jinr.webex.com/jinr/j.php?MTID=mf93df38c8fbed9d0bbaae27765fc1b0f

Sectional reports 4. Distributed computing applications Distributed computing applications

Speaker

Aleksandr Alekseev (National Research Tomsk Polytechnic University)

Description

The UMA software stack developed by the CERN-IT Monit group provides the main repository of monitoring dashboards. The adaptation of this stack to the ATLAS experiment began in 2018 to replace the old monitoring system. Since then, many improvements and fixes have been implemented to the UMA. One of the most considerable enhancements was the migration of the storage for aggregated data from InfluxDB to ElasticSearch, which significantly reduced the execution time of long time range selection queries. Many dashboards were created and updated in Grafana for various user groups and use cases to monitor the workflow management system and computing infrastructure. “Jobs accounting”, “Jobs monitoring”, “Site-oriented” and “HS06 reports” are examples of handy dashboards which are regularly utilized by ATLAS users. This presentation is dedicated to the overview of the jobs dashboards in the ATLAS experiment.

Primary authors

Aleksandr Alekseev (National Research Tomsk Polytechnic University) Dario Barberis (University and INFN Genova (Italy)) Thomas Beermann (Wuppertal University (DE))

Presentation materials