BigData tools for the monitoring of the ATLAS EventIndex

11 Sept 2018, 16:15
15m
406B

406B

Sectional reports 2. Operation, monitoring, optimization in distributed computing systems 2. Operation, monitoring, optimization in distributed computing systems

Speakers

Mr Andrei Kazymov (JINR)Mr Evgeny Alexandrov (JINR)

Description

The ATLAS EventIndex collects event information from data both at CERN and Grid sites. It uses the Hadoop system to store the results, and web services to access them. Its successful operation depends on a number of different components, that have to be monitored constantly to ensure continuous operation of the system. Each component has completely different sets of parameters and states and requires a special approach. A scheduler runs monitoring tasks, which gather information by various methods: querying databases, web sites and storage systems, parsing logs and using CERN host monitoring services. Information is then fed to Grafana dashboards via InfluxDB. Using this platform allowed much faster performance and flexibility compared to the previously used Kibana system.

Primary authors

Presentation materials