29 October 2023 to 3 November 2023
DLNP, JINR
Europe/Moscow timezone

Development of Monitoring Service for BM@N information systems

1 Nov 2023, 15:35
15m
Conference Hall, opposite the main building of the DLNP

Conference Hall, opposite the main building of the DLNP

Oral Information Technology Information Technology

Speaker

Olga Nemova (MIPT)

Description

The software infrastructure of the BM@N experiment contains a set of various information systems that are essential for the work with experimental or simulated data on all processing stages, including the collection, storage, intermediate processing and physics analysis. Some examples of the systems are the Electronic Logbook Platform, Condition Database and Event Metadata System. In case one of such systems stops functioning, the work with BM@N data by collaboration members gets either impossible or, at least, much less productive. Due to this fact, the timely detection of possible failures in the systems due to software or hardware failures is fairly important. The Monitoring Service described in the report is used to check availability and health status of information systems. This includes measuring, storing, visualizing and sending alert notifications on monitored parameters, such as CPU, memory and disk utilization, DBMS functioning parameters, response times of databases and API endpoints, ping round-trip times, and so on. The current implementation of the BM@N monitoring service is discussed in detail. A related task of building highly available information services is also briefly noted.

Primary author

Olga Nemova (MIPT)

Co-authors

Konstantin Gertsenberger (JINR) Peter Klimai (INR RAS)

Presentation materials

Peer reviewing

Paper