Speaker
Description
Monitoring of computing cluster resources provides detailed information about the status of various components of computing nodes in real time. Such a tool helps to monitor the status of CPU, RAM, software-defined storage, etc. only at the current moment. To evaluate the efficiency of a computing cluster, it is necessary to analyze the statistics of the use of various components of computing nodes.
The paper presents a developed system for analyzing and visualizing resource usage statistics obtained on the basis of data from the resources and traffic monitoring and analytics system of the heterogeneous HybriLIT platform. The received data is stored in a PostgreSQL database with different levels of aggregation (week, month, year), which allows you to analyze data at different levels of detail. The presentation of the analysis results is based on the locally deployed Yandex DataLens BI platform in the form of a set of informative charts and dashboards displaying statistics on the use of resources of the heterogeneous HybriLIT platform.
The developed system for analyzing and visualizing the use of computing cluster resources allows us to obtain quantitative estimates of the efficiency of various components of the computing cluster, which contributes to its optimization and improved manageability. This system can be used to analyze the operation of other high-performance computing systems.