Hyperconverged multi-layered system of processing and storing data from super-hot to super-cold on the “Govorun” supercomputer

10 Nov 2020, 14:15
15m
Oral Information Technology Information Technologies

Speaker

Mr Maxim Zuev

Description

At present, the “Govorun” supercomputer is used to solve different tasks facing JINR. One of the main tasks is modeling of physical events for the NICA megaproject. A peculiarity of such tasks is to work with large amounts of simulated data, amounting to hundreds of terabytes. To speed up the processing of big arrays of data, a hierarchical hyperconverged system of data processing and storage with a software-defined architecture was implemented on the “Govorun” supercomputer. According to the speed of accessing data, the system is divided into levels that are available for the user’s choice, namely, a super-hot layer implemented on the basis of Intel Optane, a hot layer based on Intel SSD NVMe under the management of the Lustre file system, a warm layer implemented as “an on-demand storage system”, which can be managed by different file systems defined by the user, a cold layer implemented on HDD of sufficient volume, which ensures data storage, but does not meet the peak requirements of a computational task. Each layer of the developed data storage system can be used both independently and as part of data processing workflows. It is noteworthy that a part of the cold storage is managed by the geographically distributed EOS file system, which allows one to connect the data processing and storage system implemented on the “Govorun” supercomputer to geographically distributed storages, the so-called DataLakes. A super-cold layer is a tape storage. The implemented hierarchical data processing and storage system provides the low time of data access and a data read/write speed of 300 Gb/s. The DIRAC software is currently used to manage jobs and the process of reading/writing/processing data from different types of storages and different types of file systems. Due to the high performance of the system described, over 140 million events, modeling the collision of heavy ions of different energies for the MPD experiment, were generated and reconstructed for the NICA megaproject over the past year.
The studies in this direction were supported by the RFBR special grant (“Megascience – NICA”), №18-02-40101.

Primary authors

Presentation materials