Multi-GPU training and parallel CPU computing for the machine learning experiments using Ariadne library

8 Jul 2021, 13:45
15m
403 or Online - https://jinr.webex.com/jinr/j.php?MTID=mf93df38c8fbed9d0bbaae27765fc1b0f

403 or Online - https://jinr.webex.com/jinr/j.php?MTID=mf93df38c8fbed9d0bbaae27765fc1b0f

https://jinr.webex.com/jinr/j.php?MTID=mf93df38c8fbed9d0bbaae27765fc1b0f
5. High Performance Computing HPC

Speaker

Egor Shchavelev (Saint Petersburg State University)

Description

Modern machine learning (ML) tasks and neural network (NN) architectures require huge amounts of GPU computational facilities and demand high CPU parallelization for data preprocessing. At the same time, the Ariadne library, which aims to solve complex high-energy physics tracking tasks with the help of deep neural networks, lacks multi-GPU training and efficient parallel data preprocessing on the CPU.
In our work, we present our approach for the Multi-GPU training in the Ariadne library. We will present efficient data-caching, parallel CPU data preprocessing, generic ML experiment setup for prototyping, training, and inference deep neural network models. Results in terms of speed-up and performance for the existing neural network approaches are presented with the help of GOVORUN computing resources.

Primary authors

Egor Shchavelev (Saint Petersburg State University) Gennady Ososkov (Joint Institute for Nuclear Research) Pavel Goncharov (Sukhoi State Technical University of Gomel, Gomel, Belarus) Anastasiia Nikolskaia Ekaterina Rezvaya Daniil Rusov

Presentation materials

There are no materials yet.