### Description

A population annealing method is a promising approach for large-scale simulations because it is potentially scalable on any parallel architecture. We report an implementation of the algorithm on a hybrid program architecture combining CUDA and MPI [1]. The problem is to keep all general-purpose graphics processing unit devices as busy as possible by efficiently redistributing replicas. We...

Modern machine learning (ML) tasks and neural network (NN) architectures require huge amounts of GPU computational facilities and demand high CPU parallelization for data preprocessing. At the same time, the Ariadne library, which aims to solve complex high-energy physics tracking tasks with the help of deep neural networks, lacks multi-GPU training and efficient parallel data preprocessing on...

Modeling the spread of viruses is an urgent task in modern conditions. In the created model, contacts between people are represented in the form of the Watz and Strogatz graph. We studied graphs with tens of thousands of vertices with a simulation period of six months. The paper proposes methods for accelerating computations on graph models using graphics processors. In the considered problem,...

Empirical studies have repeatedly shown that in High-Performance Computing (HPC) systems, the userโs resource estimation lacks accuracy [1]. Therefore, resource underestimation may remove the job at any step of computing, and subsequently allocated resources will be wasted. Moreover, resource overestimation also will waste resources. The SLURM, a famous job scheduler, has a mechanism to...

Explicit numerical methods are used to solve and simulate a wide range of mathematical problems whose origins can be mathematical models of physical conditions. However, simulations with large model spaces can require a tremendous amount of floating point calculations and run times of several months or more are possible even on large HPC systems.

The vast majority of HPC systems in the field...

The ROOT software package has a central role in high energy analytics and is being upgraded in several ways to improve processing performance. In this paper, we will consider several tools implemented in this framework for calculations on modern heterogeneous computing architectures.

PROOF (Parallel ROOT Facility โ an extension of ROOT system) uses the natural parallelism of data structures...

Obtaining a long term reference trajectory on the chaotic attractor for coupled Lorenz system is a difficult task due to the sensitive dependence on the initial conditions. Using the standard double-precision floating point arithmetic, we cannot obtain a reference solution longer than 2.5 time units. Combining OpenMP parallel technology together with GMP library (GNU multiple precision...

In this talk, we discuss the optimal strategy for parallel matrix-matrix multiplication algorithm that minimizes the time-to-solution by finding the best parameters of the algorithm for overlapping multiplications of separate tiles in each GPU and data transfers between GPUs. The new algorithm developed for multi-GPU nodes is discussed [1]. The correlation is analyzed between the optimal...

Fault tolerance of parallel and distributed applications is one of the concerns that becomes topical for large computer clusters and large distributed systems. For a long time the common solution to this problem was checkpoint and restart mechanisms implemented on operating system level, however, they are inefficient for large systems and now application-level checkpoint and restart is...

The report gives an overview of two information systems (IS) under development on the basis of the HybriLIT platform. The major goal of creating these ISs is to automate calculations, as well as to ensure data storage and analysis for different research groups.

The information system for radiobiological research provides tools for storing experimental data of different types, a software set...

ะะตัะตัะพะณะตะฝะฝะฐั ะฒััะธัะปะธัะตะปัะฝะฐั ะฟะปะฐััะพัะผะฐ HybriLIT ัะฒะปัะตััั ัะฐัััั

ะผะฝะพะณะพััะฝะบัะธะพะฝะฐะปัะฝะพะณะพ ะธะฝัะพัะผะฐัะธะพะฝะฝะพ-ะฒััะธัะปะธัะตะปัะฝะพะณะพ ะบะพะผะฟะปะตะบัะฐ ะะฐะฑะพัะฐัะพัะธะธ

ะธะฝัะพัะผะฐัะธะพะฝะฝัั
ัะตั
ะฝะพะปะพะณะธะน ะธะผ. ะ.ะ. ะะตัะตััะบะพะฒะฐ ะะฑัะตะดะธะฝะตะฝะฝะพะณะพ ะธะฝััะธัััะฐ ัะดะตัะฝัั

ะธััะปะตะดะพะฒะฐะฝะธะน. ะัะป ะฟัะพะฒะตะดะตะฝ ะฐะฝะฐะปะธะท ะดะฐะฝะฝัั
ะฟะพ ะธัะฟะพะปัะทะพะฒะฐะฝะธั ะฟะปะฐััะพัะผั HybriLIT:

ะพัะพะฑะพะต ะฒะฝะธะผะฐะฝะธะต ัะดะตะปะตะฝะพ ะธััะปะตะดะพะฒะฐะฝะธั ะธะฝัะพัะผะฐัะธะธ ะพะฑ ะธัะฟะพะปัะทัะตะผัั
ัะตััััะฐั
ะฟัะธ

ะทะฐะฟััะบะต ะทะฐะดะฐั...

Due to the rapid development of high-performance computing systems, more and more complex and time-consuming computer simulations can be carried out. It opens new opportunities for scientists and engineers. A standard situation for scientific groups now is to have an own in-house research software, significantly optimized and adopted for a very narrow scientific problem. The main disadvantage...

In the wake of the success of the integration of the Titan supercomputer into the ATLAS computing infrastructure, the number of such projects began to increase. However, it turned out that it is extremely difficult to ensure efficient data processing on such types of resources without deep modernization of both applied software and middleware. This report discusses in detail the problems and...

High-performance supercomputers became one of the biggest power consumption machines. The top supercomputer's power is about 30mW. Recent legislative trends in the carbon footprint area are affecting high-performance computing. In our work, we collect energy analysis from different kinds of Intel's server CPUs. We present the comparison of energy efficiency of our new Poissons's solver, which...

The minimum spanning tree problem has influential importance in computer science, network analysis, and engineering. However, the sequential algorithms become unable to process the given problem as the volume of the data representing graph instances overgrowing. Instead, the high-performance computational resources pursue to simulate large-scale graph instances in a distributed manner....

"HybriLIT" Heterogeneous platform is a part of the Multifunctional Information and Computing Complex (MICC) of the Laboratory of Information Technologies named after MG Meshcheryakov of JINR, Dubna. Heterogeneous platform consists of Govorun supercomputer and HybriLIT education and testing polygon. Data storage and processing system is one of the platform components. It is implemented using...

Commonly used job schedulers in high-performance computing environments do not allow resource oversubscription. They usually work by assigning the entire node to a single job even if the job utilises only a small portion of nodesโ available resources. This may lead to cluster resources under-utilization and to an increase of job wait time in the queue. Every job may have different requirements...

The development and popularization of the AMD ROCm platform with HIP technology allows one to create code that is not locked to a specific vendor maintaining a high level of performance. A lot of legacy but still supported codes is originally written in CUDA, and now it is getting ROCm HIP support as well. In a recent paper [1], the performance of popular molecular dynamics packages with GPU...