Concurrently employing resources of several supercomputers with ParaSCIP solver by Everest platform

5 Jul 2021, 16:30
15m
Conference Hall or Online - https://jinr.webex.com/jinr/j.php?MTID=m6e39cc13215939bea83661c4ae21c095

Conference Hall or Online - https://jinr.webex.com/jinr/j.php?MTID=m6e39cc13215939bea83661c4ae21c095

Sectional reports 1. Distributed computing systems Distributed computing systems

Speaker

Sergey Smirnov (Institute for Information Transmission Problems of the Russian Academy of Sciences)

Description

ParaSCIP is one of the few open-source solvers implementing a parallel version of the Branch-and-Bound (BNB) algorithm for discrete and global optimization problems adapted for computing systems with distributed memory, e.g. for clusters and supercomputers. As is known from publications there were successful using up to 80,000 CPU cores during solving problems from the MIPLIB test libraries. It was on Titan supercomputer from Oak Ridge National Laboratory, USA. During operation, the solver periodically saves the current state of the solution process (so-called checkpoints). This allows resuming solving process later, on another supercomputer as well. Usually, this feature is used to bypass time limits for jobs sent to the cluster. But still, there are a lot of interesting scientific and/or industry problems which can not be solved in an acceptable time by one “usual” cluster by hundreds of CPU cores.
In this study, an approach is described to use resources of several clusters simultaneously to reduce solving time. For that, a previously developed DDBNB (Domain Decomposition for BNB) toolkit is used, which allows to speed up the solution process by coarse-grained parallelization based on a prior decomposition of the feasible domain of the problem to be solved. DDBNB is available as an application of the Everest distributed computing platform which is responsible for running jobs on heterogeneous computing resources (servers, cloud instances, clusters, etc.). DDBNB, Everest, and ParaSCIP had been modified to enable exchange of incumbents (feasible solutions found by the BNB-solver) between several ParaSCIP instances running on different supercomputers.
The resulting system was benchmarked using three traveling salesman problem instances of different sizes. The supercomputers HPC5 of the NRC “Kurchatov Institute” and cHARISMa of the HSE University were used as computing resources. As a result, there is an effect for two instances, and the speedup is especially noticeable for a more complex problem. However, for a simpler problem, the exchange of incumbents does not seem to affect the amount of speedup. For the third instance, there is no particular effect, at least no slowdown is observed.
This work is supported by the Russian Science Foundation (Project 20-07-00701).

Primary authors

Sergey Smirnov (Institute for Information Transmission Problems of the Russian Academy of Sciences) Vladimir Voloshinov (Institute for Information Transmission Problems RAS) Oleg Sukhoroslov (IITP RAS)

Presentation materials