Running Applications on a Hybrid Cluster

3 Jul 2014, 10:00
20m
Conference Hall (LIT JINR)

Conference Hall

LIT JINR

Russia, 141980 Moscow region, Dubna, JINR
sectional reports Section 1 - Technologies, architectures, models, methods and experiences of building distributed computing systems. Consolidation and integration of distributed resources Plenary

Speaker

Mr Vladimir Gaiduchok (Saint Petersburg Electrotechnical University "LETI", Russia)

Description

A hybrid cluster implies the use of computational devices with radically different architectures. Usually, these are a conventional CPU architecture (e.g. x86_64) and a GPU architecture (e.g. NVIDIA CUDA). Creating and exploiting such a cluster requires some experience: in order to harness all computational power of the described system and get substantial speedup for computational tasks many factors should be taken into account. These factors consist of hardware characteristics (e.g. network infrastructure, a type of data storage, GPU architecture) as well as software stack (e.g. MPI implementation, GPGPU libraries). So, in order to run scientific applications GPU capabilities, software features, task size and other factors should be considered. This report discusses opportunities and problems of hybrid computations. Some statistics from tests programs and applications runs will be demonstrated. The main focus of interest is open source applications (e.g. OpenFOAM) that support GPGPU (with some parts rewritten to use GPGPU directly or by substituting libraries). There are several approaches to organise heterogeneous computations for different GPU architectures. CUDA library and OpenCL framework will be compared. CUDA library becomes quite typical for hybrid systems with NVIDIA cards, but OpenCL offers portability opportunities, which can be a determinant factor when choosing framework for development. We also put emphasis on multi-GPU systems that are often used to build hybrid clusters. Calculations were performed on a hybrid cluster of SPbU computing centre.

Primary authors

Mr Alexander Bogdanov (Saint Petersburg State University) Mr Ivan Gankevich (Saint Petersburg State University) Mr Nikolai Yuzhanin (Saint Petersburg State University) Mr Vladimir Gaiduchok (Saint Petersburg Electrotechnical University "LETI", Russia)

Presentation materials