SCIENCE BRINGS NATIONS TOGETHER

The 7th International Conference "Distributed Computing and Grid-technologies in Science and Education" (GRID 2016)

Name: The 7th International Conference "Distributed Computing and Grid-technologies in Science and Education" (GRID 2016)
Start: 2016-07-04T07:00:00+03:00
End: 2016-07-09T17:00:00+03:00
Location: No location set

4 Jul 2016, 07:00 → 9 Jul 2016, 17:00 Europe/Moscow

Description

Welcome to GRID 2016!

The 7th International Conference "Distributed Computing and Grid-technologies in Science and Education" will be held at the Laboratory of Information Technologies (LIT) of the Joint Institute for Nuclear Research (JINR) on 4 - 9 July 2016 in Dubna.

Dubna is a small quiet town located 130 km north from Moscow on the picturesque banks of the Volga River. There is a convenient railway and bus communication from Moscow to Dubna.

The 6th conference on this topic took place at the Laboratory of information technologies, JINR in June, 2014 (http://grid2014.jinr.ru/). The Conference Proceedings are available both as hard copies and on-line(“Distributed Computing and Grid-technologies in Science and Education”, Computer Research and Modeling, Number 3, 2015 Vol. 7, ISSN 2076-7633, http://crm-en.ics.org.ru/journal/issue/167/).

This is a unique Conference, held in Russia on the questions related to the use of Grid-technologies in various areas of science, education, industry and business.

The main purpose of the Conference is to discuss the current Grid operation and future role of the Grid in Russia and worldwide, similar technologies for data-intensive computations: cloud computing, BigData etc. The Conference provides a way for discussing fresh results and for establishing contacts for closer cooperation in future.

Programme of the Conference includes plenary reports in English (30 min), sectional reports (15 min) and poster presentations (in English or Russian).

Working languages - Russian and English

Important deadlines:

Abstract submission — 27 May, 2016 (at Indico on-line registration or by e-mail)
Visa support — 8 June, 2016
Registration to Conference — 12 June, 2016 (on-line)
Arrival and hotel accommodation — from 3 July, 2016
Departure: on 9 - 10 July 2016

Contacts:

Address: 141980, Russia, Moscow region, Dubna, Joliot Curie Street, 6
Phone:       (7 496 21) 64019, 65736
Fax: (7 496 21) 65145
E-mail:      grid2016@jinr.ru
URL:   http://grid2016.jinr.ru/

Participants

280 View full list

Monday 4 July
- Mon 4 Jul
- Tue 5 Jul
- Wed 6 Jul
- Thu 7 Jul
- Fri 8 Jul
- 08:00
  
  Registration at the LIT Conference Hall LIT Conference Hall
  
  LIT Conference Hall
- 1
  
  Opening welcome from JINR Scientific Program of JINR LIT Conference Hall
  
  LIT Conference Hall
  
  Speaker: Prof. Victor Matveev (JINR)
  
  Slides
- 2
  
  Welcome from Sponsors LIT Conference Hall
  
  LIT Conference Hall
- 3
  
  Laboratory of Information Technologies: Status and Future
  
  Speaker: Dr Vladimir Korenkov (JINR)
  
  Slides
- 10:30
  
  Coffee
- Plenary reports LIT Conference Hall
  
  LIT Conference Hall
  - 4
    
    Grid and Cloud Computing at IHEP in China
    
    The Institute of High Energy Physics (IHEP) is the biggest and comprehensive fundamental research center in China. Particle physics is one of the most important research fields at IHEP. This presentation will give a brief introduction to the currently running experiments as well as the experiments being constructed. With accumulation of experimental data, growing needs of computing resources put great pressure on the limited local resources. To meet the challenge, the BESIII Grid computing system was developed based on the Dirac middleware. After years' successful running, it proved to be an effective solution for integrating computing resources dispersed in the collaboration members. Further work has been done to extend its support for multiple experiments with a single system setup. This presentation will also cover the development of cloud computing platform for both scientific computing and information services at IHEP. The feature of dynamic allocation of computing resources makes it possible to improve the overall efficiency of resource usage. Recently, a proposal of setting up a dedicated HPC facility has been raised by a couple of experiments to speed up the CPU-intensive simulation and analysis work. This presentation will discuss the possibility of integrating the future HPC facility with the current system.
    
    Speaker: Prof. Weidong Li (Computing Center Institute of High Energy Physics, Chinese Academy of Sciences)
    
    Slides
  - 5
    
    Project NICA
    
    Speaker: Dr Grigory TRUBNIKOV
    
    Slides
- 12:00
  
  Lunch
- Plenary reports
  - 6
    
    Building up Intelligible Parallel Computing World
    
    Speaker: Vladimir VOEVODIN
    
    Slides
  - 7
    
    Обзор решений и примеры внедрения высокопроизводительных вычислительных комплексов (High Performance Computing) компании Inspur
    
    Speaker: Eddie Hing (Inspur)
    
    Slides
  - 8
    
    Применение методов эволюционного моделирования в задачах аэрогидродинамики
    
    Speaker: Денис ШАГЕЕВ (Niagara)
    
    Slides
  - 9
    
    Инновационные сетевые решения Brocade в распределённых и высокопроизводительных вычислениях
    
    Требования к сетям передачи данных исследовательских организаций становятся, в последнее время, всё выше и выше. Растут как требования по полосе пропускания, так и требования по надёжности сетей передачи. Уже сейчас во многих исследовательских сетях объёмы передаваемого трафика данных исчисляются десятками петабайт в год. Как правило научные организации, в которых возникают такие потребности в передаче данных, находятся на острие научного прогресса. Поэтому, им необходимы высокопроизводительные решения, которые отвечают самым высоки м требованиям не только сегодняшнего дня, но и дня завтрашнего. Brocade Communications помогает трансформировать сети передачи данных научных и образовательных организаций, таким образом, чтобы они могли справиться с растущим трафиком. При этом, решения Brocade отвечают самым жёстким требованиям, предъявляемым как к опорным сетям, так и к сетям ЛВС и ЦОД. Решения Brocade в области программно-определяемых сетей и виртуализации сетевых функций позволяют научным организациям строить расширенные приложения и сервисы по управлению сетями передачи данных максимально увеличивая возврат инвестиций в сетевую инфраструктуру.
    
    Speaker: Peter DYAKOV (Brocade Communications)
    
    Slides
- 15:00
  
  Coffee
- Plenary reports
  - 10
    
    Хранение петабайтов данных, современный подход
    
    Speaker: Mr Sergey Maznichenko (I-Teco Company)
  - 11
    
    BIM-моделирование и CFD-анализ как инструменты эффективного проектирования и оптимизации инженерной и IT инфраструктуры ЦОДов и HPC решений
    
    Speakers: Максим Новиков (Компания "4-х Стихии"), Сергей Анпилов (Компания "4-х Стихии")
    
    Slides
  - 12
    
    Технологии виртуализации графики Nvidia GRID в проектах VDI
    
    Speaker: Дмитрий Юрьевич Соловьев (IBS Platformix)
    
    Slides
  - 13
    
    The era of hyper-convergence. How new storage technologies change the architecture of data centers
    
    For the past two years, the IT industry has been facing dramatic changes in speed, quality and density of storage devices. These changes leads us to a development of a new, hyper-converged infrastructure where the future of classic storage arrays seems quite debatable. This presentation shows milestones of the storage technology and discuss todays possibilities in building a modern, web-scale data centers.
    
    Speaker: Fedor Pavlov (DELL expert)
    
    Slides
  - 14
    
    On the way to SDN
    
    There are a lot of different offerings on SDN landscape these days. We will classify SDN approaches into three major category: control plane, network overlay and networking operating system itself. We will overview each of them, define major pros and cons and look at the use cases
    
    Speaker: Sergey Gusarov (DELL expert)
    
    Slides
  - 15
    
    Архитектура и технологии Intel для построения GRID-решений
    
    Speaker: Nikolay Mester (Intel)
    
    Slides
- Poster Session
  - 16
    
    A continuous integration system for MPD Root: Deployment and setup in GitLab
    
    The paper is focused on setting up a system of continuous integration within the available infrastructure of MPD Root project. The system’s deployment and debugging for MPD Root purposes are considered; the installation and setup of the required tools using GitLab, GitLab CI Runner and Docker are described; the test results of execution speed optimization are presented for the builds in question. The deployment of various MPD Root configurations includes four major steps: installation of environmental components; creation of Root and FairRoot dependencies; building MpdRoot; and MpdRoot testing. The load of a computing node employed in continuous integration was analysed in terms of performance of the central processor, RAM, and network connections. Various parameters of the build’s parallel launch on different computing nodes were considered. As a result, we substantially decreased the build time: from 45 to 2-3 minutes. The optimization was ensured by means of caching the project’s dependencies and environmental components. Caching was done by Docker container manager. The build scripts and the container image configuration were stored in a repository. This allowed to alter the configuration of a build’s dependencies without the developer having to directly access GitLab CI Runner service.
    
    Speaker: Prof. Alexander Degtyarev (Professor)
  - 17
    
    Accounting system for the computing cluster at IHEP
    
    Accounting system is a very important part of any computing cluster which allows to understand how and by whom the core of the cluster - computing resources are used. This task become more and more complicated when different groups of users are present and it needed somehow to calculate general usage to make a bill. In this work described the development of such system for Ihep central computing cluster.
    
    Speakers: Mr Victor Kotlyar (IHEP), Ms Victoria Ezhova (IHEP)
  - 18
    
    ALICE Job Submission to A Cloud without Dedicated Storage Element.
    
    The constant need for virtual resources at LHC experiments can be explained by a lack of computing resources and and also by flexibilty that gives virtualization for providing new platforms and computing power. Researches for virtual resource integration into ALICE Grid environment have been done by many teams [1][2][3] . In our report we present an compilation of existing approaches which include cloud access using htCondor [4] using MCVM image [5] and CEPH FS [6] for storing data and for running virtual machines. The work has been performed on distributed resources of Bogolyubov Institute of Theoretical Physics (Ukraine), Saint Petersbourg State University (Russia), and CERN (Switzerland). The tests and performance estimations for the suggested approach have been presented. References: Mikolaj Krzewicki, David Rohr, Sergey Gorbunov, et al. The ALICE High Level Trigger: status and plans. ournal of Physics: Conference Series Volume 664 (2015) 082023 ( http://iopscience.iop.org/article/10.1088/1742-6596/664/8/082023/pdf ) Berzano D, Ganis G et al. PROOF as a Service on the Cloud: a Virtual Analysis Facility based on the CernVM ecosystem—in pro- ceedings of Computing in High Energy and Nuclear Physics (CHEP) 2013. Mikhail Kompaniets , Oksana Shadura, Pavlo Svirin, Volodymyr Yurchenko, Andrey Zarochentsev “Integration of XRootD into the cloud infrastructure for ALICE data analysis” , Journal of Physics: Conference Series, Volume 664 (2015), Clouds and Virtualization (http://iopscience.iop.org/article/10.1088/1742-6596/664/2/022036/meta) htCondor web site (https://research.cs.wisc.edu/htcondor/) B Segal et al.; "LHC Cloud Computing with CernVM", Proceedings of the XIII. International Workshop on Advanced Computing and Analysis Techniques in Physics Research (ACAT10), Jaipur, 2010, PoS ACAT(2010)004 CEPH web site (http://docs.ceph.com/docs/master/#)
    
    Speaker: Mr Andrey Zarochentsev (SPbSU)
  - 19
    
    Development of new security infrastructure design principles for distributed computing systems
    
    The report presents our current work on design and development of security infrastructure of modern kind that is intended for different types of distributed computing systems (DCS). The main goal of the approach is to provide users and administrators with transparent, intuitive and yet secure interface to the computational resources. The key points of the proposed approach to security infrastructure development are listed as follows: -- All the connections in the DCS must be secured with SSL/TLS protocol. -- Initial user authentication is performed using a pair of login and password with the use of multi-factor authentication where necessary. -- After successful login a user obtains a special session key with a limited validity period for further password-free work. -- Every single computational request is protected by the individual hash which is not limited in time. -- These hashes are registered by the special authentication and authorization service, and states of the hashes are tracked on real time. The service also provides online requests authorization for delegation of user rights to the other services in the DCS. A prototype of the proposed security infrastructure was deployed on a testbed. It includes an authentication and authorization service, an execution service, a storage management service, and a user interface. Various tests have shown that the proposed algorithm and architecture are competitive in terms of functionality, usability, and performance. The results can be used in the grid systems, cloud structures, large data processing systems (Big Data), as well as for the organization of remote access via the Internet to supercomputers and computer clusters. This work is supported by the Ministry of Education and Science of the Russian Federation, agreement No.14.604.21.0146 (RFMEFI60414X0146).
    
    Speaker: Ms Yulia Dubenskaya (SINP MSU)
  - 20
    
    Formation and stability of structural defects in the crystaline structure of lead dioxide: a DFT study
    
    The mechanism of conductibility for lead dioxide - which is the main component of the psoitive electrode of the lead-acid batteries - was elucidated only recently. The DFT calculations as well as experimntal data indicated that a concentration of 1 to 2 percent of deffects in the crystal structure of lead dioxide may result in a signifficant change of the conductibility - which in particular proves to be essential for the functionong of the lead-acid battery. The aim of our investigation is to determine the energy barrier associated to formation of these defects as well as the geometric parameters characterizing them. To this end we used DFT calculations to point out the value of energy barriers existing between different structures of lead dioxide (i.e. ideal structure as well as structures including defects). We present different structures, the value of energy barriers and the paths connecting the ideal to defect structures.
    
    Speaker: Mrs Codruta Mihaela Varodi (National Institute for Research and Development of Isotopic and Molecular Technologies Cluj-Napoca, Romania)
  - 21
    
    From parallel to distributed computing as application to simulate magnetic properties-structure relationship for new nanomagnetic materials
    
    Modern materials science is based in principle on the fundamental experience that the properties of materials are not peremptorily determined by their average chemical composition but they are to a large extent influenced by their microstructure. Now, it is obvious that the outstanding success of magnetic materials for the last two decades may be ascribed to three relevant accomplishments: -overall improvements in general expertise and techniques in sample synthesis; -a dramatic refinement and development of new methods and probes for magnetic materials characterization; -the increasing importance of nano-level studies that led to the ingenious ways of producing nanoparticle samples, new techniques for element specific studies, goin down to atomic resolution studies and even to single atoms at surfaces and interfaces. In the last projects completed in recent years we have analyzed and studied magnetic materials mostly micro and nano scale of perovskite, enumerating here cobaltites, ferrimagnetics, ferroics, manganites and other nanomagnetic materials. Almost all of them, listed up require massive data processing. At that time, it became obvious to us that it needs another embodiment, namely in the processing data activity. We were at the beginning, in the 2010 years to introduce parallel computing applications on the simulation of the structure, magnetic and transport properties and to explain the structure-properties relationships for the new nanomagnetic materials that were fashionable in those years. Knowing quite substantial intersection of the parallel computing and distributed, we think it is of common sense to introduce our applicative work in magnetism and magnetic materials science modeling properties, in the context of distributed computing applications. Our latest research specialize in improving techniques for high-level simulation in the design of nano-materials with controlled magnetic properties. We used a package built on Linux, called Nmag (with acquiescence) on an open source platform, across a network of parallel computers.
    
    Speaker: Dr Mihail Liviu Craus (Joint Institute for Nuclear Research)
  - 22
    
    GEANT4-based simulation of a micro-CT scanner using cloud resources
    
    Research of methods of material recognition using Medipix-based micro-CT scanner MARS requires detailed simulation of passage of X-rays through the sample. The application based on the Geant4 toolkit has been developed to solve this task. Since the computation turned out to be very time consuming for a single PC (several hunderds of projections must be simulated for each sample) the cloud resources has been used. A job submission framework optimized to run Geant4 applications in Amazon EC2 has been implemented using Python language. The obtained results and performance of the simulation will be reported.
  - 23
    
    Jupyter extension for creating CAD designs and their subsequent analysis by the finite element method
    
    Creating designs in CAD and performing their stress-strain analysis are complex computational tasks. Their successful solution depends on a number of prerequisites: availability of large computational power; comprehensive knowledge of physical and mathematical computing; and solid skills of programming and working in a variety of separate software products that are not integrated to each other directly. The paper presents a system aimed at CAD models development and verification from the ground up. The system integrates geometry construction, mesh model creation and deformation analysis into a uniform computing environment operating as a SaaS solution. It is based exclusively on open source software and allows to use the Python programming language and SALOME, GMSH, FEniCS and ParaView libraries. The system’s architecture and certain issues of working with libraries are discussed. The paper also presents a browser-based tool for CAD design creation and analysis, which tool is the front end of the software product we created.
    
    Speaker: Dr Valery Grishkin (SPbGU)
  - 24
    
    Machine Learning Technologies to Predict the ATLAS Production System Behaviour
    
    The second generation of the ATLAS Production System (ProdSys2) is an automated scheduling system that is responsible for data processing, data analysis and Monte-Carlo production on the Grid, supercomputers and clouds. The ProdSys2 project was started in 2014 and commissioned in 2015 (just before the LHC Run2) and now it handles O(2M) tasks per year, O(2M) jobs per day running on more than 250000 cores, each task transforms in many jobs. ProdSys2 evolves to accommodate a growing number of users and new requirements from the ATLAS Collaboration, Physics groups and individual users. ATLAS Distributed Computing in its current state is a big and heterogeneous facilities, running on the WLCG, academic and commercial clouds and supercomputers. This cyber-infrastructure presents computing conditions in which contention for resources among high-priority data analyses happens routinely. Inevitably, over-utilized computing resources cause degradation of services or significant workload and data handling interruptions. For these and other reasons, grid data management and processing must inevitably tolerate a continuous stream of failures, errors, and faults. This makes simulating ProdSys2 behavior a very challenging task requiring unfeasibly large computing power. However, behavior of the system seems to contain regularities that can be modeled using Machine Learning (ML) algorithms. We proposed use of ML approach in conjunction with ProdSys2 jobs execution information to predict behavior of the system, starting with estimating task completion times. The WLCG ML R&D project was started in 2016, we will present our first results how ProdSys2 behavior could be predicted and simulated. On the next phase we will use ML algorithms to predict and to find anomalies in the ProdSys2 behavior.
    
    Speaker: Maksim Gubin (Tomsk Polytechnic University)
  - 25
    
    Network traffic analysis for the computing cluster at IHEP
    
    A task for analysis of network traffic flows on the high performance computing network for the computer cluster is very important and allows to understand the way of complicated computing and storage resources usage by different software applications running on the cluster. As soon as all these applications are not managed by the cluster administrators they need a tool to understand usage patterns to make then an appropriate tuning for the core cluster software to achieve more effective usage for the cluster resources. In the presented work the development of such system for the IHEP cluster will be shown.
    
    Speaker: Mrs Anna Kotlyar (IHEP)
  - 26
    
    PARAMETRIZATION OF THE REACTIVE MD FORCE FIELD FOR Zn-O-H SYSTEMS
    
    In this contribution we describe a procedure of optimizing the molecular dynamic force field for Zn-O-H chemical systems by means of a new parallel algorithm of a multifactorial search for the global minimum. This algorithm allows one to obtain numerous parameters of the ReaxFF classical force field based on quantum chemical computations of various characteristics of simple compounds. The force field may be then used for simulating of large-scale chemical systems consisting of the same elements by means of classical molecular dynamics. Our current implementation of the algorithm is done in C++ using MPI. We compare characteristics of simple compounds, obtained by 1) quantum chemical techniques, 2) molecular-dynamic methods using reference parameters of the force field, and 3) MD methods using optimized parameters of the force field. With the optimized parameter set we perform MD simulations (using LAMMPS package) of crystals of zinc and zinc oxide of various modifications at the room temperature. Finally, we compare results of the parameter optimization procedure by means of the algorithm described above and results of a parallel implementation of an evolutionary approach to minimum search using dynamic models of Zn and ZnO crystals. Also we discuss advantages and disadvantages of the both methods and their efficiency for extremal problems. All computations are performed with machines of the distributed scientific complex of the Faculty of Physics of St Petersburg State University.
    
    Speakers: Mr Konstantin Shefov (Saint Petersburg State University), Ms Sofia KOTRIAKHOVA (Saint-Petersburg State University), Dr Stepanova Margarita (SPbSU)
  - 27
    
    The GPU Simulation of the MoS2 diode current characteristics
    
    Modeling of semiconductor carrier transport properties based on the drift-diffusion model is one of emerging activity of computational electronics. The p-n junction between materials with different type of conductivity can set up an internal electric field, which is responsible for separation of charge pairs electron and hole. The distribution of electric field in the depletion layer can be obtained by solution of Poisson’s equation. Although this equation does not appear to have an analytical solution, numerical treatment offers a deeper comprehension of the structure, achieving a complete control on the various parameters and defining their role in the device operation. In this paper, we implemented the GPU-based calculation of the p-n junction diode current simulation in the multiple parameter space. In particular, the parameter space is chosen as external voltage, diffusion coefficient and lengths. The GPU-based simulation has achieved the 2 times speedup compared to the CPU mode.
    
    Speaker: Mr Zorig Badarch (National University of Mongolia)
  - 28
    
    YASTD: A simple set of CLI tools to manage Docker containers
    
    Docker is a one of the most popular systems for container virtualization on the market. It gives user a lot of possibilities, but its use requires root access which is sometimes dangerous. We propose a set of simple command line tools for managing Docker containers called YASTD (Yet Another Simple Tools for Docker). It has three purposes: - to allow users to create containers remotely accessible via SSH; - to let users configure their containers and save the changes as new images; - to isolate users from each other and restrict their access to the Docker features that could potentionally disrupt the work of a server. The tools are mostly named after Docker options and include 1) containers managing tools: - create - creates a container from an image, accessible by the user via SSH; - show - shows the containers started by a user along with their statuses; - stop - stops a running container; - start - restarts a stopped container; - pause - pauses all processes within a container; - unpause - unpauses a paused container; - rm - removes a container; 2) images managing tools: - commit - saves a container into an image; - images - shows the list of images available to a user; - rmi - removes a user image; 3) an administrative tool: - create-user - creates a new user, sets up a personal storage directory and prepares the user's public SSH key. The administrators have to prepare a set of basic images with SSH service configured. The users can create their own containers from these images, each container only accessible to the user who created it. The scripts provide two mechanisms for saving changes made by users: - a personal storage directory is mapped to each container started by a user, allowing them to access the changes made in the respective directory of their other containers; - if the changes are not limited to the mapped directory, e.g. a software is installed or modified inside the container, a user can save the container to a new image and start containers from this new image. Images created by users are only accessible to the users who created them. Potentially these images can be migrated to other computers. All commands were realized as Python scripts.
    
    Speaker: Dr Stanislav Polyakov (SINP MSU)
  - 29
    
    Имитационное моделирование дорожного трафика с помощью сетей Петри
    
    Данная работа посвящена проблеме построения низкоуровневых моделей дорожного трафика на основе расширенных сетей Петри. Актуальность предлагаемого подхода к моделированию движения городского трафика заключается в том, что в последнее время наблюдается значительный рост интереса к низкоуровневому моделированию разнообразных систем. Этот интерес, не в последнюю очередь, обусловлен широким развитием высокопроизводительных систем для массивных параллельных вычислений. Сети Петри, обладают значительным внутренним параллелизмом, что и делает их перспективными с точки зрения реализации на современных многопроцессорных вычислительных системах с использованием практически любой технологии параллельного программирования. В работе рассматриваются принципы построения моделей дорожных систем на основе сетей Петри. Анализируется проблема преобразования исходной графовой схемы дорожной системы в соответствующую ей сеть Петри. Предлагается алгоритм такого автоматического преобразования. Рассматриваются несколько подходов к обработке конфликтных ситуаций, которые должны разрешаться на основе тех или иных правил дорожного движения. В качестве примера приложения разработанной программной системы проводится численное исследование пропускной способности нескольких стандартных типов дорожных развязок (перекрестков) в зависимости от частоты появления машин на их входах. Также в работе рассматриваются вопросы параллельной крупноблочной реализации построенной модели на основе технологии MPI.
    
    Speaker: Ms Irina Martynova (student)
  - 30
    
    Применение технологии OpenCL для решения задач сейсмики сеточно-характеристическим методом
    
    В данной работе рассмотрены аспекты применения технологии OpenCL для распараллеливания решения задач сейсмики используя сеточно-характеристический метод на графических сопроцессорах. В качестве системы определябщих уравнений используются уравнения линейной теории упругости. Проведено сравнение реализаций расчетного алгоритма на технологиях CUDA и OpenCL при работе на устройствах GPU от nVidia. Также проведено сравнение с устройствами от AMD при использовании технологии OpenCL. Для тестирования были использованы устройства от NVIDIA и AMD. От NVIDIA использовались устройства GeForce и Tesla, в том числе и последние модели Tesla k80 и Tesla k40m. От AMD использовались GPU серий Radeon HD и Radeon R9. Были рассмотрены различия в эффективности реализации c одинарной точностью и с двойной точностью. Исследование выполнено при финансовой поддержке РФФИ в рамках научного проекта № 15-07-01931 А.
    
    Speaker: Nikolay Khokhlov (MIPT)
  - 31
    
    Распределенная модель бактериального поиска на основе Марковских клеточных автоматов
    
    Групповая робототехника представляет собой перспективное направление в области роевого интеллекта, заключающееся в построении робототехнических систем, состоящих из большого числа относительно просто устроенных роботов. Актуальной задачей в этой области является разработка распределенных алгоритмов управления такого рода системами. Проблема заключается в том, что требуется разработать алгоритм решения некоторой глобальной задачи, недоступной отдельным роботам, программируя локальное поведение многих таких роботов, действующих параллельно. В настоящей работе в качестве инструмента разработки такого рода распределенных алгоритмов предлагается использовать клеточные автоматы с Марковскими системами правил. Особенностью такого подхода является то, что он позволяет единообразно описать как алгоритмическое поведение самих роботов, так и неупорядоченное поведение среды. В работе рассматривается процесс разработки колонии искусственных бактерий, выполняющих коллективный поиск питательных веществ (bacterial foraging algorithm). Рассматривается реализация основных механизмов поведения бактерий – движение, рост, деление. Особое внимание уделяется построению механизма управления движением бактерий под действием химических сигналов – хемотаксису бактерий. Приводятся результаты компьютерного моделирования и численных экспериментов, показывающие работоспособность, как построенных распределенных алгоритмов, так и всего подхода к созданию такого рода алгоритмов.
    
    Speaker: Mr Sergey Poluyan (Dubna International University for Nature, Society and Man)
  - 32
    
    Распределенная программная система автоматического построения схемы городской транспортной сети
    
    В работе рассматриваются вопросы, связанные с построением цифрового описания схем движения городского транспорта на основе геолокационных данных. Такое описание может в дальнейшем использоваться либо для компьютерного моделирования движения транспорта, например, с целью оптимизации той или иной дорожной развязки, либо для разработки алгоритмов и программ управления автономными транспортными средствами. Одной из проблем при использовании геолокационной информации является ее сильная зашумленность. Поэтому на первом этапе исследования были разработаны алгоритмы, позволяющие по нескольким трекам, соответствующим одному и тому же участки дороги, провести процедуру удаления шума и выделить «чистые» траектории движения транспорта по данному участку дороги. Серьезным недостатком такого подхода оказалась необходимость в существенной ручной предобработке входных данных. Предлагается следующая схема решения исходной задачи. Разрабатывается программа-геотрекер для мобильных устройств, служащая для сбора ими первичной геолокационной информации в непрерывном режиме. Эта информация автоматически пересылается на некоторое облачное хранилище данных, откуда забирается сервером, выполняющим всю статистическую обработку данных. На первом этапе с использованием методов машинного обучения из треков выделяются части, соответствующие движению на транспортных средствах (автомобилях). Далее производится разбиение этих треков на фрагменты, относящиеся к одинаковым участкам дорог. На третьем этапе выполняется выделение «чистых» траекторий движения транспорта и связывание их в общую систему. Настоящая работа посвящена разработке алгоритмов решения задач, возникающих на первом и третьем этапах вышеописанной схемы. Приводятся результаты численных исследований на основе реальных данных движения транспорта в городе Дубна. Работа выполнена при финансовой поддержке РФФИ (грант №14-07-00628 А).
    
    Speaker: Ms Natalia Puchkova (student)
- 18:30
  
  Welcome Party St. Stroiteley, 2 (International Conference Hall)
  
  St. Stroiteley, 2
  
  International Conference Hall
Tuesday 5 July
- Mon 4 Jul
- Tue 5 Jul
- Wed 6 Jul
- Thu 7 Jul
- Fri 8 Jul
- Plenary reports
  - 33
    
    Tier-1 CMS at JINR: Past, Present and Future
    
    The Tier-1 centre for CMS at JINR is a high-tech platform for computing systems and systems of long-term data storage with high concentration of the network and server equipment. Work on this centre was carried out in the framework of the project "Creation of the automated system of data processing of experiments at the Large Hadron Collider (LHC) of the Tier1 level and provision of grid-services for distributed analysis of these data" within the Federal target program (FTP) of the Ministry of Education and Science of the Russian Federation "Research and development on the priority directions of developing the scientific-technological complex of Russia for 2007-2013”. On 28 September, 2012, at the meeting of the Supervisory Board of the WLCG project a plan of creating a Tier-1 center in Russia was accepted. In early 2015, a full-scale WLCG site of the Tier1 level for CMS was launched at LIT, JINR. The importance of creating and maintaining such a centre is determined by several factors. First of all, the need for the development, modernization and expansion of computational performance and data storage systems of the Tier-1 centre is dictated by the research program of the CMS experiment in which JINR physicists are actively involved in the framework of the RDMS CMS collaboration. Another important thing is that in the course of work on the creation and exploitation of Tier-1 at JINR, an invaluable experience was gained that is already in demand and will be needed in the future to design, build and subsequently exploit the informational – computational centre for experiments within the NICA project. An overview of the JINR Tier-1 centre for the CMS experiment at the LHC is given. Special emphasis is placed on the results of the Tier-1 operation and future plans.
    
    Speaker: Dr Tatiana Strizh (JINR)
    
    Slides
  - 34
    
    Центр Тиер1 НИЦ КИ во втором сеансе БАК
    
    Speaker: Василий ВЕЛИХОВ
    
    Slides
  - 35
    
    Future trends in distributed infrastructures - Nordic Tier-1 example
    
    Distributed computing infrastructures have a long history, starting with pools of PCs and clusters three decades ago, briefly passing a stage of HPC Grids, and now converging to fully virtualised facilities. The Nordic Tier-1, prototype of which was launched back in 2003, is a small-scale model of a distributed infrastructure, and thus an interesting case for studying experience and future trends in a controlled environment. The talk will present an overview of the current trends in distributed computing and data storage from the perspective of the distributed Nordic Tier-1, covering a variety of aspects, from technological to political ones.
    
    Speaker: Dr Oxana Smirnova (Lund University)
    
    Slides
  - 36
    
    ATLAS Production System
    
    The second generation of the ATLAS production system called ProdSys2 is a distributed workload manager which used by thousands of physicists to analyze the data remotely, with the volume of processed data is beyond the exabyte scale, across a more than hundred heterogeneous sites. It achieves high utilization by combining dynamic job definition based on many criterias, such as input and output size, memory requirements and CPU consumption with manageable scheduling policies and by supporting different kind of computational resources, such as GRID, clouds, supercomputers and volunteering computers. Besides jobs definition Production System also includes flexible web user interface, which implements user-friendly environment for main ATLAS workflows, e.g. simple way of combining different data flows, and real-time monitoring, optimised for using with huge amount of information to present. We present an overview of the ATLAS Production System major components: job and task definition, workflow manager web user interface. We describe the important design decisions, and lessons learned from an operational experience during the first years of LHC Run2.
    
    Speaker: Mr Mikhail Borodin (NRNU MEPHI, NRC KI)
    
    Slides
- 10:00
  
  Coffee
- Plenary reports LIT Conference Hall
  
  LIT Conference Hall
  - 37
    
    DIRAC Data Management System
    
    DIRAC Project is developing software for building distributed computing systems for the needs of research communities. It provides a complete solution covering both Workload Management and Data Management tasks of accessing computing and storage resources. The Data Management subsystem (DMS) of DIRAC includes all the necessary components to organize distributed data of a given scientific community. The central component of the DMS is the File Catalog (DFC) service. It allows to build a logical File System of DIRAC presenting all the distributed storage elements as a single entity for the users with transparent access to the data. The Metadata functionality of the DFC service is provided to classify data with user defined tags. This can be used for an efficient search of the data necessary for a particular analysis. The DMS supports all the usual data management tasks of uploading and downloading, replication, removal files, etc. A special attention is paid to the bulk data operations involving large numbers of files. Automation of data operations driven by new data registrations is also possible. In this contribution we will make an overview of the DIRAC Data Management System and will give examples of its usage by several research communities.
    
    Speaker: Dr Andrei Tsaregorodtsev (CPPM-IN2P3-CNRS)
    
    Slides
  - 38
    
    Federated data storage system prototype for LHC experiments and data intensive science
    
    Rapid increase of data volume from the experiments running at the Large Hadron Collider (LHC) prompted national physics groups to evaluate new data handling and processing solutions. Russian grid sites and universities’ clusters scattered over a large area aim at the task of uniting their resources for future productive work, at the same time giving an opportunity to support large physics collaborations. In our project we address the fundamental problem of designing a computing architecture to integrate distributed storage resources for LHC experiments and other data-intensive science applications and to provide access to data from heterogeneous computing facilities. Studies include development and implementation of federated data storage prototype for Worldwide LHC Computing Grid (WLCG) centers of different levels and University clusters within one National Cloud. The prototype is based on computing resources located in Moscow, Dubna, St.-Petersburg, Gatchina and Geneva. This project intends to implement a federated distributed storage for all kind of operations such as read/write/transfer and access via WAN from Grid centers, university clusters, supercomputers, academic and commercial clouds. The efficiency and performance of the system are demonstrated using synthetic and experiment-specific tests including real data processing and analysis workflows from ATLAS and ALICE experiments, as well as compute-intensive bioinformatics applications (PALEOMIX) running on supercomputer. We present topology and architecture of the designed system, report performance and statistics for different access patterns and show how federated data storage can be used efficiently by physicists and biologists. We also describe how sharing data on a widely distributed storage system can lead to a new computing model and reformations of computing style, for instance how bioinformatics program running on supercomputer can read/write data from federated storage.
    
    Speaker: Mr Andrey Kiryanov (PNPI)
    
    Slides
  - 39
    
    The activity of Russian Chapter of International Desktop Grid Federation
    
    Results of activity of the Russian Chapter of International Desktop Grid Federation (IDGF) are considered. Including interaction with community of the russian volunteers (crunchers), start new and support of the existing projects of the volunteer distributed computing.
    
    Speaker: Mr Ilya Kurochkin (IITP RAS)
    
    Slides
  - 40
    
    Storage operations at CERN: EOS and CERNBox
    
    EOS is the high-performance CERN IT distributed storage for High-Energy Physics. Originally used for analysis, it now supports part of the data-taking and reconstruction workflows, notably for the LHC Run2. EOS currently holds about 200PB of raw disks and takes advantage from its wide-area scheduling capability exploiting both CERN's computing facilities (Geneva-Meyrin and Budapest-Wigner) about 1,000 km apart (~20-ms latency). In collaboration with the Australia's Academic and Research Network (AARNET) and Academia Sinica Grid Computing (ASGC) in Taiwan, CERN-IT Storage Group set up an R&D project to explore even more the EOS potential for geo-scheduling running a distributed storage system between Europe (Geneva, Budapest), Australia (Melbourne) and Asia (ASGC Taipei), allowing different type of data placement and data access across these four sites with latency higher than 300ms (16,500 km apart). EOS is the storage system for CERNBox (the CERN cloud synchronisation service for end-users) to provide as well sync and share capabilities to users and for scientific and engineering use-cases. The success of EOS/CERNBox has been demonstrated by the high demand in the community for such easily accessible cloud storage solution which recently crossed 5000 users.
    
    Speaker: Luca Mascetti (CERN)
    
    Slides
  - 41
    
    Технологии NVIDIA для высокопроизводительных вычислений
    
    Speaker: Антон Джораев (NVIDIA)
    
    Slides
- 12:20
  
  Lunch
- 2. Operation, monitoring, optimization in distributed computing systems 406B
  
  406B
  - 42
    
    The monitoring system of MICC-JINR
    
    The development of the monitoring system of the Multifunctional Information and Computing Complex (MICC) of JINR is aimed at optimizing its work and at increasing the availability and reliability toward the absolute 100% level limit. The monitoring is continuously improved such as to encompass all the existing hardware and primary software service modules. Monitoring allows detecting predictable system failures at incipient stages and the provision of beforehand solutions. Suitable interfaces provide timely notifications, enable history-based predictions concerning the future MICC functioning. This makes the JINR-LCG2 Tier2 and the JINR-CMS Tier1 sites amongst the most efficient ones within the WLCG.
    
    Speaker: Ivan Kashunin (JINR)
  - 43
    
    IHEP tier-2 computing center: status and operation.
    
    RU-Protvino-IHEP site is the one of three biggest WLCG Tier-2 centers in Russia. The computing infrastructure serves for "big four" high energy physics experiments such as Atlas, Alice, CMS, LHCb and local experiments at IHEP such as OKA, BEC, radio biology stands and others. In this presentation the current status of the computing capacities, networking and engineering infrastructure will be shown as well as the contribution of the grid site to the collaboration experiments.
    
    Speaker: Mr Victor Kotlyar (IHEP)
    
    Slides
  - 44
    
    Prototyping and operating scenarios in a distributed computing environment for NICA
    
    In this work by saying Virtual Accelerator we mean a set of services and tools enabling transparent execution of computational software for modeling beam dynamics in accelerators using distributed computing resources. The main use of the Virtual Accelerator is simulation of beam dynamics by different packages with the opportunity to match the and the possibility to create pipelines of tasks when the results of one processing step based on a particular software package can be sent to the input of another processing step. In the case of charged particle beams Virtual Accelerator is working as a prediction mechanism: witch analytical model should we use to exclude, may be partly , the negative effect in beam dynamics. With the help of external fields these changes can be done. To simulate the large number of particle we need distributed resources for our computations. In this paper different parallel techniques to simulate space charge effects are presented. In particular, the investigation of overall performance of the predictor-corrector method is made.
    
    Speaker: Mr Ivan Gankevich (Saint Petersburg State University)
  - 45
    
    Remote Operation Center at Dubna for NOvA experiment
    
    The ROC-Dubna, Remote Operation Center at Joint Institute for Nuclear Research (Dubna, Russia), supports the NOvA experiment located 14000 km away in Fermilab (Batavia, Illinois, USA) and Ash River (Minnesota, USA). The ROC allows Russian physicists to operate the NOvA detector and monitor the NuMI neutrino beam complex. ROC-Dubna for NOvA is the first fully operational ROC outside USA.
    
    Speaker: Oleg Samoylov (JINR)
    
    Slides
  - 46
    
    JINR Tier-1 service monitoring system: Ideas and Design
    
    In 2015, a Tier-1 center for processing data from the LHC CMS detector was launched at JINR. After a year of operation it became the third among CMS Tier-1 centers considering Completed Jobs. The large and growing infrastructure pledged QoS and complex architecture all make support and maintenance very challenging. It is vital to detect signs of service failures as early as possible and enough information to react properly. Apart from the infrastructure monitoring, which is done on the JINR Tier-1 with Nagios, there is a need for consolidated service monitoring. The top-level services that accept jobs and data from the Grid depend on lower-level storage and processing facilities that themselves rely on the underlying infrastructure. The sources of information about the state and activity of the Tier-1 services are diverse and isolated from each other. Several tools have been examined for the service monitoring role, including HappyFace and Nagios, but the decision was made to develop a new system. The goals are to retrieve a monitoring information from various sources, to process the data into events and statuses, and to react according to a set of rules, e.g. to notify service administrators. Another important part of the system is an interface visualizing data and a state of the systems. A prototype has been developed and evaluated at JINR. The architecture, current and planned functionality of the system are presented in this report.
    
    Speaker: Mr Igor Pelevanyuk (JINR)
    
    Slides
- 1. Technologies, architectures, models of distributed computing systems 310
  
  310
  - 47
    
    Creating distributed rendering applications
    
    Abstract. This article discusses the aspect of visualization appliance by the usage of distributed computing systems. It describes possible practical scope for several visualization technologies on the basis of an example for the construction such an application exploiting modern technologies and ready-made solutions. An extra attention is paid to the selection of software packages and to the provisioning of a final result to the end user taking in mind the issue of unusual computer graphics output approaches. In the light of these questions this study is carrying out an analysis of implementation’s hardware and software features. Keywords: Distributed Computing, Distributed Rendering, Computer Graphics, Visualization
    
    Speaker: Mr Andrei Ivashchenko (St.Petersburg State University)
    
    Slides
  - 48
    
    Kipper – a Grid bridge to Identity Federation
    
    Identity Federation (IdF, aka Federated Identity) is the means of interlinking people's electronic identities stored across multiple distinct identity management systems. This technology has gained momentum in the last several years and is becoming popular in academic organisations involved in international collaborations. One example of such federation is eduGAIN, which interconnects European educational and research organisations, and enables trustworthy exchange of identity-related information. In this work we will show an integrated Web-oriented solution code-named Kipper with a goal of providing access to WLCG resources using a user's IdF credentials from their home institute with no need for user-acquired X.509 certificates. Kipper achieves “X.509-free” access to Grid resources with the help of two additional services: STS and IOTA CA. STS allows credential translation from SAML2 format used by Identity Federation to the VOMS-enabled X.509 used by most of the Grid, and the IOTA CA is responsible for automatic issuing of short-lived X.509 certificates. Kipper comes with a JavaScript API considerably simplifying development of rich and convenient “X.509-free” Web-interfaces to Grid resources, and also advocating adoption of IOTA-class CAs among WLCG sites. We will describe a working prototype of IdF support in the WebFTS interface to the FTS3 data transfer engine, enabled by integration of multiple services: WebFTS, CERN SSO (member of eduGAIN), CERN IOTA CA, STS, and VOMS.
    
    Speaker: Mr Andrey Kiryanov (PNPI)
    
    Slides
  - 49
    
    Construction of visualization system for scientific experiments
    
    Abstract: This proposal considers about possible approaches for the creation of visualization system designed for scientific experiments. Possible hardware solutions represented with various combinations of numerous components, such as computing nodes based on graphics cards, visualization and virtualization servers, and peripheral devices are studied in details. Description of software solutions having the ability to exploit that solution is also provided. Keywords: Computing Infrastructure, Computer Graphics, Visualization
    
    Speaker: Ms Evgeniya Milova (St.Petersburg State University)
    
    Slides
  - 50
    
    Current status of the BY-NCPHEP Tier3 site
    
    Status of the BY-NCPHEP site presented. Upgrade and transition to rackmounted servers continued. Improvements in the usability of storage system usage is discussed.
    
    Speaker: Mr Vitaly Yermolchyk (INP BSU)
    
    Slides
  - 51
    
    Status of the RDMS CMS Computing
    
    The Compact Muon Solenoid (CMS) detector is one of two general purpose experiments at the Large Hadron Collider (LHC) at CERN. CMS is a high-performance detector for seeking out new physics. The detector was designed and built by a worldwide collaboration of about two thousand physicists from more than 30 countries. Russia and Dubna Member States (RDMS) CMS collaboration involves more than twenty institutes from Russia and Joint Institute for Nuclear Research (JINR) member states. A proper computing grid-infrastructure has been constructed at the RDMS institutes for the participation in the running phase of the CMS experiment and the RDMS CMS computing centers have been integrated into the WLCG global grid-infrastructure providing a proper functionality of grid services for CMS. RDMS CMS computing infrastructure satisfies the LHC data processing and analysis requirements at the running phase of the CMS experiment. It makes possible for RDMS CMS physicists to take a full-fledged part in the CMS experiment at its running phase. An overview of RDMS CMS physics tasks and RDMS CMS computing activities is presented.
    
    Speaker: Dr Elena Tikhonenko (JINR)
    
    Slides
  - 52
    
    Operating system Plan9 as the implementation of the GRID ideology
    
    When we organize parallel computations on a cluster system, the computer system structure is not hidden from the user, and should be taken into account while writing parallel programs. GRID ideology introduces an additional level of abstraction and makes it possible to link together heterogeneous computing systems. In fact, the inability to control the operating environment makes developers to create superfluous infrastructure at the application level. We offer to go down to the below level and to implement the necessary functionality within the operating environment kernel. Plan9 operating system has the immanent structure necessary for the implementation of GRID ideology. The main architectural elements of the operating system allow you to use all the resources of the remote computer as the local resources. The work is partially supported by RFBR grants No's 14-01-00628, 15-07-08795, and 16-07-00556.
    
    Speaker: Dr Dmitry Kulyabov (PFUR & JINR)
    
    Slides
- 7. Desktop grid technologies and volunteer computing 406A
  
  406A
  - 53
    
    Scheduling in Desktop grid: a review
    
    In the report we will review and discuss challenges, approaches, and results of scheduling tasks in Desktop grids. Besides a review of published articles, we are going to present our approach, models, and results.
    
    Speaker: Dr Ilia Chernov (IAMR KRC RAS)
    
    Slides
  - 54
    
    Implementation of the branch-and-bound method on the desktopgrid systems
    
    The report considered the effective implementation of the branch-and-bound method on the desktopgrid systems. The description of the BOINC system, its features, significant from the point of view of the implementation of this method. Description of experimental calculations and observed phenomena. The report contains a description of the approaches to the calculation and analysis of their results. The practice of calculations showed that: - It is necessary to work out a preliminary assessment of computational complexity of the assignment method. - Resource efficiency depends on the ratio of the total computational complexity of the problem being solved and the potential of the infrastructure. This ratio must be considered in the development of optimal resource utilization strategies. - Further computational experiments must be carried out with more computationally-capacious tasks. Conclusions of the work can be used to solve practical problems, reducible to optimization problems.
    
    Speakers: Dr Mikhail Posypkin (ITTP RAS), Nikolay Khrapov (Pavlovich)
  - 55
    
    BOINC’s issues and directions of future development
    
    This review summarizes and classifies all issues with which faced developers during maintaining and developing BOINC’s based projects. In conclusion have been claimed directions of BOINC’s development, which targets on meeting needs of BOINC’s users.
    
    Speaker: Mr Anatoliy Saevskiy (National University of Science and Technology MISiS)
  - 56
    
    Task Scheduling in a Desktop Grid for Virtual Drug Screening
    
    In many applied scientific problems, diversity of the results set is no less important than their amount. An example of such problem is the virtual drug screening. According to the principles of drug development, the structural diversity is one of the key characteristics of the resulting molecules set. At the same time, virtual drug screening is a time- and resource-consuming computational problem that requires high-performance computing resources in order to be solved. This work aims to develop a task scheduling algorithm which allows to obtain the results set of the largest possible size and diversity in the shortest time. In particular, we consider Desktop Grid systems that are widely used for virtual screening. In such systems task scheduling is complicated by the heterogeneity of computational nodes and the ability of separate nodes to join and leave Desktop Grid in random times. So, the computing nodes are considered as independent agents. To model task scheduling process in a Desktop Grid we use a congestion game. Congestion games describe situations where players compete for using a shared set of resources, and the utility of each player depends not only on the chosen resource, but also on the number of other players using it ("congestion level"). The game considered in the work is proven to have at least one Nash equilibrium in pure strategies. The equilibrium situation means that no node can increase amount of its useful work by unilaterally deviating from the schedule. Moreover, best-and better-response dynamics are guaranteed to converge to equilibrium in polynomial time.
    
    Speaker: Ms Natalia Nikitina (Institute of Applied Mathematical Research, Karelian Research Center RAS)
    
    Slides
  - 57
    
    Using General Lattice Program on the desktopgrid systems
    
    The prediction of the crystal structure is an important area in the chemistry and physics. There are many software implementations for solving this problem. The report contains the overview of the main implementations. The specific theme of the report is to adapt the application GULP to the infrastructure. The General Utility Lattice Program (GULP) is designed to perform a variety of tasks based on force field methods. The report contains the description of specifics of the functioning GULP application and various approaches to adapt it to the BONC system. Description of the experimental computing and various approaches to automated analysis of the calculation results and the generation of new jobs. The report contains recommendations on the organization of calculations of this kind.
    
    Speaker: Nikolay Khrapov (Pavlovich)
- 8. High performance computing, CPU architectures, GPU, FPGA LIT Conference Hall
  
  LIT Conference Hall
  - 58
    
    Optimization algorithms for computing options for the hybrid system
    
    In the article the problem of optimization of GPGPU option pricing algorithms. The main goal to achieve maximum efficiency from the use of the hybrid system. The authors offered some transformation algorithm derived from Blake-Scholes model for pricing European and Asian option on the Monte Carlo method based on GPGPU architecture features. The basic idea is the constant optimization of work with one large array of data. Keywords: European option, Asian option, hybrid system, GPGPU, CUDA, Monte Carlo method, Blake-Scholes model.
    
    Speaker: Dmitry Khmel (Saint Petersburg State University)
    
    Slides
  - 59
    
    Pragmatic appliance of GPGPU technology
    
    Abstract. This article presents an analysis of the practical appliance of GPGPU technology. Modern graphics accelerators can be used in a fairly wide range of activity fields beginning with entertainment, multimedia systems and not even limited with scientific investigations. To engage the enormous potential concealed in the device, multiple special tools and programming languages programming, such as OpenGL, OpenCL and CUDA, have been developed and standardized, helping Wolfram Mathematica, MATLAB, Maple and other popular computational packages reveal the real power of GPGPU. So, Keywords: GPGPU, CUDA, OpenCL, Parallel Computing
    
    Speaker: Dmitry Khmel (Saint Petersburg State University)
    
    Slides
  - 60
    
    Elastic Imaging using Multiprocessor Computing Systems
    
    One of the main method for search and exploration of oil and gas deposits and understanding the structure of the Earth’s crust is the seismic survey. The migration process allows us to estimate the position of geological boundaries under the day surface. A lot of different migration techniques were developed using the acoustic approximation. One of the obvious improvements is the transfer to the elastic medium model that successfully describes P-waves and S-waves. Consequences of it – the grows of numerical complexity of the mathematical problem and the increase of computer resource requirements. Real field data contain the information about kilometers of area and for the reasonable imaging the step of computational mesh must be from one to ten meters. Thus, for the realization of elastic migration procedures it is necessary to use modern HPC systems. The goal of this work was the development and investigation of the elastic imaging method using Born approximation for quasi-homogeneous elastic media. The research software in Mathematica system was developed, and a set of calculations for simple geological models were carried out using 12-cores shared memory system. The assessment of the scalability shows approximately 90 % of effectiveness. The study was funded by Ministry of Education and Science of the Russian Federation under grant agreement No. 14.575.21.0084 on October 20, 2014 (the unique identifier PNI: RFMEFI57514X0084) in the Moscow Institute of Physics and Technology (State University).
    
    Speaker: Dr Vasily Golubev (Moscow Institute of Physics and Technology)
    
    Slides
  - 61
    
    HPC Cluster. The Modern Paradigm.
    
    The modern science has a class of problems related to theoretical research, computer simulation and big data analysis. The classical method of solution is based on usage of serial (single threaded) algorithms having a long run time. The modern method of solution offers high-performance parallel (many threaded) algorithms. They reduce the run time greatly but have specific requirements for computer systems. The article is described the architecture and system software of a high-performance cluster by the example of heterogeneous cluster HybriLIT. The modern paradigm of high performance cluster was formulated. The construction principles of such computer systems were given.
    
    Speaker: Dmitry Belyakov (JINR)
    
    Slides
  - 62
    
    Development of algorithm for efficient pipeline execution of batch jobs on computer cluster
    
    The problem of reliability and stability of high performance computing parallel jobs become more and more topical with the increasing number of cluster nodes. Existing solutions rely mainly on inefficient process of RAM dumping to stable storage. In case of really big supercomputers, such approach – making checkpoints - may be completely unacceptable. In this study, I examined the model of distributed computing – Actor model - and on this basis I developed an algorithm of batch jobs processing on a cluster that restores interrupted computation state without checkpoints. The algorithm is part of a computing model that, to be specific, I called "computational kernels model in the name of its core component – computational kernel. This work describes all the components of the new model, its internal processes, benefits and potential problems.
    
    Speaker: Mr Yury Tipikin (Saint Petersburg University)
  - 63
    
    Учебный кластер из одноплатных компьютеров Raspberry Pi
    
    Встраиваемые устройства (embedded devices) становятся все более популярными. Их можно найти в моб. телефонах, планшетах, автомобильной электронике, роботах и т.д. Количество задач, решаемых на таких устройствах, постоянно растет, что приводит к необходимости объединения их в кластеры. Такой кластер, например, установлен на сингапурском спутнике X-Sat[xsat]. Исследование возможностей встраиваемых устройств упростилась с появлением одноплатного компьютера Raspberry Pi [rpi]. Этот компьютер размером с кредитную карту обладает 4х ядерным ARM-процессором Cortex-A7 с частотой 900МГц, 1 Гб ОЗУ (характеристики приведены для Raspberry Pi 2 model B). Для него существует Debian-подобная ОС Raspbian и даже специальная версия Windows (Windows IoT). Устройство потребляет до 600 мА (без внешней периферии) при напряжении в 5В. Эти характеристики вкупе с невысокой стоимостью (35$) сделали это устройство весьма популярным. На его основе было построено несколько учебных кластеров (например 64х узловый кластер Iridis-Pi в университете Саутгемптона[iridis]). Такие системы как правило позиционируются как прототипы реальных распределенных систем, позволяющие студентам опробовать такие технологии как Hadoop, Spark и др. Достоинствами таких встраиваемых кластеров (embedded cluster) являются невысокая стоимость, низкое энергопотребление и компактнось. По всей видимости, в будущем такие кластеры смогут найти применение и в реальных приложениях - например, в робототехнике. В работах, посвященных кластерам (напр [iridis]), как правило приводятся результаты различных стандартных тестов (пропускная способность, Linpack и т.д.) В предлагаемой работе мы исследуем возможности распределенных вычислений широко известной среды статистических вычислений R. Вычислительные эксперименты проводятся на кластере из Raspberry Pi, построенном одним из авторов. В качестве теста используется программа, оценивающая прогноз линейной регрессии методом бутстрап. Наш кластер состоит из трех Raspberry Pi 2 model B , соединенных с помощью бюджетного роутера фирмы Asus. Питается кластер от USB-хаба. На каждом узле установлена ОС Raspbian. На каждом узле запущен ssh-сервер, что позволяет получать доступ с терминала через любой ssh-клиент. Используется R версии 3.2.3. Основные выводы, полученные в результате работы: 1) Кластер из Raspberry Pi может быть построен очень быстро и сравнительно с небольшими затратами; 2) скорость вычислений на таких машинах невысока, но они представляют собой отличный полигон для изучения концепций и технологий распределенных вычислений; 3) среда статистических вычислений R включает несколько интересных инструментов для распределенных вычислений, весьма удобных в использовании;по всей видимости для быстрого старта в распределенных вычислениях эта среда подходит лучше, чем такие тяжелые технологии, как Hadoop и Spark. [xsat]I. V. McLoughlin, T. R. Bretschneider, Chen Zheming Virtualized Development and Testing for Emb edded Cluster Computing International Journal of Networking and Computing Volume 2, Number 2, pages 160–187, July 20A [rpi]https://www.raspberrypi.org/ [iridis] Simon J. Cox, James T. Cox, Richard P. Boardman, Steven J. Johnston, Mark Scott, Neil S. O'Brien Iridis-pi: a low-cost, compact demonstration cluster Cluster Computing, June 2013, Volume 17, Issue 2, pp 349-358
    
    Speaker: Илья Никольский (МГУ им М.В. Ломоносова)
    
    Slides
- 15:00
  
  Coffee
- 2. Operation, monitoring, optimization in distributed computing systems 406B
  
  406B
  - 64
    
    Центр управления Многофункциональным Информационно-Вычислительным Комплексом ЛИТ ОИЯИ
    
    Для обеспечения качества сервиса, требуемого для центра обработки и хранения данных уровня Тier1, на территории ЛИТ ОИЯИ был создан центр управления Многофункциональным информационно-вычислительным комплексом. Основными задачами данного центра управления являются не только круглосуточное наблюдение за состоянием оборудования ЦОД, работоспособности сервисов, правильного функционирования инженерной инфраструктуры, но и агрегация и аналитика получаемых данных от различных средств мониторинга, с целью предугадать и заранее предотвратить развитие нештатной ситуации. В процессе создания Центра управления МИВК был решен ряд задач по его техническому и программному обеспечению, которые освещены в данном докладе.
    
    Speaker: Mr Aleksei GOLUNOV (JINR)
  - 65
    
    Web-service for monitoring heterogeneous cluster "HybriLIT"
    
    The heterogeneous cluster HybriLIT is designed for the development of parallel applications and for carrying out parallel computations asked by a wide range of tasks arising in the scientific and applied research conducted by JINR. The efficient work on the cluster needs the implementation of service of statistics provided to the users. Even though tasks of monitoring of distributed computing and gathering its statistics are encountered more and more frequently, there is not so many well-known methods to do this. We developing web-service for hybrid heterogeneous cluster “HybriLIT”, that solves that task using Node.JS as it’s server and AngularJS for a presentment of data. Monitoring itself carried out by a sensor written on C++ with the using of libgtop library. At the moment functions of monitoring CPU load, memory load,network load of the computing node and browsing that data in both table and graphical form are already implemented. Also, there are diagrams of usage for different laboratories and users, information about currently running jobs and an archive table for a jobs that was computed on a cluster.
    
    Speaker: Mr Yurii Butenko (JINR)
  - 66
    
    Implementation of Coarse-Grained Parallel Scheme of Branch-and-Bound Algorithm for Discrete Optimization in Everest Platform
    
    In this study we examine the coarse-grained approach to parallelization of the branch-and-bound algorithm. Our approach is that we divide a mixed-integer programming problem on a set of subproblems by fixing some of the integer variables. Subproblems are solved by open-source solvers running in parallel on multiple hosts. When solver finds an incumbent it broadcasts it and other solvers can use its objective value during solution process. Solver life cycle is managed by the Everest Web-based platform for distributed computing. The platform was also modified to allow solvers exchange messages with incumbents and other data. The system was test on several mixed-integer programming problems and a noticeable speedup was shown. The reported study was partially supported by RFBR, research project №16-07-01150 А.
    
    Speaker: Mr Sergey Smirnov (Institute for Information Transmission Problems of the Russian Academy of Sciences)
  - 67
    
    Tuning parameters of a mixed-integer programming solver for real world problems
    
    Occasionally one needs to teach a solver to solve similar problems quickly. It can be achieved by tuning parameters of the solver algorithm. This process can also be automatized. We present a tool that tunes configuration parameters of an algorithm. Parameters are tuned to minimize the solving time for a set of problems. SCIP is a mixed-integer programming solver developed at Zuse Institute Berlin. The solver has more than 1500 configuration parameters. Most of the parameters are related to the solution process, others apply to solver's input/output. There are both discrete and continuous parameters. Our tool modifies parameters one by one to find ones having the most impact on the solving time. Then combinations of the best parameter values are evaluated. This approach implies that a great amount of solver runs is needed: 1-2 values of every parameter multiplied by the number of parameters multiplied by the number of test problems. Thus we employ a public cloud to create a temporary computational cluster for faster processing. The paper presents an overview of the system and some real world usage examples.
    
    Speaker: Mr Sergey Smirnov (Institute for Information Transmission Problems of the Russian Academy of Sciences)
  - 68
    
    THE EULERIAN GRAPHS APPROXIMATION METHODS FOR THE PROBLEM OF COMMUNICATIONS MONITORING
    
    One of the approaches to organization of communication system monitoring is approx-imation of corresponding graph by Eurelian graph, that excludes repeated vertex advancing. The work provides classification of the input graph vertexes in regard to edges between subsets with even and odd degrees. The study also considers various approximating graph construction methods, first of all those, that minimize changes in adjacent matrix. The work shows the conditions of existence of interior approximation without additional elements. . [1] Фляйшнер Г. Эйлеровы графы и смежные вопросы. М.: Мир, 2002, 335 с. [2 Fleischner H. Eurelian graphs and related topics. Part 1, v. 2, Amsterdam: Elsevier science publishers B.V., 1991, 337 p. [3] Кристофидес Н. Теория графов (алгоритмический подход). М.: Мир, 1978, 432 с. [4] Раппопорт А.М. Измерение расстояний между взвешенными графами структуризо-ванных экспертных суждений / Многокритериальный выбор при решении слабострук-туризованных проблем/. Сб. трудов. Вып. 5, М.: ВНИИСИ, 1978, с. с. 97 - 108. [5] Раппопорт А.М. Эффективный мониторинг коммуникаций на основе внешней ап-проксимации графа. /Распределенные вычисления в науке и образовании/: Труды пя-той международной конференции –Дубна: ОИЯИ, 2012, с. с. 377-382.
    
    Speaker: Alexander RAPPOPORT (A. A. Kharkevich Institute for Information Transmission Problems, RAS)
  - 69
    
    Automated Troubleshooting in Computer Clusters and Data Centers
    
    Speaker: Эдуард МЕДВЕДЕВ (Brocade Communications)
- 1. Technologies, architectures, models of distributed computing systems 310
  
  310
  - 70
    
    Application of TRIE data structure and corresponding associative algorithms for process optimization in GRID environment
    
    GRID model became widely used last years, arranging lots of computational resources in different environments, revealing problems of Big Data and horizontally scalable multiuser systems. In this paper there is an analysis of TRIE data structure and its application in contemporary GRID-related technologies, including routing (L3 OSI) and specialized key-value storages engine implementation (L7 OSI). The main goal is to show how TRIE mechanisms can influence operation of GRID environment, delivery process of the resources and corresponding services. The article describes how mechanisms of associative memory implemented by TRIE can dramatically reduce the level of latency in various GRID sub-systems at different layers of abstraction. This analysis covers base algorithms, technologies review and experimental data gathering, which represents basis for conclusions and decision taking. ------ Download abstract in .docx file format at Google Drive: https://drive.google.com/file/d/0ByqW7uAnBOBYQ2JHUUE0MTdYelE/view
    
    Speaker: Mr Vladislav Kashansky (SUSU, Electronics Department)
    
    Slides
  - 71
    
    Mock Data Challenge for the MPD experiment on the NICA cluster
    
    The simulated data processing before receiving first experimental data is an important issue in the high-energy physics experiments. This work presents Mock Data Challenge (MDC) for the MPD experiment at the NICA accelerator complex. It uses the ongoing simulation studies to exercise in a stress-testing distributed computing infrastructure and experiment software in the full production environment from simulated data through a physical analysis. The presentation describes a hardware part – the current scheme and structure of the distributed NICA cluster for storing and processing data obtained at the MPD detector. In addition, software for building data storage and parallelization of the MPD data processing is noted. The MDC presented in the work allows one to test the full processing chain (simulation, reconstruction and following physical analysis) for the MC data stream parallelized by the MPD scheduling system on the NICA cluster and helps to identify its potential issues.
    
    Speaker: Konstantin Gertsenberger (JINR)
  - 72
    
    Моделирование системы распределенной обработки данных эксперимента BM@N в составе комплекса T0-T1 NICA
    
    В ЛИТ ОИЯИ проводятся работы по созданию компьютерного off-line комплекса для моделирования, обработки, анализа и хранения данных комплекса NICA. В состав комплекса входят различные экспериментальные установки, в том числе BM@N для проведения экспериментов на Нуклотроне с выведенными пучками тяжелых ионов. В 2017 году планируется запуск эксперимента BM@N, в связи с чем необходимо создать систему распределенной обработки данных, полученных с установки. Для выбора архитектуры распределенной системы сбора, хранения и обработки данных и определения необходимого состава оборудования используется программа моделирования SyMSim. В работе представлены результаты моделирования вычислительного комплекса по приему и обработке данных с эксперимента BM@N.
    
    Speaker: Дарья Пряхина (ЛИТ)
    
    Slides
  - 73
    
    Modelling LIT Cloud Infrastructure at JINR and Evaluating the Model
    
    The project is aimed at modelling LIT Cloud infrastructure at JINR and solving a problem of effective resource utilization and performance evaluation of allocation and migration algorithms considering virtual machines deployment at the facilities. The main intention is to distribute processes over involved resources with the highest density possible. In order to fulfil the requirements regarding a successful solution of the problem, the assessment of existing modelling frameworks and their comparison have been carried out. Furthermore, an initial to-be-optimised model has been implemented. Currently, the project is focused on evaluation of the model and its adjustment to the used real-life environment.
    
    Speaker: Mr Vagram Airiian (LIT JINR / Dubna State University)
    
    Slides
  - 74
    
    Optimization for Bioinformatics genome sequencing pipelines by means of HEP computing tools for Grid and Supercomputers
    
    Modern biology uses complex algorithms and sophisticated software toolkits for genome sequencing studies, computations for which are impossible without access to powerful or significant computing resources. Recent advances of Next Generation Genome Sequencing (NGS) technology led to increasing volumes of sequencing data that need to be processed, analyzed and made available for bioinformaticians worldwide. Analysis of ancient genomes sequencing data using popular software pipeline PALEOMIX can require resource allocation of powerful standalone computer for a few weeks. PALEOMIX include typical set of software used to process NGS data including adapter trimming, read filtering, sequence alignment, genotyping and phylogenetic or metagenomic analysis. Organization the computation by sophisticated WMS and efficient usage of the supercomputers can greatly enhance this pipeline. Using related storage systems facilitate subsequent analysis. Bioinformatics and other compute intensive sciences draw attention to the success of the projects which use PanDA beyond HEP and Grid. PanDA - Production and Distributed Analysis Workload Management System has been developed to address data processing and analysis challenges of ATLAS experiment at LHC. Recently PanDA has been extended to run HEP and beyond HEP scientific applications on Leadership Class Facilities and supercomputers. In this paper we will describe the adaptation of the PALEOMIX pipeline to a distributed computing environment powered by PanDA for Ancient Mammoths DNA samples. We used PanDA to manage computational tasks on a multi-node parallel supercomputer. That was possible as we split input files into chunks which could be computed in parallel on different nodes as separate inputs for PALEOMIX and finally merge output result. We dramatically decreased the total computation time because of jobs brokering, submission and auto resubmission of failed ones by means of PanDA, which also demonstrated it earlier for the HEP applications in the Grid. Thus using software tools developed initially for HEP and Grid can reduce computation time for bioinformatics tasks such as PALEOMIX pipeline for Ancient Mammoths DNA samples from weeks to days.
    
    Speaker: Mr Alexander Novikov (National Research Centre "Kurchatov Institute")
    
    Slides
  - 75
    
    GRID and Quanputers
    
    GRIG is considered as a discrete dynamical system. Quanputers are discrete dynamical systems extended by corresponding linear systems up to the hamiltonian systems. The theory, construction and programming of quanputers considered [1]. References [1] Makhaldiani N. DYNAMICS, DISCRETE DYNAMICS AND QUANPUTERS, Reports of Enlarged Session of the Seminar of I. Vekua Institute of Applied Mathematics 2015, V. 29, P. 76.
    
    Speaker: Dr Nugzar Makhaldiani (JINR)
    
    Slides
- 7. Desktop grid technologies and volunteer computing
  - 76
    
    Using grid systems for enumerating combinatorial objects on example of diagonal Latin squares
    
    One of important classes of combinatorial and discrete optimization problems [1] is formed by enumeration problems. During their decision one needs to determine a number of objects with specified properties. The simplest examples of such problems are well known problems about chess rooks, chess queens, etc. For some of them precise analytic decisions are known, while to solve others one needs to perform exhaustive search to enumerate the number of decisions satisfying the specified constraints. For example, for chess rooks problem the number of possible dispositions of N rooks on the board of size N×N matches the number of permutations and is equal to N!. For some problems from the considered class the number of decisions can be expressed using Stirling numbers (of the first and second kind), Bell numbers [2], the number of combinations or partial permutations and so on. At the same time the precise analytical formulas for the number of decisions for chess queens problem or the number of Latin squares of order N are unknown (in the latter case there are known upper and lower bounds). The number of decisions usually grows rapidly with the increase of the dimension of a problem N, that is why when enumerating corresponding objects using brute force strategy one has to develop highly effective program implementation that takes into account the features of considered problem and provides high rate of generation of enumerated objects. From the point of view of parallel programming the enumeration problems of such type are weakly coupled problems, thus the algorithms for their solving can be implemented in the form of parallel programs that are efficient within the context of parallel computing environments with various architecture that comprise grid systems. One of the combinatorial objects of considered type is diagonal Latin squares (DLS) that are square tables of size N×N, where each cell is filled by an element of some alphabet (typically a number from 0 to N–1) and in each row, each column and also in main and second diagonals all elements are distinct. Basically, DLS are a special case of Latin squares (LS) that satisfy additional diagonality constraints. Using simple transformations that do not violate any of the constraints any DLS can be reduced to DLS in which the elements of the first row are sorted in ascending order. The corresponding squares form an isomorphism class of size N!. The dependence of number of LS on N is well known and presented by A000315 sequence in the Online Encyclopedia of Integer Sequences (OEIS) [3], the dependence of the number of LS with fixed first row has the number A000479. For DLS similar sequences are unknown and can be calculated using brute force. Native program implementation of this enumerating process is rather ineffective and has the rate of generating squares of order 10 less than 1 DLS/s. In order to increase this rate and, as a result, reduce computing time costs we introduced into the implementation the following optimizations: altering the order of filling elements of DLS; using static data structures instead of placing it in the dynamic memory; using information about the number of possible values for none filled cells of square combined with filling the cells with single cardinality out of order and avoiding unpromising branches of combinatorial tree with zero cardinality early; using auxiliary data structures (one dimension arrays) for filling the set of allowed items fast; switching off the Hyper-Threading technology during single threaded generation of DLS combined with avoiding background load on the CPU cores not used for generation; selecting the order of the filling cells by minimal cardinality criterion to decrease the arity of nodes of combinatorial tree; using PGO compilation. As a result it was possible to achieve the rate of generation of about 220 000 DLS/s for recurrent single threaded CPU-oriented program implementation on Delphi language and 240 000 DLS/s for similar implementation on C language (processor Intel Core i7 4770). By developing an alternative special iterative program implementation with nested loops the rate of generation was additionally increased to 790 000 DLS/s. Thus the developed program implementation is almost 6 orders more effective compared to native implementation, and this fact makes it possible to use it to enumerate some combinatorial objects (for example, DLS and pairs of orthogonal DLS, also known as Graeco-Latin squares). We applied it to enumerating DLS with fixed first row for several values of N. The corresponding numerical sequence is as follows: 1, 0, 0, 2, 8, 128, 171200, 7447587840. The total number of DLS can be calculated from the given sequence by multiplying its members by the cardinality of corresponding isomorphism class that is equal to N!: 1, 0, 0, 48, 960, 92160, 862848000, 300286741708800. At this moment authors are preparing the computational experiment aimed at organizing distributed enumeration of the number of DLS for cases with greater values of N using grid systems organized on voluntary basis. The research was partially supported by the state assignments for the Southwest State University (2014–2017 years, no. 2246), by Russian Foundation for Basic Research (grants 14-07-00403-a, 15-07-07891-a and 16-07-00155-a) and by Council for Grants of the President of the Russian Federation (grants NSh-8081.2016.9, MK-9445.2016.8 and stipend SP-1184.2015.5). We thank citerra [Russia team] for his help in development and implementing of some algorithms. Bibliography 1. Vatutin E.I., Titov V.S., Emelyanov S.G. Basics of discrete combinatorial optimization (in Russian). M.: ARGAMAC-MEDIA, 2016. 270 p. 2. Vatutin E.I. Logic multicontrollers design. Getting separations of parallel graph-schemes of algorithms (in Russian). Saarbrucken: Lambert Academic Publishing, 2011. 292 p. 3. https://oeis.org/A000315
    
    Speaker: Eduard Vatutin (Southwest State University)
  - 77
    
    Quality analysis of block separations of graph-schemes of parallel control algorithms during logic control systems design using grid systems on volunteer basis
    
    One of promising approaches to the logic control systems (LCS) design is their implementation within logic multicontrollers (LMC) basis which are presented as the collective of similar controllers connected by network with regular topology [1]. One of the problems corresponding to a class of discrete combinatorial optimization problems [2] that arise during LMC design is a problem of getting separations [3]. Quality of its decision has direct influence on hardware complexity and speed characteristics of designed LMC. To solve the problem in practice heuristic methods are used which are characterized by different quality of decisions and different computing time costs for its forming. Computing experiments performed with samples of graph-schemes of parallel control algorithms with pseudorandom structure and customizable settings show dependencies of separations quality as on the size of the problem (number of vertices N within graph-scheme) as on power of restrictions. For more detailed analysis significant computing resources are necessary therefore needed computations were organized within volunteer computing project Gerasim@Home based on BOINC platform. The main aim of computations was getting separations for selected points of space formed by the size of the problem N and restrictions and (for number of logic control signals received from controlling object and for size of microprogram memory respectively) followed by pair comparison of samples with estimated quality parameters, calculating two dimensional slices of parameters space and its analysis. At this moment analysis of two slices of space was performed (1≤N≤700, 3≤Xmax≤150 and 1≤N≤600, 3≤Wmax≤200) for methods of S.I. Baranov, its modification with adjacency restriction and for parallel-sequential decomposition method. For small area of space (1≤N≤200) research opportunities of random search method with fixed number of iterations was performed. Corresponding computations were performed within Gerasim@Home project from July 2010 to March 2016 (with some breaks for data analysis, server works and different problems computations). Resulting volume of raw experimental data was more than 500 GB. During calculations within the project real performance made up about 2–3 TFLOPS that was provided by more than 2000 volunteers from 69 countries, attracted more than 1000 personal computers. After analyzing of given results a set of conclusions and recommendations was formulated and published. First, strong zone dependence of decisions quality on the point of parameters space was confirmed. It is revealed that the area of preferable use of S.I. Baranov method is the zone of weak or absent restrictions, for method of parallel-sequential decomposition – zone of strong and very strong restrictions, and adjacent modification of S.I. Baranov method has compromise position in the area of mean force restrictions. Random search method may be used in the strong restrictions zone for graph-schemes with small number of vertices. With use of given dependences zones of insensitivity are identified where changing the power of restrictions has influence only on hardware complexity of controllers and doesn’t change its functions characteristics. This feature may be used during structural-parametric optimization of LMC structure and allows (by some times) to reduce significantly its hardware complexity through selecting optimal structure with big number of relatively simple controllers. As future investigations it is planned to expand studied areas of parameter space (by N to 800, by Xmax to 200 and by Wmax to 300), at this moment post processing of corresponding experiments results is taking place. Also development, program optimization and approbation of additional program implementations are planned that correspond to more intellectual well known iterative heuristic methods of getting separations and its multistage modifications that has less computing time costs, higher quality of decisions and convergence rate. The work was performed within the base part of the state assignments for the Southwest State University in 2014–2017, number 2246 and under support of President grant MK-9445.2016.8. The authors would like to thank all volunteers who took part in the calculation within the distributed computing project Gerasim@Home. The authors also wish to thank Anna Vayzbina for assistance in preparing the English version of the article. Bibliography 1. Organization and synthesis of microprogram multimicrocontrollers (in Russian) / Zotov I.V., Koloskov V.A., Titov V.S. et al. Kursk, 1999. 368 p. 2. Vatutin E.I., Titov V.S., Emelyanov S.G. Basics of discrete combinatorial optimization (in Russian). M.: ARGAMAC-MEDIA, 2016. 270 p. 3. Vatutin E.I. Logic multicontrollers design. Getting separations of parallel graph-schemes of algorithms (in Russian). Saarbrucken: Lambert Academic Publishing, 2011. 292 p.
    
    Speaker: Eduard Vatutin (Southwest State University)
  - 78
    
    Using volunteer computing for comparison of quality of decisions of heuristic methods in the problem of getting shortest path in the graph with graph density constraint
    
    There is a well known extensive class of optimization problems where parameters are discrete variables. They include problems from graphs theory, scheduling, operations research and so on. One part of them named as hard solved problems and forming a NP complexity class can’t be solved precisely for reasonable computing time therefore in practice for their solving heuristic methods are used. At this moment the most popular and widely used in practice is the set of the following methods [1]: greedy methods, methods of restricted enumeration (or brute force with restricted depth of analyzed combinatorial tree, restricted number of its branches and so on), methods of random and weighted (directed) search, bioinspired methods (for example, methods of ants or bees colony), simulated annealing method, genetic (evolutionary) methods. Their modifications are well known corresponding, for example, to early clipping of unpromising solutions (branch and bound strategy), supporting combinatorial returns for breaking the deadlocks, changing order of selecting elements during decisions forming, etc. Complexity of practice implementation, computing time costs and quality of decisions differ significantly both for different methods and for different conditions of its using. The set of some methods demands smart tuning of its numerical parameters performed during meta-optimization that is computing time intensive procedure. It is interesting to compare the quality of decisions and selecting subset of methods that are characterized by high convergence rate and provides getting decisions with the best quality for minimum computing time costs. To find out the most promising methods well known discrete problem of getting shortest path in graph was selected. Its optimal decision can be found for squared time with using Dijkstra algorithm that makes it convenient to compare the quality of decisions of different heuristic methods with known optimum. For this reason corresponding computing unit was developed including program implementations of listed above methods and their modifications. For each method that has tuning parameters meta-optimization was performed (at this moment this stage performed in automatic mode and needs some tens of hours of computing time). After this computing unit was deployed within volunteer computing project Gerasim@Home at BOINC platform. With its using from April 2014 to June 2014 and from February 2015 to June 2015 a series of computing experiments was organized which aim to investigation quality decisions of heuristic methods for samples with random graphs with number of vertices N≤500 and density 0≤d≤1 and with fixed number of iterations. As a result of experimental data analysis a set of conclusions was formulated. First, in this problem zone dependence is much weaker comparing to getting separations problem [2] that had been investigated earlier. The experimental dependences look like hyperbole in (N, d) coordinates that is consistent with the theoretical concepts. For high density graphs well known heuristic methods without modifications provide sufficient quality of decisions, the most promising of them are ant colony optimization method and genetic method. By decreasing density of graphs ant colony optimization method provides the best quality of solutions with support of combinatorial returns. Many well known heuristic methods which are successfully used in practice do not demonstrated high quality of decisions in this area. For example, simulated annealing method provides rare probability in finding correct decisions (paths) due to difficulties during modification of correct decision with saving its correctness; bee colony optimization method has strong dependence of tuning parameters on coordinates (N, d) that does not allow to select universal set of values of tuning parameters and forces to perform computing time complex meta-optimization by each use. Formulated recommendations can be expanded in future and may be used for solving more complex discrete combinatorial optimization problems that have practice significance. In addition, in the perspective of further researches it is necessary to investigate time costs analysis and convergence rate analysis for set of listed above heuristic methods. According to these results it is possible to work out more complex multistage methods improving the given characteristics. The work was performed within the base part of the state assignments for the Southwest State University in 2014–2017 number 2246 and under support of President grant MK-9445.2016.8. The authors would like to thank all volunteers who took part in the calculation within the distributed computing project Gerasim@Home. The authors also wish to thank Anna Vayzbina for assistance in preparing the English version of the article. Bibliography 1. Vatutin E.I., Titov V.S., Emelyanov S.G. Basics of discrete combinatorial optimization (in Russian). M.: ARGAMAC-MEDIA, 2016. 270 p. 2. Vatutin E.I. Logic multicontrollers design. Getting separations of parallel graph-schemes of algorithms (in Russian). Saarbrucken: Lambert Academic Publishing, 2011. 292 p.
    
    Speaker: Eduard Vatutin (Southwest State University)
  - 79
    
    Using volunteer computing to solve SAT-based cryptanalysis problems for the Bivium keystream generator
    
    Usually if the cryptanalysis is considered as a SAT problem then it is called a SAT-based cryptanalysis. In this case to find a secret key it is sufficient to find a solution of corresponding satisfiable SAT instance. Here we consider the SAT-based cryptanalysis of the Bivium keystream generator. This generator uses two shift registers of a special kind. The first register contains 93 cells and the second contains 84 cells. To initialize the cipher, a secret key of length 80 bit is put to the first register, and a fixed (known) initialization vector of length 80 bit is put to the second register. All remaining cells are filled with zeros. An initialization phase consists of 708 rounds during which keystream output is not released. We considered cryptanalysis problems for Bivium in the following formulation. Based on the known fragment of keystream we search for the values of all registers cells (177 bits) at the end of the initialization phase. Therefore, in our experiments we used SAT encodings where the initialization phase was omitted. The SAT-based cryptanalysis of Bivium turned out to be very hard, that is why we decided to solve several weakened cryptanalysis instances for this generator. Below we use the notation BiviumK to denote a weakened problem for Bivium with known values of K variables encoding the last K cells of the second shift register. In SAT@home 5 Bivium9 instances were successfully solved in 2014. We also tried another approach for solving weakened Bivium instances. On the first stage a SAT instance is being processed on a computational cluster by running the PDSAT solver (which was developed by us) in the solving mode. During this process, the time limit equal to 0.1 seconds (this value was selected according to experiments) for every subproblem, is used. PDSAT collects (by writing to a file) all subproblems which could not be solved within the time limit. It turned out, that this approach allowed to solve 2 out of 3 instances Bivium10 on a cluster (i.e., despite the time limit, PDSAT found a satisfying assignments for these 2 instances). It should be noted, that during processing these 2 instances the new approach was about 2 times faster than the approach without time limits. Solving of the remaining instance was launched in SAT@home with the help of the file with data about the hard subproblems (interrupted by time limit), collected by PDSAT. Finally, this instance was successfully solved too. So, we can conclude that with the help of the proposed approach some instances can be quickly processed on a computational cluster, and a volunteer computing project suits well for processing the remaining instances. We hope that this approach will help us to solve nonweakened instances of cryptanalysis of Bivium in the nearest future.
    
    Speaker: Mr Oleg Zaikin (Institute for System Dynamics and Control Theory of Siberian Branch of Russian Academy of Sciences)
  - 80
    
    Using dynamic deadline in the volunteer computing project SAT@home
    
    In every volunteer computing project some deadline is used. It is the time limit for execution of a computational tasks on a user computer. If a task wasn’t processed in deadline, then project server sends this task to another user’s host. The value of deadline has a great influence to effectiveness of a volunteer computing project. If this value is low, then a lot of hosts will not keep tasks processing in time. If this value is high, then in some cases a solution can be found slower (if this solution corresponds to any task of a host which has been reached the deadline). We developed the algorithm for calculation the value of deadline which suits well for a particular volunteer computing project. The input data here is percentage of tasks we want to be processed in deadline. According to the algorithm we analyze database of the project – we need to now how many tasks could be processed in different possible variants of a deadline. As a result of this analysis algorithm determine the value of deadline which should be used in the past in ideally case. Based on this value we can predict the value of deadline we should use in the nearest future. The suggested algorithm was implemented and used in the volunteer computing project SAT@home. We used the threshold percentage value equal to 97 % as an input of algorithm. Deadline was dynamically changed once a day during several months. It turned out, that default deadline (equal to 10 days) is not the better choiсe – in the case of SAT@home the value of 8 days is better.
    
    Speaker: Mr Alexey Zhuravlev (Internet portal BOINC.ru)
    
    Slides
  - 81
    
    Распределенная вычислительная сеть MarGrid на базе компьютеров республики Марий Эл
    
    В статье рассматривается архитектура и особенности реализации распределенной вычислительной сети MarGrid для решения задач, обладающих высокой вычислительной сложностью. MarGrid представляет собой клиент-серверную распределенную вычислительную сеть (РВС), построенная на базе трехуровневой архитектуры (клиентская часть приложения - сервер приложения - сервер базы данных). Для обмена сообщениями между клиентом и сервером используется программный фреймворк Windows Communication Foundation. Используется централизованный метод обнаружения ресурсов, централизованная доступность ресурсов, централизованное взаимодействие узлов. На сегодняшний день MarGrid объединяет 670 компьютеров и серверов университетов и общеобразовательных школ г.Йошкар-Олы с потенциальной вычислительной производительностью до 106TFlops и до 5000 одновременно выполняющихся задач. Разработанная распределенная вычислительная сеть MarGrid удовлетворяет всем предъявляемым требованиям: масштабируемость, поддержание логической целостности данных, устойчивость, WCF обеспечивает безопасность, эффективность. Полученное решение способствует увеличению производительности ресурсоемких вычислений в приоритетных направлениях Республики Марий Эл, которые включают: молекулярно-генетические исследования, биотехнологии, компьютерное моделирование, развитие инфокоммуникационной инфраструктуры научных исследований и информационную поддержку инновационной деятельности и др.
    
    Speaker: Vladimir Bezrodny (Volga State University of Technology)
  - 82
    
    Рейтинг проектов добровольных распределенных вычислений
    
    Предложен новый измерительный инструмент для оценки качества проектов добровольных распределенных вычислений - ЯК-рейтинг. Разработана анкета для сбора информации от сообщества добровольцев распределенных вычислений. Представлены первичные результаты опроса и составлен предварительный рейтинг проектов.
    
    Speaker: Mr Ilya Kurochkin (IITP RAS)
  - 83
    
    МОДЕЛИ ПОВЕДЕНИЯ РОССИЙСКИХ УЧАСТНИКОВ ДОБРОВОЛЬНЫХ РАСПРЕДЕЛЕННЫХ ВЫЧИСЛЕНИЙ НА ПЛАТФОРМЕ BOINC
    
    В исследовании показано, что виртуальное сообщество российских участников добровольных распределённых вычислений (volunteer computing, VC) на платформе BOINC может быть рассмотрено в форме сети. В качестве узлов сети рассмотрены 2 типа объектов: участники сообщества (аккаунты пользователей, зарегистрированных на сайте boinc.ru) и исследовательские проекты, в которых пользователи принимают участие (зарегистрированные в системе BOINC учетные записи проектов). В графе, представляющем данную сеть, ребро соединяет одну из вершин, относящихся к первому типу – пользователи, а другую, ко второму – проекты, в которых пользователь, отображенный первой вершиной, принимает участие (предоставляет свои ресурсы для вычислений). В результате мы получим двудольный граф с типами вершин “участник” и “проект”. Вес каждого ребра равен количеству “кредитов” заработанных участником на исследовании, с которым он связан данным ребром. Рассматриваемая сеть состоит из 134 проектов, 740 проектов и 44985 участников, что в сумме дает 45119 вершин, 82827 связей между ними. Средняя степень вершины в графе составляет примерно 1,83. Среднее количество участников в проекте – 618. Диаметр графа равен 6, средняя длина пути: 2.14. По мнению исследователей российского BOINC-сообщества [Андреева А.,2014; Курочкин И. И., Якимец В.Н.,2014] мотивами участия в проектах и соответствующими моделями поведения участников VC являются: - ощущение причастности к важным научным исследованиям и, соответственно, получению значимых научных результатов; - командный дух и атмосфера состязательности; участники VC могут объединяться в команды по разным признакам (национальному, региональному, пр.); за выполненное задание пропорционально затраченным вычислительным ресурсам участникам проектов начисляются т.н. «кредиты»; количество «кредитов» является характеристикой, по которой команды и отдельные участники соревнуются между собой; - информированность о командном и/или индивидуальном участии в проекте; при получении результатов обычно на сайте проекта выкладывается информация об участнике, на ПК которого был получен данный результат. Описанные модели поведения участников VC рассмотрим в качестве гипотез исследования двудольного графа – тематическая, командная и количественная. Для проверки гипотез были использованы четыре различных метода кластеризации двудольных графов (методы SRE, k-means, PDDP, “информационного бутылочного горлышка”), в основном ранее применявшиеся для кластеризации документов. Сравнение результатов работы методов показало высокую степень применимости каждого из них к данному объекту исследования. Можно с уверенностью говорить о том, что тематическая гипотеза о том, что поведение участников boinc.ru находится в сильной зависимости от их тематических научных интересов, была доказана. Гипотеза о том, что поведение участников находится в зависимости от команды, в которой они состоят, была подтверждена частично. И наконец, предположение о том, что общая активность участников будет сильным сигналом их поведения при выборе проектов, было почти полностью опровергнута, за исключением отдельной группы сверхактивных участников. Результаты кластеризации подтверждают сформированное ранее представление о статистике участия российских кранчеров в проектах VC [Тищенко В. И., Прочко А. Л., 2014]. Те участники добровольных распределённых вычислений, которые были ориентированы на развитие проекта как способа решения фундаментальной научной проблемы и видели результаты своего труда – демонстрировали лучшие показатели в предоставлении вычислительных мощностей своих компьютеров – активность, постоянство, время подключения и т.п. Полученные результаты могут иметь существенное применение при решении практических задач по оптимизации работы в рамках сети Boinc.ru.
    
    Speaker: Dr Виктор Тищенко (Институт системного анализа ФИЦ "Информатика и управление" РАН)
- 8. High performance computing, CPU architectures, GPU, FPGA
  - 84
    
    Parallelization of a finite difference scheme for solving systems of 2D Sine-Gordon equations
    
    The numerical solving of systems of 2D Sine-Gordon equations is important both for pure mathematical theory and for applications. A second-order finite difference scheme is proposed for solving particular systems of 2D perturbed Sine-Gordon equations coupled via a cyclic tridiagonal matrix. The systems are considered on rectangular domains. In some cases the computational domain size and the number of time steps may be very large, which motivates a parallelization of the difference scheme. The difference scheme is parallelized by using MPI and OpenMP technologies. For different performance tests we use the computational resources of the HybriLIT cluster and the IICT-BAS cluster. Very good performance scalability is achieved.
    
    Speaker: Dr Ivan Hristov (University of Sofia / JINR)
    
    Slides
  - 85
    
    PARALLEL EVOLUTIONARY ALGORITHM IN HIGH-DIMENSIONAL OPTIMIZATION PROBLEM
    
    An implementation of a combined evolutionary algorithm for searching of an extremum of functions with many parameters is proposed. The algorithm is designed to optimize parameters of the molecular-dynamics reactive force field potential ReaxFF. It can be efficient for a variety of extremum searching problems with arbitrary complex objective function. The algorithm itself is a hybrid of two evolutionary methods: Genetic Algorithm which uses the principle of a natural selection in the population of individuals, and the Particle Swarm Optimization, which imitates the self-organization of the particle swarm. Individuals in a population as well as swarm particles can be considered as trial solution vectors. Combination of these two methods provides one with a possibility to work with objective functions with an unknown complex structure which often has a composition of specific peculiarities insurmountable by simple algorithms. Genetic Algorithm parameterizations regarding choosing its main strategies for computations with different objective functions has been analyzed. Results for the classical test functions convergence speed testing are presented. Effectiveness of the algorithm working on the platform of the computational system with shared memory and on the platform of the distributed system has been compared. Good scalability of implemented algorithm is demonstrated for the distributed computational systems.
    
    Speaker: Ms Sofia KOTRIAKHOVA (Saint-Petersburg State University)
    
    Slides
  - 86
    
    Parallel algorithms for calculation of binding energies and adatom diffusion on GaN (0001) surface
    
    Empirical many-body potentials were used to calculate the binding energies and diffusion barriers for various adatoms on the GaN (0001) surface. Potential energy surfaces were calculated for adatoms by letting them relax in c-direction on Ga terminated (0001) surface of GaN. The minimum energy positions for Ga and N adatoms on the Ga-terminated GaN (0001) surface were identified. This allowed determination of the diffusion pathways and the diffusion barriers for adatoms. Parallel GPGPU computing was performed for the acceleration of empirical potentials and the algorithms of energy minimization. OpenCL technology was used to support both CPU and GPU computing. The basic results were compared with DFT calculations for the same structures. It was shown that Tersoff and Stillinger-Weber empirical potentials do not reproduce correctly the minimum energy positions of adatoms on GaN (0001) surface. Additional potential fitting is required to reproduce binding energies and diffusion barriers. Interatomic potential fitting and potential energy surface calculation are the problems to be done in parallel by high performance hardware or GRID infrastructure.
    
    Speaker: Dr Alexander Minkin (National Research Centre "Kurchatov Institute")
    
    Slides
  - 87
    
    Calculation of ground states of few-body nuclei using NVIDIA CUDA technology
    
    The possibility of application of modern parallel computing solutions to speed up the calculations of ground states of few-body nuclei by Feynman's continual integrals method has been investigated. These calculations may require large computational time, particularly in the case of systems with many degrees of freedom. The results of application of general-purpose computing on graphics processing units (GPGPU) using NVIDIA CUDA technology are presented. The algorithm allowing us to perform calculations directly on GPU was developed and implemented in C++ programming language. Calculations were performed on the NVIDIA Tesla K40 accelerator installed within the heterogeneous cluster of the Laboratory of Information Technologies, Joint Institute for Nuclear Research, Dubna. The energy and the square modulus of the wave function of the ground states of several few-body nuclei have been calculated. The results show that the use of GPGPU significantly increases the speed of calculations.
    
    Speaker: Mikhail Naumenko (Joint Institute for Nuclear Research)
    
    Slides
  - 88
    
    The parallel framework for the partial wave analysis
    
    Partial wave analysis is a fundamental technique for extracting hadron spectra and hadron decay properties. It is employed in some current experiments like BES-III, LHCb, COMPASS, and future ones like PANDA. The analysis is typically performed using the event-by-event maximum likelihood method. For the BES-III experiment, fitting the accumulated data (about 1.225 billion J/psi decays) using currently employed software takes a long time which significantly complicates and sometimes restricts the data analysis. The development of new multicore CPU's and GPU's makes using parallel programming technologies natural to decrease the data fitting time. For this purpose, the parallel framework for partial wave analysis is being developed. Parallelization options employing various computing technologies including OpenMP, MPI, and OpenMP with Xeon Phi co-processor extensions were studied taking into account distinctive features of the task and external software used in the reation. They were tested using the resources of the heterogeneous cluster HybriLIT. The results on calculation speedup and efficiency as well as a comparative analysis of the developed parallel implementations are presented.
    
    Speaker: Ms Victoria Tokareva (JINR)
  - 89
    
    Параллельные программы библиотеки JINRLIB
    
    JINRLIB (http://www.jinr.ru/programs/jinrlib/) - библиотека программ, предназначенных для решения широкого круга математических и физических задач. Программы объединяются в библиотеки объектных модулей или существуют в виде самостоятельных пакетов прикладных программ. В настоящий момент насчитывается более 60 программных пакетов. В последнее время происходит бурное развитие технологий программирования параллельных вычислений, в частности, MPI. Эта тенденция нашла свое отражение и в библиотеке JINRLIB. Была сформулирована следующая стратегия распараллеливания: программа, подготовленная для работы в среде MPI, должна успешно работать при любом количестве NP параллельных процессов, вовлекаемых в решение задачи, в том числе и при NP=1. Таким образом, возникает единый исходный текст программы, равно пригодный к эксплуатации как на традиционных последовательных вычислительных системах, так и на современных кластерах, состоящих из большого числа процессоров. Эта идея была успешно реализована при распараллеливании программ, описанных ниже. MINUIT - параллельная версия программы минимизации функций многих переменных. На примере MINUIT обсуждаются проблемы распараллеливания больших вычислительных программ. PFUMILI - модификация известной программы FUMILI, допускающая ее эффективную эксплуатацию на современных вычислительных кластерах, объединяющих сотни однотипных процессоров. CLEBSCH2 - вычисление простейшей формы коэффициентов Клебша-Гордана (k,n)=k!*(n-k)!/n!. Программа свободна от типичных при вычислении факториалов "в лоб" случаев переполнения при умножении. PRIMUS - программа Л.Александрова, реализующая классический алгоритм так называемого решета Эратосфена для генерации простых чисел. Авторский интерфейс был модифицирован для упрощения возможности использования нескольких процессоров в рамках технологии MPI. PROFILE - программный инструмент для исследования производительности программ в определяемых пользователем интервалах. Программа пригодна для использования в традиционных (последовательных) фортранных программах, так и в распараллеленных с использованием технологии MPI. Программы адаптированы для работы на кластере HybriLIT. Приведены результаты тестирования.
    
    Speaker: Dr Tatiana Sapozhnikova (JINR)
    
    Slides
  - 90
    
    Parallel implementations of image reconstruction algorithms for X-ray microtomography
    
    Significant improvement of detector resolution and, consequently, rapid growth of acquired data amounts, typical for evolving modern tomographic systems, demands development of more efficient image reconstruction software. A Medipix semiconductor detector with 55 μm spatial resolution and a cone-beam scanning scheme are used for taking projections in a microCT scanner MARS, being run at the Dzhelepov Laboratory of Nuclear Problems of JINR. The FDK algorithm realization developed at JINR, which is currently used for image reconstruction, requires a significant time to process the data. Its reducing is a priority task. For this purpose, parallel implementations of the reconstruction algorithm using OpenMP, MPI, and CUDA technologies have been developed and deployed for calculations on heterogeneous computing systems. A comparative analysis of the developed parallel implementations has been done, the results on calculation speedup and efficiency are presented. The computations were performed on the heterogeneous cluster HybriLIT at the JINR Laboratory of Information Technologies.
    
    Speaker: Ms Victoria Tokareva (JINR)
    
    Slides
  - 91
    
    Применение технологии MPI для решения задач распространения динамических волновых возмущений с использованием многоблочных структурных сеток
    
    В работе предложен алгоритм декомпозиции многоблочных структурных сеток для расчета на многопроцессорных вычислительных системах с распределенной памятью с применением технологии MPI. Его особенностью является сохранение структурного разбиения в рамках каждого блока. Проведена аналитическая оценка эффективности предложенного алгоритма. В качестве примера рассматривается задача прохождения динамических волновых возмущений через здание, которое представлено набором структурных сеток. Приведены результаты тестирования алгоритма на данной задаче. Исследование выполнено при финансовой поддержке РФФИ в рамках научного проекта № 15-37-20673 мол_а_вед.
    
    Speaker: Nikolay Khokhlov (MIPT)
Wednesday 6 July
- Mon 4 Jul
- Tue 5 Jul
- Wed 6 Jul
- Thu 7 Jul
- Fri 8 Jul
- 92
  
  Anniversary of LIT LIT Conference Hall
  
  LIT Conference Hall
- 11:00
  
  Boat and Picnic Party
Thursday 7 July
- Mon 4 Jul
- Tue 5 Jul
- Wed 6 Jul
- Thu 7 Jul
- Fri 8 Jul
- Plenary reports LIT Conference Hall
  
  LIT Conference Hall
  - 93
    
    Grid Site Monitoring and Log Processing using ELK
    
    Typical WLCG Tier-2 centres use several hundreds of servers with different services. Manual checks of all log files is impossible and various smart solutions for monitoring and log file analysis are used. We describe used procedures in the Computing Centre of the Institute of Physics in Prague, which hosts Tier-2 centre for ALICE and ATLAS experiments and provides resources for several other projects. Nagios is used as a basic monitoring tool set. Our custom plug-in aggregates warning and standard error messages and sends them summarised 3 times per day to administrators via email. Errors on critical components are sent immediately via email and Short Message System to predefined phone numbers. Nagios is complemented by Munin and Ganglia for better status overview of each server and the whole infrastructure. ELK stack is the most recent part of our monitoring set up. All log files from all production servers are shipped for processing by Logstash and then are stored in Elastic Search. We will describe used hardware, roles of each machine in the ELK cluster, technological challenges, obstacles and our cluster set up and its tuning. Typical examples of searches and graphical outputs will be presented.
    
    Speaker: Mr Alexandr Mikula (Institute of Physics of the Czech Academy of Sciences)
    
    Slides
  - 94
    
    Russian sites in processing and storage of ALICE experimental data in the LHC run-2.
    
    Russian sites in processing and storage of ALICE experimental data in the LHC run-2. The report presents new approaches to the processing and storage of ALICE experimental data during the LHC run-2. Participation and work of Russian sites is discussed. The trends of development of the Russian resource base are presented both for the update of the currently operating sites as for the development of new ones. Implementation and testing of novel software in Russian ALICE data centers is described. Participation of Russian research groups in the international projects to develop and implement the new distributed computing and storage structures is shown. Вклад российских институтов в обработку и хранение данных ALICE на втором этапе работы БАК. В докладе представлены новые подходы к обработке и хранению данных эксперимента ALICE на БАК, используемые на втором этапе работы коллайдера. Отдельно говориться о работе российских сайтов и институтов в этом направлении. Представлена тенденция развития российской ресурсной базы - как обновление работающих сайтов, так и появление новых. Рассказано о внедрении и тестировании новейшего программного обеспечения в российских центрах обработки данных эксперимента ALICE. Показано участие российских научных групп в международных проектах по разработке и внедрению новых распределенных вычислительных структур и структур хранения информации.
    
    Speaker: Mr Andrey Zarochentsev (SPbSU)
    
    Slides
  - 95
    
    High-level software for finite-dimensional and dynamic optimization in distributed computing infrastructure
    
    Optimization models are widely applied in the most different research areas (natural sciences, technology, economy, sociology etc.). But, heterogeneity of available optimization software and computing infrastructure complicates widespread practical usage of this approach. Especially, it is so for small and venture research teams at the beginning phase of work during optimization model fitting (including input and output data), selection of relevant algorithm and solvers, evaluating of required computing power and available computing resources (standalone servers, clusters, clouds and/or Grid infrastructure). To now, there are exist a large reserve in the theory and numerical methods for solving optimization problems of different types. We have a wide choice of optimization software: 1) solvers for mathematical programming, discrete optimization, optimal control problems etc.; 2) a number of translators for optimization modeling algebraic languages (AMPL, GAMS, ZIMPL, Fico-XPRESS etc.) compatible with most of solvers.; 3) another special scientific software, e.g. Computer Algebra Systems (Stochastic, Geometry etc, as well). Dynamic systems optimization (e.g. optimal control) involve solution of Cauchy and/or boundary-value problems. Despite the fact that some of solvers may run in multi-threaded or parallel (on clusters) modes, complexity of the considering problems increases and it requires the use of more powerful computing systems. Although active development of technologies for running optimization software as Web- or REST-services the work is far from completion. Still we don't have conventional, generally accepted technology for "on-demand" deployment of high-performance problem-oriented optimization modeling systems on the base of solvers and optimization languages translators running in distributed heterogeneous computing infrastructure (including standalone servers, clusters, clouds and Grid). At the beginning of the report we present survey of existing technologies of optimization modeling in distributed computing environment: from “NEOS: server for optimization”, http://www.neos-server.org (Argonne National Laboratory, University of Wisconsin) and framework «OS: Optimization Services», http://www.optimizationservices.org to one of the most recent Fico Optimization Suite running at Fico Analytic Cloud, http://www.fico.com/en/analytics/optimization. Then we present our approach based on Everest, http://everest.distcomp.org, a cloud platform for researchers supporting publication, execution and composition of applications running across distributed heterogeneous computing infrastructure. Everest software is developing by our research team in Center for Distributed Computing, http://distcomp.ru. We demonstrate a number of optimization models implemented by Everest platform. All of them are based on REST-services providing remote access to LP/MILP/NLP solvers and to AMPL-translator. We provide an extension of AMPL (so called AMPLX, https://gitlab.com/ssmir/amplx) which allows running any AMPL-script (data processing algorithm involving optimization written in AMPL language) in distributed mode, when independent problems are solved in parallel by remote solvers. As example an implementation of a branch-and-bound algorithm of carbonaceous nanomaterial structure identification with a joint X-Ray and neutron diffraction data analysis. Another example concerns implementation of coarse grained type of branch-and-bound algorithm with preliminary static decomposition of initial MILP problems into a fixed number of sub-problems in accordance with some heuristic rules implemented as an AMPL-script. Then all these subproblems are solving in parallel by a number of MILP-solvers connected to Everest. This system provides exchange of B&B “incumbents” (the best known feasible solution found in the branching tree) between solvers. This exchange accelerates the search of optimal solution of the initial MILP-problem. This approach is demonstrated via Traveling Salesmen Problem and Tasks-Workers Scheduling problems.
    
    Speaker: Mr Vladimir Voloshinov (Institute for Information Transmission Problems RAS)
    
    Slides
  - 96
    
    Automation of Distributed Scientific Computations with Everest
    
    The report discusses common problems associated with the automation of scientific computations, as well as promising approaches to solving these problems based on cloud computing models. The considered problems include running computations on HPC resources, integration of multiple computing resources, sharing of computing applications, combined use of mutliple applications and running parameter sweep experiments. Due to the inherent complexity of computing software and infrastructures, as well as the lack of required IT expertise among the researchers, all these actions require a significant amount of automation in order to be widely applied in practice. The use of service-oriented approach in scientific computing can improve the research productivity by enabling publication and reuse of computing applications, as well as creation of cloud services for automation of computation processes. An implementation of this approach is presented in the form of Everest cloud platform which supports publication, execution and composition of computing applications in a distributed environment. Everest follows the Platform as a Service model by providing all its functionality via remote web and programming interfaces. A single instance of the platform can be accessed by many users in order to create, run and share applications with each other without the need to install additional software on their machines. Any application added to Everest is automatically published both as a user-facing web form and a web service. Unlike other solutions, Everest runs applications on external computing resources connected by users, implements flexible binding of resources to applications and provides an open programming interface. The implementation of the platform, application use cases and future research directions are discussed.
    
    Speaker: Dr Oleg Sukhoroslov (IITP RAS)
    
    Slides
  - 97
    
    Data Knowledge Base Prototype for Collaborative Scientific Research
    
    The most common characteristics of large-scale modern scientific experiments are long lifetime, complex experimental infrastructure, sophisticated data analysis and processing tools, peta- and exascale data volume. All stages of an experiment life cycle are accompanied with the auxiliary metadata, required for monitoring, control and scientific results replicability and reproducibility. The actual issue for the majority of scientific communities is a very loose coupling between metadata describing data processing cycle, and metadata representing annotations, indexing and publication of the experimental results. Researchers from Kurchatov Institute and Tomsk Polytechnic University have investigated main metadata sources for one of the most data-intensive modern experiment - ATLAS, at LHC. It has been noticed that there is a lack of connectivity between data and meta-information, for instance between physics notes and publications, and data collection(s) used to conduct the analysis. Besides, to reproduce and to verify some previous data analysis, it’s very important for the scientists to mimic the same conditions or to process data collection with new software releases or/and algorithms. That’s why all information about data analysis process must be preserved, starting from the initial hypothesis following by processing chain description, data collection, initial results presentation and final publication. A knowledge-based infrastructure (Data Knowledge Base - DKB) gives such possibility and provides fast access to relevant scientific and accompanying information. The infrastructure architecture has been developed and prototyped. DKB is functioning on the basis of formalized representation of scientific research lifecycle – HEP data analysis ontology. The architecture has two data storage layers: Hadoop storage, where data from many metadata sources are integrated and processed to obtain knowledge-based characteristics of all stages of the experiment, and Virtuoso ontology database, where all extracted data are registered. DKB agents process and aggregate metadata from data management and data processing systems, metadata interface, conference notes archives, workshops and meetings agendas, and publications. Additionally, this data is linking with the scientific topic documentation pages (such as Twikis, Google documents, etc) and information, extracted from full texts of experiment supporting documentation. In this way, rather than require the physicists to annotate all meta-information in details, DKB agents will extract, aggregate and integrate all necessary metadata automatically. In this talk we will outline our accomplishments and discuss the next steps and possible DKB implementation in more details.
    
    Speaker: Ms Grigorieva Maria (NRC KI)
    
    Slides
  - 98
    
    ATLAS BigPanDA Monitoring and Its Evolution
    
    BigPanDA is the latest generation of the monitoring system for the Production and Distributed Analysis (PanDA) system. The BigPanDA monitor is a core component of PanDA and also serves the monitoring needs of the new ATLAS Production System Prodsys-2. BigPanDA has been developed to serve the growing computation needs of the ATLAS Experiment and the wider applications of PanDA beyond ATLAS. Through a system-wide job database, the BigPanDA monitor provides a comprehensive and coherent view of the tasks and jobs executed by the system, from high level summaries to detailed drill-down job diagnostics. The system has been in production and has remained in continuous development since mid 2014, today effectively managing more than 2 million jobs per day distributed over 150 computing centers worldwide. BigPanDA also delivers web-based analytics and system state views to groups of users including distributed computing systems operators, shifters, physicist end-users, computing managers and accounting services. Providing this information at different levels of abstraction and in real time has required solving several design problems described in this work. We describe our approach, design, experience and future plans in developing and operating BigPanDA monitoring.
    
    Speaker: Tatiana Korchuganova (National Research Tomsk Polytechnic University)
    
    Slides
- 10:00
  
  Coffee
- Plenary reports LIT Conference Hall
  
  LIT Conference Hall
  - 99
    
    Integration Of PanDA Workload Management System With Supercomputers for ATLAS
    
    The Large Hadron Collider (LHC), operating at the international CERN Laboratory in Geneva, Switzerland, is leading Big Data driven scientific explorations. Experiments at the LHC explore the fundamental nature of matter and the basic forces that shape our universe, and were recently credited for the discovery of a Higgs boson. ATLAS, one of the largest collaborations ever assembled in the sciences, is at the forefront of research at the LHC. To address an unprecedented multi-petabyte data processing challenge, the ATLAS experiment is relying on a heterogeneous distributed computational infrastructure. The ATLAS experiment uses PanDA (Production ANd Distributed Analysis system) Workload Management System for managing the workflow for all data processing on over 150 data centers. Through PanDA, ATLAS physicists see a single computing facility that enables rapid scientific breakthroughs for the experiment, even though the data centers are physically scattered all over the world. While PanDA currently uses more than 250,000 cores with a peak performance of 0.3+ petaFLOPS, next LHC data taking runs will require more resources than Grid computing can possibly provide. To alleviate these challenges, LHC experiments are engaged in an ambitious program to expand the current computing model to include additional resources such as the opportunistic use of supercomputers. We will describe a project aimed at integration of PanDA WMS with supercomputers in United States, Europe and Russia (in particular with Titan supercomputer at Oak Ridge Leadership Computing Facility, MIRA supercomputer at Argonne Leadership Computing Facilities, and others). In our talk we will consider different approaches towards ATLAS data processing on supercomputers: using dedicated allocation of supercomputer time, working in backfill mode, and multi-step processing. Special attention will be devoted to AES (ATLAS event service) on HPC and multi-job pilot. We will present our recent accomplishments with running PanDA at supercomputers and demonstrate our ability to use PanDA as a portal independent of the computing facilities infrastructure for High Energy and Nuclear Physics as well as other data-intensive science applications, such as bioinformatics and astro-particle physics.
    
    Speaker: Mr Danila Oleynik (JINR LIT)
    
    Slides
  - 100
    
    INCDTIM Grid and MPI activity
    
    Speaker: Dr Attila Bende (National Institute for R&D of Isotopic and Molecular Technologies)
    
    Slides
  - 101
    
    HYBRILIT – KEY OF THE HIGH PERFORMANCE COMPUTING IN JINR
    
    The overwhelming part of the scientific investigations carried out in the Joint Institute for Nuclear Research (JINR) asks for the solution of a broad spectrum of computing intensive tasks of ever increasing complexity. To cope with such tasks, the Laboratory of Information Technologies (LIT) of the JINR has developed a significant dedicated information-computing infrastructure the evolution of which will grow up during the next years into a Multifunctional Information and Computing Complex (MICC). The present paper deals with the high performance computing component of the MICC, which is being implemented in the heterogeneous computing cluster HybriLIT (http://hybrilit.jinr.ru/ ). The gradually developed HybriLIT configuration includes compute nodes with different types of coprocessors (graphical accelerators (GPU) NVIDIA and Intel Xeon Phi coprocessors). A hardware-software environment was created which matches the requirements of scalability and high fault tolerance and secures efficient system administration. The network access to remote software resources secures efficient fulfillment of user needs and creates hope for the connection of the resources of remote clusters.
    
    Speaker: Prof. Gheorghe Adam (JINR)
    
    Slides
- 12:30
  
  Lunch
- 10. Databases, Distributed Storage systems, Big data Analytics 406A
  
  406A
  - 102
    
    Grid datasets popularity estimation using gradient boosting
    
    In this paper we present machine learning based technique to estimate access patterns for grid datasets. We have analyzed 3-year historical data of Kurchatov Institute Tier1 site and applied gradient boosting algorithm to predict nearest future access patterns for the dataset based on its previous access statistics. We show our method to be effective for ATLAS data popularity estimation. Our method can be used to optimize grid storage in two ways: to move unpopular datasets to tape storage to save disk space and to increase number of replicas of popular datasets to reduce access latency.
    
    Speaker: Daria Chernyuk (Andreevna)
  - 103
    
    Comparison of Parallelization Methods in New SQL DB Environment
    
    All known approaches to parallel data processing in relational client-server database management systems are based only on inter-query parallelism. Nevertheless, it’s possible to achieve intra-query parallelism by consideration of a request structure and implementation of mathematical methods of parallel calculations for its equivalent transformation. This article presents an example of complex query parallelization and describes applicability of the graph theory and methods of parallel computing both for query parallelization and optimiza-tion.
    
    Speakers: Prof. Alexander Degtyarev (Professor), Prof. Yulia Shichkina (St.Petersburg State Electrotechnical University)
  - 104
    
    PhEDEx - main component in data management system in the CMS experiment
    
    Many tens of petabytes of data from the CMS experiment at over a couple hundred sites around the world need to be managed. This represents more than an order of magnitude increase in data volume over HEP experiments in a pre-LHC era. The Physics Experiment Data Export (PhEDEx) project was chosen to fulfill this need. It deals with different storage systems, interacts with the CMS catalogues, handles replica information and allows sites and users to subscribe/move datasets. It is implemented as a series of autonomous, robust, persistent processes called agents. They are running at sites and exchanging information via a central database. PhEDEx provides a safe and secure environment for operations. There will be given an overview of PhEDEx architecture and its advantages. The PhEDEx configuration at the T1_RU_JINR and T2_RU_JINR sites will be presented along with arguments in support of particular changes made to some of constituent agents.
    
    Speaker: Mr Nikolay Voytishin (LIT)
  - 105
    
    DATA GRID AND THE DISTRIBUTED DATABASE SYSTEM
    
    Abstract. The purpose of this paper is to examine distributed databases and some problems of Grid dealing with the data replication from different points of view. This paper is referred to find out the most efficient commonalities between Data Grid and that managed data stored in object-oriented databases. We discuss about features of Distributed Database System (DDBS) such as, design, architecture, performance and concurrency control and submit some research that has been carried out in this specific area of Distributed Database System (DDBS). Query optimization, distribution optimization, fragmentation optimization, and joint option to the optimization in Internet are included in our research. Our project design, advantages of its results and examples concerning about our topic are presented in this paper.
    
    Speaker: Mr Thurein Kyaw (Lwin)
  - 106
    
    Processing of Multidimensional Data in Distributed Systems for Solving the Task of Tsunami Waves Modeling
    
    Many applied research of geography and oceanology require big data processing. One of these problems it is tsunami waves modeling. This task involves dynamic re-interpolation of bathymetry data on multiple grids of different scales. It is determined by the distance from the coastline and existence of islands along the front of wave. In this work re-interpolation is implemented by applying parallel programming primitives to multidimensional arrays of data, which are distributed across the computer cluster nodes. It allows to work effectively with data that does not fit into the memory of one compute node. In addition to this, it improves processing speed compared to the sequential program. NetCDF format that use for storage bathymetry data it is hierarchical format and he hasn't ready-made solutions for processing in distributed systems. Paper views alternative solutions and use one of these for solve given task.
    
    Speaker: Mrs Svetlana Sveshnikova (Saint Petersburg State University)
    
    Slides
  - 107
    
    Разработка комплекса баз данных для эксперимента CBM
    
    Представлены результаты по разработке комплекса баз данных для эксперимента CBM. На основе анализа и изучения опыта использования баз данных в физике высоких энергий предложен набор и основные структурные характеристики требуемых баз данных. Комплекс включает следующие базы данных: Configuration DB, Condition DB, Geometry DB, TagEvent DB и Component DB. Приводится описание реализованной на данный момент Компонентной базы данных (Component DB)и принципов работы Геометрической базы данных (Geometry DB), реализация которой находится на завершающей стадии разработки.
    
    Speaker: Mr Evgeny Alexandrov (JINR)
- 3. Middleware and services for production-quality infrastructures
  - 108
    
    Automation of organizational and technical arrangements for scientific researches
    
    Abstract. The article highlights the set of problems associated with automation of computing and informational support of scientific research, and offers a possible integrated solution based on the service desk system. The attention is focused on the common organizational and technical activities related to the lifecycle of scientific research, such as registration, accounting and technical support. A gracefully integrated software complex with the usage of universal web service is provided as a solution. Mentioned system binding tool allows automating the workflow of key applications and simplifying the staff decision-making problem. Achieved data relevance, reduction of human impact factor and man-hour costs are mentioned as a positive factor of integration solution. Keywords: ITSM, Service Desk, Integration, Infrastructure, Computing Center
    
    Speaker: Mr Nikolai Yuzhanin (Saint Petersburg State University)
    
    Slides
  - 109
    
    Security Issues Formalization
    
    Software bugs are primary security issues. Semantic templates and Software fault patterns are overviewed as tools for software bugs specification tools. Further research on the topic is discussed. Software weaknesses are described in formatted text. There is no widely accepted formal notation for that purpose. This paper shows how Z-notation can be used for formal specification of CWE-119.
    
    Speaker: Prof. Vladimir Dimitrov (University of Sofia)
    
    Slides
  - 110
    
    Simplified pilot module development and testing within the ATLAS PanDA Pilot 2.0 Project
    
    The Production and Distributed Analysis (PanDA) system has been developed to meet ATLAS production and analysis requirements for a data-driven workload management system capable of operating at the LHC data processing scale. The PanDA pilot is one of the major components in the PanDA system. It runs on a worker node and takes care of setting up the environment, fetching and pushing data to storage, getting jobs from the PanDA server and executing them. The original PanDA Pilot was designed over 10 years ago and has since then grown organically. Large parts of the original pilot code base are now getting old and are difficult to maintain. Incremental changes and refactoring have been pushed to the limit, and the time is now right for a fresh start, informed by a decade of experience, with the PanDA Pilot 2.0 Project. To create a testing environment for module development and automated unit and functional testing for next generation pilot tasks, a simple pilot version was developed. It resembles the basic workflow of pilot tasks used in production and provides a simple and clean template for module construction. The miniPilot has a simple structure and is easy to use for development, testing and debugging server-client interactions with new protocols and application interfaces. The unit and functional test system will be developed on top of the miniPilot, and will be used to run automatic tests. This presentation will describe the miniPilot and the test system that will be used during the Pilot 2.0 Project.
    
    Speaker: Mr Daniel Drizhuk (NRC Kurchatov Institute)
    
    Slides
  - 111
    
    Approaches to Manage Computational Cluster Resources
    
    Managing a computational cluster is a complex task which involves many aspects from generic hardware settings to improving the performances of a particular software application. But the complexity of this task is sometime underestimated: organizations just rely on a conventional approaches and systems. As a result cluster utilization could be inefficient. Mistakes (in terms of performance) made during system integration could be imposed by used approach. That is why the choice of a correct approach to system management is very important in many cases (especially in case of nonstandard hybrid complexes). This article reflects the current situation in cluster management, describes the up-to-date approaches used to solve the mentioned problems, analyze the advantages and disadvantages of particular implementations and propose a convenient way to manage clusters. Keywords: Computational clusters, PBS, Virtualization, Grid, Single system image, Cloud.
    
    Speaker: Mr Vladimir Gaiduchok (Saint Petersburg Electrotechnical University "LETI", Russia)
    
    Slides
  - 112
    
    User interface for a computational cluster: resource description approach.
    
    Computational centers provide users with HPC resources. This usually includes not only hardware (e.g. powerful computational clusters) and system software (e.g. Linux and some PBS implementation), but also application software too. In case of university computational center system administrators install some scientific applications, provide users with access to it and manage that access. They are usually responsible for updating such software and solving problems. But such access usually implies only the ability to submit user jobs to resources: users are responsible for creating correct job description (in terms of necessary resources) and job start script (conventional and widely used systems like PBS usually require some script which contains commands that actually start the job). Such task is easy for an average system administrator, but difficult for an average user. Users usually consider job scripting to be very complex. Moreover, some of them can not request the necessary resources correctly too. But as matter of fact, users should learn scripting language (e.g. bash), some tasks of system administration (e.g. Linux administration) and learn information about all the hardware in the cluster. Such tasks could be facilitated by creating some template scripts by administrators. And even in this case users should learn scripting in order to modify such script for their needs and learn work with command line interface (in order to submit their jobs to the cluster management system). But users are usually not accustomed with such work. And all they need is to run their computations and retrieve the results. This article is dedicated to this problem: the mentioned above tasks will be discussed in details on example of a cluster with widely used management system (PBS). Then the possible solution to this problem will be proposed: graphical user interface used based on resource and task description system. This case implies a language for resource (hardware and applications) description. Such descriptions are created by system administrators. They represent available hardware and software as an abstraction. They are used by graphical user interface in order to represent the available resources in a common way. Such approach eliminates the need to learn command line interface and scripting for users and allow them to access resources in a convenient way. Moreover, this leads to more efficient resource utilization since users will rarely do mistakes when requesting resources. Keywords: Computational clusters, User interface, Cluster management systems, Automation, Parallel computing.
    
    Speaker: Mr Vladimir Gaiduchok (Saint Petersburg Electrotechnical University "LETI", Russia)
    
    Slides
  - 113
    
    Improving Networking Performance of a Linux Cluster.
    
    Networking is known to be a "bottleneck" in scientific computations on HPC clusters. It could become a problem that limits the scalability of systems with a cluster architecture. And that problem is a worldwide one since clusters are used almost everywhere. Expensive clusters usually have some custom networks. Such systems imply expensive and powerful hardware, custom protocols, proprietary operating systems. But the vast majority of up-to-date systems use conventional hardware, protocols and operating systems, for example, Ethernet network with OS Linux on cluster nodes. This article is devoted to the problems of small and medium clusters that are often used in universities. We will focus on Ethernet clusters with OS Linux. This topic will be discussed by the example of implementing a custom protocol. TCP/IP stack is used very often in cluster computing, even small clusters use it. While it was originally developed for Internet and could impose unnecessary overheads when it is used in a small cluster with reliable network. We will discuss different aspects of Linux networking stack (e.g. NAPI) and modern hardware (e.g. GSO and GRO); compare performance of TCP, UDP, custom protocol implemented with raw sockets and as a kernel module; discuss possible optimizations. As a result several recommendations on improving networking performance of Linux clusters will be given. Keywords: Computational clusters, Linux, Networking, Protocols, Kernel, Sockets, NAPI, GSO, GRO.
    
    Speaker: Mr Vladimir Gaiduchok (Saint Petersburg Electrotechnical University "LETI", Russia)
- 4. Scientific, Industry and Business Applications in Distributed Computing System LIT Conference Hall
  
  LIT Conference Hall
  - 114
    
    Requirements for distributed systems solving optimization problems on the basis of experience in solving business problems.
    
    Solving optimization problems for commerical companies bring specific requirements. High price of commerical solvers, different type of licenses, time and CPU consumption of different algorithms, business requirements for algorithms. All these raise specific requirements for constructing distributed optimisation system. Business in Russia right now demand for: easy to deploy and test, distributed license free optimization system based on open source optimization solvers.
    
    Speaker: Alexey Tarasov (IITP)
    
    Slides
  - 115
    
    Optimization of selected components in MPD Root project: Capabilities of distributed programming techniques
    
    The article analyses the prospects of optimizing the architecture and the execution logic of selected scripts available in MPD Root project. We considered the option of porting the scripts to allow execution on massive parallel architectures. We collected and structured large data illustrating the project’s work: • the project’s dependency tables were drawn up with regard to the support of parallel and concurrent computing; • the source code database was indexed to identify cross dependencies among architectural entities; • the code profiling measurements were made at launch time of the scripts in question, the call sequences were analyzed, the execution time of the scripts was evaluated. The study evaluated the prospects of using various libraries and platforms: CUDA, OpenMP, OpenCL, TBB, and MPI. The obtained measurements and the analysis of the best practices of the software under consideration allows to make recommendations for modifying MPD Root in order to optimize: • vectorization of loops; • transfer of continuous computing segments to co-processing architecture; • source code segments whose operation can be represented as call graphs; • source code segments that can be subject to load allocation between computing nodes.
    
    Speaker: Anna Fatkina (Saint-Petersburg State University)
    
    Slides
  - 116
    
    Smart grid в энергетике: исследование возможных сценариев нетехнологических потерь электроэнергии
    
    In many countries, the damage from theft of electricity annually is estimated at billions of dollars. Despite the fact that modern electricity metering devices have advanced protection against tampering, their implementation in United States and European Union has shown that these devices do not solve the problem of electricity thefts. There is still no standard of secure transmission and data exchange via communication channels within the network. The aim of this study is to describe current vulnerabilities and scenarios of unauthorized disclosure and modification of metered electricity consumption in smart grids, taking into account most advanced security systems. Во многих странах кража электроэнергии ежегодно оценивается в миллиарды долларов. Несмотря на то, что современные приборы учета электроэнергии содержат более современную защиту от вскрытия и механического воздействия, по опыту внедрения в США и странах европейского союза, эти устройства не решают проблему хищений энергии. До сих пор не выработано общего стандарта безопасности передачи и обмена данными по каналам связи внутри сети. Целью данного исследования является описание актуальных уязвимостей и сценариев несанкционированного раскрытия и модификация информации об использовании электроэнергии в умных сетях, с учетом использования в них наиболее современных систем защиты.
    
    Speakers: Mr Алексей Чухров (Университет Дубна), Mr Анатолий Минзов (Национальный исследовательский университет "МЭИ")
    
    Slides
  - 117
    
    Distributed system for detection of biological contaminants
    
    The paper proposes a distributed system for detecting the types of biological contaminants existing on the objects surfaces. The system realizes biofouling detection method based on image processing technique. The system processes a series of object images obtained in the visible and near infrared spectral ranges. One image in the series is marked as the base image. All images of the series are converted to one common shooting point and to one common angle. The object of interest is detected on the base image, and then the background on all images is removed. To recognize the type of biological contaminants, we used a pre-trained classifier based on support vector machine method. The proposed detection method has an obvious parallelism in data processing. Each image in the series, except the base, can be processed independently. Therefore it is quite easy to implement on a computing cluster using standard parallel computing libraries such as MPI. The central host of cluster is used to implement non-parallel branches of image processing algorithm. These include interactive segmentation, the search for key points of the base image, and classifier training. The central host also solves the problem of the data distribution on cluster nodes and synchronization of all nodes. Other knots of a cluster are processing all images in a series, except the base image. They implement the search key points, converting the images to a single point shooting, background removal and identification types of biofouling. After processing, for each image the map of pollution is formed. This map is sent to the storage, located at the central node of the cluster.
    
    Speaker: Dr Valery Grishkin (SPbGU)
    
    Slides
  - 118
    
    Design of nanomechanical sensors based on graphene nanoribbons in a distributed computing system
    
    Software tools for ab initio modeling of nanoscale resonators were implemented within the GRIDIFIN infrastructure, improving the yield of the computational workload. The distributed computing system was used for the investigation of elongated edge-passivated graphene nanoribbons with one open end and one fixed end, by means of density functional calculations. Their oscillatory behavior was studied through molecular dynamics simulations, as a function of key parameters like nanoribbon length, initial structural deformation and amplitude. Several practical utilizations are envisioned for such nanostructures, such as high-frequency oscillators or ultra-sensible acceleration detectors. The scaling of the MPI application with the number of cores was studied and the results were used for defining the optimal number of cores in the subclusters on which separate instances of the code were distributely ran. The study was also used as an in-house benchmark of the grid system.
    
    Speaker: Dr Camelia Mihaela Visan (Horia Hulubei National Institute for R&D in Physics and Nuclear Engineering (IFIN-HH))
    
    Slides
  - 119
    
    Планирование выполнения композитных приложений в грид-среде
    
    Для многих вычислительных задач, особенно исследовательских, не существует интегрированных программных решений и приходится применять целый ряд разнородных инструментов, связав их в единую вычислительную схему - так называемое композитное приложение (КП). Композитные приложения могут составляться из сотен и даже тысяч отдельных задач, многие из которых могут быть выполнены параллельно с использованием грид-среды. Для эффективного выполнения КП в распределенной среде необходимо планирование - выбор узлов сети, которые будут выполнять отдельные задачи в КП. Задача планирования в большинстве случаев является NP-сложной и получение строго оптимального решения обычно недоступно, в связи с чем было разработано большое количество эвристических методов планирования. В докладе приведен обзор различных постановок задачи планирования, предложена классификация методов и рассмотрен ряд конкретных алгоритмов планирования, включая наиболее современные многокритериальные подходы. Кроме того, рассмотрены проблемы, возникающие при практической реализации планирования и требующие дополнительного исследования.
    
    Speaker: Alexey Nazarenko (IITP RAS, DATADVANCE)
    
    Slides
- Consolidation and integration of distributed resources. Distributed Computing in Education 406B
  
  406B
  - 120
    
    Methods of Semantic Integration in Distributed Information Systems: Challenges of Application
    
    Semantic assets are fundamental for data collection, search and analysis together with data visualization based on semantic properties as well as for semantic interoperability of distributed information systems in general. While the technical and organizational interoperability levels are well developed, the semantic interoperability, which is quite essential for heterogeneous environment of distributed systems, still meets some challenges. The ability of information systems to interact on the semantic level can be achieved by joining the efforts of IT-specialists and domain experts. Ontologies, thesauri and glossaries, created by the experts for the domain formalization, should be transferred from paper documents into machine-readable format. Without that, the dissemination of knowledge outside of a particular information system is difficult and insufficient for the “understanding” and the (re)use by other interacting systems. This report marks the main challenges in the application of semantic integration methods, especially the synergies between IT-specialists and domain experts. Sematic integration collaboration platform is represented as а solution, which provides the formalization of the domain based on the (re)use of semantic assets for modeling of interoperable distributed information systems, transformation of open data to linked open data and for improving the quality of scientific and technical information search.
    
    Speaker: Ms Elena Yasinovskaya (Plekhanov Russian University of Economics)
    
    Slides
  - 121
    
    Интеграция эксперимента ALICE и суперкомпьютера Titan с применением системы управления потоками заданий PanDA.
    
    Вычислительная среда эксперимента ALICE на Большом Адронном Коллайдере позволяет обрабатывать различные задачи, используя ГРИД сайты, расположенные по всему миру. Тем не менее, следующий запуск БАК подразумевает использование больших ресурсов, чем может предоставить ГРИД. Вследствие чего ALICE ищет способы наращивания ресурсов, в частности суперкомпьютеров.
    
    Speaker: Mr Andrey Kondratyev (JINR)
    
    Slides
  - 122
    
    PanDA for COMPASS: processing data via Grid
    
    The development of PanDA (Production and Distributed Analysis System) is a workload management system for ATLAS started in 2005. Since that time the system has grown up and in 2013 the BigPanDA project started, aiming to extend the scope of the system to non-LHC experiments. One of the experiments to which production management PanDa is being applied, is COMPASS at CERN. The workflow of the experiment has to be changed to enable Grid for production and user jobs. Lots of the infrastructure work is being performed on backstage. PanDA jobs definition replaces native batch system job definition, automatic submission to Condor Computing Elements come in place of console job submission, Grid user certificates identify job submitters instead of AFS user names, Grid Storage Elements substitute local directories on AFS and EOS. Production software moves from private directory of production account to CVMFS. Also, a virtual organization with role management has been established for the experiment. Central monitoring was enabled. The experiment is about to start using several computing sites instead of one local batch. How the COMPASS' data are being processed via Grid will be presented in this report.
    
    Speaker: Mr Artem Petrosyan (JINR)
    
    Slides
  - 123
    
    Using EGI Resources with Everest Platform
    
    The report discusses integration of Everest platform with European Grid Infrastructure (EGI). Everest is a cloud platform supporting publication, execution and composition of computing applications in a distributed environment. The platform allows users to attach different types of external computing resources and use these resources in order to run applications published on Everest. The integration with EGI enabled Everest users to seamlessly run applications on grid resources via a web browser. In contrast to classic grid portals, the presented solution supports additional use cases such as combined use of EGI and other resources, composition of applications via REST API and running of parameter sweep experiments.
    
    Speaker: Dr Oleg Sukhoroslov (IITP RAS)
    
    Slides
  - 124
    
    Ontology distribution for a test generation system
    
    Use of ontologies in test generation systems for education provides many benefits. It allows handling different subject areas in the same way, reducing them to the structure, where complex interconnections could be used for generation of diverse questions for the end user. It is obvious that the growth of the test ontology causes the growth of its processing costs. However, the whole ontology is rarely used in a test generation process, as different tests stick with different parts of the ontology. Allocation of such parts allows distribution of the ontology and provides the test generation system with the ability to flexibly adapt to different runtime environments. In this work the way of performing such allocation is presented. It is based on test cases descriptions and their rules of question generation.
    
    Speaker: Mr Dmitry Gushchanskiy (SPbSU)
    
    Slides
- 15:00
  
  Coffee
- 10. Databases, Distributed Storage systems, Big data Analytics 406A
  
  406A
  - 125
    
    Разработка и внедрение системы электронного документооборота "СЭД Дубна" в ОИЯИ.
    
    В статье приведены результаты анализа существующих систем электронного документооборота и особенности организации документооборота в ОИЯИ. На основе этого анализа были предложены модель и технологические решения, использованные в ходе проектирования, разработки и поэтапного внедрения системы электронного документооборота "СЭД Дубна" в ОИЯИ, а также её интеграции с другими корпоративными информационными системами ОИЯИ. Описаны архитектура СЭД, интерфейс пользователя, методы организации данных и управления потоками данных, опыт внедрения и результаты эксплуатации системы.
    
    Speaker: Alexander Yakovlev (JINR)
  - 126
    
    Grafana и Splunk как пример решения проблем визуализации данных в современных системах сбора информации.
    
    Настраиваемые среды визуализации данных, такие как Grafana и Splunk, решают большой спектр задач возникающих при разработке Web-приложений визуализации данных, таких как, например, построение панелей приборов мониторинга. Такие задачи обычно решались путем написания большого количества ПО с применением графики. Настраиваемые системы обеспечивают доступ к разным типам источников данных, имеют встроенные возможности по настройке графического представления результатов с использованием встроенных Web-серверов. Аспекты применения данных систем проиллюстрированы на примерах ведущихся авторами работ. С помощью Splunk в системе сбора данных (TDAQ) ATLAS реализован сбор статистики по логам, предоставляемым Log Service. Данные индексируются для возможности поиска интересующей информации и представления ее в удобной для пользователя форме. Данный плагин с реализацией клиент-серверной модели, исключающую ограничения по доступу вне CERN, заменяет целую компаненту TDAQ. На данный момент Grafana используется в TDAQ для визуализации таких данных как: Data Flow Overview, ROS Information, HLT Farm Information, ROIB, Dead Time Information. Опыт применения оказался успешным и в настоящее время идут работы по внедрению Grafana для визуализации данных сетевого трафика подсистем TDAQ. Это заменит действующий пакет визуализации, требующий значительных усилий по поддержке и уступает в гибкости настройки панелей управления.
    
    Speaker: Mr Mikhail Mineev (JINR)
    
    Slides
  - 127
    
    The Automation of the content filling for JDS system
    
    The JINR Document Server (JDS – jds.jinr.ru), has been launched and developed in the framework of Open Access Initiative. Open Access (OA) in science is a way to collect and preserve the intellectual output of scientific organization and disseminate it all over the world. JDS possesses a digital library functionality which is provided by the software Invenio. It covers all the aspects of the modern digital library management. JDS includes collections of video lectures for young scientists, posters, audio lectures and news about JINR. In the future JDS is considered as part of the JINR corporate information system. The filling of content for such a big repository is a hard work that takes a lot of time. The goal of our activity is the maximal automation of the information system filling. As part of the work for the automation of filling the content, two applications have been developed: • The application for the Grants collection for authority records, that allows one to gather information from the official JINR website. The application is based on the Web Scraping technology (technology of data from web pages) for a subsequent download at the JINR DOCUMENT SERVER Information System (JDS). Work is made on the base of jds-test3 testing server. • The application for the Preprints collection for bibliographic records. The methods of automating and collecting information, current and planned functionality of the system are presented in the report.
    
    Speaker: Ms Tatiana Zaikina (JINR, LIT)
    
    Slides
  - 128
    
    Создание, поддержка и развитие модели интерпретации смыслов
    
    В мире наблюдается постоянный и устойчивый рост количества научных рецензируемых журналов и публикуемых в них статей. [1,2]. По оценкам некоторых авторов в мире издается порядка 1 млн. статей в год, что соответствует ежедневному выпуску порядка 2700 публикаций [2]. Хотя значительная часть такого рода ресурсов находится в электронных фондах (ЭФ), тщательно изучать такие информационные массивы для ученых и исследователей традиционными способами становится все труднее. Но смысловой поиск – это необходимый этап, предшествующий генерации нового знания в научной среде, и порядка 60% времени ученый тратит именно на поиск научной информации [3]. Научные статьи представляют собой тексты на естественном языке, отвечающие определенным требованиям к структуре и содержанию: однозначность, логика от посыла к следствию, явная целевая установка статьи, ясность и точность. Несмотря на шаблон, в них есть неопределенность, связанная с неоднозначностью интерпретации читателем. В работе излагается технология извлечения смысла из научных текстов на основе модели интерпретации (донесения до определенной аудитории) смысла, состоящей из следующих компонентов: 1) выделение смысла из научной статьи (выжимка, словарь, семантическая модель (слова и связи)); 2) формирование логико-семантической модели (сети); 3) вопросно-ответный параметрический навигатор. Такая модель используется для структурирования информационных фондов на основе каталожной службы, представляющей собой множество логико-семантической сетей (ЛСС) [4]. Эффектом является сокращение времени на изучение фонда за счет повышения уровня понимания данной статьи. Организация такой службы является трудоемкой задачей и требует от эксперта (аналитика ЭФ) выполнения большого объема работы для анализируемого информационного ресурса. В статье исследуется возможность автоматического анализа научных текстов на предмет выявления основных и скрытых смыслов с использованием набора техник интеллектуального анализа текстов (Text Mining). Результаты анализа будут использоваться экспертом для построения и поддержки набора ЛСС. На основе описанной модели создан прототип, частично реализующий функционал АРМ аналитика ЭФ и вопросно-ответный навигатор. [1] Т. Н. Домнина, О. А. Хачко (ВИНИТИ РАН).Научные журналы: количество, темпы роста//Информационное обеспечение науки: новые технологии, Cб. науч. тр. / Каленов Н.Е., Цветкова В.А. (ред.). - М.: БЕН РАН, 2015.-с.83-96. [Электронный ресурс] — URL: http://www.benran.ru/SEM/Sb_15/sbornik/83.pdf [2] Галина Якшонок. Эффективный поиск и анализ научно-исследовательской информации в SciVerse: Scopus, Hub, ScienceDirect//МГИМО, 2012. [Электронный ресурс]. URL: http://mgimo.ru/files2/y03_2012/220642/MGIMO_March-2012.ppt [3] Коэн Э. Сравнительный анализ научных публикаций российских и зарубежных ученых. - 2006 //«Эльзевир» [Электронный ресурс] — URL: http://www.elsevier.ru/about/articles/?id=792 [4] Добрынин В.Н., Филозова И.А. Семантический поиск в научных электронных библиотеках//Информатизация образования и науки № 2(22)/2014. - c.110-110
    
    Speaker: Irina Filozova (JINR)
  - 129
    
    Adaptive method for computer-aided translation of research and technology texts
    
    The research is devoted to the experimental use of the software in order to obtain an assisted translation for research and technology texts. The current relevance of computer-assisted translation programs such as SDL Trados is due to the fact that this system allows improving and speeding up the translation process. As well as other CATs, it assists in simplifying the traditional process of translation which is performed and controlled directly by the translator. To increase efficiency, productivity and quality within the localization process it encompasses such core technologies as translation memory and terminology management.
    
    Speaker: Vera Inkina (NRNU MEPhI)
  - 130
    
    Multidisciplinary digest structure formation methods
    
    The growth of information (especially scientific and technical) traffic in the Internet is a key mark of the 21st century. The problem of qualitative filtering and topical and relevant data selection, that is necessary in correct decisionmaking by organizations’ management. Different analytical documents, such as topic-based news digests and executive summaries are the basis for process of decisionmaking. In this paper, the digests’ and executive summaries’ creation algorithms are described, as well as data search activities are described.
    
    Speaker: Dmitry Kshnyakov (NRNU MEPhI)
  - 131
    
    Scientific visualization of information analytical researches
    
    Due to development of international cooperation in the scientific field a large number of mega projects appears, as well as international research organizations and facilities in which a lot of studies carried out. As a result, they generate unstructured data that is an object of interest for wide range of scientists and specialists around the world. Apart from the need to filter that large amounts of research material, there is also a need in its structure and visualization. To solve these problems various visualization tools can be used, such as infographics, semantic networks, dynamic dossier and information digests. Selecting imaging instrument depends on the end-user objectives, scope and type of the rendered material. The topic of this paper is the usage of different types of visualization in agent systems.
    
    Speaker: Kristina Ionkina (NRNU MEPhI)
- 3. Middleware and services for production-quality infrastructures 310
  
  310
  - 132
    
    Experiment STAR
    
    Reaching its 16th year of operation and data taking, the STAR experiment, at the Relativistic Heavy Ion Collider (RHIC), faces daunting data processing challenges. The data accumulated so far is in excess of 24 PetaBytes of raw data and projected to reach 40 PetaBytes in 2020. The CPU needs are as challenging – STAR estimates that, at flat funding profile, resources up to 20% level of its 118 Millions hours of total CPU power at its main facility is needed from external sites to achieve a mere 1.2 production passes. However, high quality dataset production is best achieved in HEP/NP communities with 2 or more passes, implying a dire need to harvest resources far beyond what already exists – remote facilities, accessible via the Grid, may very well accommodate the anticipated data reproduction needs and accelerate scientific discoveries. To this aim, STAR has successfully exploited its Tier-1 center at NERSC/PDSF as well as KISTI/Korea and is now in the process of evaluating the usability of the resources available at Dubna/JINR. JINR offers a possible avenue to distributed resource integration and growth as a contribution to the experiment's needs. We will present our workflow approach, modeled after the KISTI exercise and success, show our results to date as well as the obstacles encountered and how they were resolved. To help identify the issues, we will show stability and efficiency decomposed into STAR-specific workflow, infrastructure, and Grid interface-related issues. Solutions and perspective for a path forward will be discussed.
    
    Speaker: Levente Hajdu (BNL)
    
    Slides
  - 133
    
    Development of the active monitoring system for the computer center at IHEP
    
    Computer center at IHEP is a complex system of many technologies gathered together. Among them distributed computing, high throughput networking, high reliable uninterpretable power systems, precision cooling systems. Monitoring and control such complex is a very difficult task. Even more difficult is to create self-optimization, self-healing and self-defense systems on top of the monitoring. As a first step it might be a creation of several databases to accumulate all information about center infrastructure, events, logs, statutes and then as second step a creation of an active monitoring system which could be able to perform simple tasks itself or make advices for the human interventions. The current status of the development of such system for the IHEP computer center described in this work.
    
    Speaker: Mr Victor Kotlyar (IHEP)
    
    Slides
  - 134
    
    IHEP cluster for Grid and distributed computing
    
    To build a computer cluster for the Grid and distributed computing is a highly complex task. Such cluster has to seamlessly combine grid middleware and different types of the other software in one system with shared cpu, storage, network and engineering infrastructure. To be able to run effectively and be flexible for the still unknown future usage patterns many software systems must be gathered together to build a complete system with high level of complexity. This work present a general possible architecture for such systems and a cluster software stack which could be used to build and operate it using IHEP computer cluster as an example.
    
    Speaker: Mr Victor Kotlyar (IHEP)
    
    Slides
  - 135
    
    Security infrastructure for distributed computing systems on the basis of blockchain technology
    
    To ensure secure access to resources of distributed computing systems (DCS) [1] with the account of rights of a given user and the service/resource policy, a security infrastructure is needed, which should be enough reliable and on the other hand not create significant difficulties for users. In the work [2], a method of user authentication, which is based on a login/password pair together with a session-restricted key was suggested. This approach provides a substantial simplification of both the registration of new users in the system and their operation in DCS, compared with the commonly used in DCS public key infrastructure (PKI) with the use of proxy certificates. However, a vulnerability area of both the PKI, and the solution proposed in [2], is the need of operation of a fail-proof and tamper-resistant central server in the security infrastructure. In the case of the infrastructure suggested in [2] this is the authentication/authorization server and in the case of PKI this is the server of proxy certificates renovation. In this work, we investigate the possibility of abandoning the special dedicated servers in the DCS security infrastructure and the use instead of them a distributed database on the basis of the blockchain technology [3], the paradigm of smart contracts [4] and Ethereum protocol [5,6]. Since in this case the database of the security infrastructure is distributed across all the nodes in the system, this approach will increase the resiliency and security of DCS. This work was supported by the Ministry of Science and Education of the Russian Federation, the Agreement No. 14.604.21.0146; unique identifier is RFMEFI60414X0146. References 1. A. P. Kryukov, A. P. Demichev, and S. P. Polyakov, Web Platforms for Scientific Research, Programming and Computer Software, 2016, Vol. 42, No. 3, pp. 129–141 2. J. Dubenskaya, A. Kryukov, A. Demichev, N. Prikhodko, New security infrastructure model for distributed computing systems, Journal of Physics: Conference Series. 2016. Vol. 681, P. 012051-1-012051-5. 3. Public versus Private Blockchains , BitFury Group, 2015, http://bitfury.com/content/5-white-papers-research/public-vs-private-pt1-1.pdf 4. N. Szabo, The Idea of Smart Contracts, http://szabo.best.vwh.net/smart_contracts_idea.html 5. V. Buterin, Ethereum White Paper, https://github.com/ethereum/wiki/wiki/White-Paper 6. G. Wood, Ethereum: A secure decentralised generalised transaction ledger, http://gavwood.com/paper.pdf
    
    Speaker: Dr Alexander Kryukov (SINP MSU)
    
    Slides
  - 136
    
    DDS – The Dynamic Deployment System
    
    The Dynamic Deployment System (DDS) is a tool-set that automates and significantly simplifies a deployment of user-defined processes and their dependencies on any resource management system (RMS) using a given topology. A number of basic concepts are taken into account in DDS. Namely, DDS implements a single responsibility principle command line tool-set and APIs. The system treats users’ tasks as black boxes – they can be executables and scripts. DDS doesn’t depend on RMS and provides deployment via SSH, when no RMS is present. It doesn’t require pre-installation and pre-configuration on the worker nodes. DDS deploys private facilities on demand with isolated sandboxes. The system provides a key-value property propagation service for tasks. DDS provides a rule-based execution of tasks. In this report detailed description, current status and future developments of the DDS will be presented.
    
    Speaker: Andrey Lebedev (GSI, Darmstadt)
    
    Slides
- 4. Scientific, Industry and Business Applications in Distributed Computing System LIT Conference Hall
  
  LIT Conference Hall
  - 137
    
    Типичные схемы составных инженерных и научных расчетов
    
    Вычислительное моделирование играет огромную роль в разработке современных инженерных изделий. Однако для проведения полноценного анализа изделия как правило недостаточно единичного расчета модели - необходимо решение более высокоуровневых задач, таких как анализ чувствительности и оптимизация. Поэтому решение конечной задачи обычно требует совместного использования ряда вычислительных пакетов. Одним из подходов к организации таких вычислений является использование специального программного комплекса - интеграционной среды. Для постановки требований к интеграционной среде для инженерных расчетов были проанализированы типичные задачи и соответствующие им расчетные схемы. В докладе приводится список наиболее часто встречающихся расчетных схем, рассматриваются особенности работы с данными, обработки ошибок, кэширования и параллельного выполнения.
    
    Speaker: Alexander Prokhorov (IITP RAS, DATADVANCE)
  - 138
    
    Организация потоковой обработки гиперспектральных изображений в распределённой вычислительной среде
    
    В настоящее время использование облачной инфраструктуры для размещения данных и их обработки становится важным технологическим трендом. При размещении, хранении и обработке данных дистанционного зондирования Земли (ДЗЗ), в частности, гиперспектральных изображений, изображений большого размера, наборов связанных фрагментов изображений, в распределённой информационной среде следует учитывать, помимо большого размера, структурные особенности этих данных, а именно, пространственную зависимость данных, необходимость согласования многомерной структуры данных и последовательного представления их при хранении. Именно поэтому достаточно популярный путь перехода в «облако» на основе использования технологий «больших данных» (Big Data), в частности, программной платформы Hadoop, для организации обработки и хранения данных дистанционного зондирования наталкивается на значительное число проблем, связанных с тем, что технологии Big Data в первую очередь ориентированы на обработку текстовой информации. Отдельные проекты использования платформы Hadoop для хранения и обработки набора изображений только подтверждают это. Однако, общие принципы и методологию распределённого хранения и параллельной обработки данных полезно использовать при разработке систем обработки и хранения данных ДЗЗ. Следует выделить две основных идеи: 1) тотальное распараллеливание данных и вычислений с обеспечением отказоустойчивости как на уровне хранения, так и при обработке, и 2) наличие базовой инструментальной среды, позволяющей исследователям сосредоточится на написании содержательной части обработки, обеспечивая распределённое отказоустойчивое хранение данных, снимая проблемы организации распределённой обработки «на месте хранения», доставки исполняемого кода на узлы хранения для проведения вычислений. Начальные реализации платформы Hadoop основаны на парадигме MapReduce. Основная проблема заключается в сложности представления произвольного алгоритма обработки (получения результата) в виде двухэтапной процедуры, где на первом этапе (этапе Map) происходит формирование частичных промежуточных результатов обработки отдельных частей единого массива данных, а на втором этапе (этапе Reduce) на их основе собирается окончательный результат. Дополнительной проблемой такой организации вычислений является необходимость сохранять все промежуточные результаты в локальной дисковой памяти узлов распределённого вычислительного кластера, поэтому для обеспечения приемлемой эффективности процесса такой обработки объём данных после первого этапа должен уменьшиться на несколько порядков. Именно поэтому, организация распределённой обработки изображений на базе платформы Hadoop зачастую реализуется в вырожденном виде (только этап Map), по сути, используя только встроенные в данную инфраструктуру средства организации распределённой обработки «на месте хранения». Поскольку не все методы обработки можно реализовать в рамках парадигмы MapReduce, то появились потоковые методы обработки больших данных, которые также основаны на идее разбиения исходного набора данных на независимые блоки с последующей потоковой схемой их преобразования в результат с использованием распределённой отказоустойчивой организации вычислений. Возросшая при этом эффективность обработки связана, в первую очередь, с реализованной возможностью не сохранять промежуточные результаты вычислений на диск. При творческом развитии и адаптации изложенных идей применительно к задачам разработки распределённых систем обработки и хранения данных ДЗЗ необходимо учитывать особенности этих данных, представленных преимущественно в виде изображений, т.е. пространственно-зависимых данных. Простое разбиение крупноформатного изображения на отдельные фрагменты, которое реализуется в системах на базе платформы Hadoop, в большинстве случаев не применимо, оно должно быть программно-контролируемым, особенно при использовании в процессе обработки локальных операций на основе скользящей окрестности. Для технологий обработки данных ДЗЗ (в первую очередь, крупноформатных гиперспектральных изображений) характерно наличие предварительной обработки в виде многоэтапного процесса, при котором промежуточные данные существенно не редуцируются в объёме. Распределённая организация вычислений при этом связана с использованием декомпозиции на перекрывающиеся фрагменты, то есть совместно с основными отсчётами, которые будут изменяться в процессе обработки, хранятся пограничные отсчёты из соседних фрагментов, которые используются только в процессе обработки и при многоэтапной обработке должны корректироваться по результатам завершения очередной операции. Решением в данном случае является использование концепции Data-as-a-Service. Все компоненты системы хранения и обработки гиперспектральных изображений, в том числе сами данные, реализуются как сервисы, размещенные в облачной вычислительной среде. В качестве интеграционного решения используется децентрализованное объединение (федерация) равноправных взаимодействующих друг с другом сервисов, реализованное на базе одной из составляющих платформы Hadoop – сервисе координации процессов распределённых приложений Zookeeper. Основной структурной единицей хранения в системе является распределённое изображение, которое доступно пользователю как набор взаимодействующих сервисов хранения, т.е. каждый фрагмент распределённого изображения представляется в виде отдельного сервиса (фрейм-сервиса) со своим уникальным именем в иерархии изображений-сервисов. Фрейм-сервис, помимо интерфейса доступа к данным, реализует интерфейс, обеспечивающий получение и выполнение задания на обработку контролируемых данных. Причём, в процессе такой обработки данные не изменяются, а создаётся новое распределённое изображение, фрагменты которого могут быть не согласованы между собой в части обеспечения последующей сбалансированности распределённой обработки. Тем не менее, наличие "интеллекта" у фрейм-сервиса и связей с соседями позволяет выполнить все необходимые для такого согласования действия автономно в фоновом режиме, предоставляя пользователю необходимые ему функции (чаще всего это визуализация) и данные непосредственно сразу после создания. Таким образом, реализуется ориентированный на данные подход к организации вычислений, при котором фрагменты данных заранее распределены по узлам хранения распределённой системы, а в процессе обработки вместо данных пересылаются процедуры их обработки. Сервисы-обработчики могут функционировать в двух режимах – как фильтр и как ресурс. В первом случае сервис получает и перерабатывает потоковые данные, во втором – передает специальным образом оформленную задачу, которая выполняется в виртуальной машине фрейм-сервиса. Все сервисы регистрируются в распределенном репозитарии Zookeeper, который обеспечивает группировку всех сервисов по передаваемой от них информации: сервисов-обработчиков по типам операций, фрейм-сервисов по принадлежности к исходному ГСИ, пространственным координатам и другим статистическим характеристикам. Особенностью вычислительных задач обработки изображений является тот факт, что значительное число методов сложной обработки изображений могут быть реализованы как последовательное применение некоторых законченных типовых операций над изображениями. Именно это обстоятельство определяет эффективность применения разнообразных программных систем обработки изображений к решению широкого спектра исследовательских и прикладных задач обработки и анализа изображений. Базовый вариант формирования задачи обработки изображений заключается в задании последовательности выполняемых операций обработки, т.е. графа обработки. Наряду с последовательным вариантом организации вычислений по такому графу обработки при хранении всех промежуточных данных в оперативной памяти фрейм-сервиса рассматривается схема с параллельным выполнением всех шагов алгоритма и непосредственной передачей части данных от одной операции к другой по мере готовности. Фактически формируется некоторый поток данных (потоковая сеть), элементы которого преобразуются на каждом шаге и передаются последующему для дальнейшей обработки. Из-за необходимости анализа процесса вычислений в потоковых сетях произвольного вида обычно рассматриваются только конвейеры из последовательно соединенных операций. При реализации произвольных потоковых сетей необходимо проводить предварительный анализ графа обработки на корректность вычислений и выполнять возможное согласование структуры данных и процессов вычислений. Предлагаемая архитектура предоставляет не просто способ построения сложной распределенной системы, а качественно иной принцип использования привычных сущностей, которые вместе обеспечивают определённый синергетический эффект. Рассмотренные решения позволят организовать технологию обработки данных ДЗЗ с использованием метапрограммирования и мультимодального подхода, при котором пользователь может указывать не конкретные операции обработки, а некоторые обобщающие этапы, формулировать требуемые цели обработки и намечать пути их реализации. Работа выполнена при поддержке Российского фонда фундаментальных исследований (проект № 1529-07077).
    
    Speaker: Sergey Popov (IPSI RAS - Branch of the FSRC "Cristallography and Photonics" RAS)
    
    Slides
  - 139
    
    ИССЛЕДОВАНИЕ КАЧЕСТВЕННЫХ ОСОБЕННОСТЕЙ ДИФФЕРЕНЦИАЛЬНЫХ УРАВНЕНИЙ С ПОЛИНОМИАЛЬНОЙ ПРАВОЙ ЧАСТЬЮ В РАСПРЕДЕЛЕННОЙ КОМПЬЮТЕРНОЙ СРЕДЕ
    
    Speaker: Ирина ЕМЕЛЬЯНОВА (Тверской государственный технический универсистет)
  - 140
    
    ОБОБЩЕННАЯ СХЕМА ГОРНЕРА ДЛЯ ВЫЧИСЛЕНИЯ ЗНАЧЕНИЙ МНОГОМЕРНЫХ МНОГОЧЛЕНОВ В РАСПРЕДЕЛЕННОЙ КОМПЬЮТЕРНОЙ СРЕДЕ
    
    Speaker: Ирина Емельянова (Тверской государственный технический универсистет)
    
    Slides
  - 141
    
    New approaches to supercomputer industry and Georgian supercomputer project
    
    Last several decades supercomputer industry is developing very quickly. New approaches and more and more new technologies are involved to make supercomputers faster and more effective. Today many countries, companies, universities or other research facilities are using supercomputers to solve complex problems. Georgia is making first steps in this direction. We are working on building first Georgian supercomputer with modern approaches and technologies to solve several complex problems. One of these problems is: Accurate weather forecast. The current paper contains information about new project: Establishment of first Georgian supercomputer for research and education at Georgian Technical University.
    
    Speakers: Mr Archil Prangishvili (Georgian Technical University (GTU)), Mr Bekar Oikashvili (Georgian Technical University (GTU)), Mr Zurab Modebadze (Georgian Technical University (GTU))
    
    Slides
  - 142
    
    HPC design of ion based molecular clusters
    
    Different self-assembled molecular clusters grown around H+ and Na+ ion kernels have been investigated using density functional theory method considering advanced exchange-correlation DFT functionals. Supramolecular structures of high-order molecular associations observed in different mass spectra and their formation dynamics were explained with the help of HPC technique. The presence of new and unusual weak intermolecular forces which keep together these supramolecular assemblies were observed and characterized in more details using molecular modeling methods.
    
    Speaker: Dr Attila Bende (National Institute for R&D of Isotopic and Molecular Technologies)
    
    Slides
  - 143
    
    Comparison of CPU and GPU performance for the charge-transfer simulation in the 1D molecular chains
    
    The process of charge transfer in biopolymers is modeled by specific ODE system. To estimate the thermodynamic properties of the model, we use direct simulation - calculation of the set of trajectories and averaging over ensemble. Such calculations require a lot of computer time. We compared the three program realization, using MPI, openMP, and GPU technology. At present, attention of researchers attracts possible application of biological macromolecules in nanobioelectronics, especially DNA, for example, in the development of electronic biochips and use as molecular wires. Therefore, the study of thermodynamic characteristics and conducting properties of biopolymers is of interest [1,2]. The model of charge transfer along the chain of sites is described by the self-consistent system of ODE. To simulate thermostat, we add friction term and random value with special properties in the right-hand side (Langevin equation). In computationally, the resource-intensive part of the problem is the calculation of a large number of samples (the dynamics of the charge distribution from the different initial data and with different values of the random force), since the accuracy of the mean is proportional to the square root of the simulation amount. Such task imply the natural parallelization "one sample - one core" using MPI technology. The problem of attaining the thermodynamic equilibrium can require huge integration time [3]. To reduce the computation time, we have studied the possibility of paralleling using openMP, and program realization based on GPUs with NVIDIA CUDA technology. Although explicit iterative method use for numerical integration of the ODE system, but the equations includes explicitly only the nearest neighbors. So we can "divide" the chain into several fragments, which are integrated in one step independently on different cores with openMP. To realize the GPU program version, we assemble a single array of several implementations with dividers - Resettable sites. Number of samples depends on the amount of sites in the chain. Due to hardware multithreading operations, calculations are carried out in parallel simultaneously on all components of the vector. The results of tests and comparison of the performance of these three program realizations are discussed. The work was done with partial support from the Russian Foundation for Basic Research, projects No. 16-07-00305, 14-07-00894, 15-07-06426, and Russian Science Foundation, project 16-11-10163. [1] Nanobioelectronics - for Electronics, Biology, and Medicine. Eds. Offenhousser A., Rinaldi R. New York: Springer, 337 p., 2009. [2] Lakhno V.D.: DNA nanobioelectronics. International Journal of Quantum Chemistry, 108 (11): 1970-1981, 2008. [3] Lakhno V.D., Fialko N.S. On the dynamics of a polaron in a classical chain with finite temperature. JETP, 120: P.125-131, 2015.
    
    Speaker: Dr Nadezhda Fialko (Institute of Mathematical Problems of Biology RAS – the Branch of Keldysh Institute of Applied Mathematics of Russian Academy of Sciences)
    
    Slides
  - 144
    
    Distributed evolutionary optimization algorithms for peptide structure prediction
    
    This work presents an approach to the construction of distributed stochastic evolutionary optimization algorithms for peptide secondary structure prediction. The prediction can be formulated as optimal peptide structure search in continuous space of torsion angles. The optimal peptide structure corresponds to the global minimum of free energy. Two main regular peptide secondary structures are considered: alpha-helix and beta-sheet. The authors proposed a scheme for application of evolutionary algorithms to this problem by changing non-covalent force-field term during optimal structure search. Results of numerical experiments for model and real peptides are presented.
    
    Speaker: Mr Sergey Poluyan (Dubna International University for Nature, Society and Man)
    
    Slides
  - 145
    
    Method of terminological system's formation to determine the level of product development
    
    Development of a competitive equipment with advanced technical characteristics requires a high level of scientific and technological progress, as well as an effective system of controlling the design and operation of advanced machinery during the entire period of its life cycle. To create such a system it is necessary to develop a unified approach to the identification of the development level of technologies and products. In this paper, a model to identify the level of product development was developed. To achieve this goal the official standards of the United States of America were analyzed.
    
    Speaker: Valeriya Danilova (NRNU MEPhI)
    
    Slides
  - 146
    
    Portal for organization of scientific data processing on heterogeneous computing resources at NRCKI
    
    National Research Center “Kurchatov Institute” Data Processing Center provides its user community with state-of-art large scale heterogeneous computing resources, such as supercomputers, cloud and grid computing. The requirement to integrate computing facilities as a single computing entity for the end-user has motivated us to develop and to implement the data processing portal. The ATLAS Production and Distributed Analysis workload management system (PanDA) has been chosen as a base technology for the portal. PanDA has demonstrated excellent capabilities to manage various workflows at scale in the ATLAS experiment at LHC. PanDA instance was installed at NRC KI, adapted to the scientific needs of its user community and integrated with NRC KI computing resources. This integration has required the development of user interface to provide scientists with a single point of access to the computing resources, and file handling system to transfer input/output data between file storages and working nodes. The portal has demonstrated its efficiency in running bioinformatical scientific applications, in particular for genome analysis. Success of biological application has attracted interest from other compute-intensive sciences, and we plan to expand portal’s usage in the nearest future. In this report our accomplishments will be reviewed and then we’ll discuss the portal’s potential usage.
    
    Speaker: Mr Alexey Poyda (NRC KURCHATOV INSTITUTE)
    
    Slides
- Consolidation and integration of distributed resources. Distributed Computing in Education 406B
  
  406B
  - 147
    
    TASKS SCHEDULING ALGORITHMS IN HETEROGENEOUS CLUSTER
    
    Over the past decade, the grid computing system has been an active research field. One of the important issues in grid computing systems is to increase the efficiency of resource utilization and reduction of completion time for grid computing. In other words to improve the resource utilization of heterogeneous cluster, scheduler must avoid unnecessary data transmission. The aim of this paper is to build the desktop grid environment using UNICORE middle-ware and a built-in test mode of the emulation of the MAUI scheduler was used. The results of practical testing for UNICORE sites with the computing cluster under the local resource management system Torque management for the most common types of parallel jobs are presented. The effectiveness of the algorithms is researched by analyzing the results of simulation and testing of scheduler on the computing cluster in situation of changing intensity of incoming workflows.
    
    Speaker: Ms Zolzaya Kherlenchimeg (National University of Mongolia)
  - 148
    
    The experience of deploying a virtual computer lab in education — running failover clusters in a virtualized environment
    
    When training highly skilled IT professionals, it is an important challenge for the university to teach professional competencies to graduates that they will be able to use to successfully solve a broad range of substantive problems that arise at all stages of the lifecycle of corporate information systems. Such information systems in practice, as a rule, are used for enterprise management, workflow management in technological processes, IT infrastructure management, creating web-solutions for high availability, data collection, and data analysis and storage. It is obvious that in order for students to learn these professional competencies, they need to master a large amount of theoretical material and to carry out practical exercises and research on the development of modern information systems, their deployment and support, the effective implementation of solutions for problem-oriented tasks, etc. The organization of an effective process for the goal-directed training of IT experts has demanded a speedy solution to the following problems: an often insufficient number of classroom hours for students to cover a necessary and sufficient set of practical exercises that help students learn complex information systems; on a typical personal computer with average capabilities it is impossible to get real practical experience working with multi-component information systems because the hardware requirements for such systems often go beyond what is offered on typical home, office and laptop computers; sometimes there are difficulties installing and supporting some information systems, and these problems cannot be solved without gaining experience about how to use such systems; the single-user license cost is too high, and in most cases, such a license is required only for the duration of the learning process. The main way to solve these problems has been to create a virtual computer lab that is able to solve the problem of insufficient computing and software resources and to provide an adequate level of technological and methodological support; to teach how to use modern technologies to work with distributed information systems; to organize group work with educational materials by involving users in the process of improving these materials and allowing them to communicate freely with each other on the basis of self-organizational principles. The virtual computer lab provides a set of software and hardware-based virtualization tools that enable the flexible and on-demand provision and use of computing resources in the form of "cloud" Internet services for carrying out research projects, resource-intensive computational calculations and tasks related to the development of complex corporate and other information systems. The service also provides dedicated virtual servers for innovative projects that are carried out by students and staff at the Institute of System Analysis and Management. The presentation (master class) will demonstrate the deployment of a failover cluster in a virtual computer lab environment. It will emphasize the service's features that have been adapted to the needs of the educational process at the university. We have chosen to highlight this use case deliberately. The task of designing and deploying failover clusters forms the topic of several special courses, which are designed to satisfy the demand for these skills by modern companies. When designing corporate information systems and ensuring the availability of critical applications that are independent of a particular hardware and software environment, it is critically important to ensure the successful implementation of many key business processes. Downtime, including for scheduled maintenance, leads to additional costs and the loss of customers, and the long outages are simply unacceptable for modern high-tech enterprises. In learning such practical skills, students must independently master the requirements for creating a failover cluster; determine the critical components that require redundancy; configure virtual machines; become familiar with advanced data storage tools and technologies, the principles for creating distributed systems, different types of server operating systems (Windows and Unix), and ways for ensuring their interoperability; learn about communication protocols on the basis of iSCSI; set up computer networks; draft security policies; and solve the problem of integrating system components. The task of deploying failover clusters demonstrates the capabilities of the virtual computer lab. It also illustrates how it can be used as part of practical lessons and extracurricular work, making it possible to train IT professionals in accordance with the requirements of the most advanced educational and professional standards. The implementation of a virtual computer lab makes it possible to implement innovations, and it represents a significant leap forward over traditional educational approaches. It should also be emphasized that the virtual computer lab has helped us provide an optimal and sustainable technological, educational-organizational, scientific-methodological, and regulatory-administrative environment for supporting innovative approaches to computer education. It promotes the integration of the scientific and educational potential of Dubna State University and the formation of industry and academic research partnerships with leading companies that are potential employers of graduates of the Institute of System Analysis and Management. The results that the Institute of System Analysis and Management has achieved in improving the educational process represent strategic foundations for overcoming perhaps one of the most acute problems in modern education: the fact that it tends to respond to changes in the external environment weakly and slowly.
    
    Speakers: Mrs Elena Kirpicheva (Dubna International University of Nature, Society, and Man. Institute of system analysis and management), Mrs Evgenia Cheremisina (Dubna State University), Mr Mikhail Belov (Dubna State Univeristy), Mikhail Lishilin (Dubna State University), Mrs Nadezhda Tokareva (Dubna State Univeristy), Mr Pavel Lupanov (Dubna State University)
  - 149
    
    Established Educational & Training Grid site in the National University of Mongolia
    
    Powerful supercomputers are required to make such scientific calculations and Mongolia is lacking in the calculation based computational science field, so we rely on outsourcing our massive weather, image processing of geographical, economical, chemical and physical calculations to other countries which is quite costly. Thus, with the help of this research work, we are striving to become a joint educational grid site which is one of our first steps of implementing the calculation grid. Therefore, we have completed work on our internal network, system architecture and implementation process. The National University of Mongolia have started research on this matter since 2013 and in 2015 we have started a project to join the t-infrastructure of JINR as well as taking t-infrastructure administration courses and suggestions from the people of LIT, JINR in 2012, 2013 and 2016. We have managed to create an internal educational training grid site and joined the t-infrastructure based on gLite of JINR. This has opened up a new direction in our research and allowed our researchers, teachers and students to work and test on a grid site on a practical level.
    
    Speaker: Prof. Bolormaa Dalanbayar (SEAS-NUM, Ulaanbaatar Mongolia)
  - 150
    
    ИНТЕЛЛЕКТУАЛЬНЫЙ РОБОТОТЕХНИЧЕСКИЙ ТРЕНАЖЕР
    
    Разработка, развитие и реализация эффективных высоких наукоемких информационных технологий (ИТ) (создаваемых в различных областях науки и техники) неразрывно связаны с необходимостью разработки и повышения уровня интеллектуальности используемых процессов и систем управления, объективно учитывающие в законах управления контекстуально-зависимые физические эффекты, ограничения и информационные границы, реально существующие в конкретных моделях объекта управления (ОУ). Важную роль при формировании уровня интеллектуальности системы автоматического управления (САУ) играет выбор используемого инструментария технологии интеллектуальных вычислений (ИВ) для проектирования соответствующей базы знаний (БЗ) при заданной цели управления [1]. Одной из основных проблем эффективного применения технологии мягких вычислений в задачах управления являлось решение следующих задач: • объективное определение вида ФП и ее параметров в продукционных правилах в БЗ; • определение оптимальной структуры нечетких нейронных сетей (ННС) в задачах обучения (аппроксимация обучающего сигнала с требуемой (заданной) ошибкой и с минимальным количеством продукционных правил в БЗ); • применение генетического алгоритма (ГА) в задачах многокритериального управления при наличии дискретных ограничений на параметры ОУ. Перечисленные проблемы были решены и апробированы на основе Оптимизатора Баз Знаний (ОБЗ) с применением технологии мягких вычислений [1]. Разработанный интеллектуальный инструментарий позволил проектировать робастные БЗ на основе решения одной из алгоритмически трудно решаемых задач теории искусственного интеллекта – извлечения, обработки и формирования объективных знаний без использования экспертных оценок. В данном оптимизаторе используются три генетических алгоритма, которые позволяют спроектировать оптимальную структуру нечеткого регулятора (вид и число ФП, их параметры, а также число самих правил нечеткого вывода), аппроксимирующей обучающий сигнал с требуемой ошибкой. При этом автоматически проектируется оптимальная структуры нечеткой нейронной сети и универсального аппроксиматора в виде нечеткого регулятора. Под руководством профессора Ульянова С.В., в институте «Системного Анализа и Управления» университета «Дубна» реализуется широкое внедрение в учебный процесс научных разработок, активное вовлечение студентов, аспирантов, молодых ученых в исследовательскую и изобретательскую деятельность. Так, на базе университета «Дубна» и НТП «Дубна», был разработан и реализован программно - аппаратный комплекс для обучения технологии проектирования интеллектуальных систем управления.
    
    Speaker: Dr Andrey Reshetnikov (University "Dubna")
Friday 8 July
- Mon 4 Jul
- Tue 5 Jul
- Wed 6 Jul
- Thu 7 Jul
- Fri 8 Jul
- Plenary reports LIT Conference Hall
  
  LIT Conference Hall
  - 151
    
    ONLINE PROCESSING OF LARGE VOLUME OF EXPERIMENTAL DATA IN STRONG ELECTROMAGNETIC AND RADIATION FIELDS IN ITER
    
    ITER data acquisition and control system consists of CODAC (Control, Data Access and Communication), Plasma control system (PCS), Central Safety System (CSS) and Central Interlock System (CIS). All of them have two tiers of control: global – central supervisor orchestrating the whole plant, and local – subsystem control. Core System (CCS) is the main conventional SCADA system of the ITER based on EPICS. More than 220 diagnostics and technical subsystems are interconnected. Some plasma diagnostics have tens of measurement channels with frequency band ~500 MHz and more. These multi-channel diagnostics systems (~50 synchronous channels) generate data flow ~200 Gbit/sec. These data arrays have to be measured and sent to CODAC and PCS in real time. Design of diagnostic acquisition systems must satisfy special ITER requirements, which creates problems for designers. The first problem is in external environment. Some sensors and front-end electronics are located in Port Cells, Galleries close to ITER vacuum vessel and so work in strong magnetic (~0.4 Теsla), neutron (~ 107 n.cm-2.s-1), and Hard X-ray fields. In addition, this equipment located near megawatt radiofrequency oscillators (up to 170 GHz) and must have good electromagnetic shielding. The second problem is in equipment itself. It cannot be relocated outside operation zone of installation because of long cables induce noise on experimental signals. The third problem is in electronics parts. In ITER the number of different electronics parts in radiation zone is very large ~500, compared with about 20 different parts in ATLAS detector. Development of special radiation-hard electronics parts for ITER is very expensive and not appropriate for the project. The fourth problem is in networks. Huge amount of experimental data have to be transmitted into Central systems in real time. Estimations show the required data transmission velocity is more than 10 times exceeds the channel capacity of existing ETHERNET and computer buses (~10 Gb/sec). The fifth problem is in data network organization. For plasma control system it is necessary to take into account not only data transmission velocity, but also jitter, latency and packets collision between plant system and supervisor and between plant systems. All above mentioned require optimization of acquisition systems in whole. Report presents data acquisition system developed for Russian ITER diagnostics, which satisfy requirements for QС3 and QС4 class equipment. Main features of this system are follows. External block with front-end electronics, designed as Faraday casing, placed in ITER Port Cell, in shielded cabinet. It contains minimum electronics components. Main block placed in protected zone at the distance ~100 m. Main and external equipment are connected by multi-channel digital optical lines (10 Gb/sec). Preliminary data processing implemented in smart ADCs (250 -500 MHz step, 14 bit, FPGA). Main data processing is organized inside plant controller (FPGA + CPU) and in local cluster (NvidiaTESLA S1070) connected to plant controller. System uses data compression and fits the size of the data flow to the ITER networks bandwidth
    
    Speaker: Dr Igor Semenov (ITER Russian Domestic Agency)
    
    Slides
  - 152
    
    Geographically Distributed Software Defined Storage
    
    The volume of the coming data in HEP is growing. Also growing volume of the data to be hold long time. Actually large volume of data – big data – is distributed around the planet. In other words now there is situation where the data storage does integrate storage resources from many data centers located far from each other. That means the methods, approaches how to organize, manage the globally distributed data storage are required. For personal needs the distributed storage has several examples like owncloud.org, pydio.com, seafile.com, sparkleshare.org. For enterprise level there are a number of distributed storage systems SWIFT (part of Openstack), CEPH and the like which are mostly object storage. When distributed storage integrate several data center resources the organization of data links becomes very important issue especially if several parallel data links between data centers are used. The situation on data centers and in data links might be changed each hour. All that means each part of distributed data storage has to be able to rearrange usage of data links and storage servers in each data center. In addition for each customer of distributed storage the different requirements have to be satisfied. Above topics are planned to be discussed in the proposal of data storage architecture.
    
    Speaker: Mr andrey shevel (PNPI, ITMO)
    
    Slides
  - 153
    
    Modeling of electron dynamics and thermodynamics in DNA chains
    
    The lecture gives a review of numerical experiments on a charge transfer in DNA. The charge motion is described in terms of quantum mechanics, whereas vibrational degreese of fredom are treated both classically and quantum mechanically. Special attention is given to dynamics of polaron state formation, polaron motion in electric field, Bloch oscillations and breather states. The dynamics of charge migration was modeled to calculate temperature dependencies of its thermodynamic equilibrium values such as energy and electronic heat capacity in homogeneous polynucleotide chains. The work was done with partial support from RFBR, project № 16-07-00305 and RSF project № 16-11-10163. References 1. A.P.Chetvericov, W. Ebeling, V. Lakhno, A.S.Shigaev, M.G.Velarde, Eur. Phys. J.B., (2016) 89:101 2. N. Fialko, E.Sobolev, V. Lakhno, Phys. Lett. A, (2016), 380, 1547 3. Lakhno V.D., Fialko N.S. Math. Biol. Bioinf, (2015), 10(2), 562
    
    Speaker: Victor Lakhno (Institute of Mathematical Problems of Biology RAS – the Branch of Keldysh Institute of Applied Mathematics of Russian Academy of Sciences and Acting Head of the Laboratory of Quantum-Mechanical Systems of the IMPB RAS)
    
    Slides
  - 154
    
    UTFSM/CCTVal Data Center: 10 years of experience
    
    Starting end of 2006 UTFSM Data Center provides its computational resources for users from UTFSM and other Chilean universities. Since 2009 cluster facilities are also available in the frame of Grid Computing. The range of problems being solved at the cluster is huge: from fundamental and applied problems in different branches of Physics, Chemistry, Astronomy, Computer Science, etc. to the educational and training purposes. History and current cluster status (including configuration and operational modes) are presented.
    
    Speaker: Yury Ivanov (UTFSM, Valparaiso, Chile)
    
    Slides
  - 155
    
    SUPERCOMPUTER MODELLING THE DESTRUCTION PROCESSES OF INTERCONNECTIONS IN ELECTRONIC CIRCUITS
    
    Reduction of the basic electronic components sizes and their packing density has led to an aggravation of electronic schemes reliability problem. It is connected with the fact that the characteristic sizes of schemes active elements and thicknesses of the lines (interconnections) bringing a current to them became comparable with the defects sizes (for example, the intergranular spaces and cracks), and also with a diffusion length in conductive materials (10-15 nm). The solution of reliability problem demands the detailed research in this area and taking into account the mathematical models of atomic and molecular structure of individual construction elements. Realization of such approach brings to multiscale mathematical models and specific methods of their analysis. In this work we propose one of such multiscale approaches to the calculations of electronic schemes degradation processes. For the numerical realization of approach the technique using grid models at the macrolevel and molecular dynamics equations at the microlevel is developed. As the total algorithm is rather resource intensive, its realization is initially focused on parallel computing. In report the details of numerical and computer implementations are discussed, the examples of solving the particular problem are given. The work was supported by Russian Foundation for Basic Research (projects 15-07-06082-a, 14-01-00663-a).
    
    Speaker: Prof. Sergey Polyakov (Keldysh Institute of Applied Mathematics)
- 11:00
  
  Lunch
- Mathematical Methods and Algorithms for Parallel and Distributed Computing 406A
  
  406A
  - 156
    
    The brachistochrone problem as a minimum path on the graph
    
    The brachistochrone problem was posed by Johann Bernoulli in Acta Eruditorum in June 1696. He introduced the problem as follows [1]. Given two points A and B in a vertical plane. What is the curve traced out by a point acted on only by gravity, which starts at A and reaches B in the shortest time. The modern method to solve this problem is the method of calculus of variations exclusively [2]. The proposed report presents a graphical method of solution of the brachistochrone problem. The method consists of constructing a number of curves connecting points A and B. The minimal time of sliding along every of curve is calculated, and the minimal time curve is taken as the solution of the problem. The curves are searching as the broken lines. The minimum time broken line is searched as the minimal path on a special graph. The nodes of this graph are the nodes of a rectangular grid superimposed on the domain of the plane. The domain contains the points A and B. This graph is similar to graph from [3]. But now the weight of the arc is the time to move from one node to another. The special rule determines for each node a set of adjacent nodes. This rule uses the mutually simple numbers. The power of set may be 8, 16, 32, 48, 80, .... The minimal path is searched by means of the Dijkstra algorithm [4], adapted to this task. If the power of that set tends to infinity, then the minimal path tends to a smooth curve. If the power of that set tends to infinity, then the minimal path tends to a smooth curve. The paper [5] gives formula to calculate the proximity in case the curve is straight line and power of set is finite. This procedure replaces the continuous variational problem with the problem of discrete optimization. The closeness of the discrete problem solution to the solution of the continuous problem is defined. The proximity of the discrete problem solution to the continuous problem solution is determined. by means of the concept of proximity k-th order for curves and y = [2]. The proximity of the zero-order is determined through the distance from points of one curve to another curve. The proximity of the first order, is determined through that distance and at the same time through the difference in the length of these curves. References 1. http://www-history.mcs.st-andrews.ac.uk/HistTopics/Brachistochrone.html 2. Эльсгольц Л.Э. Дифференциальные уравнения и вариационное исчисление. УРСС, Москва, 1998. 3. Лотарев Д.Т. Применение метода поиска кратчайшего пути на графе для приближенного решения вариационной задачи// АиТ № 9. 2002 4. Dijkstra E.W. A Note on Two Problems in Conexion with Graphs// Numerische Mathematik l, 269 - 27 I (l 959). 5. Лотарев Д.Т. Построение цифровой модели местности на территории с равнинным рельефом// аит 1998. № 8. С. 53 - 62.
    
    Speaker: Dmitry LOTAREV (A.A. Kharkevich Institute for Information Transmission Problem, RAS)
  - 157
    
    Прикладные проблемы решения задачи вписывания многогранников
    
    В докладе рассматривается задача нахождения многогранников заданной формы внутри других многогранников. Данная задача является частным случаем третьей части 18-ой проблемы Гильберта. Доклад посвящен методам, которые используются для решения данной задачи в промышленных целях. Рассматриваются альтернативные методы, основанные на сведении данной задачи к задаче нелинейного оптимизации, их плюсы и недостатки по сравнению с общепринятыми переборочными подходами.
    
    Speaker: Mr Denis Kokorev (Institute for information transmission problems RAoS)
    
    Slides
  - 158
    
    Эволюционный метод минимизации суммы квадратов нелинейных функций (Opportunistic Evolutionary Method to Minimize a Sum of Squares of Nonlinear Functions)
    
    Нахождение глобального минимума суммы квадратов нелинейных функций — стандартная задача, возникающая при подгонке параметров математической модели по экспериментальным измерениям. В случае суммы линейных функций решение обычно находят при помощи обобщенного метода наименьших квадратов. В случае суммы нелинейных функций минимизационные алгоритмы, как правило, не используют тот факт, что целевая функция представлена в виде суммы квадратов. В работе предложен эволюционный метод, во время минимизации идентифицирующий группы коррелированных слагаемых. В случае частично-разделяемых задач метод позволяет ускорить сходимость к минимуму. Для параллельных вычислений метод использует схему ведущий/ведомые с асинхронным обменом заданиями (аргументами функций) и результатами вычислений между вычислительными узлами. Для типичных проблем ускорение по сравнению с вычислениями на одном процессоре почти равно числу вычислительных узлов (для систем из нескольких десятков узлов). Finding the global minimum of a sum of squares of nonlinear functions is ubiquitous in curve fitting, when one tries to determine physical parameters from experimental observations. While generalized least squares is a technique commonly used to minimize a sum of squares of linear functions, there are a few approaches to solve minimization problems with a sum of nonlinear functions as an objective function. We present an evolutionary method, which identifies groups of correlated summands during the minimization process. If a problem is partially separable, the method makes use of this opportunity to speed-up convergence. The proposed method implements a master-worker model for parallel calculations with asynchronous exchange of tasks (function arguments) and results between processing nodes and the master process. For practical problems the speed-up with respect to a single processor mode is nearly equal to a number of computing nodes (up to several tenths of processors).
    
    Speaker: Dr Mikhail Zhabitsky (Joint Institute for Nuclear Research)
    
    Slides
  - 159
    
    Эффективные алгоритмы построения и визуализации рельефа Земли и батиметрии Мирового океана
    
    Прямые вычислительные эксперименты с целью прогноза потенциально опасных морских явлений и оценки состояния прибрежных и океанских акваторий и атмосферы в настоящее время способны оперировать обширной информационной базой разномасштабных батиметрический карт, многолетними массивами наблюдений за параметрами состояния Мирового океана, и в том числе с возможностью вовлечения оперативных потоков данных о зондировании океана из космоса и телеметрических измерениях в реальном времени. Практическое использование больших массивов информации из разнородных источников требует разработки эффективных, а значит специализированных процедур для быстрой конвертации материалов, архивированных в различных форматах аппаратной кодировки данных, с последующей графической визуализацией аналитических выборок на больших географических покрытиях, и с возможностью быстрого синтеза различных вариантов представления результатов моделирования совместно с текущими данными наблюдений о состоянии моря.
    
    Speaker: Dr Vasily Khramushin (Saint-Petersburg State University)
  - 160
    
    Особенности разделения процессов физического моделирования в континуально-корпускулярных вычислительных экспериментах
    
    Исследование возможности достижения высокой эффективности вычислительных экспериментов при моделировании естественных процессов и явлений в трехмерных физических пространствах делает востребованным построение особого математического аппарата, непосредственно ассоциированного с набором операций и числовых объектов в архитектуре современных вычислительных систем. Традиционное математическое определение физических законов в интегральной и дифференциальной формах отмечается рассогласованием методов последовательных или рекуррентных соотношений с реальными возможностями параллельных расчетов на множестве независимых вычислительных ядер, к тому же обеспеченных универсальным аппаратом трехмерных геометрических преобразований в однородных координатах с автоматической визуализации результатов, в том числе включающих графические сцены в наглядных перспективных проекциях. Оптимальным вариантом математического моделирования в механике сплошных сред выбирается метод разделения вычислительного эксперимента на этапы с пространственной интерполяцией неразрывных – континуальных явлений внутри сеточной области с нерегулярными узлами для фиксации состояния моделируемых физических параметров, где в образующихся смежных ячейках позиционируются соразмерные подвижные и деформируемые корпускулы – обладающие инерционными свойствами и особой поляризаций для сопряжения свободных частиц во взаимозависимые кластеры в непосредственной близости от опорных ячеек исходной сеточной области.
    
    Speaker: Dr Vasily Khramushin (Saint-Petersburg State University)
  - 161
    
    Hybrid combination genetic algorithm and gradient methods to train a neural network
    
    Speaker: Ivan Priezzhev (Dubna University)
  - 162
    
    ПОЛНЫЕ ГРАФЫ И СЕТИ И ПРОБЛЕМА РЕАЛИЗУЕМОСТИ В НИХ ПОТОКОВ ПРОДУКТОВ
    
    Полные N-вершинные графы рассматриваются как выпуклые конусы G в вещественном пространстве R размерности N(N-1)/2, орты в котором суть ребра графа, а крайние векторы суть следующие векторы , где - орты пространства R. Показано, что крайние векторы конуса , сопряженного конусу G, можно интерпретировать как сети, имеющие потоковые и сетевые ребра. Показано также, что для любой сети, определяемой полным графом, а также выделенными в нем потоковыми ребрами с заданными интенсивностями потоков и сетевыми ребрами с заданными пропускными способностями, можно определить вектор в пространстве R – вектор сети – что реализуемость в сети соответствующего многопродуктового потока зависит от того, принадлежит ли этот вектор конусу G или нет. Это, в свою очередь, означает, что критерием реализуемости потока является неотрицательность скалярных произведений вектора сети с некоторыми из крайних векторов конусов .
    
    Speaker: Яков Гринберг (ИППИ РАН)
    
    Slides
  - 163
    
    Three-loop numerical calculation of critical exponents of the directed percolation process
    
    Directed bond percolation near its second-order phase transition is investigated by the means of the perturbative renormalization group approach. We present a numerical calculation of the renormalization group functions in the ε-expansion. Within this procedure anomalous dimensions are expressed in terms of irreducible renormalized Feynman diagrams and thus the calculation of renormalization constants could be entirely skipped. Numerical calculation of integrals has been done on Hybrilit cluster employing vegas algorithm from CUBA library.
    
    Speaker: Mr Lukas Mizisin (Pavol Josef Safarik University in Kosice)
    
    Slides
  - 164
    
    Research of MPI application efficiency on cloud and heterogeneous infrastructures of MICC JINR
    
    A comparative analysis of the efficiency of a distributed computing system based on the cloud infrastructure of MICC JINR and the heterogeneous cluster HybriLIT for parallel applications with MPI technology has been carried out. The calculations include both test problems and MPI-implementation of a program complex for calculations of physical characteristics in long Josephson junctions. Dependence of the amount of interprocessors interactions on speedup of computations in parallel mode has been investigated. For the model of Josephson junctions, parameter values of computational schemes based on 4-step Runge-Kutta method and 1-step Euler method has been obtained. The use of these parameters facilitates the efficiency of calculations with the cloud infrastructure in its current configuration. The work is supported by RFBR grant No 15-29-01217.
    
    Speaker: Mr Maxim Zuev (JINR)
  - 165
    
    Simulation processes of hydrodynamics with help the software OpenFOAM
    
    Speaker: Dr Dmitry Podgainy (JINR)
- 4. Scientific, Industry and Business Applications in Distributed Computing System LIT Conference Hall
  
  LIT Conference Hall
  - 166
    
    New developments of Distributed Computing Technologies in Moldova
    
    The recent developments are using results of previous projects like the regional project “Experimental Deployment of an Integrated Grid and Cloud Enabled Environment in BSEC Countries on the Base of g-Eclipse (BSEC gEclipseGrid)” supported by Black Sea Economical Cooperation Programme (http://www.blacksea-cloud.net). In this project we selected middleware for implementation of computing architecture that provide a collaborative, network-based model that enables the sharing of computing resources: data, applications, storage and computing cycles. The project allowed to introduce the general idea of federated Cloud infrastructure, which can provide different solutions for universities, scientific and research communities [1]. The project was focused on implementation approaches to combine the Grid and Cloud resources together as a single enhanced computational power and offer the possibility to use Grid or Cloud resources on demand. As an example, if the user requires parallel computational resources, he will submit a job on the Grid, but if the user needs any specific software or environment to solve some special problem, he can use a dedicated Cloud service or virtual image for that purpose. Fig. 1 shows the skeleton of the suggested platform. The proposed platform made possible to solve the following problems: increasing the effective usage of computational resources; providing additional different services for scientific and research communities; close collaboration between different resources providers to solve common regional problems. Figure 1: General structure of the proposed heterogeneous regional platform The objective is to create an infrastructure, which uses resources provided by geographically distributed heterogeneous computing clusters. These sites are operated by independent organizations, which have total control on managing their own resources, including the setup and enforcement of special administrative policies regarding authorization and access, security, resource usage quota, monitoring and auditing of the local infrastructure. The resource providers delegate the control for a part of their infrastructure in a safe and efficient way, so a federated infrastructure can be build based on the resources available on these distinct administrative domains. This is a major challenge regarding the implementation of a federated Cloud infrastructure as it is sought to be achieved by the currently implementing initiatives. These achievements we used for development of the national distributed computing infrastructure. The general scheme, showing the plan of the integrated computational infrastructure evolution in Moldova presented in fig. 2. Figure 2: The prospects of evolution of the distributed computing infrastructure in Moldova Future developments of integrated heterogeneous distributed computing infrastructure since 2016 continued within new regional project VI-SEEM (VRE for regional Interdisciplinary communities in Southeast Europe and the Eastern Mediterranean). During preparations for this new project the works were effectuated to unite in one regional infrastructure various distributed computing resources like Grid, HPC, storage and computing cloud. To achieve the initial idea to ensure heterogeneous resources management for HPC, Grid and storage access on the cloud we in Moldova continued works on development of relevant and flexible basic cloud infrastructure. The first experimental infrastructure based on OpenStack middleware was deployed using the latest at the moment Ubuntu Server 14.04 LTS as a base operation system for all nodes and the latest version of OpenStack release “Juno”. This infrastructure was interconnected via two dedicated Gbit switches - one for management and one for data network. External networking connectivity and internal networking for virtual machines provided via the Network Node. The Network Node runs SDN (Software Defined Network) technology software – Open Virtual Switch. It created virtual network infrastructure for virtual machines and segregates different network slices using GRE (Generic Routing Encapsulation) tunneling protocol. It also supports many networking protocols including OpenFlow and it allowed us to implement very flexible and powerful instrument. To develop OpenStack management capabilities and flexibility at the next step of cloud infrastructure extension we used Fuel open source deployment and management tool for OpenStack. Developed as an OpenStack community effort and approved as an OpenStack project under the Big Tent governance model, it provides an intuitive, GUI-driven experience for deployment and management of a variety of OpenStack distributions and plug-ins. Fuel brings consumer-grade simplicity to streamline and accelerate the otherwise time-consuming, often complex, and error-prone process of deploying various configuration flavors of OpenStack at scale. Unlike other platform-specific deployment or management utilities, Fuel is an upstream OpenStack project that focuses on automating the deployment of OpenStack and a range of third-party options, so it’s not compromised by hard bundling or vendor lock-in [2]. We deployed our Open Stack cluster using Mirantis Fuel project solutions. The cluster deployed is containing two computing nodes, one controller and now has in total 18 CPU cores, 36GB RAM and 1,1TB HDD storage. Fuel gives us the speed, ease and reliability of Open Stack cluster deployment, as well as the flexibility to configure the cluster on the fly. The current Open Stack cluster configuration is shown in fig. 3. New cluster’s components, such as computing nodes, controllers and storage nodes can be easily connected to an existing infrastructure or be removed from it when needed. The entire installation and configuration takes place automatically, followed by a set of special scripts that check system availability after cluster deployment (health check). Thanks to this, we are able to efficient use of the existing computing resources, reconfigure access to other virtualized facilities and save huge amount of time that was previously required to deploy and manage the distributed resources manually. Figure 3: Configuration of the Open Stack cluster based on open source Mirantis Fuel project. To ensure operation of federated mechanism to access distributed computing resources were finalized works to realize solutions that allow providing unified access to cloud infrastructures and be integrated in the creating Research & Educational identity management federations operated within eduGAIN inter-federation authorization & authentication mechanism (AAI) [3]. The practical results in the area of federated access to cloud implementation based on the EGI-Inspire AAI Cloud Pilot project “Federated Authentication and Authorization Infrastructure (AAI) for NREN services” and other new results obtained during deployment and administration of OpenStack cloud infrastructure. References: 1. Hrachya Astsatryan, Andranik Hayrapetyan, Wahi Narsesian, Peter Bogatencov, Nicolai Iliuha, Ramaz Kvatadze, Nugzar Gaamtsemlidze, Vladimir Florian, Gabriel Neagu, Alexandru Stanciu. Deployment of a Federated Cloud Infrastructure in the Black Sea Region. Computer Science and Information Technologies. Proceedings of the CSIT Conference, September 23-27, 2013, Erevan, Armenia, pp.283-285. ISBN 978-5-8080-0797-0. 2. https://www.mirantis.com/products/mirantis-openstack-software/openstack-deployment-fuel/. 3. P. Bogatencov, N. Degteariov, N. Iliuha, and P. Vaseanovici “Implementation of Scientific Cloud Testing Infrastructure in Moldova,” Proceeding of The Third Conference of Mathematical Society of the Republic of Moldova, August 2014, Chisinau, Moldova, pp. 463–466.
    
    Speaker: Dr Peter Bogatencov (RENAM, Moldova)
  - 167
    
    Инфраструктура и основные задачи дата-центра института Физики НАН Азербайджана.
    
    Грид и облачные технологии являются основными направлениями развития дата-центра Института физики НАН Азербайджана. Пользователи дата-центра, используя возможности, которые предлагают эти технологии, успешно решают задачи в области физики высоких энергий, физики твердого тела и других научных направлений института. Инфраструктура дата-центра представлена высокопроизводительными серверами Supermicro, способными решать серьезные задачи на самом высоком уровне. Эффективному развитию дата-центра способствует сотрудничество с ОИЯИ, CERN и другими международными научными центрами.
    
    Speaker: Mr Aleksey Bondyakov (JINR (Joint Institute For Nuclear Research))
    
    Slides
  - 168
    
    KIAM JOB_CONTROL TASK MANAGEMENT SYSTEM AND ITS APPLICATION TO THE CLOUD AND GRID COMPUTING
    
    Nowadays, the tasks related to development of nanotechnologies are of a great interest. The solution of such tasks requires the application of high-performance computing systems. Those computer systems are very expensive in installation and service therefore it is important to reduce the amount of their idle time. Due to this fact, the task of redistribution of calculations and restarting the applications on other available computing systems became a challenging problem for the systems end-users. In this work we provide a web-environment for carrying out mass supercomputer calculations. Throughout this work the KIAM Job_Control service was developed for controlling the applications and computing resources. KIAM Job_Control allows to perform long-term calculations on a set of supercomputers by automatic transfer of applications and calculated data between them. Basic functions of the system: application launch, application termination, data saving, relocation of save point on other supercomputer, application restart, interactive communication with the application during computation. The KIAM Job_Control service is designed to deal with a wide variety of tasks. The GIMM_NANO software complex oriented on the solution of actual tasks of nanotechnologies works using this service. One of applications is a module for molecular dynamics calculations. By means of this module multiscale simulation of non-linear processes in gas–metal microsystems is made. In the presentation we will describe the technologies allowing to realize the required service functionality. Also an example of its usage will be presented on the tasks of molecular dynamics calculations. The work was supported by Russian Foundation for Basic Research (projects 15-07-06082-a, 15-29-07090-ofi_m).
    
    Speaker: Mr Dmitry Puzyrkov (Keldysh Institute of Applied Mathematics)
  - 169
    
    Визуализация большого массива данных по мгновенному сердечному ритму с использованием фазового пространства
    
    К настоящему времени основным подходом к оценке и предсказанию риска развития фатальных сердечно-сосудистых осложнений считается анализ вариабельности сердечного ритма (ВСР). Одним из перспективных направлений исследования ВСР является анализ мгновенного сердечного ритма (МСР). Холтеровское мониторирование (ХМ) позволяет получать данные о МСР в течение нескольких суток. За 24 часа ХМ получается массив из порядка 150000 данных. В работах [1,2] введена МСР функция - y(t), которая наиболее адекватно отражает динамику сердечно-сосудистой системы. Наряду с y(t) можно ввести функцию, которая характеризует скорость изменения МСР, которую мы назовем МССР (мгновенной скоростью сердечного ритма) функцией - v(t). Функции y(t) и v(t) содержат полную информацию о характере МСР на интересующем нас промежутке времени. Множество точек в R2 с декартовыми координатами y(t) и v(t) образуют фазовое пространство (ФП) МСР. Функции y(t) и v(t) определяют фазовую траекторию (ФТ), а каждая ее точка определяет состояние МСР и, соответственно, носит название фазовой точки. Наряду с ФП МСР используется расширенное фазовое пространство (РФП) МСР, которая представляет собой множество точек в R3 с декартовыми координатами (y(t),v(t),n(t)). Функция n(t) описывает число прохождений ФТ через точку с координатами (y(t),v(t)). В этом случае ФТ в ФП представляет собой проекцию ФТ в РФП на плоскость с координатами (y,v). Приводится и анализируется конкретный вид РФП и ФП МСР одного из пациентов Тверского областного кардиологического диспансера. Время ХМ взято равным 632 сек. Проведенное нами исследование указывает на действительно сложный характер поведения функций y(t) и v(t), характеризующих состояние МСР. На конкретном примере мы показали эффективность исследования этих функций на основе визуализации большого массива данных по МСР с использованием РФП . Литература. [1] Кудинов А.Н., Лебедев Д.Ю., Цветков В.П., Цветков И.В. Математическая модель мультифрактальной динамики и анализ сердечных ритмов. Математическое моделирование, том 26, № 10, 2014, стр. 127 – 136. [2] Иванов А.П., Кудинов А.Н., Лебедев Д.Ю., Цветков В.П., Цветков И.В. Анализ мгновенного сердечного ритма в модели мультифрактальной динамики на основе холтеровского мониторирования. Математическое моделирование, том 27, № 4, 2015, стр. 16 – 30.
    
    Slides
  - 170
    
    Регуляризация обратной задачи определения параметров нелокального переноса энергии в термоядерной плазме и ее решение методами оптимизации
    
    Явление чрезвычайно быстрого распространения тепла в магнитно-удерживаемой термоядерной плазме, часто называемого «супердиффузией» или «нелокальным переносом» тепла известно уже более 40 лет и наблюдается в экспериментальных установках термоядерного синтеза различных типов («Токамак» и «Стелларатор»). Однако общепризнанной адекватной теории этого процесса не создано до сих пор. В докладе рассматривается одна из математических моделей, предложенная сотрудниками НИЦ «Курчатовский институт». В ее основе — подлежащие определению коэффициенты поглощения и испускания плазмой гипотетических «переносчиков» энергии как функции температуры и частоты. В докладе рассматривается упрощенный «монохроматический случай», когда функции источника и поглощения зависят только от температуры плазмы, зависящей, в свою очередь, от нормированной радиальной координаты (r) «вдоль» эффективного радиуса плазменного шнура. В предположении однозначной связи функций поглощения и испускания, неизвестную функцию поглощения К(Т) предлагается искать в результате решения интегрального уравнения специального вида. В качестве исходных данных доступны профили температуры плазмы вида T(r,t) и ee «электронной» плотности e(r,t). Здесь t — отметки времени замеров после начала процесса переноса энергии (до этого плазма находилась в некотором равновесном состоянии). Первоначально эти данные не корректировались, а для целей численного интегрирования и дифференцирования к ним применялась процедура кубической сплайн-интерполяции. Непосредственное применение к указанному интегральному уравнению методов оптимизационной идентификации параметров искомой функции K(T) в результате минимизации его «невязки» не удовлетворяло наших коллег точностью «пападания» в предоставленные ими экспериментальные данные. Кроме того, применяемый поначалу метод решения обратной задачи не позволял количественно учесть очевидную погрешность исходных данных. Эти соображения заставили применить к задаче иной метод, в основе которого регуляризующая аппроксимация исходных данных. Для регуляризации задачи (и оценки исходных и возникающих погрешностей) применяется оригинальный метод аппроксимации SvF (Simplicity versus Fitting, Простота модели против точности повторения экспериментальных данных). Метод состоит в поиске компромисса между простотой модели и точностью повторения экспериментальных данных. В качестве меры простоты модели используется гладкость искомых кривых и поверхностей (интегралы от квадрата 2-ой производной), в качестве меры близости к данным – обычное, среднеквадратичное отклонение. Результат (сглаженные кривые и оценки точности) достигается путем минимизации функционала состоящего из взвешенной суммы меры простоты и меры близости. Для выбора оптимального значения весов используется процедура перекрёстного оценивания (Cross-validation). В результате оказалось, что оставаясь в пределах четырех процентов (4%) коррекции исходных данных удалось определить искомые неизвестные функции, т.е. добиться почти точного совпадения экспериментальных данных с расчетными, что свидетельствует о перспективности применения SvF-метода в подобных обратных задачах.
    
    Speaker: Александр Соколов (Витальевич)
    
    Slides
  - 171
    
    ПАРАЛЛЕЛЬНЫЕ АЛГОРИТМЫ ДЛЯ 3D МОДЕЛИРОВАНИЯ ТЕРМИЧЕСКИХ ПРОЦЕССОВ В РАМКАХ ГИПЕРБОЛИЧЕСКИХ УРАВНЕНИЙ ТЕПЛОПРОВОДНОСТИ
    
    Гиперболическое уравнение теплопроводности[1] описывают процессы теплопереноса в локально-неравновесных условиях. Одной из задач приводящим к локально-неравновесным условиям является облучение материалов тяжелыми ионами высоких энергий. Взаимодействие ионов высоких энергий с материалами происходит во временах порядка ~ 10-14 c и в таких временах для описания тепловых процессов применяется гиперболическое уравнение теплопроводности. В данной работе для численного решения гиперболического уравнения теплопроводности применяется экономичные схемы конечно-разностного метода[2] и их параллельная реализация. Исследована эффективность вычислительных схем и параллельных алгоритмов для 3D моделирования. Получены результаты исследований термических процессов в металлических мишенях при облучении тяжелыми ионами высоких энергий. [1] Лыков А.В. Тепломассобмен: 2-ое изд., М. Энергия, 1978, 480с. [2] Самарский А.А., Гулин А.В. Численные методы М.: Наука,1989, 432 c.
    
    Speaker: Zarif Sharipov (JINR)
  - 172
    
    НЕПРЕРЫВНО-АТОМИСТИЧЕСКОЕ МОДЕЛИРОВАНИЕ С ИСПОЛЬЗОВАНИЕМ ВЫСОКОПРОИЗВОДИТЕЛЬНЫХ ВЫЧИСЛИТЕЛЬНЫХ СИСТЕМ ПРОЦЕССОВ ВЗАИМОДЕЙСТВИЯ ТЯЖЕЛЫХ ИОНОВ С МЕТАЛЛАМИ
    
    Исследования в области облучения материалов тяжелыми ионами высоких энергий (ТИВЭ) проводятся на протяжении нескольких десятилетий. Проведение экспериментальных исследований в этих областях трудоемко и дорого, поэтому актуальным становится математическое моделирование, которое требует разработки новых и развитие существующих моделей на основе новых экспериментальных данных взаимодействия ТИВЭ с материалами. В работе предлагается непрерывно-атомистический подход[1,2] для моделирования взаимодействия ТИВЭ с конденсированными средами. Непрерывно-атомистическая модель представляет собой два разных класса задач, а именно непрерывные уравнения теплопроводности модели термического пика[3] и уравнения движения материальных точек метода молекулярной динамики[4]. Использование высокопроизводительных систем для непрерывно-атомистического моделирования требует разработки новых вычислительных схем и параллельных алгоритмов. В работе для решения уравнений непрерывно-атомистической модели разработан алгоритм и вычислительная схема с возможностью использования в многопроцессорных системах. Исследована эффективность вычислительных схем и параллельных алгоритмов. Получены результаты исследований на примере металлических мишеней при облучении тяжелыми ионами. [1] D. Ivanov and L. Zhigilei. Combined atomistic-continuum modeling of short-pulse laser melting and disintegration of metal films // Physical Review B 68. 064114 (2003). [2] Norman G.E., Starikov S.V., Stegailov V.V., Saitov I.M., Zhilyaev P.A. Atomistic modeling of warm dense matter in the two-temperature state // Contrib. Plasma Phys. 2013. V. 53. P. 129-139. [3] М.И. Каганов, И.М. Лифшиц, Л.В. Танатаров. Релаксация между электронами и решеткой // ЖЭТФ. 1956. N.31. № 2(8). C.232-237. [4] Х.Т. Холмуродов, М.В. Алтайский, И.В. Пузынин и др. Методы молекулярной динамики для моделирования физических и биологических процессов // ЭЧАЯ. 2003. Т. 34. Вып. 2. С. 472-515. Работа выполнена при частичной финансовой поддержке гранта РФФИ № 15-01-06055-а и гранта Полномочного представителя Республики Болгария в ОИЯИ.
- 6. Cloud computing, Virtualization
  - 173
    
    JINR cloud: status report
    
    A number of the JINR cloud users as well as a demand in cloud resources from both individual users and experiments are growing permanently. Thus, the JINR cloud was re-organized to increase a reliability and availability of service. To cover the resources deficit, work is in progress in three directions: 1) the resources usage optimization with the help of a smart scheduler which consolidates underloaded virtual machines (VMs) and containers (CTs) within the hosts; 2) the external clouds integration with the JINR one following the so called “cloud bursting” approach what a custom driver was developed for; 3) buying more servers. The new JINR cloud architecture, the basic idea of the smart cloud scheduler, specifics of the “cloud bursting” driver implementation are given.
    
    Speaker: Dr Nikolay Kutovskiy (JINR)
    
    Slides
  - 174
    
    Monitoring systems comparison for collecting metrics for cloud management and optimization.
    
    Cloud technologies provide new tools for improving the efficiency of IT infrastructure. To manage load and optimize cloud resource utilization, e.g. through overcommitment and automated migration, it is necessary to gather performance metrics from the whole cloud infrastructure. The OpenNebula built-in monitoring system has limited configuration possibilities and does not keep historical data for long enough. For this reason, an external monitoring system needs to be used. There are many popular tools suitable for this task. To have an objective metric to help compare these systems, a performance test scheme has been proposed. With one monitoring server and an increasing number of monitored nodes, the server load was measured over time. The results for Ganglia, Icinga2, NetXMS, NMIS and Zabbix are given in this report.
    
    Speaker: Mr Ivan Kadochnikov (JINR)
  - 175
    
    Smart Cloud Scheduler
    
    Several years of experience in managing the LIT JINR Cloud service have revealed some shortcomings and reasons of its underutilization. A need and possibilities of the cloud infrastructure optimization is demonstrated. The talk covers the Smart Cloud Scheduler project development process aimed at optimizing the JINR Cloud utilization through a dynamic reallocation and consolidation of virtual resources with a minimal impact on the cloud users’ applications.
    
    Speaker: Mr Nikita Balashov (JINR)
    
    Slides
  - 176
    
    Virtual clusters as a way to experiment software
    
    In this article we represent researches done in virtualization sphere which help to create computing infrastructures based on container clusters. Consider an opportunity to create different topologies and test software behaviour without the need to construct real network topologies with real nodes. One of such examples is an application running on multiple nodes over a network. We cannot rely on experiments made using Internet because it doesn't provide exact repetition of experiment conditions. However, another topology or network characteristics can speedup computations thus such experiments should be done. Nowadays virtualization techniques enables instant creation of virtual clusters and it's even possible to simulate the conditions of poor communication between nodes, to limit the bandwidth, to add delays like in real network, to add some errors (BER) and so on. We investigate available tools and do experiments with different limitations on network and node characteristics.
    
    Speaker: Mr Artem Krosheninnikov (SPbSU)
  - 177
    
    Dynamical virtualized computing clusters for HEP at Budker INP
    
    There are several experimental groups at Budker INP participating in local and international HEP projects. Their requirements on computing resources and environments vary widely, often being incompatible. On the other hand, there are several computing sites nearby the institute, providing supercomputer resources for academic users, each site having its own specific setups. The dynamical virtualized computing cluster concept allows to integrate these remote resources for BINP users in a convenient for each users group manner. The presentation shows details of the implementation, namely the mechanisms of integration of local and remote batch systems and running virtualized machines inside batch system task. Several use cases and success stories at BINP are listed.
    
    Speaker: Mr Andrey Sukharev (Budker INP)
  - 178
    
    CLOUD PLATFORM FOR DATA MANAGEMENT OF THE ENVIRONMENTAL MONITORING NETWORK: UNECE ICP VEGETATION CASE
    
    There are seven International Cooperative Programmes (ICPs) and Task Forces that report to the Working Group of the Convention on Long-Range Transboundary Air Pollution (CLRTAP) on the effects of atmospheric pollutants on different components of the environment and health. Among those programmes the UNECE ICP Vegetation*, which was established in the late 1980s, plays a special role by providing data on the concentrations of twelve heavy metals, nitrogen, POPs** and radionuclides of natural and technogenic origin in naturally growing mosses throughout Europe. The aim of the UNECE ICP Vegetation surveys carried out every 5 years is to identify the main polluted areas, produce regional maps and further develop the understanding of the long-range transboundary pollution. Since January 2014, the coordination of moss surveys in 36 European and Asian countries has been conducted from the JINR in Russia. Analytical results and information on the sampling sites (MossMet set) reported to JINR include confidential acceptance of the data from individual contributors, the storage of large data arrays, their initial multivariate statistical possessing followed by applying GIS*** technology, and, possibly, the use of artificial neural networks for predicting concentrations of chemical elements in various environments. To simplify the whole procedure, it is proposed to build a unified platform consisting of a set of interconnected services and tools to be developed, deployed and hosted in the JINR cloud. Such an approach also will allow scaling up and down cloud resources depending on the service load. That will increase efficiency of the hardware utilization as well as the reliability and availability of the service itself for the end users. The overall and specific objectives, as well as the design principles of the platform are discussed. Tools for acceptance, storage, processing and interpretation of network data are presented. The developed software can be used for global air pollution monitoring purposes anywhere in the world to assess the pathway of pollutants in the atmosphere.
    
    Speaker: Dr Alexander Uzhinskiy (Dr.)
  - 179
    
    Методика и алгоритм выбора стандартов для профиля интероперабельности в облачных вычислениях.
    
    Настоящий доклад содержит материалы, представляющие собой развитие наших результатов по «проблеме интероперабельности» в облачных вычислениях, доложенных на предыдущих конференциях «GRID’2010», «GRID’2012» и «GRID’2014» [1,2,3,4]. Напомним, что «Интероперабельность - способность систем или компонентов обмениваться информацией и использовать эту информацию (ISO/IEC 24765-2010). В основе достижения интероперабельности лежит технология открытых систем и использование согласованных наборов ИТ-стандартов – профилей [5]. Проблема интероперабельности возникает в гетерогенной ИКТ-среде для информационных систем практически любого назначения и масштаба (от наносистем до Грид-систем, систем облачных вычислений и сверхбольших систем – systems of systems). И она тем острее, чем выше уровень гетерогенности среды. Обеспечение интероперабельности – сложная научно-техническая задача, которой занимаются многие организации и исследователи – основными международными организациями в области Грид-систем и систем облачных вычислений следует считать Open Grid Forum (OGF) и Open Cloud Consorcium (ОСС). Этими вопросами занимается также IEEE. В нашей стране проблема развития принципов интероперабельности, стандартов и технологий открытых систем, а также развитие технологий и стандартов Грид включены в Программу фундаментальных исследований государственных академий наук в 2013-2020 гг. На конференции «GRID’2014» авторами была рассмотрена методика обеспечения интероперабельности в Грид-среде и облачных вычислениях, которая использует принципы системной инженерии, базируется на едином подходе, разработанном нами ранее и зафиксированном в ГОСТ Р 55062-2012. Напомним, что методика состоит из основных этапов, таких как: «Построение концепции», «Построение архитектуры», «Построение проблемно-ориентированной модели», «Построение профиля интероперабельности», «Реализация» и вспомогательных этапов: «Построение дорожной карты разработки стандартов», «Разработка глоссария», «Разработка стандартов». Также были рассмотрены различия в содержании этапов методики для Грид и облаков и представлена модель интероперабельности облачных вычислений [4]. В предстоящем докладе будет детально рассматриваться ключевой этап методики, который ранее был наиболее слабо проработан, а именно этап построения профиля интероперабельности для облачных вычислений. Поскольку профиль представляет согласованный набор стандартов, структурированный в терминах эталонной модели интероперабельности, в докладе будет рассмотрена усовершенствованная модель интероперабельности в облачных вычислениях (см. рис. 1), которая является развитием модели, представленной на предыдущей конференции [4]. Рисунок 1 - Модель обеспечения интероперабельности облачных вычислений. Построение профиля в целом осуществляется в соответствии с документом Госстандарта Р50.1.041-2002 [6]. В докладе будут детально рассмотрены методика и алгоритм выбора стандартов для профиля интероперабельности в облачных вычислениях. В основе методики выбора стандартов лежит метод многокритериального анализа альтернатив, основанный на механизме нечеткого логического вывода [7]. Процесс выбора стандартов заключается в формировании системы критериев для отбора стандартов, на основе которых вычисляются промежуточные коэффициенты соответствия и обобщающий коэффициент, позволяющие сделать выбор в пользу одного из стандартов. В докладе будет представлена адаптация методики для среды облачных вычислений за счет расширения критериев. К основным подэтапам построения профиля, которые можно частично формализовать, а, следовательно, автоматизировать, следует отнести следующие: 1. Формирование базы данных используемых стандартов (инфраструктура базовых стандартов, ИБС). 2. Идентификация и выбор нормативных документов для служб, обеспечивающих информационные технологии. 3. Идентификация и выбор нормативных документов для сервисов информационных систем. В докладе будут представлены алгоритмы работы автоматизированной системы поддержки проектирования профиля (АСППП). Применение предложенных методики и алгоритмов позволит наиболее обоснованно выбрать стандарты для профиля интероперабельности облачных вычислений с учетом особенностей предметной области. Список литературы 1. Журавлев Е.Е., Корниенко В.Н., Олейников А.Я. Вопросы стандартизации и обеспечения интероперабельности в GRID-системах. // Распределенные вычисления и Грид-технологии в науке и образовании: Труды 4-й междунар. конф. (Дубна, 28 июня – 3 июля, 2010 г.). Дубна. 2010. С. 364-372. 2. Журавлев Е.Е., Корниенко В.Н., Олейников А.Я. Исследование особенностей проблемы интероперабельности в GRID технологии и технологии облачных вычислений // Исследование особенностей проблемы интероперабельности в GRID технологии и технологии облачных вычислений. Дубна. 2012. С. 312-320. 3. Иванов С.В. Вопросы интероперабельности в облачных вычислениях // Распределенные вычисления и грид-технологии в науке и образовании: Труды 5-й международной конференции (Дубна, 16-21 июля, 2012г). Дубна: ОИЯИ. 2012. С. 321-325. 4. Журавлев Е.Е., Иванов С.В., Каменщиков А.А., Корниенко В.Н., Олейников А.Я., Широбокова Т.Д., "Особенности методики обеспечения интероперабельности в грид-среде и облачных вычислениях," Компьютерные исследования и моделирование, Т. 7, № 3, Июль 2015. С. 675-682. 5. Олейников А.Я. Технология открытых систем. Москва: Янус-К, 2004. 6. Рекомендации Госстандарта Р50.1.041-2002 «Информационные технологии. Руководство по проектированию профилей среды открытой системы организации пользователя». 7. Королев А.С. Модели и алгоритмы интеллектуальной поддержки принятия решений при создании открытых информационных систем // дис. канд. тех. наук. Моск. гос. ин-т радиотехники, электроники и автоматики. 2007.
    
    Speaker: Sergey Ivanov (RosNOU)
  - 180
    
    bwCloud: cross-site server virtualization
    
    Computer Centers are using the Cloud Computing, providing their services based on different models. The purpose of bwCloud project is to build the prototype for cross-site Infrastructure as a Service (IaaS) system, based on the set of universities in Baden Württemberg (Germany). The system architecture, project status and the future work are the subjects of this work.
    
    Speaker: Oleg Dulov (Karlsruhe Institute of Technology)
    
    Slides
- 14:30
  
  Closing LIT Conference Hall
  
  LIT Conference Hall
- 15:00
  
  Coffee

Choose timezone

SCIENCE BRINGS NATIONS TOGETHER The 7th International Conference "Distributed Computing and Grid-technologies in Science and Education" (GRID 2016)

Welcome to GRID 2016!

LIT Conference Hall

LIT Conference Hall

LIT Conference Hall

LIT Conference Hall

St. Stroiteley, 2

International Conference Hall

LIT Conference Hall

406B

310

406A

LIT Conference Hall

406B

310

LIT Conference Hall

LIT Conference Hall

LIT Conference Hall

406A

LIT Conference Hall

406B

406A

310

LIT Conference Hall

406B

LIT Conference Hall

406A

LIT Conference Hall

LIT Conference Hall

SCIENCE BRINGS NATIONS TOGETHER

The 7th International Conference "Distributed Computing and Grid-technologies in Science and Education" (GRID 2016)