Choose timezone

Your profile timezone:

Use timezone based on:

Event/category Custom

Select a custom timezone

Login

The 8th International Conference "Distributed Computing and Grid-technologies in Science and Education" (GRID 2018)

10–14 Sept 2018

Europe/Moscow timezone

Support

grid2018@jinr.ru

Session

2. Operation, monitoring, optimization in distributed computing systems

11 Sept 2018, 13:30

There are no materials yet.

234. Разработка перспективной системы сбора данных на основе TRB-3

Andrey Kondratyev (JINR)

11/09/2018, 13:30

2. Operation, monitoring, optimization in distributed computing systems

Sectional reports

Всвязи с увеличением объема информации получаемой в ходе эксперимента ALICE на Большом Адронном Коллайдере, повышаются требования к системам сбора данных с детекторов, например увеличение пропускной способности. Одним из возможных методов решения данной проблемы является использование TRB-3 платформы. Решение представляет собой глубокую модернизацию существующей модели сбора данных.

322. Mechanisms for ensuring the integrity of information in distributed computing systems in the long-term period of time

Mr Anatoly Minzov (MPEI)

11/09/2018, 13:45

2. Operation, monitoring, optimization in distributed computing systems

Sectional reports

The article discusses issues of ensuring the integrity of information over a long period of time. This task was not raised earlier. However, experience shows that in the long periods of time in electronic archives there can be an uncontrolled change in information and even its disappearance. Attacks on the integrity of electronic archives can be targeted. This requires the creation of...

201. Trigger information data flow for the ATLAS EventIndex

Mr Mikhail Mineev (JINR)

11/09/2018, 14:00

2. Operation, monitoring, optimization in distributed computing systems

Sectional reports

The trigger information is an important part of the ATLAS event data. In the EventIndex trigger information is collected for various use cases, including event selection and overlap counting. Decoding the trigger information from the event records, stored as a bit mask, requires additional input from the conditions metadata database, as trigger configurations evolve with time. It depends on...

227. Improving Networking Performance of a Linux Node

Mr Vladimir Gaiduchok (Saint Petersburg Electrotechnical University "LETI", Russia)

11/09/2018, 14:15

2. Operation, monitoring, optimization in distributed computing systems

Sectional reports

Linux networking performance is excellent. Linux networking stack is perfect, it is very effective. There are many options that could be configured for different cases. This article is devoted to questions related to networking stack implementation and configuration. Questions related to effective configuration will be discussed. Possible problems and solutions will be shown. Different options...

253. Application of unified monitoring system in LHAASO

Mr Qingbao Hu (IHEP)

11/09/2018, 14:30

2. Operation, monitoring, optimization in distributed computing systems

Sectional reports

LHAASO The on line machinecomputer room of LHAASO experiment located at has high altitude and poor natural environment. As t, and there is no permanent resident maintenance manpowerpersonnel, so it needs to deploy an automatic operation and maintenance system for the remote management. According to the characteristics of the LHAASO cluster management, we have designed a distributed monitoring...

295. Development of JINR Tier-1 service monitoring system

Mr Igor Pelevanyuk (JINR)

11/09/2018, 14:45

2. Operation, monitoring, optimization in distributed computing systems

Sectional reports

Tier-1 center for CMS in JINR has been successfully operating since 2015. Monitoring is an important aspect of ensuring its performance. Hardware monitoring of the Tier-1 center had been introduced at construction time and was constantly upgraded with the center. The scientific community makes use of the resources through Grid services that depend on more low-level services. A dedicated...

243. THE BIGPANDA MONITORING SYSTEM ARCHITECTURE

Tatiana Korchuganova (National Research Tomsk Polytechnic University)

11/09/2018, 15:30

2. Operation, monitoring, optimization in distributed computing systems

Sectional reports

Currently-running large-scale scientific projects involve unprecedented amounts of data and computing power. For example, the ATLAS experiment at the Large Hadron Collider (LHC) has collected 140 PB of data over the course of Run 1 and this value increases at rate of ~800 MB/s during the ongoing Run 2 and recently has reached 350 PB. Processing and analysis of such amounts of data demands...

312. The BigPanDA self-monitoring alarm system for ATLAS

Mr Aleksandr Alekseev (National Research Tomsk Polytechnic University)

11/09/2018, 15:45

2. Operation, monitoring, optimization in distributed computing systems

Sectional reports

The BigPanDA monitoring system is a Web application created to deliver the real-time analytics, covering many aspects of the ATLAS experiment distributed computing. The system serves about 35000 requests daily and provides critical information used as input for various decisions: from distribution of the payload among available resources to issue tracking related to any of 350k jobs running...

354. Search for Anomalies in the Computational Jobs of the ATLAS Experiment with the Application of Visual Analytics

Ms Grigorieva Maria (NRC KI)

11/09/2018, 16:00

2. Operation, monitoring, optimization in distributed computing systems

Sectional reports

ATLAS is the largest experiment at the LHC. It generates vast volumes of scientific data accompanied with auxiliary metadata. These metadata represent all stages of data processing and Monte-Carlo simulation, as well as characteristics of computing environment, such as software versions and infrastructure parameters, detector geometry and calibration values. The systems responsible for data...

200. BigData tools for the monitoring of the ATLAS EventIndex

Mr Andrei Kazymov (JINR), Mr Evgeny Alexandrov (JINR)

11/09/2018, 16:15

2. Operation, monitoring, optimization in distributed computing systems

Sectional reports

The ATLAS EventIndex collects event information from data both at CERN and Grid sites. It uses the Hadoop system to store the results, and web services to access them. Its successful operation depends on a number of different components, that have to be monitored constantly to ensure continuous operation of the system. Each component has completely different sets of parameters and states and...

314. Tier-1 centre at NRC «Kurchatov institute» between LHC Run2 and Run3

Igor Tkachenko (NRC "Kurchatov Institute")

11/09/2018, 16:30

2. Operation, monitoring, optimization in distributed computing systems

Plenary reports

The issues of development and modernization of the Tier-1 center at the National Research Center "Kurchatov Institute" are considered in accordance with the changing requirements of experiments at the Large Hadron Collider. Increasing requirements for computing resources, drived by increase in simulations, led to an increase in their volumes, which in turn required the development of...

340. Performance measurements for the WLCG Cost Model

Victoria Matskovskaya

11/09/2018, 16:45

2. Operation, monitoring, optimization in distributed computing systems

Sectional reports

High energy physics community needs metrics that allow to characterize the resource usage of the experiments workloads detailed enough so that the impact of changes in the infrastructure or the workload implementations can be quantified with a precision high enough to guide design decisions towards improved efficiencies. This model has to express the resource utilization of the workloads in...

375. Evaluation of the performance of a cluster monitoring system based on Icinga2

Ivan Kashunin (JINR)

11/09/2018, 17:00

Building timetable...

Powered by Indico v3.3.6