The 7th International Conference "Distributed Computing and Grid-technologies in Science and Education" (GRID 2016)

Name: The 7th International Conference "Distributed Computing and Grid-technologies in Science and Education" (GRID 2016)
Start: 2016-07-04T07:00:00+03:00
End: 2016-07-09T17:00:00+03:00
Location: No location set

4–9 Jul 2016

Europe/Moscow timezone

Grid Site Monitoring and Log Processing using ELK

7 Jul 2016, 08:00

20m

LIT Conference Hall

Plenary reports 10. Databases, Distributed Storage systems, Big data Analytics Plenary reports

Mr Alexandr Mikula (Institute of Physics of the Czech Academy of Sciences)

Typical WLCG Tier-2 centres use several hundreds of servers with different services. Manual checks of all log files is impossible and various smart solutions for monitoring and log file analysis are used. We describe used procedures in the Computing Centre of the Institute of Physics in Prague, which hosts Tier-2 centre for ALICE and ATLAS experiments and provides resources for several other projects. Nagios is used as a basic monitoring tool set. Our custom plug-in aggregates warning and standard error messages and sends them summarised 3 times per day to administrators via email. Errors on critical components are sent immediately via email and Short Message System to predefined phone numbers. Nagios is complemented by Munin and Ganglia for better status overview of each server and the whole infrastructure. ELK stack is the most recent part of our monitoring set up. All log files from all production servers are shipped for processing by Logstash and then are stored in Elastic Search. We will describe used hardware, roles of each machine in the ELK cluster, technological challenges, obstacles and our cluster set up and its tuning. Typical examples of searches and graphical outputs will be presented.

Mr Alexandr Mikula (Institute of Physics of the Czech Academy of Sciences)

Slides

mikula.pdf

The 7th International Conference "Distributed Computing and Grid-technologies in Science and Education" (GRID 2016)

Grid Site Monitoring and Log Processing using ELK

LIT Conference Hall

Speaker

Description

Author

Presentation materials

Choose timezone

The 7th International Conference "Distributed Computing and Grid-technologies in Science and Education" (GRID 2016)

Speaker

Description

Author

Presentation materials