ATLAS Production System

5 Jul 2016, 09:30
30m
Plenary reports 3. Middleware and services for production-quality infrastructures Plenary reports

Speaker

Mr Mikhail Borodin (NRNU MEPHI, NRC KI)

Description

The second generation of the ATLAS production system called ProdSys2 is a distributed workload manager which used by thousands of physicists to analyze the data remotely, with the volume of processed data is beyond the exabyte scale, across a more than hundred heterogeneous sites. It achieves high utilization by combining dynamic job definition based on many criterias, such as input and output size, memory requirements and CPU consumption with manageable scheduling policies and by supporting different kind of computational resources, such as GRID, clouds, supercomputers and volunteering computers. Besides jobs definition Production System also includes flexible web user interface, which implements user-friendly environment for main ATLAS workflows, e.g. simple way of combining different data flows, and real-time monitoring, optimised for using with huge amount of information to present. We present an overview of the ATLAS Production System major components: job and task definition, workflow manager web user interface. We describe the important design decisions, and lessons learned from an operational experience during the first years of LHC Run2.

Primary author

Mr Mikhail Borodin (NRNU MEPHI, NRC KI)

Co-authors

Dr Alexei Klimentov (Brookhaven National Lab) Mr Dmitry Golubkov (Institute for High Energy Physics) Fernando Barreiro Megino (University of Texas at Arlington) Dr Kaushik De (University of Texas at Arlington) Dr Tadashi Maeno (Brookhaven National Laboratory) Dr Torre Wenaus (Brookhaven National Laboratory)

Presentation materials