GridFTP frontend with redirection for DMLite

2 Jul 2014, 16:30
20m
Conference Hall (LIT JINR)

Conference Hall

LIT JINR

Russia, 141980 Moscow region, Dubna, JINR
sectional reports Section 3 - Technology for storaging, searching and processing of Big Data Technology for storaging, searching and processing of Big Data

Speaker

Mr Andrey Kiryanov (PNPI)

Description

One of the most widely used storage solutions in WLCG is a Disk Pool Manager (DPM) developed and supported by SDC/ID group at CERN. Recently DPM went through a massive overhaul to address scalability and extensibility issues of the old code. New system was called DMLite. Unlike the old DPM that was based on daemons, DMLite is arranged as a library that can be loaded directly by an application. This approach greatly improves performance and transaction rate by avoiding unnecessary inter-process communication via network as well as threading bottlenecks. DMLite has a modular architecture with its core library providing only the very basic functionality. Backends (storage engines) and frontends (data access protocols) are implemented as plug-in modules. Doubtlessly DMLite wouldn't be able to completely replace DPM without GridFTP as it is used for most of the data transfers in WLCG. In DPM GridFTP support was implemented in a Data Storage Interface (DSI) module for Globus’ GridFTP server. In DMLite an effort was made to rewrite a GridFTP module from scratch in order to take advantage of new DMLite features and also implement new functionality. The most important improvement over the old version is a redirection capability. With old GridFTP frontend a client needed to contact SRM on the head node in order to obtain a transfer URL (TURL) before reading or writing a file. With new GridFTP frontend this is no longer necessary: a client may connect directly to the GridFTP server on the head node and perform file I/O using only logical file names (LFNs). Data channel is then automatically redirected to a proper disk node. This renders the most often used part of SRM unnecessary, simplifies file access and improves performance. It also makes DMLite a more appealing choice for non-LHC VOs that were never much interested in SRM. With new GridFTP frontend it's also possible to access data on various DMLite-supported backends like HDFS, S3 and legacy DPM.

Primary author

Presentation materials