9th International Conference "Distributed Computing and Grid Technologies in Science and Education" (GRID'2021)

Name: 9th International Conference "Distributed Computing and Grid Technologies in Science and Education" (GRID'2021)
Start: 2017-12-03T12:00:00+03:00
End: 2021-07-09T19:05:00+03:00
Location: No location set

5–9 Jul 2021

Europe/Moscow timezone

Support

grid2021@jinr.ru

Transformer-based Model for the Semantic Parsing of Error Messages in Distributed Computing Systems in High Energy Physics

8 Jul 2021, 17:00

15m

Conference Hall or Online - https://jinr.webex.com/jinr/j.php?MTID=m6e39cc13215939bea83661c4ae21c095

https://jinr.webex.com/jinr/j.php?MTID=m6e39cc13215939bea83661c4ae21c095

Sectional reports 2. Research infrastructure Research infrastructure

Dmitry Grin

Large-scale computing centers supporting modern scientific experiments store and analyze vast amounts of data. A noticeable number of computing jobs executed within the complex distributed computing environments ends with errors of some kind, and the amount of error log data generated every day complicates manual analysis by human experts. Moreover, traditional methods such as specifying regular expression patterns to automatically group error messages become impractical in a heterogeneous computing environment without a well-defined structure of error messages. ClusterLogs framework for error message clustering was developed to address this challenge. The framework can discover common patterns in error messages from various sources and group them together. One of the essential results of this process is the clear automated description of the resulting clusters, which will be used for the analysis.
In this research, we propose that interpreting error messages as a natural language allows us to use transformer-based deep learning models such as BERT for this task. A model for extracting the relevant part of messages was trained and integrated into ClusterLogs to represent each cluster as a few actionable items, ensuring better interpretation and validation of the results of clustering.

Dmitry Grin Dr Maria Grigorieva (Moscow State University)

Презентация GRID.pdf

9th International Conference "Distributed Computing and Grid Technologies in Science and Education" (GRID'2021)

Support

Transformer-based Model for the Semantic Parsing of Error Messages in Distributed Computing Systems in High Energy Physics

Conference Hall or Online - https://jinr.webex.com/jinr/j.php?MTID=m6e39cc13215939bea83661c4ae21c095

Speaker

Description

Authors

Presentation materials

Choose timezone

9th International Conference "Distributed Computing and Grid Technologies in Science and Education" (GRID'2021)

Support

Speaker

Description

Authors

Presentation materials