Speaker
Ms
Victoriya Osipova
(Tomsk Polytechnic University, Tomsk, Russia)
Description
The traditional relational databases (aka RDBMS) having been consistent for the normalized data structures. RDBMS served well for decades, but the technology is not optimal for data processing and analysis in data intensive fields like social networks, oil-gas industry, experiments at the Large Hadron Collider, etc. Several challenges have been raised recently on the scalability of data warehouse-like workload against the transactional schema, in particular for the analysis of archived data or the aggregation of data for summary and accounting purposes. We have evaluated new approaches of handling vast amount of data. In particular, we have studied a new class of technologies commonly referred to as non-relational (NoSQL) databases. This includes schema-less approaches via key-value stores, like HBase, Cassandra, MongoDB. We studied performance, throughput and scalability of the above technologies for several scientific and industrial use-cases. The detailed studies and comparison make this project successful for different heterogeneous systems. This paper presents technologies and architectures we have studied, as well as the description of the back-end application that implements data uploading from RDBMS to NoSQL data warehouse, NoSQL database organization and how it could be used for data analytics in the further.
Primary author
Ms
Victoriya Osipova
(Tomsk Polytechnic University, Tomsk, Russia)
Co-authors
Mr
Alexey Alekseev
(Tomsk Polytechnic University, Tomsk, Russia)
Mr
Alexeyi Klimentov
(Brookhaven National Laboratory, Upton, USA National Research Center “Kurchatov Insitute”, Moscow, Russia)
Ms
Asel Seidova
(Tomsk Polytechnic University, Tomsk, Russia)
Mr
Maksim Ivanov
(Tomsk Polytechnic University, Tomsk, Russia)
Ms
Nina Grigoreva
(Tomsk Polytechnic University, Tomsk, Russia)
Ms
Olga Gerget
(Tomsk Polytechnic University, Tomsk, Russia)