The "Fawkes" procedure is discussed as a method of protection against unauthorized use and recognition of facial images from social networks. As an example, the results of an experiment are given, confirming the fact of a low result of face image recognition within CNN, when the "Fawkes" procedure is applied with the parameter mode = "high". Based on a comparative analysis with the original...
The ATLAS EventIndex provides a global event catalogue and event-level metadata for ATLAS analysis groups and users. The LHC Run 3, starting in 2022, will see increased data-taking and simulation production rates, with which the current infrastructure would still cope but may be stretched to its limits by the end of Run 3. This talk describes the implementation of a new core storage service...
Существует огромное количество научных и коммерческих приложений, написанных с прицелом на последовательное исполнение. Запуск таких программ на многопроцессорных системах возможен, но без использования преимуществ этих систем. Для выполнения программы с учетом этих возможностей зачастую необходимо переписать программу. Однако, это не всегда оптимальный выбор. В этой работе рассматривается...
Air pollution has a significant impact on human and environmental health. The aim of the UNECE International Cooperative Program (ICP) Vegetation in the framework of the United Nations Convention on Long-Range Transboundary Air Pollution (CLRTAP) is to identify the main polluted areas of Europe, produce regional maps and further develop the understanding of the long-range transboundary...
Current status of the INP BSU grid site. An overview of INP BSU computational facilities usage and cloud resources integration with JINR cloud is presented.
In this paper, computer studies of the effectiveness of the use of transfer learning methods for solving the problem of recognizing human brain tumors based on its MRI images are carried out. The deep convolutional networks VGG-16, ResNet-50, Inception_v3, and MobileNet_v2 were used as the basic models. Based on them, various strategies for training and fine-tuning models for recognizing brain...
The ATLAS EventIndex is going to be upgraded in advance of LHC Run 3. A framework for testing the performance of both the existing system and the new system has been developed. It generates various queries (event lookup, trigger searches, etc.) on sets of the EventIndex data and measures the response times. Studies of the response time dependence on the amount of requested data, and data...
Recently, deep learning has obtained a central position toward our daily life automation and delivered considerable improvements as compared to traditional algorithms of machine learning. Enhancing of image quality is a fundamental image processing task and. A high-quality image is always expected in several tasks of vision, and degradations like noise, blur, and low-resolution, are required...
Research Cloud Computing Ecosystem in Armenia
Abstract
Growing needs for computational resources, data storage within higher-educational institutions and the requirement for a lot of investment and financial resources the idea or the concept of “National Research Cloud Platform (NRCP)” is crucial to provide necessary IT support for educational, research and development activities,...
This paper presents the technology developed by the authors to improve the semantic interoperability of heterogeneous systems exchanging information through an object-oriented bus. We demonstrate the solution that allows semantically map the models of interacting information systems with a unified data model (domain otology) when developing an information exchange package.
Very-high-energy gamma ray photons interact with the atmosphere to give rise to cascades of secondary particles - Extensive Air Showers (EASs), which in turn generate very short flashes of Cherenkov radiation. This flashes are detected on the ground with Imaging Air Cherenkov Telescopes (IACTs). In the TAIGA project, in addition to images directly detected and recorded by the experimental...
Since 2019, the COMPASS experiment works on the Frontera high performance computer. This is a large machine (number 5 in the ranking of the most powerful supercomputers in 2019) and details, problems, and approaches to organizing data processing on this machine are presented in this report.
The project of the Super Charm-Tau (SCT) factory --- a high-luminosity
electron-positron collider for studying charmed hadrons and tau lepton
--- is proposed by Budker INP. The project implies single collision point
equipped with a universal particle detector. The Aurora software
framework has been developed for the SCT detector. It is based on
trusted and widely used in high energy...
The UMA software stack developed by the CERN-IT Monit group provides the main repository of monitoring dashboards. The adaptation of this stack to the ATLAS experiment began in 2018 to replace the old monitoring system. Since then, many improvements and fixes have been implemented to the UMA. One of the most considerable enhancements was the migration of the storage for aggregated data from...
ParaSCIP is one of the few open-source solvers implementing a parallel version of the Branch-and-Bound (BNB) algorithm for discrete and global optimization problems adapted for computing systems with distributed memory, e.g. for clusters and supercomputers. As is known from publications there were successful using up to 80,000 CPU cores during solving problems from the MIPLIB test libraries....
This work describes the design of a digital model of an HPC system for processing data from the Super Charm-Tau factory electron-positron collider of the "megascience" class. This model is developed using the AGNES multiagent modeling platform. The model includes intelligent agents that mimic the behavior of the main subsystems of the supercomputer, such as a task scheduler, computing...
Machine learning methods including convolutional neural networks
(CNNs) have been successfully applied to the analysis of extensive air
shower images from imaging atmospheric Cherenkov telescopes (IACTs).
In the case of the TAIGA experiment, we previously demonstrated that
both quality of selection of gamma ray events and accuracy of
estimates of the gamma ray energy by CNNs are good...
We give an overview of the CMS experiment activities to apply Machine Learning (ML) techniques to Data Quality Monitoring (DQM).
In the talk special attention will be paid to ML for Muon System and muon physics object DQM. ML application for data certification (anomaly detection) and release validation will be discussed.
The report presents the results of the work of Russian institutes in the processing of ALICE experiment data during the last 3 years of the operation of the Large Hadron Collider (LHC) including the end of the LHC RUN2 and the1st year of the COVID-19 pandemic. The main problems and tasks facing both ALICE Grid Computing and its Russian segment before the LHC RUN3, including the problems of...
At the moment, there are many different platforms for web-applications developing: Django, ASP.NET Core, Express, Angular, etc. Usually, these platforms assume a division of labour when a relatively large group of developers are working on a project, each of whom is engaged in its part
(design, layout, front-end, back-end).
In our real life, usually, only 1-2 people (full-stack developers)...
Развитие технологий искусственного интеллекта и больших данных (big data) явилось стимулом разработки новых инструментальных средств организации и автоматизации рабочих процессов (workflow). Проект Jupyter – один из основных проектов автоматизации рабочих процессов в области искусственного интеллекта. Ключевыми парадигмами проекта являются клиент-серверная модель и графическая интерактивная...
DIRAC Interware is a development framework and a set of ready to use components that allow to build distributed
computing systems of any complexity. Services based on the DIRAC Interware are used by several large scientific
collaborations such as LHCb, CTA and others. Multi-community DIRAC services are also provided by a number of
grid infrastructure projects, for example EGI, GridPP...
The development of the digital economy implies storing the history of a large number of transactions of every citizen involved in business processes based on digital technologies, starting from receiving public and social services in electronic form and ending with consumption of electronic goods and services produced by e-business and e-commerce.
If we look carefully at the data structure...
The Jiangmen Underground Neutrino Observatory (JUNO) experiment is mainly designed to determine the neutrino mass hierarchy and precisely measure oscillation parameters by detecting reactor anti-neutrinos. The total event rate from DAQ is about 1 kHz and the estimated volume of raw data is about 2 PB/year. But the event rate of reactor anti-neutrino is only about 60/day. So one of challenges...
The quantum self-organization algorithm model of wise knowledge base design for intelligent fuzzy controllers with required robust level considered. Background of the model is a new model of quantum inference based on quantum genetic algorithm. Quantum genetic algorithm applied on line for the quantum correlation’s type searching between unknown solutions in quantum superposition of imperfect...
Technological advances in the field of virtual reality and personal computation in general brought us to the era of web-based virtual reality, where virtual environments can be accessed directly from web browsers and without the need of installation of any additional software. Such online virtual environments seem to be a promising tool for scientific data visualization. When accessed through...
Multi-Purpose Detector collaboration began using distributed computing for centralized Monte-Carlo generation in the mid of 2019. DIRAC Interware is used as a platform for the integration of heterogeneous distributed computing resources. Since that time workflows of job submission, data transfer, and storage were designed, tested, and successfully applied. Moreover, we observe the growth of...
The paper presents and discusses a new computer toolkit for assessing the seaworthiness of a ship in stormy sailing conditions, intended for testing new design solutions for promising ocean-going ships and ships of unlimited ocean navigation, as well as organizing full-function simulators. The presented toolkit can be used by captains to select effective and safe sailing modes, as well as to...
Introduction
In world practice, the number of published articles in leading scientific publications is indicators of the results of scientific activities of researchers, research organizations and higher educational institutions. International publication activity reflects the level of development of national science against the background of other countries, especially in the field of basic...
Quantum Machine Learning is one of the most promising applications on near-term quantum devices which possess the potential to solve problems faster than traditional computers. Classical Machine Learning is taking up a significant role in particle physics to speed up detector simulations. Generative Adversarial Networks (GANs) have scientifically proven to achieve a similar level of accuracy...
The modern Big Data ecosystem provides tools to build a flexible platform for processing data streams and batch datasets. Supporting both the functioning of modern giant particle physics experiments and the services necessary for the work of many individual physics researchers generate and transfer large quantities of semi-structured data. Thus, it is promising to apply cutting-edge...
Introduction
The development of technological progress makes increased demands on the strength properties of structural elements of buildings and structures, machines and mechanisms, and a decrease in their material consumption. This leads to the need for effective use of existing and creation of new methods of solid mechanics and the training of new highly qualified specialists.
One of...
The studies of the geometrical aspects of the quantum information grow very actual owing to practical purposes.
Due to a request coming from the quantum technology, formulation of the quantum estimation theory turn to be in the frontier of a modern research. Particularly, the issue of interrelations between the phase space quasidistributions and classical Fisher metric are of current...
The paper proposes a method for detecting fertile soils based on the processing of satellite images. As a result of its application, a map of the location of fertile and infertile soils for a given region of the earth's surface is formed and the corresponding areas are calculated. Currently, data from most satellites are in the public domain and, as a rule, are multispectral images of the...
The Large Hadron Collider experiments at CERN produce a large amount of data which are analyzed by the High Energy Physics (HEP) community in hundreds of institutes around the world.
Both efficient transport and distribution of data across HEP centres are, therefore, crucial.
The HEP community has thus established high-performance interconnects for data transport---notably the Large Hadron...
We present analysis of performance of MPD data analysis/simulation software MPDroot by profilers and benchmarks.
Based on this we draw preliminary conclusions and set perspectives for future optimization improvements.
Accurate simulations of elementary particles in High Energy Physics (HEP) detectors are fundamental to accurately reproduce and interpret the experimental results and to correctly reconstruct particle flows. Today, detector simulations typically rely on Monte Carlo-based methods which are extremely demanding in terms of computing resources. The need for simulated data at future experiments -...
Air quality sensors represent an emerging technology for air monitoring quality. Their main advantage is that they are significantly cheaper monitoring devices compared to standard monitoring equipment. Low-cost, mass-produced sensors have a potential to form much denser monitoring networks and provide more detailed information on air pollution distribution. The drawback of sensor air...
Для исследований на ускорительном комплексе NICA (ОИЯИ) необходимы эффективные и быстрые программные реализации алгоритмов моделирования и реконструкции событий. Созданный для эксперимента BM@N пакет BmnRoot базируется на среде ROOT, GEANT4 и объектно-ориентированном фреймворке FairRoot и включает инструменты для исследования характеристик детектора BM@N, а также восстановления и анализа...
In a huge number of applications, air pollution dispersion modelling using standard Gaussian methodologies is an excessively data-intensive process that requires considerable computing power. Land Use Regression (LUR) represents an alternative modelling methodology. LUR presumes that pollution concentration is determined by factors obtained via spatial analysis. These factors are chosen on the...
Identifying news that affects financial markets is an important task on the way to predicting financial markets. A large number of articles are devoted to this topic. But the main problem for analyzing news is neural networks what used. These neural networks are created to analyze user reports about a particular object, be it a restaurant, a movie or a purchased item. In such reports, the...
Modern applications of quantum mechanics renewed interest in the properties of the set of density matrices of finite size. The issue of establishing of Riemannian structures on the quantum counterparts of space of probability measures became a subject of recent investigations.
We study quantum analogues of a well-known, natural Riemannian metric, the so-called Fisher metric. Explicit...
Pneumonia is a life-threatening lung disease caused by either a bacterial or viral infection. It can be life-threatening if not acted on at the right time, and so early diagnosis of pneumonia is vital. The aim of this work is the automatic detection of bacterial and viral pneumonia on the basis of X-ray images. Four different pre-trained deep convolutional neural networks (CNN): VGG16,...
Epidemic algorithms are widely explored in the case of distributed systems based on trustful environments. However, an assumption on arbitrary peers behaviour in Byzantine fault tolerance problem calls into question the appropriateness of well-studied gossip algorithms since some of them are based on aggregated network information, i.e. the number of nodes in the network, etc. Given this...
The search for short-lived particles is an important part of the physics research in experiments with relativistic heavy ions.
Such investigations mainly study decays of neutral particles into charged daughter particles, which can be already registered in the detector system. In order to find, select and study the properties of such short-lived particles in real time in the CBM experiment...
The report overviews the current state and key directions of the advanced development of the National Research and Education Network (NREN) of Russia for the period of 2021-2024. Unified NREN called National Research Computer Network (NIKS) was created in 2019 according to the results of integration of federal-level telecommunication networks in the fields of higher education (RUNNet) and...
Potential benefits of implementation of distributed ledger technology are widely discussed among different business actors and governmental structures. Within the last decade, with growing popularity of blockchain-based payment systems and cryptocurrencies, these discussions considerably sharpened. Therefore, an extensive body of research has emerged on this soil. The goal of this study is to...
This paper proposes a method for predicting and assessing land conditions based on satellite image processing using neural networks. In some regions, mainly based on agriculture and cattle breeding, the threat of irreversible soil changes has appeared, in particular desertification, which can lead to serious environmental and economic problems. Therefore, it is necessary to identify both the...
The paper describes method for modeling beam dynamics based on the calculation of ordinary differential equations with Taylor mapping. This method allows you to get the solutions of the system both in symbolic and numerical form. Using numerical simulation methods, one can obtain partial solutions of beam dynamics process. The paper considers the possibility of solving the inverse problem -...
The formation of a new generation of digital technologies, which were called «cross-cutting» due to the scale and depth of their influence, determined a large-scale transformation of business and social models. These changes have a strong impact on the content of professional activity: new skills are required from employees, and therefore new competencies. The rapid digitalization of the...
Classical methods use statistical-moments to determine the type of
modulation in question. This essentially correct approach for discerning
amplitude modulation (AM) from frequency modulation (FM), fails for more
demanding cases such as AM vs. AM-LSB (lower side-band rejection) - radio
signals being richer in information that statistical moments. Parameters
with good...
The data flow paradigm has established itself as a powerful approach to the machine learning. Indeed, it is also very powerful for the computational physics, although it is not used as much in the field. One of the complications is that physical models are much less homogeneous compared to ML, which makes their description a quite complicated task.
In this talk we present a syntax analyzer...
Информация, публикуемая пользователями в открытом доступе, может служить хорошим ресурсом для сбора данных при формировании датасетов для обучения нейронных сетей. Одной из самых больших существующих платформ для обмена фотографиями и видеозаписями является Instagram. Основным методом взаимодействия пользователей друг с другом на данной платформе является публикация изображений. При этом...
High-resolution images processing for land-surface monitoring is fundamental to analyse the impact of different geomorphological processes on earth surface for different climate change scenarios. In this context, photogrammetry is one of the most reliable techniques to generate high-resolution topographic data, being key to territorial mapping and change detection analysis of landforms in...
The recognition of particle trajectories (tracks) from experimental measurements plays a key role in the reconstruction of events in experimental high-energy physics. Knowledge about the primary vertex of an event can significantly improve the quality of track reconstruction. To solve the problem of primary vertex finding in the BESIII inner tracking detector we applied the LOOT program which...
We present an approach to predict ECG waves with non-linear autoregressive
exogenous neuromorphic (NARX) software. These predictions are important in
comparing the underlying QRS complex of the ECG-wave with the slowly
deteriorating waves (or arrythmia) in cardiac patients. A deep Q-wave for
instance (such as 1/4 of the R-wave) is a typical sign of (inferior wall)
myocardial...
Education systems provide specialists of different levels and specialization for the labor market. However, in the modern dynamic world of artificial intelligence, pandemic, and remote work, the labor market evolves dramatically from year to year. Universities and colleges must keep track of these changes to adapt educational programs and manage the number of student slots offered for...
We applied distributed computing to study new peptide dendrimers with Lys-2Lys and Lys-2Arg repeating units in water. These molecules are promising nanocontainers for the drug and gene delivery. The dendrimers have recently been synthesized and studied by NMR (Sci. Reports, 2018, 8, 8916; RSC Advances, 2019, 9, 18018) and successfully tested as carriers for gene delivery (Bioorg. Chem., 2020,...
The Spark – Hadoop ecosystem includes a wide variety of different components and can be integrated with any tool required for Big Data nowadays. From release-to-release developers of these frameworks optimize the inner work of components and make their usage more flexible and elaborate.
Anyway, since inventing MapReduce as a programming model and the first Hadoop releases data skew was and...
In modern eLearning systems, educational measurements are used both to evaluate the students’ achievements, and to control the learning process. However, eLearning systems usually have comparatively trivial embedded features for analyzing measurement results, which are not of considerable interest for sufficient statistical research of the assessment tools quality. To identify the...
The need for big data analysis takes place in many areas of science and technology: economics, medicine, geophysics, astronomy, particle physics and many others.
This task is greatly simplified if big data has structural patterns. In this talk, we will consider the case when big data with a high degree of accuracy are fractals.
We propose to analyze the fractal structure of big data based on...
Machine learning methods can be used to solve the problems of detecting and countering attacks on software-defined networks. For such methods, it is necessary to prepare a large amount of initial data for training. Mininet is used as a modeling environment for SDN. The main tasks of modeling a software-defined network are studying traffic within the network, as well as practicing various...
Настоящая работа посвящена разработке и исследованию методов управления коллективным поведением в роевых робототехнических системах на примере решения модельной задачи уборки роем роботов заданной ограниченной территории. В работе рассматривается несколько распределенных алгоритмов решения поставленной задачи, основанных на различных классических методах и моделях роевого интеллекта....
Настоящая работа посвящена вопросам применения технологий оптического распознавания символов и методов машинного обучения для распознавания печатных русскоязычных текстов XIX века. Анализируются особенности данной задачи по сравнению с общей задачей оптического распознавания символов. Проводится обзор существующих методов и программ для решения рассматриваемой проблемы. Предлагается свой...
To calculate the lifetime of mesons in hot and dense nuclear matter, it
is necessary to computate the 5-dimentional integrals with complicated
integrand function. This work presents algorithms and methods for
calculating complicated integrals based on the Monte-Carlo method. For
optimization of computation the algorithm of parallel calculations was
implemented in C++ programming language...
Quantum computing began in the early 1980s when physicist Paul Benioff
constructed a quantum mechanical model of Turing machine and physicist
Richard Feynman and mathematician YuriManin discussed the potential
of quantum computers to simulate phenomena a classical computer could
not feasibly do.
In 1994 Peter Shor developed a polynomial quantum algorithm for factoring
integers with the...
Ontology-based approach in exploratory analysis of textual data can significantly improve the quality of the obtained results. On the other hand, the use of domain knowledge defined in the form of ontologies increases the time needed to prepare a model and makes required calculations more complex. The presentation will discuss selected aspects of cluster analysis performed on documents...
Айтикост и Dell в понедельник . доклад спонсоров
The “Govorun” supercomputer is a heterogeneous computing system that contains computing architectures of different types, including graphics accelerators. The given architecture of the supercomputer allows users to choose optimal computing facilities for solving their tasks.
To enhance the efficiency of solving user tasks, as well as to expand the efficiency of utilizing both the computing...
A population annealing method is a promising approach for large-scale simulations because it is potentially scalable on any parallel architecture. We report an implementation of the algorithm on a hybrid program architecture combining CUDA and MPI [1]. The problem is to keep all general-purpose graphics processing unit devices as busy as possible by efficiently redistributing replicas. We...
A substantial data volume growth will appear with the start of the HL-LHC era. It is not well covered by the current LHC computing model, even taking into account the hardware evolution. The WLCG DOMA project was established to provide data management and storage researches. National data lake r&d's, as a part of the DOMA project, should address the study of possible technology solutions for...
Contemporary computing systems are commonly characterized in terms of data-intensive workflows, that are managed by utilizing large number of heterogeneous computing and storage elements interconnected through complex communication topologies. As the scale of the system grows and workloads become more heterogeneous in both inner structure and the arrival patterns, scheduling problem becomes...
Развитие экспериментов в различных областях приводит к увеличению объёма хранения и интенсивности обработки данных. Этот факт приводит к повышению количественных и качественных требований к системам хранения данных. Рассматриваются текущий статус, результаты проведённой в последнее время модернизации и перспективы дальнейшего развития работизированного хранилища данных Многофункционального...
The research of load balancing strategies in Grid systems is carried out. The main classes of load distribution strategies are identified with the aim of possibly increasing the efficiency of distributed systems. A model based on the fractal method for describing the dynamics of the load is considered.
Modern machine learning (ML) tasks and neural network (NN) architectures require huge amounts of GPU computational facilities and demand high CPU parallelization for data preprocessing. At the same time, the Ariadne library, which aims to solve complex high-energy physics tracking tasks with the help of deep neural networks, lacks multi-GPU training and efficient parallel data preprocessing on...
The model of Russian Remote Participation Center (RPC) was created under the contract between Russian Federation Domestic Agency (RF DA) and ROSATOM as the prototype of full-scale Remote Participation Center for ITER experiments and for coordination activities in the field of Russian thermonuclear research. This prototype was used for investigation of the following technical and scientific...
Modeling the spread of viruses is an urgent task in modern conditions. In the created model, contacts between people are represented in the form of the Watz and Strogatz graph. We studied graphs with tens of thousands of vertices with a simulation period of six months. The paper proposes methods for accelerating computations on graph models using graphics processors. In the considered problem,...
The Russian Scientific Data Lake is a part of Data Lake R&D conducted by the DOMA project. It aims to mitigate the present LHC computing model drawbacks to cope with an unprecedented scientific data volume at the multi-exabyte scale that will be delivered by experiments in the High Luminosity phase of the LHC. The prototype of the Russian Scientific Data Lake is being implemented and it tests...
Empirical studies have repeatedly shown that in High-Performance Computing (HPC) systems, the user’s resource estimation lacks accuracy [1]. Therefore, resource underestimation may remove the job at any step of computing, and subsequently allocated resources will be wasted. Moreover, resource overestimation also will waste resources. The SLURM, a famous job scheduler, has a mechanism to...
The topic of the presented report is various approaches to modeling the process of solving optimization problems using the desktopgrid [1]. The report summarizes the practical experience of performing computations on local infrastructures and on voluntary computing projects. The creation of preliminary models of the computational process will allow to avoid many systemic complexities in the...
Abstract. The questions of constructing optimal logical structure of a distributed database (DDB) are considered. Solving these issues will make it possible to increase the speed of processing requests in DDB in comparison with a traditional database. In particular, such tasks arise for the organization of systems for processing huge amounts of information from the Large Hadron Collider the...
The article describes approaches to the modernization of a distributed electronic infrastructure that combines various types of resources aimed at supporting the research and educational activities in Moldova. The development trends of computer infrastructures and technologies aimed at creating conditions for solving complex scientific problems with high requirements for computing...
AQMS - in this case, the broad name includes in particular measurement and mathematical modeling of air pollution, geoinformation technologies for the analysis of their results and preparation and implementation of modeling on parallel supercomputer clusters. In the past few years, my team and I have been researching and refining mathematical models, expanding the amount of processed input...
The CREST project for a new conditions database prototype for Run3 (intended to be used for production in Run 4) is focused on improvement of Athena based access, metadata management and, in particular, global tag management. The project addresses evolution of the data storage design and conditions data access optimization, enhancing the caching capabilities of the system in the context of...
One of the uppermost tasks in creating a computing system of the NICA complex is to model centers of storing and processing data that come from experimental setups of the complex, in particular, the BM@N detector, or are generated using special software for checking of the developed data processing algorithms and for comparison with the expected physical result.
After reviewing the existing...
Processing and analyzing of experimental and simulated data are an integral part of all modern high-energy physics experiments. These tasks are of particular importance in the experiments of the NICA project at the Joint Institute for Nuclear Research (JINR) due to the high interaction rate and particle multiplicity of ion collision events, therefore the task of automating the considered...
In this work consider a distributed computing system in which the control functions are dispersed in several dominant nodes that are directly connected to all the others. This configuration reduces the vulnerability of the entire network, since the failure of a single control element immediately disrupts its operation. On the other hand, the large length of the maximum shortest chain...
The concept of a "virtual testbed", namely the creation of problem-oriented environment for full-featured modeling for the investigated phenomenon or behavior of a complex technical object,
has now acquired a finished look. The design of this concept contributed to the development of complex mathematical models suitable for full-fledged computational experiments and improvement of a computer...
Particle collision experiments are known to generate substantial amount of data that must be stored and, later, analyzed. Typically, only a small subset of all the collected events is relevant when performing a particular physics analysis task. Although it is possible to obtain the required subset of records directly, by iterating through the whole volume of the collected data, the process is...
We are developing a tool for for calling functions between different environments. And by "different environment" we mean both, different programming languages and different machines. This tool is a remote procedure call protocol(and it's implementation), that is optimized for simplicity and can support higher-order functions. In our implementation, functions are never serialized and are...
The Multifunctional Information and Computing Complex in the Laboratory of Information Technologies of the Joint Institute for Nuclear Research is a multicomponent hardware and software complex, which ensures the fulfillment of a wide range of tasks related to the processing, analysis and storage of data in research conducted at the world level at JINR and in the world centers collaborating...
The report presents a solution for completely decentralized data management systems in geographically distributed environments with administratively unrelated or loosely related user groups and in conditions of partial or complete lack of trust between them. The solution is based on the integration of blockchain technology, smart contracts and provenance metadata driven data management....
Obtaining a long term reference trajectory on the chaotic attractor for coupled Lorenz system is a difficult task due to the sensitive dependence on the initial conditions. Using the standard double-precision floating point arithmetic, we cannot obtain a reference solution longer than 2.5 time units. Combining OpenMP parallel technology together with GMP library (GNU multiple precision...
В последние годы все большее распространение получают интеллектуальные системы видеообзора: охранные системы, системы анализа дорожной обстановки, системы выявления девиантного поведения, непрерывно растет число видеокамер, увеличивается разрешение получаемых изображений, усложняются алгоритмы обработки. Все это приводит к непрерывному увеличению генерируемой информации, и соответствующее...
The Data Knowledge Base (DKB) project is aimed at knowledge acquisition and metadata integration, providing fast response for a variety of complicated queries, such as summary reports and monitoring tasks (aggregation queries) and multi-system join queries, which are not easy to implement in a timely manner and, obviously, are less efficient than a query to a single system with integrated...
This report will present services for performing computational neurobiology tasks for working with experimental data from nuclear magnetic resonance imaging of the human brain. These Services are created as separate modules based on the "Digital Laboratory platform" NRC "Kurchatov Institute".
On the basis of the distributed modular platform «Digital Laboratory» at NRC "Kurchatov Institute"...
Fault tolerance of parallel and distributed applications is one of the concerns that becomes topical for large computer clusters and large distributed systems. For a long time the common solution to this problem was checkpoint and restart mechanisms implemented on operating system level, however, they are inefficient for large systems and now application-level checkpoint and restart is...
Предлагаются рациональные инструменты для инновационного развития и совершенствования способов оценки качества технических систем на этапе эксплуатации. Инновации основаны на том, что предлагаются для применения совокупность взаимосвязанных моделей, методик и реализующих их программ для ЭВМ, которые позволят сократить затраты времени и ресурсов на оценку качества систем и (или) элементов...
In this work, we present a formal mathematical model and software library for modelling hardware components and software systems based on the SOPN Petri network and CSharp programming language.
A discrete stochastic model denoted as SOPN, is presented, which combines the qualities of coloured, hierarchical and generalized Petri nets. The model is a series of extensions over the necessary...
Гетерогенная вычислительная платформа HybriLIT является частью
многофункционального информационно-вычислительного комплекса Лаборатории
информационных технологий им. М.Г. Мещерякова Объединенного института ядерных
исследований. Был проведен анализ данных по использованию платформы HybriLIT:
особое внимание уделено исследованию информации об используемых ресурсах при
запуске задач...
Due to the rapid development of high-performance computing systems, more and more complex and time-consuming computer simulations can be carried out. It opens new opportunities for scientists and engineers. A standard situation for scientific groups now is to have an own in-house research software, significantly optimized and adopted for a very narrow scientific problem. The main disadvantage...
The ATLAS experiment uses various tools to monitor and analyze the metadata of the main distributed computing applications. One of the tools is fully based on the unified monitoring infrastructure (UMA) provided by the CERN-IT Monit group. The UMA infrastructure uses modern and efficient open-source solutions such as Kafka, InfluxDB, ElasticSearch, Kibana and Grafana to collect, store and...
DNA molecular is a clear example of data storage and biocomputing. Performing millions of operations simultaneously DNA – biocomputer allows the performance rate to increase exponentially. The limitation problem is that each stage of paralleled operations requires time measured hours or days. To overcome this problem can nanobioelectronics.
The central problem of nanobioelectronics is...
The medical field, and especially diagnosis, is still an extremely poorly formalized field. This is especially true in the study of diseases associated with changes and disorders in the activity of the brain. In order to improve the results of medical research in this area, various methods of analyzing the condition of patients are used. These include both instrumental methods (MRI, EEG) and...
The report will present the results on the development of the algorithmic block of the Information System (IS) for radiobiological studies, created within a joint project of MLIT and LRB JINR, in terms of solving the segmentation problem for morphological research to study the effect of ionizing radiation on biological objects. The problem of automating the morphological analysis of...
High-performance supercomputers became one of the biggest power consumption machines. The top supercomputer's power is about 30mW. Recent legislative trends in the carbon footprint area are affecting high-performance computing. In our work, we collect energy analysis from different kinds of Intel's server CPUs. We present the comparison of energy efficiency of our new Poissons's solver, which...
The problems of silent data corruption detection in the data storage systems (Reed-Solomon codes) and faulty share detection in the distributed voting protocols (Shamir scheme) are treated from a uniform point of view. Namely, the both can be interpreted as the problem of systematic error detection in the data set { (x_1, y_1),...(x_N,y_N)} generated by a polynomial function y=f(x) in some...
Nowadays, cloud resources are the most flexible tool to provide access to infrastructures for establishing services and applications. But, it is also a valuable resource for scientific computing. In the Joint Institute for Nuclear Research computing cloud was integrated with the DIRAC system. That allowed submission of scientific computing tasks directly to the cloud. With that experience, the...
Existence of exact closed-form formula for the price of derivative is a rather rare event in derivative pricing, therefore, to determine the price of derivative, one has to apply various numerical methods, including finite difference methods, binomial trees and Monte Carlo simulations. Alternatively, derivative prices can be approximated with deep neural networks.
We study pricing of...
Nowadays, the problem of identification and authentication on the Internet is more urgent than ever. There are several reasons for this: on the one hand, there are many Internet services that keep records of users and differentiate their access rights to certain resources; on the other hand, cybercriminals' attacks on web services have become much more frequent lately. At the same time, in...
Started in natural sciences, the high demand for analyzing a vast amount of complex data reached such research areas as economics and social sciences. Big Data methods and technologies provide new efficient tools for researches. In this paper, we discuss the main principles and architecture of the digital analytical platform aimed to support socio-economic applications. Integrating specific...
The JINR cloud infrastructure hosts a number of cloud services to facilitate scientific workflows of individual researchers and research groups. While the batch processing systems are still the major compute power consumers of the cloud, new auxiliary cloud services and tools are being adopted by researchers and are gradually changing the landscape of the cloud environment. While such...
The article discusses the main provisions (methods, risk models, calculation algorithms, etc.) of the issue of organizing the protection of personal data (PD), based on the application of anonymization procedure. The authors reveal the relevance of the studied problem based on the tendency of the general growth of informatization and the further development of the Big Data technology. This...
Cloud computing has emerged as a new paradigm for on-demand access to a wast pool of computing resources that provides a promising alternative to traditional on-premises resources. There are several advantages of using clouds for scientific computing. Clouds can significantly lower time-to-solution via quick resource provision, skipping the lengthy process of building a new cluster on-premises...
Commonly used job schedulers in high-performance computing environments do not allow resource oversubscription. They usually work by assigning the entire node to a single job even if the job utilises only a small portion of nodes’ available resources. This may lead to cluster resources under-utilization and to an increase of job wait time in the queue. Every job may have different requirements...
The process of digitalization of the Russian economy as the basis for the transition to the digital economy is conditioned by the requirements of objective reality and is based, first of all, on the introduction of digital technologies into the activities of its actors. The most promising is the Blockchain technology, which has the capabilities of the most effective coordination of the...
Предлагаемая работа посвящена разработке программной системы для проведения взаимной классификации семейств популяционных алгоритмов оптимизации и задач многомерной непрерывной оптимизации. Одной из целей настоящего исследования является разработка методов предсказания эффективности работы включенных в систему алгоритмов и выбора из них наиболее эффективных алгоритмов для решения заданной...
To accurately detect texts containing elements of hatred or enmity, it is necessary to take into account various features: syntax, semantics and discourse relations between text fragments. Unfortunately, at present, methods for identifying discourse relations in the texts of social networks are poorly developed. The paper considers the issue of classification of discourse relations between two...
Современное развитие IT-инфраструктуры осуществляется форсировано во время которого происходит переход на облачные вычисления с внедрение методов виртуализации.
Наряду с этим, в настоящее время, эффективное решение крупномасштабных научных задач требует применения высокопроизводительных вычислений, в том числе использования распределенных вычислительных сред (РВС) различного назначения.
Все...
Humans and other animals can understand concepts from only a few examples. while standard machine learning algorithms require a large number of examples to extract hidden features. Unsupervised learning is procedure of revealing hidden features from unlabeled data.
In deep neural network training, unsupervised data pre-training increases the final accuracy of the algorithm by decreasing an...