Machine learning (ML) methods began to be used in the MLIT laboratory from the very beginning of its organization in 1966, when one of the main tasks of the LVTA was the automation of film data processing used at that time in physics experiments. This included the problems of automating film measurements and calibration of the then-built scanning machines Spiral Reader and AELT (Automat on...
Different aspects of deep learning applications in the collider physics will be discussed in the talk. The main topic of the talk is the methodology of data analysis optimizations with deep neural networks. Short overview of the methods to search for "new physics" with neural network technique will be presented.
The GRAPES-3 experiment located in Ooty consists of a dense array of 400 plastic scintillator detectors spread over an area of 25,000 $m^2$ and a large area (560 $m^2$) tracking muon telescope. Everyday, the array records about 3 million showers in the energy range of 1 TeV - 10 PeV induced by the interaction of primary cosmic rays in the atmosphere. These showers are reconstructed in order to...
Imaging Atmospheric Cherenkov Telescopes (IACT) of TAIGA astrophysical complex allow to observe high energy gamma radiation helping to study many astrophysical objects and processes. TAIGA-ACT enables us to select gamma quanta from the total cosmic radiation flux and recover their primary parameters, such as energy and direction of arrival. The traditional method of processing the resulting...
The TAIGA experimental complex is a hybrid observatory for high-energy gamma-ray astronomy in the range from 10 TeV to several EeV. The complex consists of such installations as TAIGA-IACT, TAIGA-HiSCORE and a number of others. The TAIGA-HiSCORE facility is a set of wide-angle synchronized stations that detect Cherenkov radiation scattered over a large area. With TAIGA-HiSCORE data provides an...
Monte Carlo method is commonly used to simulate Cherenkov telescope images of atmospheric events caused by high-energy particles. We investigate the possibility of augmentation the Monte Carlo-generated sets using other methods. One of these methods is variational autoencoders.
We trained conditional variational autoencoders (CVAE) using a set of Monte Carlo-generated images from one...
Currently, generative adversarial networks (GANs) are a promising tool for image generation in the astronomy domain. Of particular interest are conditional GANs (CGANs), which allow you to divide images into several classes according to the value of some property of the image, and then specify the required class when generating images. In the case of images from Imaging Atmospheric Cherenkov...
The Jiangmen Underground Neutrino Observatory (JUNO) is a neutrino experiment under construction with a broad physics program. The main goals of JUNO are the determination of the neutrino mass ordering and the high precision measurement of neutrino oscillation properties. High quality reconstruction of reactor neutrino energy is crucial for the success of the experiment.
The JUNO detector...
Particle tracking is an essential part of any high-energy physics experiment. Well-known tracking algorithms based on the Kalman filter are not scaling well with the amounts of data being produced in modern experiments. In our work we present a particle tracking approach based on deep neural networks for the BM@N experiment and future SPD experiment. We have already applied similar approaches...
Taking into account that, at a Higgs boson mass of 125 GeV, the probability of its decay into bb is greater than the sum of the probabilities of all other decay channels, this channel makes a great contribution to the study of the Higgs boson. A more suitable channel for the production of the Higgs boson for studying it in bb decay is associative production with a vector boson. It was in this...
Machine Learning methods are wildly used for particle identification (PID) in experimental high energy physics nowadays. Particle identification plays an important role in high-energy physics analysis therefore determines the success of the performing an experiment. This determines importance of using machine learning to the PID problem. This report gives a preliminary status of application of...
During the experiment, 9 water bodies located in the Pskov region were studied: the pond of the Mirozhka River, the delta of the Velikaya River, the Kamenka River, lakes Kalatskoye, Teploe, Lesitskoye, Tiglitsy, Chudskoye (Peipsi), Pskovskoye. Water samples with phytoplankton were taken from each water body, and toxicants (CdSO$_4$ or K$_2$Cr$_2$O$_7$) were added at a concentration of 20 μM...
Most modern machine learning models are known as black-box models. By default, these predictors don't provide an explanation as to why a certain event or example has been assigned a particular class or value. Model explainability methods aim to interpret the decision-making process of a black-box model and present it in a way that is easy for researchers to understand. These methods can...
One of the methods for analysis of complex spectral contours (especially for spectra of liquid objects) is their decomposition into a limited number of spectral bands with physically reasonable shapes (Gaussian, Lorentzian, Voigt etc.). Consequent analysis of the dependencies of the parameters of these bands on some external conditions in which the spectra are obtained may reveal some...
In this paper we estimate accuracy of solving the task of relation extraction from texts containing pharmacologically significant information on the set of corpora in two languages:
1) the expanded version of RDRS corpus, that contains texts of internet reviews on medications in Russian;
2) the DDI2013 dataset containing MEDLINE abstracts and documents from DrugBank database in English;
3)...
V.V. Korenkov, A.G. Reshetnikov, S.V. Ulyanov, P.V. Zrelov
MLIT, JINR
The physical interpretation of self-organization control process on quantum level is discussed based on the quantum information-thermodynamic models of the exchange and extraction of quantum (hidden) value information from/between classical particle’s trajectories in particle swarm [1,2]. Main physics and information...
Traditional linear approximation of quantum mechanical wave functions are not practically applicable for systems with more than 3 degrees of freedom due to the “the curse of dimensionality”. Indeed, the number of parameters required to describe a wave function in high-dimensional space grows exponentially with the number of degrees of freedom. Inevitably, strong model assumptions should be...
The paper presents the application of the methodology of machine learning (artificial neural networks) and the method of principal component analysis to the problem of classifying data on the base of credit institutions.
The feed-forward neural network (multilayer perceptron with hidden layers) was applied to specially prepared input data. As a result, the set of credit institutions was...
Spiking neural networks which model action potentials in biological neurons are increasingly popular for machine learning applications thanks to ongoing progress in the hardware implementation of spiking networks in low-energy-consuming neuromorphic hardware. However, obtaining a spiking neural network model that solver a classification task as accurately as a formal neural network remains a...
In the framework of the joint project of LIT and LRB JINR, aimed to the creation of an information system for the tasks of radiation biology, a module is being developed to study the behavioral patterns of small laboratory animals exposed to radiation. The module for behavioral analysis automates the analysis of video data obtained by testing of the laboratory animals in the different test...
Hyperspectral images are a unique source for obtaining many kinds of information about the Earth's surface. Modern platforms support users to perform complex analyses with a collection of images without the use of any specialized software. Google Earth Engine (GEE) is a planetary-scale platform for Earth science data & analysis. Atmospheric, radiometric, and geometric corrections have been...
Cloud cover is the main physical factor limiting the downward shortwave (SW) solar radiation flux. In modern models of climate and weather forecasts, physical models describing radiative transfer through clouds may be used. However, this is a computationally expensive option. Instead, one may use parameterizations which are simplified schemes for approximating environmental variables. The...
Cloudiness plays an important role in the hydrological cycle of the atmosphere. Cloud types and other cloud spatial and temporal characteristics privide the ability to make short-term in situ weather forecasts. With the help of clouds, one may also track the content of various impurities in the air. Most importantly, clouds are the major obstacle on the pathway of incoming solar radiation,...
Surface wind is one of the most important fields in climate change research. Accurate prediction of high-resolution surface wind has a wide variety of applications, such as renewable energy and extreme weather forecasts. Downscaling is a methodology for high-resolution approximation of physical variables from low-resolution modeling outputs. Statistical downscaling methods allow to avoid...
The task of analysing the inhabitants of the underwater world is applicable to a wide range of applied problems: construction, fishing, and mining. Currently, this task is applied on an industrial scale by a rigorous review done by human experts in the field of underwater life. In this work, we present a tool that we have created that allows us to significantly reduce the time spent by a...
Recently, the haze removal methods have taken increasing attention of researchers. An objective comparison of haze removal methods struggles because of the lack of real data. Capturing pairs of images of the same scene with presence/absence of haze in real environment is a very complicated task. Therefore, the most of modern haze datasets contain artificial images, generated by some model of...
This study is devoted to the inverse problems of exploration geophysics, which consist in reconstructing the spatial distribution of the properties of the medium in the Earth's thickness from the geophysical fields measured on its surface. We consider the methods of gravimetry, magnetometry, and magnetotelluric sounding, as well as their integration, i.e. simultaneous use of data from several...
В данной работе предлагается рассмотреть метод предсказания матрицы контактов для пептидов. В данной статье были выбраны пептиды с длинной до 45 аминокислотных остатков для упрощения расчётов. Для предсказания использовались свёрточные нейронные сети (CNN) из-за схожести пространства признаков белков и изображений, к котором обычно успешно применяются свёрточные нейронные сети. Признаки были...
Quantitative, granulometric and classification-based distribution of oceanic sediment grains are important indicators in paleo-reconstruction of the characteristics of marine waters. Currently, the classification of grains is performed visually by an expert on a limited subset of a sediment sample using a binocular microscope. It is a highly time-consuming process in which geological expertise...
Currently, there are more than two years of statistics accumulated on COVID-19 for a large number of regions, which allows the use of algorithms that require large training sets, such as neural networks, to predict the dynamics of the disease.
The article provides a comparative analysis of various COVID-19 models based on forecasting for the period from 07/20/2020 to 05/05/2022 using...
In neural network solutions to many physical problems, there is a need to reduce the dimension of the input data in order to achieve a more accurate and stable solution while reducing computational complexity.
When solving an inverse problem in spectroscopy, multicollinearity is often observed between the input features, making it necessary to use a selection method that takes into account...
The weather forecast has a significant impact on a variety of human industries. In particular, knowledge of the short-term wind speed conditions is essential for fishery, energy management, surfing and others. One of the most effective neural network models for time series forecasting is LSTM (Long short-term memory), however, the accuracy of its forecast decreases significantly with...
The report presents the possibilities for using the ML/DL/HPC ecosystem deployed on the HybriLIT Heterogeneous Platform (MLIT JINR) on top of JupyterHub, which provides opportunities for solving tasks not only in the field of machine learning and deep learning, but also for the convenient organization of calculations and scientific visualization. The ecosystem allows one to develop and...
The paper presents an analytical platform that implements automated monitoring and analysis of the labor market in the Russian Federation. The platform is based on Big Data solutions and technologies. End-to-end processing corresponds to the general scheme of step-by-step solving of the problem - from data collection, their transformation, analysis, and modeling to services for visualization...
The BIOHLIT information system (IS) for analyzing behavioral and pathomorphological changes in the central nervous system when studying the effect of ionizing radiation on laboratory animals. Information system is being jointly developed by specialists from MLIT and LRB JINR.
The IS is necessary for storing data in a single information space, enhancing the detection of laboratory animals in...
The most complete information about the state of the human cardiovascular system is provided by the analysis of the array of cardiointervals (RR-intervals) of 24-hour holter monitoring (HM).
The most important task of analyzing a large array of HM RR-intervals is the introduction of the main parameters that most adequately reflect the properties of this array. One way to solve this problem is...
During the start-up regime of the IBR-2M power fluctuations appear, which
the AR system dampens. Their origin is not completely clear, however it
is known that the major reactivity sources are from design - respectively
the OPO and DPO reflectors (axial fluctuations towards the active zone and
their relative phase of intersecting each other facing the center of the
active zone).
A...
Yearly nuclide mass data is fitted to improved versions of the Weizsaecker formula. The present attempt at furthering the precision of this endeavour aims to reach beyond just precision, and obtain predictive capability about the "Stability Island" of nuclides. The method is to perform a fit to a recent improved liquid drop model with isotonic shift. The residuals are then fed to a neural...
We present the effect of using the Metropolis-Hastings algorithm for sampling the integrand on the accuracy of calculating the value of the integral. In addition, a hybrid method for sampling the integrand is proposed, in which part of the training sample is generated by applying the Metropolis-Hastings algorithm, and the other part includes points of a uniform grid. Numerical experiments show...
The Spin Physics Detector, a universal facility for studying the nucleon spin structure and other spin-related phenomena with polarized proton and deuteron beams. It will be placed in one of the two interaction points of the NICA collider that is under construction at the Joint Institute for Nuclear Research (Dubna, Russia). The main objective of the proposed experiment is the comprehensive...