Speaker
Description
Keywords: cloud technologies, knowledge-intensive applications, workflow management system, workflow as a service
1.Introduction
In this work, we propose a comprehensive solution to the problem of scheduling the execution of complex applications on cloud-based WaaS (Workflow as a Service) platforms. Multitenant WaaS environments allow to implement efficient mechanisms for managing continuous flows of diverse types of jobs in various fields of knowledge [1]. Workflow is a widely used model that can be represented in the form of a directed acyclic graph (DAG), where vertices correspond to single tasks and arcs to information links. Cloud technologies are actively used to perform scientific workflows, including: CyberShake (seismology), Epigenomics, SIPHT (bioinformatics), Montage (astrophysics), LIGO (physics of gravitational waves). The execution of such applications is automated by workflow management systems (WMS). They provide functionality for resource management, scheduling task execution and data transfers. To date, there are a huge number of WMSs: ASKALON, Galaxy, HyperFlow, Kepler, Pegasus, Taverna, CloudBus and a number of others [2].
2.Integrated Approach for Workflow Scheduling and Resource Management
The main strength of cloud computing is its scalability: Infrastructure as a Service (IaaS) allows WMS to access a virtually unlimited pool of resources. At the same time, a number of serious problems arise related to the algorithms and features of job-flow scheduling, the solution which critically affects the efficiency of resource utilization in cloud environment.
First of all, this is the problem of selecting appropriate IaaS providers, forming a pool of virtual machines (VMs), and allocating them on physical servers. As a rule, known solutions do not consider the issues related to simultaneous execution of tasks belonging to different workflows on one VM. Therefore, this leads to reduced utilization of active VMs, the need to deploy additional VMs and containers in them, which, in turn, causes the corresponding time costs and performance degradation.
Despite the fact that diverse types of workflows can be represented in the form of DAGs, the graph structures for different applications, for example, Montage, CyberShake and SIPHT, differ significantly by the computational complexity of their constituent tasks and types of information links. This creates additional challenges when executing heterogeneous workflows on a single WaaS platform. Another challenge is the natural presence of loops in a number of applications. For example, WMSs Pegasus, Apache Airflow, Taverna, Kepler resort to palliative methods to eliminate such loops, which leads to increased workflow scheduling time [2].
Considering the mentioned aspects, the need for an integrated approach for workflow scheduling and resource management in WaaS platforms determines the importance of the proposed solution. The scientific novelty of the approach consists in the development of multifactor strategies for workflow and resource management. The problems of efficient workflow execution should take into account the heterogeneity of resources of different IaaS providers and time costs for accessing global data storage, such as Amazon S3. Additionally, the specifics of each individual job must be taken into account. The scheduling must consider multiple user preferences and constraints, which include VM performance requirements, monetary cost of use and pay-per-use implementation, deadline constraints, power consumption, reliability, and several other aspects [3]. Thus, there is a need to develop multi-factor strategies for workflow and resource management.
3.Main Results
3.1. Methods and tools for scheduling independent and heterogeneous job-flows on WaaS platform.
Proposed scheduling methods and tools meet the following requirements.
Dynamically create a pool of VMs from different IaaS providers and containers, taking into account multiple factors: cloud platform utilization level, data storage and transfer policies, workflow structure and user estimates on the execution time.
The execution time is comprised of the following components: time of actual processing as the ratio of the computation volume to the CPU performance on the VM of the corresponding type; time for data transfer between subtasks of the job (including reading and writing to global storage); time to deploy VMs and containers in VMs of the corresponding type.
The WaaS platform is supposed to receive many workflows at any given time. Each workflow subtask can be executed on some particular subset of VM types available from the IaaS provider. Specialized mechanisms were implemented to resolve conflicts between tasks competing for the use of the same VMs.
3.2. Selecting a WMS and extending its functionality to implement the core components of the WaaS platform: workflow scheduler; resource provisioning manager; system for monitoring the state of resources and services of the platform; a system for storing workflow execution histories on the WaaS platform.
3.3. Experimental studies of multifactor workflow and resource management strategies on synthetic datasets and real applications available in open-source repositories [2].
Practical use of the received results is possible for various knowledge-intensive applications with high demands on computing resources of such distributed environments as cloud platforms. The project's code is hosted in the GitHub repository (https://github.com/Sorran973/Scheduling-in-Workflow-as-a-Service).
References
1. Toporkov, V. Job Batch Scheduling in Workflow-as-a-Service Platforms / V. Toporkov, D. Yemelyanov, A. Bulkhak, M. Pirogova // Proc. PCT 2024. Communications in Computer and Information Science. - 2024. V. 2241, Springer, Cham. - P. 65–79.
2. Amstutz, P. Existing Workflow systems [Электронный ресурс] https://s.apache.org/existing-workflow-systems / P. Amstutz, M. Mikheev, M. R. Crusoe, et al. (дата обращения 18.08.2024).
3. Toporkov, V. Micro-scheduling for Dependable Resources Allocation / V. Toporkov, D. Yemelyanov // Performance Evaluation Models for Distributed Service Networks. Studies in Systems, Decision and Control. - 2021. V. 343. Editors: Bocewicz, Grzegorz, Pempera, Jarosław, Toporkov, Victor. Springer International Publishing. - P. 81-105.