Computing and Information Systems - Theses

Permanent URI for this collection

Search Results

Now showing 1 - 2 of 2
  • Item
    Thumbnail Image
    Resource provisioning and scheduling algorithms for scientific workflows in cloud computing environments
    Rodriguez Sossa, Maria Alejandra ( 2016)
    Scientific workflows describe a series of computations that enable the analysis of data in a structured and distributed manner. Their importance is exacerbated in todays big data era as they become a compelling mean to process and extract knowledge from the ever-growing data produced by increasingly powerful tools such as telescopes, particle accelerators, and gravitational wave detectors. Due to their large-scale nature, scheduling algorithms are key to efficiently automate their execution in distributed environments, and as a result, to facilitate and accelerate the pace of scientific progress. The emergence of the latest distributed system paradigm, cloud computing, brings with it tremendous opportunities to run workflows at low costs without the need of owning any infrastructure. In particular, Infrastructure as a Service (IaaS) clouds, offer an easily accessible, flexible, and scalable infrastructure for the deployment of these scientific applications by providing access to a virtually infinite pool of resources that can be acquired, configured, and used as needed and are charged on a pay-per-use basis. This thesis investigates novel resource provisioning and scheduling approaches for scientific workflows in IaaS clouds. They address fundamental challenges that arise from the multi-tenant, resource-abundant, and elastic resource model and are capable of fulfilling a set of quality of service requirements expressed in terms of execution time and cost. It advances the field by making the following key contributions: 1. A taxonomy and survey of the state-of-the-art scientific workflow scheduling algorithms designed exclusively for IaaS clouds. 
 2. A novel static scheduling algorithm that leverages Particle Swarm Optimization to generate a workflow execution and resource provisioning plan that minimizes the infrastructure cost while meeting a deadline constraint. 
 3. A hybrid algorithm based on a variation of the Unbounded Knapsack Problem that finds a trade-off between making static decisions to find better-quality schedules and dynamic decisions to adapt to unexpected delays. 
 4. A scalable algorithm that combines heuristics and two different Integer Programming models to generate schedules that minimize the execution time of the work- flow while meeting a budget constraint. 
 5. The implementation of a cloud resource management module and its integration to an existing Workflow Management System. 

  • Item
    Thumbnail Image
    Resource provisioning in spot market-based cloud computing environments
    VOORSLUYS, WILLIAM ( 2014)
    Recently, cloud computing providers have started offering unused computational resources in the form of dynamically priced virtual machines (VMs), also known as "spot instances". In spite of the apparent economical advantage, an intermittent nature is inherent to these biddable resources, which may cause VM unavailability. When an out-of-bid situation occurs, i.e. the current spot price goes above the user's maximum bid, spot instances are terminated by the provider without prior notice. This thesis presents a study on employing cloud computing spot instances as a means of executing computational jobs on cloud computing resources. We start by proposing a resource management and job scheduling policy, named SpotRMS, which addresses the problem of running deadline-constrained compute-intensive jobs on a pool of low-cost spot instances, while also exploiting variations in price and performance to run applications in a fast and economical way. This policy relies on job runtime estimations to decide what are the best types of spot instances to run each job and when jobs should run. It is able to minimise monetary spending and make sure jobs finish within their deadlines. We also propose an improvement for SpotRMS, that addresses the problem of running compute-intensive jobs on a pool of intermittent virtual machines, while also aiming to run applications in a fast and economical way. To mitigate potential unavailability periods, a multifaceted fault-aware resource provisioning policy is proposed. Our solution employs price and runtime estimation mechanisms, as well as three fault tolerance techniques, namely checkpointing, task duplication and migration. As a further improvement, we equip SpotRMS with prediction-assisted resource provisioning and bidding strategies. Our results demonstrate that both costs savings and strict adherence to deadlines can be achieved when properly combining and tuning the policy mechanisms. Especially, the fault tolerance mechanism that employs migration of VM state provides superior results in virtually all metrics. Finally, we employ a statistical model of spot price dynamics to artificially generate price patterns of varying volatility. We then analyse how SpotRMS performs in environments with highly variable price levels and more frequent changes. Fault tolerance is shown to be even more crucial in such scenarios.