Computing and Information Systems - Theses

Permanent URI for this collection

Search Results

Now showing 1 - 2 of 2
  • Item
    Thumbnail Image
    Machine Learning-based Energy and Thermal Efficient Resource Management Algorithms for Cloud Data Centres
    Ilager, Shashikant Shankar ( 2021)
    Cloud data centres are the backbone infrastructures of modern digital society and the economy. Data centres have witnessed tremendous growth, consuming enormous energy to power IT equipment and cooling system. It is estimated that the data centres consume 2% of global electricity generated, and the cooling system alone consumes up to 50% of it. Therefore, to save significant energy and provide reliable services, workloads should be managed in both an energy and thermal efficient manner. However, existing heuristics or static rule-based resource management policies often fail to find an optimal solution due to the massive complexity and non-linear characteristics of the data centre and its workloads. In this thesis, we focus on machine learning-based resource management algorithms for energy and thermal efficiency in Cloud data centres which are proven to be efficient in capturing non-linearity between interdependent parameters. We explore how these techniques can be adapted to resource management problems to increase the energy and thermal efficiency of Cloud data centres while simultaneously satisfying application QoS requirements. In particular, we propose algorithms for workload placement, consolidation, application scheduling, and configuring efficient frequencies of resources in Cloud data centres. The proposed solutions are evaluated using various simulation toolkits and prototype systems implemented on real testbeds.
  • Item
    Thumbnail Image
    Distributed data stream processing and task placement on edge-cloud infrastructure
    Amarasinghe, Gayashan Niroshana ( 2021)
    Indubitable growth of smart and connected edge devices with substantial processing power has made ubiquitous computing possible. These edge devices either produce streams of information related to the environment in which they are deployed or the devices can be located in proximity to such information producers. Distributed Data Stream Processing is a programming paradigm that is introduced to process these event streams to acquire relevant insights in order to make informed decisions. While deploying data stream processing frameworks on distributed cloud infrastructure has been the convention, for latency critical real-time applications that rely on data streams produced outside the cloud on the edge devices, the communication overhead between the cloud and the edge is detrimental. The privacy concerns surrounding where the data streams are processed is also contributing to the move towards utilisation of the edge devices for processing user-specific data. The emergence of Edge Computing has helped to mitigate these challenges by enabling to execute processes on edge devices to utilise their unused potential. Distributed data stream processing that shares edge and cloud computing infrastructure is a nascent field which we believe to have many practical applications in the real world such as federated learning, augmented/virtual reality and healthcare applications. In this thesis, we investigate novel modelling techniques and solutions for sharing the workload of distributed data stream processing applications that utilise edge and cloud computing infrastructure. The outcome of this study is a series of research works that emanates from a comprehensive model and a simulation framework developed using this model, which we utilise to develop workload sharing strategies that consider the intrinsic characteristics of data stream processing applications executed on edge and cloud resources. First, we focus on developing a comprehensive model for representing the inherent characteristics of data stream processing applications such as the event generation rate and the distribution of even sizes at the sources, the selectivity and productivity distribution at the operators, placement of tasks onto the resources, and recording the metrics such as end-to-end latency, processing latency, networking latency and the power consumption. We also incorporate the processing, networking, power consumption, and curating characteristics of edge and cloud computing infrastructure to the model from the perspective of data stream processing. Based on our model, we develop a simulation tool, which we call ECSNeT++, and verify its accuracy by comparing the latency and power consumption metrics acquired from the calibrated simulator and a real test-bed, both of which execute identical applications. We show that ECSNeT++ can model a real deployment, with proper calibration. With the public availability of ECSNeT++ as an open source software, and the verified accuracy of our results, ECSNeT++ can be used effectively for predicting the behaviour and performance of stream processing applications running on large scale, heterogeneous edge and cloud computing infrastructure. Next, we investigate how to optimally share the application workload between the edge and cloud computing resources while upholding quality of service requirements. A typical data stream processing application is formed as a directed acyclic graph of tasks that consist of sources that generate events, operators that process incoming events and sinks that act as destinations for event streams. In order to share the workload of such an application, these tasks need to placed onto the available computing resources. To this end, we devise an optimisation framework, consisting of a constraint satisfaction formulation and a system model, that aims to minimise end-to-end latency through appropriate placement of tasks either on cloud or edge devices. We test our optimisation framework using ECSNeT++, with realistic topologies and calibration, and show that compared to edge-only and cloud-only placements, our framework is capable of achieving 8-14% latency reduction and 14-15% energy reduction when compared to the conventional cloud only placement, and 14-16% latency reduction when compared to a naive edge only placement while also reducing the energy consumption per event by 1-5%. Finally, in order to cater the multitude of applications that operate under dynamic conditions, we propose a semi-dynamic task switching methodology that can be applied to optimise end-to-end latency of the application. Here, we approach the task placement problem for changing environment conditions in two phases: in the first phase respective locally optimal task placements are acquired for discrete environment conditions which are then fed to the second phase, where the problem is modelled as an Infinite Horizon Markov Decision Process with discounted rewards. By solving this problem, an optimal policy can be obtained and we show that this optimal policy can improve the performance of distributed data stream processing applications when compared with a dynamic greedy task placement approach as well as static task placement. For real-world applications executed on ECSNeT++, our approach can improve the latency as much as 10 - 17% on average when compared to a fully dynamic greedy approach.