Computing and Information Systems - Theses

Permanent URI for this collection

Search Results

Now showing 1 - 3 of 3
  • Item
    Thumbnail Image
    Cost-efficient Management of Cloud Resources for Big Data Applications
    Islam, Muhammed Tawfiqul ( 2020)
    Analyzing a vast amount of business and user data on big data analytics frameworks is becoming a common practice in organizations to get a competitive advantage. These frameworks are usually deployed in a computing cluster to meet the analytics demands in every major domain, including business, government, financial markets, and health care. However, buying and maintaining a massive amount of on-premise resources is costly and difficult, especially for start-ups and small business organizations. Cloud computing provides infrastructure, platform, and software systems for storing and processing data. Thus, Cloud resources can be utilized to set up a cluster with a required big data processing framework. However, several challenges need to be addressed for Cloud-based big data processing which includes: deciding how much Cloud resources are needed for each application, how to maximize the utilization of these resources to improve applications' performance, and how to reduce the monetary cost of resource usages. In this thesis, we focus on a user-centric view, where a user can be either an individual or a small/medium business organization who want to deploy a big data processing framework on the Cloud. We explore how resource management techniques can be tailored to various user-demands such as performance improvement, and deadline guarantee for the applications; all while reducing the monetary cost of using the cluster. In particular, we propose efficient resource allocation and scheduling mechanisms for Cloud-deployed Apache Spark clusters.
  • Item
    Thumbnail Image
    Profit optimization of resource management for big data analytics-as-a-service platforms in cloud computing environments
    Zhao, Yali ( 2020)
    Discovering optimal resource management solutions to support data analytics to extract value from big data is an increasingly important research area. It is fair to say that the success of many organizations, companies, and individuals now relies heavily on data analytics solutions. Cloud computing greatly supports big data analytics by providing scalable resources based on user demand and supporting elastic resource provisioning in a pay-as-you-go model. Big data Analytics as a Service (AaaS) platforms provision AaaS to various domains as consumable services in an easy to use manner across cloud computing environments. AaaS platforms aim to deliver efficient data analytics solutions to benefit decision-making and problem solving in a wide range of application domains such as engineering, science, and government. However, big data analytics solutions face a range of challenges: the dynamic nature of query requests; the heterogeneity of cloud resources; the different Quality of Service (QoS) requirements; the potential for lengthy data processing times and associated expensive resource costs and dealing with big data processing demands under potentially limited/constrained budgets, deadlines and/or data accuracies. The above challenges need to be tackled by efficient resource management solutions to support AaaS platforms to deliver reliable, cost-effective and fast AaaS. Optimal resource management solutions are essential for AaaS platforms to maximize profits and minimize query times while guaranteeing Service Level Agreements (SLAs) during AaaS delivery. To tackle the above challenges, this thesis systematically studies profit optimization solutions to support AaaS platforms. Key contributions are made through a range of resource management solutions. These include admission control and resource scheduling algorithms that enable various problem scenarios where data needs to be processed under heterogeneous, constrained or limited budgets, deadlines, or accuracies with support of data splitting and/or data sampling-based methods to reduce data processing times and costs with potential accuracy trade-offs. These algorithms allow AaaS platforms to optimize profits and minimize query times through optimal resource management solutions, and thereby increase market share by maximizing query admissions and improve reputation by delivering SLA-supported AaaS solutions.
  • Item
    Thumbnail Image
    A big data infrastructure for real-time traffic analytics on the cloud
    Gong, Yikai ( 2019)
    With the increasing urbanisation occurring globally, cities are facing unprecedented challenges. One major challenge is related to traffic and the increasingly common congestion issues that arise in cities. At the same time, digital data is being created across all walks of life by industry, governments and society more generally. The term "big data'' has now entered common vernacular. Big data can include officially captured data, e.g. from traffic measurement systems from government organisations such as VicRoads in Australia, as well as other forms of data generated by the population at large, e.g. social media. This thesis explores the unique characteristics of traffic related data and focuses on the development and evaluation of an underpinning Cloud-based platform that can tackle some of the unique big data challenges related to such data. In particular, the thesis focuses on challenges related to the volume, velocity and variety of traffic data. We explore how different forms of data including official sensor data such as the Sydney Coordinated Adaptive Traffic System (SCATS) that is widely rolled out across Victoria and supported by VicRoads can be processed in real time, as well as how social media data such as Twitter can be used as a cheaper proxy for SCATS to better understand traffic in cities. We also develop novel real-time clustering algorithms that tackle the unique spatial and temporal aspects of traffic related data.