dc.contributor.author | Islam, Muhammed Tawfiqul | |
dc.date.accessioned | 2021-01-11T05:15:34Z | |
dc.date.available | 2021-01-11T05:15:34Z | |
dc.date.issued | 2020 | |
dc.identifier.uri | http://hdl.handle.net/11343/258658 | |
dc.description | © 2020 Muhammed Tawfiqul Islam | |
dc.description.abstract | Analyzing a vast amount of business and user data on big data analytics frameworks is becoming a common practice in organizations to get a competitive advantage. These frameworks are usually deployed in a computing cluster to meet the analytics demands in every major domain, including business, government, financial markets, and health care. However, buying and maintaining a massive amount of on-premise resources is costly and difficult, especially for start-ups and small business organizations. Cloud computing provides infrastructure, platform, and software systems for storing and processing data. Thus, Cloud resources can be utilized to set up a cluster with a required big data processing framework. However, several challenges need to be addressed for Cloud-based big data processing which includes: deciding how much Cloud resources are needed for each application, how to maximize the utilization of these resources to improve applications' performance, and how to reduce the monetary cost of resource usages. In this thesis, we focus on a user-centric view, where a user can be either an individual or a small/medium business organization who want to deploy a big data processing framework on the Cloud. We explore how resource management techniques can be tailored to various user-demands such as performance improvement, and deadline guarantee for the applications; all while reducing the monetary cost of using the cluster. In particular, we propose efficient resource allocation and scheduling mechanisms for Cloud-deployed Apache Spark clusters. | |
dc.rights | Terms and Conditions: Copyright in works deposited in Minerva Access is retained by the copyright owner. The work may not be altered without permission from the copyright owner. Readers may only download, print and save electronic copies of whole works for their own personal non-commercial use. Any use that exceeds these limits requires permission from the copyright owner. Attribution is essential when quoting or paraphrasing from these works. | |
dc.subject | Big Data | |
dc.subject | Cloud Computing | |
dc.subject | Cost-efficiency | |
dc.subject | Performance Improvement | |
dc.subject | Resource Management | |
dc.subject | Resource Allocation | |
dc.subject | Scheduling | |
dc.subject | Artificial Intelligence | |
dc.subject | Cluster Scheduling | |
dc.subject | Apache Spark | |
dc.subject | Apache Mesos | |
dc.title | Cost-efficient Management of Cloud Resources for Big Data Applications | |
dc.type | PhD thesis | |
melbourne.affiliation.department | Computing and Information Systems | |
melbourne.affiliation.faculty | Engineering | |
melbourne.thesis.supervisorname | Rajkumar Buyya | |
melbourne.contributor.author | Islam, Muhammed Tawfiqul | |
melbourne.thesis.supervisorothername | Shanika Karunasekera | |
melbourne.tes.fieldofresearch1 | 080501 Distributed and Grid Systems | |
melbourne.tes.fieldofresearch2 | 080503 Networking and Communications | |
melbourne.tes.confirmed | true | |
melbourne.accessrights | Open Access | |