Computing and Information Systems - Research Publications

Permanent URI for this collection

Search Results

Now showing 1 - 10 of 46
  • Item
    Thumbnail Image
    Exploiting patterns to explain individual predictions
    Jia, Y ; Bailey, J ; Ramamohanarao, K ; Leckie, C ; Ma, X (Springer London, 2020-03)
    Users need to understand the predictions of a classifier, especially when decisions based on the predictions can have severe consequences. The explanation of a prediction reveals the reason why a classifier makes a certain prediction, and it helps users to accept or reject the prediction with greater confidence. This paper proposes an explanation method called Pattern Aided Local Explanation (PALEX) to provide instance-level explanations for any classifier. PALEX takes a classifier, a test instance and a frequent pattern set summarizing the training data of the classifier as inputs, and then outputs the supporting evidence that the classifier considers important for the prediction of the instance. To study the local behavior of a classifier in the vicinity of the test instance, PALEX uses the frequent pattern set from the training data as an extra input to guide generation of new synthetic samples in the vicinity of the test instance. Contrast patterns are also used in PALEX to identify locally discriminative features in the vicinity of a test instance. PALEX is particularly effective for scenarios where there exist multiple explanations. In our experiments, we compare PALEX to several state-of-the-art explanation methods over a range of benchmark datasets and find that it can identify explanations with both high precision and high recall.
  • Item
    Thumbnail Image
    Spectral-based fault localization using hyperbolic function
    Neelofar, N ; Naish, L ; Ramamohanarao, K (WILEY, 2018-03)
    Summary Debugging is crucial for producing reliable software. One of the effective bug localization techniques is spectral‐based fault localization. It tries to locate a buggy statement by applying an evaluation metric to program spectra and ranking program components on the basis of the score it computes. Here, we propose a restricted class of “hyperbolic” metrics, with a small number of numeric parameters. This class of functions is based on past theoretical and empirical results. We show that optimization methods such as genetic programming and simulated annealing can reliably discover effective metrics over a wide range of data sets of program spectra. We evaluate the performance for both real programs and model programs with single bugs, multiple bugs, “deterministic” bugs, and nondeterministic bugs and find that the proposed class of metrics performs as well as or better than the previous best‐performing metrics over a broad range of data.
  • Item
    Thumbnail Image
    On the effectiveness of isolation-based anomaly detection in cloud data centers
    Calheiros, RN ; Ramamohanarao, K ; Buyya, R ; Leckie, C ; Versteeg, S (WILEY, 2017-09-25)
    Summary The high volume of monitoring information generated by large‐scale cloud infrastructures poses a challenge to the capacity of cloud providers in detecting anomalies in the infrastructure. Traditional anomaly detection methods are resource‐intensive and computationally complex for training and/or detection, what is undesirable in very dynamic and large‐scale environment such as clouds. Isolation‐based methods have the advantage of low complexity for training and detection and are optimized for detecting failures. In this work, we explore the feasibility of Isolation Forest, an isolation‐based anomaly detection method, to detect anomalies in large‐scale cloud data centers. We propose a method to code time‐series information as extra attributes that enable temporal anomaly detection and establish its feasibility to adapt to seasonality and trends in the time‐series and to be applied online and in real‐time.
  • Item
    Thumbnail Image
    Improving spectral-based fault localization using static analysis
    Neelofar, N ; Naish, L ; Lee, J ; Ramamohanarao, K (WILEY, 2017)
  • Item
    No Preview Available
    Holistic resource management for sustainable and reliable cloud computing: An innovative solution to global challenge
    Gill, SS ; Garraghan, P ; Stankovski, V ; Casale, G ; Thulasiram, RK ; Ghosh, SK ; Ramamohanarao, K ; Buyya, R (Elsevier Inc., 2019-09-01)
    Minimizing the energy consumption of servers within cloud computing systems is of upmost importance to cloud providers toward reducing operational costs and enhancing service sustainability by consolidating services onto fewer active servers. Moreover, providers must also provision high levels of availability and reliability, hence cloud services are frequently replicated across servers that subsequently increases server energy consumption and resource overhead. These two objectives can present a potential conflict within cloud resource management decision making that must balance between service consolidation and replication to minimize energy consumption whilst maximizing server availability and reliability, respectively. In this paper, we propose a cuckoo optimization-based energy-reliability aware resource scheduling technique (CRUZE) for holistic management of cloud computing resources including servers, networks, storage, and cooling systems. CRUZE clusters and executes heterogeneous workloads on provisioned cloud resources and enhances the energy-efficiency and reduces the carbon footprint in datacenters without adversely affecting cloud service reliability. We evaluate the effectiveness of CRUZE against existing state-of-the-art solutions using the CloudSim toolkit. Results indicate that our proposed technique is capable of reducing energy consumption by 20.1% whilst improving reliability and CPU utilization by 17.1% and 15.7% respectively without affecting other Quality of Service parameters.
  • Item
    No Preview Available
    Continuous Spatial Query Processing: A Survey of Safe Region Based Techniques
    Qi, J ; Zhang, R ; Jensen, CS ; Ramamohanarao, K ; He, J (ASSOC COMPUTING MACHINERY, 2018-07)
    In the past decade, positioning system-enabled devices such as smartphones have become most prevalent. This functionality brings the increasing popularity of location-based services in business as well as daily applications such as navigation, targeted advertising, and location-based social networking. Continuous spatial queries serve as a building block for location-based services. As an example, an Uber driver may want to be kept aware of the nearest customers or service stations. Continuous spatial queries require updates to the query result as the query or data objects are moving. This poses challenges to the query efficiency, which is crucial to the user experience of a service. A large number of approaches address this efficiency issue using the concept of safe region . A safe region is a region within which arbitrary movement of an object leaves the query result unchanged. Such a region helps reduce the frequency of query result update and hence improves query efficiency. As a result, safe region-based approaches have been popular for processing various types of continuous spatial queries. Safe regions have interesting theoretical properties and are worth in-depth analysis. We provide a comparative study of safe region-based approaches. We describe how safe regions are computed for different types of continuous spatial queries, showing how they improve query efficiency. We compare the different safe region-based approaches and discuss possible further improvements.
  • Item
    Thumbnail Image
    A cautionary note on the use of SIFT in pathological connectomes
    Zalesky, A ; Sarwar, T ; Ramamohanarao, K (WILEY, 2020-03)
  • Item
    Thumbnail Image
    Performance anomaly detection using isolation-trees in heterogeneous workloads of web applications in computing clouds
    Kardani-Moghaddam, S ; Buyya, R ; Ramamohanarao, K (John Wiley & Sons Ltd., 2019-10-25)
    Cloud computing is a model for on-demand access to shared resources based on the pay-per-use policy. In order to efficiently manage the resources, a continuous analysis of the operational state of the system is required to be able to detect the performance degradations and malfunctioned resources as soon as possible. Every change in the workload, hardware condition, or software code can change the state of the system from normal to abnormal, which causes the performance and quality of service degradations. These changes or anomalies vary from a simple gradual increase in the load to flash crowds, hardware faults, software bugs, etc. In this paper, we propose Isolation-Forest based anomaly detection (IFAD) framework based on the unsupervised Isolation technique for anomaly detection in a multi-attribute space of performance indicators for web-based applications. Unsupervised nature of the algorithm and its fast execution make this algorithm most suitable for the environments with dynamic nature where the patterns of data change frequently. The experiment results demonstrate that IFAD can achieve good detection accuracy especially in terms of precision for multiple types of the anomaly. Moreover, we show the importance of validating the accuracy of anomaly detection algorithms with regard to both Area Under the Curve (AUC) and Precision-Recall AUC (PRAUC) in an extensive set of comparisons including multiple unsupervised algorithms. The demonstration of the effectiveness of each algorithm shown by PRAUC results indicates the importance of PRAUC in selecting suitable anomaly detection algorithm, which is largely ignored in the literature.
  • Item
    Thumbnail Image
    ETAS: Energy and thermal-aware dynamic virtual machine consolidation in cloud data center with proactive hotspot mitigation
    Ilager, S ; Ramamohanarao, K ; Buyya, R (John Wiley & Sons Ltd., 2019-09-10)
    Data centers consume an enormous amount of energy to meet the ever‐increasing demand for cloud resources. Computing and Cooling are the two main subsystems that largely contribute to energy consumption in a data center. Dynamic Virtual Machine (VM) consolidation is a widely adopted technique to reduce the energy consumption of computing systems. However, aggressive consolidation leads to the creation of local hotspots that has adverse effects on energy consumption and reliability of the system. These issues can be addressed through efficient and thermal‐aware consolidation methods. We propose an Energy and Thermal‐Aware Scheduling (ETAS) algorithm that dynamically consolidates VMs to minimize the overall energy consumption while proactively preventing hotspots. ETAS is designed to address the trade‐off between time and the cost savings and it can be tuned based on the requirement. We perform extensive experiments by using the real‐world traces with precise power and thermal models. The experimental results and empirical studies demonstrate that ETAS outperforms other state‐of‐the‐art algorithms by reducing overall energy without any hotspot creation.
  • Item
    Thumbnail Image
    Mapping connectomes with diffusion MRI: deterministic or probabilistic tractography?
    Sarwar, T ; Ramamohanarao, K ; Zalesky, A (WILEY, 2019-02)
    PURPOSE: Human connectomics necessitates high-throughput, whole-brain reconstruction of multiple white matter fiber bundles. Scaling up tractography to meet these high-throughput demands yields new fiber tracking challenges, such as minimizing spurious connections and controlling for gyral biases. The aim of this study is to determine which of the two broadest classes of tractography algorithms-deterministic or probabilistic-is most suited to mapping connectomes. METHODS: This study develops numerical connectome phantoms that feature realistic network topologies and that are matched to the fiber complexity of in vivo diffusion MRI (dMRI) data. The phantoms are utilized to evaluate the performance of tensor-based and multi-fiber implementations of deterministic and probabilistic tractography. RESULTS: For connectome phantoms that are representative of the fiber complexity of in vivo dMRI, multi-fiber deterministic tractography yields the most accurate connectome reconstructions (F-measure = 0.35). Probabilistic algorithms are hampered by an abundance of false-positive connections, leading to lower specificity (F = 0.19). While omitting connections with the fewest number of streamlines (thresholding) improves the performance of probabilistic algorithms (F = 0.38), multi-fiber deterministic tractography remains optimal when it benefits from thresholding (F = 0.42). CONCLUSIONS: Multi-fiber deterministic tractography is well suited to connectome mapping, while connectome thresholding is essential when using probabilistic algorithms.