Melbourne School of Population and Global Health - Research Publications

Permanent URI for this collection

Search Results

Now showing 1 - 4 of 4
  • Item
    Thumbnail Image
    An Uncertainty-Accuracy-Based Score Function for Wrapper Methods in Feature Selection
    Maadi, M ; Khorshidi, HA ; Aickelin, U (IEEE, 2023-08-13)
    Feature Selection (FS) is an effective preprocessing method to deal with the curse of dimensionality in machine learning. Redundant features in datasets decrease the classification performance and increase the computational complexity. Wrapper methods are an important category of FS methods that evaluate various feature subsets and select the best one using performance measures related to a classifier. In these methods, the accuracy of classifiers is the most common performance measure for FS. Although the performance of classifiers depends on their uncertainty, this important criterion is neglected in these methods. In this paper, we present a new performance measure called Uncertainty-Accuracy-based Performance Measure for Feature Selection (UAPMFS) that uses an ensemble approach to measure both the accuracy and uncertainty of classifiers. UAPMFS uses bagging and uncertainty confusion matrix. This performance measure can be used in all wrapper methods to improve FS performance. We design two experiments to evaluate the performance of UAPMFS in wrapper methods. In experiments, we use the leave-one-variable-out strategy as the common strategy in wrapper methods to evaluate features. We also define a feature score function based on UAPMFS to rank and select features. In the first experiment, we investigate the importance of considering uncertainty in the FS process and show how neglecting uncertainty affects FS performance. In the second experiment, we compare the performance of the UAPMFS-based feature score function with the most common feature score functions for FS. Experimental results show the effectiveness of the proposed performance measure on different datasets.
  • Item
    Thumbnail Image
    Cluster-based Diversity Over-sampling: A Density and Diversity Oriented Synthetic Over-sampling for Imbalanced Data
    Yang, Y ; Khorshidi, H ; Aickelin, U (SCITEPRESS - Science and Technology Publications, 2022)
    In many real-life classification tasks, the issue of imbalanced data is commonly observed. The workings of mainstream machine learning algorithms typically assume the classes amongst underlying datasets are relatively well-balanced. The failure of this assumption can lead to a biased representation of the models’ performance. This has encouraged the incorporation of re-sampling techniques to generate more balanced datasets. However, mainstream re-sampling methods fail to account for the distribution of minority data and the diversity within generated instances. Therefore, in this paper, we propose a data-generation algorithm, Cluster-based Diversity Over-sampling (CDO), to consider minority instance distribution during the process of data generation. Diversity optimisation is utilised to promote diversity within the generated data. We have conducted extensive experiments on synthetic and real-world datasets to evaluate the performance of CDO in comparison with SMOTE-based and diversity-based methods (DADO, DIWO, BL-SMOTE, DB-SMOTE, and MAHAKIL). The experiments show the superiority of CDO.
  • Item
    Thumbnail Image
    Collaborative Human-ML Decision Making Using Experts' Privileged Information under Uncertainty
    Maadi, M ; Khorshidi, HA ; Aickelin, U ( 2021-01-01)
    Machine Learning (ML) models have been widely applied for clinical decision making. However, in this critical decision making field, human decision making is still prevalent, because clinical experts are more skilled to work with unstructured data specially to deal with uncommon situations. In this paper, we use clinical experts' privileged information as an information source for clinical decision making besides information provided by ML models and introduce a collaborative human-ML decision making model. In the proposed model, two groups of decision makers including ML models and clinical experts collaborate to make a consensus decision. As decision making always comes with uncertainty, we present an interval modelling to capture uncertainty in the proposed collaborative model. For this purpose, clinical experts are asked to give their opinion as intervals, and we generate prediction intervals as the outputs of ML models. Using Interval Agreement Approach (IAA), as an aggregation function in our proposed collaborative model, pave the way to minimize loss of information through aggregating intervals to a fuzzy set. The proposed model not only can improve the accuracy and reliability of decision making, but also can be more interpretable especially when it comes to critical decisions. Experimental results on synthetic data shows the power of the proposed collaborative decision making model in some scenarios.
  • Item
    Thumbnail Image
    IMPROVING ACCURACY OF RECORD LINKAGE USING GRAPH STRUCTURES: RELEVANCE FOR HEALTH OUTCOMES RESEARCH?
    IJzerman, N ; Lin, P ; IJzerman, M ; Aickelin, U (ELSEVIER SCIENCE INC, 2020-05-01)