Engineering and Information Technology Collected Works - Research Publications

Permanent URI for this collection

Search Results

Now showing 1 - 3 of 3
  • Item
    Thumbnail Image
    Enhancing Predictive Modeling in Emergency Departments
    Kouhounestani, M ; Song, L ; Luo, L ; Aickelin, U (SCITEPRESS - Science and Technology Publications, 2024)
    Increasing global Emergency Department (ED) visits, exacerbated by COVID-19, has presented multiple challenges in recent years. Electronic Health Records (EHRs) as comprehensive digital repositories of patient health information offer a pathway to construct prediction systems to address these issues. However, the heterogeneity of EHRs complicates accurate predictions. A notable challenge is the prevalence of high-cardinality nominal features (NFs) in EHRs. Due to their numerous distinct values, these features are often excluded from the analysis, risking information loss, reduced accuracy, and interpretability. This study proposes a framework, integrating a preprocessing technique with target encoding (TE-PrepNet) into machine learning (ML) models to address challenges of NFs from MIMIC-IV-ED. We evaluate performance of TE-PrepNet in two specific ED-based prediction tasks: triage-based hospital admissions and ED reattendance within 72 hours at discharge time. Incorporating three NFs, our approach demonstrates improvements compared to the baseline and outperforms previous research that overlooked NFs. Random forest model with TE-PrepNet in the prediction of hospitalisation achieved an AUROC of 0.8458, compared to the baseline AUROC of 0.7520. For the prediction of ED reattendance within 72 hours, the utilisation of XGBoost yielded an improvement, attaining an AUROC of 0.6975, outperforming the baseline AUROC of 0.6166.
  • Item
    Thumbnail Image
    Capturing prediction uncertainty in upstream cell culture models using conformal prediction and Gaussian processes
    Pham, TD ; Aickelin, U ; Bassett, R ; Papadopoulos, H ; Nguyen, KA ; Boström, H ; Carlsson, L (ML Research Press, 2023)
    This extended abstract compares the efficacy of Gaussian process and conformal XGBoost regressions in capturing prediction uncertainty in simulated and industrial cell culture data.
  • Item
    Thumbnail Image
    An Uncertainty-Accuracy-Based Score Function for Wrapper Methods in Feature Selection
    Maadi, M ; Khorshidi, HA ; Aickelin, U (Institute of Electrical and Electronics Engineers, 2023)
    Feature Selection (FS) is an effective preprocessing method to deal with the curse of dimensionality in machine learning. Redundant features in datasets decrease the classification performance and increase the computational complexity. Wrapper methods are an important category of FS methods that evaluate various feature subsets and select the best one using performance measures related to a classifier. In these methods, the accuracy of classifiers is the most common performance measure for FS. Although the performance of classifiers depends on their uncertainty, this important criterion is neglected in these methods. In this paper, we present a new performance measure called Uncertainty-Accuracy-based Performance Measure for Feature Selection (UAPMFS) that uses an ensemble approach to measure both the accuracy and uncertainty of classifiers. UAPMFS uses bagging and uncertainty confusion matrix. This performance measure can be used in all wrapper methods to improve FS performance. We design two experiments to evaluate the performance of UAPMFS in wrapper methods. In experiments, we use the leave-one-variable-out strategy as the common strategy in wrapper methods to evaluate features. We also define a feature score function based on UAPMFS to rank and select features. In the first experiment, we investigate the importance of considering uncertainty in the FS process and show how neglecting uncertainty affects FS performance. In the second experiment, we compare the performance of the UAPMFS-based feature score function with the most common feature score functions for FS. Experimental results show the effectiveness of the proposed performance measure on different datasets.