Computing and Information Systems - Theses

Permanent URI for this collection

Search Results

Now showing 1 - 1 of 1
  • Item
    Thumbnail Image
    Scalable and Explainable Time Series Classification
    Cabello Wilson, Nestor Stiven ( 2022)
    Time series data are ubiquitous, and the explosion of sensor technologies on industry applications has further accelerated their growth. Modern applications collecting large datasets with long time series require fast automated decisions. Besides, legislation such as the European General Data Protection Regulation now require explanations for automated decisions. Although state-of-the-art time series classification methods are highly accurate, their computationally expensive and complex models are typically impractical for large datasets or cannot provide explanations. To address these issues, this thesis proposes two time series classification methods that are comparable to state-of-the-art methods in terms of accuracy, while the two methods provide scalable and explainable classifications. Our first method proposes a novel supervised selection of sub-series to pre-compute a set of features that maximizes the classification accuracy when fed into a tree-based ensemble. Our second method further proposes a perturbation scheme for the supervised feature selection and the node-splitting process when training the tree-based ensemble. We also propose a highly time-efficient strategy to build the tree-based ensemble. Both methods enable explainability for our classification results, while they are orders of magnitude faster than the state-of-the-art methods. Our second method, in particular, is significantly faster and more accurate than our first approach, while it is not significantly different from the state-of-the-art methods in terms of classification accuracy. Moreover, motivated to explore a more general model for time series classification, we propose a novel graph-based method to learn to classify time series without the order constraint inherent to time series data. This method classifies time series by learning the relationships among the data points independently from their positions (i.e., time-stamps) within the series. We show that this method outperforms state-of-the-art methods over several time series datasets, thus opening up a new direction for the design of time series classifiers.