Computing and Information Systems - Theses

Permanent URI for this collection

Search Results

Now showing 1 - 2 of 2
  • Item
    Thumbnail Image
    Scalable and Explainable Time Series Classification
    Cabello Wilson, Nestor Stiven ( 2022)
    Time series data are ubiquitous, and the explosion of sensor technologies on industry applications has further accelerated their growth. Modern applications collecting large datasets with long time series require fast automated decisions. Besides, legislation such as the European General Data Protection Regulation now require explanations for automated decisions. Although state-of-the-art time series classification methods are highly accurate, their computationally expensive and complex models are typically impractical for large datasets or cannot provide explanations. To address these issues, this thesis proposes two time series classification methods that are comparable to state-of-the-art methods in terms of accuracy, while the two methods provide scalable and explainable classifications. Our first method proposes a novel supervised selection of sub-series to pre-compute a set of features that maximizes the classification accuracy when fed into a tree-based ensemble. Our second method further proposes a perturbation scheme for the supervised feature selection and the node-splitting process when training the tree-based ensemble. We also propose a highly time-efficient strategy to build the tree-based ensemble. Both methods enable explainability for our classification results, while they are orders of magnitude faster than the state-of-the-art methods. Our second method, in particular, is significantly faster and more accurate than our first approach, while it is not significantly different from the state-of-the-art methods in terms of classification accuracy. Moreover, motivated to explore a more general model for time series classification, we propose a novel graph-based method to learn to classify time series without the order constraint inherent to time series data. This method classifies time series by learning the relationships among the data points independently from their positions (i.e., time-stamps) within the series. We show that this method outperforms state-of-the-art methods over several time series datasets, thus opening up a new direction for the design of time series classifiers.
  • Item
    Thumbnail Image
    Automatic caloric expenditure estimation with smartphone's built-in sensors
    Cabello Wilson, Nestor Stiven ( 2016)
    Fitness-tracking systems are technologies commonly used to enhance peoples' lifestyles. Feedback, usability, and ease of acquisition are fundamental to achieving the good physical condition goal. Users need constant motivation as a way to keep their interest in the fitness system and consequently, continue on a healthy lifestyle track. However, although feedback is increasingly being incorporated in many fitness-tracking systems, usability and ease of acquisition are remaining shortcomings that need to be enhanced. Features such as automatic activity identification, low-energy consumption, simplicity and goals-achieved notifications provide a good user experience. Nevertheless, most of these functions require the acquisition of a relatively expensive fitness-tracking device. Smartphones provide a partial solution by allowing users an easy access to multiple fitness applications, which reduce the need for purchasing another gadget. Nonetheless, improvements in the user experience are still necessary. In the other hand, wearables devices satisfy the usability, however, the cost of their acquisition represents an impediment to some users. The system proposed in this research aims to handle these issues and offers a solution by combining the benefits from mobile applications such as feedback and ease of acquisition, with the usability that wearable devices provide, into a smartphone Android application. Data collected from a single user while performing a series of common daily activities namely walking, jogging, cycling, climbing stairs, and walking downstairs, was used to classify and provide an automatic identification of these activities with an overall accuracy of 91%, and identifying the stairs activities with an accuracy of 81%. Finally, the caloric expenditure, which we considered the most important metric for motivating a user to perform a physical activity, was estimated by following the oxygen consumption equations from the American College of Sports Medicine (ACSM).