Computing and Information Systems - Research Publications

Permanent URI for this collection

http://hdl.handle.net/11343/350

Search Results

Now showing 1 - 7 of 7

A Transferable Technique for Detecting and Localising Segments of Repeating Patterns in Time series

Mirmomeni, M ; Kulik, L ; Bailey, J (IEEE, 2021)

In time series data, consecutively repeated patterns occur in many applications, including activity recognition from wearable sensors. Repeating patterns may vary over time and present in various shapes and sizes, which makes their detection a challenging problem. We develop a novel technique, RP-Mask, that can detect and localise segments of consecutively repeated patterns, without prior knowledge about the shape and length of the repeats. Our technique represents time series using recurrence plots (RP), a method for visualising repetition in time series. We identify two key features of recurrence plots-checkerboard patterns and vertical/horizontal lines marking the start and end of checkerboard patterns. We use object recognition on RP images to detect and localise the checkerboard patterns, which are mapped to the segments of consecutively repeating patterns on the underlying time series. Since the collection and labeling of a real world dataset that exhibits all possible variations of a repetition is prohibitive, we demonstrate that our model is able to effectively learn from synthetically curated data and perform equally effective on a real world dataset, while it is noise tolerant. We compare our method to a number of state-of-the-art techniques and show that our method outperforms the state of the art both when trained using real activity recognition and synthetic data.
An effective and versatile distance measure for spatiotemporal trajectories

Naderivesal, S ; Kulik, L ; Bailey, J (SPRINGER, 2019-05)

The analysis of large-scale trajectory data has tremendous benefits for applications ranging from transportation planning to traffic management. A fundamental building block for the analysis of such data is the computation of similarity between trajectories. Existing work for similarity computation focuses mainly on the spatial aspects of trajectories, but more rarely takes into account time in conjunction with space. A key challenge when considering time is how to handle trajectories that are sampled asynchronously or at variable rates, which can lead to uncertainty. To tackle this problem, we quantify trajectory similarity as an interval, rather than a single value, to capture the uncertainty that can result from different sampling rates and asynchronous sampling. Based on this perspective, we develop a new trajectory similarity measure, Trajectory Interval Distance Estimation, which models similarity computation as a convex optimisation problem. Using two real datasets, we demonstrate that our proposed measure is extremely effective for assessing similarity in comparison to existing state of the art measures.
Privacy- and context-aware release of trajectory data

Naghizade, E ; Kulik, L ; Tanin, E ; Bailey, J (ACM, 2020-03)

The availability of large-scale spatio-temporal datasets along with the advancements in analytical models and tools have created a unique opportunity to create valuable insights into managing key areas of society from transportation and urban planning to epidemiology and natural disasters management. This has encouraged the practice of releasing/publishing trajectory datasets among data owners. However, an ill-informed publication of such rich datasets may have serious privacy implications for individuals. Balancing privacy and utility, as a major goal in the data exchange process, is challenging due to the richness of spatio-temporal datasets. In this article, we focus on an individual's stops as the most sensitive part of the trajectory and aim to preserve them through spatio-temporal perturbation. We model a trajectory as a sequence of stops and moves and propose an efficient algorithm that either substitutes sensitive stop points of a trajectory with moves from the same trajectory or introduces a minimal detour if no safe Point of Interest (POI) can be found on the same route. This hinders the amount of unnecessary distortion, since the footprint of the original trajectory is preserved as much as possible. Our experiments shows that our method balances user privacy and data utility: It protects privacy through preventing an adversary from making inferences about sensitive stops while maintaining a high level of similarity to the original dataset.
An automated matrix profile for mining consecutive repeats in time series

Mirmomeni, M ; Kowsar, Y ; Kulik, L ; Bailey, J ; Geng, X ; Kang, BH (Springer Nature, 2018-01-01)

A key application of wearable sensors is remote patient monitoring, which facilitates clinicians to observe patients non-invasively, by examining the time series of sensor readings. For analysis of such time series, a recently proposed technique is Matrix Profile (MP). While being effective for certain time series mining tasks, MP depends on a key input parameter, the length of subsequences for which to search. We demonstrate that MP’s dependency on this input parameter impacts its effectiveness for finding patterns of interest. We focus on finding consecutive repeating patterns (CRPs), which represent human activities and exercises whilst tracked using wearable sensors. We demonstrate that MP cannot detect CRPs effectively and extend it by adding a locality preserving index. Our method automates the use of MP, and reduces the need for data labeling by experts. We demonstrate our algorithm’s effectiveness in detecting regions of CRPs through a number of real and synthetic datasets.
Characteristics of Local Intrinsic Dimensionality (LID) in Subspaces: Local Neighbourhood Analysis

Hashem, T ; Rashidi, L ; Bailey, J ; Kulik, L ; Amato, G ; Gennaro, C ; Oria, V ; Radovanovic, M (Springer, 2019-01-01)

The local intrinsic dimensionality (LID) model enables assessment of the complexity of the local neighbourhood around a specific query object of interest. In this paper, we study variations in the LID of a query, with respect to different subspaces and local neighbourhoods. We illustrate the surprising phenomenon of how the LID of a query can substantially decrease as further features are included in a dataset. We identify the role of two key feature properties in influencing the LID for feature combinations: correlation and dominance. Our investigation provides new insights into the impact of different feature combinations on local regions of the data.
PRESS: A personalised approach for mining top-k groups of objects with subspace similarity

Hashem, T ; Rashidi, L ; Kulik, L ; Bailey, J (Elsevier, 2020-07)

Personalised analytics is a powerful technology that can be used to improve the career, lifestyle, and health of individuals by providing them with an in-depth analysis of their characteristics as compared to other people. Existing research has often focused on mining general patterns or clusters, but without the facility for customisation to an individual's needs. It is challenging to adapt such approaches to the personalised case, due to the high computational overhead they require for discovering patterns that are good across an entire dataset, rather than with respect to an individual. In this paper, we tackle the challenge of personalised pattern mining and propose a query-driven approach to mine objects with subspace similarity. Given a query object in a categorical dataset, our proposed algorithm, PRESS (Personalised Subspace Similarity), determines the top-k groups of objects, where each group has high similarity to the query for some particular subspace. We evaluate the efficiency and effectiveness of our approach on both synthetic and real datasets.
Automatically recognizing places of interest from unreliable GPS data using spatio-temporal density estimation and line intersections

Bhattacharya, T ; Kulik, L ; Bailey, J (Elsevier, 2015-05)

Abstract Stay points are important for recognizing significant places from a mobile user’s GPS trajectory. Such places are often located indoors and in urban canyons, where GPS is unreliable. Consequently, mapping a user’s stay point to a Place of Interest (POI) using only GPS data is particularly challenging. Our novel algorithm employs both spatio-temporal density estimation and line count inference to predict and rank a user’s POI(s) at building level accuracy from noisy time-annotated GPS data points. An experimental study demonstrates the superiority of our algorithm against several baseline approaches with a recall of 96.5% for the top 5 retrieved locations.

Computing and Information Systems - Research Publications

Permanent URI for this collection

Filters

Date

Author

Type

Settings

Sort By

Results per page

Statistics

Citations

Search Results