Computing and Information Systems - Theses

Permanent URI for this collection

Search Results

Now showing 1 - 1 of 1
  • Item
    Thumbnail Image
    Scaling learning algorithms using locality sensitive hashing
    Aye, Zay Maung Maung ( 2018)
    With increasing volumes of data, it is imperative that data analysis can appropriately scale. However, many common machine learning algorithms, e.g., metric learning, manifold landmark learning, and processing trajectories, suffer poor computational complexity in the size of training data. In this thesis, we propose generic methods for scaling up learning algorithms by utilizing locality sensitive hashing. First, finding representative samples utilizing locality sensitive hashing is proposed. The usefulness of these samples is demonstrated on large-scale supervised metric learning. Our methods achieve quadratic speed up with only minimal decrease in accuracy. Second, representative samples are leveraged for adaptive minibatch selection for fitting Gaussian processes for landmarking manifolds. Our methods exploit the compatibility of locality sensitive hashing and the manifold assumption in high-dimensional data, thereby limiting expensive optimization to relevant regions of the data. Training the state-of-the-art learner with our compressed dataset achieves superior accuracy compared to training with randomly selected samples. We also demonstrate that our methods can be used to find manifold landmarks without learning Gaussian processes at all, which leads to orders-of-magnitude speed up with only minimal decrease in accuracy. And finally, we propose locality sensitive hashing based feature hashing methods which map variable length trajectories to constant length trajectories for efficient similarity computation in Euclidean space. Our methods can accelerate trajectory clustering while achieving competitive accuracy in comparison to clustering using more complicated distance function, such as Dynamic Time Warping.