Scaling learning algorithms using locality sensitive hashing
AuthorAye, Zay Maung Maung
AffiliationComputing and Information Systems
MetadataShow full item record
Document TypePhD thesis
Access StatusOpen Access
© 2018 Dr. Zay Maung Maung Aye
With increasing volumes of data, it is imperative that data analysis can appropriately scale. However, many common machine learning algorithms, e.g., metric learning, manifold landmark learning, and processing trajectories, suffer poor computational complexity in the size of training data. In this thesis, we propose generic methods for scaling up learning algorithms by utilizing locality sensitive hashing. First, finding representative samples utilizing locality sensitive hashing is proposed. The usefulness of these samples is demonstrated on large-scale supervised metric learning. Our methods achieve quadratic speed up with only minimal decrease in accuracy. Second, representative samples are leveraged for adaptive minibatch selection for fitting Gaussian processes for landmarking manifolds. Our methods exploit the compatibility of locality sensitive hashing and the manifold assumption in high-dimensional data, thereby limiting expensive optimization to relevant regions of the data. Training the state-of-the-art learner with our compressed dataset achieves superior accuracy compared to training with randomly selected samples. We also demonstrate that our methods can be used to find manifold landmarks without learning Gaussian processes at all, which leads to orders-of-magnitude speed up with only minimal decrease in accuracy. And finally, we propose locality sensitive hashing based feature hashing methods which map variable length trajectories to constant length trajectories for efficient similarity computation in Euclidean space. Our methods can accelerate trajectory clustering while achieving competitive accuracy in comparison to clustering using more complicated distance function, such as Dynamic Time Warping.
Keywordshashing; machine learning; LSH; DBH; metric learning; clustering; scalable learning; locality sensitive hashing
- Click on "Export Reference in RIS Format" and choose "open with... Endnote".
- Click on "Export Reference in RIS Format". Login to Refworks, go to References => Import References