Computing and Information Systems - Theses

Permanent URI for this collection

Search Results

Now showing 1 - 2 of 2
  • Item
    Thumbnail Image
    Regularization methods for neural networks and related models
    Demyanov, Sergey ( 2015)
    Neural networks have become very popular in the last few years. They have demonstrated the best results in areas of image classification, image segmentation, speech recognition, and text processing. The major breakthrough happened in early 2010s, when it became feasible to train deep neural networks (DNN) on a GPU, which made the training process several hundred times faster. At the same time, large labeled datasets with millions of objects, such as ImageNet , became available. The GPU implementation of a convolutional DNN with over 10 layers and millions of parameters could handle the ImageNet dataset in just a few days. As a result, such networks could decrease classification error in the image classification competition LSVRC-2010 by 40\% compared with the hand-made feature algorithms. Deep neural networks are able to demonstrate excellent results on tasks with a complex classification function and sufficient amount of training data. However, since DNN models have a huge number of parameters, they can also be easily overfitted, when the amount of training data is not large enough. Thus, regularization techniques for neural networks are crucially important to make them applicable to a wide range of problems. In this thesis we provide a comprehensive overview of existing regularization techniques for neural networks and provide their theoretical explanation. Training of neural networks is performed using the Backpropagation algorithm (BP). Standard BP has two passes: forward and backward. It computes the predictions for the current input and the loss function in the forward pass, and the derivatives of the loss function with respect to the input and weights on the backward pass. The nature of the data usually assumes that two very close data points have the same label. This means that the predictions of a classifier should not change quickly near the points of a dataset. We propose a natural extension of the backpropagation algorithm that minimizes the length of the vector of derivatives of the loss function with respect to the input values, and demonstrate that this algorithm improves the accuracy of the trained classifier. The proposed invariant backpropagation algorithm requires an additional hyperparameter that defines the strength of regularization, and therefore controls the flexibility of a classifier. In order to achieve the best results, the initial value of this parameter needs to be carefully chosen. Usually this hyperparameter is chosen using a validation set or cross-validation. However, these methods might not be accurate and can be slow. We propose a method of choosing the parameter that affects a classifier flexibility and demonstrate its performance on Support Vector Machines. This method is based on the Akaike Information Criterion (AIC) and Bayesian Information Criterion (BIC), and uses the disposition of misclassified objects and VC-dimension of the classifier. In some tasks, data consists of feature values as well as additional information about feature location in one or more dimensions. Usually these dimensions are space and time. For example, image pixels are described by their coordinates in horizontal and vertical axes. Various time series necessarily contain information when their elements were taken. This information might be used by a classifier. Some regularizers are particularly targeted at this goal, restricting a classifier from learning an inappropriate model. We present an overview of such regularization methods, describe some of their applications and provide the result of their usage. Video is one of the domains where one has to consider time. One of the challenging tasks of this domain is deception detection from visual cues. The psychology literature indicates that people are unable to detect deception with high accuracy and is slightly better than a random guess. At the same time, trained individuals were shown to be able to detect liars with an accuracy up to 73\% \cite{ekman1991can, ekman1999few}. This result confirms that visual and audio channels contain enough information to detect deception. In this thesis we describe an automated multilevel system of video processing and feature engineering based on facial movements. We demonstrate that the extracted features provide a classification accuracy that is statistically significantly better than a random guess. Another contribution of this thesis is the collection of one of the largest datasets of videos with truthful and deceptive people recorded in more natural conditions than others.
  • Item
    Thumbnail Image
    A generic framework for the simulation of biologically plausible spiking neural networks on graphics processors
    Abi-Samra, Jad ( 2011)
    The study of the structure and functionality of the brain has been ardently investigated, as the implications of such research may aid in the treatment and diagnosis of mental diseases. This has led to a growing interest in numerical simulation tools that can model its network complexity, in order to achieve a greater understanding of the underlying processes of this complex biological system. The computational requirements of neural modeling makes high performance multi-core systems a desirable architecture when simulating large-scale networks. Graphics processing units (GPUs) are an inexpensive, power-efficient, supercomputing alternative for solving compute-intensive scientific applications. However, the irregular communication and execution patterns in realistic spiking neural networks pose a challenge to their implementation on these massively data parallel devices. In this work, we propose a generic framework for simulating large-scale spiking neural networks with biologically realistic connectivity on GPUs. We provide an extensive list of optimization techniques and strategies which target the main issues involved with neural simulation on these devices, such as: optimal access patterns, synaptic referencing, current aggregation, firing representation, and task distribution. We succeed in building a GPU-based simulator that preserves the flexibility, accuracy, and biological plausibility of neural simulation, while providing high performance and efficient memory usage. Overall, our implementation achieves speedups of around 35-84 times on a single graphics card over an optimized CPU implementation based on the SPIKESIM simulator. We also provide a comparison with other GPU neural simulators related to this work. Following that, we analyze the communication aspects of migrating the system onto a multi-GPU cluster. This is done in an attempt to quantitatively determine the implications of communication overhead with large-scale neural simulation, when employing distributed clusters with GPU devices. We describe a model to determine the arising dependency cost from partitioning a neural network across the different components of the distributed system. We also discuss various techniques for minimizing overhead resulting from frequent messaging and global synchronization. Finally, we provide a theoretical analysis of the suggested communication model in relation to computational and overall performance, as well as a discussion on the relevance of the work.