Show simple item record

dc.contributor.authorDemyanov, Sergey
dc.date.accessioned2015-12-21T00:16:14Z
dc.date.available2015-12-21T00:16:14Z
dc.date.issued2015
dc.identifier.urihttp://hdl.handle.net/11343/57198
dc.description© 2015 Dr. Sergey Demyanov
dc.description.abstractNeural networks have become very popular in the last few years. They have demonstrated the best results in areas of image classification, image segmentation, speech recognition, and text processing. The major breakthrough happened in early 2010s, when it became feasible to train deep neural networks (DNN) on a GPU, which made the training process several hundred times faster. At the same time, large labeled datasets with millions of objects, such as ImageNet , became available. The GPU implementation of a convolutional DNN with over 10 layers and millions of parameters could handle the ImageNet dataset in just a few days. As a result, such networks could decrease classification error in the image classification competition LSVRC-2010 by 40\% compared with the hand-made feature algorithms. Deep neural networks are able to demonstrate excellent results on tasks with a complex classification function and sufficient amount of training data. However, since DNN models have a huge number of parameters, they can also be easily overfitted, when the amount of training data is not large enough. Thus, regularization techniques for neural networks are crucially important to make them applicable to a wide range of problems. In this thesis we provide a comprehensive overview of existing regularization techniques for neural networks and provide their theoretical explanation. Training of neural networks is performed using the Backpropagation algorithm (BP). Standard BP has two passes: forward and backward. It computes the predictions for the current input and the loss function in the forward pass, and the derivatives of the loss function with respect to the input and weights on the backward pass. The nature of the data usually assumes that two very close data points have the same label. This means that the predictions of a classifier should not change quickly near the points of a dataset. We propose a natural extension of the backpropagation algorithm that minimizes the length of the vector of derivatives of the loss function with respect to the input values, and demonstrate that this algorithm improves the accuracy of the trained classifier. The proposed invariant backpropagation algorithm requires an additional hyperparameter that defines the strength of regularization, and therefore controls the flexibility of a classifier. In order to achieve the best results, the initial value of this parameter needs to be carefully chosen. Usually this hyperparameter is chosen using a validation set or cross-validation. However, these methods might not be accurate and can be slow. We propose a method of choosing the parameter that affects a classifier flexibility and demonstrate its performance on Support Vector Machines. This method is based on the Akaike Information Criterion (AIC) and Bayesian Information Criterion (BIC), and uses the disposition of misclassified objects and VC-dimension of the classifier. In some tasks, data consists of feature values as well as additional information about feature location in one or more dimensions. Usually these dimensions are space and time. For example, image pixels are described by their coordinates in horizontal and vertical axes. Various time series necessarily contain information when their elements were taken. This information might be used by a classifier. Some regularizers are particularly targeted at this goal, restricting a classifier from learning an inappropriate model. We present an overview of such regularization methods, describe some of their applications and provide the result of their usage. Video is one of the domains where one has to consider time. One of the challenging tasks of this domain is deception detection from visual cues. The psychology literature indicates that people are unable to detect deception with high accuracy and is slightly better than a random guess. At the same time, trained individuals were shown to be able to detect liars with an accuracy up to 73\% \cite{ekman1991can, ekman1999few}. This result confirms that visual and audio channels contain enough information to detect deception. In this thesis we describe an automated multilevel system of video processing and feature engineering based on facial movements. We demonstrate that the extracted features provide a classification accuracy that is statistically significantly better than a random guess. Another contribution of this thesis is the collection of one of the largest datasets of videos with truthful and deceptive people recorded in more natural conditions than others.en_US
dc.subjectregularizationen_US
dc.subjectneural networksen_US
dc.subjectmodel selectionen_US
dc.subjectsupport vector machinesen_US
dc.subjectdeception detectionen_US
dc.titleRegularization methods for neural networks and related modelsen_US
dc.typePhD thesisen_US
melbourne.affiliation.departmentComputing and Information Systems
melbourne.affiliation.facultyEngineering
melbourne.contributor.authorDemyanov, Sergey
melbourne.accessrightsOpen Access


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record