Electrical and Electronic Engineering - Research Publications

Permanent URI for this collection

http://hdl.handle.net/11343/362

Search Results

Now showing 1 - 10 of 13

Real value solvent accessibility prediction using adaptive support vector regression

Gubbi, J ; Shilton, A ; Palaniswami, M ; Parker, M (IEEE, 2007)
A monomial ν-SV method for regression

SHILTON, ALISTAIR ; Lai, Daniel ; PALANISWAMI, MARIMUTHU ( 2007)

In the present paper we describe a new formulation for Support Vector regression (SVR), namely monomial ν-SVR. Like the standard ν-SVR, the monomial ν-SVR method automatically adjusts the radius of insensitivity (the tube width, epsilon) to suit the training data. However, by replacing Vapnik’s epsilon-insensitive cost with a more general monomial epsilon-insensitive cost (and likewise replacing the linear tube shrinking term with a monomial tube shrinking term), the performance of the monomial ν-SVR is improved for data corrupted by a wider range of noise distributions. We focus on the quadric form of monomial ν-SVR and show that the dual form of this is simpler than the standard ν-SVR. We show that, like Suykens’ Least-Squares SVR (LS-SVR) method (and unlike standard ν-SVR), the quadric ν-SVR dual has a unique global solution. Comparisons are made between the asymptotic efficiency of our method and that of standard ν-SVR and LS-SVR which demonstrate the superiority of our method for the special case of higher order polynomial noise. These theoretical predictions are validated using experimental comparisons with the alternative approaches of standard ν-SVR, LS-SVR and weighted LS-SVR.
Protein topology classification using two-stage support vector machines.

Gubbi, J ; Shilton, A ; Parker, M ; Palaniswami, M (Universal Academy Press, 2006)

The determination of the first 3-D model of a protein from its sequence alone is a non-trivial problem. The first 3-D model is the key to the molecular replacement method of solving phase problem in x-ray crystallography. If the sequence identity is more than 30%, homology modelling can be used to determine the correct topology (as defined by CATH) or fold (as defined by SCOP). If the sequence identity is less than 25%, however, the task is very challenging. In this paper we address the topology classification of proteins with sequence identity of less than 25%. The input information to the system is amino acid sequence, the predicted secondary structure and the predicted real value relative solvent accessibility. A two stage support vector machine (SVM) approach is proposed for classifying the sequences to three different structural classes (alpha, beta, alpha+beta) in the first stage and 39 topologies in the second stage. The method is evaluated using a newly curated dataset from CATH with maximum pairwise sequence identity less than 25%. An impressive overall accuracy of 87.44% and 83.15% is reported for class and topology prediction, respectively. In the class prediction stage, a sensitivity of 0.77 and a specificity of 0.91 is obtained. Data file, SVM implementation (SVMHEAVY) and result files can be downloaded from http://www.ee.unimelb.edu.au/ISSNIP/downloads/.
Stability Analysis of the Decomposition Method for solving Support Vector Machines

Lai, Daniel ; SHILTON, ALISTAIR ; Mani, N. ; PALANISWAMI, MARIMUTHU ( 2005)

In situations where processing memory is limited, the Support Vector Machine quadratic program can be decomposed into smaller sub-problems and solved sequentially. The convergence of this method has been proven previously through the use of a counting method. In this initial investigation, we approach the convergence analysis by treating the decomposed sub-problems as subsystems of a general system. The gradients of the subproblems and the inequality constraints are explicitly modelled as system variables. The change in these variables during optimization form a dynamic system modelled by vector differential equations. We show that the change in the objective function can be written as the energy in the system. This makes it a natural Lyapunov function which has an asymptotically stable point at the origin. The asymptotic stability of the whole system then follows under certain assumptions.
Disulphide Bridge Prediction using Fuzzy Support Vector Machines

Jayavardhana, Rama G. L. ; SHILTON, ALISTAIR ; PARKER, MICHAEL ; PALANISWAMI, MARIMUTHU ( 2005)

One of the major contributors to the native form of protien is cystines forming covalent bonds in oxidized state. The Prediction of such bridges from the sequence is a very challenging task given that the number of bridges will rise exponentially as the number of cystines increases. We propose a novel technique for disulphide bridge prediction based on Fuzzy Support Vector Machines. We call the system DIzzy. In our investigation, we look at disulphide bond connectivity given two Cystines with and without a priori knowledge of the bonding state. We make use of a new encoding scheme based on physico-chemical properties and statistical features such as the probability of occurrence of each amino acid in different secondary structure states along with psiblast profiles. The performance is compared with normal support vector machines. We evaluate our method and compare it with the existing method using SPX dataset.
Prediction of cystine connectivity using SVM

Rama, JGL ; Shilton, AP ; Parker, MM ; Palaniswami, M (BIOMEDICAL INFORMATICS, 2005)

One of the major contributors to protein structures is the formation of disulphide bonds between selected pairs of cysteines at oxidized state. Prediction of such disulphide bridges from sequence is challenging given that the possible combination of cysteine pairs as the number of cysteines increases in a protein. Here, we describe a SVM (support vector machine) model for the prediction of cystine connectivity in a protein sequence with and without a priori knowledge on their bonding state. We make use of a new encoding scheme based on physico-chemical properties and statistical features (probability of occurrence of each amino acid residue in different secondary structure states along with PSI-blast profiles). We evaluate our method in SPX (an extended dataset of SP39 (swiss-prot 39) and SP41 (swiss-prot 41) with known disulphide information from PDB) dataset and compare our results with the recursive neural network model described for the same dataset.
Distributed data fusion using support vector machines

Challa, S. ; Palaniswami, M. ; Shilton, A. ( 2002)

The basic quantity to be estimated in the Bayesian approach to data fusion is the conditional probability density function (CPDF). In recent times, computationally efficient particle filtering approaches are gaining growing importance in estimating these CPDF. In this approach, i.i.d samples are used to represent the conditional probability densities. However, their application in data fusion is severely limited due to the fact that the information is stored in the form of a large set of samples. In all practical data fusion systems that have limited communication bandwidth, broadcasting this probabilistic information, available as a set of samples, to the fusion center is impractical. Support vector machines, through statistical learning theory, provide a way of compressing information by generating optimal kernal based representations. In this paper we use SVM to compress the probabilistic information available in the form of i.i.d samples and apply it to solve the Bayesian data fusion problem. We demonstrate this technique on a multi-sensor tracking example.
Machine learning using support vector machines

Palaniswami, M. ; Shilton, A. ; Ralph, D. ; Owen, B. D. ( 2000)

Machine learning invokes the imagination of many scientific minds due to its potential to solve complex and difficult real world problems. This paper gives methods of constructing machine learning tools using Support Vector Machines (SVMs). We first give a simple example to illustrate the basic concept and then demonstrate further with a practical problem. The practical problem is concerned with electronic monitoring of fishways for automatic counting of different fish species for the purpose of environmental management in Australian rivers. The results illustrate the power of the SVM approaches on the sample problem and their computational attractiveness for practical implementations.
A convergence rate estimate for the SVM decomposition method

Lai, D. ; Shilton, A. ; Palaniswami, M. ( 2005)

The training of Support Vector Machines using the decomposition method has one drawback; namely the selection of working sets such that convergence is as fast as possible. It has been shown by Lin that the rate is linear in the worse case under the assumption that all bounded Support Vectors have been determined. The analysis was done based on the change in the objective function and under a SVMlight selection rule. However, the rate estimate given is independent of time and hence gives little indication as to how the linear convergence speed varies during the iteration. In this initial analysis, we provide a treatment of the convergence from a gradient contraction perspective. We propose a necessary and sufficient condition which when satisfied provides strict linear convergence of the algorithm. The condition can also be interpreted as a basic requirement for a sequence of working sets in order to achieve such a convergence rate. Based on this condition, a time dependant rate estimate is then further derived. This estimate is shown to monotonically approach unity from below.
Incremental training of support vector machines

Shilton, A ; Palaniswami, M ; Ralph, D ; Tsoi, AC (IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC, 2005-01)

We propose a new algorithm for the incremental training of support vector machines (SVMs) that is suitable for problems of sequentially arriving data and fast constraint parameter variation. Our method involves using a "warm-start" algorithm for the training of SVMs, which allows us to take advantage of the natural incremental properties of the standard active set approach to linearly constrained optimization problems. Incremental training involves quickly retraining a support vector machine after adding a small number of additional training vectors to the training set of an existing (trained) support vector machine. Similarly, the problem of fast constraint parameter variation involves quickly retraining an existing support vector machine using the same training set but different constraint parameters. In both cases, we demonstrate the computational superiority of incremental training over the usual batch retraining method.

Electrical and Electronic Engineering - Research Publications

Permanent URI for this collection

Filters

Date

Author

Subject

Type

Settings

Sort By

Results per page

Statistics

Citations

Search Results