Electrical and Electronic Engineering - Research Publications

Permanent URI for this collection

Search Results

Now showing 1 - 2 of 2
  • Item
    Thumbnail Image
    Protein topology classification using two-stage support vector machines.
    Gubbi, J ; Shilton, A ; Parker, M ; Palaniswami, M (Universal Academy Press, 2006)
    The determination of the first 3-D model of a protein from its sequence alone is a non-trivial problem. The first 3-D model is the key to the molecular replacement method of solving phase problem in x-ray crystallography. If the sequence identity is more than 30%, homology modelling can be used to determine the correct topology (as defined by CATH) or fold (as defined by SCOP). If the sequence identity is less than 25%, however, the task is very challenging. In this paper we address the topology classification of proteins with sequence identity of less than 25%. The input information to the system is amino acid sequence, the predicted secondary structure and the predicted real value relative solvent accessibility. A two stage support vector machine (SVM) approach is proposed for classifying the sequences to three different structural classes (alpha, beta, alpha+beta) in the first stage and 39 topologies in the second stage. The method is evaluated using a newly curated dataset from CATH with maximum pairwise sequence identity less than 25%. An impressive overall accuracy of 87.44% and 83.15% is reported for class and topology prediction, respectively. In the class prediction stage, a sensitivity of 0.77 and a specificity of 0.91 is obtained. Data file, SVM implementation (SVMHEAVY) and result files can be downloaded from http://www.ee.unimelb.edu.au/ISSNIP/downloads/.
  • Item
    Thumbnail Image
    Incremental training of support vector machines
    Shilton, A ; Palaniswami, M ; Ralph, D ; Tsoi, AC (IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC, 2005-01)
    We propose a new algorithm for the incremental training of support vector machines (SVMs) that is suitable for problems of sequentially arriving data and fast constraint parameter variation. Our method involves using a "warm-start" algorithm for the training of SVMs, which allows us to take advantage of the natural incremental properties of the standard active set approach to linearly constrained optimization problems. Incremental training involves quickly retraining a support vector machine after adding a small number of additional training vectors to the training set of an existing (trained) support vector machine. Similarly, the problem of fast constraint parameter variation involves quickly retraining an existing support vector machine using the same training set but different constraint parameters. In both cases, we demonstrate the computational superiority of incremental training over the usual batch retraining method.