Computing and Information Systems - Theses

Permanent URI for this collection

Search Results

Now showing 1 - 2 of 2
  • Item
    Thumbnail Image
    Computational gene finding in the human malaria parasite Plasmodium vivax
    Stivala, Alexander David ( 2006-10)
    Different approaches to genome annotation are reviewed and compared with reference based annotation using GeneMapper in the human malaria parasite Plasmodium vivax. It is found that the latter approach does not achieve sensitivity and specificity as high as those for some ab initio techniques. Potential reasons for this are identified and discussed. As part of the process of using GeneMapper, codon substitution matrices are constructed and examined. This leads to the discovery of evidence from which we derive a conjecture regarding Plasmodium evolution.
  • Item
    Thumbnail Image
    Algorithms for the study of RNA and protein structure
    Stivala, Alexander David ( 2010)
    The growth in the number of known sequences and structures of RNA and protein molecules has led to the need to solve many computationally demanding problems in the analysis of RNA and protein structure. This thesis describes algorithms for structural comparison of RNA and protein molecules. In the case of proteins, it also describes a technique for automatically generating two-dimensional diagrammatic representations for visual comparison. A general technique for parallelizing dynamic programs in a straightforward way, by means of a shared lock-free hash table implementation and randomization of subproblem ordering is described. This generic approach is applied to several well-known dynamic programs, as well as a dynamic program for structural alignment of RNA molecules by aligning their base pairing probability matrices. Two algorithms for protein structure and substructure searching are described. These algorithms are also capable of finding non-sequential matches, that is, matches between structures where the sequential order of secondary structure elements is not preserved. The first algorithm is based on the relaxation of an earlier quadratic integer problem (QIP) formulation to a quadratic program (QP). The second algorithm uses the same formulation but approximates it using simulated annealing. It is shown that this results in significant increases in speed. This algorithm is also capable of greater accuracy when assessed as a fold recognition method. A parallel implementation of this algorithm on modern graphics processing unit (GPU) hardware is also described. This parallel implementation results in a further significant speedup, and, to the best of our knowledge, is the first use of a GPU for the protein structural search problem. Finally, a system to automatically generate two-dimensional representations of protein structure is described. Such diagrams are particularly useful in analysing complex protein folds. A method for using these diagrams as an interface to the protein substructure search methods is also described.