Computing and Information Systems - Research Publications

Permanent URI for this collection

Search Results

Now showing 1 - 10 of 32
  • Item
    Thumbnail Image
    A time decoupling approach for studying forum dynamics
    Kan, A ; Chan, J ; Hayes, C ; Hogan, B ; Bailey, J ; Leckie, C (SPRINGER, 2013-11)
  • Item
    Thumbnail Image
    Discovery and analysis of consistent active subnetworks in cancers
    Gaire, RK ; Smith, L ; Humbert, P ; Bailey, J ; Stuckey, PJ ; Haviv, I (BMC, 2013-01-21)
    Gene expression profiles can show significant changes when genetically diseased cells are compared with non-diseased cells. Biological networks are often used to identify active subnetworks (ASNs) of the diseases from the expression profiles to understand the reason behind the observed changes. Current methodologies for discovering ASNs mostly use undirected PPI networks and node centric approaches. This can limit their ability to find the meaningful ASNs when using integrated networks having comprehensive information than the traditional protein-protein interaction networks. Using appropriate scoring functions to assess both genes and their interactions may allow the discovery of better ASNs. In this paper, we present CASNet, which aims to identify better ASNs using (i) integrated interaction networks (mixed graphs), (ii) directions of regulations of genes, and (iii) combined node and edge scores. We simplify and extend previous methodologies to incorporate edge evaluations and lessen their sensitivity to significance thresholds. We formulate our objective functions using mixed integer programming (MIP) and show that optimal solutions may be obtained. We compare the ASNs obtained by CASNet and similar other approaches to show that CASNet can often discover more meaningful and stable regulatory ASNs. Our analysis of a breast cancer dataset finds that the positive feedback loops across 7 genes, AR, ESR1, MYC, E2F2, PGR, BCL2 and CCND1 are conserved across the basal/triple negative subtypes in multiple datasets that could potentially explain the aggressive nature of this cancer subtype. Furthermore, comparison of the basal subtype of breast cancer and the mesenchymal subtype of glioblastoma ASNs shows that an ASN in the vicinity of IL6 is conserved across the two subtypes. This result suggests that subtypes of different cancers can show molecular similarities indicating that the therapeutic approaches in different types of cancers may be shared.
  • Item
    Thumbnail Image
    A fast indexing approach for protein structure comparison
    Zhang, L ; Bailey, J ; Konagurthu, AS ; Ramamohanarao, K (BMC, 2010)
    BACKGROUND: Protein structure comparison is a fundamental task in structural biology. While the number of known protein structures has grown rapidly over the last decade, searching a large database of protein structures is still relatively slow using existing methods. There is a need for new techniques which can rapidly compare protein structures, whilst maintaining high matching accuracy. RESULTS: We have developed IR Tableau, a fast protein comparison algorithm, which leverages the tableau representation to compare protein tertiary structures. IR tableau compares tableaux using information retrieval style feature indexing techniques. Experimental analysis on the ASTRAL SCOP protein structural domain database demonstrates that IR Tableau achieves two orders of magnitude speedup over the search times of existing methods, while producing search results of comparable accuracy. CONCLUSION: We show that it is possible to obtain very significant speedups for the protein structure comparison problem, by employing an information retrieval style approach for indexing proteins. The comparison accuracy achieved is also strong, thus opening the way for large scale processing of very large protein structure databases.
  • Item
    Thumbnail Image
    A voting approach to identify a small number of highly predictive genes using multiple classifiers
    Hassan, MR ; Hossain, MM ; Bailey, J ; Macintyre, G ; Ho, JWK ; Ramamohanarao, K (BMC, 2009-01-30)
    BACKGROUND: Microarray gene expression profiling has provided extensive datasets that can describe characteristics of cancer patients. An important challenge for this type of data is the discovery of gene sets which can be used as the basis of developing a clinical predictor for cancer. It is desirable that such gene sets be compact, give accurate predictions across many classifiers, be biologically relevant and have good biological process coverage. RESULTS: By using a new type of multiple classifier voting approach, we have identified gene sets that can predict breast cancer prognosis accurately, for a range of classification algorithms. Unlike a wrapper approach, our method is not specialised towards a single classification technique. Experimental analysis demonstrates higher prediction accuracies for our sets of genes compared to previous work in the area. Moreover, our sets of genes are generally more compact than those previously proposed. Taking a biological viewpoint, from the literature, most of the genes in our sets are known to be strongly related to cancer. CONCLUSION: We show that it is possible to obtain superior classification accuracy with our approach and obtain a compact gene set that is also biologically relevant and has good coverage of different biological processes.
  • Item
    Thumbnail Image
    Discovering and Summarising Regions of Correlated Spatio-Temporal Change in Evolving Graphs
    CHAN, J. ; BAILEY, J. ; LECKIE, C. (IEEE Computer Society, 2006)
  • Item
    Thumbnail Image
    Semantic-compensation-based recovery in multi-agent systems
    Unruh, A ; Harjadi, H ; Bailey, J ; Ramamohanarai, K (IEEE, 2005)
  • Item
  • Item
    Thumbnail Image
    Mining, ranking, and using acronym patterns
    Ji, X ; Xu, C ; Bailey, J ; Li, H ; Zhang, Y ; Yu, G ; Bertino, E ; Xu, G (SPRINGER-VERLAG BERLIN, 2008)
  • Item
    Thumbnail Image
    Discovery of minimal unsatisfiable subsets of constraints using hitting set dualization
    Bailey, J ; Stuckey, PJ ; Hermenegildo, M ; Cabeza, D (SPRINGER-VERLAG BERLIN, 2005)
  • Item
    Thumbnail Image
    is-rSNP: a novel technique for in silico regulatory SNP detection
    Macintyre, G ; Bailey, J ; Haviv, I ; Kowalczyk, A (OXFORD UNIV PRESS, 2010-09)
    MOTIVATION: Determining the functional impact of non-coding disease-associated single nucleotide polymorphisms (SNPs) identified by genome-wide association studies (GWAS) is challenging. Many of these SNPs are likely to be regulatory SNPs (rSNPs): variations which affect the ability of a transcription factor (TF) to bind to DNA. However, experimental procedures for identifying rSNPs are expensive and labour intensive. Therefore, in silico methods are required for rSNP prediction. By scoring two alleles with a TF position weight matrix (PWM), it can be determined which SNPs are likely rSNPs. However, predictions in this manner are noisy and no method exists that determines the statistical significance of a nucleotide variation on a PWM score. RESULTS: We have designed an algorithm for in silico rSNP detection called is-rSNP. We employ novel convolution methods to determine the complete distributions of PWM scores and ratios between allele scores, facilitating assignment of statistical significance to rSNP effects. We have tested our method on 41 experimentally verified rSNPs, correctly predicting the disrupted TF in 28 cases. We also analysed 146 disease-associated SNPs with no known functional impact in an attempt to identify candidate rSNPs. Of the 11 significantly predicted disrupted TFs, 9 had previous evidence of being associated with the disease in the literature. These results demonstrate that is-rSNP is suitable for high-throughput screening of SNPs for potential regulatory function. This is a useful and important tool in the interpretation of GWAS. AVAILABILITY: is-rSNP software is available for use at: www.genomics.csse.unimelb.edu.au/is-rSNP.