Computing and Information Systems - Research Publications

Permanent URI for this collection

Search Results

Now showing 1 - 10 of 42
  • Item
    Thumbnail Image
    On the Noncyclic Property of Sylvester Hadamard Matrices
    Tang, X ; Parampalli, U (IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC, 2010-09)
  • Item
    Thumbnail Image
    Efficient identity-based signatures in the standard model
    Narayan, S ; Parampalli, U (INST ENGINEERING TECHNOLOGY-IET, 2008-12)
  • Item
    Thumbnail Image
    Building more robust multi-agent systems using a log-based approach
    Unruh, A ; Bailey, J ; Ramamohanarao, K (IOS Press, 2009-03-23)
  • Item
    Thumbnail Image
    MUSTANG: A multiple structural alignment algorithm
    Konagurthu, AS ; Whisstock, JC ; Stuckey, PJ ; Lesk, AM (WILEY, 2006-08-15)
    Multiple structural alignment is a fundamental problem in structural genomics. In this article, we define a reliable and robust algorithm, MUSTANG (MUltiple STructural AligNment AlGorithm), for the alignment of multiple protein structures. Given a set of protein structures, the program constructs a multiple alignment using the spatial information of the C(alpha) atoms in the set. Broadly based on the progressive pairwise heuristic, this algorithm gains accuracy through novel and effective refinement phases. MUSTANG reports the multiple sequence alignment and the corresponding superposition of structures. Alignments generated by MUSTANG are compared with several handcurated alignments in the literature as well as with the benchmark alignments of 1033 alignment families from the HOMSTRAD database. The performance of MUSTANG was compared with DALI at a pairwise level, and with other multiple structural alignment tools such as POSA, CE-MC, MALECON, and MultiProt. MUSTANG performs comparably to popular pairwise and multiple structural alignment tools for closely related proteins, and performs more reliably than other multiple structural alignment methods on hard data sets containing distantly related proteins or proteins that show conformational changes.
  • Item
    Thumbnail Image
    Reference-Free Validation of Short Read Data
    Schroeder, J ; Bailey, J ; Conway, T ; Zobel, J ; Aramayo, R (PUBLIC LIBRARY SCIENCE, 2010-09-22)
    BACKGROUND: High-throughput DNA sequencing techniques offer the ability to rapidly and cheaply sequence material such as whole genomes. However, the short-read data produced by these techniques can be biased or compromised at several stages in the sequencing process; the sources and properties of some of these biases are not always known. Accurate assessment of bias is required for experimental quality control, genome assembly, and interpretation of coverage results. An additional challenge is that, for new genomes or material from an unidentified source, there may be no reference available against which the reads can be checked. RESULTS: We propose analytical methods for identifying biases in a collection of short reads, without recourse to a reference. These, in conjunction with existing approaches, comprise a methodology that can be used to quantify the quality of a set of reads. Our methods involve use of three different measures: analysis of base calls; analysis of k-mers; and analysis of distributions of k-mers. We apply our methodology to wide range of short read data and show that, surprisingly, strong biases appear to be present. These include gross overrepresentation of some poly-base sequences, per-position biases towards some bases, and apparent preferences for some starting positions over others. CONCLUSIONS: The existence of biases in short read data is known, but they appear to be greater and more diverse than identified in previous literature. Statistical analysis of a set of short reads can help identify issues prior to assembly or resequencing, and should help guide chemical or statistical methods for bias rectification.
  • Item
    Thumbnail Image
    MUSTANG-MR Structural Sieving Server: Applications in Protein Structural Analysis and Crystallography
    Konagurthu, AS ; Reboul, CF ; Schmidberger, JW ; Irving, JA ; Lesk, AM ; Stuckey, PJ ; Whisstock, JC ; Buckle, AM ; Fernandez-Fuentes, N (PUBLIC LIBRARY SCIENCE, 2010-04-06)
    BACKGROUND: A central tenet of structural biology is that related proteins of common function share structural similarity. This has key practical consequences for the derivation and analysis of protein structures, and is exploited by the process of "molecular sieving" whereby a common core is progressively distilled from a comparison of two or more protein structures. This paper reports a novel web server for "sieving" of protein structures, based on the multiple structural alignment program MUSTANG. METHODOLOGY/PRINCIPAL FINDINGS: "Sieved" models are generated from MUSTANG-generated multiple alignment and superpositions by iteratively filtering out noisy residue-residue correspondences, until the resultant correspondences in the models are optimally "superposable" under a threshold of RMSD. This residue-level sieving is also accompanied by iterative elimination of the poorly fitting structures from the input ensemble. Therefore, by varying the thresholds of RMSD and the cardinality of the ensemble, multiple sieved models are generated for a given multiple alignment and superposition from MUSTANG. To aid the identification of structurally conserved regions of functional importance in an ensemble of protein structures, Lesk-Hubbard graphs are generated, plotting the number of residue correspondences in a superposition as a function of its corresponding RMSD. The conserved "core" (or typically active site) shows a linear trend, which becomes exponential as divergent parts of the structure are included into the superposition. CONCLUSIONS: The application addresses two fundamental problems in structural biology: first, the identification of common substructures among structurally related proteins--an important problem in characterization and prediction of function; second, generation of sieved models with demonstrated uses in protein crystallographic structure determination using the technique of Molecular Replacement.
  • Item
    Thumbnail Image
    Classifying proteins using gapped Markov feature pairs
    Ji, X ; Bailey, J ; Ramamohanarao, K (ELSEVIER, 2010-08)
  • Item
  • Item
    Thumbnail Image
    Crossing the agent technology chasm: Lessons, experiences and challenges in commercial applications of agents
    Munroe, S ; Miller, T ; Belechean, RA ; Pechoucek, M ; McBurney, P ; Luck, M (CAMBRIDGE UNIV PRESS, 2006-12)
    Agent software technologies are currently still in an early stage of market development, where, arguably, the majority of users adopting the technology are visionaries who have recognized the long-term potential of agent systems. Some current adopters also see short-term net commercial benefits from the technology, and more potential users will need to perceive such benefits if agent technologies are to become widely used. One way to assist potential adopters to assess the costs and benefits of agent technologies is through the sharing of actual deployment histories of these technologies. Working in collaboration with several companies and organizations in Europe and North America, we have studied deployed applications of agent technologies, and we present these case studies in detail in this paper. We also review the lessons learnt, and the key issues arising from the deployments, to guide decision-making in research, in development and in implementation of agent software technologies.
  • Item
    Thumbnail Image
    A binary decision diagram based approach for mining frequent subsequences
    Loekito, E ; Bailey, J ; Pei, J (SPRINGER LONDON LTD, 2010-08)