Computing and Information Systems - Research Publications

Permanent URI for this collection

Search Results

Now showing 1 - 10 of 11
  • Item
    Thumbnail Image
    MUSTANG: A multiple structural alignment algorithm
    Konagurthu, AS ; Whisstock, JC ; Stuckey, PJ ; Lesk, AM (WILEY, 2006-08-15)
    Multiple structural alignment is a fundamental problem in structural genomics. In this article, we define a reliable and robust algorithm, MUSTANG (MUltiple STructural AligNment AlGorithm), for the alignment of multiple protein structures. Given a set of protein structures, the program constructs a multiple alignment using the spatial information of the C(alpha) atoms in the set. Broadly based on the progressive pairwise heuristic, this algorithm gains accuracy through novel and effective refinement phases. MUSTANG reports the multiple sequence alignment and the corresponding superposition of structures. Alignments generated by MUSTANG are compared with several handcurated alignments in the literature as well as with the benchmark alignments of 1033 alignment families from the HOMSTRAD database. The performance of MUSTANG was compared with DALI at a pairwise level, and with other multiple structural alignment tools such as POSA, CE-MC, MALECON, and MultiProt. MUSTANG performs comparably to popular pairwise and multiple structural alignment tools for closely related proteins, and performs more reliably than other multiple structural alignment methods on hard data sets containing distantly related proteins or proteins that show conformational changes.
  • Item
    Thumbnail Image
    MUSTANG-MR Structural Sieving Server: Applications in Protein Structural Analysis and Crystallography
    Konagurthu, AS ; Reboul, CF ; Schmidberger, JW ; Irving, JA ; Lesk, AM ; Stuckey, PJ ; Whisstock, JC ; Buckle, AM ; Fernandez-Fuentes, N (PUBLIC LIBRARY SCIENCE, 2010-04-06)
    BACKGROUND: A central tenet of structural biology is that related proteins of common function share structural similarity. This has key practical consequences for the derivation and analysis of protein structures, and is exploited by the process of "molecular sieving" whereby a common core is progressively distilled from a comparison of two or more protein structures. This paper reports a novel web server for "sieving" of protein structures, based on the multiple structural alignment program MUSTANG. METHODOLOGY/PRINCIPAL FINDINGS: "Sieved" models are generated from MUSTANG-generated multiple alignment and superpositions by iteratively filtering out noisy residue-residue correspondences, until the resultant correspondences in the models are optimally "superposable" under a threshold of RMSD. This residue-level sieving is also accompanied by iterative elimination of the poorly fitting structures from the input ensemble. Therefore, by varying the thresholds of RMSD and the cardinality of the ensemble, multiple sieved models are generated for a given multiple alignment and superposition from MUSTANG. To aid the identification of structurally conserved regions of functional importance in an ensemble of protein structures, Lesk-Hubbard graphs are generated, plotting the number of residue correspondences in a superposition as a function of its corresponding RMSD. The conserved "core" (or typically active site) shows a linear trend, which becomes exponential as divergent parts of the structure are included into the superposition. CONCLUSIONS: The application addresses two fundamental problems in structural biology: first, the identification of common substructures among structurally related proteins--an important problem in characterization and prediction of function; second, generation of sieved models with demonstrated uses in protein crystallographic structure determination using the technique of Molecular Replacement.
  • Item
    Thumbnail Image
    The island confinement method for reducing search space in local search methods
    Fang, H ; Kilani, Y ; Lee, JHM ; Stuckey, PJ (SPRINGER, 2007-12)
  • Item
    Thumbnail Image
    Optimal sum-of-pairs multiple sequence alignment using incremental Carrillo and Lipman bounds
    Konagurthu, AS ; Stuckey, PJ (MARY ANN LIEBERT, INC, 2006-04)
    Alignment of sequences is an important routine in various areas of science, notably molecular biology. Multiple sequence alignment is a computationally hard optimization problem which involves the consideration of different possible alignments in order to find an optimal one, given a measure of goodness of alignments. Dynamic programming algorithms are generally well suited for the search of optimal alignments, but are constrained by unwieldy space requirements for large numbers of sequences. Carrillo and Lipman devised a method that helps to reduce the search space for an optimal alignment under a sum-of-pairs measure using bounds on the scores of its pairwise projections. In this paper, we generalize Carrillo and Lipman bounds and demonstrate a novel approach for finding optimal sum-of-pairs multiple alignments that allows incremental pruning of the optimal alignment search space. This approach can result in a drastic pruning of the final search space polytope (where we search for the optimal alignment) when compared to Carrillo and Lipman's approach and hence allows many runs that are not feasible with the original method.
  • Item
    Thumbnail Image
    Fast Set Bounds Propagation Using a BDD-SAT Hybrid
    Gange, G ; Stuckey, PJ ; Lagoon, V (AI ACCESS FOUNDATION, 2010)
    Binary Decision Diagram (BDD) based set bounds propagation is a powerful approach to solving set-constraint satisfaction problems. However, prior BDD based techniques in- cur the significant overhead of constructing and manipulating graphs during search. We present a set-constraint solver which combines BDD-based set-bounds propagators with the learning abilities of a modern SAT solver. Together with a number of improvements beyond the basic algorithm, this solver is highly competitive with existing propagation based set constraint solvers.
  • Item
    Thumbnail Image
    Solving set constraint satisfaction problems using ROBDDS
    Hawkins, P ; Lagoon, V ; Stuckey, PJ (AI ACCESS FOUNDATION, 2005)
    In this paper we present a new approach to modeling finite set domain constraint problems using Reduced Ordered Binary Decision Diagrams (ROBDDs). We show that it is possible to construct an efficient set domain propagator which compactly represents many set domains and set constraints using ROBDDs. We demonstrate that the ROBDD-based approach provides unprecedented flexibility in modeling constraint satisfaction problems, leading to performance improvements. We also show that the ROBDD-based modeling approach can be extended to the modeling of integer and multiset constraint problems in a straightforward manner. Since domain propagation is not always practical, we also show how to incorporate less strict consistency notions into the ROBDD framework, such as set bounds, cardinality bounds and lexicographic bounds consistency. Finally, we present experimental results that demonstrate the ROBDD-based solver performs better than various more conventional constraint solvers on several standard set constraint problems.
  • Item
    Thumbnail Image
    Incremental Satisfiability and Implication for UTVPI Constraints
    Schutt, A ; Stuckey, PJ (INFORMS, 2010-09-01)
    Unit two-variable-per-inequality (UTVPI) constraints form one of the largest class of integer constraints that are polynomial time solvable (unless P = NP). There is considerable interest in their use for constraint solving, abstract interpretation, spatial database algorithms, and theorem proving. In this paper we develop new incremental algorithms for UTVPI constraint satisfaction and implication checking that require ℴ(m + n log n + p) time and ℴ(n + m + p) space to incrementally check satisfiability of m UTVPI constraints on n variables, and we check the implication of p UTVPI constraints. The algorithms can be straightforwardly extended to create nonincremental implication checking and generation of all (nonredundant) implied constraints, as well as generate minimal unsatisfiable subsets and minimal implicants.
  • Item
    Thumbnail Image
    Dynamic programming to minimize the maximum number of open stacks
    de la Banda, MG ; Stuckey, PJ (INFORMS, 2007-01-01)
    We give a dynamic-programming solution to the problem of minimizing the maximum number of open stacks. Starting from a call-based dynamic program, we show a number of ways to improve the dynamic-programming search, preprocess the problem to simplify it, and determine lower and upper bounds. We then explore a number of search strategies for reducing the search space. The final dynamic-programming solution is, we believe, highly effective.
  • Item
    Thumbnail Image
    Exploration of Networks Using Overview plus Detail with Constraint-based Cooperative Layout
    Dwyer, T ; Marriott, K ; Schreiber, F ; Stuckey, PJ ; Woodward, M ; Wybrow, M (IEEE COMPUTER SOC, 2008)
    A standard approach to large network visualization is to provide an overview of the network and a detailed view of a small component of the graph centred around a focal node. The user explores the network by changing the focal node in the detailed view or by changing the level of detail of a node or cluster. For scalability, fast force-based layout algorithms are used for the overview and the detailed view. However, using the same layout algorithm in both views is problematic since layout for the detailed view has different requirements to that in the overview. Here we present a model in which constrained graph layout algorithms are used for layout in the detailed view. This means the detailed view has high-quality layout including sophisticated edge routing and is customisable by the user who can add placement constraints on the layout. Scalability is still ensured since the slower layout techniques are only applied to the small subgraph shown in the detailed view. The main technical innovations are techniques to ensure that the overview and detailed view remain synchronized, and modifying constrained graph layout algorithms to support smooth, stable layout. The key innovation supporting stability are new dynamic graph layout algorithms that preserve the topology or structure of the network when the user changes the focus node or the level of detail by in situ semantic zooming. We have built a prototype tool and demonstrate its use in two application domains, UML class diagrams and biological networks.
  • Item
    Thumbnail Image
    Philosophy of the MiniZinc challenge
    Stuckey, PJ ; Becket, R ; Fischer, J (SPRINGER, 2010-07)