Computing and Information Systems - Research Publications

Permanent URI for this collection

Search Results

Now showing 1 - 10 of 649
  • Item
    No Preview Available
    Modeling of microalgal shear-induced flocculation and sedimentation using a coupled CFD-population balance approach.
    Golzarijalal, M ; Zokaee Ashtiani, F ; Dabir, B (Wiley, 2018)
    In this study, shear-induced flocculation modeling of Chlorella sp. microalgae was conducted by combination of population balance modeling and CFD. The inhomogeneous Multiple Size Group (MUSIG) and the Euler-Euler two fluid models were coupled via Ansys-CFX-15 software package to achieve both fluid and particle dynamics during the flocculation. For the first time, a detailed model was proposed to calculate the collision frequency and breakage rate during the microalgae flocculation by means of the response surface methodology as a tool for optimization. The particle size distribution resulted from the model was in good agreement with that of the jar test experiment. Furthermore, the subsequent sedimentation step was also examined by removing the shear rate in both simulations and experiments. Consequently, variation in the shear rate and its effects on the flocculation behavior, sedimentation rate and recovery efficiency were evaluated. Results indicate that flocculation of Chlorella sp. microalgae under shear rates of 37, 182, and 387 s-1 is a promising method of pre-concentration which guarantees the cost efficiency of the subsequent harvesting process by recovering more than 90% of the biomass.
  • Item
    Thumbnail Image
    Finding Time for Tabletop: Board Game Play and Parenting
    Rogerson, MJ ; Gibbs, M (SAGE PUBLICATIONS INC, 2018)
    Hobby board gaming is a serious leisure pastime that entails large commitments of time and energy. When serious hobby board gamers become parents, their opportunities for engaging in the pastime are constrained by their new family responsibilities. Based on an ethnographic study of serious hobby board gamers, we investigate how play is constrained by parenting and how serious board gamers with these responsibilities create opportunities to continue to play board games by negotiating the context, time, location, and medium of play. We also examine how these changes influence the enjoyment players derive from board games across the key dimensions of sociality, intellectual challenge, variety, and materiality.
  • Item
    No Preview Available
    TREE-BASED STATISTICAL MACHINE TRANSLATION: EXPERIMENTS WITH THE ENGLISH AND BRAZILIAN PORTUGUESE PAIR
    Beck, D ; Caseli, H (SBIC, 2013)
    Machine Learning paradigms have dominated recent research in Machine Translation. Current state-of-the-art approaches rely only on statistical methods that gather all necessary knowledge from parallel corpora. However, this lack on explicit linguistic knowledge makes them unable to model some linguistic phenomena. In this work, we focus on models that take into account the syntactic information from the languages involved on the translation process. We follow a novel approach that preprocess parallel corpora using syntactic parsers and uses translation models composed by Tree Transducers. We perform experiments with English and Brazilian Portuguese, providing the first known results in syntax-based Statistical Machine Translation for this language pair. These results show that this approach is able to better model phenomena like long-distance reordering and give directions to future improvements in building syntax-based translation models for this pair.
  • Item
    Thumbnail Image
    RAR/RXR binding dynamics distinguish pluripotency from differentiation associated cis-regulatory elements
    Chatagnon, A ; Veber, P ; Morin, V ; Bedo, J ; Triqueneaux, G ; Semon, M ; Laudet, V ; d'Alche-Buc, F ; Benoit, G (OXFORD UNIV PRESS, 2015-05-26)
    In mouse embryonic cells, ligand-activated retinoic acid receptors (RARs) play a key role in inhibiting pluripotency-maintaining genes and activating some major actors of cell differentiation. To investigate the mechanism underlying this dual regulation, we performed joint RAR/RXR ChIP-seq and mRNA-seq time series during the first 48 h of the RA-induced Primitive Endoderm (PrE) differentiation process in F9 embryonal carcinoma (EC) cells. We show here that this dual regulation is associated with RAR/RXR genomic redistribution during the differentiation process. In-depth analysis of RAR/RXR binding sites occupancy dynamics and composition show that in undifferentiated cells, RAR/RXR interact with genomic regions characterized by binding of pluripotency-associated factors and high prevalence of the non-canonical DR0-containing RA response element. By contrast, in differentiated cells, RAR/RXR bound regions are enriched in functional Sox17 binding sites and are characterized with a higher frequency of the canonical DR5 motif. Our data offer an unprecedentedly detailed view on the action of RA in triggering pluripotent cell differentiation and demonstrate that RAR/RXR action is mediated via two different sets of regulatory regions tightly associated with cell differentiation status.
  • Item
    Thumbnail Image
    Stratification bias in low signal microarray studies
    Parker, BJ ; Guenter, S ; Bedo, J (BMC, 2007-09-02)
    BACKGROUND: When analysing microarray and other small sample size biological datasets, care is needed to avoid various biases. We analyse a form of bias, stratification bias, that can substantially affect analyses using sample-reuse validation techniques and lead to inaccurate results. This bias is due to imperfect stratification of samples in the training and test sets and the dependency between these stratification errors, i.e. the variations in class proportions in the training and test sets are negatively correlated. RESULTS: We show that when estimating the performance of classifiers on low signal datasets (i.e. those which are difficult to classify), which are typical of many prognostic microarray studies, commonly used performance measures can suffer from a substantial negative bias. For error rate this bias is only severe in quite restricted situations, but can be much larger and more frequent when using ranking measures such as the receiver operating characteristic (ROC) curve and area under the ROC (AUC). Substantial biases are shown in simulations and on the van 't Veer breast cancer dataset. The classification error rate can have large negative biases for balanced datasets, whereas the AUC shows substantial pessimistic biases even for imbalanced datasets. In simulation studies using 10-fold cross-validation, AUC values of less than 0.3 can be observed on random datasets rather than the expected 0.5. Further experiments on the van 't Veer breast cancer dataset show these biases exist in practice. CONCLUSION: Stratification bias can substantially affect several performance measures. In computing the AUC, the strategy of pooling the test samples from the various folds of cross-validation can lead to large biases; computing it as the average of per-fold estimates avoids this bias and is thus the recommended approach. As a more general solution applicable to other performance measures, we show that stratified repeated holdout and a modified version of k-fold cross-validation, balanced, stratified cross-validation and balanced leave-one-out cross-validation, avoids the bias. Therefore for model selection and evaluation of microarray and other small biological datasets, these methods should be used and unstratified versions avoided. In particular, the commonly used (unbalanced) leave-one-out cross-validation should not be used to estimate AUC for small datasets.
  • Item
    Thumbnail Image
    Plasma lipid profiling in a large population-based cohort
    Weir, JM ; Wong, G ; Barlow, CK ; Greeve, MA ; Kowalczyk, A ; Almasy, L ; Comuzzie, AG ; Mahaney, MC ; Jowett, JBM ; Shaw, J ; Curran, JE ; Blangero, J ; Meikle, PJ (ELSEVIER, 2013-10)
    We have performed plasma lipid profiling using liquid chromatography electrospray ionization tandem mass spectrometry on a population cohort of more than 1,000 individuals. From 10 μl of plasma we were able to acquire comparative measures of 312 lipids across 23 lipid classes and subclasses including sphingolipids, phospholipids, glycerolipids, and cholesterol esters (CEs) in 20 min. Using linear and logistic regression, we identified statistically significant associations of lipid classes, subclasses, and individual lipid species with anthropometric and physiological measures. In addition to the expected associations of CEs and triacylglycerol with age, sex, and body mass index (BMI), ceramide was significantly higher in males and was independently associated with age and BMI. Associations were also observed for sphingomyelin with age but this lipid subclass was lower in males. Lysophospholipids were associated with age and higher in males, but showed a strong negative association with BMI. Many of these lipids have previously been associated with chronic diseases including cardiovascular disease and may mediate the interactions of age, sex, and obesity with disease risk.
  • Item
    No Preview Available
    A Decomposition-Based Algorithm for the Scheduling of Open-Pit Networks over Multiple Time Periods
    Blom, M ; Pearce, A ; Stuckey, P (INFORMS (Institute for Operations Research and Management Sciences), 2016)
    We consider the multiple-time-period, short-term production scheduling problem for a network of multiple open-pit mines and ports. Ore produced at each mine, in each period, is transported by rail to a set of ports and blended into products for shipping. Each port forms these blends to a specification, as stipulated in contracts with downstream customers. This problem belongs to a class of multiple producer/consumer scheduling problems in which producers are able to generate a range of products, a combination of which are required by consumers to meet specified demands. In practice, short-term schedules are formed independently at each mine, tasked with achieving a grade and quality target outlined in a medium-term plan. Because of uncertainty in the data available to a medium-term planner and the dynamics of the mining environment, such targets may not be feasible in the short term. In this paper, we present an algorithm in which the grade and quality targets assigned to each mine are iteratively adapted, ensuring the satisfaction of blending constraints at each port while generating schedules for each mine that maximise resource utilisation. This paper was accepted by Yinyu Ye, optimization.
  • Item
    No Preview Available
    Multi-objective short-term production scheduling for open-pit mines: a hierarchical decomposition-based algorithm
    Blom, M ; Pearce, AR ; Stuckey, PJ (TAYLOR & FRANCIS LTD, 2018-12-02)
    This article presents a novel algorithm for solving a short-term open-pit production-scheduling problem in which several objectives, of varying priority, characterize the quality of each solution. A popular approach employs receding horizon control, dividing the horizon into N period-aggregates of increasing size (number of periods or span). An N-period mixed integer program (MIP) is solved for each period in the original horizon to incrementally construct a production schedule one period at a time. This article presents a new algorithm that, in contrast, decomposes the horizon into N period-aggregates of equal size. Given a schedule for these N periods, obtained by solving an N-period MIP, the first of these aggregates is itself decomposed into an N-period scheduling problem with guidance provided on what regions of the mine should be extracted. The performance of this hierarchical decomposition-based approach is compared with that of receding horizon control on a suite of data sets generated from an operating mine producing millions of tons of ore annually. As the number of objectives being optimized increases, the hierarchical decomposition-based algorithm outperforms receding horizon control, in a majority of instances.
  • Item
    Thumbnail Image
    Automated human-level diagnosis of dysgraphia using a consumer tablet
    Asselborn, T ; Gargot, T ; Kidzinski, L ; Johal, W ; Cohen, D ; Jolly, C ; Dillenbourg, P (NATURE PORTFOLIO, 2018-08-31)
    The academic and behavioral progress of children is associated with the timely development of reading and writing skills. Dysgraphia, characterized as a handwriting learning disability, is usually associated with dyslexia, developmental coordination disorder (dyspraxia), or attention deficit disorder, which are all neuro-developmental disorders. Dysgraphia can seriously impair children in their everyday life and require therapeutic care. Early detection of handwriting difficulties is, therefore, of great importance in pediatrics. Since the beginning of the 20th century, numerous handwriting scales have been developed to assess the quality of handwriting. However, these tests usually involve an expert investigating visually sentences written by a subject on paper, and, therefore, they are subjective, expensive, and scale poorly. Moreover, they ignore potentially important characteristics of motor control such as writing dynamics, pen pressure, or pen tilt. However, with the increasing availability of digital tablets, features to measure these ignored characteristics are now potentially available at scale and very low cost. In this work, we developed a diagnostic tool requiring only a commodity tablet. To this end, we modeled data of 298 children, including 56 with dysgraphia. Children performed the BHK test on a digital tablet covered with a sheet of paper. We extracted 53 handwriting features describing various aspects of handwriting, and used the Random Forest classifier to diagnose dysgraphia. Our method achieved 96.6% sensibility and 99.2% specificity. Given the intra-rater and inter-rater levels of agreement in the BHK test, our technique has comparable accuracy for experts and can be deployed directly as a diagnostics tool.
  • Item
    Thumbnail Image
    BioCaster: detecting public health rumors with a Web-based text mining system.
    Collier, N ; Doan, S ; Kawazoe, A ; Goodwin, RM ; Conway, M ; Tateno, Y ; Ngo, Q-H ; Dien, D ; Kawtrakul, A ; Takeuchi, K ; Shigematsu, M ; Taniguchi, K (Oxford University Press (OUP), 2008-12-15)
    SUMMARY: BioCaster is an ontology-based text mining system for detecting and tracking the distribution of infectious disease outbreaks from linguistic signals on the Web. The system continuously analyzes documents reported from over 1700 RSS feeds, classifies them for topical relevance and plots them onto a Google map using geocoded information. The background knowledge for bridging the gap between Layman's terms and formal-coding systems is contained in the freely available BioCaster ontology which includes information in eight languages focused on the epidemiological role of pathogens as well as geographical locations with their latitudes/longitudes. The system consists of four main stages: topic classification, named entity recognition (NER), disease/location detection and event recognition. Higher order event analysis is used to detect more precisely specified warning signals that can then be notified to registered users via email alerts. Evaluation of the system for topic recognition and entity identification is conducted on a gold standard corpus of annotated news articles. AVAILABILITY: The BioCaster map and ontology are freely available via a web portal at http://www.biocaster.org.