- Computing and Information Systems - Research Publications
Computing and Information Systems - Research Publications
Permanent URI for this collection
Search Results
Now showing
1 - 10 of 293
-
ItemNo Preview AvailableTREE-BASED STATISTICAL MACHINE TRANSLATION: EXPERIMENTS WITH THE ENGLISH AND BRAZILIAN PORTUGUESE PAIRBeck, D ; Caseli, H (SBIC, 2013)Machine Learning paradigms have dominated recent research in Machine Translation. Current state-of-the-art approaches rely only on statistical methods that gather all necessary knowledge from parallel corpora. However, this lack on explicit linguistic knowledge makes them unable to model some linguistic phenomena. In this work, we focus on models that take into account the syntactic information from the languages involved on the translation process. We follow a novel approach that preprocess parallel corpora using syntactic parsers and uses translation models composed by Tree Transducers. We perform experiments with English and Brazilian Portuguese, providing the first known results in syntax-based Statistical Machine Translation for this language pair. These results show that this approach is able to better model phenomena like long-distance reordering and give directions to future improvements in building syntax-based translation models for this pair.
-
ItemPlasma lipid profiling in a large population-based cohortWeir, JM ; Wong, G ; Barlow, CK ; Greeve, MA ; Kowalczyk, A ; Almasy, L ; Comuzzie, AG ; Mahaney, MC ; Jowett, JBM ; Shaw, J ; Curran, JE ; Blangero, J ; Meikle, PJ (ELSEVIER, 2013-10)We have performed plasma lipid profiling using liquid chromatography electrospray ionization tandem mass spectrometry on a population cohort of more than 1,000 individuals. From 10 μl of plasma we were able to acquire comparative measures of 312 lipids across 23 lipid classes and subclasses including sphingolipids, phospholipids, glycerolipids, and cholesterol esters (CEs) in 20 min. Using linear and logistic regression, we identified statistically significant associations of lipid classes, subclasses, and individual lipid species with anthropometric and physiological measures. In addition to the expected associations of CEs and triacylglycerol with age, sex, and body mass index (BMI), ceramide was significantly higher in males and was independently associated with age and BMI. Associations were also observed for sphingomyelin with age but this lipid subclass was lower in males. Lysophospholipids were associated with age and higher in males, but showed a strong negative association with BMI. Many of these lipids have previously been associated with chronic diseases including cardiovascular disease and may mediate the interactions of age, sex, and obesity with disease risk.
-
ItemAbstract Interpretation over Non-Lattice Abstract DomainsGange, G ; Navas, JA ; Schachte, P ; Søndergaard, H ; Stuckey, PJ ; Logozzo, F ; Fahndrich, M (Springer, 2013)The classical theoretical framework for static analysis of programs is abstract interpretation. Much of the power and elegance of that framework rests on the assumption that an abstract domain is a lattice. Nonetheless, and for good reason, the literature on program analysis provides many examples of non-lattice domains, including non-convex numeric domains. The lack of domain structure, however, has negative consequences, both for the precision of program analysis and for the termination of standard Kleene iteration. In this paper we explore these consequences and present general remedies.
-
ItemFeasibility of using Clinical Element Models (CEM) to standardize phenotype variables in the database of genotypes and phenotypes (dbGaP).Lin, K-W ; Tharp, M ; Conway, M ; Hsieh, A ; Ross, M ; Kim, J ; Kim, H-E ; Raghava, GPS (Public Library of Science (PLoS), 2013)The database of Genotypes and Phenotypes (dbGaP) contains various types of data generated from genome-wide association studies (GWAS). These data can be used to facilitate novel scientific discoveries and to reduce cost and time for exploratory research. However, idiosyncrasies and inconsistencies in phenotype variable names are a major barrier to reusing these data. We addressed these challenges in standardizing phenotype variables by formalizing their descriptions using Clinical Element Models (CEM). Designed to represent clinical data, CEMs were highly expressive and thus were able to represent a majority (77.5%) of the 215 phenotype variable descriptions. However, their high expressivity also made it difficult to directly apply them to research data such as phenotype variables in dbGaP. Our study suggested that simplification of the template models makes it more straightforward to formally represent the key semantics of phenotype variables.
-
ItemA highly optimized algorithm for continuous intersection join queries over moving objectsZhang, R ; Qi, J ; Lin, D ; Wang, W ; Wong, RC-W (SPRINGER, 2012-08)
-
ItemA proximity-aware load balancing in peer-to-peer-based volunteer computing systemsGhafarian, T ; Deldari, H ; Javadi, B ; Buyya, R (SPRINGER, 2013-08)
-
ItemA time decoupling approach for studying forum dynamicsKan, A ; Chan, J ; Hayes, C ; Hogan, B ; Bailey, J ; Leckie, C (SPRINGER, 2013-11)
-
ItemAn enhanced XCS rule discovery module using feature rankingAbedini, M ; Kirley, M (SPRINGER HEIDELBERG, 2013-06)
-
ItemAutomatic keyphrase extraction from scientific articlesKim, SN ; Medelyan, O ; Kan, M-Y ; Baldwin, T (SPRINGER, 2013-09)
-
ItemConservative scales in packing problemsBelov, G ; Kartak, VM ; Rohling, H ; Scheithauer, G (SPRINGER, 2013-03)