Computing and Information Systems - Research Publications

Permanent URI for this collection

Search Results

Now showing 1 - 10 of 365
  • Item
    Thumbnail Image
    Towards a semantic lexicon for biological language processing
    Verspoor, K (HINDAWI LTD, 2005)
    This paper explores the use of the resources in the National Library of Medicine's Unified Medical Language System (UMLS) for the construction of a lexicon useful for processing texts in the field of molecular biology. A lexicon is constructed from overlapping terms in the UMLS SPECIALIST lexicon and the UMLS Metathesaurus to obtain both morphosyntactic and semantic information for terms, and the coverage of a domain corpus is assessed. Over 77% of tokens in the domain corpus are found in the constructed lexicon, validating the lexicon's coverage of the most frequent terms in the domain and indicating that the constructed lexicon is potentially an important resource for biological text processing.
  • Item
    Thumbnail Image
    Assessment of disease named entity recognition on a corpus of annotated sentences
    Jimeno, A ; Jimenez-Ruiz, E ; Lee, V ; Gaudan, S ; Berlanga, R ; Rebholz-Schuhmann, D (BMC, 2008)
    BACKGROUND: In recent years, the recognition of semantic types from the biomedical scientific literature has been focused on named entities like protein and gene names (PGNs) and gene ontology terms (GO terms). Other semantic types like diseases have not received the same level of attention. Different solutions have been proposed to identify disease named entities in the scientific literature. While matching the terminology with language patterns suffers from low recall (e.g., Whatizit) other solutions make use of morpho-syntactic features to better cover the full scope of terminological variability (e.g., MetaMap). Currently, MetaMap that is provided from the National Library of Medicine (NLM) is the state of the art solution for the annotation of concepts from UMLS (Unified Medical Language System) in the literature. Nonetheless, its performance has not yet been assessed on an annotated corpus. In addition, little effort has been invested so far to generate an annotated dataset that links disease entities in text to disease entries in a database, thesaurus or ontology and that could serve as a gold standard to benchmark text mining solutions. RESULTS: As part of our research work, we have taken a corpus that has been delivered in the past for the identification of associations of genes to diseases based on the UMLS Metathesaurus and we have reprocessed and re-annotated the corpus. We have gathered annotations for disease entities from two curators, analyzed their disagreement (0.51 in the kappa-statistic) and composed a single annotated corpus for public use. Thereafter, three solutions for disease named entity recognition including MetaMap have been applied to the corpus to automatically annotate it with UMLS Metathesaurus concepts. The resulting annotations have been benchmarked to compare their performance. CONCLUSIONS: The annotated corpus is publicly available at ftp://ftp.ebi.ac.uk/pub/software/textmining/corpora/diseases and can serve as a benchmark to other systems. In addition, we found that dictionary look-up already provides competitive results indicating that the use of disease terminology is highly standardized throughout the terminologies and the literature. MetaMap generates precise results at the expense of insufficient recall while our statistical method obtains better recall at a lower precision rate. Even better results in terms of precision are achieved by combining at least two of the three methods leading, but this approach again lowers recall. Altogether, our analysis gives a better understanding of the complexity of disease annotations in the literature. MetaMap and the dictionary based approach are available through the Whatizit web service infrastructure (Rebholz-Schuhmann D, Arregui M, Gaudan S, Kirsch H, Jimeno A: Text processing through Web services: Calling Whatizit. Bioinformatics 2008, 24:296-298).
  • Item
    Thumbnail Image
    Unveiling Hidden Unstructured Regions in Process Models
    Polyvyanyy, A ; Garcia-Banuelos, L ; Weske, M ; Meersman, R ; Dillon, T ; Herrero, P (SPRINGER-VERLAG BERLIN, 2009-01-01)
    Process models define allowed process execution scenarios. The models are usually depicted as directed graphs, with gateway nodes regulating the control flow routing logic and with edges specifying the execution order constraints between tasks. While arbitrarily structured control flow patterns in process models complicate model analysis, they also permit creativity and full expressiveness when capturing non-trivial process scenarios. This paper gives a classification of arbitrarily structured process models based on the hierarchical process model decomposition technique. We identify a structural class of models consisting of block structured patterns which, when combined, define complex execution scenarios spanning across the individual patterns. We show that complex behavior can be localized by examining structural relations of loops in hidden unstructured regions of control flow. The correctness of the behavior of process models within these regions can be validated in linear time. These observations allow us to suggest techniques for transforming hidden unstructured regions into block-structured ones.
  • Item
    Thumbnail Image
    On application of structural decomposition for process model abstraction
    Polyvyanyy, A ; Smirnov, S ; Weske, M (Springer Verlag, 2009-12-01)
    Real world business process models may consist of hundreds of elements and have sophisticated structure. Although there are tasks where such models are valuable and appreciated, in general complexity has a negative influence on model comprehension and analysis. Thus, means for managing the complexity of process models are needed. One approach is abstraction of business process models - creation of a process model which preserves the main features of the initial elaborate process model, but leaves out insignificant details. In this paper we study the structural aspects of process model abstraction and introduce an abstraction approach based on process structure trees (PST). The developed approach assures that the abstracted process model preserves the ordering constraints of the initial model. It surpasses pattern-based process model abstraction approaches, allowing to handle graph-structured process models of arbitrary structure. We also provide an evaluation of the proposed approach.
  • Item
    Thumbnail Image
    Flexible Process Graph: A Prologue
    Polyvyanyy, A ; Weske, M ; Meersman, R ; Tari, Z (SPRINGER-VERLAG BERLIN, 2008-01-01)
    Businesses document their operational processes as process models. The common practice is to represent process models as directed graphs. The nodes of a process graph represent activities and directed edges constitute activity ordering constraints. A flexible process graph modeling approach proposes to generalize process graph structure to a hypergraph. Obtained process structure aims at formalization of ad-hoc process control flow. In this paper we discuss aspects relevant to concurrent execution of process activities in a collaborative manner organized as a flexible process graph. We provide a real world flexible process scenario to illustrate the approach.
  • Item
    Thumbnail Image
    Process Model Abstraction: A Slider Approach
    Polyvyanyy, A ; Smirnov, S ; Weske, M (IEEE COMPUTER SOC, 2008-01-01)
    Process models provide companies efficient means for managing their business processes. Tasks where process models are employed are different by nature and require models of various abstraction levels. However, maintaining several models of one business process involves a lot of synchronization effort and is erroneous. Business process model abstraction assumes a detailed model of a process to be available and derives coarse grained models from it. The task of abstraction is to tell significant model elements from insignificant ones and to reduce the latter. In this paper we argue that process model abstraction can be driven by different abstraction criteria. Criterion choice depends on a task which abstraction facilitates. We propose an abstraction slider-a mechanism that allows user control of the model abstraction level. We discuss examples of combining the slider with different abstraction criteria and sets of process model transformation rules.
  • Item
    Thumbnail Image
    An ontology-based service discovery approach for the provisioning of product-service bundles
    Knackstedt, R ; Kuropka, D ; MĂĽller, O ; Polyvyanyy, A (ECIS, 2008-12-01)
    More and more traditional manufacturing companies bundle their products with services to offer integrated solutions. Some of these services can be digitized completely and thus fully delivered electronically. Other services require the physical integration of external factors, but can still be coordinated electronically. In both cases companies face the problem of discovering concrete service offerings in the market. Based on ideas from the web service discovery discipline we propose a meetin- the-middle approach between heavy-weight semantic technologies and simple full-text search to address this issue. Our approach is able to identify and process semantic and linguistic relations in service descriptions and thus delivers better results than syntax-based search. However – unlike most semantic approaches – it does not require the use of any artificial language and thus requires less resources and skills for both service providers and consumers. To fully realize the potentials of the proposed approach a domain ontology is needed. In this research-in-progress paper we construct such an ontology for the domain of product-service bundles through analysis and synthesis of related work on service description. This will serve as an anchor for future research to iteratively improve and evaluate the ontology through collaborative design efforts and practical application.
  • Item
    Thumbnail Image
    Semantic Querying of Business Process Models
    Awad, A ; Polyvyanyy, A ; Weske, M (IEEE COMPUTER SOC, 2008)
  • Item
    Thumbnail Image
    Reducing complexity of large EPCs
    Polyvyanyy, A ; Smirnov, S ; Weske, M ( 2008-12-01)
  • Item
    Thumbnail Image
    Hypergraph-Based Modeling of Ad-Hoc Business Processes
    Polyvyanyy, A ; Weske, M ; Ardagna, D ; Mecella, M ; Yang, J (SPRINGER-VERLAG BERLIN, 2009-01-01)
    Process models are usually depicted as directed graphs, with nodes representing activities and directed edges control flow. While structured processes with pre-defined control flow have been studied in detail, flexible processes including ad-hoc activities need further investigation. This paper presents flexible process graph, a novel approach to model processes in the context of dynamic environment and adaptive process participants’ behavior. The approach allows defining execution constraints, which are more restrictive than traditional ad-hoc processes and less restrictive than traditional control flow, thereby balancing structured control flow with unstructured ad-hoc activities. Flexible process graph focuses on what can be done to perform a process. Process participants’ routing decisions are based on the current process state. As a formal grounding, the approach uses hypergraphs, where each edge can associate any number of nodes. Hypergraphs are used to define execution semantics of processes formally. We provide a process scenario to motivate and illustrate the approach.