Computing and Information Systems - Theses

Permanent URI for this collection

Search Results

Now showing 1 - 1 of 1
  • Item
    Thumbnail Image
    Toward semantic interoperability for software systems
    Lister, Kendall ( 2008)
    “In an ill-structured domain you cannot, by definition, have a pre-compiled schema in your mind for every circumstance and context you may find ... you must be able to flexibly select and arrange knowledge sources to most efficaciously pursue the needs of a given situation.” [57] In order to interact and collaborate effectively, agents, whether human or software, must be able to communicate through common understandings and compatible conceptualisations. Ontological differences that occur either from pre-existing assumptions or as side-effects of the process of specification are a fundamental obstacle that must be overcome before communication can occur. Similarly, the integration of information from heterogeneous sources is an unsolved problem. Efforts have been made to assist integration, through both methods and mechanisms, but automated integration remains an unachieved goal. Communication and information integration are problems of meaning and interaction, or semantic interoperability. This thesis contributes to the study of semantic interoperability by identifying, developing and evaluating three approaches to the integration of information. These approaches have in common that they are lightweight in nature, pragmatic in philosophy and general in application. The first work presented is an effort to integrate a massive, formal ontology and knowledge-base with semi-structured, informal heterogeneous information sources via a heuristic-driven, adaptable information agent. The goal of the work was to demonstrate a process by which task-specific knowledge can be identified and incorporated into the massive knowledge-base in such a way that it can be generally re-used. The practical outcome of this effort was a framework that illustrates a feasible approach to providing the massive knowledge-base with an ontologically-sound mechanism for automatically generating task-specific information agents to dynamically retrieve information from semi-structured information sources without requiring machine-readable meta-data. The second work presented is based on reviving a previously published and neglected algorithm for inferring semantic correspondences between fields of tables from heterogeneous information sources. An adapted form of the algorithm is presented and evaluated on relatively simple and consistent data collected from web services in order to verify the original results, and then on poorly-structured and messy data collected from web sites in order to explore the limits of the algorithm. The results are presented via standard measures and are accompanied by detailed discussions on the nature of the data encountered and an analysis of the strengths and weaknesses of the algorithm and the ways in which it complements other approaches that have been proposed. Acknowledging the cost and difficulty of integrating semantically incompatible software systems and information sources, the third work presented is a proposal and a working prototype for a web site to facilitate the resolving of semantic incompatibilities between software systems prior to deployment, based on the commonly-accepted software engineering principle that the cost of correcting faults increases exponentially as projects progress from phase to phase, with post-deployment corrections being significantly more costly than those performed earlier in a project’s life. The barriers to collaboration in software development are identified and steps taken to overcome them. The system presented draws on the recent collaborative successes of social and collaborative on-line projects such as SourceForge, Del.icio.us, digg and Wikipedia and a variety of techniques for ontology reconciliation to provide an environment in which data definitions can be shared, browsed and compared, with recommendations automatically presented to encourage developers to adopt data definitions compatible with previously developed systems. In addition to the experimental works presented, this thesis contributes reflections on the origins of semantic incompatibility with a particular focus on interaction between software systems, and between software systems and their users, as well as detailed analysis of the existing body of research into methods and techniques for overcoming these problems.