Knowledge base enrichment via deep neural networks
AuthorTrisedya, Bayu Distiawan
AffiliationComputing and Information Systems
Document TypePhD thesis
Access StatusOpen Access
© 2020 Bayu Distiawan Trisedya
A knowledge base is a large repository that typically stores information about real-world entities. Several efforts have been made to develop knowledge bases in general and specific domains such as DBpedia, YAGO, LinkedGeoData, and Wikidata. These knowledge bases contain millions of facts about entities. However, these knowledge bases are far from complete and mandate continuous enrichment and curation. In this thesis, we study three common methods to enrich a knowledge base. The first is a Knowledge Bases Alignment method that aims to find entities in two knowledge bases that represent the same real-world entity, and then integrates these knowledge bases based on the aligned entities. Many knowledge bases have been created separately for particular purposes with overlapping entity coverage. These knowledge bases are complementary to each other in terms of completeness. We may integrate such knowledge bases to form a more extensive knowledge base for knowledge inferences. The second is a Relation Extraction method that aims to extract entities and their relationships from sentences of a corpus and map them to an existing knowledge base. With a large amount of unstructured data sources (i.e., sentences), the relation extraction is an essential method to extract facts from any data source for enriching a knowledge base. The third is a Description Generation method that aims to generate a sentence to describe a target entity from its properties in a knowledge base. The generated description can be used to enrich the presentation of the knowledge in a knowledge base, which later can be used in many downstream applications. For example, in question answering, the generated sentence can be used to describe the entity in the answer. For knowledge bases alignment, we propose an embedding-based entity alignment model. Our model exploits attribute embeddings that capture the similarity between entities in different knowledge bases. We also propose an end-to-end relation extraction model for knowledge base enrichment. The proposed model integrates the extraction and canonicalization tasks. This integration helps the model reduces the error propagation between relation extraction and named entity disambiguation that existing approaches are prone to. For description generation, we propose a content plan based attention model to generate sentences from knowledge base triples in the form of a star-shaped graph. We further propose a graph-based encoder to handle arbitrary-shaped graph for generating entity description. Extensive experiment results show that the proposed methods outperform the state-of-the-art methods in the knowledge base enrichment problems studied.
KeywordsKnowledge base; Knowledge graph; Deep neural networks; Knowledge base alignment; Sentence generation; Relation extraction
- Click on "Export Reference in RIS Format" and choose "open with... Endnote".
- Click on "Export Reference in RIS Format". Login to Refworks, go to References => Import References