Computing and Information Systems - Theses

Permanent URI for this collection

Search Results

Now showing 1 - 1 of 1
  • Item
    Thumbnail Image
    From Discourse and Keyphrases, to Language Modeling in Automatic Summarization
    Fajri, Fajri ( 2022)
    This thesis aims to enhance single-document automatic summarization by exploring four different spectrums: language model, discourse, keyphrases, and evaluation systems. First, progress on language models and automatic summarization has predominantly been in English, and it leaves open the question of whether they function effectively in other languages. To address this, we perform a case study on the Indonesian language by releasing two pre-trained language models (i.e. IndoBERT, IndoBERTweet) and two large-scale summarization corpora (i.e. Liputan6 and LipKey). While our findings suggest that the current progress indeed effectively works in Indonesian, we found particular challenges in evaluating Indonesian text summarization because of morphological variation, synonyms, and abbreviations in system-generated summaries. Second, modern summarization systems are built on pre-trained language models which serve as the foundation models. However, it is still unclear whether these language models truly learn the summarization task, or they simply memorize the pattern of the input document and human-written summaries. In this thesis, we argue that these language models are still imperfect, and investigate the benefits of discourse information and keyphrases for summarization systems. This is because discourse provides information relating to text organization, while keyphrases capture succinct and salient words about the text. To test this hypothesis, we first perform discourse probing on pre-trained language models to understand the extent to which they capture discourse relations, and introduce a novel approach to discourse parsing - which aims to recover the discourse structure given a document. We then explicitly incorporate discourse and keyphrases into summarization systems and found the qualities of machine-generated summaries improve. Lastly, despite significant progress in the development of summarization models, both automatic and manual evaluations of text summarization are less studied. Reliable and scalable evaluation is critical to measure the research progress in summarization, and ROUGE as the de facto summarization evaluation is inadequate. ROUGE only evaluates summary quality by comparing word overlap between machine-generated and human-written summaries, while broader aspects such as faithfulness (the extent to which the generated summary contains genuine details found in the document) and linguistic quality of summaries (e.g. fluency of the language) are not covered. The last contribution of this thesis is a comprehensive automatic evaluation framework for text summarization compiling prominent aspects used in the manual evaluations of prior works. This proposal is introduced as the FFCI framework that consists of four aspects: faithfulness, focus, coverage, and inter-sentential coherence, and we propose methods to automatically assess summarization quality based on these four aspects.