Show simple item record

dc.contributor.authorTan, Ping Ping
dc.date.accessioned2020-01-13T01:15:17Z
dc.date.available2020-01-13T01:15:17Z
dc.date.issued2019
dc.identifier.urihttp://hdl.handle.net/11343/233794
dc.description© 2019 Ping Ping Tan
dc.description.abstractDespite advancement in collaborative writing tools, the track changes capability remains limited to highlighting syntactic changes, with authors still required to manually read through each of the revisions. We envision a collaborative authoring system where an author could accept all minor edits first and then focus on the substantial changes. The primary goal of this thesis is to develop a computational framework for significant revision identification where paraphrase approaches cannot fully support such identification. An existing taxonomy of revision analysis categorises revisions to surface (i.e. no meaning) and text-base (i.e. meaning) changes, with further categorisation of surface change to formal changes and meaning preserving changes, while textbase change is sub-divided to micro-structure and macro-structure changes. However, the taxonomy lacks details for computational modelling. Through examination of the works in the domain of psycho-linguistics, introspective analysis and feedback from both authors and non-authors on what constitute significant revisions, a conceptual framework for significant revision identification is proposed. An inter-rater agreement of alpha Krippendorff = 0.745 was obtained for the annotation between the authors and non-authors. The core concept of our proposed approach is bi-directional textual entailment assessment. We demonstrated that this concept is computationally feasible by relying on existing textual entailment systems. Our proposed approach is more accurate (micro-averaged F1 = 0.541) compared to several baseline approaches based on edit distance, which are similar to the current track changes capability built in most of the word processors. Computationally identifying significant revisions between two versions of a text document has the potential to improve the revision process in a multi-author environment when multiple revisions are done by different authors.
dc.rightsTerms and Conditions: Copyright in works deposited in Minerva Access is retained by the copyright owner. The work may not be altered without permission from the copyright owner. Readers may only download, print and save electronic copies of whole works for their own personal non-commercial use. Any use that exceeds these limits requires permission from the copyright owner. Attribution is essential when quoting or paraphrasing from these works.
dc.subjecttext revision, significant revision identification, textual entailment
dc.titleSignificant Revision Identification between Revised Texts in a Multi-Author Environment
dc.typePhD thesis
melbourne.affiliation.departmentComputing and Information Systems
melbourne.affiliation.facultyEngineering
melbourne.thesis.supervisornameCornelia Verspoor
melbourne.contributor.authorTan, Ping Ping
melbourne.thesis.supervisorothernameTimothy Miller
melbourne.tes.fieldofresearch1080107 Natural Language Processing
melbourne.tes.fieldofresearch2200402 Computational Linguistics
melbourne.tes.confirmedtrue
melbourne.accessrightsOpen Access


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record