Significant Revision Identification between Revised Texts in a Multi-Author Environment
AuthorTan, Ping Ping
AffiliationComputing and Information Systems
Document TypePhD thesis
Access StatusOpen Access
© 2019 Ping Ping Tan
Despite advancement in collaborative writing tools, the track changes capability remains limited to highlighting syntactic changes, with authors still required to manually read through each of the revisions. We envision a collaborative authoring system where an author could accept all minor edits first and then focus on the substantial changes. The primary goal of this thesis is to develop a computational framework for significant revision identification where paraphrase approaches cannot fully support such identification. An existing taxonomy of revision analysis categorises revisions to surface (i.e. no meaning) and text-base (i.e. meaning) changes, with further categorisation of surface change to formal changes and meaning preserving changes, while textbase change is sub-divided to micro-structure and macro-structure changes. However, the taxonomy lacks details for computational modelling. Through examination of the works in the domain of psycho-linguistics, introspective analysis and feedback from both authors and non-authors on what constitute significant revisions, a conceptual framework for significant revision identification is proposed. An inter-rater agreement of alpha Krippendorff = 0.745 was obtained for the annotation between the authors and non-authors. The core concept of our proposed approach is bi-directional textual entailment assessment. We demonstrated that this concept is computationally feasible by relying on existing textual entailment systems. Our proposed approach is more accurate (micro-averaged F1 = 0.541) compared to several baseline approaches based on edit distance, which are similar to the current track changes capability built in most of the word processors. Computationally identifying significant revisions between two versions of a text document has the potential to improve the revision process in a multi-author environment when multiple revisions are done by different authors.
Keywordstext revision, significant revision identification, textual entailment
- Click on "Export Reference in RIS Format" and choose "open with... Endnote".
- Click on "Export Reference in RIS Format". Login to Refworks, go to References => Import References