Scaling conditional random fields using error correcting codes
AuthorCOHN, TREVOR; SMITH, ANDREW; OSBORNE, MILES
Source TitleProceedings, 43rd Annual Meeting of the Association for Computational Linguists
University of Melbourne Author/sCohn, Trevor
AffiliationEngineering: Department of Computer Science and Software Engineering
Document TypeConference Paper
CitationsCohn, T., Smith, A., & Osborne, M. (2005). Scaling conditional random fields using error correcting codes. In, Proceedings, 43rd Annual Meeting of the Association for Computational Linguists, Ann Arbor, Michigan.
Access StatusOpen Access
Conditional Random Fields (CRFs) have been applied with considerable success to a number of natural language processing tasks. However, these tasks have mostly involved very small label sets. When deployed on tasks with larger label sets, the requirements for computational resources mean that training becomes intractable. This paper describes a method for training CRFs on such tasks, using error correcting output codes (ECOC). A number of CRFs are independently trained on the separate binary labelling tasks of distinguishing between a subset of the labels and its complement. During decoding, these models are combined to produce a predicted label sequence which is resilient to errors by individual models. Error-correcting CRF training is much less resource intensive and has a much faster training time than a standardly formulated CRF, while decoding performance remains quite comparable. This allows us to scale CRFs to previously impossible tasks, as demonstrated by our experiments with large label sets.
Keywordserror-correcting codes; machine learning; named entity recognition; natural lanuagage processing; part of speech tagging; noun phrase chunking
- Click on "Export Reference in RIS Format" and choose "open with... Endnote".
- Click on "Export Reference in RIS Format". Login to Refworks, go to References => Import References