Computing and Information Systems - Research Publications

Permanent URI for this collection

Search Results

Now showing 1 - 5 of 5
  • Item
    Thumbnail Image
    Optimising Equal Opportunity Fairness in Model Training
    Shen, A ; Han, X ; Cohn, T ; Baldwin, T ; Frermann, L (ASSOC COMPUTATIONAL LINGUISTICS-ACL, 2022)
  • Item
    Thumbnail Image
    Evaluating Debiasing Techniques for Intersectional Biases
    Subramanian, S ; Han, X ; Baldwin, T ; Cohn, T ; Frermann, L (Association for Computational Linguistics, 2021-01-01)
    Bias is pervasive in NLP models, motivating the development of automatic debiasing techniques. Evaluation of NLP debiasing methods has largely been limited to binary attributes in isolation, e.g., debiasing with respect to binary gender or race, however many corpora involve multiple such attributes, possibly with higher cardinality. In this paper we argue that a truly fair model must consider 'gerrymandering' groups which comprise not only single attributes, but also intersectional groups. We evaluate a form of bias-constrained model which is new to NLP, as well an extension of the iterative nullspace projection technique which can handle multiple protected attributes.
  • Item
    Thumbnail Image
    Fairness-aware Class Imbalanced Learning
    Subramanian, S ; Rahimi, A ; Baldwin, T ; Cohn, T ; Frermann, L (Association for Computational Linguistics, 2021-01-01)
    Class imbalance is a common challenge in many NLP tasks, and has clear connections to bias, in that bias in training data often leads to higher accuracy for majority groups at the expense of minority groups. However there has traditionally been a disconnect between research on class-imbalanced learning and mitigating bias, and only recently have the two been looked at through a common lens. In this work we evaluate long-tail learning methods for tweet sentiment and occupation classification, and extend a margin-loss based approach with methods to enforce fairness. We empirically show through controlled experiments that the proposed approaches help mitigate both class imbalance and demographic biases.
  • Item
    Thumbnail Image
    Tangled up in BLEU: Reevaluating the Evaluation of Automatic Machine Translation Evaluation Metrics
    Mathur, N ; Baldwin, T ; Cohn, T (Association for Computational Linguistics, 2020-07)
    Automatic metrics are fundamental for the development and evaluation of machine translation systems. Judging whether, and to what extent, automatic metrics concur with the gold standard of human evaluation is not a straightforward problem. We show that current methods for judging metrics are highly sensitive to the translations used for assessment, particularly the presence of outliers, which often leads to falsely confident conclusions about a metric’s efficacy. Finally, we turn to pairwise system ranking, developing a method for thresholding performance improvement under an automatic metric against human judgements, which allows quantification of type I versus type II errors incurred, i.e., insignificant human differences in system quality that are accepted, and significant human differences that are rejected. Together, these findings suggest improvements to the protocols for metric evaluation and system performance evaluation in machine translation.
  • Item
    Thumbnail Image
    Take and Took, Gaggle and Goose, Book and Read: Evaluating the Utility of Vector Differences for Lexical Relation Learning
    Vylomova, E ; Rimell, L ; Cohn, T ; Baldwin, T ; Erk, K ; Smith, NA (The Association for Computational Linguistics, 2016)
    Recent work on word embeddings has shown that simple vector subtraction over pre-trained embeddings is surprisingly effective at capturing different lexical relations, despite lacking explicit supervision. Prior work has evaluated this intriguing result using a word analogy prediction formulation and hand-selected relations, but the generality of the finding over a broader range of lexical relation types and different learning settings has not been evaluated. In this paper, we carry out such an evaluation in two learning settings: (1) spectral clustering to induce word relations, and (2) supervised learning to classify vector differences into relation types. We find that word embeddings capture a surprising amount of information, and that, under suitable supervised training, vector subtraction generalises well to a broad range of relations, including over unseen lexical items.