Analyzing Gender Translation Errors to Identify Information Flows between the Encoder and Decoder of a NMT System - Laboratoire Interdisciplinaire des Sciences du Numérique Accéder directement au contenu
Communication Dans Un Congrès Année : 2022

Analyzing Gender Translation Errors to Identify Information Flows between the Encoder and Decoder of a NMT System

Résumé

Multiple studies have shown that existing NMT systems demonstrate some kind of "gender bias". As a result, MT output appears to err more often for feminine forms and to amplify social gender misrepresentations, which is potentially harmful to users and practioners of these technologies. This paper continues this line of investigations and reports results obtained with a new test set in strictly controlled conditions. This setting allows us to better understand the multiple inner mechanisms that are causing these biases, which include the linguistic expressions of gender, the unbalanced distribution of masculine and feminine forms in the language, the modelling of morphological variation and the training process dynamics. To counterbalance these effects, we formulate several proposals and notably show that modifying the training loss can effectively mitigate such biases.
Fichier principal
Vignette du fichier
source_transfert.pdf (302.37 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)

Dates et versions

hal-03912438 , version 1 (24-12-2022)

Identifiants

  • HAL Id : hal-03912438 , version 1

Citer

Guillaume Wisniewski, Lichao Zhu, Nicolas Ballier, François Yvon. Analyzing Gender Translation Errors to Identify Information Flows between the Encoder and Decoder of a NMT System. BlackboxNLP 2022, Dec 2022, Abu Dhabi, United Arab Emirates. ⟨hal-03912438⟩
55 Consultations
36 Téléchargements

Partager

Gmail Facebook X LinkedIn More