Learning with Noise-Contrastive Estimation: Easing training by learning to scale - Laboratoire Interdisciplinaire des Sciences du Numérique Accéder directement au contenu
Communication Dans Un Congrès Année : 2018

Learning with Noise-Contrastive Estimation: Easing training by learning to scale

Résumé

Noise-Contrastive Estimation (NCE) is a learning criterion that is regularly used to train neural language models in place of Maximum Likelihood Estimation, since it avoids the computational bottleneck caused by the output softmax. In this paper, we analyse and explain some of the weaknesses of this objective function, linked to the mechanism of self-normalization, by closely monitoring comparative experiments. We then explore several remedies and modifications to propose tractable and efficient NCE training strategies. In particular, we propose to make the scaling factor a trainable parameter of the model, and to use the noise distribution to initialize the output bias. These solutions, yet simple, yield stable and competitive performances in either small and large scale language modelling tasks.
Fichier principal
Vignette du fichier
C18-1261.pdf (573.48 Ko) Télécharger le fichier
Origine : Fichiers éditeurs autorisés sur une archive ouverte
Loading...

Dates et versions

hal-02912385 , version 1 (05-08-2020)

Identifiants

  • HAL Id : hal-02912385 , version 1

Citer

Matthieu Labeau, Alexandre Allauzen. Learning with Noise-Contrastive Estimation: Easing training by learning to scale. 27th International Conference on Computational Linguistics (COLING 2018), Aug 2018, Santa Fe, NM, United States. pp.3090-3101. ⟨hal-02912385⟩
52 Consultations
45 Téléchargements

Partager

Gmail Facebook X LinkedIn More