Training dynamic models using early exits for automatic speech recognition on resource-constrained devices

George August Wright; Umberto Cappellazzo; Salah Zaiem; Desh Raj; Lucas Ondel Yang; Daniele Falavigna; Alessio Brutti

Pré-Publication, Document De Travail Année : 2023

Training dynamic models using early exits for automatic speech recognition on resource-constrained devices

(1) , (1) , (2, 3) , (4) , (5, 6) , (7) , (7)

1
2
3
4
5
6
7

George August Wright

Fonction : Auteur

Università degli Studi di Trento = University of Trento

Umberto Cappellazzo

Fonction : Auteur

Università degli Studi di Trento = University of Trento

Salah Zaiem

Fonction : Auteur
PersonId : 1286801

Signal, Statistique et Apprentissage

Département Images, Données, Signal

Desh Raj

Fonction : Auteur

Johns Hopkins University

Lucas Ondel Yang

Fonction : Auteur

Laboratoire Interdisciplinaire des Sciences du Numérique

Sciences et Technologies des Langues - LISN

Daniele Falavigna

Fonction : Auteur

Fondazione Bruno Kessler [Trento, Italy]

Alessio Brutti

Fonction : Auteur

Fondazione Bruno Kessler [Trento, Italy]

Résumé

The possibility of dynamically modifying the computational load of neural models at inference time is crucial for on-device processing, where computational power is limited and time-varying. Established approaches for neural model compression exist, but they provide architecturally static models. In this paper, we investigate the use of early-exit architectures, that rely on intermediate exit branches, applied to large-vocabulary speech recognition. This allows for the development of dynamic models that adjust their computational cost to the available resources and recognition performance. Unlike previous works, besides using pre-trained backbones we also train the model from scratch with an early-exit architecture. Experiments on public datasets show that early-exit architectures from scratch not only preserve performance levels when using fewer encoder layers, but also improve task accuracy as compared to using single-exit models or using pre-trained models. Additionally, we investigate an exit selection strategy based on posterior probabilities as an alternative to frame-based entropy.

Mots clés

dynamic models early-exit Conformer ASR

Domaines

Informatique et langage [cs.CL] Réseau de neurones [cs.NE]

Fichier principal

2309.09546.pdf (4.17 Mo)

Origine : Fichiers produits par l'(les) auteur(s)

Salah Zaiem : Connectez-vous pour contacter le contributeur

https://hal.science/hal-04216190

Soumis le : samedi 23 septembre 2023-19:22:44

Dernière modification le : mardi 19 mars 2024-09:34:06

Archivage à long terme le : dimanche 24 décembre 2023-18:43:51

Dates et versions

hal-04216190 , version 1 (23-09-2023)

Identifiants

HAL Id : hal-04216190 , version 1
ARXIV : 2309.09546

Citer

George August Wright, Umberto Cappellazzo, Salah Zaiem, Desh Raj, Lucas Ondel Yang, et al.. Training dynamic models using early exits for automatic speech recognition on resource-constrained devices. 2023. ⟨hal-04216190⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

INSTITUT-TELECOM CNRS INRIA CENTRALESUPELEC UNIV-PARIS-SACLAY LTCI IDS S2A IP_PARIS LISN GS-COMPUTER-SCIENCE

35 Consultations

37 Téléchargements

Training dynamic models using early exits for automatic speech recognition on resource-constrained devices

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Altmetric

Partager