Match ‘em: Multi-Tiered Alignment for Error Analysis in ASR

Parsons, Phoebe; Kvale, Knut; Svendsen, Torbjørn; Salvi, Giampiero

Match ‘em: Multi-Tiered Alignment for Error Analysis in ASR

dc.contributor.author	Parsons, Phoebe
dc.contributor.author	Kvale, Knut
dc.contributor.author	Svendsen, Torbjørn
dc.contributor.author	Salvi, Giampiero
dc.contributor.editor	Johansson, Richard
dc.contributor.editor	Stymne, Sara
dc.coverage.spatial	Tallinn, Estonia
dc.date.accessioned	2025-02-18T13:59:06Z
dc.date.available	2025-02-18T13:59:06Z
dc.date.issued	2025-03
dc.description.abstract	We introduce “Match ‘em”: a new framework for aligning output from automatic speech recognition (ASR) with reference transcriptions. This allows a more detailed analysis of errors produced by end-to-end ASR systems compared to word error rate (WER). Match ‘em performs the alignment on both the word and character level; each relying on information from the other to provide the most meaningful global alignment. At the character level, we define a speech production motivated character similarity metric. At the word level, we rely on character similarities to define word similarity and, additionally, we reconcile compounding (insertion or deletion of spaces). We evaluated Match ‘em on transcripts of three European languages produced by wav2vec2 and Whisper. We show that Match ‘em results in more similar word substitution pairs and that compound reconciling can capture a broad range of spacing errors. We believe Match ‘em to be a valuable tool for ASR error analysis across many languages.
dc.identifier.uri	https://hdl.handle.net/10062/107240
dc.language.iso	en
dc.publisher	University of Tartu Library
dc.relation.ispartofseries	NEALT Proceedings Series, No. 57
dc.rights	Attribution-NonCommercial-NoDerivatives 4.0 International
dc.rights.uri	https://creativecommons.org/licenses/by-nc-nd/4.0/
dc.title	Match ‘em: Multi-Tiered Alignment for Error Analysis in ASR
dc.type	Article

Failid

Originaal pakett

Nüüd näidatakse 1 - 1 1

Nimi:: 2025_nodalida_1_48.pdf
Suurus:: 200.01 KB
Formaat:: Adobe Portable Document Format

Lae alla

Kollektsioonid

Proceedings of the Joint 25th Nordic Conference on Computational Linguistics and 11th Baltic Conference on Human Language Technologies (NoDaLiDa/Baltic-HLT 2025)