Starten Sie Ihre Suche...


Wir weisen darauf hin, dass wir technisch notwendige Cookies verwenden. Weitere Informationen

Transcribing Diverse Voices: Using Whisper for ICE corpora

Odette Scharenborg (Hrsg). Proceedings of Interspeech 2025. https://www.isca-archive.org/index.html: ISCA Archive 2025 S. 3359 - 3363

Erscheinungsjahr: 2025

Publikationstyp: Diverses (Konferenzbeitrag)

Sprache: Englisch

Doi/URN: 10.21437/Interspeech.2025-1980

Volltext über DOI/URN

Website
Geprüft:Bibliothek

Inhaltszusammenfassung


The precise transcription of speech data is crucial yet work-intensive in the field of sociolinguistics. Although recent advancements in end-to-end ASR (e.g. Whisper) offer great potential across various disciplines, these models have rarely been tested for sociolinguistic corpus transcription. This study addresses this gap by harnessing all Whisper models for the re-transcription of classic sociolinguistic reference corpora of non-standard varieties: ICE Nigeria and ICE Scotland. Employing W...The precise transcription of speech data is crucial yet work-intensive in the field of sociolinguistics. Although recent advancements in end-to-end ASR (e.g. Whisper) offer great potential across various disciplines, these models have rarely been tested for sociolinguistic corpus transcription. This study addresses this gap by harnessing all Whisper models for the re-transcription of classic sociolinguistic reference corpora of non-standard varieties: ICE Nigeria and ICE Scotland. Employing WER metrics, the study utilizes linear mixed-effects modelling to determine significant factors affecting transcription accuracy. The results show that Whisper can manage both varieties, though it is slightly less accurate for Nigerian English. An increased model size reduces WER and boosts robustness, though accuracy varies by sound file. While Whisper proves useful for corpus transcription work overall, challenges such as speaker diarization, hallucinations and idealized transcriptions persist.» weiterlesen» einklappen

Autoren


Weilinghoff, Andreas (Autor)