Sirvi Autor "Tolmats, Mona" järgi

Nüüd näidatakse 1 - 1 1

listelement.badge.access-status Avatud juurdepääs ,
Automaatne kõnesünteesi kvaliteedi hindamine soome-ugri keeltele
(Tartu Ülikool, 2025) Tolmats, Mona; Rätsep, Liisa, juhendaja; Tartu Ülikool. Loodus- ja täppisteaduste valdkond; Tartu Ülikool. Arvutiteaduse instituut
Automatic evaluation of synthesised speech quality accelerates the development of text-to-speech models by replacing costly human listening tests based on mean opinion score. This capability is particularly valuable for low-resource languages, where only limited speech and text corpora are available and finding an adequate group of human evaluators is particularly challenging. The aim of this thesis is to train a model that evaluates the naturalness of Estonian synthetic speech and generalises to other Finno-Ugric languages. A wav2vec 2.0 was trained to predict mean opinion scores on Estonian text-to-speech models outputs. Separately, a wav2vec 2.0 model pre-trained using the SCOREQ loss function was fine-tuned, and the UTMOSv2 model was also adapted through fine-tuning. Training drew on three distinct datasets, while evaluation of cross-lingual generalisability was conducted on a single Võro-language test set. The experimental findings indicated that UTMOSv2 achieved the highest Pearson and Spearman correlations with human judgments and demonstrated superior generalisation to previously unseen Finno-Ugric languages.