Does Preprocessing Matter? An Analysis of Acoustic Feature Importance in Deep Learning for Dialect Classification

dc.contributor.authorFischbach, Lea
dc.contributor.authorKleen, Caroline
dc.contributor.authorFlek, Lucie
dc.contributor.authorLameli, Alfred
dc.contributor.editorJohansson, Richard
dc.contributor.editorStymne, Sara
dc.coverage.spatialTallinn, Estonia
dc.date.accessioned2025-02-17T14:16:28Z
dc.date.available2025-02-17T14:16:28Z
dc.date.issued2025-03
dc.description.abstractThis paper examines the effect of preprocessing techniques on spoken dialect classification using raw audio data. We focus on modifying Root Mean Square (RMS) amplitude, DC-offset, articulation rate (AR), pitch, and Harmonics-to-Noise Ratio (HNR) to assess their impact on model performance. Our analysis determines whether these features are important, irrelevant, or misleading for the classification task. To evaluate these effects, we use a pipeline that tests the significance of each acoustic feature through distortion and normalization techniques. While preprocessing did not directly improve classification accuracy, our findings reveal three key insights: deep learning models for dialect classification are generally robust to variations in the tested audio features, suggesting that normalization may not be necessary. We identify articulation rate as a critical factor, directly affecting the amount of information in audio chunks. Additionally, we demonstrate that intonation, specifically the pitch range, plays a vital role in dialect recognition.
dc.identifier.urihttps://hdl.handle.net/10062/107207
dc.language.isoen
dc.publisherUniversity of Tartu Library
dc.relation.ispartofseriesNEALT Proceedings Series, No. 57
dc.rightsAttribution-NonCommercial-NoDerivatives 4.0 International
dc.rights.urihttps://creativecommons.org/licenses/by-nc-nd/4.0/
dc.titleDoes Preprocessing Matter? An Analysis of Acoustic Feature Importance in Deep Learning for Dialect Classification
dc.typeArticle

Failid

Originaal pakett

Nüüd näidatakse 1 - 1 1
Laen...
Pisipilt
Nimi:
2025_nodalida_1_16.pdf
Suurus:
9.98 MB
Formaat:
Adobe Portable Document Format