Boosting up the sentiment analysis models’ accuracy by blending multi-label learning with a large sentiment lexicon

dc.contributor.authorKokkinakis, Dimitrios
dc.contributor.editorNermo, Magnus
dc.contributor.editorPapadopoulou Skarp, Frantzeska
dc.contributor.editorTienken, Susanne
dc.contributor.editorWidholm, Andreas
dc.contributor.editorBlåder, Anna
dc.date.accessioned2025-12-19T12:41:11Z
dc.date.available2025-12-19T12:41:11Z
dc.date.issued2025
dc.description.abstractThis study compares sentiment analysis approaches for Swedish texts using a manually annotated gold-standard dataset. Two methods were examined: i) a multi-label sentiment classifier trained for Swedish, and ii) the Swedish version of VADER, a lexicon-based tool that computes sentiment scores from a vocabulary of polarity-weighted words. The analysis also examined agreement and disagreement between the two methods, with a focus on mixed or context-dependent sentiment. Results indicate that the multi-label classifier aligns more closely with human judgments, especially for medium- or long-text segments with complex or subtle emotional tones. VADER, while prone to errors in idiomatic or nuanced expressions, performs reliably on short, informal utterances, offering computational efficiency and transparency. A hybrid approach combining classifier predictions with lexicon-based scores was investigated to leverage their complementary strengths. Findings underscore the value of rigorous evaluation against human annotations and highlight strategies to improve sentiment analysis in under-resourced languages such as Swedish.en
dc.identifier.issn1736-6305
dc.identifier.urihttps://hdl.handle.net/10062/118298
dc.language.isoen
dc.publisherTartu University Library
dc.relation.ispartofseriesNEALT Proceedings Series 60
dc.rightsAttribution-NonCommercial-NoDerivatives 4.0 International
dc.rights.urihttps://creativecommons.org/licenses/by/4.0/
dc.subjectsentiment analysis
dc.subjectmulti-label classifier
dc.subjectmulti-class model
dc.subjectlexicon-based method (VADER/svVADER)
dc.subjectSwedish dataset
dc.titleBoosting up the sentiment analysis models’ accuracy by blending multi-label learning with a large sentiment lexicon
dc.typeArticle

Failid

Originaal pakett

Nüüd näidatakse 1 - 1 1
Laen...
Pisipilt
Nimi:
paper_8.pdf
Suurus:
1.08 MB
Formaat:
Adobe Portable Document Format