Boosting up the sentiment analysis models’ accuracy by blending multi-label learning with a large sentiment lexicon
| dc.contributor.author | Kokkinakis, Dimitrios | |
| dc.contributor.editor | Nermo, Magnus | |
| dc.contributor.editor | Papadopoulou Skarp, Frantzeska | |
| dc.contributor.editor | Tienken, Susanne | |
| dc.contributor.editor | Widholm, Andreas | |
| dc.contributor.editor | Blåder, Anna | |
| dc.date.accessioned | 2025-12-19T12:41:11Z | |
| dc.date.available | 2025-12-19T12:41:11Z | |
| dc.date.issued | 2025 | |
| dc.description.abstract | This study compares sentiment analysis approaches for Swedish texts using a manually annotated gold-standard dataset. Two methods were examined: i) a multi-label sentiment classifier trained for Swedish, and ii) the Swedish version of VADER, a lexicon-based tool that computes sentiment scores from a vocabulary of polarity-weighted words. The analysis also examined agreement and disagreement between the two methods, with a focus on mixed or context-dependent sentiment. Results indicate that the multi-label classifier aligns more closely with human judgments, especially for medium- or long-text segments with complex or subtle emotional tones. VADER, while prone to errors in idiomatic or nuanced expressions, performs reliably on short, informal utterances, offering computational efficiency and transparency. A hybrid approach combining classifier predictions with lexicon-based scores was investigated to leverage their complementary strengths. Findings underscore the value of rigorous evaluation against human annotations and highlight strategies to improve sentiment analysis in under-resourced languages such as Swedish. | en |
| dc.identifier.issn | 1736-6305 | |
| dc.identifier.uri | https://hdl.handle.net/10062/118298 | |
| dc.language.iso | en | |
| dc.publisher | Tartu University Library | |
| dc.relation.ispartofseries | NEALT Proceedings Series 60 | |
| dc.rights | Attribution-NonCommercial-NoDerivatives 4.0 International | |
| dc.rights.uri | https://creativecommons.org/licenses/by/4.0/ | |
| dc.subject | sentiment analysis | |
| dc.subject | multi-label classifier | |
| dc.subject | multi-class model | |
| dc.subject | lexicon-based method (VADER/svVADER) | |
| dc.subject | Swedish dataset | |
| dc.title | Boosting up the sentiment analysis models’ accuracy by blending multi-label learning with a large sentiment lexicon | |
| dc.type | Article |
Failid
Originaal pakett
1 - 1 1