Boosting up the sentiment analysis models’ accuracy by blending multi-label learning with a large sentiment lexicon

Kokkinakis, Dimitrios

Boosting up the sentiment analysis models’ accuracy by blending multi-label learning with a large sentiment lexicon

Failid

paper_8.pdf (1.08 MB)

Kuupäev

2025

Autorid

Kokkinakis, Dimitrios

Kirjastaja

Tartu University Library

Abstrakt

This study compares sentiment analysis approaches for Swedish texts using a manually annotated gold-standard dataset. Two methods were examined: i) a multi-label sentiment classifier trained for Swedish, and ii) the Swedish version of VADER, a lexicon-based tool that computes sentiment scores from a vocabulary of polarity-weighted words. The analysis also examined agreement and disagreement between the two methods, with a focus on mixed or context-dependent sentiment. Results indicate that the multi-label classifier aligns more closely with human judgments, especially for medium- or long-text segments with complex or subtle emotional tones. VADER, while prone to errors in idiomatic or nuanced expressions, performs reliably on short, informal utterances, offering computational efficiency and transparency. A hybrid approach combining classifier predictions with lexicon-based scores was investigated to leverage their complementary strengths. Findings underscore the value of rigorous evaluation against human annotations and highlight strategies to improve sentiment analysis in under-resourced languages such as Swedish.

Märksõnad

sentiment analysis, multi-label classifier, multi-class model, lexicon-based method (VADER/svVADER), Swedish dataset

URI

https://hdl.handle.net/10062/118298

Kollektsioonid

Proceedings of the 2nd Huminfra Conference

Kirje täielik lehekülg

Boosting up the sentiment analysis models’ accuracy by blending multi-label learning with a large sentiment lexicon

Failid

Kuupäev

Autorid

Ajakirja pealkiri

Ajakirja ISSN

Köite pealkiri

Kirjastaja

Abstrakt

Kirjeldus

Märksõnad

Viide

URI

Kollektsioonid