Annotating and Classifying Direct Speech in Historical Danish and Norwegian Literary Texts

dc.contributor.authorAl-Laith, Ali
dc.contributor.authorConroy, Alexander
dc.contributor.authorDegn, Kirstine Nielsen
dc.contributor.authorBjerring-Hansen, Jens
dc.contributor.authorHershcovich, Daniel
dc.contributor.editorJohansson, Richard
dc.contributor.editorStymne, Sara
dc.coverage.spatialTallinn, Estonia
dc.date.accessioned2025-02-17T13:38:52Z
dc.date.available2025-02-17T13:38:52Z
dc.date.issued2025-03
dc.description.abstractAnalyzing direct speech in historical literary texts provides insights into character dynamics, narrative style, and discourse patterns. In late 19th century Danish and Norwegian fiction direct speech reflects characters' social and geographical backgrounds. However, inconsistent typographic conventions in Scandinavian literature complicate computational methods for distinguishing direct speech from other narrative elements. To address this, we introduce an annotated dataset from the MeMo corpus, capturing speech markers and tags in Danish and Norwegian novels. We evaluate pre-trained language models for classifying direct speech, with results showing that a Danish Foundation Model (DFM), trained on extensive Danish data, has the highest performance. Finally, we conduct a classifier-assisted quantitative corpus analysis and find a downward trend in the prevalence of speech over time.
dc.identifier.urihttps://hdl.handle.net/10062/107192
dc.language.isoen
dc.publisherUniversity of Tartu Library
dc.relation.ispartofseriesNEALT Proceedings Series, No. 57
dc.rightsAttribution-NonCommercial-NoDerivatives 4.0 International
dc.rights.urihttps://creativecommons.org/licenses/by-nc-nd/4.0/
dc.titleAnnotating and Classifying Direct Speech in Historical Danish and Norwegian Literary Texts
dc.typeArticle

Failid

Originaal pakett

Nüüd näidatakse 1 - 1 1
Laen...
Pisipilt
Nimi:
2025_nodalida_1_1.pdf
Suurus:
201.24 KB
Formaat:
Adobe Portable Document Format