Annotating and Classifying Direct Speech in Historical Danish and Norwegian Literary Texts

Al-Laith, Ali; Conroy, Alexander; Degn, Kirstine Nielsen; Bjerring-Hansen, Jens; Hershcovich, Daniel

Annotating and Classifying Direct Speech in Historical Danish and Norwegian Literary Texts

dc.contributor.author	Al-Laith, Ali
dc.contributor.author	Conroy, Alexander
dc.contributor.author	Degn, Kirstine Nielsen
dc.contributor.author	Bjerring-Hansen, Jens
dc.contributor.author	Hershcovich, Daniel
dc.contributor.editor	Johansson, Richard
dc.contributor.editor	Stymne, Sara
dc.coverage.spatial	Tallinn, Estonia
dc.date.accessioned	2025-02-17T13:38:52Z
dc.date.available	2025-02-17T13:38:52Z
dc.date.issued	2025-03
dc.description.abstract	Analyzing direct speech in historical literary texts provides insights into character dynamics, narrative style, and discourse patterns. In late 19th century Danish and Norwegian fiction direct speech reflects characters' social and geographical backgrounds. However, inconsistent typographic conventions in Scandinavian literature complicate computational methods for distinguishing direct speech from other narrative elements. To address this, we introduce an annotated dataset from the MeMo corpus, capturing speech markers and tags in Danish and Norwegian novels. We evaluate pre-trained language models for classifying direct speech, with results showing that a Danish Foundation Model (DFM), trained on extensive Danish data, has the highest performance. Finally, we conduct a classifier-assisted quantitative corpus analysis and find a downward trend in the prevalence of speech over time.
dc.identifier.uri	https://hdl.handle.net/10062/107192
dc.language.iso	en
dc.publisher	University of Tartu Library
dc.relation.ispartofseries	NEALT Proceedings Series, No. 57
dc.rights	Attribution-NonCommercial-NoDerivatives 4.0 International
dc.rights.uri	https://creativecommons.org/licenses/by-nc-nd/4.0/
dc.title	Annotating and Classifying Direct Speech in Historical Danish and Norwegian Literary Texts
dc.type	Article

Failid

Originaal pakett

Nüüd näidatakse 1 - 1 1

Nimi:: 2025_nodalida_1_1.pdf
Suurus:: 201.24 KB
Formaat:: Adobe Portable Document Format

Lae alla

Kollektsioonid

Proceedings of the Joint 25th Nordic Conference on Computational Linguistics and 11th Baltic Conference on Human Language Technologies (NoDaLiDa/Baltic-HLT 2025)