SweSAT-1.0: The Swedish University Entrance Exam as a Benchmark for Large Language Models

Kurfalı, Murathan; Zahra, Shorouq; Gogoulou, Evangelia; Dürlich, Luise; Carlsson, Fredrik; Nivre, Joakim

SweSAT-1.0: The Swedish University Entrance Exam as a Benchmark for Large Language Models

dc.contributor.author	Kurfalı, Murathan
dc.contributor.author	Zahra, Shorouq
dc.contributor.author	Gogoulou, Evangelia
dc.contributor.author	Dürlich, Luise
dc.contributor.author	Carlsson, Fredrik
dc.contributor.author	Nivre, Joakim
dc.contributor.editor	Johansson, Richard
dc.contributor.editor	Stymne, Sara
dc.coverage.spatial	Tallinn, Estonia
dc.date.accessioned	2025-02-18T09:33:52Z
dc.date.available	2025-02-18T09:33:52Z
dc.date.issued	2025-03
dc.description.abstract	This introduces SweSAT-1.0, a new benchmark dataset created from the Swedish university entrance exam (Högskoleprovet) to assess large language models in Swedish. The current version of the benchmark includes 867 questions across six different tasks, including reading comprehension, mathematical problem solving, and logical reasoning. We find that some widely used open-source and commercial models excel in verbal tasks, but we also see that all models, even the commercial ones, struggle with reasoning tasks in Swedish. We hope that SweSAT-1.0 will facilitate research on large language models for Swedish by enriching the breadth of available tasks, offering a challenging evaluation benchmark that is free from any translation biases.
dc.identifier.uri	https://hdl.handle.net/10062/107227
dc.language.iso	en
dc.publisher	University of Tartu Library
dc.relation.ispartofseries	NEALT Proceedings Series, No. 57
dc.rights	Attribution-NonCommercial-NoDerivatives 4.0 International
dc.rights.uri	https://creativecommons.org/licenses/by-nc-nd/4.0/
dc.title	SweSAT-1.0: The Swedish University Entrance Exam as a Benchmark for Large Language Models
dc.type	Article

Failid

Originaal pakett

Nüüd näidatakse 1 - 1 1

Nimi:: 2025_nodalida_1_36.pdf
Suurus:: 1.05 MB
Formaat:: Adobe Portable Document Format

Lae alla

Kollektsioonid

Proceedings of the Joint 25th Nordic Conference on Computational Linguistics and 11th Baltic Conference on Human Language Technologies (NoDaLiDa/Baltic-HLT 2025)