Holdt, Špela ArharAntloga, ŠpelaMunda, TinaPori, EvaKrek, SimonTudor, Crina MadalinaDebess, Iben NyholmBruton, MicaellaScalvini, BarbaraIlinykh, NikolaiHoldt, Špela Arhar2025-02-142025-02-142025-03https://hdl.handle.net/10062/107125Large Language Models (LLMs) have demonstrated significant potential in natural language processing, but they depend on vast, diverse datasets, creating challenges for languages with limited resources. The paper presents a national initiative that addresses these challenges for Slovene. We outline strategies for large-scale text collection, including the creation of an online platform to engage the broader public in contributing texts and a communication campaign promoting openly accessible and transparently developed LLMs.enAttribution-NonCommercial-NoDerivatives 4.0 Internationalhttps://creativecommons.org/licenses/by-nc-nd/4.0/From Words to Action: A National Initiative to Overcome Data Scarcity for the Slovene LLMArticle