Pashchenko, DmytroYankovskaya, LisaFishel, MarkJohansson, RichardStymne, Sara2025-02-182025-02-182025-03https://hdl.handle.net/10062/107242We develop paragraph-level machine translation for four low-resource Finno-Ugric languages: Proper Karelian, Livvi, Ludian, and Veps. The approach is based on sentence-level pre-trained translation models, which are fine-tuned with paragraph-parallel data. This allows the resulting model to develop a native ability to handle discource-level phenomena correctly, in particular translating from grammatically gender-neutral input in Finno-Ugric languages. We collect monolingual and parallel paragraph-level corpora for these languages. Our experiments show that paragraph-level translation models can translate sentences no worse than sentence-level systems, while handling discourse-level phenomena better. For evaluation, we manually translate part of FLORES-200 into these four languages. All our results, data, and models are released openly.enAttribution-NonCommercial-NoDerivatives 4.0 Internationalhttps://creativecommons.org/licenses/by-nc-nd/4.0/Paragraph-Level Machine Translation for Low-Resource Finno-Ugric LanguagesArticle