Sirvi Autor "Dehouck, Mathieu" järgi
Nüüd näidatakse 1 - 2 2
- Tulemused lehekülje kohta
- Sorteerimisvalikud
listelement.badge.dso-type Kirje , Comparative Concepts or Descriptive Categories: a UD Case study(University of Tartu Library, 2025-03) Boyer, Matthieu Pierre; Dehouck, Mathieu; Johansson, Richard; Stymne, SaraIn this paper, we present a series of methods used to quantify the soundness of using the same names to annotate cases in different languages. We follow the idea described by Martin Haspelmath that descriptive categories and comparative concepts are different objects and we look at the necessary simplification taken by the Universal Dependencies project. We thus compare cases in closely related languages as belonging to commensurable descriptive categories. Then we look at the corresponding underlying comparative concepts. We finally looked at the possibility of assigning cases to adpositions.listelement.badge.dso-type Kirje , Lattice @MultiGEC-2025: A Spitful Multilingual Language Error Correction System Using LLaMA(University of Tartu Library, 2025-03) Seminck, Olga; Dupont, Yoann; Dehouck, Mathieu; Wang, Qi; Durandard, Noé; Novikov, Margo; Muñoz Sánchez, Ricardo; Alfter, David; Volodina, Elena; Kallas, JelenaThis paper reports on our submission to the NLP4CALL shared task on Multilingual Grammatical Error Correction (MultiGEC-2025) (Masciolini et al., 2025). We developed two approaches: fine-tuning a large language model, LLaMA 3.0 (8B), for each MultiGEC corpus, and a pipeline based on the encoderbased language model XLM-RoBERTa. During development, the first method significantly outperformed the second, except for languages that are poorly supported by LLaMA 3.0 and have limited MultiGEC training data. Therefore, our official results for the shared task were produced using the neural network system for Slovenian, while fine-tuned LLaMA models were used for the eleven other languages. In this paper, we first introduce the shared task and its data. Next, we present our two approaches, as well as a method to detect cycles in the LLaMA output. We also discuss a number of hurdles encountered while working on the shared task.