Andmebaasi logo
Valdkonnad ja kollektsioonid
Kogu ADA
Eesti
English
Deutsch
  1. Esileht
  2. Sirvi autori järgi

Sirvi Autor "Scherrer, Yves" järgi

Tulemuste filtreerimiseks trükkige paar esimest tähte
Nüüd näidatakse 1 - 4 4
  • Tulemused lehekülje kohta
  • Sorteerimisvalikud
  • Laen...
    Pisipilt
    listelement.badge.dso-type Kirje ,
    Boosting Neural Machine Translation from Finnish to Northern Sámi with Rule-Based Backtranslation
    (Reykjavik, Iceland (Online), Linköping University Electronic Press, Sweden, pp. 351--356, 2021) Aulamo, Mikko; Virpioja, Sami; Scherrer, Yves; Tiedemann, Jörg; Dobnik, Simon; Øvrelid, Lilja
  • Laen...
    Pisipilt
    listelement.badge.dso-type Kirje ,
    Interactive maps for corpus-based dialectology
    (University of Tartu Library, 2025-03) Scherrer, Yves; Kuparinen, Olli; Johansson, Richard; Stymne, Sara
    Traditional data collection methods in dialectology rely on structured surveys, whose results can be easily presented on printed or digital maps. But in recent years, corpora of transcribed dialect speech have become a precious alternative data source for data-driven linguistic analysis. For example, topic models can be advantageously used to discover both general dialectal variation patterns and specific linguistic features that are most characteristic for certain dialects. Multilingual (or rather, multilectal) language modeling tasks can also be used to learn speaker-specific embeddings. In connection with this paper, we introduce a website that presents the results of two recent studies in the form of interactive maps, allowing visitors to explore the effects of various parameter settings. The website covers two tasks (topic models and speaker embeddings) and three language areas (Finland, Norway, and German-speaking Switzerland). It is available at https://www.corcodial.net/ .
  • Laen...
    Pisipilt
    listelement.badge.dso-type Kirje ,
    Multi-label Scandinavian Language Identification (SLIDE)
    (University of Tartu Library, 2025-03) Fedorova, Mariia; Frydenberg, Jonas Sebulon; Handford, Victoria; Langø, Victoria Ovedie Chruickshank; Willoch, Solveig Helene; Midtgaard, Marthe Løken; Scherrer, Yves; Mæhlum, Petter; Samuel, David; Tudor, Crina Madalina; Debess, Iben Nyholm; Bruton, Micaella; Scalvini, Barbara; Ilinykh, Nikolai; Holdt, Špela Arhar
    Identifying closely related languages at sentence level is difficult, in particular because it is often impossible to assign a sentence to a single language. In this paper, we focus on multi-label sentence-level Scandinavian language identification (LID) for Danish, Norwegian Bokmål, Norwegian Nynorsk, and Swedish. We present the Scandinavian Language Identification and Evaluation, SLIDE, a manually curated multi-label evaluation dataset and a suite of LID models with varying speed–accuracy tradeoffs. We demonstrate that the ability to identify multiple languages simultaneously is necessary for any accurate LID method, and present a novel approach to training such multi-label LID models.
  • Laen...
    Pisipilt
    listelement.badge.dso-type Kirje ,
    OpusDistillery: A Configurable End-to-End Pipeline for Systematic Multilingual Distillation of Open NMT Models
    (University of Tartu Library, 2025-03) Gibert, Ona de; Nieminen, Tommi; Scherrer, Yves; Tiedemann, Jörg; Johansson, Richard; Stymne, Sara
    In this work, we introduce OpusDistillery, a novel framework to streamline the Knowledge Distillation (KD) process of multilingual NMT models. OpusDistillery's main features are the integration of openly available teacher models from OPUS-MT and Hugging Face, comprehensive multilingual support and robust GPU utilization tracking. We describe the tool in detail and discuss the individual contributions of its pipeline components, demonstrating its flexibility for different use cases. OpusDistillery is open-source and released under a permissive license, aiming to facilitate further research and development in the field of multilingual KD for any sequence-to-sequence task. Our code is available at https://github.com/Helsinki-NLP/OpusDistillery.

DSpace tarkvara autoriõigus © 2002-2026 LYRASIS

  • Teavituste seaded
  • Saada tagasisidet