Andmebaasi logo
Valdkonnad ja kollektsioonid
Kogu ADA
Eesti
English
Deutsch
  1. Esileht
  2. Sirvi autori järgi

Sirvi Autor "Nieminen, Tommi" järgi

Tulemuste filtreerimiseks trükkige paar esimest tähte
Nüüd näidatakse 1 - 2 2
  • Tulemused lehekülje kohta
  • Sorteerimisvalikud
  • Laen...
    Pisipilt
    listelement.badge.dso-type Kirje ,
    Incorporating Target Fuzzy Matches into Neural Fuzzy Repair
    (University of Tartu Library, 2025-03) Nieminen, Tommi; Tiedemann, Jörg; Virpioja, Sami; Johansson, Richard; Stymne, Sara
    Neural fuzzy repair (NFR) is a simple implementation of retrieval-augmented translation (RAT), based on data augmentation. In NFR, a translation database is searched for translation examples where the source sentence is similar to the sentence being translated, and the target side of the example is concatenated with the source sentences. We experiment with introducing retrieval that is based on target similarity to NFR during training. The results of our experiments confirm that including target similarity matches during training supplements source similarity matches and leads to better translations at translation time.
  • Laen...
    Pisipilt
    listelement.badge.dso-type Kirje ,
    OpusDistillery: A Configurable End-to-End Pipeline for Systematic Multilingual Distillation of Open NMT Models
    (University of Tartu Library, 2025-03) Gibert, Ona de; Nieminen, Tommi; Scherrer, Yves; Tiedemann, Jörg; Johansson, Richard; Stymne, Sara
    In this work, we introduce OpusDistillery, a novel framework to streamline the Knowledge Distillation (KD) process of multilingual NMT models. OpusDistillery's main features are the integration of openly available teacher models from OPUS-MT and Hugging Face, comprehensive multilingual support and robust GPU utilization tracking. We describe the tool in detail and discuss the individual contributions of its pipeline components, demonstrating its flexibility for different use cases. OpusDistillery is open-source and released under a permissive license, aiming to facilitate further research and development in the field of multilingual KD for any sequence-to-sequence task. Our code is available at https://github.com/Helsinki-NLP/OpusDistillery.

DSpace tarkvara autoriõigus © 2002-2025 LYRASIS

  • Teavituste seaded
  • Saada tagasisidet