Browsing by Author "Tars, Maali"
Now showing 1 - 4 of 4
- Results Per Page
- Sort Options
Item Extremely low-resource machine translation for closely related languages(Reykjavik, Iceland (Online), Linköping University Electronic Press, Sweden, pp. 41--52, 2021) Tars, Maali; Tättar, Andre; Fišel, Mark; Dobnik, Simon; Øvrelid, LiljaItem Improving translation for low-resource Finno-Ugric languages with Neural Machine Translation models(Tartu Ülikool, 2021) Tars, Maali; Tättar, Andre, juhendaja; Tartu Ülikool. Loodus- ja täppisteaduste valdkond; Tartu Ülikool. Arvutiteaduse instituutTraining a good neural machine translation model requires a lot of data. The majority of languages in the world have low amounts of suitable data available for this task. One possible solution to this problem is developing a multilingual model, combining high-resource and low-resource languages and creating a shared vocabulary space, where knowledge gained from high-resource languages is applied to translating low-resource languages. Another useful technique is to produce new data for low-resource languages by creating synthetic translations of monolingual data with a baseline model. In this thesis we use both of those methods, training a multilingual baseline model on Finno- Ugric language family data and increasing the amount of data for smaller Finno-Ugric languages by translating monolingual data with the multilingual baseline model in order to improve machine translation quality for low-resource languages.Item Low-resource Finno-Ugric Neural Machine Translation through Cross-lingual Transfer Learning(Tartu Ülikool, 2023) Tars, Maali; Tättar, Andre, juhendaja; Tartu Ülikool. Loodus- ja täppisteaduste valdkond; Tartu Ülikool. Arvutiteaduse instituutFirst high-quality machine translation models were mainly focusing on large languages, such as English and German. Thankfully, the trend has been growing toward helping languages with fewer resources. Most Finno-Ugric languages are low-resource and require the help of different techniques and larger languages for additional information during translation. Recently, multiple big companies have released multilingual pre-trained neural machine translation models that can be adapted to low-resource languages. However, some of the Finno-Ugric languages included in our work were not included in the training of these pre-trained models. Thus, we need to use cross-lingual transfer for fine-tuning the models to our selected languages. In addition, we do data augmentation by back-translation to alleviate the data scarcity issue of low-resource languages. We train multiple different models to determine the best setting for our selected languages and improve over previous results for all language pairs. As a result, we deploy the best model and create the first multilingual NMT system for multiple low-resource Finno-Ugric languages.Item Machine Translation for Low-resource Finno-Ugric Languages(University of Tartu Library, 2023-05) Yankovskaya, Lisa; Tars, Maali; Tätar, Andre; Fišhel, Mark