Sirvi Autor "Purason, Taido, juhendaja" järgi

Nüüd näidatakse 1 - 1 1

Suurte keelemudelite võrdlev analüüs Eesti bioloogiaolümpiaadide küsimuste põhjal
(Tartu Ülikool, 2025) Kiil, Ahto; Purason, Taido, juhendaja; Kuulmets, Hele-Andra, juhendaja; Tartu Ülikool. Loodus- ja täppisteaduste valdkond; Tartu Ülikool. Arvutiteaduse instituut
Several types of tests are used to evaluate large language models – translation, text comprehension, image recognition, answering questions etc. Typically, evaluation datasets are translated from English, and there is a lack of test sets that consider specific local context and are originally composed in Estonian. As part of this BA thesis, a multiple-choice dataset consisting of 1,031 questions was compiled using tasks from Estonian biology olympiads between 2005 and 2024. In the second phase, five OpenAI models, 13 Estonian-trained models from the Hugging Face platform and nine of the most recent closed commercial models accessed via websites were evaluated. The best model's accuracy (85.35%) is comparable to the average result (87.16%) of pupils who placed in the top three in Estonian olympiads.