Dialectal treebanks and their relation with the standard variety: The case of East Cretan and Standard Modern Greek

Kuupäev

2025-03

Ajakirja pealkiri

Ajakirja ISSN

Köite pealkiri

Kirjastaja

University of Tartu Library

Abstrakt

We report on the development of the first treebank and parser for Eastern Cretan in the framework of Universal Dependencies (UD). Eastern Cretan is a living but under-resourced dialect of Modern Greek. We have worked on the transcription of oral material and relied on active annotation and knowledge transfer from GUD, a treebank of Standard Modern Greek. Along with its other phonological and morphosyntactic differences from Standard Modern Greek, Eastern Cretan (and other varieties of Modern Greek) makes heavy use of euphonics and voicing that have not been included in the UD annotation guidelines so far. We have provided annotation guidelines for East Cretan euphonics and voicing and included them in the models. Knowledge transfer from the treebank of Standard Modern Greek to the dialectal models helped to initiate annotation via an active annotation procedure.

Kirjeldus

Märksõnad

Viide