Establishing a Document Layout Analysis Baseline for Historical Cipher Keys

Laen...
Pisipilt

Kuupäev

Ajakirja pealkiri

Ajakirja ISSN

Köite pealkiri

Kirjastaja

Tartu University Library

Abstrakt

Historical cipher keys encode mappings between plaintext elements and cipher symbols and are characterized by complex, heterogeneous handwritten layouts. This paper establishes a baseline for document layout analysis (DLA) of historical cipher keys using a newly annotated dataset of 350 images from European archives dating from ca. 1300 to 1850 CE. We evaluate four YOLO-based architectures under three conditions: training from scratch, cross-domain transfer from models pre-trained on DocLayNet and CATMuS in a class-agnostic setting, and fine-tuning of these pre-trained models on cipher key data. Results show that training from scratch is limited by data scarcity and unstable convergence, while direct transfer across DLA domains performs poorly. In contrast, fine-tuning consistently improves performance across all architectures, demonstrating the feasibility of adapting existing DLA models to cipher keys and supporting downstream tasks such as key extraction and comparative cryptographic analysis.

Kirjeldus

Märksõnad

cipher keys, document layout analysis, historical cryptology

Viide