From Statistics to Neural Networks: Enhancing Ciphertext-Plaintext Alignment in Historical Substitution Ciphers for Automatic Key Extraction

Laen...
Pisipilt

Kuupäev

Ajakirja pealkiri

Ajakirja ISSN

Köite pealkiri

Kirjastaja

Tartu University Library

Abstrakt

Ciphertext manuscripts found in archival collections are often intermingled with plaintext manuscripts in various languages, making the manual analysis required to match the documents labour-intensive and complex. Automating the alignment of these texts to reconstruct corresponding cipher keys is therefore highly beneficial, particularly when handling large volumes of documents. This study introduces a novel approach using modern neural networks, specifically Long Short-Term Memory (LSTM) architectures, to develop an automated method for aligning homophonic substitution ciphertexts with plaintext. These neural models are compared to traditional statistical approaches, demonstrating that LSTMs achieve significant accuracy improvements, including perfect alignment for ciphertexts of 50 characters or less. Additionally, to facilitate practical application, a program has been developed to enable the upload of transcribed ciphertext and plaintext documents, using the optimized models to automatically align the texts and extract the substitution key.

Kirjeldus

Märksõnad

Ciphertext alignment, Plaintext alignment, Historical cryptanalysis, Neural cryptanalysis, Homophonic substitution ciphers, Long Short-Term Memory (LSTM), Text alignment, Automated key extraction, Historical manuscripts, Computational cryptanalysis, Sequence-to-sequence models

Viide