Artificial neural network for hoax cryptogram identification

Foxon, Floe

Artificial neural network for hoax cryptogram identification

Failid

Article_10.pdf (122.89 KB)

Kuupäev

2024

Autorid

Foxon, Floe

Kirjastaja

Tartu University Library

Abstrakt

Numerous putative cryptograms remain unsolved. Some, including the Dorabella cryptogram, have been suggestedas hoaxes, i.e., some sort of gibberish with no meaningful underlying plaintext.The statistical properties of a putative cryptogram may be modelled to determine whether the cryptogram groups moreclosely with real or with randomly generated plaintext. Ten thousand plaintexts from an English-language corpus, and ten thousand (pseudo-)randomly generated English-alphabet gibberish texts were studied through their statistical properties, including the alphabet length; the frequency, separation, and entropy of n-grams; the index of coincidence; Zipf’slaw, and mean associated contact counts. An artificial neural network (deep learning) model was fitted to these data, with a cross-validated mean accuracy of 99.8% (standard deviation: 0.1%). This model correctly predicted that arbitrary, out-of-sample simple substitution ciphers represented meaningful English plaintext (as opposed to gibberish) with probabilities close to 1; correctly predicted that arbitrary, out-of-sample gibberish texts were gibberish (as opposed to simple substitution ciphers) with probabilities close to 1; and assigned a probability of meaningful English plaintext of 0.9996 to the Dorabella cryptogram.

Märksõnad

Machine learning, Simple substitution cipher, Hoax, Dorabella cryptogram

URI

https://hdl.handle.net/10062/98469
https://doi.org/10.58009/aere-perennius0094

Kollektsioonid

Proceedings of the 7th International Conference on Historical Cryptology (HistoCrypt 2024)

Kirje täielik lehekülg

Artificial neural network for hoax cryptogram identification

Failid

Kuupäev

Autorid

Ajakirja pealkiri

Ajakirja ISSN

Köite pealkiri

Kirjastaja

Abstrakt

Kirjeldus

Märksõnad

Viide

URI

Kollektsioonid