This paper presents state-of-art cryptanalysis studies on attacks of the substitution and transposition ciphers using various metaheuristic algorithms. Traditional cryptanalysis methods employ an exhaustive search, which is computationally expensive. Therefore, metaheuristics have attracted the interest of researchers in the cryptanalysis field. Metaheuristic algorithms are known for improving the search for the optimum solution and include Genetic Algorithm, Simulated Annealing, Tabu Search, Particle Swarm Optimization, Differential Evolution, Ant Colony, the Artificial Bee Colony, Cuckoo Search, and Firefly algorithms. The most important part of these various applications is deciding the fitness function to guide the search. This review presents how these algorithms have been implemented for cryptanalysis purposes. The paper highlights the results and findings of the studies and determines the gaps in the literature.

Cryptology is a research field focused on methods for secret communication; the field comprises two areas of study: cryptography and cryptanalysis. Cryptography is the study of secret communication methods while cryptanalysis is the study of methods for attacking ciphers. The purpose of cryptanalysis is to obtain the original text (plaintext) from the cipher text without having any information about the key that was used in the encryption process. There are three fundamental strategies for attacks: ciphertext-only, known-plaintext, and chosen-plaintext [

The substitution cipher is the more established type of cryptography algorithm. This cipher replaces each letter and substitutes it with another letter in ciphertext. This substitution system is transformable in that it authorizes the expected text receivers to inverse-exchange cipher text letters to get the plaintext. The simple substitution cipher is the primary type of substitution cipher. Every letter in the original text maps to a letter in the cipher text [

A simple substitution cipher, or monoalphabetic cipher, is one in which every letter in the original text is exchanged with an identical letter of ciphertext.

A homophonic substitution cipher is as simple as the substitution cryptosystem, except that a letter of the original text can map to one of a few letters of the ciphertext.

A polygram substitution cipher is outlines of the letters are encrypted in a set. The Playfair cipher and the Hill cipher are examples of this kind of cipher.

A polyalphabetic substitution cipher combines several simple substitution ciphers. In this method, a certain cipher is used to change the position of all letters of the original text, such as in the Vigenere cipher.

One-time pad is a large nonrepeating group of irregular key letters, which are written on a piece of a paper then glued to each other in a pad.

Another type of cryptography algorithm is transposition ciphers in which the main idea is breaking the original text into blocks of a specific size, depending on a specific permutation. In the substitution and transposition ciphers, the group of potential keys is the group of all potential permutations of the letter (for English alphabet letter); there are over 403 septillion possible permutations [

Metaheuristic algorithms use randomization to discover near-optimal solutions to computationally unmanageable issues. In classical ciphers, because the group of potential keys is the group of all potential permutations of the letters (alphabet letter), there are many possibilities permutations. Therefore, due to the computational cost, cryptanalysis by brute force is not effective. For this reason, stochastic operations, guided by optimization algorithms, are used to get the optimal key.

Using the metaheuristics to attack the substitution and transposition ciphers requires a method to determine the viability of a key, called fitness function. The fitness functions are utilized to consider the validity of a certain key depending on the type of the frequency analysis. The aim of frequency analysis is to compare the frequencies in the decrypted text with the frequencies found in English literature. In other words, the frequency analysis authorizes us to evaluate precisely exact matches between a certain text and the language in which the original text was written. The frequency analysis is performed by extracting all unigrams ^{u}_{i}, ^{b}_{ij}, and ^{t}_{ijk}, respectively. When these tallies are classified, one can score the aggregate number of all unigrams, bigrams, and trigrams, as show in

Finally, fractional indication frequencies are obtained for all unigrams, bigrams, and trigrams using

In the frequency analysis, the message subjected to cryptanalysis must have sufficient length because implementing frequency analysis on short messages leads to low fitness values; therefore, only longer messages can be minutely analyzed by this method [

Holland, introduced the Genetic algorithms that modulate the idea of the

Matthews [

Spillman et al. [

where SF[i] indicates the corresponding frequency of letter

Clark [

Lin and Kao [

Clark et al. [

_{1}, _{2}, and _{3} are known unigram, bigram, and trigram statistics, respectively. _{1}, _{2}, and _{3} are statistics for the decrypted text, and _{1}, _{2}, and _{3} are weights chosen to equal one. This study initially used unigrams and bigrams only, then later used trigrams. This method obtained 90% of the original by using 600 known cipher text letters for each key. In this test, the weights were _{1} = 0.4, _{2} = 0.6, and _{3} = 0.0. After 100 iterations, the weights were gradually changed such that _{1}, _{2}, _{3} {0.0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0}; the preferable outcome occurred when _{3} = 0.8.

Clark and Dawson [

Suppose, ^{(1)}, ^{(2)},and ^{(3)} are the known unigram, bigram, and trigram statistics, respectively. ^{(1)}, ^{(2)}, and ^{(3)} are statistics for the decrypted text, and _{1} , _{2}, and _{3} are weights chosen to equal one. When the algorithm was implemented 5 times on 10 different texts of text length 600 characters, 70% of the original key characters were restored correctly. An improved result was obtained when the values of _{1}, _{2}, and _{3} were 0.1, 0.7 and 0.8 respectively.

Clark and Dawson [

where

Dimovski and Gligoroski [

The results showed that each algorithm could successfully recover the key for periods less than 15 (13.25) with 1000 available cipher text characters. On average, the simulated annealing attacks could set 25 of the key elements for a transposition cipher of period 30.

Li et al. [

Toemeh and Arumugam [

Toemeh and Arumugam [

Song et al. [

Garg [

Erickson and Hausman [

Omran et al. [

Omran et al. [

Heydari et al. [_{f}

where Ci and Dj indicate the frequency of the bigram and trigram characters respectively, with _{i} and _{j} indicate the frequency of the very common characters, respectively.

Dureha and Kaur [

The results showed that the number of bits recovered was proportional (key-length and cipher-text length). Also, it was proportional to population size, which helps in preserving population divergence. In addition, the time was reduced by decreasing the number of generations to less than 50 generations, which gave the algorithm higher robustness.

Khalid et al. [

where Pi refers to the frequency of the bigrams or trigrams in the text, _{i} refers to the fitness value to the ^{th} bigram or trigram tested, and the result of the summation is over the

Bhateja and Kumar [

where ^{th} unigram in the original text, ^{th} unigram in the cipher text, ^{th} bigram in the original text, and ^{th} bigram in the cipher text. Experimental results showed that elitism improves the effectiveness and performance of the algorithm because it prevents the loss of the best key, which is possibly the right key.

Boryczka and Dworak [

Boryczka and Dworak [

Alkathiry and Al-Mogren [

Saveetha et al. [

Sadeghzadeha and Taherbaghalb [

Jadaun et al. [

where ^{b}_{(i,j)} and ^{b}_{(i,j),}indicate the used language bigrams and decrypted text bigrams, respectively. The values of

Bhateja et al. [

where ^{th} unigram in the decrypted text, ^{th} b-gram in standard English and ^{th} bigram in the decrypted text. The Vigenere cipher was attacked by PSO and GA. The α and

Khalid and Al-Khafagi [

The sensitiveness to large values of the difference was reduced by scaling the summation terms. From the results, increasing population size leads to a decrease of the number of generations required to get to the optimal solution, but it leads to an increase in time required to perform a single generation. A lower mutation average decreases the prospect of genetic jump, as well. Also, the researchers applied two kinds of crossover, the single point crossover and the multipoint crossover, and stated that the multipoint crossover had better results compared to the single point crossover.

Bergmann et al. [

Habeeb [

Mudgal et al. [

Jain et al. [

Suppose, K(u), K (b) and K (t) are the known unigram, bigram, and trigram statistics, respectively. D(1), D(2), and D(3) are statistics for the decrypted text, and the value 0.1, 0.3 and 0.6 are weights chosen to equal one. Results showed that the combination of GA and SA was the most robust method for breaking the monoalphabetic cipher.

Forhad et al. [

Kirkpatrick et al., recommended SA which models the physical process of heating and cooling a material to minimize the energy related to the material. Some applications of SA on the cryptanalysis of classical ciphers are given below.

Forsyth and Safavi-Naini [

where _{i} is the value of the cost function,

Giddy and Safavi-Naini [

The cost function offered in this study was an improvement of the version used by Forsyth and Safavi-Naini [_{αβ} indicate the probability of the bigram _{αβ}. The success of the algorithm depends on the length of the ciphertext.

Nevertheless, in this study, comparing the result of the algorithm provided important information for guiding the cryptanalysis. The technique solved some transposition ciphers with periods of 25 and cipher lengths of 500 letters. In some cases, solutions reached 80% accuracy. The capability of the cost function to successfully identify the valid text from a series of letters depended on the algorithm’s decryption capability. Results showed that SA makes cryptanalysis of transposition ciphers easier and supplies a robust technique for analyzing most of the developed ciphers.

Glover, used TS which finds the best neighbour solution for each iteration and acceptable changes are restricted using a tabu list which can prevent the cycling problem. Below are some applications of TS to the cryptanalysis of classical ciphers and some comparisons with GA.

Verma et al. [

Kennedy and Eberhart, developed PSO based on the birds’ adaptation by information sharing to find rich food supplies and to avoid being hunted. In the algorithm, each particle uses its previous experience while setting its own position for the best position in the track. Below are some applications of PSO to the cryptanalysis of classical ciphers and some comparisons with SA, TS, and GA.

Uddin and Youssef [

Hameed and Hmood [

Rajkumar [

Price and Storn, developed the DE algorithm for optimization problems through continuous scope based on crossover, mutation and greedy selection. Below is an application of Differential Evolution to the cryptanalysis of classical ciphers.

Wulandari et al. [

Sabonchi and Akay [

The ACO algorithm developed by Dorigo, simulates the intelligent behaviours of ants related to selecting the path with a high concentration of pheromones. Below are some applications of the ACO to the cryptanalysis of classical ciphers and some comparisons with GA.

Russell et al. [

_{d} is the number of

where _{x}, and _{x}, indicate the ^{th} character in columns _{std} (

Uddint and Youssef [

Mekhaznia and Menai [

Grari et al. [

Suppose,

CS was presented based on the obligate brood parasitism of some cuckoo species that lay their eggs in the nests of other host birds. Below are some applications of CS to the cryptanalysis of classical ciphers and some comparisons based on TS, PSO, and GA.

Sadiq and Kareem [

where

Jain and Chaudhari [

Bhateja et al. [

Jain and Chaudhari [

Jain and Chaudhari [

FA developed by Yang, is a modern evolutionary algorithm inspired by the flashing behavior of fireflies. Below is an application of the FA to the cryptanalysis of classical ciphers.

Luthra and Pal [

The ABC algorithm defined by Karaboğa, simulates the food source searching behaviors of honey bees. Because of the function division of the bees and their self-organization, the bees do many functions as a group in a good way. While food source position refers to the solution, amount of the nectar refers to the function of the solution in ABC algorithm. It has three phases; employed, onlooker and scout bees. Below are some applications of ABC algorithm to the cryptanalysis of classical ciphers and some comparisons with FA, CS, DE, PSO, and GA.

Sabonchi and Akay [

Sabonchi and Akay [

Brezočnik et al. [

Sabonchi and Akay [

In this paper, 9 algorithms are reviewed about the applications of metaheuristic algorithms on the cryptanalysis of various classical ciphers, such as monoalphabetic, polyalphabetic, substitution, Hill, and Vernem ciphers. Among 55 papers, 33 papers employed GA, 2 papers employed SA algorithm, 4 papers employed ACO algorithm, 5 papers employed CS algorithm, 3 papers employed PSO algorithm, 4 papers employed ABC algorithm, 2 paper employed DE algorithm, 1 paper employed TS algorithm, and 1 paper employed FA algorithm.

Algorithm | Publication | Application |
---|---|---|

GA | Matthews (1993) [ |
The use of genetic algorithms in cryptanalysis |

Spillman et al. (1993) [ |
cryptanalysis of simple substitution ciphers | |

Clark (1994) [ |
Modern optimization algorithms and cryptanalysis | |

Lin and Kao (1995) [ |
A genetic algorithm for cipher text-only attack in cryptanalysis | |

Clark et al. (1996) [ |
Cryptanalysis of polyalphabetic ciphers | |

Clark and Dawson (1997) [ |
A parallel GA for cryptanalysis of the polyalphabetic cipher | |

Clark and Dawson (1998) [ |
Automated cryptanalysis of classical ciphers | |

Dimovski and Gligoroski (2003) [ |
Attacks on the transposition ciphers | |

Li et al. (2005) [ |
Heuristic cryptanalysis of classical and modern cipher | |

Toemeh and Arumugam (2007) [ |
Breaking transposition cipher | |

Toemeh and Arumugam (2008) [ |
Searching key space of polyalphabetic ciphers | |

Song et al. (2008) [ |
Cryptanalysis of transposition cipher using SA & GA algorithm | |

Garg (2009) [ |
A comparison and cryptanalysis of transposition cipher | |

Hausman and Erickson (2009) [ |
A Dominant gene GA and substitution cipher | |

Omran et al. (2010) [ |
Break a monoalphabetic substitution cipher | |

Omran et al. (2011) [ |
A cryptanalytic attack on vigenere | |

Heydari et al. [ |
Cryptanalysis of transposition ciphers with long key and an improved GA | |

Dureha and Kaur (2013) [ |
Automate an attack on classical ciphers | |

Al-Khalid et al. (2013) [ |
Break a simple transposition cipher | |

Bhateja and Kumar (2014) [ |
Genetic algorithm with elitism for cryptanalysis of vigenere cipher | |

Boryczka and Dworak (2014) [ |
Genetic transformation techniques in cryptanalysis | |

Boryczka and Dworak (2014) [ |
Cryptanalysis of transposition cipher using EA | |

Alkathiry and Al-Mogren (2014) [ |
A powerful GA to crack a transposition cipher | |

Saveetha et al. (2014) [ |
Cryptography and optimization heuristics techniques | |

Sadeghzadeh and Taherbaghal (2014) [ |
A new method and comparison with TS and SA | |

Jadaun et al. (2014) [ |
Deciphering of transposition ciphers | |

Bhateja et al. (2014) [ |
Analysis of different cryptosystems and meta heuristic techniques | |

Al-Khalid and Al-Khfagi (2015) [ |
Cryptanalysis of a hill cipher | |

Bergmann et al. (2015) [ |
Cryptanalysis using GA | |

Habeeb (2016) [ |
Arabic text cryptanalysis using genetic algorithm | |

Mudgal et al. (2017) [ |
Application of genetic algorithm in cryptanalysis of mono-alphabetic substitution cipher | |

Jain et al. (2018) [ |
Cryptanalysis of Mono-Alphabetic Substitution Ciphers using Genetic Algorithms and Simulated Annealing | |

Forhad et al. (2018) [ |
An improved fitness function for automated cryptanalysis using genetic algorithm | |

SA | Forsyth and Safavi-Naini (1993) [ |
Automated cryptanalysis of substitution ciphers |

Giddy and Safavi-Naini (1994) [ |
Automated cryptanalysis of transposition ciphers | |

TS | Verma et al. (2007) [ |
Attack on the monoalphabetic cipher |

PSO | Uddin and Youssef (2006) [ |
Cryptanalysis of simple substitution ciphers |

Hameed and Hmood (2010) [ |
Cryptanalysis of transposition cipher | |

Rajkumar (2017) [ |
Cryptanalysis of Substitution Ciphers Using Particle Swarm Optimization | |

DE | Wulandari et al. (2015) [ |
Cryptanalysis of transposition cipher |

Sabonchi and Akay (2020) [ |
Cryptanalysis of Polyalphabetic Cipher Using Differential Evolution Algorithm | |

ACO | Russell et al. (2003) [ |
Breaking transposition ciphers |

Uddin and Youssef (2006b) [ |
Cryptanalysis of simple substitutions cipher | |

Mekhaznia and Menai (2014) [ |
Cryptanalysis of classical ciphers | |

Grari et al. (2016) [ |
A novel ant colony optimization-based cryptanalysis of substitution cipher | |

CS | Sadiq et al. (2014) [ |
Attacking transposition cipher using improved CS |

Jain and Chaudhari (2015) [ |
A new heuristic based on the CS for cryptanalysis of substitutions cipher | |

Bhateja et al. (2015) [ |
Cryptanalysis of Vigenere cipher | |

Jain and Chaudhari (2018) [ |
A novel cuckoo search technique for solving discrete optimization problems | |

Jain and Chaudhari (2019) [ |
An Improved Genetic Algorithm and A New Discrete Cuckoo Algorithm for Solving the Classical Substitution Cipher | |

FA | Luthra and Pal (2011) [ |
A hybrid FA and cryptanalysis of a monoalphabetic cipher |

ABC | Sabonchi and Akay (2017) [ |
Cryptanalysis using Artificial Bee Colony Algorithm Guided by Frequency based Fitness Value |

Sabonchi and Akay (2017) [ |
Cryptanalytic of Substitution Ciphers by Artificial Bee Colony Algorithm Guided by Statistics based Fitness Function | |

Brezočnik et al. (2020) [ |
Nature-Inspired Cryptoanalysis Methods for Breaking Vigenère Cipher | |

Sabonchi and Akay (2020) [ |
A Binomial Crossover Based Artificial Bee Colony Algorithm for Cryptanalysis of Polyalphabetic Cipher |

We can see that there are still gaps in the literature, especially regarding the improvement of the algorithms’ efficiency for modern cryptographic techniques. We hope that this survey will be helpful for readers interested in cryptanalysis based on metaheuristic algorithms.

This study is supported by Erciyes University Research Projects Unit with grant number FDK-2016-7085.