hcrypto.analyzer
Class DigramAnalyzer

java.lang.Object
  extended by hcrypto.analyzer.CryptoAnalyzer
      extended by hcrypto.analyzer.DigramAnalyzer
All Implemented Interfaces:
Analyzer, java.lang.Runnable

public class DigramAnalyzer
extends CryptoAnalyzer

Analyzes traditional cryptograms of the sort you find in the newspaper. It assumes the text is a word-delimited cryptogram created using a simple substitution cipher.

Credits: The algorithm here is based on the algorithm reported by Thomas Jakobsen, A fast method for the cryptanalysis of substitution ciphers. in Cryptologia, Volume 19, Issue 3 July 1995 , pages 265 - 274.

At present this analyzer only works for the alphabet 'a' to 'z'.

The algorithm computes a digram distribution matrices, that are used to compare the distribution of digram frequencies in the cryptogram with their distribution in standard English text (as computed from a book, such as Tom Sawyer). Cryptanalysis proceeds by generating an initial key and then computing how close to a correct decryption it comes. However, rather than using the key to decrypt the cryptogram, the algorithm simply compares the distribution matrices. This results in a much faster algorithm.

The fitness function, in this case, is the sum of the numerical differences between the frequencies of the English and crypto matrices. The closer to 0, the better the fit.

The algorithm

  1. Construct an initial key, k, based on letter frequencies of cryptogram.
  2. Let v = f(d(c,k)) be the fitness of a decryption based on the key, k.
  3. Let k' = k.
  4. Change k' by swapping two elements a and b in k'.
  5. Let v' = f(d(c,k')).
  6. If v' < v, let v = v' and k = k'.
  7. Go to 3 or terminate if v' < v hasn't been true in N iterations for some N.

Speedup. This algorithm can be sped up by noticing that swapping a and b in the key would result in swapping columns a and b and then rows a and b in the cryptograms digram distribution matrix. The speedup is due to the fact that the crypto text needs to be parsed only once.

Swapping strategy. Although the algorithm will work for random a and b, it uses a more intelligent strategy -- namely, swap letters whose frequencies are close together in the cryptogram. This can be done by letting S represent a vector of ciphertext symbols ranked in order from highest to lowest, so S1 probably probably represents the space, S2, the letter 'e' and so forth.

The swapping strategy goes as follows: S1/S2, S2/S3,...,S26/S27. Then S1/S3, S2/S4, ...,S25/S27, and so on.


Field Summary
 
Fields inherited from class hcrypto.analyzer.CryptoAnalyzer
PERMUTATION, PLAYFAIR, RAILFENCE, SIMPLESUB
 
Fields inherited from interface hcrypto.analyzer.Analyzer
DECIPHER_LIMIT
 
Constructor Summary
DigramAnalyzer()
           
DigramAnalyzer(AnalyzerFrame f)
           
DigramAnalyzer(TextStatistics ts)
           
 
Method Summary
 java.lang.String decrypt(boolean replace_brace)
          Applies the current key, d_key, to the cryptogram itself.
 void doAnalysis()
          This method performs an analysis of the text using the algorithm described above.
 java.lang.String getReport()
          This method is part of the Analyzer interface.
 void run()
          This method is part of the Analyzer interface.
 void setup(java.lang.String text)
          This method initializes the Analyzer.
 java.lang.String toString()
          This method returns the report generated by the analysis.
 
Methods inherited from class hcrypto.analyzer.CryptoAnalyzer
prettyPrint, setup, stopThread, threadIsStopped
 
Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Constructor Detail

DigramAnalyzer

public DigramAnalyzer()

DigramAnalyzer

public DigramAnalyzer(AnalyzerFrame f)

DigramAnalyzer

public DigramAnalyzer(TextStatistics ts)
               throws java.lang.NullPointerException
Throws:
java.lang.NullPointerException
Method Detail

setup

public void setup(java.lang.String text)
This method initializes the Analyzer.

Specified by:
setup in interface Analyzer
Overrides:
setup in class CryptoAnalyzer
Parameters:
text - a String pointing to the text being analyzed

run

public void run()
This method is part of the Analyzer interface. It runs the analysis.

Specified by:
run in interface Analyzer
Specified by:
run in interface java.lang.Runnable
Overrides:
run in class CryptoAnalyzer

getReport

public java.lang.String getReport()
This method is part of the Analyzer interface. It returns the report generated by the analysis.

Specified by:
getReport in interface Analyzer
Overrides:
getReport in class CryptoAnalyzer

toString

public java.lang.String toString()
This method returns the report generated by the analysis.

Overrides:
toString in class CryptoAnalyzer

decrypt

public java.lang.String decrypt(boolean replace_brace)
Applies the current key, d_key, to the cryptogram itself. It is only necessary to do this when the hill-climbing algorithm terminates.

Parameters:
replace_brace - true if you want to replace } with a blank
Returns:
The decrypted ciphertext.

doAnalysis

public void doAnalysis()
This method performs an analysis of the text using the algorithm described above. It assumes the text is a cryptogram created using a simple substitution cipher.