Lattices are hard. As someone who doesn't consider mathematics his primary interest I take solice in the words of whoever wrote the NTL LLL documentation:
I think it is safe to say that nobody really understands how the LLL algorithm works. The theoretical analyses are a long way from describing what "really" happens in practice. Choosing the best variant for a certain application ultimately is a matter of trial and error.
There are a few libraries with LLL support:
- liblll - the easiest to get working, simply clone source and use in your project folder. Pip install didn't work. Blog here was helpful (http://kutioo.blogspot.com/2011/12/liblll.html)
- fpylll - should have worked easily but couldn't get it installed with all the dependencies on CentOS 7. It's a python frontend for the C library fplll and relies on cython which had problems installing due to cysignals which has to be compiled without FORTIFY_SOURCE. Nearest I can tell this is something that's a problem on CentOS due to security measures like SELinux... not worth mucking with.
- pymatgen - this is listed on Wikipedia, but be aware it only supports 3x3 matrices... so seems useless for crypto problems.
- fplll - Floating Point LLL (FPLLL) is one of the more prominent in Google searches and was simple to clone and build from source. Getting a basic client program example working with it isn't as straight forward as liblll however...
- ntl - Number Theory Library (NTL) is listed on Wikipedia, wasn't too hard to download, build, and hack an example to use.
- https://kel.bz/post/lll/ - helpful blog about the LLL algorithm in general, with code snippets
- http://www.math.ucsd.edu/~crypto/Projects/JenniferBakker/Math187/#Anchor-Attackin-35484 - blog about the LLL algorithm, not sure it's input/output is actually correct... but gives 1 data point.
- http://kutioo.blogspot.com/2011/12/liblll.html - blog about using liblll
- https://martinralbrecht.wordpress.com/2016/04/03/fpylll/ - blog about using fpylll
Using a Merkle-Hellman Knapsack encryption engine with the following parameters:
plaintext = 'goddoesnotplaydice'
# superincreasing set of integers
A = [3, 5, 10, 19, 41, 79, 161, 320, 641, 1311]
# integer such that M > sum(A)
M = 2629
# integer such that 1 < W < M and gcd(M,W) = 1
W = 1036
You get the following public key and ciphertext (I'm using a specific method of encoding the plaintext ASCII into decimal values first):
publickey = [479, 2551, 2473, 1281, 412, 345, 1169, 266, 1568, 1632]
ciphertext = [5622, 3258, 4589, 8349, 9224, 7001, 1514, 3460, 1926]
The question is how can you recover some information about the private key or the plaintext itself? The LLL algorithm provides an answer!
The following is using the liblll
Python library:
for cipherdecimal in ciphertext:
M = create_matrix_from_knapsack(publickey,cipherdecimal)
M_reduced = lll_reduction(M)
plainbits = best_vect_knapsack(M_reduced)
plainbits_array.append(plainbits[:10])
plaintext = decode(plainbits_array)
print "Plaintext: " + str(plaintext) + "\n"
This includes the helper functions create_matrix_from_knapsack
and best_vect_knapsack
which just organize some input and output data for the real heavy lifter lll_reduction
. The function decode
in this case is mine, and just maps the plaintext decimal values back to ASCII digraph pairs. The output of using the liblll
library for this ciphertext/public key was:
Plaintext: goddaasnotplayaace
Clearly there are a few errors here, though it did a decent job, enough that standard cryptanalysis could figure out the rest. The two errors resulted inaa
digraphs which is basically the value 00
. Each time this happens the liblll
library prints the following to stdout:
No direct solution found, apply heuristic
no solution found with heuristics
Not sure what's going on here, so I tried a new library to see if the results are the same.
Since the Python fpylll
didn't install easily, decided to just go straight into using the C lib. To run a quick test, you can check the equivalent to the liblll
's lll_reduction
function by running fplll
executable from a terminal with a file that contains the matrix to reduce (one matrix per ciphertext value). A file called test
with the following:
[[ 1 0 0 0 0 0 0 0 0 0 479]
[0 1 0 0 0 0 0 0 0 0 2551]
[0 0 1 0 0 0 0 0 0 0 2473]
[0 0 0 1 0 0 0 0 0 0 1281]
[0 0 0 0 1 0 0 0 0 0 412]
[0 0 0 0 0 1 0 0 0 0 345]
[0 0 0 0 0 0 1 0 0 0 1169]
[0 0 0 0 0 0 0 1 0 0 266]
[0 0 0 0 0 0 0 0 1 0 1568]
[0 0 0 0 0 0 0 0 0 1 1632]
[0 0 0 0 0 0 0 0 0 0 -5622]]
Contains the identity matrix, public key, and ciphertext decimal. The matrix expected by fplll
is rotated compared to the format used by liblll
so just be aware if there are differences in output that it could be the problem. This produces the following when you pipe this matrix into the binary:
cat test | /usr/local/bin/fplll
[[0 0 1 0 1 0 1 0 1 0 0 ]
[0 1 -1 0 0 -1 0 1 0 0 -1 ]
[1 0 0 0 -2 1 0 0 0 0 0 ]
[-1 0 0 0 0 -2 1 0 0 0 0 ]
[1 -1 0 0 0 0 2 -1 0 0 0 ]
[-1 0 0 0 0 1 -1 -1 1 0 -1 ]
[1 1 1 0 0 -1 -1 0 0 1 -1 ]
[0 0 0 1 -1 -1 0 1 1 2 0 ]
[-1 2 0 1 0 -1 0 0 -1 1 1 ]
[-1 1 0 -2 1 1 0 -1 0 0 1 ]
[0 0 2 0 0 1 0 1 -1 1 -1 ]
]
The result of the best_vect_knapsack
in liblll
is the same as the first row in this case:
[0 0 1 0 1 0 1 0 1 0 0 ]
Encrypt using the Merkel-Hellman Knapsack scheme:
plaintext = [ 0 1 1 1 1 1 0 1 0 1 ]
public key = [479 2551 2473 1281 412 345 1169 266 1568 1632]
2551 + 2473 + 1281 + 412 + 345 + 266 + 1632
ciphertext = 8960
Use the public key and ciphertext to create the input matrix to the LLL algorithm:
[[1 0 0 0 0 0 0 0 0 0 479]
[0 1 0 0 0 0 0 0 0 0 2551]
[0 0 1 0 0 0 0 0 0 0 2473]
[0 0 0 1 0 0 0 0 0 0 1281]
[0 0 0 0 1 0 0 0 0 0 412]
[0 0 0 0 0 1 0 0 0 0 345]
[0 0 0 0 0 0 1 0 0 0 1169]
[0 0 0 0 0 0 0 1 0 0 266]
[0 0 0 0 0 0 0 0 1 0 1568]
[0 0 0 0 0 0 0 0 0 1 1632]
[0 0 0 0 0 0 0 0 0 0 -8960]]
The fplll
implementation outputs:
[1 0 0 0 -2 1 0 0 0 0 0]
[0 0 0 0 -2 -1 1 0 0 0 0]
[0 1 -1 0 0 -1 0 1 0 0 -1]
[2 0 -1 0 0 1 1 0 0 0 -1]
[-1 0 0 0 0 1 -1 -1 1 0 -1]
[0 1 1 1 -1 0 1 1 0 1 0]
[0 0 1 0 1 -1 -2 -1 -1 1 0]
[-1 0 1 0 1 0 -1 1 -2 1 -1]
[1 -1 -1 1 0 0 0 0 0 2 0]
[-1 1 0 -2 1 1 0 -1 0 0 1]
[0 1 -1 1 1 0 0 -1 -2 1 1]
And liblll
outputs:
[1 -1 0 1 -1 -1 2 1 0 -1 0]
[0 0 1 0 0 1 0 -1 1 0 1]
[0 0 -1 -1 0 0 0 -1 -1 1 1]
[0 0 0 0 0 -2 0 1 1 0 1]
[-2 0 0 2 0 1 -1 0 1 1 -1]
[1 -2 -1 0 1 1 -1 0 0 0 0]
[0 1 0 1 -1 0 0 0 0 -1 1]
[0 0 1 0 -1 -1 -1 0 -1 1 1]
[0 0 0 0 1 0 -1 0 -2 -2 0]
[0 0 0 0 0 0 1 2 1 1 1]
[0 0 -1 -1 -1 1 -1 0 1 -1 0]
And ntl
outputs:
[1 0 0 0 -2 1 0 0 0 0 0]
[0 0 0 0 -2 -1 1 0 0 0 0]
[0 1 -1 0 0 -1 0 1 0 0 -1]
[2 0 -1 0 0 1 1 0 0 0 -1]
[-1 0 0 0 0 1 -1 -1 1 0 -1]
[0 1 1 1 -1 0 1 1 0 1 0]
[1 0 1 0 -1 0 -2 -1 -1 1 0]
[0 0 1 0 -1 1 -1 1 -2 1 -1]
[1 -1 -1 1 0 0 0 0 0 2 0]
[-1 1 0 -2 1 1 0 -1 0 0 1]
[0 1 -1 1 1 0 0 -1 -2 1 1]
Obviously none of these are valid answers as they aren't in the set {0,1}
, though it is interesting that both ntl
and fplll
produce exactly the same results. Likely they implement the algorithm identically.
Pipe plaintext in to get ciphertext out. Plaintext must be all lowercase without punctuation, spaces and newlines will be automatically removed. For example:
$ cat plaintext.txt
the most merciful thing in the world i think is the inability of the human mind
to correlate all its contents we live on a placid island of ignorance in the midst
of black seas of infinity and it was not meant that we should voyage far
the sciences each straining in its own direction have hitherto harmed us little
but some day the piecing together of dissociated knowledge will open up such
terrifying vistas of reality and of our frightful position therein that we shall
either go mad from the revelation or flee from the light into the peace and safety
of a new dark age
$ cat plaintext.txt | python merkle_hellman_knapsack.py
[8960, 2304, 7592, 9799, 4839, 2192, 4652, 4940, 6233, 5346, 7166, 8960, 5041, 7656, 4595, 7366, 6233, 5612, 5734, 8960, 2038, 5745, 1980, 4797, 2313, 6221, 8960, 6328, 1090, 3067, 3832, 6075, 8497, 2849, 10674, 5238, 3545, 2862, 5764, 7366, 8484, 7390, 8630, 6142, 9454, 5238, 6064, 3872, 5745, 7001, 1568, 7299, 5734, 5899, 6075, 6221, 5933, 4244, 8518, 4443, 2038, 6142, 5967, 3832, 1693, 8497, 5673, 5899, 3760, 7819, 1913, 6221, 7166, 5210, 6914, 2313, 3067, 3460, 745, 1913, 4244, 9799, 2862, 6142, 8960, 3545, 1760, 11019, 5024, 4595, 1236, 2517, 2885, 4041, 9372, 5967, 8484, 4365, 4443, 4775, 2862, 5126, 10183, 8518, 7166, 7166, 3151, 6914, 479, 6592, 6075, 5798, 4430, 8896, 7390, 5064, 2725, 6233, 8960, 4839, 8497, 5064, 8139, 6062, 3561, 4797, 2111, 4531, 3415, 479, 5758, 6062, 1514, 8960, 5504, 4365, 2192, 5346, 8497, 2885, 8960, 4839, 6221, 3460, 8551, 7247, 4099, 8630, 2795, 4244, 5226, 6062, 2885, 2026, 5764, 8958, 3936, 4290, 7886, 5126, 8630, 10674, 5997, 3686, 5346, 3628, 10183, 1913, 6221, 8784, 4369, 7366, 2517, 6075, 6221, 5024, 10416, 7873, 6085, 10262, 4024, 5635, 8085, 8896, 7390, 8960, 4839, 2038, 6142, 5064, 745, 4775, 5064, 5764, 2038, 8960, 4839, 5622, 4477, 4826, 7474, 8201, 5967, 8784, 2725, 5899, 8896, 7390, 7656, 5540, 3128, 6018, 5758, 8960, 5238, 5933, 6555, 7166, 8497, 8960, 5504, 2862, 1926, 3067, 1693, 1898, 6407, 5520, 4041, 6011, 5871, 1977, 2817, 2885]
This uses some default private key parameters A, M, W, and therefore public key B. If you want to edit them you have to edit the parameters in the python file. Or you can just use the functions in the file directly from another python program.
You can also pipe the ciphertext output of merkle_hellman_knapsack.py
to the lll_lattice_basis_reduction_attack.py
file to run it through the LLL algorithm attack.
$ cat plaintext.txt | python merkle_hellman_knapsack.py | python lll_lattice_basis_reduction_attack.py
### Recovered plaintext:
------tmercifu----ngin----orldit----is----nability----ehumanmindtocorr--ateallitscon--ntsw--iv--naplacidis--n
d--ignoranc--nthemidstofb----seas--infinityan--twasnotmeant--atweshould----gefarthescienceseachstraininginits
owndirec--on--ve----erto--rmeduslittlebutsomeday----iecingtoge--er----ss--ia--dknowledgewillop--upsuch--rrify
ingvistas----alit--nd--ourfrigh--ulposi--on--er--nt--twes--ll----ergomadfromthe--ve----onorfleefrom----ightin
to----eaceandsafet--fa----arkage