Here are my findings of entropy extraction estimates from mouse movement events in the browser. Tables below show the results sorted by the minimum entropy extraction. Timing events, keyboard events, and other potential sources of entropy that can be collected from the user are not considered here.
A visual representation of slow, medium, and fast mouse movements can help visualize why the entropy estimation increases as the mouse velocity increases. The recorded data was plotted with Gnuplot as follows:
set term png;
set output 'mouse-slow.png';
set key off;
set xlabel "Browser client x-coordinate";
set ylabel "Browser client y-coordinate";
# the browser window size on my Pinebook- adjust as necessary
set xrange [426:938];
set yrange [10:522];
set term png size 512,512;
plot 'mouse-slow.txt' linetype 1 pointsize .5 linecolor black;
The plots were then drawn with:
$ gnuplot -c plot.gp
At the bottom of this document is an HTML proof-of-concept to record (X, Y) coordinate pairs for your own testing.
To estimate the entropy inherent in the mouse movement events, I am using compression to compress the recorded (X, Y) coordinate pairs, using the tightest compression possible for each compression algorithm. The goal is to get an approximation of the minimum entropy as a base to build from.
Per NIST, compression can be used as a good entropy estimator:
The compression estimate, proposed by Hagerty and Draper [HD12], computes the entropy rate of a dataset, based on how much the dataset can be compressed. This estimator is based on the Maurer Universal Statistic [Mau92]. The estimate is computed by generating a dictionary of values, and then computing the average number of samples required to produce an output, based on the dictionary. One advantage of using the Maurer statistic is that there is no assumption of independence. When sequences with dependencies is tested with this statistic, the compression rate is affected (and therefore the entropy), but an entropy estimate is still obtained. A calculation of the Maurer statistic is efficient, as it requires only one pass through the dataset to provide an entropy estimate.
Another advantage of using compression is the availability of various compression algorithms in userspace. This makes it trivial to extract entropy estimates without writing possibly complex tools for calculating collision estimates or lag prediction estimates.
The following command line switches were passed to each compression algorithm:
- Bzip2:
$ bzip --best $file
- Gzip:
$ gzip --best $file
- LZ4:
$ lz4 -9 $file
- LZMA:
$ lzma --best $file
- LZO:
$ lzop --best $file
- XZ:
$ xz --best $file
- ZPAQ:
$ zpaq -a ${file}.zpaq $file -m5
- lrzip:
$ lrz --best $file
As compression algorithms improve, also per the NIST SP800-90B document, it's recommended to reduce the minimum entropy by a factor of 2, just to be safe.
Because the <div>
is 512x512 pixels square, every (X, Y) coordinate pair can
fit within 18 bits. This fact was used when calculating how much entropy can be
extracted knowing the "bits per byte" result.
From painfully slow mouse movements across the <div>
element, trying to get
(X, Y) coordinates as sequential as possible, the following entropy extraction
estimates are recommended:
- Min: 2.75-bits
- NIST: 1.375-bits
The original file size was 7936 bytes. Sorted by compressed bytes:
Comp. Alg. | Comp. Bytes | Comp. % | Ent. Bits/Byte | Ent. Extracted |
---|---|---|---|---|
LZMA | 1209 | 84.77 | 1.22 | 2.75 |
XZ | 1256 | 84.17 | 1.27 | 2.86 |
lrzip | 1280 | 83.87 | 1.29 | 2.90 |
ZPAQ | 2200 | 72.28 | 2.22 | 5.00 |
Gzip | 2224 | 71.98 | 2.24 | 5.04 |
BZip2 | 2366 | 70.19 | 2.38 | 5.36 |
LZO | 3402 | 57.13 | 3.43 | 7.72 |
LZ4 | 4771 | 39.88 | 4.81 | 10.82 |
From moving the mouse around the <div>
element as I would normally. No rhyme
or reason was made in drawing anything in the browser. As such, the following
entropy extraction estimates would be recommended:
- Min: 5.24-bits
- NIST: 2.62-bits
The original file size was 7791 bytes. Sorted by compressed bytes:
Comp. Alg. | Comp. Bytes | Comp. % | Ent. Bits/Byte | Ent. Extracted |
---|---|---|---|---|
LZMA | 2270 | 70.86 | 2.33 | 5.24 |
XZ | 2316 | 70.27 | 2.38 | 5.36 |
lrzip | 2340 | 69.97 | 2.40 | 5.40 |
BZip2 | 2831 | 63.66 | 2.91 | 6.55 |
ZPAQ | 3025 | 61.71 | 3.06 | 6.89 |
Gzip | 3426 | 56.03 | 3.52 | 7.92 |
LZO | 4439 | 43.02 | 4.56 | 10.26 |
LZ4 | 5964 | 23.45 | 6.12 | 13.77 |
I moved the mouse across the browser window as insanely fast as possible. No
surprise, this approach took the longest to gather all 1,000 events, as most
mouse movement was likely happing outside of the <div>
, and as such, not
getting recorded. These entropy extraction estimates would be:
- Min: 6.62-bits
- NIST: 3.31-bits
The original file size was 7775 bytes. Sorted by compressed bytes:
Comp. Alg. | Comp. Bytes | Comp. % | Ent. Bits/Byte | Ent. Extracted |
---|---|---|---|---|
BZip2 | 2862 | 63.19 | 2.94 | 6.62 |
LZMA | 3014 | 61.23 | 3.10 | 6.98 |
XZ | 3060 | 60.64 | 3.15 | 7.09 |
lrzip | 3083 | 60.35 | 3.17 | 7.13 |
ZPAQ | 3423 | 55.97 | 3.52 | 7.92 |
Gzip | 3557 | 54.25 | 3.66 | 8.24 |
LZO | 4537 | 41.65 | 4.67 | 10.51 |
LZ4 | 6033 | 22.41 | 6.21 | 13.97 |
You should run your own tests, and many of them, then record the minimum entropy extraction per event reduced by a factor of 2, per NIST's recommendation. At the bare minimum, you can extract 1-bit per event, also per NIST.
Once collected, run the data through a cryptographically secure key derivation function (KDF) with arbitrary output. Recommended is PBKDF2, scrypt, or Argon2. An extensible output function (XOF) would be sufficient also, such as SHAKE128, SHAKE256, & BLAKE2,
Via my findings, if I were to use 1.375-bits for me extracted estimate per event, then based on the security margin I need in bits, the following table shows the minimum number of event necessary to capture to make that possible:
Security Needed | Events Needed |
---|---|
64 | 47 |
70 | 51 |
80 | 59 |
96 | 70 |
128 | 94 |
224 | 163 |
256 | 187 |
384 | 280 |
512 | 373 |
521 | 379 |
1024 | 745 |
2048 | 1490 |
4096 | 2979 |
8192 | 5958 |
Use this to test mouse entropy extraction yourself. The (X, Y) coordinate pairs are logged to the JavaScript console.
<!DOCTYPE html>
<html lang="en">
<head>
<title>JavaScript entropy collector</title>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<style>
#ent {
background-color: white;
border: 1px solid black;
height: 512px;
width: 512px;
}
</style>
</head>
<body id="body">
<center>
<div id="ent"></div>
</center>
<script>
var i=0, limit=1000;
var logCoords = function(e) { console.log(e.clientX, e.clientY); };
var ent = document.getElementById("ent");
var body = document.getElementById("body");
ent.onmousemove = function(e) {
// console.log({i, limit}); // debug
if(++i <= limit) { return logCoords(e); }
body.style.backgroundColor = "red";
return (ent.onmousemove = null);
};
</script>
</body>
</html>