Command line Wald–Wolfowitz runs test https://en.wikipedia.org/wiki/Wald-Wolfowitz_runs_test

If you have a command line that can generate a result, and you want to test whether successive results are statistically indepenent of each other, you can use the following:

seq <N_TESTS> \
| parallel -n0 <COMMAND...> \
| uniq -c \
| awk '{ NR%2 ? n1+=$1 : n2+=$1 } END { print NR, n1, n2 }' \
| python -c 'from statistics import NormalDist as N; import math; n_runs, n1, n2 = map(int, input().split()); n = n1+n2; μ=1.0*(2*n1*n2)/n + 1; σ=math.sqrt(1.0*(μ-1)*(μ-2)/(n-1)); Z = (n_runs-μ)/σ; p = N().cdf(-abs(Z)); p_two_tailed = p*2; print(); print(*list(vars().items())[-9:], sep="\n")'

(To explain: uniq -c counts unique lines within runs -- it only gets you overall unique counts if you sort the data first.)

p_two_tailed in this case represents the "probability of getting the observed results if they are statistically independent of each other". You can only conclude that the results are not independent if the p-value is small e.g. less than 0.05 or 0.01.

Note that you may need to either:

add extra quoting to <COMMAND...> because parallel interprets the args as a shell script by default, or
pass the -q/--quote flag to parallel if your shell command consists of a single program with args

You can also pass additional args (e.g. --progress) to parallel as long as they only output to stderr, not stdout.

Example

$ seq 500 \
> | parallel -n0 echo '$((RANDOM%2))' \
> | uniq -c \
> | awk '{ NR%2 ? n1+=$1 : n2+=$1 } END { print NR, n1, n2 }' \
> | python -c 'from statistics import NormalDist as N; import math; n_runs, n1, n2 = map(int, input().split()); n = n1+n2; μ=1.0*(2*n1*n2)/n + 1; σ=math.sqrt(1.0*(μ-1)*(μ-2)/(n-1)); Z = (n_runs-μ)/σ; p = N().cdf(-abs(Z)); p_two_tailed = p*2; print(); print(*list(vars().items())[-9:], sep="\n")'

('n_runs', 258)
('n1', 242)
('n2', 258)
('n', 500)
('μ', 250.744)
('σ', 11.157671366941461)
('Z', 0.6503149054468893)
('p', 0.25774441546058247)
('p_two_tailed', 0.5154888309211649)

vergenzt/Wald-Wolfowitz-runs-test.md

Example