I found out recently that Coq had a heap implementation based on Braun trees, contributed by Jean-Christophe Filliâtre. It hadn’t heard of Braun trees, so it got me curious. I found a delightful article by Chris Okasaki (which also cites plenty of previous work which I haven’t taken time to read yet).
My main interrogation was how it compared with pairing heaps an implementation of heaps (which apparently predates the Braun-tree based one by 10 years) which is famous for being both very simple and very efficient. So I got around to making measurements (and learnt along the way that doing meaningful measurements is not an obvious thing to do).
It’s worth pointing out that, if the Braun-tree based implementation is rather simple, pairing heaps are incredibly easy to code (in fact, it may have taken me less time to write the pairing heap implementation from scratch, than to find the Coq implementation again on my hard drive, copy it in and adapt it for the sake of uniformity).
Also, besides the efficiency considerations, pairing heap support the
merge
operation which the Braun-tree based implementation does not.
I benchmarked random sequences of operations using the Core_bench micro-benchmarking tool. I ran two types of sequences of operations:
- Sequences of (random)
add
followed by the same number ofremove_min
(ordered in the result table, followed by the number ofadd
). - Random sequences of (random)
add
andremove_min
where the proportion ofremove_min
is a parameter (they are indicated by the proportion ofremove_min
, followed by the number of operations).
In random sequences, whatever the proportion of remove_min
, pairing
heap are consistently faster: a significant 25% faster. The proportion
of remove_min
-s does not affect average performance. In the case
where add
-s are done first and are followed by remove_min
-s, the
difference is smaller but pairing heaps still come out ahead.
Memory consumption is higher for the Braun-tree implementation in both cases (over twice higher in the case of the random sequences of operation).
Results grouped by test kind (sorted by time/run below):
Name | Time/Run | mWd/Run | mjWd/Run | Prom/Run | Percentage |
---|---|---|---|---|---|
Pairing heap (ordered):50 | 3.15us | 1.96kw | 1.49w | 1.49w | 4.45% |
Pairing heap (ordered):100 | 7.24us | 4.79kw | 7.26w | 7.26w | 10.24% |
Pairing heap (ordered):250 | 21.83us | 14.78kw | 54.84w | 54.84w | 30.86% |
Pairing heap (ordered):500 | 50.83us | 33.67kw | 245.65w | 245.65w | 71.85% |
Pairing heap (1/2):100 | 4.79us | 1.46kw | 0.71w | 0.71w | 6.78% |
Pairing heap (1/2):200 | 9.81us | 3.16kw | 2.82w | 2.82w | 13.87% |
Pairing heap (1/2):500 | 25.29us | 8.54kw | 17.79w | 17.79w | 35.75% |
Pairing heap (1/2):1000 | 52.41us | 17.84kw | 72.64w | 72.64w | 74.09% |
Pairing heap (1/3):100 | 4.80us | 1.46kw | 0.71w | 0.71w | 6.78% |
Pairing heap (1/3):200 | 10.00us | 3.16kw | 2.84w | 2.84w | 14.13% |
Pairing heap (1/3):500 | 25.81us | 8.54kw | 17.77w | 17.77w | 36.48% |
Pairing heap (1/3):1000 | 52.74us | 17.84kw | 72.55w | 72.55w | 74.56% |
Pairing heap (1/4):100 | 4.87us | 1.63kw | 1.10w | 1.10w | 6.89% |
Pairing heap (1/4):200 | 9.92us | 3.51kw | 4.57w | 4.57w | 14.02% |
Pairing heap (1/4):500 | 25.63us | 9.49kw | 29.45w | 29.45w | 36.23% |
Pairing heap (1/4):1000 | 52.37us | 19.79kw | 119.55w | 119.55w | 74.02% |
Pairing heap (1/7):100 | 4.65us | 1.77kw | 1.65w | 1.65w | 6.57% |
Pairing heap (1/7):200 | 9.85us | 3.82kw | 6.92w | 6.92w | 13.92% |
Pairing heap (1/7):500 | 24.98us | 10.35kw | 45.57w | 45.57w | 35.31% |
Pairing heap (1/7):1000 | 53.33us | 21.65kw | 185.02w | 185.02w | 75.39% |
Braun heap (ordered):50 | 3.42us | 2.66kw | 1.21w | 1.21w | 4.83% |
Braun heap (ordered):100 | 8.39us | 6.69kw | 5.92w | 5.92w | 11.86% |
Braun heap (ordered):250 | 25.35us | 21.24kw | 45.32w | 45.32w | 35.83% |
Braun heap (ordered):500 | 60.70us | 49.81kw | 209.43w | 209.43w | 85.80% |
Braun heap (1/2):100 | 5.81us | 2.28kw | 0.80w | 0.80w | 8.21% |
Braun heap (1/2):200 | 12.58us | 5.58kw | 3.54w | 3.54w | 17.78% |
Braun heap (1/2):500 | 33.25us | 17.57kw | 26.10w | 26.10w | 47.01% |
Braun heap (1/2):1000 | 68.71us | 40.76kw | 116.67w | 116.67w | 97.13% |
Braun heap (1/3):100 | 5.63us | 2.28kw | 0.80w | 0.80w | 7.96% |
Braun heap (1/3):200 | 12.14us | 5.58kw | 3.51w | 3.51w | 17.16% |
Braun heap (1/3):500 | 32.30us | 17.58kw | 25.91w | 25.91w | 45.65% |
Braun heap (1/3):1000 | 70.02us | 40.77kw | 116.73w | 116.73w | 98.97% |
Braun heap (1/4):100 | 5.76us | 2.57kw | 1.22w | 1.22w | 8.14% |
Braun heap (1/4):200 | 12.26us | 6.16kw | 5.48w | 5.48w | 17.34% |
Braun heap (1/4):500 | 33.05us | 18.77kw | 39.71w | 39.71w | 46.72% |
Braun heap (1/4):1000 | 70.13us | 42.87kw | 179.13w | 179.13w | 99.14% |
Braun heap (1/7):100 | 5.70us | 2.77kw | 1.78w | 1.78w | 8.06% |
Braun heap (1/7):200 | 12.03us | 6.47kw | 7.94w | 7.94w | 17.01% |
Braun heap (1/7):500 | 32.32us | 19.40kw | 57.75w | 57.75w | 45.69% |
Braun heap (1/7):1000 | 70.74us | 43.66kw | 263.82w | 263.82w | 100.00% |
Results sorted by time/run:
Name | Time/Run | mWd/Run | mjWd/Run | Prom/Run | Percentage |
---|---|---|---|---|---|
Pairing heap (ordered):50 | 3.15us | 1.96kw | 1.49w | 1.49w | 4.45% |
Braun heap (ordered):50 | 3.42us | 2.66kw | 1.21w | 1.21w | 4.83% |
Pairing heap (1/7):100 | 4.65us | 1.77kw | 1.65w | 1.65w | 6.57% |
Pairing heap (1/2):100 | 4.79us | 1.46kw | 0.71w | 0.71w | 6.78% |
Pairing heap (1/3):100 | 4.80us | 1.46kw | 0.71w | 0.71w | 6.78% |
Pairing heap (1/4):100 | 4.87us | 1.63kw | 1.10w | 1.10w | 6.89% |
Braun heap (1/3):100 | 5.63us | 2.28kw | 0.80w | 0.80w | 7.96% |
Braun heap (1/7):100 | 5.70us | 2.77kw | 1.78w | 1.78w | 8.06% |
Braun heap (1/4):100 | 5.76us | 2.57kw | 1.22w | 1.22w | 8.14% |
Braun heap (1/2):100 | 5.81us | 2.28kw | 0.80w | 0.80w | 8.21% |
Pairing heap (ordered):100 | 7.24us | 4.79kw | 7.26w | 7.26w | 10.24% |
Braun heap (ordered):100 | 8.39us | 6.69kw | 5.92w | 5.92w | 11.86% |
Pairing heap (1/2):200 | 9.81us | 3.16kw | 2.82w | 2.82w | 13.87% |
Pairing heap (1/7):200 | 9.85us | 3.82kw | 6.92w | 6.92w | 13.92% |
Pairing heap (1/4):200 | 9.92us | 3.51kw | 4.57w | 4.57w | 14.02% |
Pairing heap (1/3):200 | 10.00us | 3.16kw | 2.84w | 2.84w | 14.13% |
Braun heap (1/7):200 | 12.03us | 6.47kw | 7.94w | 7.94w | 17.01% |
Braun heap (1/3):200 | 12.14us | 5.58kw | 3.51w | 3.51w | 17.16% |
Braun heap (1/4):200 | 12.26us | 6.16kw | 5.48w | 5.48w | 17.34% |
Braun heap (1/2):200 | 12.58us | 5.58kw | 3.54w | 3.54w | 17.78% |
Pairing heap (ordered):250 | 21.83us | 14.78kw | 54.84w | 54.84w | 30.86% |
Pairing heap (1/7):500 | 24.98us | 10.35kw | 45.57w | 45.57w | 35.31% |
Pairing heap (1/2):500 | 25.29us | 8.54kw | 17.79w | 17.79w | 35.75% |
Braun heap (ordered):250 | 25.35us | 21.24kw | 45.32w | 45.32w | 35.83% |
Pairing heap (1/4):500 | 25.63us | 9.49kw | 29.45w | 29.45w | 36.23% |
Pairing heap (1/3):500 | 25.81us | 8.54kw | 17.77w | 17.77w | 36.48% |
Braun heap (1/3):500 | 32.30us | 17.58kw | 25.91w | 25.91w | 45.65% |
Braun heap (1/7):500 | 32.32us | 19.40kw | 57.75w | 57.75w | 45.69% |
Braun heap (1/4):500 | 33.05us | 18.77kw | 39.71w | 39.71w | 46.72% |
Braun heap (1/2):500 | 33.25us | 17.57kw | 26.10w | 26.10w | 47.01% |
Pairing heap (ordered):500 | 50.83us | 33.67kw | 245.65w | 245.65w | 71.85% |
Pairing heap (1/4):1000 | 52.37us | 19.79kw | 119.55w | 119.55w | 74.02% |
Pairing heap (1/2):1000 | 52.41us | 17.84kw | 72.64w | 72.64w | 74.09% |
Pairing heap (1/3):1000 | 52.74us | 17.84kw | 72.55w | 72.55w | 74.56% |
Pairing heap (1/7):1000 | 53.33us | 21.65kw | 185.02w | 185.02w | 75.39% |
Braun heap (ordered):500 | 60.70us | 49.81kw | 209.43w | 209.43w | 85.80% |
Braun heap (1/2):1000 | 68.71us | 40.76kw | 116.67w | 116.67w | 97.13% |
Braun heap (1/3):1000 | 70.02us | 40.77kw | 116.73w | 116.73w | 98.97% |
Braun heap (1/4):1000 | 70.13us | 42.87kw | 179.13w | 179.13w | 99.14% |
Braun heap (1/7):1000 | 70.74us | 43.66kw | 263.82w | 263.82w | 100.00% |
The Braun-tree based implementation has good worst-case bounds: add
and remove_min
are both O(log(n)) worst-case. Pairing heaps, on the
other hands, have O(1) add
but O(n) remove_min
in the worst-case.
In the table, we can observe that the time of a sequence of operations
is roughly linear in its length, indicating that both implementations
have O(1) amortised time for both add
and remove_min
. However,
this is under the assumption that the heaps are used in a
single-threaded manner. Since both implementation are purely
functional, it is possible to make pairing heap very inefficient by
repeating many times a slow remove_min
. Braun heaps are much more
resistant to this kind of attacks.
In an environment with ocamlbuild
, ocamlfind
and core_bench
available, run the benchmarks with:
ocamlbuild -use-ocamlfind bench.native --
In an environment with ocamlbuild
, ocamlfind
and qtest
available, run the tests with:
ocamlbuild -use-ocamlfind test.native --
To build insert the table into Org mode I run the tests with
ocamlbuild -use-ocamlfind heap.native -- -ascii
Then I copy it in an Org mode buffer, remove the first column because
it isn’t parsed well by Org mode. Select the table, C-c |
, paste the
first column back, insert |
in front of each line with
string-insert-rectangle
, make sure the horizontal-rule line starts
with |-
(no space), normalise the table (C-c C-c
).
Thanks @aspiwack for this quite interesting benchmark!
For the record, this is because of Okasaki's paper that I opted for the Braun tree based version. At that time, I knew already several other ways of implementing priority queues (e.g. binomial heaps, leftist heap, skew heaps, and pairing heaps, at least). But Braun trees are my favorite.
Note: The truth is that, most of the time, you don't need priority queues to be applicative. If so, a binary heap stored inside a (resizable) array is to be preferred, as is much more efficient. It is a pity that the literature is full of cute applicative heap data structures, but that we don't need them in practice. But may be that's the case for this specific application in Coq...