I was curious about the results reported here, which reports that Scala's mutable maps are slower than Java's: http://www.infoq.com/news/2011/11/yammer-scala
In my tests, Scala's OpenHashMap equals or beats java's HashMap:
Insertion 100k elements (String keys) time in ms:
- scala HashMap: 92.75
- scala OpenHashMap: 14.03125
- java HashMap: 15.78125
Insertion 100k elements (Integer keys) time in ms:
- scala HashMap: 98.625
- scala OpenHashMap: 13.71875
- java HashMap: 23.5625
Hastily thrown together timing code below. I at least run enough trials to get a half second sample.
Scala's mutable HashMap implementation was known to be slower than HashMap, which motivated David MacIver to write OpenHashMap way back in 2008. Unfortunately, not many people know it.
package org.apache.spark
object Test {
def time(warmup: Int)(block: => Any) = {
for (i <- 1 to warmup) block // possibly trigger JIT on freq used methods in block
var goodSample = false
var start = 0L; var stop = 0L;
var N = 10
while (!goodSample) {
start = System.currentTimeMillis
for (i <- 1 to N) block
stop = System.currentTimeMillis
if (stop - start < 500) N = N*4 // require at least half a second for a decent sample
else goodSample = true
}
val perOp = (stop.toDouble - start.toDouble) / N
perOp
}
def main(args: Array[String]) {
val vs = (0 to 100000).map(_ toString).toArray
val warmup = 10
//
// {
// println(time(warmup) {
// val m = new scala.collection.mutable.HashMap[String, Int];
// var i = 0;
// while (i < 100000) {
// m.put(vs(i), i);
// i = i + 1;
// };
// })
//
// println(time(warmup) {
// val m = new scala.collection.mutable.OpenHashMap[String, Int];
// var i = 0;
// var start = System.currentTimeMillis();
// while (i < 100000) {
// m.put(vs(i), i);
// i = i + 1;
// };
// })
//
// println(time(warmup) {
// val m = new java.util.HashMap[String, Int];
// var i = 0;
// while (i < 100000) {
// m.put(vs(i), i);
// i = i + 1;
// };
// })
// }
// With Integer keys
{
println("scala HashMap: " + time(warmup) {
val m = new scala.collection.mutable.HashMap[Int, Int];
var i = 0;
while (i < 100000) {
m.put(i, i);
i = i + 1;
};
})
println("scala OpenHashMap: " + time(warmup) {
val m = new scala.collection.mutable.OpenHashMap[Int, Int];
var i = 0;
var start = System.currentTimeMillis();
while (i < 100000) {
m.put(i, i);
i = i + 1;
};
})
println("Spark OpenHashMap: " + time(warmup) {
val m = new org.apache.spark.util.collection.OpenHashMap[Int, Int]
var i = 0
while (i < 100000) {
m(i) = i
i = i + 1
}
})
println("java HashMap: " + time(warmup) {
val m = new java.util.HashMap[Int, Int];
var i = 0;
while (i < 100000) {
m.put(i, i);
i = i + 1;
};
})
}
}
}