Here are the latest results and conclusion.
Essentially, for Span specifically, calling Slice then IndexOf is just as good (if not better) than calling the IndexOf overload which removes the need to Slice.
(I am limiting the discussion to fast span since the conclusion doesn’t generally change for slow span)
When comparing:
var temp = bytes.IndexOf(LookupVal, startIndex, count);
VS
var temp = bytes.Slice(startIndex, count).IndexOf(LookupVal);
We observe that the Slice overhead is smaller than calling IndexOf that has additional arguments passed to it along with the helper function changes below between the two calls. The only difference between the two function calls is a constant time change to two local variables outside the indexing loop.
Looking at the execution time breakdown of Slicing and then calling IndexOf, it is observed that the Slice operation overhead is a very small constant.
In essence, the Slice operation is a O(1) operation while IndexOf is an O(n) operation and dominates for large span lengths. The initial goal was to investigate if for some small n, removing the constant time Slice overhead by implementing an IndexOf overload with additional parameters improved performance. It turns out, the constant time overhead of supporting the new overload is greater than (or equal to) the Slice overhead.
Regarding the inconsistent results that would occur for benchmarks for large span lengths (where some compiler/JIT optimization was affecting the results): By making explicit use of the return value of IndexOf, the results become consistent
var temp = bytes.Slice(startIndex, count).IndexOf(LookupVal);
if (temp == -1)
{
Console.WriteLine(temp);
}
From the benchmark tests submitted within the PR in corefxlab:
For slow span, the slice operation is a bit more expensive (around 30%), but removing it in exchange of the IndexOf overload overhead still does not provide any meaningful performance gains.
It is important to note that Span is a small struct and hence copying a span during a Slice operation is not as expensive as some of the larger structs build above it. It would be useful to see what other data types have the Slice operator and measure the overhead of slicing for larger structs.
Fast Span:
public struct Span<T>{
private readonly ByReference<T> _pointer;
private readonly int _length;
Slow Span:
public struct Span<T>{
private readonly Pinnable<T> _pinnable;
private readonly IntPtr _byteOffset;
private readonly int _length;
Very nice analysis. Thanks!