Open Bug 1371043 Opened 7 years ago Updated 3 months ago

Speed up MFBT's BinarySearch()

Categories

(Core :: MFBT, defect)

defect

Tracking

()

People

(Reporter: ehsan.akhgari, Unassigned)

References

Details

Attachments

(2 files)

While looking at bug 1366241, I spent some time researching the performance of binary search algorithms and came across a paper which was suggesting a branch-free implementation of it which was supposedly faster:

https://arxiv.org/abs/1509.05053

(See the branch-free binary search algorithm described there)

I experimented replacing MFBT's BinarySearch() algorithm with this idea.  I had a stupid microbenchmark which this patch could make around 30-40% faster locally, but I wasn't quite sure what to do with the patch after this point in terms of further performance testing.

I'm submitting it here for now mostly in case it will be of use for bug 1366241 although I'm mostly convinced that the best solution for that bug is to stop using binary search completely, but perhaps this may introduce some useful gains there?

This has passed a full green try run so it should be correct.
Submitting it for feedback mostly to get some idea around what you think we should do with this patch.  :-)
Attachment #8875476 - Flags: feedback?(nfroyd)
Comment on attachment 8875476 [details] [diff] [review]
Use a branch-free binary search algorithm for mozilla::BinarySearch()

Review of attachment 8875476 [details] [diff] [review]:
-----------------------------------------------------------------

I think we should probably commit it.  What did your microbenchmark look like?  It's possible to do a very unfair comparison of this algorithm with the previous implementation, so we should probably double-check that we're doing the right thing.

::: mfbt/BinarySearch.h
@@ +65,5 @@
>  
> +// Branch-less sign function.
> +// Returns -1 if its argument is negative, 1 if its argument is positive, and 0 if it is 0.
> +template<typename T>
> +T Sign(T aValue)

This function should live in mozilla::detail, rather than mozilla:: itself.

@@ +89,3 @@
>  
> +  // Branch-free binary-search algorithm based on the one described in
> +  // https://arxiv.org/abs/1509.05053.

May want to mention here that the compiler is expected to optimize the conditional operator to a conditional move, rather than branches, to provide the branch-free-ness.
Attachment #8875476 - Flags: feedback?(nfroyd) → feedback+
Attached patch microbenchmarkSplinter Review
This was the microbenchmark patch that I was using before.  You can probably see why I didn't attach it before.  For one thing, the std::binary_search version is just totally wrong (since that function just returns a bool!) so the comparison to that isn't entirely interesting.  But to answer your question, the benchmark was doing a million binary searches in nsHtml5ElementName::ELEMENT_HASHES looking for an *existing* value.  I also had a modification for a case where the search would be for a non-existing value always and it seemed like the speedups showed up a lot more consistently in the latter case (which to some extent does make sense since such searches would examine more of the array each time, and running more of the loop, which would be the case where the inefficiencies of the existing algorithm would show up more.)

At any rate, posting this microbenchmark patch to try gives this result:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=4810b3266c12e477f7276aeb59d31e09bd2443d2

The TLDR being: faster on Win32, slower on Win64, about the same on OSX, slower on Linux.

If you want, modifying this patch to make it benchmark the search for non-existing elements case and posting it to try should be fairly simple.  But the actual reason why I wrote this was to do local profiling with it where I was getting much better and different results than I got on try...
(In reply to Nathan Froyd [:froydnj] from comment #2)
> ::: mfbt/BinarySearch.h
> @@ +65,5 @@
> >  
> > +// Branch-less sign function.
> > +// Returns -1 if its argument is negative, 1 if its argument is positive, and 0 if it is 0.
> > +template<typename T>
> > +T Sign(T aValue)
> 
> This function should live in mozilla::detail, rather than mozilla:: itself.

Heh, I was kind of blindly assuming that BinarySearchIf is also in detail and didn't realize it's exposed interface!  Now that I see it is used even more than BinarySearch itself is, I'm a bit more energized about this patch.  :-)

(But yeah I did mean to put Sign in detail, this was an oversight...)

> @@ +89,3 @@
> >  
> > +  // Branch-free binary-search algorithm based on the one described in
> > +  // https://arxiv.org/abs/1509.05053.
> 
> May want to mention here that the compiler is expected to optimize the
> conditional operator to a conditional move, rather than branches, to provide
> the branch-free-ness.

Right, will do.  FWIW I did verify that this indeed does happen, and in fact it is faster than the branch.  The paper does discuss the cmov instruction being a bit controversial so I didn't want to just blindly follow the crowd here.  :-)
Assignee: nobody → ehsan
(In reply to :Ehsan Akhgari (needinfo please, extremely long backlog) from comment #3)
> The TLDR being: faster on Win32, slower on Win64, about the same on OSX,
> slower on Linux.

So will this make things faster in general or should we fix bug 1366241 in some other way?
Flags: needinfo?(ehsan)
(In reply to Olli Pettay [:smaug] from comment #5)
> (In reply to :Ehsan Akhgari (needinfo please, extremely long backlog) from
> comment #3)
> > The TLDR being: faster on Win32, slower on Win64, about the same on OSX,
> > slower on Linux.
> 
> So will this make things faster in general or should we fix bug 1366241 in
> some other way?

I think we should definitely fix that bug in some other way, using my suggestion there.

I do want to look into this again in some more detail to see whether I can reproduce the try server results and look into the why.  I have realized that we actually use this function in a fair number of places in the code, for example I have seen in profiles of Speedometer from BaselineScript::icEntryFromReturnAddress(), also in other profiles under NS_EscapeURL(), so I'm definitely motivated to get back to it, just need to find some spare time!
Flags: needinfo?(ehsan)
Assignee: ehsan → nobody
Severity: normal → S3
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: