Closed Bug 980934 Opened 6 years ago Closed 4 years ago

Huge performance hit from mozilla::SplayTree::checkCoherency in DEBUG builds

Categories

(Core :: MFBT, defect)

defect
Not set

Tracking

()

RESOLVED FIXED

People

(Reporter: jseward, Assigned: njn)

Details

Attachments

(2 files)

mfbt/SplayTree.h appears to add a splay tree implementation.  In debug
builds, SplayTree::find, SplayTree::insert and SplayTree::remove call
SplayTree::checkCoherency in order to sanity check the resulting tree.
Unfortunately that walks the entire tree and so transforms what is
presumably an O(log N) or better operation to O(N) cost.  This makes
m-c extremely slow in debug mode, to the point of being unusable,
under some circumstances.

Non-debug builds do not show any of these perf problems.

I discovered this when running a debug build of Firefox with the SPS
unwinder and the new LUL unwind library.  The profiles show a lot of
cost in SplayTree::checkCoherency, with 50+ deep nested recursive
calls.  It is not easy to see where this frenzy of recursion
originates from, but one source I did see is
mozilla::OverflowChangedTracker(nsIFrame*).
OS: Linux → All
Hardware: x86_64 → All
Trying out debug {mochi,ref,crash}tests with the consistency check removed to see if it makes a big difference:

https://tbpl.mozilla.org/?tree=Try&rev=3efc309445f6
I also tried with a somewhat newer tree, to be able to compare the try run with inbound more reasonably:

https://tbpl.mozilla.org/?tree=Try&rev=05b1e98f1131

mochitest-1 appears to be outside the noise threshold--somewhere between 10-20% faster.  Everything else
looks like it is in the noise.

(Which isn't to say, of course, that we shouldn't fix the issue so developers could use debug builds more productively on their machines...)
FWIW, I'm locally working on some code that uses a SplayTree with around 45000 elements in it, and checkCoherency was making it unusably slow, causing my browser to hang for tens of seconds.
This problem is a bit bad because the worst case performance hit
can be huge, as Andrew and I observed independently, even if, as
Nathan observes, the average hit across {mochi,ref,crash}tests
isn't too bad.

A splay tree should be one of those things where, once we've done
some simple unit tests on it, we're pretty much sure it works and
there's no point in endlessly rewalking complete trees when used
"for real" in Gecko.  Is there some way we can just unit-test it
and rely on that to be adequate?
> Is there some way we can just unit-test it and rely on that to be adequate?

Yes, this is the right way to do it. Someone just needs to write mfbt/tests/TestSplayTree.cpp. There are plenty of other tests in mfbt/tests/ to serve as examples.
Assignee: nobody → n.nethercote
Status: NEW → ASSIGNED
CC'ing mattwoodrow because he is (I think?) the original author of MFBT's SplayTree.
I just adapted it from the SplayTree in js/src/ds/, but this seems like a good change.
> I just adapted it from the SplayTree in js/src/ds/

Ah, so it was you who was responsible for the duplication :(
Comment on attachment 8464487 [details] [diff] [review]
(part 1) - Add mfbt/tests/TestSplayTree.cpp

Review of attachment 8464487 [details] [diff] [review]:
-----------------------------------------------------------------

Hooray!  r=me with the below problem fixed.

::: mfbt/SplayTree.h
@@ +171,5 @@
>  
> +  // For testing purposes only.
> +  void checkCoherency()
> +  {
> +    checkCoherency(mRoot, nullptr);

checkCoherency's body is currently #ifdef DEBUG *and* uses plain MOZ_ASSERT, rather than MOZ_RELEASE_ASSERT, so the test will do nothing in release builds.
Attachment #8464487 - Flags: review?(nfroyd) → review+
Attachment #8464489 - Flags: review?(nfroyd) → review+
> checkCoherency's body is currently #ifdef DEBUG *and* uses plain MOZ_ASSERT,
> rather than MOZ_RELEASE_ASSERT, so the test will do nothing in release
> builds.

Good catch. I'll remove the #ifdef DEBUG and convert them all to MOZ_RELEASE_ASSERT.
This landed ages ago. Not sure why it wasn't marked as landing on mozilla-central or auto-closed. So I'll close it now.
Status: ASSIGNED → RESOLVED
Closed: 4 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.