Closed Bug 1352919 Opened 7 years ago Closed 6 years ago

Remove O(n^2) algorithm in hit testing tree updates

Categories

(Core :: Panning and Zooming, enhancement, P3)

enhancement

Tracking

()

RESOLVED DUPLICATE of bug 1457030

People

(Reporter: dvander, Assigned: dvander)

Details

(Whiteboard: [gfx-noted])

Attachments

(2 files)

When updating the hit testing tree, we spend a great deal of time recycling non-primary HitTestingTreeNodes, since we call RemoveElement() at the front of the vector.
Attached patch fixSplinter Review
This separates primary and non-primary nodes into separate lists, and makes the non-primary list a deque. If the old ordering wasn't important I can switch it to a vector or stack or something.
Attachment #8853843 - Flags: review?(bugmail)
Comment on attachment 8853843 [details] [diff] [review]
fix

Review of attachment 8853843 [details] [diff] [review]:
-----------------------------------------------------------------

Can you provide a sample layer tree that hits the O(n^2) behaviour? It's been a while since I paged in this code and having an example to walk through would help me review this. I recall when I wrote this code I thought that it would be pretty fast in the common case (hence the comment at [1]) but maybe we made changes to layer tree structure since then that invalidate assumptions I made.

[1] http://searchfox.org/mozilla-central/rev/990055a4902952e038cc350c9ffe1ac426d1c943/gfx/layers/apz/src/APZCTreeManager.cpp#407
(In reply to Kartikaya Gupta (email:kats@mozilla.com) from comment #2)
> Comment on attachment 8853843 [details] [diff] [review]
> fix
> 
> Review of attachment 8853843 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> Can you provide a sample layer tree that hits the O(n^2) behaviour? It's
> been a while since I paged in this code and having an example to walk
> through would help me review this. I recall when I wrote this code I thought
> that it would be pretty fast in the common case (hence the comment at [1])
> but maybe we made changes to layer tree structure since then that invalidate
> assumptions I made.
> 
> [1]
> http://searchfox.org/mozilla-central/rev/
> 990055a4902952e038cc350c9ffe1ac426d1c943/gfx/layers/apz/src/APZCTreeManager.
> cpp#407

The O(n) part isn't the loop but the "RemoveElement" which has to move the whole list down. This bug only really gets bad with layers.force-active enabled, on Matt's DL test case there are over 20,000 layers.
Priority: -- → P3
Whiteboard: [gfx-noted]
I'd still like to see a layers dump for that page (maybe with 19,997 of the layers removed), if you have it. I just want to get a better idea of what the layer tree looks like. Plus it'll be good to have a record of it in this bug for future archeology.
Attached file layers.txt
Here's a snippet of the composite layer tree, it only contains the first ~200 color layers or so.
Thanks! I spent some time going over the code and I think I understand the problem. When we build mNodesToDestroy, we are iterating backwards through the existing hit-testing tree [1], because the hit-testing tree is optimized for backwards iteration. This puts the the nodes at the "end" of the tree (the last siblings) at the front of the nsTArray. Then, we walk the layer tree backwards and try to reuse nodes from the nsTArray. Since we're walking the layer tree backwards, the node order is going to closely match (in the common case, where there are few layer tree changes) the front-to-back order of nodes in the nsTArray. So all our searches for nodes ([2], [3]) walk the array from front-to-back. This means the searches find a hit quickly, but it also means that the RemoveElement calls happen at low array indices and that causes the entire array to shift up, which is the expensive part.

While your patch probably works fine for this case, I think it might do less well in other cases where the layer tree changes somewhat between calls to UpdateHitTestingTree. In those cases, we might end up repeatedly calling RemoveElement(n) on the deque for small but non-zero values of n. I'm not sure what the performance characteristics of deque are in that scenario - the cppreference.com documentation just says it's constant-time at the ends but linear anywhere else. If the performance is such that insertion/removal *near* the ends of the deque is fast, then we're probably okay with this patch. I don't know if we want to rely on that.

Another solution that I think would work just as well is to reverse the list after we have built it at [1]. We still want to build it with a backwards iterator because the HitTestingTreeNodes are optimized for that. But then after we have stuffed all the nodes into the nsTArray, we can just call Reverse() on it, which seems like it will be pretty fast [4]. And then we also reverse the direction of the searches at [2] and [3] so that they search the array from back to front rather than front to back. That way the RemoveElement calls will always remove things near the end of the list which should perform much better. We'll also want to reverse the destruction loop at [5]. I feel like this approach will more clearly preserve the existing semantics of the code, while still giving us good performance.

David, can you give this a try and see if it works for this scenario?

[1] http://searchfox.org/mozilla-central/rev/990055a4902952e038cc350c9ffe1ac426d1c943/gfx/layers/apz/src/APZCTreeManager.cpp#255
[2] http://searchfox.org/mozilla-central/rev/990055a4902952e038cc350c9ffe1ac426d1c943/gfx/layers/apz/src/APZCTreeManager.cpp#410
[3] http://searchfox.org/mozilla-central/rev/990055a4902952e038cc350c9ffe1ac426d1c943/gfx/layers/apz/src/APZCTreeManager.cpp#544
[4] http://searchfox.org/mozilla-central/rev/381a7b8f8a78b481336cfa384cb0a5536e217e4a/xpcom/ds/nsTArray.h#1986
[5] http://searchfox.org/mozilla-central/rev/990055a4902952e038cc350c9ffe1ac426d1c943/gfx/layers/apz/src/APZCTreeManager.cpp#326
Comment on attachment 8853843 [details] [diff] [review]
fix

Review of attachment 8853843 [details] [diff] [review]:
-----------------------------------------------------------------

As per our IRC discussion I'd like to see if the simpler approach I described also fixes the perf problem. However if it doesn't, or you don't have time to look into it and this needs fixing urgently, we can land this. As far as I can tell there are no problems with the patch as-is. The concern I had about deque perf is irrelevant because even in the scenario I was thinking of we will always remove from the end of the deque.
Attachment #8853843 - Flags: review?(bugmail) → review+
Forward-duping to Bas' patch which also fixes the O(n^2) behaviour.
Status: ASSIGNED → RESOLVED
Closed: 6 years ago
Resolution: --- → DUPLICATE
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: