If you think a bug might affect users in the 57 release, please set the correct tracking and status flags for Release Management.

Optimize nsAttrAndChildArray::IndexOfChild for pathological cases




8 years ago
5 years ago


(Reporter: Ehsan, Unassigned)


(Blocks: 1 bug, {perf})


Firefox Tracking Flags

(Not tracked)


nsAttrAndChildArray::IndexOfChild sucks, because many operations include repetitive calls to this function, and it can be slow in some cases.  For example, the last test case on bug 485446 contains an element with ~20000 child elements, and the test case actually spends 100% of the CPU time in nsAttrAndChildArray::IndexOfChild.

We can do better.

This is a known issue.  Bug 240884 tried to improve this situation for most cases using an MRU cache.  This doesn't work if the access pattern to the array is random, and is actually worse than a linear search for large arrays, because it hurts the CPU data cache because of the way that the array elements are accessed.

So, I'm proposing an alternative scheme for pathological cases like that.  The best we can do on an indexof implementation is O(1), using a hashtable.  I'm proposing that we switch to a hashtable for pathological cases.  We need to define "pathological" here, and I have a proposal for that:

A pathological case is detected inside IndexOfChild.  We can collect data about how many times a this operation is performed on an object at nearly no extra cost.  Then, we can define thresholds on the number of search operations performed, and the number of children of a node.  Let's say we decide to construct a hashtable if there are more than 50 child elements or if a search operation has occurred more than 100 times (these numbers need tweaking of course.)  The idea is that for nodes which have a large number of children, a search operation using a hashtable is very efficient, and the cost of constructing a hashtable [O(N)] will be paid off by future O(1) look-ups; also for cases where many number of search operations have been performed, we go ahead and pay the extra cost of constructing a hashtable in the hope that the clients perform more search operations, so that the cost of hashtable construction can be amortized into the total cost of search operations.  We can check for this case inside IndexOfChild, and if we detect a pathological case, we construct a hashtable and use that table from that point on for IndexOfChild searches.

There is a caveat for such a hashtable: insertions and deletions of nodes obviously mess up the hashtable indexes, and because the hashtable is keyed by the child addresses, in order to perform a fix-up, we have to traverse the entire hashtable once.  If we do the hashtable update inside InsertChildAt and RemoveChildAt, then we would be able to reuse the existing hashtable at the expense of making the best case performance of these functions O(N) instead of the current O(1).  If we do the hashtable update inside IndexOfChild, we would have to reconstruct the hashtable from scratch since we don't know which elements have been added or removed from the hashtable, but we keep the current performance characteristics of InsertChildAt and RemoveChildAt.  I think the latter approach makes sense, for two reasons:

1. Many web apps may add/remove children in a loop, and the former solutions hurts the performance of such cases.

2. We might even get lucky and not need to construct a hashtable at all (if we no longer face a pathological case, or if no client simply needs to call IndexOfChild any more), so deferring the hashtable update until we actually need it makes sense to me.

I'm going to make a patch implementing these ideas and attach it to this bug.
Keywords: perf
I think the right solution to this problem is to not use IndexOfChild for position comparisons.  See also bug 513169 (and especially the link in bug 513169 comment 0).
(In reply to comment #1)
> I think the right solution to this problem is to not use IndexOfChild for
> position comparisons.  See also bug 513169 (and especially the link in bug
> 513169 comment 0).

My tries with bug 513169 weren't fruitful...  Don't you think that this approach might be worth experimenting with?

At this point, I think I'm gonna actually start working on bug 240933 in order to eliminate one of the big places where this is a real performance problem for us.
> Don't you think that this approach might be worth experimenting with?

It is, if bug 513169 doesn't pan out, but it seems like a good bit more overhead, so I'd like to avoid it if we can.
Updating to reality: I won't have the time to work on this for the foreseeable future!
Assignee: ehsan.akhgari → nobody
I ran into some serious jank today and grabbed a profile that seems to blame this function:

I was loading configure.in on mxr and started an in-page find, and then everything hung for multiple seconds. Looks like it was doing a sync reflow (to highlight the search results, I'd guess).
Yes, but I suggest a separate bug on that; it's not clear to me that the caller there couldn't do much better...
You need to log in before you can comment on or make changes to this bug.