use an array instead of an nsTHashtable for nsContinuationStates
Categories
(Core :: Layout, enhancement)
Tracking
()
| Tracking | Status | |
|---|---|---|
| firefox69 | --- | fixed |
People
(Reporter: heycam, Assigned: heycam)
References
Details
(Keywords: perf)
Attachments
(2 files)
Profiling https://en.wikipedia.org/wiki/Barack_Obama page load, I find ~140 ms spent under nsBidiPresUtils::RepositionInlineFrames. Some of that time is spent managing the nsContinuationStates hash table. My measurements show switching to an array saves ~25 ms.
For reference, attached is the frequency of total entry counts in all of the nsContinuationStates instances over the page load.
| Assignee | ||
Updated•6 years ago
|
| Assignee | ||
Comment 1•6 years ago
|
||
| Assignee | ||
Updated•6 years ago
|
| Assignee | ||
Comment 2•6 years ago
|
||
Comment 3•6 years ago
|
||
This basically seems like a good change, except that I worry there could be pathological cases where we are dealing with a much larger number of nsContinuationStates, and the linear-searched array will begin to perform poorly. I suspect that creating a large bidi page with lots of nested <span>s would tend to run into this problem, and we might regress performance significantly in such a case.
I wonder if it'd be worth trying to mitigate that, possibly by switching back to a hashtable representation if the array grows past some threshold size (a few hundred?). We do that for the list of frames in nsLineBox, for example.
Another possibility, quite tempting given that AFAICS we build the table early in nsBidiPresUtils::RepositionInlineFrames and then it isn't modified thereafter, would be to sort the entries if the size exceeds some modest threshold, and then use BinaryIndexOf to look up entries. That should allows lookups to remain tolerably fast even if the array becomes thousands of entries.
WDYT -- should we try to create and profile some worst-case scenarios here before moving forward? We often tend to perform rather badly on large bidi documents already, and it'd be sad if we regressed some of them further here, as an accidental side-effect of optimizing the common cases.
| Assignee | ||
Comment 4•6 years ago
|
||
For this particular page, I found when trying a hybrid-switch-to-HashMap approach that the results were very slightly slower but not much, for any particular threshold that we decided to switch to the HashMap. But if we can get pathological cases then it sounds like it would be worth doing that.
Switching to a sorted array sounds tempting as a low cost way to make pathological cases manageable.
I'll try creating a pathological case to see how that behaves.
Comment 5•6 years ago
|
||
I did a little bit of informal profiling here, trying to stress things: I took a complete plain-text copy of the Qur'an (in Arabic), and wrapped it in a few dozen nested <span>s. Loading the resulting file on my MBPro showed around 900ms under RepositionInlineFrames with current trunk code (using the hashtable).
With the linear-array patch, this jumped to nearly 6000ms.
I then added a patch to sort and bin-search the array when it exceeds 32 elements (a somewhat arbitrary number, just felt like the right sort of value). This brought us back down to about 1000ms under RepositionInlineFrames, so only a very slight regression from the hashtable version (while keeping the array-vs-hash win on more typical content).
So unless your testing indicates otherwise, I'd suggest we do something along those lines.
| Assignee | ||
Comment 6•6 years ago
|
||
Could you attach (or send me) the test case?
| Assignee | ||
Comment 8•6 years ago
|
||
Got the test case, thanks.
(In reply to Emilio Cobos Álvarez (:emilio) from comment #7)
Bug 1342650 could be useful for this kind of stuff.
I was just thinking it would be nice to have to have such a type around!
I did some testing on Jonathan's test case:
- unpatched: ~510 ms
- array: ~2800 ms
- array, sorting if more than 32 elements: ~745 ms
- array, switching to hash map after 32 elements: ~520 ms
And for the Wikipedia page I was originally testing, I didn't see any difference between a sorted array and a hash map. So I think I'll go with switching to the hash map.
Interestingly I got terrible performance when using mfbt/HashTable.h's HashMap, and normal looking performance with using nsDataHashable...
| Assignee | ||
Comment 9•6 years ago
|
||
Since a new hybrid array/hash table type should probably go in mfbt/, it should probably use HashMap. Since I can't work out right now what the issue is with using HashMap here, I'll leave that for another time.
| Assignee | ||
Updated•6 years ago
|
Updated•6 years ago
|
| Assignee | ||
Comment 10•6 years ago
|
||
Comment 11•6 years ago
|
||
Comment 12•6 years ago
|
||
| bugherder | ||
Description
•