Closed
Bug 105783
Opened 23 years ago
Closed 23 years ago
Make nsXULSortService greasy fast
Categories
(Core :: XUL, defect)
Core
XUL
Tracking
()
RESOLVED
FIXED
mozilla0.9.7
People
(Reporter: tingley, Assigned: tingley)
References
Details
(Keywords: perf)
Attachments
(1 file, 1 obsolete file)
29.91 KB,
patch
|
mozilla
:
review+
waterson
:
superreview+
|
Details | Diff | Splinter Review |
Filing this to get down some of the stuff I had noticed while trying to understand why building (opening) the bookmarks menu the first time takes so darn long. <template> and associated baggage isn't terribly hip these days, but I looked at the XUL template code a tiny bit for another bug, and maybe life's too short to understand both it and the outliner. I used my stanard 1000+ element bookmark file from bug 69009 and starting tracing (I still have no jprof). Building the bookmarks menu took 9.3 million ticks of the PR_Now() clock. The first million of these were spent in initial propogation [sic] through the rule network; the remainder was in content building. There's a *lot* going on in there, and I don't understand all of it. The obvious big nail is nsXULSortService::InplaceSort(), which consumes roughly 6.1 million ticks -- two thirds of the time. Much of that is split between two hot spots, each of which eat about 2.5 million ticks. The first is nsCollation::CompareString; in particular, the call at http://lxr.mozilla.org/seamonkey/source/content/xul/templates/src/nsXULSortService.cpp#936 CompareString has been identified elsewhere as a dog (bug 99383; wontfix), and implicated in the pokiness of the XUL directory viewer (bug xxxxx). bbaetz pointed out that the outliner doesn't use it -- it just calls CompareRawSortKey(). Coincidentally, the outliner has a reputation as being "fast". How it manages to do raw key comparisons and still sort properly is a mystery. Since the outliner apparently does it... The second problem area is in GetNodeValue(); the majority (1.8 million or so) is spent in the block beginning at http://lxr.mozilla.org/seamonkey/source/content/xul/templates/src/nsXULSortService.cpp#1332 (in the first of the two GetNodeValue() implementations) which reproduces the IndexOf() logic from RDFContainerUtilsImpl. Unfortunately, lightning doesn't strike twice -- the code is walking the arcs backwards as it should. It seems like we're just grovelling through a lot of arcs. Looking at this particular spot made me start thinking about a memoizing nsIRDFContainer implementation that would cache IndexOf() results. In fact, over the course of an insertion, GetNodeValue() is called repeatedly for the key node -- if nothing else, reusing this value should hopefully be easy.
Assignee | ||
Comment 1•23 years ago
|
||
cc'ing RDF people and miscellaneous dudes. That "bug xxxxx" is supposed to be bug 69185.
Comment 2•23 years ago
|
||
ccing hyatt, who never did answer my question about how outliner sorting works. When I was discussing bug 99383 on irc (with ftang, possibly, but I really can't remember), someone mentioned that mailnews generates its own sort key stuff somewhere. Does it sort on an extra (undisplayed) column, or something?
Assignee | ||
Comment 3•23 years ago
|
||
> In fact, over the course of an insertion, GetNodeValue() is called repeatedly
> for the key node -- if nothing else, reusing this value should hopefully be
> easy.
I take this back; the sortInfo already does this (cacheFirstNode).
Comment 4•23 years ago
|
||
bbaetz: outliner in the raw has no notion of sorting. rdfliner has a notion of sorting, which takes the good and leaves the bad from the XULSortService.
Assignee | ||
Comment 5•23 years ago
|
||
For the rdf container sorting case (as in bookmarks menu construction), there's certainly no reason why we should be suffering through the overhead of CompareString() -- all this nonsense about writing out the index as a string and then doing a textual comparison needs to go. I'm going to work on a patch along these lines.
Assignee | ||
Comment 6•23 years ago
|
||
Progress. I produced successive versions of the code which did the following: a) Special-cased natural order comparison in InplaceSort() to allow integer compares of the indexes, rather than writing them out to strings things like "000042" and feeding them into the nsCollation meatgrinder. b) "Partial caching": Same as a), but re-added caching (which didn't work for this case any more after my changes) of the "static" first element to avoid calling IndexOf() repeatedly on the key element when doing insertion. c) "Full caching": Same as a), but modified RDFContainerImpl to memoize IndexOf() results in a PLDHashTable. Results (1070 bookmarks): Initial Time: 8106634 With a) [integer compares]: 6058016 With b) [partial caching]: 5194734 With c) [full caching]: 3999085 Total time taken by the InplaceSort() calls dropped to a half a million ticks, a change of an order of magnitude or so. w00t. a) and b) are pretty simple stuff, though I have to clean the code up in a number of ways (and possibly undo any damage I may have done to some other form of sorting -- I was watching the Vikings game part of the time, and...). I think the caching RDF container is cool, but will need more work; none of the code is written that would be needed to keep it consistent if elements were removed, etc.. there's probably a couple good segfaults in the making. (And it doesn't behave differently for the other types of container. Oops. But who uses rdf:Alt?) Also, it's tempting to make it a separate implementation of nsIRDFContainer since the current one is widely used. I'll keep playing with this stuff.
Assignee | ||
Updated•23 years ago
|
Assignee | ||
Comment 7•23 years ago
|
||
At some point I stopped experimenting and started rewriting this thing. My goals are to both speed up the code as well as to clean it up, in particular eliminating the redundant APIs by mapping back to the contentSortInfo in the InsertContainerNode case. I started this wednesday night. At the moment I've only got the natural order sort code working again, but it's a full 10x faster than the original -- building the bookmark menu is down to 2870596 usec, of which calls to InsertContainerNode make up only 630843. My worry now is that I'm going to do this and then the code is going to be removed from the tree. In particular, I'm worried about bug 97062 (give the outliner a content model), in which hyatt states as a goal the removal of <tree> code from the build and replacing it entirely with outliner. So... before I spend the next week hacking on this, can someone give me an idea as to whether I'm wasting my time? Hyatt/Waterson, I'm especially interested in hearing from you. thanks
Comment 8•23 years ago
|
||
You won't catch me going anywhere near that code.
Comment 9•23 years ago
|
||
hyatt is the definitive master of XUL's future, but as I understand it, <tree> is here to stay, even if only castrated down to a <listbox> tag. (That said, I certainly wouldn't _mind_ seeing <tree> go, since it has its claws in just about every area of the frame construction and management process.) hyatt, what say you?
Assignee | ||
Comment 10•23 years ago
|
||
Hmm, now that I think about it, wasn't I originally looking at menu performance in this bug? Even if <tree> goes, templated menus etc are still (for now) using this code.
Assignee | ||
Comment 11•23 years ago
|
||
Working on this code is like having breakfast with the devil. However, I believe the end is in sight; if I can get bookmark separators to behave sanely I'll be about ready to put a patch up. Not tonight, though.
Assignee | ||
Comment 12•23 years ago
|
||
I have a much larger patch in my tree which is a rewrite of nearly all of this code. I'm still not happy with it, though, and I'm wary of trying to get review for 3000 line patches. In the mean time, I'm offering a shorter version which contains two significant speedups as well as some very basic cleanup. 1) Use integer math to compare sequence indexes. The old code took the result of IndexOf() (or rather equivalent code), stuffed it into an rdf literal, built a collation key from it, and then did a key comparison! This resulted in horrible perf due to the infamous CompareString(), and was also a bit goofy. To wedge this in without hacking the APIs all to pieces, I replaces the nsIRDFNode* params in GetNodeValue and CompareNodes with a simple SortNode that can hold different types of values. 2) Fix optimization for in-order inserts. There was code to do this already (see the lastWasLast, lastWasFirst checks in InsertContainerNode), but it was badly broken. In fact, the whole insertion sort loop was taking the index boundaries to be from |staticCount| to |staticCount+numChildren|, which is wrong -- the static children are a subset of the total, so |staticCount+numChildren| is larger than the number of elements we're sorting on. Cleaning up this logic makes in-order inserts (as in the case of menu building) much faster, since it will only compare against the last node as long as |lastWasLast| is set. 3) Cleanup: remove redundant #includes, explicit IID -> NS_GET_IID, fix spots that duplicated containerUtils code ("manual" IsSeq and IndexOf opserations), replace some redundant atoms with nsXULAtoms:: equivalents. There are still many strange and wonderous things in this file. However, the two functional changes speed up sorted numerical inserts (read: menu building) considerably. For the 1000-bookmark menu test, before and after times are approx: Total time | Time in InsertContainerNode() Before: 8,600,000 6,510,000 After: 2,750,000 675,000 So it's a 10x speedup on the actual sorting time; the rest is spent in template/rule propagation (I retract my snide use of [sic] in the initial comment now that this has been fixed!). I *believe* this patch also speeds up startup time, as it appears that something listens to rdf:bookmarks and builds its content before the first nav window appears. Although the integer math should speed up in-place sorting based on natural order, beyond that there isn't much here for other types of sorts (bbaetz: xul directory viewer builds in sorted order based on name, so don't expect much speedup from this patch). I plan on producing other patches to improve other cases (mainly through reuse of generated collation keys), as well as more code cleanup (rrr, those redundant APIs). Also, I've marked various dependent bugs (more or less for tracking purposes); none of them are fixed here. This new bugzilla attachment interface is slick.
Assignee | ||
Updated•23 years ago
|
Assignee | ||
Comment 13•23 years ago
|
||
Comment on attachment 57045 [details] [diff] [review] the short version hello, nsIRDFInt.
Attachment #57045 -
Attachment is obsolete: true
Assignee | ||
Comment 14•23 years ago
|
||
much cleaner version using nsIRDFInt and scrapping all the SortNode junk. slightly slower (QI overhead + literal creation?) -- about 2.9 million ticks compared to 2.75 million, but still a huge improvement.
Comment 15•23 years ago
|
||
I have a patche for #2 as part of the patch for bug 96108 (voidarray index bounds cleanup/inlining). Check the latest patch for 96108 and compare the changes in nsXULSortService.cpp. Also, I noticed in your changes for #2 you use numChildren. Note that above in the code numChildren is modified by subtracting staticCount from it(!). I side-stepped the issue by adding a realNumChildren that isn't modified.
Blocks: 96108
Assignee | ||
Comment 16•23 years ago
|
||
staticCount is subtracted from numChildren in two cases. The first is if staticCount is negative (=> sortStaticsLast was set) and it is added to numChildren. The second is if staticCount is *greater* than numChildren, which should never happen (this block makes no sense -- the net result seems to be that numChildren is set to 0.) In the first case, I think this is the correct behavior. If there are 10 children, with 3 statics, we would normally look at the last seven (left = 4, right = 10). But if we want static elements to be sorted last, the subtraction would force us to consider (left = 0, right = 7), so any inserted node came before the static elements (assuming they were already last....). Is this wrong?
Assignee | ||
Comment 17•23 years ago
|
||
Similarly (now looking at your changes in bug 96108), the check to set sortState->lastWasFirst needs to be fixed -- the last (most recently inserted) node is "first" if it is inserted at the leftmost index under consideration. This is actually at |staticCount|, one less than the original value for |left| (since we insert at |x-1| if |direction<0|). Other than that the changes are basically equivalent.
Comment 18•23 years ago
|
||
Looks good. Feel free (but not obliged) to move any of the other atoms into nsXULAtoms.
Comment 19•23 years ago
|
||
Comment on attachment 57073 [details] [diff] [review] v2 sr=waterson
Attachment #57073 -
Flags: superreview+
Comment 20•23 years ago
|
||
Comment on attachment 57073 [details] [diff] [review] v2 r=rjc
Attachment #57073 -
Flags: review+
Comment 21•23 years ago
|
||
Go for it. I'll remove/modify my patch for nsXULSortService in bug 96108
Assignee | ||
Comment 22•23 years ago
|
||
I haven't heard back from Marcia Knous about cvs access yet. Waterson, feel free to check this in.
Comment 23•23 years ago
|
||
If you like I can check it in when the tree opens
Assignee | ||
Comment 24•23 years ago
|
||
That would be great.
Assignee | ||
Comment 25•23 years ago
|
||
My cvs account has been activated; I'll check this in tonight.
Assignee | ||
Comment 26•23 years ago
|
||
Fix checked in. I'm tempted to leave this bug open to track the other things here I need to improve.
Comment 27•23 years ago
|
||
Resist temptation. Omnibus bugs bad; focused bugs good. :-]
Assignee | ||
Comment 28•23 years ago
|
||
Allright, then. Clearing blocks on a couple things that didn't get taken care of in this patch (actually, it's possible that fixing the indexing took care of sortStaticLast, but I'm not sure).
Component: XP Toolkit/Widgets: XUL → XUL
QA Contact: jrgmorrison → xptoolkit.widgets
You need to log in
before you can comment on or make changes to this bug.
Description
•