Open Bug 559396 Opened 12 years ago Updated 4 years ago
Logger files are slow in Firefox
231.98 KB, application/x-bzip2
From Silverwave (in comments from http://antennasoft.net/robcee/2010/04/13/lorentz-branch-diagram/): "I have a large html file (9MB, 20k records in a table). "FF takes 30sec to open it, Chromium takes 5secs. "The filter on it uses js and is very fast in Chromium (10sec) but in FF > 5min :-( (tested with a new clean profile with no extensions)." This bug is to investigate the performance problem. Test data to accompany this is to follow in this bug. Further item to test: Does the file see the same slow performance in version 3.7a4?
Product: Firefox → Core
QA Contact: general → general
provided in another blog comment: "OK I will do this formally later but for now here is a link to the Test Data I have created. http://silverwav.wordpress.com/2010/04/14/firefox-speed-test/ Total Rows 20068 (Clips 6055, Default 14013). 12.6MB I will do some testing and update with my times."
from Silverwave: ___________________ Test Results (Linux) Firefox 3.6.4 ubuntu reload: 0:37 0:37 0:37 Chromium 5.0. ubuntu reload: 0.05 0:05 0:05 7.4 Slower Firefox 3.6.4 ubuntu filter: 2:07 2:05 Chromium 5.0. ubuntu filter: 0:03 0:03 Filter word: Platypus 42.3 Slower ______________________________ Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:18.104.22.168pre) Gecko/20100410 Ubuntu/10.04 (lucid) Namoroka/3.6.4pre – Build ID: 20100410093139 Chromium 5.0.376.0 (44292) Ubuntu ______________________________ Lucid Beta 2 64bit Linux version 2.6.32-20-generic (buildd@yellow) (gcc version 4.4.3 (Ubuntu 4.4.3-4ubuntu5) ) #29-Ubuntu SMP Fri Apr 9 20:35:00 UTC 2010 (Ubuntu 2.6.32-20.29-generic 22.214.171.124+drm33.2) CPU0: Intel(R) Core(TM)2 Quad CPU @ 2.40GHz stepping 07 8GB RAM ______________________________
again: ___________________ Test Results – Win7 Firefox 3.6.3 win7 reload: 0:52 0:50 0:53 7.4 x Slower Firefox 3.7.4 win7 reload: 1:02 0:56 0:54 8.1 x Slower Chrome 5.0. win7 reload: 0.07 0:07 0:07 Firefox 3.6.4 win7 filter: 3:15 3:15 48.7 x Slower Firefox 3.7.4 win7 filter: 0:33 0:34 0:34 8.5 x Slower Chrome 5.0. win7 filter: 0:04 0:03 0:03 Machine Windows 7 Prof 64Bit, 4GB RAM. HP 550 Core 2 Duo Laptop
and another: ___________________ Results – Win7 Firefox 3.6.3 win7 reload: 0:52 0:50 0:53 7.4 x Slower Firefox 3.7.4 win7 reload: 1:02 0:56 0:54 8.1 x Slower Chrome 5.0. win7 reload: 0.07 0:07 0:07 Firefox 3.6.4 win7 filter: 3:15 3:15 48.7 x Slower Firefox 3.7.4 win7 filter: 0:33 0:34 0:34 8.5 x Slower Chrome 5.0. win7 filter: 0:04 0:03 0:03 Machine Windows 7 Prof 64Bit, 4GB RAM. HP 550 Core 2 Duo Laptop
This bug could have really used decent steps to reproduce (e.g. I see nothing obvious in the linked files that times the load or the filter). I added some timers to measure the pageload, and I get 7s in chrome (during which it's completely locked up) vs 52s for Gecko (which stays responsive throughout the load). Both browsers need about 5s to lay out the page non-incrementally, and we need about 2s to parse the page if not rendering it, so the main reason we're slower on the pageload is that we try to not lock up the browser while doing it. I can also confirm that the filter is much slower. Looking into that. Chances are, we want separate bugs here.
OK, more data. First the pageload. As expected, about 75% under reflow. About 1/5 of this is computing the auto size of the table; the rest is table reflow. A large part of this (something like 1/5 of the total time for the pageload!) is taken up by all the nsTableFrame::InvalidateFrame calls that tables make. Just walking all the way up the tree on each one is a pain. roc, do any of our existing bugs cover this? Apart from that, the main issue is just that we do multiple relayouts of the table, where chrome presumably only does one, at least when loading from local disk. Again, this is a tradeoff to keep the browser responsive. For the filter: * nsTableRowGroupFrame::AdjustRowIndices is 60% of the time or so. Specifically, for every row set to display:none we walk through _all_ rows and renumber them. I'll file a separate bug on this. * UndisplayedMap::AppendNodeFor is 16% of the time. Could we make this into a hashtable or a list-or-hash or something? Or if we can get rid of the NS_NOTREACHED check there, can we just prepend, or use a prclist? * 9% of the time is spent actually running JS. Mostly seems to be DOM access. Doesn't trace well, but we only lose about 500ms to that. Total JS execution time is about 3s.
Component: General → Layout: Tables
QA Contact: general → layout.tables
And if I work around the various not-staying-on-trace issues (due to the testcase getting className on Text nodes, due to our lack of IndexGetter, etc), then the JS runtime is about the same as chrome's. So looks like we were losing closer to 1500ms to the non-tracing business. Filed bug 560613 on AdjustRowIndices and bug 560616 on the undisplayed map.
Bug 539356 is the way to eliminate Invalidate overhead in reflow. Why are we doing multiple relayouts? If there are no input events, we shouldn't interrupt reflow, right?
Because we interrupt the parser on a timer. So we parse some stuff, lay it out, then parse more stuff, etc. Each time reflow is running to completion of whatever DOM is there, but then more nodes get added.
Depends on: dlbi
Oh right. And Chrome doesn't interrupt the parser on a timer at all? Does the HTML5 parser help here? With off-main-thread parsing, presumably we don't need to interrupt the parser on a timer anymore?
> Oh right. And Chrome doesn't interrupt the parser on a timer at all? I don't know. It certainly didn't in this testcase. > Does the HTML5 parser help here? Gives us chrome-like behavior: 8s pageload, no response during it.
Isn't the reflow interruptible?
Yes, but... About 10% of the time is spent on the parser thread. Of the time on the main thread, about 44% is under reflow, 35% is frame construction (half style stuff, half other), 15% is various DOM node creation and parser overhead. That 35% is certainly not interruptible. And of the reflow, about 2/3 is under nsTableOuterFrame::Reflow (which doesn't interrupt, really) and the rest under ClearFloats computing the intrinsic min-width of the table from ComputeSize (also not interruptible). So really, interrupting doesn't buy us much here.
I see. Still, I think the HTML5 parser is a big step in the right direction. If we wanted to improve the user experience further, we should look at making more of that stuff interruptible.
At this point I see about 5s for pageload (vs 8s for Chrome) and 2s for filter (vs 1.5s for Chrome). I'll do some profiling to see what remains during that filter step.
A bunch of the remaining time is a restyle and resulting frame construction. Which makes sense, since the testcase is changing display on various rows. Anyway, I tried it with stylo and that's comparable to Chrome in terms of performance, at first glance.
You need to log in before you can comment on or make changes to this bug.