Closed Bug 67692 Opened 24 years ago Closed 23 years ago

large tables are slow resolving style

Categories

(Core :: CSS Parsing and Computation, defect)

defect
Not set
major

Tracking

()

RESOLVED FIXED
Future

People

(Reporter: karnaze, Assigned: pierre)

References

()

Details

(4 keywords)

From bug 54542, Randell Jesup writes:

If you check out the mozilla nglayout stress test cases
http://www.mozilla.org/newlayout/testcases/stress/wbtbltxt.html and
http://www.mozilla.org/newlayout/testcases/stress/wbtblclr.html
you'll see that performance is MUCH worse than ns4.7, and it used to be better
than ns4.7.  (By much worse I mean 4-10 times worse on large tables on the
second, ns4.7 takes 4 seconds to layout on a PIII-800, Mozilla takes 40
seconds!)

I ran the 2nd stress test in the debugger and every time I broke, the following 
line in nsCSSFrameConstructor::TableProcessChild was being executed.  

aPresContext->ResolveStyleContextFor(&aChildContent, aParentStyleContext,                                                                           
                                     PR_FALSE,
                                     getter_AddRefs(childStyleContext));

Is it possible that the style system has gotten much slower over the last 
several months for this stress test?
FreeBSD 4.1 20010131xx

I just rechecked wbtblclr.html; time was 2:40.  Admittedly, this is debug build,
but that's about 4 times as slow as it used to be in a debug build.  I assume
this is style-sharing stuff that's slowing it down; I don't know if there's a
lot of new error-checking in the style code that might slow it down for debug
builds only.

Whomever is working on the Style sharing stuff to save memory should use these
stress test pages as part of their testcases.
I made a run in Quantify and could easily put the blame on a single routine. 
There are 80 million calls to the function 
StyleFontImpl::CalcFontDifference(nsFont const&,nsFont const&) during the load 
and that alone takes 54% of the time.

That function is called from StyleFontImpl::CalcDifference(StyleFontImpl 
const&)const which is called 40 million times and takes 58% of the time.

That function is called from 
StyleContextImpl::CalcStyleDifference(nsIStyleContext *,int&,int)const which is 
called 40 million times and takes 65% of the time.

Then going upward: 
StyleContextImpl::StyleDataMatches(...) 40 million times, 69% of the time.
StyleSetImpl::FindMatchingContext(...) 40 000 times, 75% of the time.

It seems as if we have a bad O(n^2) algorithm here in 
FindMatchingContext searching for a similar style. That should be the primary 
target for optimization. The secondary target should be the string comparison in 
 StyleFontImpl::CalcFontDifference, which alone takes half the time.

One method I could think of would be to hash a value for each style set (I see 
signs of a CRC already), and keep the style sets in a sorted list. Insertion 
would then take about log(n) time and finding a style set would also be log(n) 
which would get the time complexity down to O(n*log(n)). 

It could be a little bit slower for really small datasets but would be faster as 
soon as there are more than a few style sets. My experience from other similar 
situations is that the break even point can be real low, so it might not even be 
measurable.

Comments? Attinasi?
Keywords: perf
There already eqists a CRC on the StyleContextData, and it should keep us from
having to do brute-force matching, which is slow, in many cases. As Pierre
recently found, there is a problem with the CRC calculations, however, and it
limits the number of unique CRC values, making them far less effective. I
believe that having a really good CRC for each style struct should limit the
amount of time we have to call CalcStyleDifference, which is by its very nature,
slow.
Is there a bug on the CRC problem or do you have a reference to Pierre's 
findings? Even if the CRC wasn't quite correct I think there is a little too 
many calls. It is as if there were 10000 different styles, one for every table 
cell (I guess there might be) and each of them were compared to every other 
without having any help from the CRC at all. 

Btw, I don't understand why I didn't consider a hashtable in my last post. I 
must have been sleeping.
(Repeat of message from 54542)
Retested FreeBSD 4.1 20010206xx

Full optimize/no DEBUG build

For wbtblclr on a PIII-800, time is 1:40 (100 seconds)!
This is on the order of 5 times as bad as M18/NS6 was.


For wbtbltxt, time is 14 seconds.  This is only 40-50% worse than M18.
For reference again, NS4.75 on this same page is circa 6-7 seconds for either
color or plain(!)
(end of repeat)

This level of performance regression (even in a testcase) merits a high
priority, even this close to a milestone.  Upping severity; adding keyword
mozilla0.8, regression, testcase; OS -> All; Platform -> All
Severity: normal → major
OS: Windows NT → All
Hardware: PC → All
Added topperf
FWIW, I pulled just a couple of hours ago and retested the color table in 
Quantify and it seems to have gotten twice as fast suddenly. I think it might 
have been Pierre's checkin.

The style system isn't the biggest offender anymore, even though it shows up 
high. It is now "only" about 35% of the time with 1/3 of that (10% overall) 
being time used by memory allocations. 1/6 of the style time (5% overall) is 
used by StyleContextImpl::FindChildWithRules.
Depends on: 68030
After my big checkin last week that contained the fix for the CRC calculation, I 
am tempted to close this bug as fixed.  As Daniel noticed, the style system 
represents 35% of the time now, which is a bit high a for page but not completely 
out of bounds, especially considering the page itself.

On the Mac, wbtbltxt takes about the same time as NS4.7 (around 10 seconds) while 
wbtblclr still takes twice the time (around 35 seconds instead of 18).  It seems 
to me that we might be killed by the incremental reflow: the page first displays 
in 13-15 seconds then it takes 20 seconds where it does nothing but adjusting the 
scrollbar.

It doesn't seem to be a style problem anymore.  Back to karnaze.

Many thanks to Randell and Daniel for the metrics.
Assignee: pierre → karnaze
Keywords: mozilla0.9
does this bug need a new title????
Keywords: mozilla0.8.1
Keywords: mozilla0.8
Blocks: 71668
Target Milestone: --- → mozilla0.9
just offering another data point:

nighly build 2001031611 vs. ns 4.76
linux 2.2.19pre17 running on an amd k2-400 with 192 mb ram

wbtbltxt:  mozilla damn competitive with 4.76
ns 4.76 loads in (wall clock ) 1 minute 2 seconds
mozilla also loads in 1 minute 2 seconds  ( even with the incremental reflows. 
wow.) 

wbtblclr:  
ns 4.76 loads in 1 minute  15 seconds
mozilla loads in 2 minutes 36 seconds.  
QA Contact: ian → chrisd
table issue, qa -> chrisd
Please be careful - mention if timings are from an optimized build or not.

On my PIII/800, 20010319xx --enable_optimize --enable_debug (!) build,
FreeBSD4.1:

wbtbltxt.html: ~16s
wbtblclr.html: ~32s

ns4.75 (linux, running on FreeBSD):

wbtbltxt.html: ~6s
wbtblclr.html: ~7s

Note: don't count the download time - these are large documents, especially
wbtblclr.html.  Make them local, pre-cache them, or use a proxy.

I will try a non-debug build as well (building now), since a fair bit of extra
processing occurs in some cases under debug.  However, I still see ~2.5:1 for
text, ~5:1 for color slower.

Compared to my measurements of 2/6/2001 in this bug, text is slightly worse (14
up to 16), and color is significantly improved (100s down to 32 sec).  Note
those 2/6 numbers were no-debug, so it could change when I finish my no-debug
rebuild.  Certainly there's been a big improvement in color, and we're about
back to where we were or a bit better than before the regression.

We shouldn't pat ourselves on the back about handling these as well as ns4.7
does, though.  IE5.5 uses *much* less time, and on a slower machine (650MHz PIII
laptop):

webtbltxt.html: <1 sec
webtblclr.html: 1 sec.

and a spyglass-based browser I've worked on takes <<1 second for either text or
color.

If it's determined that the slowness is NOT style and the problem is still not
resolved, the base bug 54542 should be re-opened.
optimized, no-debug numbers:
text: 10 seconds (vs 6 for ns4.7)
color: 24 seconds (vs 7 for ns4.7)

PIII/800, FreeBSD4.1, mozilla pull/build 20010319xx  --enable-optimize
--disable-debug --disable-cpp-rtti --disable-pedantic --disable-mailnews
--with-pthreads

text is 1.6x longer, color is 3.5x longer.
Attinasi feels that wbtbltxt is a worst case test for the style system sharing. 
Rather than turn off style sharing to do better on this test at the expense of 
most real world pages, we are just going to have to live with the fact that we 
will do horribly on this test until the style system undergoes some fairly major 
re-architecting.

Pierre, we both came to conclusion that the style system is the cause of the 
slowdown even though you have said it only accounts for 35% of the time. When 
each cell uses the same background color, the time is 2.5X times faster on my 
machine. Also, table reflow is not the major culprit.

Marking future.
Assignee: karnaze → pierre
Target Milestone: mozilla0.9 → Future
Is this "fixed" by the style system rewrite? The test case is much faster now.
Definitely it's much improved with the rewrite. the wbtbltxt.html takes ~7s on my
Celeron 566 and the wbtblclr.html ~12s.
Using Linux nightly 2001061821.
This looks fixed to me, but I guess someone needs to run Quantify and compare
with previous results
The stlye system is now 7% of the page load time compared to 35% before.

I will mark this FIXED since noone has objected for the last couple of months. 
If there are more issues please file specific bugs.
Status: NEW → RESOLVED
Closed: 23 years ago
Resolution: --- → FIXED
Changing QA contact
QA Contact: chrisd → madhur
You need to log in before you can comment on or make changes to this bug.