641341 - CalculateHypotheticalBox can be slow when lines have lots of placeholders (only placeholders?) on them

Reporter

Description

•

14 years ago

On my mac, 50% of the total time is reflow (the rest is painting). Almost a third of the reflow time is CalculateHypotheticalBox. Almost half of that is self time; the rest is the nsLineBox::IndexOf call the line iterator constructor makes. Would it make any sense to search backwards within lines when searching back from the line cursor? Are we even hitting that here? The self time is the IsEmpty() calls in AreAllEarlierInFlowFramesEmpty, looks like. Not sure what we can do about those.

Mats Palmgren (inactive)

Comment 1

•

14 years ago

Attached patch wip (wdiff) — Details — Splinter Review

Using "lineBox->CachedIsEmpty()" is valid at this point, right? If so, then if it's true it follows that all frames on that line before the place-holder must be empty, so we can skip the whole AreAllEarlierInFlowFramesEmpty() loop. This makes the default 30x30 test about 9% faster in a local opt build for me (Linux x86-64).

Robert O'Callahan (:roc) (email my personal email if necessary)

Comment 2

•

14 years ago

(In reply to comment #1) > Using "lineBox->CachedIsEmpty()" is valid at this point, right? Yes.

Boris Zbarsky [:bzbarsky]

Reporter

Comment 3

•

14 years ago

Yeah, 9% about matches my profiler data (1/2 * 1/3 * 0.5). Can we do something about the line iterator too? I'm not sure what; searching backwards in the line box won't actually do the right thing...

Robert O'Callahan (:roc) (email my personal email if necessary)

Comment 4

•

14 years ago

I think nsAbsoluteContainingBlock::Reflow could keep a line cursor containing the current line of kidFrame's placeholder. The line cursor should only need to advance forward.

Boris Zbarsky [:bzbarsky]

Reporter

Comment 5

•

14 years ago

The thing is, in this case there aren't that many lines (30 at most). It's just that each line has lots of stuff on it... so for your average new abs pos kid, we have to look at 900/2 placeholders (not counting the walls!) to find the "right" line.

Mats Palmgren (inactive)

Comment 6

•

14 years ago

Attached patch wip8 — Details — Splinter Review

Fwiw, I tried something like that, this gave an additional 3% improvement.

Mats Palmgren (inactive)

Comment 7

•

14 years ago

Attached patch wip28 — Details — Splinter Review

Caching the last child frame on the LineCursor property gives a significant improvement. Also optimized the dual direction line iteration a bit, it won't give much for this test but I think it might be good when there are many lines and the cursor is near either end. When there is no line cursor (we start from the first line) is such a case.

Mats Palmgren (inactive)

Comment 8

•

14 years ago

Attached patch previous patches rolled into one — Details — Splinter Review

With all patches I get a 33% improvement in a local Linux x86-64 opt build.

Ahmed Aly

Comment 9

•

14 years ago

opera pass this test amazingly

Boris Zbarsky [:bzbarsky]

Reporter

Comment 10

•

14 years ago

Mats, are you actively working on this? If not, should someone else pick it up?

Mats Palmgren (inactive)

Comment 11

•

14 years ago

I'm not actively working on this. It's probably worth taking some of these patches, at least the first one. Feel free to pick it up.

Boris Zbarsky [:bzbarsky]

Reporter

Updated

•

14 years ago

Updated

•

13 years ago

Depends on: 682052

mdew

Comment 12

•

13 years ago

Is this bug similar to the slowness when visiting http://mazery.sjackson.net/ ?

Robert O'Callahan (:roc) (email my personal email if necessary)

Comment 13

•

13 years ago

(In reply to mdew from comment #12) > Is this bug similar to the slowness when visiting > http://mazery.sjackson.net/ ? No, that seems to use canvas, not tables (and Firefox is a lot faster than Chrome on that page for me).

Randell Jesup [:jesup] (needinfo me)

Comment 14

•

13 years ago

See jprof profile (Fedora x64 Xeon) in https://bugzilla.mozilla.org/attachment.cgi?id=563041 See also bug 626927, though that has a different cause.

Keywords: perf

OS: Mac OS X → All

Hardware: x86 → All

Randell Jesup [:jesup] (needinfo me)

Comment 15

•

13 years ago

FYI, in my case 80% is ::Paint(), ~17% is Reflow. Of the 80%, 38% is PaintForFrame(), 24% nsIFrame::BuildDisplayListForStackingContext(), and 18.4% nsDisplayList::ComputeVisibilityForRoot()

Boris Zbarsky [:bzbarsky]

Reporter

Comment 16

•

13 years ago

Here's a thought. What about stashing the pointer to the line box on the abs pos frame, or on its placeholder, in a property? We could do that during reflow....

Robert O'Callahan (:roc) (email my personal email if necessary)

Comment 17

•

13 years ago

That's tricky because it has to be updated as the placeholder (or its ancestors) get pushed and pulled around.

Boris Zbarsky [:bzbarsky]

Reporter

Comment 18

•

13 years ago

Ah, pulling and pushing ancestors, yeah. :(

Robert O'Callahan (:roc) (email my personal email if necessary)

Comment 19

•

13 years ago

Alternatively we could give nsInlineFrames and nsPlaceholderFrames a direct pointer to the line-box if and only if they're immediate children of the line box/block. Then you could find the line-box for any placeholder by walking up the ancestors looking for the pointer.

Boris Zbarsky [:bzbarsky]

Reporter

Comment 20

•

13 years ago

That could end up with a fair amount of cost if there are lots of inlines. Would it make sense to cache the line box on the placeholder when reflowing the placeholder? By the time we need to calculate the hypothetical box, the placeholder has been reflowed, presumably, to get the horizontal position....

Randell Jesup [:jesup] (needinfo me)

Updated

•

13 years ago

Depends on: 691414

Robert O'Callahan (:roc) (email my personal email if necessary)

Comment 21

•

13 years ago

Yes!

Boris Zbarsky [:bzbarsky]

Reporter

Comment 22

•

13 years ago

OK, I'm going to try that. I think it should give us simpler code than Mats' last patch, with about the same speedup. It does rely on the invariant that we reflow the placeholder before the abs pos frame. If we think this is not reliable across incremental reflows, we can clear the cached line box somewhere...

Boris Zbarsky [:bzbarsky]

Reporter

Comment 23

•

13 years ago

Attached patch Perhaps like so (obsolete) — Details — Splinter Review

Attachment #564435 - Flags: review?(roc)

Boris Zbarsky [:bzbarsky]

Reporter

Updated

•

13 years ago

Assignee: nobody → bzbarsky

Priority: -- → P1

Comment 24

•

13 years ago

Try run for 90b41fd48aea is complete. Detailed breakdown of the results available here: https://tbpl.mozilla.org/?tree=Try&rev=90b41fd48aea Results (out of 71 total builds): success: 66 warnings: 4 failure: 1 Builds available at http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/bzbarsky@mozilla.com-90b41fd48aea

Robert O'Callahan (:roc) (email my personal email if necessary)

Comment 25

•

13 years ago

(In reply to Boris Zbarsky (:bz) from comment #22) > OK, I'm going to try that. I think it should give us simpler code than > Mats' last patch, with about the same speedup. It does rely on the > invariant that we reflow the placeholder before the abs pos frame. I don't believe this invariant holds. If something inside an abs-pos frame is dirty, nsBlockFrame::ChildIsDirty does nothing to dirty the placeholder or its line, nor does PresShell::FrameNeedsReflow. Another approach could be to cache state in nsAbsoluteContainingBlock::Reflow that keeps track of which line we're on and the last placeholder in the line that we reflowed, and pass that state as a parameter to ReflowAbsoluteFrame and poke it into nsHTMLReflowState. Then CalculateHypotheticalBox can start its search for the placeholder from the last placeholder.

Boris Zbarsky [:bzbarsky]

Reporter

Comment 26

•

13 years ago

> If something inside an abs-pos frame is dirty, nsBlockFrame::ChildIsDirty does nothing to > dirty the placeholder or its line, nor does PresShell::FrameNeedsReflow. But in that case the placeholder's line is valid, no? We only need to worry about the cache storing an incorrect line, and in that case the invariant I mentioned should hold and keep that from lasting until we get to this code.

Robert O'Callahan (:roc) (email my personal email if necessary)

Comment 27

•

13 years ago

True.

Robert O'Callahan (:roc) (email my personal email if necessary)

Comment 28

•

13 years ago

Comment on attachment 564435 [details] [diff] [review] Perhaps like so Review of attachment 564435 [details] [diff] [review]: ----------------------------------------------------------------- I think you should probably just make the cached line box a regular field of nsPlaceholderFrame. Almost all abs-pos placeholders will end up having that property. If we're worried about the word of space in placeholders for floats, we could make abs-pos placeholders a tiny subclass of nsPlaceholderFrame. ::: layout/generic/nsHTMLReflowState.cpp @@ +984,5 @@ > // all the frames before it are empty. In that case, it would > // have been just before this line. > // XXXbz the line box is not fully reflowed yet if our > // containing block is relatively positioned... > + bool found = false; Why did you move "found" out?

Boris Zbarsky [:bzbarsky]

Reporter

Comment 29

•

13 years ago

Comment on attachment 564435 [details] [diff] [review] Perhaps like so > I think you should probably just make the cached line box a regular field of > nsPlaceholderFrame. Hmm. Yeah, fair. That probably actually uses less memory than the property in most cases. > Why did you move "found" out? By mistake. ;)

Attachment #564435 - Attachment is obsolete: true

Attachment #564435 - Flags: review?(roc)

Boris Zbarsky [:bzbarsky]

Reporter

Comment 30

•

13 years ago

Attached patch Speed up CalculateHypotheticalBox for the case of a block containing a bunch of lines but absolutely no in-flows anywhere. — Details — Splinter Review

Attachment #564745 - Flags: review?(roc)

Robert O'Callahan (:roc) (email my personal email if necessary)

Updated

•

13 years ago

Attachment #564745 - Flags: review?(roc) → review+

Robert O'Callahan (:roc) (email my personal email if necessary)

Comment 31

•

13 years ago

Oh, can you add a comment to GetCachedLineBox indicating that it's only OK to call this if the caller ensures that the placeholder has been reflowed since any changes to the frame tree that could affect the line-box?

Boris Zbarsky [:bzbarsky]

Reporter

Comment 32

•

13 years ago

Yes, absolutely.

Boris Zbarsky [:bzbarsky]

Reporter

Comment 33

•

13 years ago

https://hg.mozilla.org/integration/mozilla-inbound/rev/0cd9ed297f73

Flags: in-testsuite-

Whiteboard: [need review]ietestdrive → ietestdrive

Target Milestone: --- → mozilla10

Marco Bonardo [:mak]

Comment 34

•

13 years ago

https://hg.mozilla.org/mozilla-central/rev/0cd9ed297f73

Status: NEW → RESOLVED

Closed: 13 years ago

Resolution: --- → FIXED

Randell Jesup [:jesup] (needinfo me)

Comment 35

•

13 years ago

A rough test on my very fast Xeon went from 85 seconds (see bug 641340) to 82 seconds with this patch. So this may have helped a little, but it doesn't look like it helped a lot. (Were any before/after profiles taken? I could re-profile (I took one a few weeks ago) and compare.) From my previous test, on the same system Chrome takes 4.5 seconds (close to 20x faster). On my Win7 laptop (Lenovo W520 - also SandyBridge core i7 (2820QM in this case), but on the order of 1/2 as fast as the desktop generally) in IE9 it took 34 seconds, and on Aurora 9.0a2 (without this patch) it takes (drumroll) 339 seconds. beat 10x as long by 1 second! Given the lack of improvement on my desktop machine, I suspect this patch won't help much (not that it's bad).

Boris Zbarsky [:bzbarsky]

Reporter

Comment 36

•

13 years ago

> A rough test on my very fast Xeon went from 85 seconds (see bug 641340) to 82 seconds How many measurements did you take? This test is _very_ noisy and timing-dependent. > Were any before/after profiles taken? Yes, of course. Before the patch painting was about 50% of the total test time for the slice I profiled; after the patch it's closer to 60%. If we assume painting speed is unchanged and number of paints is unchanged, that means that "new time" = "old time"*50/60. Given that the maximum expected win was about 16% (see comment 0, which tells you exactly how much time the hotspot this bug is about was taking), that seems about right.

Ahmad Saleem

Comment 37

•

13 years ago

My test system (i7 2600k at stock) with FF Nightly (70e4de45a0d0)with new profile showed just 75 seconds and with this bug fix in build (38a487da2def}, I got only 54 seconds. Test is with 30*30 maze, performed only once, both with new profile.

mdew

Comment 38

•

13 years ago

Is this bug really "resolved" I'm not seeing firefox nightly even coming close chrome, and even IE. On a lowly AMD 4800+: IE9: 25 seconds (30x30) Chrome: 12 seconds (30x30) Firefox: 154 seconnds I've re-run these tests a few times over, getting pretty much the same results (give or take a few seconds).

Daniel Holbert [:dholbert] (PTO until 2/24)

Comment 39

•

13 years ago

Your nightly was built before this was landed on mozilla-central (this morning, Comment 34). Hopefully you'll see improved results in tomorrow's nightly.

Boris Zbarsky [:bzbarsky]

Reporter

Comment 40

•

13 years ago

> Is this bug really "resolved" Yes. This bug wasn't "make it as fast as other browsers". It was "1/6 of the time we're spending is in this function; we can get rid of it". That's pretty obvious if you actually read comment 0 or comment 36. There are other bugs that cover other things that we're slow on in this testcase.

Frank Zimmer

Comment 41

•

13 years ago

For me this seems still not fixed : Safari 5.1.1 = 5,2 seconds Chrome 16.0.886.0 (Developer Build 101724 Windows) = 6.8s IE9 = 19s FF Nightly build October 25 (32 and 64 bit) = i did not hat time enough to wait for the end

Randell Jesup [:jesup] (needinfo me)

Comment 42

•

13 years ago

As mentioned in comment 40 by bz, this bug was to reduce the overhead of one part of the problem, which accounted only for about 1/6th of the runtime, so even if it had been reduced to 0 by this bug, the test would only be ~1/6th faster. See the other bugs in this group.

Boris Zbarsky [:bzbarsky]

Reporter

Updated

•

13 years ago

Depends on: 696175

Mats Palmgren (inactive)

Updated

•

13 years ago

Depends on: 700112

Boris Zbarsky [:bzbarsky]

Reporter

Comment 43

•

13 years ago

> It does rely on the invariant that we reflow the placeholder before the abs pos frame. Turns out this is not true for cases when the containing block of the abs pos frame is a rel-pos inline or a rel pos block splitting across columns, because the placeholder might get pushed to a later continuation while the abs pos frame is reflowed after the first continuation is done. Of course that means that in those case the layout ends up wrong with the old code, but that's probably better than crashing as in bug 696175. I just backed this out on inbound to fix bug 696175 for now. Once we finish up bug 524925 we can do abs pos reflow later and fix this again.

Status: RESOLVED → REOPENED

Priority: P1 → P3

Resolution: FIXED → ---

Boris Zbarsky [:bzbarsky]

Reporter

Updated

•

13 years ago

Depends on: 524925

Mats Palmgren (inactive)

Updated

•

13 years ago

Depends on: 730769

Mats Palmgren (inactive)

Updated

•

13 years ago

Depends on: 730771

Boris Zbarsky [:bzbarsky]

Reporter

Updated

•

13 years ago

Depends on: 746705

Mayank Bansal

Comment 44

•

13 years ago

this is the only benchmark that mnakes /firefox look bad in the latest Toms Hardware :Web browser Grand Prix. http://www.tomshardware.com/reviews/windows-7-chrome-20-firefox-13-opera-12,3228-8.html

Boris Zbarsky [:bzbarsky]

Reporter

Comment 45

•

13 years ago

Yes, we do know that...

CruNcher

Comment 46

•

12 years ago

Though accelerated Direct2D Performance seems to have improved significantly already also in the maze test still have to test on the IGP but it looks very nice in Multimedia tasks already much much better then some days back :)

CruNcher

Comment 47

•

12 years ago

I was restricting myself for a long time now to DX9 (layers) because of the bad Direct2D Performance seems slowly that isn't needed anymore also the frame drops seem to have lowered nicely and also the extensions are more stable in that regards :) I guess all the new Profiling tools introduced really pay of now :)

CruNcher

Comment 48

•

12 years ago

Also https://bugzilla.mozilla.org/show_bug.cgi?id=698297 shows very significant render improvements :)

Mayank Bansal

Comment 49

•

12 years ago

This has really improved within a few days . :) Very good performance. Not still comparable to other browsers, but much better than previous firefox performance. Congratulations! Which patch/set of patched made this happen?

Robert O'Callahan (:roc) (email my personal email if necessary)

Comment 50

•

12 years ago

Bug 524925. Boris, should we wontfix this bug now? I think post-bug-524925 this isn't needed.

Mats Palmgren (inactive)

Comment 51

•

12 years ago

Chrome 23 (Canary) takes ~6 seconds for a 30x30 maze. Nightly 2012-08-02 takes ~16 seconds. There's no reflow methods showing up in a shark build. The results are: Running (Self) Symbol Name 1886.0ms 7.2% isalloc_validate 1040.0ms 3.9% ozone_size 664.0ms 2.5% ozone_free_definite_size 172.0ms 0.6% ozone_free 5.0ms 0.0% free 5.0ms 0.0% ozone_realloc 973.0ms 3.7% aa_render_shape 705.0ms 2.7% _ZL11SearchTableP12PLDHashTablePKvj15PLDHashOperator 524.0ms 2.0% memmove$VARIANT$sse42 455.0ms 1.7% _spin_lock$VARIANT$mp 443.0ms 1.6% resample_byte_h_4cpp_vector 436.0ms 1.6% arena_dalloc 368.0ms 1.4% nsRegion::SetToElements(unsigned int) 367.0ms 1.4% arena_malloc 305.0ms 1.1% mozilla::(anonymous namespace)::ContainerState::ProcessDisplayItems(nsDisplayList const&, mozilla::FrameLayerBuilder::Clip&, unsigned int) ... the rest is < 1% ... So I would say this bug is worksforme. I think we should still consider taking some of the suggested improvements though even though it doesn't impact this particular test anymore (at least the first patch seems worth it without adding much risk). BTW, is it reasonable that isalloc_validate takes this much time?

Mats Palmgren (inactive)

Comment 52

•

12 years ago

> Chrome 23 (Canary) takes ~6 seconds for a 30x30 maze. Note that this is after a Shift+Reload. If you close the dialog and start a new run without doing Shift+Reload Chrome will take a different (shorter) path through the maze, which takes 1.8 seconds. (this could of course influence our comparative score quite badly depending on how these sites perform their tests)

Mayank Bansal

Comment 53

•

12 years ago

with the latest nightly, a strange behaviour occurs. as soon as the maze starts solving,nightly is blazing fast.Faster than chrome But very quickly, it slows down. Thst is, after going maybe 25-30 steps, the solving slows down a lot. if you refresh the tab and start solving the maze again, it again start very fast but tapers very quickly.

Dennis Jakobsen

Comment 54

•

12 years ago

(In reply to mayankleoboy1 from comment #53) > with the latest nightly, a strange behaviour occurs. as soon as the maze > starts solving,nightly is blazing fast.Faster than chrome But very quickly, > it slows down. Thst is, after going maybe 25-30 steps, the solving slows > down a lot. > > if you refresh the tab and start solving the maze again, it again start very > fast but tapers very quickly. I can confirm this strange behaviour but only on Windows.. Linux runs fast all the way through. I'm sure you already know, but test speeds vary alot depending on maze. Would be nice with a configurable seed, so you could test the same maze on different browsers/builds.

Zéfling

Comment 55

•

12 years ago

The first maze generated is still same.

Robert O'Callahan (:roc) (email my personal email if necessary)

Comment 56

•

12 years ago

(In reply to Robert O'Callahan (:roc) (Mozilla Corporation) from comment #50) > Bug 524925. Actually I meant bug 691651. We need to make it faster still. Bug 776190 tracks further improvements.

Boris Zbarsky [:bzbarsky]

Reporter

Comment 57

•

12 years ago

Could people PLEASE stop spamming the bug, as well as every other bug related to maze solver? I really didn't need literally dozens of mails about this. :( We know it got faster, per comment 56. We know it's still quadratic as comment 53 describes. That's bug 691645, which is clearly hanging off bug 776190. Mats, I agree that it's still worth making some of the changes here. I'm not going to have time to drive them in. Will you? Resummarizing to the remaining issue, but it might make sense to file a new bug for it without all the baggage of this one...

Assignee: bzbarsky → nobody

Summary: IE Maze Solver testcase takes too long in CalculateHypotheticalBox → CalculateHypotheticalBox can be slow when lines have lots of placeholders (only placeholders?) on them

Target Milestone: mozilla10 → ---

Boris Zbarsky [:bzbarsky]

Reporter

Comment 58

•

12 years ago

And to be clear, I don't think we should overrotate on this, short of taking notes on how this could generally be done better for servo. So I'd also be ok with a wontfix.

Chris Peterson [:cpeterson]

Updated

•

11 years ago

Blocks: ietestdrive

Boris Zbarsky [:bzbarsky]

Reporter

Updated

•

9 years ago

Blocks: 1235321

BMO Automation

Updated

•

2 years ago

Severity: normal → S3

BugBot [:suhaib / :marco/ :calixte]

Comment 59

•

2 years ago

The severity field for this bug is relatively low, S3. However, the bug has 16 votes.
:jfkthame, could you consider increasing the bug severity?

For more information, please visit auto_nag documentation.

Flags: needinfo?(jfkthame)

BugBot (nomail) [:suhaib / :marco/ :calixte]

Comment 60

•

2 years ago

The last needinfo from me was triggered in error by recent activity on the bug. I'm clearing the needinfo since this is a very old bug and I don't know if it's still relevant.

Flags: needinfo?(jfkthame)

wip (wdiff) 14 years ago Mats Palmgren (inactive) 1.65 KB, patch		Details \| Diff \| Splinter Review
wip8 14 years ago Mats Palmgren (inactive) 5.46 KB, patch		Details \| Diff \| Splinter Review
wip28 14 years ago Mats Palmgren (inactive) 16.61 KB, patch		Details \| Diff \| Splinter Review
previous patches rolled into one 14 years ago Mats Palmgren (inactive) 16.05 KB, patch		Details \| Diff \| Splinter Review
Perhaps like so 13 years ago Boris Zbarsky [:bzbarsky] 13.69 KB, patch		Details \| Diff \| Splinter Review
Speed up CalculateHypotheticalBox for the case of a block containing a bunch of lines but absolutely no in-flows anywhere. 13 years ago Boris Zbarsky [:bzbarsky] 13.00 KB, patch	roc : review+	Details \| Diff \| Splinter Review