Closed Bug 1398448 Opened 7 years ago Closed 7 years ago

page loading is to slow if scrolling during page loading (not always reproduce, but 50/50)

Categories

(Core :: Layout, defect)

57 Branch
defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla57
Tracking Status
firefox-esr52 --- unaffected
firefox55 --- unaffected
firefox56 --- unaffected
firefox57 --- fixed

People

(Reporter: alice0775, Assigned: emilio)

References

()

Details

(Keywords: perf, regression)

Attachments

(3 files)

Attached file about:support
The problem reproducible in Nightly57, but not in 56b10


Steps to Reproduce:
1. Clear cache and everything
2. Scroll until page turn blank when something appear in content area
3. Repeat step 2 about 3 times

Actual Results:
scroll until page turn blank when something appear in content area: it takes about 60-70 sec
If wait and do nothing : it takes about 20 sec


Expected Results:
Should be same
screencast https://youtu.be/Ui3_oT_64fc
I am not confident 100%, regression window:
https://hg.mozilla.org/integration/mozilla-inbound/pushloghtml?fromchange=3c698f2b2eed6e29d6346c798e6234bd5b2696c4&tochange=af5f4b6dbf7c2ccb16d020e98a520a30374da437

af5f4b6dbf7c	Emilio Cobos Álvarez — Bug 1398041: Make WipeContainingBlock consistent about reconstructing sync or async. r=bz
Blocks: 1398041
Keywords: perf, regression
Hmm, so you have stylo enabled, for which I think we don't coalesce reconstruct frame hints, so I'm not sure how that patch could possibly affect perf.

Also, I took a profile of loading that page and scrolling (http://www.ecma-international.org/ecma-262/index.html):

 https://perfht.ml/2gR8Oui

I only see WipeContainingBlock once on the whole profile, and it's not a big amount of time...

I'll try a bit more, but with stylo enabled I think my patch is mostly a no-op.
Here's another profile, with a bit of a longer pageload. Again, most of the CPU time is Reflow, and WipeContainingBlock appears only once:

  https://perfht.ml/2gTwgqU
I cannot get Gecko profiler log for some reason, filed Bug 1398460.


BTW,
Via local 64bit build(According to about:support, stylo is enabled)
m-i cset 0298cb04eb7a : able to reproduce (attempt 1 of 2 times)
m-i cset 0298cb04eb7a and backed out af5f4b6dbf7c(Bug 1398041) : not yet reproduce (attempt 0 of 5 times)
I got Gecko Profiler log https://perfht.ml/2xWcJKb with Nightly 57.0a1(2017-09-08) Build ID 20170908220146

It was about the following situation
    Open the page at 3sec- 
    Scroll up/down with scroll bar about -30sec
    Page turn to blank at about 30 to 100sec.
    Page is shown at about 105-110sec)
So something worth investigating is definitely going in that profile, but looking at the time spent in WipeContainingBlock there, I'm moderately sure it doesn't have to do with bug 1398041, at least in isolation.

I haven't been able to reproduce it, but just realized I was trying with the latest version of the spec, which is at http://www.ecma-international.org/ecma-262/index.html, but your screencast is with http://www.ecma-international.org/ecma-262/7.0/index.html, which is a different web page. So I'll try a bit harder.
Ok, that gives me a profile much more consistent with yours: https://perfht.ml/2gSx3rS
Over to layout, since that's all frame construction time.
Component: General → Layout
So I ran mozregression, and arrived to the same changeset as you. However, that was the first build that had the layout.css.servo.enabled set to true by default (not sure why).

Alice, can you try to reproduce the problem on an older build manually toggling layout.css.servo.enabled to "true"? (I could repro on a build from 06-09)

Also, can you try to repro on current nightly switching that to false?

I think this is a stylo-only issue, but I'd like you to confirm if that's not too much of a hassle.
Flags: needinfo?(alice0775)
The issue is not always reproducible.  The probability is 50/50 on my PC.

force layout.css.servo.enabled = false
  It takes within 20 seconds for the page loading to complete. (this is normal on my PC)
    Nightly(2017-09-09) BuildID=20170909100226
    Nightly(2017-09-08) BuildID=20170908220146


force layout.css.servo.enabled = true
  It takes 100-120 seconds for the page loading to complete.
  And the blank content is continued for very long period(20-50 seconds) after scroll.
    Nightly(2017-09-09) BuildID=20170909100226
    Nightly(2017-09-08) BuildID=20170908220146

  It takes within 20 seconds for the page loading to complete. (this is normal on my PC)
    Nightly(2017-09-08) BuildID=20170908100218
    Nightly(2017-09-07) BuildID=20170907100318
    Nightly(2017-09-06) BuildID=20170906100107
Flags: needinfo?(alice0775)
Summary: page loading is to slow if scrolling during page loading → page loading is to slow if scrolling during page loading (not always reproduce, but 50/50)
Attached file Change list.
So I took a closer look today, and my patch is definitely to blame.

The reason why this is more apparent with stylo is because stylo uses change hints for lazy frame construction text-nodes (which makes me think we should maybe make the change list stack capacity bigger).

So we got a change list that looks like the attachment. Each of those text nodes will trigger a WipeContainingBlock tearing down the whole body.

The reason I thought my patch wouldn't have an impact is because I thought before my changes there would be multiple change hints posted for the <body>, but of course that's not true.

With my patch:

 RecreateFrames(#text)
   WipeContainingBlock
     ReconstructFrames(body, Sync)
 RecreateFrames(#text)
   WipeContainingBlock
     ReconstructFrames(body, Sync)
 // ...

Without my patch:

 RecreateFrames(#text)
   WipeContainingBlock
     ReconstructFrames(body, Async)
 RecreateFrames(#text)
   No insertion point, bail.
 RecreateFrames(#text)
   No insertion point, bail.
 // ...

 RecreateFrames(body, Sync)

So, few things:

 * Reverting my patch is the obvious thing to do.
 * We may want to move all the "reconstruct an ancestor" bits to use InsertionKind::Async.
 * Longer term, I'd like to move us from ReconstructFrame hints to just use the lazy frame bits, that'd allow a single pass where we do frame construction, and we can use the lazy frame bits' presence to avoid the redundant reconstructs.
Boris, mind taking a quick look? See comment 12 for the analysis. I can also just revert the WipeContainingBlock stuff if you think the rest aren't worth it.
Flags: needinfo?(bzbarsky)
Comment on attachment 8906473 [details]
Bug 1398448: Always insert async when reconstructing ancestors to avoid pathological frame construction cases.

https://reviewboard.mozilla.org/r/178230/#review183296

Well, I'm glad to see my pessimism from bug 1395719 comment 17 is justified, or something.  ;)

I think we could use a comment somewhere explaining why async is OK in all these cases, and in particular explaining that during a frames flush an async reconstruct will post a new restyle, but restyle processing will keep going as long as there are posted restyles, so we will ensure we process the async thing before the flish ends.  Putting all this at every callsite is a bit too much; maybe do it where InsertionKind::Async is defined?

r=me

::: layout/base/nsCSSFrameConstructor.cpp:8647
(Diff revision 1)
>      nsIFrame* ancestorFrame = ancestor->GetPrimaryFrame();
>      if (ancestorFrame->GetProperty(nsIFrame::GenConProperty())) {
>        // XXXmats Can we recreate frames only for the ::after/::before content?
>        // XXX Perhaps even only those that belong to the aChild sub-tree?
>        LAYOUT_PHASE_TEMP_EXIT();
> -      RecreateFramesForContent(ancestor, aInsertionKind);
> +      RecreateFramesForContent(ancestor, InsertionKind::Async);

So we had used aInsertionKind here to get sync behavior in some cases to play nice with our frame tree state, I thought.  I guess it's actually ok because we never remove things from the frame tree state?

Or am I misrememering and we just left some cases sync as a matter of caution (and maybe because I had not recalled at the time how we avoid an extra event loop trip in the async case from restyling)?
Attachment #8906473 - Flags: review+
Flags: needinfo?(bzbarsky)
Pushed by ecoal95@gmail.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/034c6b09eb50
Always insert async when reconstructing ancestors to avoid pathological frame construction cases. r=bz
https://hg.mozilla.org/mozilla-central/rev/034c6b09eb50
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla57
Assignee: nobody → emilio
You need to log in before you can comment on or make changes to this bug.