uninterruptible reflow at _calcMouseTargetRect@chrome://browser/content/tabbrowser.xml:7697:27 when moving the mouse

VERIFIED FIXED in Firefox 61

Status

()

defect
P1
normal
VERIFIED FIXED
2 years ago
6 months ago

People

(Reporter: rowbot, Assigned: mconley)

Tracking

(Blocks 2 bugs, {perf, stale-bug})

unspecified
Firefox 61
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(firefox61 verified)

Details

(Whiteboard: [ohnoreflow][qf:p3][fxperf:p1])

Attachments

(3 attachments)

(Reporter)

Description

2 years ago
Here's the stack:

_calcMouseTargetRect@chrome://browser/content/tabbrowser.xml:7697:27
getMouseTargetRect@chrome://browser/content/tabbrowser.xml:7661:13
_callListener@chrome://browser/content/browser.js:8233:16
addListener@chrome://browser/content/browser.js:8211:5
set_label@chrome://browser/content/tabbrowser.xml:7645:13
updateStatusField@chrome://browser/content/browser.js:4512:7
_showDelayed/this._timer<@chrome://browser/content/browser.js:4921:7
setTimeout handler*_showDelayed@chrome://browser/content/browser.js:4920:19
handleEvent@chrome://browser/content/browser.js:4914:9
EventListener.handleEvent*update@chrome://browser/content/browser.js:4905:7
setOverLink@chrome://browser/content/browser.js:4467:5


This occurs for me when mousing out of the "File a bug" and "File a new bug anyway" links on the OhNoReflow Report page.
This happens when moving the mouse after the status panel content has changed. It's the result of bug 1356663 where we changed this code to not trigger a sync reflow immediately when the status panel content changed.
Blocks: 1356663
Summary: 3.85ms uninterruptible reflow at _calcMouseTargetRect@chrome://browser/content/tabbrowser.xml:7697:27 → uninterruptible reflow at _calcMouseTargetRect@chrome://browser/content/tabbrowser.xml:7697:27 when moving the mouse
Duplicate of this bug: 1358426
No longer blocks: 1356663
Component: Untriaged → General
Depends on: 1356663
Flags: qe-verify?
Priority: -- → P2
Another stack:

_calcMouseTargetRect@chrome://browser/content/tabbrowser.xml:7697:27
getMouseTargetRect@chrome://browser/content/tabbrowser.xml:7661:13
_callListener@chrome://browser/content/browser.js:8233:16
handleEvent/<@chrome://browser/content/browser.js:8225:9
handleEvent@chrome://browser/content/browser.js:8223:5
EventListener.handleEvent*_delayedStartup@chrome://browser/content/browser.js:1531:5
EventListener.handleEvent*onLoad@chrome://browser/content/browser.js:1239:5
onload@chrome://browser/content/browser.xul:1:1
(Assignee)

Updated

2 years ago
Whiteboard: [ohnoreflow][qf][photon-performance] → [ohnoreflow][qf:p1][photon-performance]
Priority: P2 → P3
Whiteboard: [ohnoreflow][qf:p1][photon-performance] → [ohnoreflow][qf:p1][reserve-photon-performance]
Whiteboard: [ohnoreflow][qf:p1][reserve-photon-performance] → [ohnoreflow][qf:p2][reserve-photon-performance]
Flags: qe-verify? → qe-verify+
QA Contact: adrian.florinescu
Assignee: nobody → kmaglione+bmo
Depends on: 1383367
Comment hidden (mozreview-request)
Status: NEW → ASSIGNED
Priority: P3 → P1

Comment 5

2 years ago
mozreview-review
Comment on attachment 8889021 [details]
Bug 1358712: Don't trigger reflows from MousePosTracker.

https://reviewboard.mozilla.org/r/160066/#review166192

::: browser/base/content/browser.js:8568
(Diff revision 1)
>  
> -    this._callListener(listener);
> +    let rect = await PromiseUtils.layoutFlushed(
> +      document, "layout",
> +      () => listener.getMouseTargetRect());
> +
> +    this._callListener(listener, rect);

This looks like it might regress bug 646038?
Attachment #8889021 - Flags: review-
(Assignee)

Comment 6

2 years ago
mozreview-review-reply
Comment on attachment 8889021 [details]
Bug 1358712: Don't trigger reflows from MousePosTracker.

https://reviewboard.mozilla.org/r/160066/#review166192

> This looks like it might regress bug 646038?

Certainly worth checking.
(Assignee)

Comment 7

2 years ago
mozreview-review
Comment on attachment 8889021 [details]
Bug 1358712: Don't trigger reflows from MousePosTracker.

https://reviewboard.mozilla.org/r/160066/#review166814

Clearing review until we're sure this doesn't re-open bug 646038.
Attachment #8889021 - Flags: review?(mconley)
(In reply to Dão Gottwald [::dao] from comment #5)
> Comment on attachment 8889021 [details]
> Bug 1358712: Don't trigger reflows from MousePosTracker.
> 
> https://reviewboard.mozilla.org/r/160066/#review166192
> 
> ::: browser/base/content/browser.js:8568
> (Diff revision 1)
> >  
> > -    this._callListener(listener);
> > +    let rect = await PromiseUtils.layoutFlushed(
> > +      document, "layout",
> > +      () => listener.getMouseTargetRect());
> > +
> > +    this._callListener(listener, rect);
> 
> This looks like it might regress bug 646038?

As far as I can tell, it does not. The behavior of the status panel seems to be identical with or without this patch.

Comment 9

2 years ago
mozreview-review
Comment on attachment 8889021 [details]
Bug 1358712: Don't trigger reflows from MousePosTracker.

https://reviewboard.mozilla.org/r/160066/#review166848

Okay, I'll give this another look.
Attachment #8889021 - Flags: review- → review?(dao+bmo)

Comment 10

2 years ago
mozreview-review
Comment on attachment 8889021 [details]
Bug 1358712: Don't trigger reflows from MousePosTracker.

https://reviewboard.mozilla.org/r/160066/#review171656

Finally was able to test this after figuring out that I need to replace PromiseUtils.layoutFlushed with BrowserUtils.promiseLayoutFlushed. Alas, this does regress bug 646038.
Attachment #8889021 - Flags: review?(dao+bmo) → review-
Keywords: stale-bug
Keywords: perf
(Assignee)

Comment 11

2 years ago
This should be for non-remote pages only, which should be rare enough these days that I feel pretty comfortable making this a qf:p3.
Whiteboard: [ohnoreflow][qf:p2][reserve-photon-performance] → [ohnoreflow][qf:p3][reserve-photon-performance]
Whiteboard: [ohnoreflow][qf:p3][reserve-photon-performance] → [ohnoreflow][qf:p3][reserve-photon-performance] [fxperf]
Whiteboard: [ohnoreflow][qf:p3][reserve-photon-performance] [fxperf] → [ohnoreflow][qf:p3][reserve-photon-performance] [fxperf:p3]
I've realized that comment 11 is just dead wrong. We can and do still hit this when hovering links in remote tabs as well. I'm setting this to fxperf:p1 and taking.
Assignee: kmaglione+bmo → mconley
Whiteboard: [ohnoreflow][qf:p3][reserve-photon-performance] [fxperf:p3] → [ohnoreflow][qf:p3][reserve-photon-performance] [fxperf:p1]
(Assignee)

Updated

a year ago
Whiteboard: [ohnoreflow][qf:p3][reserve-photon-performance] [fxperf:p1] → [ohnoreflow][qf:p3][fxperf:p1]
(Assignee)

Updated

a year ago
Attachment #8960312 - Flags: review?(dao+bmo)

Comment 14

a year ago
mozreview-review
Comment on attachment 8960312 [details]
Bug 1358712 - Get rid of synchronous layout flushes when calculating where to put the StatusPanel.

https://reviewboard.mozilla.org/r/229070/#review235490

::: browser/base/content/browser.js:8998
(Diff revision 1)
> +   *
> +   * We wait until after the next refresh driver tick to account for
> +   * the case that that removeListener called soon after addListener,
> +   * but before a refresh driver tick. This ensures that clean-up
> +   * actions that need to occur after removeListener happen _after_
> +   * any set-up actions that occurred for addListener.

Why are you assuming that the addListener callback still wants to run after removeListener was called? Seems like the promise should instead be rejected.

::: browser/base/content/browser.js:9060
(Diff revision 1)
> -      listener.onMouseLeave();
> +        functionsToCall.push(listener.onMouseLeave.bind(listener));
> +      }
> +    }
> +
> +    window.requestAnimationFrame(() => {
> +      for (let fn of functionsToCall) {

What's the rationale here? I don't think you can assume much about what kind of things onMouseEnter and onMouseLeave will do.
Attachment #8960312 - Flags: review?(dao+bmo)
(Assignee)

Comment 15

a year ago
mozreview-review-reply
Comment on attachment 8960312 [details]
Bug 1358712 - Get rid of synchronous layout flushes when calculating where to put the StatusPanel.

https://reviewboard.mozilla.org/r/229070/#review235490

> Why are you assuming that the addListener callback still wants to run after removeListener was called? Seems like the promise should instead be rejected.

I'm thinking mostly of the case where the StatusPanel value is set and then cleared - for example, if a page load completes, and the "Transferring..." StatusPanel message ends up being set to "" before the next refresh driver tick.

During my own testing, this was pretty frequent. It seems wrong to me to reject for such a frequent case.

And the reason I am assuming that addListener still wants to run after removeListener is because I'm assuming the addListener callback is going to set some state, like the StatusPanel does, and that the removeListener will then remove / unset that state. I'm really just trying to make sure that we get an ordered sequence of events here, regardless of how quickly the StatusPanel string changes.

Does that convince you? Or is there another alternative that I should consider?

> What's the rationale here? I don't think you can assume much about what kind of things onMouseEnter and onMouseLeave will do.

\_callListeners is being called from within a promiseDocumentFlushed callback. In that scope, we desperately need to avoid touching the DOM (bug 1441168 will hopefully make doing this impossible). Because we have no real knowledge about what onMouseEnter and onMouseLeave are doing, we can either:

1. Assume that they'll obey the invariant that the DOM must not be touched, or
2. Assume that the consumers will not obey that invarient, and queue up the functions to run when it's safe to touch the DOM instead

I went with 2 on this one.
Are the above explanations sufficient to drop the issues?
Flags: needinfo?(dao+bmo)
(In reply to Mike Conley (:mconley) (:⚙️) (Totally backlogged on reviews and needinfos) from comment #15)
> Comment on attachment 8960312 [details]
> Bug 1358712 - Get rid of synchronous layout flushes when calculating where
> to put the StatusPanel.
> 
> https://reviewboard.mozilla.org/r/229070/#review235490
> 
> > Why are you assuming that the addListener callback still wants to run after removeListener was called? Seems like the promise should instead be rejected.
> 
> I'm thinking mostly of the case where the StatusPanel value is set and then
> cleared - for example, if a page load completes, and the "Transferring..."
> StatusPanel message ends up being set to "" before the next refresh driver
> tick.
> 
> During my own testing, this was pretty frequent. It seems wrong to me to
> reject for such a frequent case.

Why does that seem wrong? I don't interpret "reject" as a failure per se.

> And the reason I am assuming that addListener still wants to run after
> removeListener is because I'm assuming the addListener callback is going to
> set some state, like the StatusPanel does, and that the removeListener will
> then remove / unset that state.

The inactive attribute can be set unconditionally after the removeListener call, it doesn't really matter whether the addListener callback ended up removing it.

> I'm really just trying to make sure that we
> get an ordered sequence of events here, regardless of how quickly the
> StatusPanel string changes.

But my point is that the StatusPanel doesn't really want further events after calling removeListener. Briefly setting the label and removing the inactive attribute seems like pointless work, even more so if this happens frequently.

> > What's the rationale here? I don't think you can assume much about what kind of things onMouseEnter and onMouseLeave will do.
> 
> \_callListeners is being called from within a promiseDocumentFlushed
> callback. In that scope, we desperately need to avoid touching the DOM (bug
> 1441168 will hopefully make doing this impossible). Because we have no real
> knowledge about what onMouseEnter and onMouseLeave are doing, we can either:
> 
> 1. Assume that they'll obey the invariant that the DOM must not be touched,
> or
> 2. Assume that the consumers will not obey that invarient, and queue up the
> functions to run when it's safe to touch the DOM instead
> 
> I went with 2 on this one.

Okay. Can you document this briefly?
Flags: needinfo?(dao+bmo)
(In reply to Dão Gottwald [::dao] from comment #17)
> 
> Why does that seem wrong? I don't interpret "reject" as a failure per se.
> 

async functions that might consume this API _do_ interpret rejections as a failure, and throw exceptions if they're not explicitly caught. We also have code in our test framework that tries to ensure that we don't have any uncaught rejections: https://searchfox.org/mozilla-central/rev/c217fbde244344fedfd07b57a740c694a456dbca/testing/xpcshell/head.js#1389

Would you prefer that each callpoint must catch() when talking to MousePosTracker?

> > And the reason I am assuming that addListener still wants to run after
> > removeListener is because I'm assuming the addListener callback is going to
> > set some state, like the StatusPanel does, and that the removeListener will
> > then remove / unset that state.
> 
> The inactive attribute can be set unconditionally after the removeListener
> call, it doesn't really matter whether the addListener callback ended up
> removing it.

Alright. I'll see if I can find a way of cancelling the addListener callback if removeListener is called for the same listener.

> Okay. Can you document this briefly?

Sure, will do, thanks.
Comment hidden (mozreview-request)
I've seen a bunch of review requests go by for you dao, so if you're overloaded, let me know and I'll redirect this.
(Assignee)

Updated

a year ago
Attachment #8960312 - Flags: review?(dao+bmo) → review?(gijskruitbosch+bugs)
I think Dao is busy - what do you think of this, Gijs?
(Assignee)

Updated

a year ago
Attachment #8960312 - Flags: review?(gijskruitbosch+bugs) → review?(felipc)
Hey felipe - do you have cycles to review this, or should I redirect it?
Flags: needinfo?(felipc)
Hey, sorry for the delay. I can review it today
Flags: needinfo?(felipc)
Comment hidden (mozreview-request)

Comment 26

a year ago
mozreview-review
Comment on attachment 8960312 [details]
Bug 1358712 - Get rid of synchronous layout flushes when calculating where to put the StatusPanel.

https://reviewboard.mozilla.org/r/229070/#review240356

Awesome, thanks for the change to onTrackingStarted
Attachment #8960312 - Flags: review?(felipc) → review+

Comment 27

a year ago
Pushed by mconley@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/d9110e28f361
Get rid of synchronous layout flushes when calculating where to put the StatusPanel. r=Felipe

Comment 28

a year ago
Backout by rgurzau@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/e1d77f98515e
Backed out changeset d9110e28f361 for wpt mass failures on Linux QuantumRender, e.g. in /css/css-tables/height-distribution/computing-row-measure-1.html
Backed out changeset d9110e28f361 (bug 1358712) for wpt mass failures on Linux QuantumRender, e.g. in /css/css-tables/height-distribution/computing-row-measure-1.html

Backout link: https://hg.mozilla.org/integration/autoland/rev/e1d77f98515ee5dd3ac85f0850b259435d8d7bcc

Push with failures: https://treeherder.mozilla.org/#/jobs?repo=autoland&revision=d9110e28f3612bd40cfa21b29b0e05fc5afd10aa

Log link: https://treeherder.mozilla.org/logviewer.html#?job_id=172498430&repo=autoland&lineNumber=1972

Log snippet: 
[task 2018-04-07T21:29:17.334Z] 21:29:17     INFO - TEST-START | /css/css-grid/grid-definition/grid-change-fit-content-argument-001.html
[task 2018-04-07T21:29:17.562Z] 21:29:17     INFO - 
[task 2018-04-07T21:29:17.564Z] 21:29:17     INFO - TEST-PASS | /css/css-grid/grid-definition/grid-change-fit-content-argument-001.html | .grid 1 
[task 2018-04-07T21:29:17.565Z] 21:29:17     INFO - TEST-PASS | /css/css-grid/grid-definition/grid-change-fit-content-argument-001.html | .grid 2 
[task 2018-04-07T21:29:17.566Z] 21:29:17     INFO - TEST-UNEXPECTED-FAIL | /css/css-grid/grid-definition/grid-change-fit-content-argument-001.html | .grid 3 - assert_equals:
Flags: needinfo?(mconley)
So I think I know what's going on here.

TL;DR: lol, this is crazy.

Longer:

So it seems as if in certain circumstances (which are strangely easy to hit in automation with WebRender enabled but much harder locally), when we create a new remote <xul:browser>, it's possible for the content of that browser to finish loading and calculating before the parent has told the child how big the <xul:browser> is. That's because the parent has yet to run layout, so as far as its concerned, the <xul:browser> is of size 0,0.

This means that in TabChild::RecvShow, the TabChild is told that its PuppetWidget has no size, and layout is calculated appropriately. In the case of the test I'm currently working with, that means table cells get widths of 0 when they should be closer to 17. And so my test fails.

After the parent finally figures out how big the <xul:browser> is, it updates the TabChild, which then updates the dimensions of everything inside of it.

So I think the reflows that were being caused by the StatusPanel when updating browser status were keeping these tests passing by forcing sync layout in the parent and ensuring that the TabChild knew the right widget dimensions in time for the test to run.

Hilarious!
(Assignee)

Updated

a year ago
Depends on: 1454493
Thanks for the back-out - it looks like landing this would actually open us to potential web breakage because of the increased risk of web content opened in new tabs thinking that the viewport is too small early in their lifetime.

Hoping to fix that in bug 1454493.
Flags: needinfo?(mconley)
(Assignee)

Updated

a year ago
Duplicate of this bug: 1454493
Comment hidden (mozreview-request)
Comment hidden (mozreview-request)
(Assignee)

Updated

a year ago
Attachment #8969658 - Flags: review?(nika)
Comment hidden (mozreview-request)
Comment hidden (mozreview-request)

Comment 37

a year ago
mozreview-review
Comment on attachment 8969658 [details]
Bug 1358712 - Force the frameloader to go through layout when content causes a TabParent to be created.

https://reviewboard.mozilla.org/r/238454/#review244250

::: dom/ipc/ContentParent.cpp:4744
(Diff revision 2)
>          aNewTabParent = frameLoader->GetTabParent();
> +        // At this point, it's possible the inserted frameloader hasn't gone through
> +        // layout yet. To ensure that the dimensions that we send down when telling the
> +        // frameloader to display will be correct (instead of falling back to a 10x10
> +        // default), we force layout if necessary to get the most up-to-date dimensions.
> +        TabParent::GetFrom(aNewTabParent)->ForceFrameLoaderLayoutIfNecessary();

Is there any chance you could add a mention of this bug to the comment so people can get context if they need it in the future?

::: dom/ipc/TabParent.h:146
(Diff revision 2)
>    void AddWindowListeners();
>  
> +  // Used when content is causing a TabParent to be created, and
> +  // needs to try forcing layout to flush in order to get accurate
> +  // dimensions for the content area.
> +  void ForceFrameLoaderLayoutIfNecessary();

This feels to me like it makes more sense on the nsFrameLoader, rather than here on the TabParent -- especially after Fission, when a single FrameLoader may have many TabParents.
Attachment #8969658 - Flags: review?(nika) → review+
(Assignee)

Comment 38

a year ago
mozreview-review-reply
Comment on attachment 8969658 [details]
Bug 1358712 - Force the frameloader to go through layout when content causes a TabParent to be created.

https://reviewboard.mozilla.org/r/238454/#review244250

> Is there any chance you could add a mention of this bug to the comment so people can get context if they need it in the future?

Sure thing.
Comment hidden (mozreview-request)
Comment hidden (mozreview-request)

Comment 41

a year ago
Pushed by mconley@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/52471746531a
Force the frameloader to go through layout when content causes a TabParent to be created. r=Nika
https://hg.mozilla.org/integration/autoland/rev/a7fc392e48ab
Get rid of synchronous layout flushes when calculating where to put the StatusPanel. r=Felipe

Comment 42

a year ago
bugherder
https://hg.mozilla.org/mozilla-central/rev/52471746531a
https://hg.mozilla.org/mozilla-central/rev/a7fc392e48ab
Status: ASSIGNED → RESOLVED
Last Resolved: a year ago
Resolution: --- → FIXED
Target Milestone: --- → Firefox 61
Can you please provide some steps to reproduce this issue? Thank you!
Flags: needinfo?(mconley)
(In reply to Cristian Comorasu [:CristiComo] from comment #43)
> Can you please provide some steps to reproduce this issue? Thank you!

Hi CristiComo,

STR:

1. Install Oh no! Reflow! (Instructions: https://mikeconley.github.io/ohnoreflow/)
2. Open the add-on panel, and set the threshold to 0, and click "Not recording. Click to enable." to start recording.
3. Hover any link on any webpage such that the status panel shows up in the bottom corner.

ER:

The add-on should not report a reflow occurred when hovering a link. It will increment a counter in a badge in the add-on button when it observes a reflow (to see some example reflows, type in the URL bar - we reflow like crazy in there).

AR:

We reflow when hovering links.



The AR are true without the patch. With this patch, we successfully avoid reflowing.

Hopefully that's enough!
Flags: needinfo?(mconley)
I can confirm the browser respects the ER from comment 44. I tested using Fx 61.0b6 Developer Edition.
Status: RESOLVED → VERIFIED
Flags: qe-verify+
Blocks: 1509489
You need to log in before you can comment on or make changes to this bug.