LayoutActor.getGrids/node.getElementWithGrid() is the topmost hangs reported on BHR

RESOLVED FIXED in Firefox 63

Status

defect
P2
normal
RESOLVED FIXED
Last year
Last year

People

(Reporter: ochameau, Assigned: bradwerth)

Tracking

(Blocks 1 bug)

unspecified
Firefox 63
Dependency tree / graph

Firefox Tracking Flags

(firefox63 fixed)

Details

Attachments

(4 attachments)

This call:
  https://searchfox.org/mozilla-central/source/devtools/server/actors/layout.js#263
Seems to be the topmost reported hangs on BHR when you filter by "devtools/":
  https://arewesmoothyet.com/?category=all&durationSpec=128_512&mode=explore&payloadID=c8e925752fc94f78af9349665aad14c7&search=devtools%2F&thread=0
  On this particular BHR, for light hangs between 128 and 512ms, it represents at least 34% of the hangs.
(Layout Frame Inspector is the so far unreleased tool for showing layout boxes. This issue is about the Layout side panel in the Inspector.)

Maybe Brad has some ideas for this one?
Component: Layout Frame Inspector → Inspector
Flags: needinfo?(bwerth)
getElementWithGrids() repeatedly calls nsComputedDOMStyle::GetComputedStyle for every element in the DOM. Those calls will flush any pending restyles for the document, which is most likely the slow part. I can change the API to avoid the flush if directed by a parameter flag. This would cause the Layout panel to "miss" any items whose styles were changed to add "display:grid" before the panel was opened.
I've posted a patch which avoids the style flush. I'm not certain how to construct an appropriate testcase to check if this avoids the hangs noted in comment 0. I'll run the mochitests to make sure it doesn't break anything. I suspect it is possible to construct a case where the call does fail to find all the display:grid elements, but I doubt it would be something that any user would encounter in typical use.
Flags: needinfo?(bwerth)
Priority: -- → P2
I believe the patch will help, but I don't know how to check it against the original reported problem in Comment 0. Alex, can you advise me how to confirm this patch?
Flags: needinfo?(poirot.alex)
The only way to confirm the impact on BHR is to land and see how it reacts...

But the best is to be able to come up with an STR to reproduce these hangs, or slow code.
Unfortunately, BHR doesn't come with STR/webpages. We just know that the browser was stuck on such frame.

I tried to reproduced this by using the inspector against this test page:
  https://gridbyexample.com/examples/code/example29.html
I can see getGrids appear on some profile, but just a couple of ms, without the slow calls to getElementWithGrids.

May be you already had a STR that highlights this slow codepath?
If yes, I think that's enough to proceed!
Flags: needinfo?(poirot.alex)
The proposed patch may be ready to go. In my interactive testing with the patch applied, I can't get the grid highlighter to fail in the cases I was worrying about. Specifically, I was able to add "display:grid" to a rule and the Grid Inspector still picked up the change just fine. So since this patch can possibly make the getElementsWithGrid() call faster, and doesn't seem to give incorrect results, I'll start reviews and push for a landing.
Assignee: nobody → bwerth
Attachment #8996114 - Flags: review?(bzbarsky)
Summary: LayoutActor.getGrids/node.getElementWithGrids() is the topmost hangs reported on BHR → LayoutActor.getGrids/node.getElementWithGrid() is the topmost hangs reported on BHR
It would really be good to understand why this is causing problems.  Any time a style flush here would be costly, if we don't flush the next refresh tick will have the same costly flush, right?

Have you tried testing the performance of this code on large display:none subtrees?  I would expect those to be somewhat slow because then you get O(N^2) behavior in depth and whatnot.

> This would cause the Layout panel to "miss" any items whose styles were changed to add "display:grid" before the panel was opened.

No, just the ones changed after the last refresh tick.  Really hard to notice with manual testing.
Flags: needinfo?(bwerth)
Comment on attachment 8996114 [details]
Bug 1479508 Part 1: Move devtools-specific function definitions in Element.webidl into a better-documented partial interface.

https://reviewboard.mozilla.org/r/260366/#review267682

::: devtools/server/actors/layout.js:263
(Diff revision 1)
>      // Root node can be a #document object, which does not support getElementsWithGrid.
>      if (node.nodeType === nodeConstants.DOCUMENT_NODE) {
>        node = node.documentElement;
>      }
>  
> -    const gridElements = node.getElementsWithGrid();
> +    const gridElements = node.getElementsWithGrid(false);

This is the only caller, right?  As in, the API exists just for this callsite?

Why do we want to add the boolean arg instead of just changing the API behavior?

::: dom/base/Element.cpp:1563
(Diff revision 1)
>  {
>    // This helper function is passed to GetElementsByMatching()
>    // to identify elements with styling which will cause them to
>    // generate a nsGridContainerFrame during layout.
> -  auto IsDisplayGrid = [](Element* aElement) -> bool
> +  //
> +  // Depdending on the value of aFlushStyles, we provide one of two versions

"Depending"

::: dom/webidl/Element.webidl:75
(Diff revision 1)
>    [Throws, Pure]
>    HTMLCollection getElementsByTagNameNS(DOMString? namespace, DOMString localName);
>    [Pure]
>    HTMLCollection getElementsByClassName(DOMString classNames);
>    [ChromeOnly, Pure]
> -  sequence<Element> getElementsWithGrid();
> +  sequence<Element> getElementsWithGrid(optional boolean flushStyles = true);

This method really shouldn't be here, among the standardized parts of Element.  It should be on its own partial interface, with documentation....
Attachment #8996114 - Flags: review?(bzbarsky) → review-
In addition to resolving the issues raised in the review, I'll figure out how to make an appropriate performance test for this, and see if this fix has a verifiable impact.
Flags: needinfo?(bwerth)
(In reply to Alexandre Poirot [:ochameau] from comment #0)
> This call:
>  
> https://searchfox.org/mozilla-central/source/devtools/server/actors/layout.
> js#263
> Seems to be the topmost reported hangs on BHR when you filter by "devtools/":
>  
> https://arewesmoothyet.com/
> ?category=all&durationSpec=128_512&mode=explore&payloadID=c8e925752fc94f78af9
> 349665aad14c7&search=devtools%2F&thread=0
>   On this particular BHR, for light hangs between 128 and 512ms, it
> represents at least 34% of the hangs.

Could you give me some steps on how to use the BHR interface to reveal this data? I'm hoping that if I can analyze the BHR data adequately, it will help me understand this problem better.
Flags: needinfo?(poirot.alex)
(In reply to Brad Werth [:bradwerth] from comment #13)
> Could you give me some steps on how to use the BHR interface to reveal this
> data? I'm hoping that if I can analyze the BHR data adequately, it will help
> me understand this problem better.

Ah, I see how to do it now. By expanding the expensive items on the tree, we get to getElementsWithGrid from several nodes, and from there to nsComputedDOMStyle::GetComputedStyle. This call clearly is the source of the hang. ::GetComputedStyleNoFlush would very likely remove that hang, but as pointed out in comment 10, would only serve to push it somewhere else. Outside of the devtools Layout panel, but still hanging somewhere.

Still, though this patch wouldn't eliminate the hang, it may be useful to land it to move the hang out of the devtools. If it's outside of devtools, it might lead to an improved user experience since the devtools would appear more snappily and the hang afterwards might not be perceivable by the user.
Flags: needinfo?(poirot.alex)
> ::GetComputedStyleNoFlush would very likely remove that hang

All the time in the profile under nsComputedDOMStyle::GetComputedStyle is also under GetComputedStyleNoFlush, no?  That is, the flushing is not the problem; it's the actual computing the style.
(In reply to Boris Zbarsky [:bz] (no decent commit message means r-) from comment #15)
> > ::GetComputedStyleNoFlush would very likely remove that hang
> 
> All the time in the profile under nsComputedDOMStyle::GetComputedStyle is
> also under GetComputedStyleNoFlush, no?  That is, the flushing is not the
> problem; it's the actual computing the style.

Yes. You're right. The style flush (if any) is not the bottleneck -- it's the calculation of the computed style. Thank you for finding that and keeping me from going down the wrong path any further.

Alright, back to the beginning on this. Next steps I can see:

a) Optimize nsComputedDOMStyle::GetComputedStyleNoFlush. Admirable and would provide speedups across many browser workflows, but certainly very difficult if it's possible at all.

b) Modify the grid inspector to remove the UI that shows all elements with grids. This would be a significant change of the workflow for the grid inspector, because it would stop showing the use of grid across the page, and would instead only show the grid properties for the element currently selected in the inspector (if it uses grid). That's actually more consistent with the other inspector panels like the box model which appears immediately beneath the grid inspector. The box model shows the model for the currently selected element; not for all boxes on the page.

c) Change the the call to getGrids to occur asynchronously in such a way that it doesn't hang the content thread.

d) Live with this hang as a cost of having the grid inspector function as designed. The cost is only paid when the grid inspector is opened, or when the page is reflowed while the grid inspector is visible.

Gabe, what do you think of the options?
Flags: needinfo?(gl)
So the thing that's not clear to me is still why this is slow.  The profile shows us actually resolving styles.  We shouldn't need to do that in most cases, I would think.

My hunch is still that what's going on is style resolution over display:none subtrees.  _If_ that is the case, do we really want to look for grids in there?  If not, why don't we stop doing that?  For example skip subtrees rooted at display:none nodes or nodes for which HasServoData() is false.
And to put this into perspective, 2s as reported in comment 1 is _huge_.  On my laptop, getting the computed style for every single element in the single-page HTML spec (which does not have large display:none subtrees) takes about 600ms. 

Now if I style the body on that page as display:none that time goes up to 8000ms.  Which is why I was asking about display:none subtrees...
(In reply to Boris Zbarsky [:bz] (no decent commit message means r-) from comment #17)
> So the thing that's not clear to me is still why this is slow.  The profile
> shows us actually resolving styles.  We shouldn't need to do that in most
> cases, I would think.
> 
> My hunch is still that what's going on is style resolution over display:none
> subtrees.  _If_ that is the case, do we really want to look for grids in
> there?  If not, why don't we stop doing that?  For example skip subtrees
> rooted at display:none nodes or nodes for which HasServoData() is false.

Ah. I'm a slow learner, but I do learn. I'll try that and see what I can discover.
(C) sounds pretty good as an option to me (In reply to Brad Werth [:bradwerth] from comment #16)
> (In reply to Boris Zbarsky [:bz] (no decent commit message means r-) from
> comment #15)
> > > ::GetComputedStyleNoFlush would very likely remove that hang
> > 
> > All the time in the profile under nsComputedDOMStyle::GetComputedStyle is
> > also under GetComputedStyleNoFlush, no?  That is, the flushing is not the
> > problem; it's the actual computing the style.
> 
> Yes. You're right. The style flush (if any) is not the bottleneck -- it's
> the calculation of the computed style. Thank you for finding that and
> keeping me from going down the wrong path any further.
> 
> Alright, back to the beginning on this. Next steps I can see:
> 
> a) Optimize nsComputedDOMStyle::GetComputedStyleNoFlush. Admirable and would
> provide speedups across many browser workflows, but certainly very difficult
> if it's possible at all.
> 
> b) Modify the grid inspector to remove the UI that shows all elements with
> grids. This would be a significant change of the workflow for the grid
> inspector, because it would stop showing the use of grid across the page,
> and would instead only show the grid properties for the element currently
> selected in the inspector (if it uses grid). That's actually more consistent
> with the other inspector panels like the box model which appears immediately
> beneath the grid inspector. The box model shows the model for the currently
> selected element; not for all boxes on the page.
> 
> c) Change the the call to getGrids to occur asynchronously in such a way
> that it doesn't hang the content thread.
> 
> d) Live with this hang as a cost of having the grid inspector function as
> designed. The cost is only paid when the grid inspector is opened, or when
> the page is reflowed while the grid inspector is visible.
> 
> Gabe, what do you think of the options?

(c) sounds pretty good to me.
Flags: needinfo?(gl)
Attachment #8996114 - Attachment is obsolete: true
Attachment #8996114 - Flags: review?(bzbarsky)
Attachment #8996889 - Flags: review?(bzbarsky)
Attachment #8996890 - Flags: review?(bzbarsky)
Comment on attachment 8996889 [details]
Bug 1479508 Part 2: Change GetElementsWithGrid to use a more conservative traversal that skips subtrees without frames.

https://reviewboard.mozilla.org/r/260876/#review267938

::: dom/base/Element.cpp:1591
(Diff revision 1)
> +
> +      if (aFunc(elem)) {
> +        aElements.AppendElement(elem);
> -    }
> +      }
> +
> +      if (elem->GetPrimaryFrame() && elem->HasServoData()) {

I'm quite sure that if it has a frame it has servo data: we can't create a frame without having style, which is stored in the servo data.

So just checking GetPrimaryFrame here is fine.

::: dom/base/Element.cpp:1599
(Diff revision 1)
> +        cur = cur->GetNextNode(this);
> +        continue;
> +      }
> +    }
> +
> +    // Either this isn't an element, or it has no frame or no servo data.

Please adjust this comment when the HasServoData() check is removed.
Attachment #8996889 - Flags: review?(bzbarsky) → review+
Comment on attachment 8996890 [details]
Bug 1479508 Part 3: Update test expectations.

https://reviewboard.mozilla.org/r/260878/#review267942

This should presumably get merged with part 2, to avoid having a non-passing-tests changeset.

::: dom/base/test/chrome/test_getElementsWithGrid.html:67
(Diff revision 1)
>      {id:"a", message:"'plain' grid container with display:grid"},
>      {id:"b", message:"display:subgrid inside display:grid (to be fixed in Bug 1240834)", todo:true},
>      {id:"c", message:"'plain' grid container with display:inline-grid"},
>      {id:"d", message:"display:subgrid inside display:inline-grid (to be fixed in Bug 1240834)", todo:true},
>      {id:"e", message:"grid container with visibility:hidden"},
> -    {id:"f", message:"grid container inside a display:none element"},
> +    {id:"f", message:"grid container inside a element"},

"an element"

::: dom/base/test/chrome/test_getElementsWithGrid.html:80
(Diff revision 1)
>    // Part 2: Look for all the grid elements starting from a non-root element.
>    let elementsFromNonRoot = document.getElementById("a_non_root_element").getElementsWithGrid();
> -  is(elementsFromNonRoot.length, 1, "Found expected number of elements from non-root element.");
>  
>    let targetsFromNonRoot = [
> -    {id:"f", message:"grid container inside a display:none element (from non-root element)"},
> +    {id:"f", message:"grid container inside a element (from non-root element)"},

"an element"
Attachment #8996890 - Flags: review?(bzbarsky) → review+
Comment on attachment 8996114 [details]
Bug 1479508 Part 1: Move devtools-specific function definitions in Element.webidl into a better-documented partial interface.

https://reviewboard.mozilla.org/r/260366/#review267944

::: dom/webidl/Element.webidl:299
(Diff revision 2)
> +  [ChromeOnly, Pure]
> +  sequence<Grid> getGridFragments();
> +  
> +  /**
> +   * Returns a sequence of all the descendent elements of this element
> +   * that have display:grid or display:inline-grid style.

and frames, right?
Attachment #8996114 - Flags: review+
(In reply to Boris Zbarsky [:bz] (no decent commit message means r-) from comment #24)

> I'm quite sure that if it has a frame it has servo data: we can't create a
> frame without having style, which is stored in the servo data.
> 
> So just checking GetPrimaryFrame here is fine.

Hmmm... does that mean the code could just do a GetPrimaryFrame()->isGridContainerFrame() and avoid the call to getComputedStyle() altogether? If that's acceptable, that would be a big performance improvement.
Flags: needinfo?(bzbarsky)
That seems pretty plausible, yes...  The idea is to show people the actual grids they have on their page, not the things that could be grids if they weren't hidden by something, right?
Flags: needinfo?(bzbarsky)
(In reply to Boris Zbarsky [:bz] (no decent commit message means r-) from comment #28)
> That seems pretty plausible, yes...  The idea is to show people the actual
> grids they have on their page, not the things that could be grids if they
> weren't hidden by something, right?

Yes, that's the only value in the API as it is used today. Right. I'll rework the code to simplify down to this assumption and ask you for one more round of reviews. Thank you for your patience.
Pushed by bwerth@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/09d7d2f8fe4f
Part 1: Move devtools-specific function definitions in Element.webidl into a better-documented partial interface. r=bz
https://hg.mozilla.org/integration/autoland/rev/c42fa0ccd335
Part 2: Change GetElementsWithGrid to use a more conservative traversal that skips subtrees without frames. r=bz
https://hg.mozilla.org/integration/autoland/rev/cba3360692b8
Part 3: Update test expectations. r=bz
Backed out 3 changesets (bug 1479508) for devtools failures at devtools/client/inspector/grids/test/browser_grids_no_fragments.js

Backout: https://hg.mozilla.org/integration/autoland/rev/5cb6d926e60bc6e56da2971f167a965983354d5c

Failure push: https://treeherder.mozilla.org/#/jobs?repo=autoland&revision=cba3360692b8089e5850d1c111a8f57697fa554b

Failure log: https://treeherder.mozilla.org/logviewer.html#?job_id=191960164&repo=autoland&lineNumber=2849

14:44:33     INFO - TEST-START | devtools/client/inspector/grids/test/browser_grids_no_fragments.js
14:44:34     INFO - GECKO(805) | console.log: "[DISPATCH]" "{\n  \"type\": \"UPDATE_GRIDS\",\n  \"grids\": []\n}"
14:44:34     INFO - GECKO(805) | console.log: "[DISPATCH]" "{\n  \"type\": \"CLEAR_FLEXBOX\"\n}"
14:44:34     INFO - GECKO(805) | console.log: "[DISPATCH]" "{\n  \"type\": \"UPDATE_OFFSET_PARENT\",\n  \"offsetParent\": null\n}"
14:44:35     INFO - GECKO(805) | console.log: "[DISPATCH]" "{\n  \"type\": \"UPDATE_LAYOUT\",\n  \"layout\": {\n    \"width\": 1264,\n    \"height\": 0,\n    \"position\": \"static\",\n    \"top\": \"auto\",\n    \"right\": \"auto\",\n    \"bottom\": \"auto\",\n    \"left\": \"auto\",\n    \"margin-top\": \"8px\",\n    \"margin-right\": \"8px\",\n    \"margin-bottom\": \"8px\",\n    \"margin-left\": \"8px\",\n    \"padding-top\": \"0px\",\n    \"padding-right\": \"0px\",\n    \"padding-bottom\": \"0px\",\n    \"padding-left\": \"0px\",\n    \"border-top-width\": \"0px\",\n    \"border-right-width\": \"0px\",\n    \"border-bottom-width\": \"0px\",\n    \"border-left-width\": \"0px\",\n    \"z-index\": \"auto\",\n    \"box-sizing\": \"content-box\",\n    \"display\": \"block\",\n    \"float\": \"none\",\n    \"line-height\": \"19.2px\",\n    \"autoMargins\": {},\n    \"from\": \"server1.conn22.child1/pagestyle29\",\n    \"isPositionEditable\": false\n  }\n}"
14:44:35     INFO - TEST-INFO | started process screencapture
14:44:35     INFO - TEST-INFO | screencapture: exit 0
14:44:35     INFO - Buffered messages logged at 14:44:33
14:44:35     INFO - Entering test bound 
14:44:35     INFO - Adding a new tab with URL: data:text/html;charset=utf-8,%0A%20%20%3Cstyle%20type%3D'text%2Fcss'%3E%0A%20%20%20%20%23wrapper%20%7B%20display%3A%20none%3B%20%7D%0A%20%20%20%20.grid%20%7B%20display%3A%20grid%3B%20grid-template-columns%3A%201fr%201fr%3B%20%7D%0A%20%20%3C%2Fstyle%3E%0A%20%20%3Cdiv%20id%3D%22wrapper%22%3E%0A%20%20%20%20%3Cdiv%20class%3D%22grid%22%3E%0A%20%20%20%20%20%20%3Cdiv%3Ecell1%3C%2Fdiv%3E%0A%20%20%20%20%20%20%3Cdiv%3Ecell2%3C%2Fdiv%3E%0A%20%20%20%20%3C%2Fdiv%3E%0A%20%20%20%20%3Cdiv%20class%3D%22grid%22%3E%0A%20%20%20%20%20%20%3Cdiv%3Ecell1%3C%2Fdiv%3E%0A%20%20%20%20%20%20%3Cdiv%3Ecell2%3C%2Fdiv%3E%0A%20%20%20%20%3C%2Fdiv%3E%0A%20%20%3C%2Fdiv%3E%0A
14:44:35     INFO - Buffered messages logged at 14:44:34
14:44:35     INFO - Tab added and finished loading
14:44:35     INFO - Opening the inspector
14:44:35     INFO - Opening the toolbox
14:44:35     INFO - Toolbox opened and focused
14:44:35     INFO - Selecting the layoutview sidebar
14:44:35     INFO - Buffered messages finished
14:44:35     INFO - TEST-UNEXPECTED-FAIL | devtools/client/inspector/grids/test/browser_grids_no_fragments.js | Uncaught exception - at chrome://mochitests/content/browser/devtools/client/inspector/grids/test/browser_grids_no_fragments.js:34 - TypeError: gridList is null
14:44:35     INFO - Stack trace:
14:44:35     INFO - @chrome://mochitests/content/browser/devtools/client/inspector/grids/test/browser_grids_no_fragments.js:34:3
14:44:35     INFO - Async*Tester_execTest/<@chrome://mochikit/content/browser-test.js:1103:34
14:44:35     INFO - async*Tester_execTest@chrome://mochikit/content/browser-test.js:1094:16
14:44:35     INFO - nextTest/<@chrome://mochikit/content/browser-test.js:996:9
14:44:35     INFO - SimpleTest.waitForFocus/waitForFocusInner/focusedOrLoaded/<@chrome://mochikit/content/tests/SimpleTest/SimpleTest.js:795:59
14:44:35     INFO - Leaving test bound 
14:44:35     INFO - Removing tab.
14:44:35     INFO - Waiting for event: 'TabClose' on [object XULElement].
14:44:35     INFO - Got event: 'TabClose' on [object XULElement].
14:44:35     INFO - Tab removed and finished closing
14:44:35     INFO - GECKO(805) | MEMORY STAT | vsize 4636MB | residentFast 575MB | heapAllocated 248MB
Flags: needinfo?(bwerth)
Attachment #8997625 - Flags: review?(gl)
The proposed Part 4 of the patch will remove the problematic test. I don't think there's a reason to "fix" the test, since it is designed to check correct behavior in a situation that can no longer occur.
Flags: needinfo?(bwerth)
Comment on attachment 8997625 [details]
Bug 1479508 Part 4: Remove a test that only exists to check grids in display:none subtrees -- something no longer targeted by the UI.

https://reviewboard.mozilla.org/r/261328/#review268488
Attachment #8997625 - Flags: review?(gl) → review+
Pushed by bwerth@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/22e1e1559e39
Part 1: Move devtools-specific function definitions in Element.webidl into a better-documented partial interface. r=bz
https://hg.mozilla.org/integration/autoland/rev/905043dccccb
Part 2: Change GetElementsWithGrid to use a more conservative traversal that skips subtrees without frames. r=bz
https://hg.mozilla.org/integration/autoland/rev/4499e3d8db74
Part 3: Update test expectations. r=bz
https://hg.mozilla.org/integration/autoland/rev/89b76e834c7d
Part 4: Remove a test that only exists to check grids in display:none subtrees -- something no longer targeted by the UI. r=gl
You need to log in before you can comment on or make changes to this bug.