stylo: Huge hang on Facebook, spending most of my time in Stylo style recalculations

NEW
Assigned to

Status

()

Core
CSS Parsing and Computation
P1
normal
10 days ago
4 days ago

People

(Reporter: mconley, Assigned: emilio)

Tracking

(Blocks: 1 bug)

Firefox Tracking Flags

(Not tracked)

Details

(Whiteboard: [qf:p1][Stylo])

MozReview Requests

()

Submitter Diff Changes Open Issues Last Updated
Loading...
Error loading review requests:

Attachments

(2 attachments)

Mozilla/5.0 (Macintosh; Intel Mac OS X 10.10; rv:56.0) Gecko/20100101 Firefox/56.0

Build ID: 20170714100307

I have layout.css.servo.enabled set to true, and about:support says:

"Stylo 	true (enabled by user)"


Unfortunately, I don't have great STR for this one. I just remember being on Facebook and clicking on something[1], and then the main thread in the content process hanging for a very long time. I eventually had to kill the JS execution (via the content process hang notification bar).

Luckily I was profiling at the time. I got the last 3 seconds or so of the hang in the content process:

https://perfht.ml/2tbDYBR

We're spending an awful long time to style recalculations in our Stylo code here. We seem to be entering it via JS - probably in a loop.

BenWa at Facebook was able to decode some of the obfuscated JS that's running here. I'm needinfo'ing him to comment further if he's able regarding what code is actually running here.

This profile was initially examined during The Joy of Profiling: Episode 6[2].

[1] Could have been anything. Possible the button to access my messages. I really can't remember.
[2]: https://air.mozilla.org/the-joy-of-profiling-episode-6/
Flags: needinfo?(b56girard)
The time is spent in a function checking visibility. De-minified code is:
function isVisible(element: HTMLElement): boolean {
  // Look upward in the DOM until we either reach the document or
  // find something hidden.
  let elem = element;
  while (
    elem &&
    elem !== document &&
    Style.get(elem, 'visibility') != 'hidden' &&
    Style.get(elem, 'display') != 'none') {
    elem = elem.parentNode;
  }

  // If we reached the document, the element is visible.
  return elem === document;
}


Here we assume that once we've flushed then subsequent style queries will be fast. Guessing from the symbol names it looks like Stylo is doing the work lazily for each element and so we have to reflush lazily each time we walk up to the parent.

Looks like the you were either interacting with the Status composer or the a chat message tab.
Flags: needinfo?(b56girard)
Hey Manish - anything actionable in here?
Flags: needinfo?(manishearth)
Whiteboard: [qf]
Emilio/heycam understand restyles better
Flags: needinfo?(manishearth)
Flags: needinfo?(emilio+bugs)
Flags: needinfo?(cam)
(Assignee)

Comment 4

10 days ago
(In reply to Benoit Girard (:BenWa) from comment #1)
> The time is spent in a function checking visibility. De-minified code is:
> function isVisible(element: HTMLElement): boolean {
>   // Look upward in the DOM until we either reach the document or
>   // find something hidden.
>   let elem = element;
>   while (
>     elem &&
>     elem !== document &&
>     Style.get(elem, 'visibility') != 'hidden' &&
>     Style.get(elem, 'display') != 'none') {
>     elem = elem.parentNode;
>   }
> 
>   // If we reached the document, the element is visible.
>   return elem === document;
> }
> 
> 
> Here we assume that once we've flushed then subsequent style queries will be
> fast. Guessing from the symbol names it looks like Stylo is doing the work
> lazily for each element and so we have to reflush lazily each time we walk
> up to the parent.
> 
> Looks like the you were either interacting with the Status composer or the a
> chat message tab.

Yes... We should cache somewhere the style we resolve for the parent, otherwise this is going to be super-slow... Leaving ni? for now.
(Assignee)

Comment 5

10 days ago
Created attachment 8886674 [details]
testcase

FWIW this is only a perf problem if the element is actually non-visible, because we optimize out the styles in display: none subtrees.

Here's a test-case, that shows the problem: The first isVisible call only takes a few milliseconds. The second one is still running...
This doesn't appear to be a Stylo-only bug, btw. I just tested with Stylo disabled, and we get the same hang with the old style stuff too.
Comment hidden (mozreview-request)
(Assignee)

Comment 8

10 days ago
There's a proof-of-concept patch that fixes the issue... It has some drive-by cleanup, and it breaks invariants, in general, so even though I think it works as-is, I need to prove it and document the new invariants, if any :-)
Assignee: nobody → emilio+bugs
Flags: needinfo?(emilio+bugs)
(Assignee)

Comment 9

10 days ago
Comment on attachment 8886678 [details]
Bug 1381071: Cache lazily-resolved styles in the element data.

Worth having some feedback from Cam when I wake up tomorrow morning... :)

Cam, wdyt about this patch? I think this is the more straight-forward and elegant way to implement a fix for this. That being said, we could also store it in an out-of-band hashtable, or something...

We'd need to resolve eager pseudo-element styles to be correct too, I guess (or add a flag, etc), but we were doing that anyway before my refactor at bug 1379505...
Attachment #8886678 - Flags: feedback?(cam)
(Assignee)

Comment 10

8 days ago
So, I looked at this today again, and it's tricky to get working with our XBL setup (sigh), so I'm probably not going to get it done today.

That being said, the reason why you found this hang on Stylo and not on Gecko is probably https://github.com/servo/servo/pull/17751.

With that fixed, Stylo should be pretty much on par with Gecko. But I still think it's worth caching the style (because the testcase in the bug is really embarrasing).
Blocks: 1375906
Priority: -- → P1
Whiteboard: [qf] → [qf][Stylo]
Hmm, so bug 1366964 should cause us to do half the style resolution that we are doing here (since we won't re-resolve style when looking up the display value after we look up the visibility value), but that's not enough.

Storing the style we compute for elements in display:none subtrees (and their ancestors) when we call getComputedStyle on them sounds OK to me.  If we are restyling and encounter the root of a display:none subtree, will we clear element data for its descendants?  Or will we continue to restyle them?  Seems easier not to restyle them.  Though if we do (and we'd need to ensure we don't generate damage for them), we might be able to remove the inexact cached style context invalidation in nsComputedDOMStyle that bug 1366964 added, at the expense of potentially wastefully styling these elements if an nsComputedDOMStyle object no longer needs them.
Comment on attachment 8886678 [details]
Bug 1381071: Cache lazily-resolved styles in the element data.

f+ though I didn't check whether we're doing the right things with dirty bits etc.
Flags: needinfo?(cam)
Attachment #8886678 - Flags: feedback?(cam) → feedback+
Hm, at our stylo meetup in september, I remember concluding that it was really hard to keep track of whether styles in display:none subtrees are up to date. If you leave them cached, you either need to continue restyling them on subsequent restyles (which is potentially wasteful), or not restyle them, which can cause them to get stale.

IMO the easiest solution is a Gecko-side hashtable from element->stylecontext for elements in display:none subtrees. When we restyle, we clear the hashtable.
(In reply to Bobby Holley (:bholley) (busy with Stylo) from comment #13)
> Hm, at our stylo meetup in september, I remember concluding that it was
> really hard to keep track of whether styles in display:none subtrees are up
> to date. If you leave them cached, you either need to continue restyling
> them on subsequent restyles (which is potentially wasteful), or not restyle
> them, which can cause them to get stale.

If we cleared them out if we would have to enter the display:none subtree to restyle, that should avoid most of the tracking trickiness.  It seems like that should work with the invalidation propagation too, as we'd propagate our dirty bit up out of the styled element in the display:none subtree, and once we come to restyle, we'll just drop that subtree's element data.

> IMO the easiest solution is a Gecko-side hashtable from
> element->stylecontext for elements in display:none subtrees. When we
> restyle, we clear the hashtable.

That might be easier, yes.
(Assignee)

Comment 15

8 days ago
(In reply to Bobby Holley (:bholley) (busy with Stylo) from comment #13)
> Hm, at our stylo meetup in september, I remember concluding that it was
> really hard to keep track of whether styles in display:none subtrees are up
> to date. If you leave them cached, you either need to continue restyling
> them on subsequent restyles (which is potentially wasteful), or not restyle
> them, which can cause them to get stale.

It is not, you just need to always clear during the next restyle. And we mostly do that, we only need to clear them also when we don't recompute the style of the display: none element. WebKit and Blink also do this fwiw.

> IMO the easiest solution is a Gecko-side hashtable from
> element->stylecontext for elements in display:none subtrees. When we
> restyle, we clear the hashtable.

That makes us move all the "does the parent has style" logic to the parent context. Which is fine, but a lot harder.

Updated

7 days ago
Summary: Huge hang on Facebook, spending most of my time in Stylo style recalculations → stylo: Huge hang on Facebook, spending most of my time in Stylo style recalculations
(In reply to Emilio Cobos Álvarez [:emilio] from comment #15)
> (In reply to Bobby Holley (:bholley) (busy with Stylo) from comment #13)
> > Hm, at our stylo meetup in september, I remember concluding that it was
> > really hard to keep track of whether styles in display:none subtrees are up
> > to date. If you leave them cached, you either need to continue restyling
> > them on subsequent restyles (which is potentially wasteful), or not restyle
> > them, which can cause them to get stale.
> 
> It is not, you just need to always clear during the next restyle. And we
> mostly do that, we only need to clear them also when we don't recompute the
> style of the display: none element.

How would we find and identify them? The naive approach would be to propagate the dirty descendant flag up from the display:none root after caching display:none styles in its descendants, but that would force a subsequent restyle when none is needed.

> 
> > IMO the easiest solution is a Gecko-side hashtable from
> > element->stylecontext for elements in display:none subtrees. When we
> > restyle, we clear the hashtable.
> 
> That makes us move all the "does the parent has style" logic to the parent
> context. Which is fine, but a lot harder.

I'm not sure what "the parent context" refers to here.

Are you referring to reusing existing styles when recursively resolving styles up the tree (resolve_style_internal etc)? Seems like that code could check the hashmap.

I'm certainly not opposed to caching on the elements if we have a clear and simple strategy to know when styles are up to date, which doesn't complicate the rest of the traversal invariants. I'm just not sure what that is yet.
Emilio and I discussed this a bit, and I thought about it more afterwards. I think we can make the in-tree thing work if we apply the following simple rule:

Every time a display:none root is restyled, we eagerly drop all style data from descendants.

Things that this handles:
* The display:block -> display:none case.
* the display:none -> display:block case, where we may have some unstyled elements deep in the subtree that we wouldn't otherwise be able to find. This is preferable to RestyleHint::subtree because that would cause us to compute restyle damage we don't need.
* The display:none -> display:none case (with either rematching or recascading), which may invalidate styles in the subtree.
Whiteboard: [qf][Stylo] → [qf:p1][Stylo]
You need to log in before you can comment on or make changes to this bug.