Open Bug 1425040 Opened 7 years ago Updated 8 months ago

Firefox is very slow entering/leaving Twitter “lightbox” effects when the DOM is large

Categories

(Core :: CSS Parsing and Computation, defect, P3)

59 Branch
x86_64
Linux
defect

Tracking

()

Performance Impact low
Tracking Status
platform-rel --- ?
firefox59 --- affected

People

(Reporter: jld, Unassigned)

Details

(Keywords: perf, Whiteboard: [platform-rel-Twitter])

Attachments

(1 file, 1 obsolete file)

Attached file ff-twitter-perf.tar (obsolete) —
STR (for me): go to Twitter, expand the infinite scroll enough times (hitting the End key repeatedly should work, but normally this is just the result of scrolling down and reading for a while), and then click on a photo or tweet to enter the “lightbox” mode.

This takes a while, depending on how large the stack of tweets has gotten.  I've seen the tab lock up for over 10 seconds.  Clicking outside the popover to dismiss it is also slow.

Chromium responds to these actions quickly (especially when repeated, but even the first time after loading the page it's much faster), but it's very slow when scrolling the page, so on balance Firefox is the less bad option for now, and whatever they're doing here may not be applicable to Gecko.

I'm attaching some profiles I took — the first two are entering/leaving a lightbox, and the second is the same with a slightly smaller page and the Stylo thread pool included.  It seems to be spending a lot of time thinking about how to restyle/re-layout content that hasn't visibly changed.
Keywords: perf
Whiteboard: [qf]
Priority: -- → P1
Flags: needinfo?(emilio)
Just so I can repro locally more easily, what size of DOM are we talking about?

On nightly with all the tweets I could load after a bunch of "load more", with:

  Array.from(document.querySelectorAll('.js-stream-item')).length

Reports 709, and it's still quite ok (not as fast as with a small dom of course, but...).
Flags: needinfo?(emilio) → needinfo?(jld)
From the profile it seems something is causing us to restyle the whole page, or at least a large amount of it, which looks suspicious...
Is this a local opt build by any chance? I see stuff there that really should get inlined on release builds, like Option::map and such.

Local opt builds use -O1 for Rust code for build speed apparently, and it's not the first report that I've got with people reporting bad styling performance on local opt builds because of that...
Priority: P1 → P3
(Side note: the STR sound a lot like bug 1368070, but that was fixed a while back (and fixed more thoroughly with stylo, and it sounds like stylo is enabled here, so it's unlikely to be the same thing.)

jld: can you see if you can still reproduce with a Nightly? (Per comment 3, it seems you might've just been hitting a known lack-of-optimization in local builds...)
platform-rel: --- → ?
Whiteboard: [qf] → [qf] [platform-rel-Twitter]
It was a local build, and I think I'd heard about the Rust -O1 feature at some point but I'd forgotten about it.  Thanks for the pointer.  Subjectively it's noticeably faster / less irritating with --enable-release, but (having sat down and profiled again) it still seems to block events for about 1s per 100 tweets in the main view to enter or leave thread view.

In one instance with events blocked for 3.7s, it looks like 1.1s is style (~40% of wall time in the parallelized part), 2.2s is C++ with `Reflow` in the method names, painting is negligible, and the rest seems to be content script.
Flags: needinfo?(jld)
This is the profile I described in the last comment; content process #4 is the one that's doing things.
Attachment #8936613 - Attachment is obsolete: true
Daniel, please take a further look at this bug and update the QF whiteboard tag. Thanks!
Flags: needinfo?(dholbert)
Sorry, that shortlink had the wrong process focused. Here's one with the correct process focused, zoomed to the long reflow:
  https://perfht.ml/2nXoPzr

I'll take a closer look this afternoon.
Before the long reflow, there's a long restyle from an "offsetWidth" query (979ms, approximately one second). Here's the profile zoomed to that range (note, you have to scroll down to see Content Process 4 and its style threads):
  https://perfht.ml/2o4LM37

This is the full-page-restyle that emilio mentioned in comment 2 (for an earlier profile).

I'm guessing that the subsequent expensive reflow is a result of this restyle. So the way to fix the slow relayout here is probably to avoid the slow restyle. Hence, reclassifying from Layout component to CSS.

(We're not *reconstructing* frames at least (or not very many, at least -- 3ms with nsCSSFrameConstructor in the stack -- so this isn't likely to be bug 1342220 regressing. There, frame-reconstruction was the expensive bit.)

Calling [qf:p3] for now.  We might want to increase the priority to p1 if this is easy to trigger in release builds as well. (Sounds like it's not as bad there.)  Until that's been demonstrated, this feels medium-priority.
Component: Layout → CSS Parsing and Computation
Flags: needinfo?(dholbert)
Whiteboard: [qf] [platform-rel-Twitter] → [qf:p3][qf:f61][platform-rel-Twitter]
Long restyle doesn't necessary trigger long reflow. We calculate the difference we need for downstream and if there is nothing necessary for reflow, reflow shouldn't happen at all.

So my guess is that some inherited property which requires reflow gets changed on root of a very large subtree. We need to investigate what the property is then we can further discuss what can we do with it.

I barely have an impression that Twitter hides and shows the scrollbar, which may be done via touching overflow property. That is a reset property so we shouldn't do a full page restyle, but IIRC that can definitely cause a large reflow (usually twice on non-Mac platforms, one for without scrollbar, one for with).
(In reply to Xidorn Quan [:xidorn] UTC+10 from comment #11)
> Long restyle doesn't necessary trigger long reflow. We calculate the
> difference we need for downstream and if there is nothing necessary for
> reflow, reflow shouldn't happen at all.
> 
> So my guess is that some inherited property which requires reflow gets
> changed on root of a very large subtree. We need to investigate what the
> property is then we can further discuss what can we do with it.
> 
> I barely have an impression that Twitter hides and shows the scrollbar,
> which may be done via touching overflow property. That is a reset property
> so we shouldn't do a full page restyle, but IIRC that can definitely cause a
> large reflow (usually twice on non-Mac platforms, one for without scrollbar,
> one for with).

Actually it's even worse, it can be up to 4 reflows:

 * one without scrollbars
 * one with scrollbars just on the inline axis.
 * one with scrollbars just on the block axis.
 * one with scrollbars on both axes.
Whiteboard: [qf:p3][qf:f61][platform-rel-Twitter] → [qf:p3][platform-rel-Twitter]
Performance Impact: --- → P3
Whiteboard: [qf:p3][platform-rel-Twitter] → [platform-rel-Twitter]
Severity: normal → S3
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: