https://foofighters.com is especially slow in Firefox, compared to Chrome

RESOLVED FIXED in Firefox 52

Status

()

Core
DOM
RESOLVED FIXED
9 months ago
7 months ago

People

(Reporter: julienw, Assigned: bkelly)

Tracking

({regression})

unspecified
mozilla54
regression
Points:
---
Dependency tree / graph
Bug Flags:
qe-verify -

Firefox Tracking Flags

(firefox51 unaffected, firefox52+ fixed, firefox-esr52- fixed, firefox53 fixed, firefox54 fixed)

Details

Attachments

(3 attachments)

(Reporter)

Description

9 months ago
STR:
1. Go to https://foofighters.com
2. Wait

=> The page implements a departure-like table. But this is a lot slower in Firefox than in Chrome to reach the end. When I say "a lot slower" it's really a 1:30 minute compared to 10 seconds. On Android this is several minutes even.

This is a profile with latest Nightly: https://perfht.ml/2muvkHK

I noticed we're especially slow at painting, with paints that can last up to 100ms on my computer. So I enabled "paint flashing" both in Chrome and Firefox. In Firefox we repaint the full table a lot, while in Chrome this never happens.
(Reporter)

Comment 1

9 months ago
Created attachment 8841461 [details]
foofighters.com.zip

Here is the saved page. It's not exactly the same (some images are missing) but the same issue appears.
(Reporter)

Comment 2

9 months ago
Mmm likely more Layout given the issue seems to be with the fact we're painting the full table.
Component: Graphics → Layout
Depends on: 929484
(Assignee)

Comment 3

9 months ago
FF51.0.1 does not show this behavior.  I think it might be due to bug 1300659.

If you look at a performance timeline for the site on FF51.0.1 you can see 25 consecutive setInterval() callbacks followed by ~50 requestAnimationFrame() callbacks.

In FF54 instead it fires a single setInterval() callback and then two requestAnimationFrame() callbacks, style, layout, and paint before the next setInterval() callback is executed.
Blocks: 1300659
(Assignee)

Comment 4

9 months ago
I'm a bit conflicted about what to do here.  Normally I think we want to prioritize painting over running timer callbacks.  In this case, though, painting is so much slower than the other work its basically saturating the main thread.  The site is depending on timers getting prioritized higher to force more work through.
Flags: needinfo?(bkelly)
(Assignee)

Comment 5

9 months ago
Obviously the site could avoid this situation by measuring how fast its running and flipping the correct number of letters every time its timer fires.  Or running a pure requestAnimationFrame() for its animation instead of mixing setInterval/requestAnimationFrame.  That would cause our slow painting to cause them to lose some animation steps (multiple letters flip at a time), but the overall animation would run in the same total time.
(Assignee)

Updated

9 months ago
See Also: → bug 1322328
(Assignee)

Comment 6

9 months ago
BTW, on my nexus 5x this site works a lot better in FF53 aurora compared to FF51 release.
(Assignee)

Comment 7

9 months ago
[Tracking Requested - why for this release]:

This is a user visible regression in sites using this animation technique.  I spoke with Boris about it and he thinks we should uplift a mitigation before we go to release next week.
Assignee: nobody → bkelly
Status: NEW → ASSIGNED
status-firefox51: --- → unaffected
status-firefox52: --- → affected
status-firefox53: --- → affected
status-firefox54: --- → affected
status-firefox-esr52: --- → affected
tracking-firefox52: --- → ?
tracking-firefox-esr52: --- → ?
Flags: needinfo?(bkelly)
Keywords: regression
(Assignee)

Comment 8

9 months ago
I have a patch that allows us to process more timer callbacks in RunTimeout().  Some results of timing how long it takes the site to get to the end of its animation:

FF 51.0.1 baseline:

no timer yielding = 10 seconds to complete

FF54 with different cap values:

1 allowed callback  = 33 seconds to complete
2 allowed callbacks = 16 seconds to complete
3 allowed callbacks = 15 seconds to complete
4 allowed callbacks = 14 seconds to complete
5 allowed callbacks = 14 seconds to complete

The difference between 10 seconds and 14 seconds is likely due to the yielding caused by ThrottledEventQueue.  Unfortunately, I can't easily modify that without completely disabling it.

With these values we still avoid jank on this test case:

https://people-mozilla.org/~bkelly/timer-flood/index.html

I'd like to move forward with this change even though it doesn't exactly match FF51 release behavior.  The 4 second difference here can be addressed by speeding up our layout/paint.
(Assignee)

Comment 9

9 months ago
Created attachment 8841632 [details] [diff] [review]
Run a limited number of timer callbacks in the same event loop runnable. r= smaug

Olli, this site has a significant animation slowdown due to my timer yielding changes.  This is occurring because the paints it triggers last 40ms to 60ms which means there is always a refresh driver runnable waiting to execute after a timer runs.  So it paints after every timer callback instead of coallescing changes between paints.

This patch mitigates the issue by allowing us to run a limited number of timers in the same runnable.  The limit defaults to 5, but is configurable by a pref.  (This will let us update the value in release if necessary without a point release.)

Long term we should make RunTimeout() do something like this based on a time budget instead of fixed limit.  Its hard to do, though, since RunTimeout() wants to know the number of timers up-front before any timers are invoked.  I don't think we could uplift such changes.

There is still some difference with release even with these patches due to the ThrottledEventQueue, but its greatly reduced.  I'd like to go with this patch as a compromise solution as it still allows us to survive timer floods, etc.
Attachment #8841632 - Flags: review?(bugs)

Updated

9 months ago
Attachment #8841632 - Flags: review?(bugs) → review+

Comment 10

9 months ago
That patch is a hack so we need to figure out how this all should work in different cases:
- How should timers be handled when refresh driver tick takes too much time,
- what the behavior should be when compositing is slow (I guess just run as many timers as possible)
- what if the next refresh driver tick is postponed too much because of code running between ticks.
(Assignee)

Comment 11

9 months ago
(In reply to Olli Pettay [:smaug] from comment #10)
> That patch is a hack so we need to figure out how this all should work in
> different cases:
> - How should timers be handled when refresh driver tick takes too much time,
> - what the behavior should be when compositing is slow (I guess just run as
> many timers as possible)
> - what if the next refresh driver tick is postponed too much because of code
> running between ticks.

What is a hack or not is a bit subjective when none of that is really spec'd.  Realistically sites should not expect precise timer behavior if they are saturating the main thread with work.

But, yes a time budget or something would be better.  I'll file a follow-up.
(Assignee)

Comment 12

9 months ago
Created attachment 8841760 [details] [diff] [review]
FF52 uplift patch

Verified this builds and works in FF52 locally.

Comment 13

9 months ago
Pushed by bkelly@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/e1a939a64f4b
Run a limited number of timer callbacks in the same event loop runnable. r=smaug
(Assignee)

Comment 14

9 months ago
Comment on attachment 8841632 [details] [diff] [review]
Run a limited number of timer callbacks in the same event loop runnable. r= smaug

Approval Request Comment
[Feature/Bug causing the regression]: Timer changes in bug 1300659.
[User impact if declined]: Some animations that depend on flooding the main thread run slower, although with fewer dropped frames.  This patch is a mitigation to allow a limited number of timers to coalesce while still preventing floods from kill fps.
[Is this code covered by automated tests?]: We have tests, but not for this specific condition.  Its somewhat timing dependent.
[Has the fix been verified in Nightly?]: I verified it locally in nightly and beta.  Has not merged to m-c yet, though.
[Needs manual test from QE? If yes, steps to reproduce]:  No
[List of other uplifts needed for the feature/fix]: Separate mozilla-beta patch attached to this bug.
[Is the change risky?]: Minimal risk
[Why is the change risky/not risky?]: Its changing a conditional in a small controlled way.  In addition, the patch includes a new preference to allow us to tweak the value in the future without requiring code changes.
[String changes made/needed]: None
Attachment #8841632 - Flags: approval-mozilla-aurora?
(Assignee)

Comment 15

9 months ago
Comment on attachment 8841760 [details] [diff] [review]
FF52 uplift patch

See comment 14.  This is a separate patch for FF52 branches.
Attachment #8841760 - Flags: approval-mozilla-esr52?
Attachment #8841760 - Flags: approval-mozilla-beta?

Comment 16

9 months ago
bugherder
https://hg.mozilla.org/mozilla-central/rev/e1a939a64f4b
Status: ASSIGNED → RESOLVED
Last Resolved: 9 months ago
status-firefox54: affected → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla54
Comment on attachment 8841632 [details] [diff] [review]
Run a limited number of timer callbacks in the same event loop runnable. r= smaug

work around an issue with some animations and timers, aurora53+
Attachment #8841632 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Comment on attachment 8841760 [details] [diff] [review]
FF52 uplift patch

work around an issue with animations and timers, beta/release 52+

This should be in 52 rc2
Attachment #8841760 - Flags: approval-mozilla-release+
Attachment #8841760 - Flags: approval-mozilla-esr52?
Attachment #8841760 - Flags: approval-mozilla-beta?
Attachment #8841760 - Flags: approval-mozilla-beta+
(Assignee)

Comment 19

9 months ago
remote:   https://hg.mozilla.org/releases/mozilla-aurora/rev/c269adcb4e01ae92e215efe6232a5ab28a9c3949
status-firefox53: affected → fixed
(Assignee)

Comment 20

9 months ago
remote:   https://hg.mozilla.org/releases/mozilla-beta/rev/17d42c8d8f209c6217826a790c3efa1d1f057c19

remote:   https://hg.mozilla.org/releases/mozilla-release/rev/c9cd431edad441c7340a4d89daf30620a6e992e7
status-firefox52: affected → fixed
https://hg.mozilla.org/releases/mozilla-esr52/rev/17d42c8d8f20
status-firefox-esr52: affected → fixed
(Assignee)

Updated

9 months ago
Component: Layout → DOM
(Assignee)

Updated

9 months ago
Depends on: 1343877
(Assignee)

Updated

9 months ago
Depends on: 1343895
(Assignee)

Updated

9 months ago
Blocks: 1343912
Setting qe-verify- based on Ben's assessment on manual testing needs (see Comment 14).
Flags: qe-verify-
tracking-firefox52: ? → +
tracking-firefox-esr52: ? → -
I see talos improvements on aurora when this landed there:
== Change summary for alert #5291 (as of February 28 2017 14:11 UTC) ==

Improvements:

 11%  a11yr summary linux64 pgo e10s         461.2 -> 412.63
 10%  a11yr summary linux64 opt e10s         461.38 -> 413.4
  9%  a11yr summary osx-10-10 opt e10s       463.68 -> 420.38
  8%  a11yr summary windows7-32 pgo e10s     522.12 -> 479.32
  8%  a11yr summary windows8-64 pgo e10s     482.06 -> 442.58

For up to date results, see: https://treeherder.mozilla.org/perf.html#/alerts?id=5291
and it looks like tps improvements on trunk:
== Change summary for alert #5271 (as of February 27 2017 22:33 UTC) ==

Improvements:

  3%  tps summary linux64 opt          48.66 -> 47.39
  3%  tps summary windows8-64 pgo      50.38 -> 49.1

For up to date results, see: https://treeherder.mozilla.org/perf.html#/alerts?id=5271
(Assignee)

Comment 25

9 months ago
Joel, I still don't fully buy the talos changes based on the analysis I did in bug 1315895 comment 12.
(Assignee)

Updated

9 months ago
See Also: → bug 1315895
we see a11y improvements on beta as well, cool :)
did this not make it into the 54 merge to aurora last week?  I see an improvement, then a regression:
https://treeherder.mozilla.org/perf.html#/graphs?timerange=5184000&series=%5Bmozilla-aurora,054905c426e16aca2e90a056152d32e9f9616511,1,1%5D&series=%5Bautoland,054905c426e16aca2e90a056152d32e9f9616511,0,1%5D&series=%5Bmozilla-inbound,054905c426e16aca2e90a056152d32e9f9616511,1,1%5D
Flags: needinfo?(bkelly)
(Assignee)

Comment 28

8 months ago
This landed two weeks, well before the merge.  See comment 16.  Note, I also tweaked this algorithm in bug 1343877.

However, as I tried to explain in comment 25, I still think these talos changes are a bit of a red herring.
Flags: needinfo?(bkelly)
Compared to Google Chrome on same website(https://foofighters.com), the performance seems still behind. Do we have any todos remaining in this bug ?
Flags: needinfo?(bkelly)
(Assignee)

Comment 30

8 months ago
We can speed up layout/rendering like in bug 929484.  As I mentioned in comment 9, though, my changes were just a mitigation and did not completely bring this site to parity.  I wrote more about the tradeoffs involved here in:

https://blog.wanderview.com/blog/2017/03/13/firefox-52-settimeout-changes/
Flags: needinfo?(bkelly)
(Assignee)

Comment 31

7 months ago
It seems my mitigation landed here is no longer effective for some reason.  The mechanism is still working, though, because I can see a change on this demo when modify the pref:

https://mozdevs.github.io/servo-experiments/experiments/tiles/

I also tested with the fix from bug 929484 and it didn't help either.

We probably need to re-profile this once bug 929484 hits nightly.
(Assignee)

Updated

7 months ago
Depends on: 1362412
(Assignee)

Updated

7 months ago
No longer blocks: 1343912
You need to log in before you can comment on or make changes to this bug.