Closed Bug 1877531 Opened 7 months ago Closed 7 months ago

High CPU usage on page with animation and ResizeObserver

Categories

(Core :: Layout, defect, P2)

Firefox 122
defect

Tracking

()

VERIFIED FIXED
124 Branch
Tracking Status
firefox-esr115 --- unaffected
firefox122 + verified
firefox123 + verified
firefox124 + verified

People

(Reporter: maxbadryzlov, Assigned: emilio)

References

(Regression)

Details

(Keywords: perf-alert, regression)

Attachments

(5 files, 1 obsolete file)

Steps to reproduce:

Open a page with at least one animated element and one observed by ResizeObserver:

<!DOCTYPE html>
<html lang="en">
  <head>
    <meta charset="UTF-8" />
    <meta name="viewport" content="width=device-width, initial-scale=1.0" />
    <title>Document</title>
    <style>
      .animated {
        position: absolute;
        width: 12px;
        height: 12px;
        background-color: #000;
        animation: rotating-animation 0.5s infinite linear;
      }
      @keyframes rotating-animation {
        0% {
          opacity: 0;
        }
        100% {
          opacity: 1;
        }
      }
    </style>
  </head>
  <body tabindex="-1">
    <div class="animated"></div>
    <div id="observable"></div>

    <script>
      const resizeObserver = new ResizeObserver(() => {})
      resizeObserver.observe(document.getElementById('observable'))
    </script>
  </body>
</html>

Actual results:

After Firefox v122 CPU usage increased to ~9%-12% on tested (above) page. On real pages with more complicated layout, this value can go up to ~30%.

Expected results:

While on Firefox v121 CPU usage on tested page was around 1% (in about:processes).


###mozregression output:

18:16.76 INFO: Last good revision: 9ea04a2cc829da8ce1048f38bb015b869d9273aa
18:16.76 INFO: First bad revision: 95679f6ce2544b21ec9cfd0d19961058d60155d6
18:16.76 INFO: Pushlog:
https://hg.mozilla.org/integration/autoland/pushloghtml?fromchange=9ea04a2cc829da8ce1048f38bb015b869d9273aa&tochange=95679f6ce2544b21ec9cfd0d19961058d60155d6

Profiles:


Context:

The Bugbug bot thinks this bug should belong to the 'Core::Layout' component, and is moving the bug to that component. Please correct in case you think the bot is wrong.

Component: Untriaged → Layout
Product: Firefox → Core

Setting regressed by on bug 1807253 based on the bisection performed by the OP.

Keywords: regression
Regressed by: 1807253

:fredw, since you are the author of the regressor, bug 1807253, could you take a look?

For more information, please visit BugBot documentation.

Flags: needinfo?(fwang)

Bug 1807253 caused a regression that was fixed by bug 1867042.

I supposed it's a different problem here, will check.

Flags: needinfo?(fwang)
Attached file without-patch.perf

Data generated by perf with the attached testcase for 5s at https://hg.mozilla.org/mozilla-central/rev/9ea04a2cc829da8ce1048f38bb015b869d9273aa

Attached file with-patch.perf

Data generated by perf with the attached testcase for 5s at https://hg.mozilla.org/mozilla-central/rev/95679f6ce2544b21ec9cfd0d19961058d60155d6

Attachment #9377374 - Attachment mime type: application/octet-stream → text/plain
Attachment #9377375 - Attachment mime type: application/octet-stream → text/plain

I'm observing this high CPU usage today's trunk build and in Firefox 122.

Comparing the attached perf output (e.g. with https://profiler.firefox.com)

I see we spend significant time in nsRefreshDriver::Tick.

Without the patch, work is essentially in nsRefreshDriver::TickObserverArray:

nsHTMLScrollFrame::IsRootScrollFrameOfDocument() const
nsIFrame::GetTransformMatrix
nsLayoutUtils::GetTransformToAncestor
TransformGfxRectToAncestor
nsLayoutUtils::TransformFrameRectToAncestor
IsFrameScrolledOutOfView
nsIFrame::IsScrolledOutOfView
mozilla::dom::KeyframeEffect::CanThrottleIfNotVisible
mozilla::dom::KeyframeEffect::CanThrottle
mozilla::dom::KeyframeEffect::NotifyAnimationTimingUpdated
mozilla::dom::CSSAnimation::UpdateTiming
mozilla::dom::Animation::Tick
mozilla::dom::CSSAnimation::Tick
mozilla::dom::AnimationTimeline::Tick
mozilla::dom::DocumentTimeline::MostRecentRefreshTimeUpdated
nsRefreshDriver::TickObserverArray

With the patch, it now essentially spends time in three different functions:

  • nsRefreshDriver::DetermineProximityToViewportAndNotifyResizeObservers (19%)
  • nsRefreshDriver::TickObserverArray (13%)
    • mozilla::PresShell::DoFlushPendingNotifications (1.2%)
    • mozilla::a11y::NotificationController::WillRefresh (7.1%)
    • mozilla::dom::DocumentTimeline::MostRecentRefreshTimeUpdated (1.6%) -- same stack as above
  • nsViewManager::ProcessPendingUpdates (66%)

@emilio: I plan to debug more but I will likely need to guidance here. IIUC the new FlushLayoutForWholeBrowsingContextTree call is what is causing most of the extra work here (removing it allows to recover performance). However, that's what the spec says we should do, so not sure. Maybe we can do some optimizations in case we have animations or so.

Flags: needinfo?(emilio)

Here's a profile of nightly with the test-case then the test-case but without the resize observer: https://share.firefox.dev/3Umrlzp

So the profiler makes clear that what's going is that the refresh driver goes at full speed rather than being throttled to update each 100ms if only compositor animations are going on.

I wonder if not flushing animations from ResizeObserver handling would help (also whether it'd be correct, given transform animations, it seems you'd get throttled resizes for transform animations), but maybe that's what was going on before?

The other question is "why did your patch in particular cause this".

My guess is that this code used to prevent the flush before your patch, and now flushes. The flush triggers an animation update which schedules a next tick, which flushes again, etc.

Flags: needinfo?(emilio)

This code runs unconditionally from "update the rendering". It flushing
throttled animations kinda defeats the point of throttled animations.

This was kind of papered before bug 1807253 because we didn't
unconditionally flush if there were resize observers.

Assignee: nobody → emilio
Status: UNCONFIRMED → ASSIGNED
Ever confirmed: true
Pushed by fwang@igalia.com:
https://hg.mozilla.org/integration/autoland/rev/aef6359f8095
Don't flush throttled animations from DetermineProximityToViewportAndNotifyResizeObservers. r=fredw
Attachment #9377586 - Attachment description: WIP: Bug 1877531 - Don't flush throttled animations from DetermineProximityToViewportAndNotifyResizeObservers. → Bug 1877531 - Don't flush throttled animations from DetermineProximityToViewportAndNotifyResizeObservers. r=emilio
Attachment #9377586 - Attachment is obsolete: true
Attachment #9377587 - Flags: approval-mozilla-beta?

Uplift Approval Request

  • Needs manual QE test: yes
  • Risk associated with taking this patch: Medium
  • Fix verified in Nightly: no
  • Steps to reproduce for manual QE testing: Open two non-overlapping windows: one with testcase https://bugzilla.mozilla.org/show_bug.cgi?id=1877531#c0 and one with about:processes. CPU usage for the testcase should be 1-2% not around 9-12%.
  • User impact if declined: High CPU regression (> 10%) on document using animation and ResizeObserver.
  • Is Android affected?: yes
  • Code covered by automated testing: no
  • String changes made/needed: None
  • Explanation of risk level: This layout flushing code could be called for each rendering update and changing animation flushing may affect subsequent steps. However, this only changes behavior when the document contains a resize observer and an animation.
Flags: qe-verify+
Attachment #9377589 - Flags: approval-mozilla-release?

Uplift Approval Request

  • Code covered by automated testing: no
  • Is Android affected?: yes
  • Explanation of risk level: This layout flushing code could be called for each rendering update and changing animation flushing may affect subsequent steps. However, this only changes behavior when the document contains a resize observer and an animation.
  • String changes made/needed: None
  • Risk associated with taking this patch: Medium
  • Needs manual QE test: yes
  • User impact if declined: High CPU regression (> 10%) on document using animation and ResizeObserver.
  • Steps to reproduce for manual QE testing: Open two non-overlapping windows: one with testcase https://bugzilla.mozilla.org/show_bug.cgi?id=1877531#c0 and one with about:processes. CPU usage for the testcase should be 1-2% not around 9-12%.
  • Fix verified in Nightly: no

@Emilio: I landed your patch, let's see if it passes CI and if reporter can confirm the fix when it arrives in nightly. Writing a test based on refreshDriverHasPendingTick turned out to be difficult, so let's just rely on the manual test. I prepared some patches to uplift this to beta/release.

Emilio, could you set a priority/severity on this bug? I guess that the impact of this bug depends on how often on the web people hit this bug? Is that something we hit on major websites? Thanks

Flags: needinfo?(emilio)

It kinda defeats optimizations we have had for a long time for accelerated animations. So I think it's rather important.

Severity: -- → S2
Flags: needinfo?(emilio)
Priority: -- → P2
Status: ASSIGNED → RESOLVED
Closed: 7 months ago
Resolution: --- → FIXED
Target Milestone: --- → 124 Branch

Uplift Approval Request

  • Steps to reproduce for manual QE testing: Open two non-overlapping windows: one with testcase https://bugzilla.mozilla.org/show_bug.cgi?id=1877531#c0 and one with about:processes. CPU usage for the testcase should be 1-2% not around 9-12%.
  • User impact if declined: High CPU regression (> 10%) on document using animation and ResizeObserver.
  • Code covered by automated testing: no
  • Explanation of risk level: This layout flushing code could be called for each rendering update and changing animation flushing may affect subsequent steps. However, this only changes behavior when the document contains a resize observer and an animation.
  • Is Android affected?: yes
  • Fix verified in Nightly: yes
  • Needs manual QE test: yes
  • String changes made/needed: None
  • Risk associated with taking this patch: Medium

Uplift Approval Request

  • User impact if declined: High CPU regression (> 10%) on document using animation and ResizeObserver.
  • Steps to reproduce for manual QE testing: Open two non-overlapping windows: one with testcase https://bugzilla.mozilla.org/show_bug.cgi?id=1877531#c0 and one with about:processes. CPU usage for the testcase should be 1-2% not around 9-12%.
  • Code covered by automated testing: no
  • Explanation of risk level: This layout flushing code could be called for each rendering update and changing animation flushing may affect subsequent steps. However, this only changes behavior when the document contains a resize observer and an animation.
  • Fix verified in Nightly: yes
  • Needs manual QE test: yes
  • Is Android affected?: yes
  • String changes made/needed: None
  • Risk associated with taking this patch: Medium

@Max: The fix is now integrated in the latest Nightly build. I checked your reduced test case, but can you please double check with your own pages?

Flags: needinfo?(maxbadryzlov)
QA Whiteboard: [qa-triaged]

Yes, can confirm that high CPU usage has disappeared. Thank you.

Flags: needinfo?(maxbadryzlov)

Reproduced the initial issue on Firefox 123.0b4 on Ubuntu 22.04 x64.

Verified as fixed using the latest Nightly 124.0a1 (20240201173838) on macOS 13, Ubuntu 22.04 x64 and Windows 10 x64.

Attachment #9377587 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
Attachment #9377589 - Flags: approval-mozilla-release? → approval-mozilla-release+

Verified as fixed on Firefox 123.0b7 and Firefox 122.0.1 - CPU usage is around 1-2% - tested on macOS 13, Ubuntu 22.04 x64 and Windows 10 x64.

Status: RESOLVED → VERIFIED
Flags: qe-verify+

(In reply to Pulsebot from comment #11)

Pushed by fwang@igalia.com:
https://hg.mozilla.org/integration/autoland/rev/aef6359f8095
Don't flush throttled animations from
DetermineProximityToViewportAndNotifyResizeObservers. r=fredw

== Change summary for alert #41401 (as of Thu, 08 Feb 2024 07:43:13 GMT) ==

Improvements:

Ratio Test Platform Options Absolute values (old vs new) Performance Profiles
4% twitch loadtime macosx1015-64-shippable-qr fission warm webrender 713.04 -> 687.94 Before/After
2% twitch LastVisualChange linux1804-64-shippable-qr fission warm webrender 5,993.79 -> 5,846.58 Before/After
2% twitch LastVisualChange linux1804-64-shippable-qr fission warm webrender 5,994.37 -> 5,867.90 Before/After

For up to date results, see: https://treeherder.mozilla.org/perfherder/alerts?id=41401

Keywords: perf-alert
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: