Closed Bug 1754810 Opened 2 years ago Closed 10 months ago

SMIL causing 60Hz refresh ticks despite nothing visibly animated

Categories

(Core :: SVG, defect)

defect

Tracking

()

VERIFIED FIXED
113 Branch
Tracking Status
firefox113 --- fixed

People

(Reporter: florian, Assigned: emilio)

References

(Blocks 1 open bug)

Details

(Keywords: perf:resource-use, power)

Attachments

(2 files)

I noticed that the treeherder log viewer (eg https://treeherder.mozilla.org/logviewer?job_id=366960865&repo=try&lineNumber=2184) causes 60Hz refresh driver ticks with "Tick reasons: HasObservers (SMIL animations [Style])".
https://share.firefox.dev/3rDDi4W

Nothing seems animated on this page (at least nothing gets repainted at 60Hz), and I haven't managed to find the animation in its source code. Maybe it's something in a visibility:hidden node, in which case it would be the same issue as bug 1737881.

I've done a little bit of debugging here. The animation is from the SVG defined at: https://github.com/mozilla-frontend-infra/react-lazylog/blob/962d294b18dd657a0c84ab05062a6bd8339c8180/src/components/Loading/index.jsx

I can't find any trace of this in the DOM using the devtools.

This <Loading/> React component is used only at https://github.com/mozilla-frontend-infra/react-lazylog/blob/962d294b18dd657a0c84ab05062a6bd8339c8180/src/components/LazyLog/index.jsx#L687 which is when the log is loading and not a single line of it has been received yet. So I assume as soon as we get any data from the log, this part of the DOM gets removed.

So this probably isn't the same issue as bug 1737881. Here we keep a refresh observer for an animation from DOM nodes that no longer exist. I don't know if the DOM nodes need to be removed at a specific time to reproduce this bug.

(In reply to Florian Quèze [:florian] from comment #1)

I've done a little bit of debugging here. The animation is from the SVG defined at: https://github.com/mozilla-frontend-infra/react-lazylog/blob/962d294b18dd657a0c84ab05062a6bd8339c8180/src/components/Loading/index.jsx

I can't find any trace of this in the DOM using the devtools.

It does temporarily get added to the DOM, for what it's worth. I can see mozilla::dom::SVGAnimationElement::BindToTree and UnbindFromTree being called during pageload here. (note SVGAnimationElement in the type name -- this is for the <animate> element itself.)

(In reply to Florian Quèze [:florian] from comment #1)

Here we keep a refresh observer for an animation from DOM nodes that no longer exist.

Indeed. Probably the SMILAnimationController.cpp should unregister as a refresh observer in this case.

(With the caveat that I'm not at all familiar with the animation code,) is it sufficient to check if the table becomes empty in SMILAnimationController::UnregisterAnimationElement, and if so, call StopSampling until such time as any element again gets registered? In simple testing, this seems to work to avoid the constant ticks here.

Assignee: nobody → jfkthame
Status: NEW → ASSIGNED
Severity: -- → S3

Toggling ni=me to finish off the patch here, as discussed on phab.

Flags: needinfo?(dholbert)
Blocks: power-usage
No longer blocks: 1756269
Depends on: 1756269

This just came up again in bug 1762219, and I confirmed that the patch in this bug fixes that page.

Thanks. I'll try to circle back to this soon.

(This is code that nobody's looked at closely in a long time, so it requires some dusting-off / paging-back-in & edge-case exploration to be sure we correctly handle things & don't inadvertently leave the page in a state where subsequently-added SMIL animations would be broken or would use a bogus/reset timeline.)

FTR, I can still reproduce the bug by navigating to any gitlab merge request, e.g. https://invent.kde.org/plasma/kwin/-/merge_requests/2243. Firefox floods the wayland compositor with frame callbacks.

Oh, never mind, I thought that the relevant patch has been merged.

I'm unable to confirm the existence of this bug.

I'm on Fedora 37 Firefox 104.0. Mesa Intel(R) UHD Graphics (CML GT2) on Wayland.

I've confirmed in about:support that it's starting as wayland:

  • Window Protocol: wayland

Confirmed window is being painted and kwin_wayland usage very low.

I'm unable to confirm the existence of this bug.

The bug is still present in Firefox Nightly 106.0a1 20220822190304.

I'm not sure what the best way to test for this is, but I came across this report and tried this (on Firefox 103.0.2):

  1. in KDE System Settings -> Workspace Behavior -> Desktop Effects, I enabled "Show Paint" (and set Meta+P as its toggle)
  2. pressed Meta+P, everything started colorfully strobing at 60 FPS (this may hurt your eyes, or worse)
  3. opened the gitlab merge request linked above
  4. let my mouse cursor sit still, the 60 FPS (repainting) strobing kept happening
  5. set widget.wayland.vsync.enabled to false, no clear immediate change, so I restarted Firefox
  6. going back to that gitlab merge request, and letting my cursor sit still again, no more strobing!
  7. pressed Meta+P to disable the colored repaint rectangles

My eyes still hurt from the experiment - I would probably prefer that instead of the repainted rectangles
being colored, they would have FPS counters (above some minimum rectangle size), but it did its job anyway.

My hope right now is that serving Firefox requests is why kwin_wayland was temporarily freezing up for me
sometimes, but I wasn't able to reliably repro that other issue even before I came across this report, or
touched widget.wayland.vsync.enabled at all, so maybe it's completely unrelated, we'll see I guess.

Btw, as a quick update to my last comment on this, I have since resolved the problem in the last paragraph, and it was unrelated.

Not clear on what was causing it (best guess is ZFS and/or shared memory) but I was swapping way more than I should've, without ever using up all the RAM (starting around 24/32 GiB or so, and that's counting everything the kernel is reporting, not just userspace RSS etc.).

It also is strange that it only affects Plasma/Wayland and not Plasma/X11, but I did catch some perf samples suggesting that shared memory (backing Wayland surfaces, I assume - IIRC the trace involved pixman) being accessed caused a page fault, the kernel went to allocate physical memory for it, and started swapping instead (again, despite there being unused physical memory).

Virtually all my freezing issues (which used to be up to 30s if not more) resolved themselves after:

sudo sysctl vm.swappiness=0 vm.watermark_scale_factor=100

(the second value is so that it starts swapping before oomkill kicks in, other machines may need other values, or you can use vm.swappiness=1, that itself helps a bit, it was just good enough for me - I also had to disable earlyoom, I should look into configuring it so it's also less eager, or just pretend I don't need it)


This wasn't that relevant to Firefox, but I hope you'll excuse my off-topic if it may help someone coming across this issue randomly (or if you ever see users reporting weird slowdowns when switching from X11 to Wayland, maybe they were already swapping and it wasn't getting in the way as much etc.).

Flags: needinfo?(emilio)
Assignee: jfkthame → emilio
Flags: needinfo?(emilio)
Flags: needinfo?(dholbert)
Pushed by ealvarez@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/3c1861de3ee2
Centralize sampling decisions in SMILAnimationController. r=longsonr
Status: ASSIGNED → RESOLVED
Closed: 1 year ago
Resolution: --- → FIXED
Target Milestone: --- → 113 Branch

Can you confirm that you don't see this anymore (just in case there's other stuff to fix)?

Going with WAYLAND_DEBUG=1 on the latest nightly to https://invent.kde.org/plasma/kwin/-/merge_requests/2243 goes idle properly for me (once scrollbars fade etc, as expected), but would be good to confirm from you as well :)

Flags: needinfo?(vlad.zahorodnii)

I can confirm here - awesome, thanks a lot!

Status: RESOLVED → VERIFIED

Yep, works fine.

Flags: needinfo?(vlad.zahorodnii)
Status: VERIFIED → REOPENED
Resolution: FIXED → ---

(In reply to Darkspirit from comment #21)

Reopening as requested via https://chat.mozilla.org/#/room/#nightly:mozilla.org.

It'd be best not to do that. Skimming the backscroll in #Nightly that resulted in the reopening, it looks like we've got a user @neondark who's still reproducing this bug (or something like this bug), and they wanted to add a comment here to reflect that. (Thanks, @neondark!)

Let's use a new bug report to track the issue that neondark is seeing, rather than reopening this one (even if it appears to be the same issue), given that:

  • we had a patch land here, several months/several versions ago (and we've got status flags set to track specifically when that patch landed).
  • Based on comment 19 - comment 20, the patch that landed did fix the issue (or some manifestation of the issue) for some users at least.
  • Process-wise, it gets quite complicated to reason about (and represent in bugzilla) which nightlies/releases have which patches, if we reopen this bug and land more patches here, multiple months/versions later than when the first patch landed.
  • "bugs are cheap" as they say. We're not going to run out of bug IDs anytime soon. :)
Status: REOPENED → RESOLVED
Closed: 1 year ago10 months ago
Resolution: --- → FIXED
Status: RESOLVED → VERIFIED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: