Open Bug 1684482 Opened 4 years ago Updated 2 months ago

Mouse events can be delayed by a couple of seconds (scroll, click)

Categories

(Core :: Graphics: WebRender, defect, P3)

defect

Tracking

()

REOPENED
Tracking Status
firefox84 --- affected
firefox85 --- affected
firefox86 --- affected

People

(Reporter: whimboo, Unassigned)

References

(Depends on 1 open bug, Blocks 1 open bug)

Details

(Keywords: perf, power)

Mozilla/5.0 (Macintosh; Intel Mac OS X 10.16; rv:86.0) Gecko/20100101 Firefox/86.0 ID:20201229095620

I updated to Big Sur yesterday, and as of today I can see lags from 1 or more seconds when switching between tabs. Here a gecko profile from a single tab switch:

https://share.firefox.dev/3rzYZke

As it looks like most of the time is spent in mozilla::layers::APZCTreeManager::GetTargetAPZC.

Actually this doesn't seem to be related to switching tabs only, but to all mouse events. It doesn't matter where I click the event seems to be delayed and as such causing the UI to show the lags.

Here the stack:

__psynch_mutexwait
_pthread_mutex_firstfit_lock_slow
mozilla::layers::APZCTreeManager::GetTargetAPZC(mozilla::gfx::PointTyped<mozilla::ScreenPixel, float> const&)
mozilla::layers::APZCTreeManager::ReceiveInputEvent(mozilla::InputData&)
{virtual override thunk({offset(-16)}, mozilla::layers::APZCTreeManager::ReceiveInputEvent(mozilla::InputData&))}
mozilla::layers::APZInputBridge::ReceiveInputEvent(mozilla::WidgetInputEvent&)
nsBaseWidget::DispatchInputEvent(mozilla::WidgetInputEvent*)
-[ChildView mouseUp:]
[..]
Summary: Switching between tabs is delayed by a couple of seconds → Mouse events can be delayed by a couple of seconds

Note that after a restart of Firefox I cannot see the problem anymore. I will keep an eye on it to figure out how to trigger that misbehavior.

I hit this problem several times over the last days. I don't have specific reproduction steps yet, but it seems that the performance slightly decreases over time. Given that I haven't used the instance of Firefox a lot over the last days, it might be related to hibernation.

Markus or Steven, do you have any idea?

Flags: needinfo?(spohl.mozilla.bugs)
Flags: needinfo?(mstange.moz)
Keywords: perf
Whiteboard: [fxperf]
Flags: needinfo?(tnikkel)
Flags: needinfo?(spohl.mozilla.bugs)
Flags: needinfo?(mstange.moz)
Component: Graphics: Layers → Panning and Zooming

This might still be graphics, if it has something to do with hit testing. But that stack seems to point to something in the apz code.

Note that when Firefox is under such a condition the CPU load goes up to about 400% when I eg scroll on pages like https://github.com/facebook/watchman/releases.

Blocks: power-usage
Keywords: power
Summary: Mouse events can be delayed by a couple of seconds → CPU load up to 400% causing mouse events to be delayed by a couple of seconds

I actually used Instruments now to get a native trace of what's going on, and as it looks like all the CPU time is spent on 8 Rayon threads. All of these actually contain stack frames for webrender::glyph_rasterizer::GlyphRasterizer::flush_glyph_requests::process_glyph.

Not sure why this differs from the recording of the Gecko profiler. Maybe it cannot handle those threads?

Status: NEW → RESOLVED
Closed: 4 years ago
Resolution: --- → DUPLICATE
No longer blocks: power-usage
Whiteboard: [fxperf]
Flags: needinfo?(tnikkel)

(In reply to Henrik Skupin (:whimboo) [⌚️UTC+2] from comment #6)

Not sure why this differs from the recording of the Gecko profiler. Maybe it cannot handle those threads?

If you add "WrWorker" to the thread filter then these threads do show up in the Gecko profiler.

I've filed bug 1686358 on the fact that this performance problem was able to slow down mouse handling.

Given that there are a couple of dependent issues here, lets reopen the bug so that I can verify it's all fixed when work on dependencies is done.

Status: RESOLVED → REOPENED
Component: Panning and Zooming → Graphics: WebRender
Depends on: 1686358, 1680855
Resolution: DUPLICATE → ---

Here profile from just now when switching tabs: https://share.firefox.dev/2XzbyyD

[Tracking Requested - why for this release]: It's a regression in 86 and causes more than 1s delays when switching tabs.

(In reply to Markus Stange [:mstange] from comment #9)

I've filed bug 1686358 on the fact that this performance problem was able to slow down mouse handling.

Do we know when the causing change landed landed?

Flags: needinfo?(mstange.moz)
Keywords: regression

(In reply to Henrik Skupin (:whimboo) [⌚️UTC+2] from bug 1680855 comment #11)

I can constantly see this problem with switching to a pinned Github tab. After more than 5 minutes not selecting this tab and doing other work in the browser, this tab always blocks for 1-2s.

In this case the re-rasterization is expected. The fact that it's slow is not expected, and I'll probably be fixing that in bug 1681346 or bug 1683975.

Depends on: 1681346, 1683975

Actually the 400% CPU load isn't there anymore since bug 1680855 got fixed. As such updating the summary.

Summary: CPU load up to 400% causing mouse events to be delayed by a couple of seconds → Mouse events can be delayed by a couple of seconds (scroll, click)
Severity: -- → S3
Priority: -- → P3

Changing the priority to p2 as the bug is tracked by a release manager for the current nightly.
See What Do You Triage for more information

Priority: P3 → P2

(In reply to Henrik Skupin (:whimboo) [⌚️UTC+2] from comment #14)

Actually the 400% CPU load isn't there anymore since bug 1680855 got fixed. As such updating the summary.

You morphed the bug in such a way that it's no longer a regression, so I'm removing the regression keyword and resetting the tracking flags and priority.

(In reply to Henrik Skupin (:whimboo) [⌚️UTC+2] from comment #12)

[Tracking Requested - why for this release]: It's a regression in 86 and causes more than 1s delays when switching tabs.

No, it only takes 1 second if you're running with DMD.

It's also not a regression, it's a problem that became worse with Big Sur because the system font takes longer to create.

(In reply to Markus Stange [:mstange] from comment #9)

I've filed bug 1686358 on the fact that this performance problem was able to slow down mouse handling.

Do we know when the causing change landed landed?

The causing change for bug 1686358? No, I don't think we even know whether the cached hit testing tree ever worked properly.

Flags: needinfo?(mstange.moz)
Keywords: regression
Priority: P2 → P3

(In reply to Markus Stange [:mstange] from comment #16)

(In reply to Henrik Skupin (:whimboo) [⌚️UTC+2] from comment #14)

Actually the 400% CPU load isn't there anymore since bug 1680855 got fixed. As such updating the summary.

You morphed the bug in such a way that it's no longer a regression, so I'm removing the regression keyword and resetting the tracking flags and priority.

The actual behavior in Firefox is still the same, only the CPU load dropped significantly. So I don't understand why this is no longer a regression when I don't see that behavior in Firefox 85 DevEdition. And that was also the case when not running with DMD enabled.

(In reply to Henrik Skupin (:whimboo) [⌚️UTC+2] from comment #12)

[Tracking Requested - why for this release]: It's a regression in 86 and causes more than 1s delays when switching tabs.

No, it only takes 1 second if you're running with DMD.

So you are saying that with the fixes for bug 1680855 the delay should no longer be that long when not running a DMD build? I can try that today.

Flags: needinfo?(mstange.moz)

That's right, with the fixes that have landed now, you should be seeing equivalent behavior in Beta and Nightly without DMD. If Nightly is still slower, then there's some other bug left that I wasn't aware of.

Flags: needinfo?(mstange.moz)

Without DMD enabled I haven't seen the problem today. So I will wait for the known dependencies to be fixed, and then check again. Thanks!

Shall we modify the bug summary to include DMD builds? CC'ing Andrew so that he is also aware of it.

It's not a bug that things are slower with DMD. DMD is expected to slow down allocations, and creating fonts in WebRender does a lot of allocation.

(In reply to Markus Stange [:mstange] from comment #20)

It's not a bug that things are slower with DMD. DMD is expected to slow down allocations, and creating fonts in WebRender does a lot of allocation.

Ok, so it means that I basically would have to wait until the dependent bugs have been fixed. Right now I would have to restart Firefox that often that I cannot afford running it with DMD enabled. I wonder why I haven't seen that before, and maybe it's due to the Big Sur update and your filed bug 1683975 only?

Flags: needinfo?(mstange.moz)

I had the same problem yesterday but NOT with DMD enabled: https://share.firefox.dev/2NLsEaS

It wasn't multiple seconds but half a second. And after switching between several different tabs back and forth the problem went away, but came back at a later time. Not sure if something in that profile is helpful.

Here another profile for a really sluggish scrolling on a Phabricator page: https://share.firefox.dev/3dIUBuH

Yeah, same problem as before - it's waiting for the window server. And it's not like we're giving it a ton of work to do, the "WR OS Compositor frame" markers on the Renderer thread show only 33 layers.

Flags: needinfo?(mstange.moz)

Oh wait, when I said "same problem as before" I meant same problem as in bug 1692828 comment 4. This new profile does not show the Big Sur font problem, it shows a slow window server problem.

I see. I will have a look at bug 1692828 and check what's keeping the WindowServer busy under such a condition.

See Also: → 1692828
See Also: → 1717410
See Also: 1717410
Depends on: 1758555
No longer depends on: 1681346
You need to log in before you can comment on or make changes to this bug.