Hung Firefox starting in 106 (main thread APZ waiting on GPU process)
Categories
(Core :: Graphics, defect, P1)
Tracking
()
People
(Reporter: jrmuizel, Assigned: jrmuizel)
References
Details
There are a number of reports of Firefox hanging. Often associated with media playback:
https://www.reddit.com/r/firefox/comments/yfkplj/pages_with_media_freeze_firefox_after_update_to/
https://www.reddit.com/r/firefox/comments/yefgrr/still_experiencing_hanging_with_ff_10602/
and associated: https://crash-stats.mozilla.org/report/index/3b3df888-5d5f-479c-96c2-fce6d0221028
We don't have too much information about what's going on yet but all the reports I've seen have been on Intel Xe GPUs
Assignee | ||
Updated•2 years ago
|
Assignee | ||
Comment 1•2 years ago
|
||
Assignee | ||
Updated•2 years ago
|
Updated•2 years ago
|
Comment 3•2 years ago
|
||
(In reply to Jeff Muizelaar [:jrmuizel] from comment #0)
We don't have too much information about what's going on yet but all the reports I've seen have been on Intel Xe GPUs
As per my bug report (1798050), the bug also happens on integrated AMD graphics.
Assignee | ||
Comment 4•2 years ago
|
||
Another crash report showing us getting stuck waiting for the GPU process:
https://crash-stats.mozilla.org/report/index/500f0fee-442d-49eb-b98d-fb5400221029#allthreads
Comment 5•2 years ago
|
||
Can the tool https://github.com/b0bh00d/crash-firefox be made to crash the GPU process? If so, would it make sense to ask folks experiencing this issue on Reddit to crash the GPU process instead so we can see what it's doing at the time of the hang?
Comment 6•2 years ago
|
||
(In reply to Botond Ballo [:botond] from comment #5)
Can the tool https://github.com/b0bh00d/crash-firefox be made to crash the GPU process? If so, would it make sense to ask folks experiencing this issue on Reddit to crash the GPU process instead so we can see what it's doing at the time of the hang?
I think so (I haven't tried it) but (1) this would require some way to tell which PID is the GPU process while Firefox is hung and (2) I haven't run into this in just under 3 weeks.
Also (as the reporter of bug 1791938), this only occurs very intermittently for me, indicating maybe not the same cause as bug 1798050. (It is still possible they're related.)
Comment 7•2 years ago
|
||
The bug is marked as tracked for firefox106 (release), tracked for firefox107 (beta) and tracked for firefox108 (nightly). We have limited time to fix this, the soft freeze is in 10 days. However, the bug still isn't assigned.
:bhood, could you please find an assignee for this tracked bug? If you disagree with the tracking decision, please talk with the release managers.
For more information, please visit auto_nag documentation.
Assignee | ||
Updated•2 years ago
|
Assignee | ||
Comment 8•2 years ago
|
||
Updated•2 years ago
|
Comment 9•2 years ago
|
||
This looks like a dupe of bug 1791938.
Assignee | ||
Comment 10•2 years ago
|
||
Here are two GPU processes crashes:
https://crash-stats.mozilla.org/report/index/3553982f-3197-460d-bd80-bdb9e0221101
https://crash-stats.mozilla.org/report/index/eaeb8abf-7307-461a-8c9b-431650221101
Unfortunately, everything looks normal in those.
However in https://crash-stats.mozilla.org/report/index/f398b407-4a11-4c90-83e3-54f290221102#allthreads we see that we're waiting in mozilla::gfx::DeviceManagerDx::CreateCompositorDevices()
Comment 11•2 years ago
|
||
Reviewing the reports/crashes, I am certain as I can be without reproducing myself that this would be fixed by bug 1792115.
Assignee | ||
Comment 13•2 years ago
|
||
Here's another crash report that says the same thing: https://crash-stats.mozilla.org/report/index/4c036fd5-ce09-4fe4-b4fd-1c2440221102
Assignee | ||
Comment 14•2 years ago
|
||
I've been able to reproduce a similar hang using dxcap -forcetdr
. I also confirmed that the hang does not happen in beta 107.
Comment 15•2 years ago
|
||
My history of hangs/no hangs also lines up perfectly with the timeline in bug 1792115.
Comment 16•2 years ago
|
||
(In reply to Jeff Muizelaar [:jrmuizel] from comment #13)
https://crash-stats.mozilla.org/report/index/4c036fd5-ce09-4fe4-b4fd-1c2440221102
https://crash-stats.mozilla.org/report/index/f398b407-4a11-4c90-83e3-54f290221102
The driver that crashed in #1 is kind of old (30.0.101.1069 @ Nov 11, 2021) and #2 is from July 2022 (31.0.101.3251) so not very old. Updating to 31.0.101.3790 may not help here.
https://crash-stats.mozilla.org/report/index/3553982f-3197-460d-bd80-bdb9e0221101
https://crash-stats.mozilla.org/report/index/eaeb8abf-7307-461a-8c9b-431650221101
Both of these crashed on the latest AMD Adrenalin 22.10.3 driver from 10/28/2022 so unless the driver has a bug, it's likely not the cause.
Assignee | ||
Comment 17•2 years ago
|
||
Looking at the telemetry closer. In 105 we'd get about 1.5-2M device resets per day. In 106 that number has dropped to 0 presumably because we hang instead of recording the resets.
Comment 18•2 years ago
|
||
We have a point release out (106.0.4) with the fix in bug 1792115. Would appreciate it if people who run into this grab that update and confirm the issue is addressed. Thanks!
Updated•2 years ago
|
Updated•2 years ago
|
Description
•