Clicking Trigger Device Reset in about:support causes Firefox to lock up (Windows)
Categories
(Core :: Graphics, defect)
Tracking
()
Tracking | Status | |
---|---|---|
relnote-firefox | --- | 106+ |
firefox-esr102 | --- | unaffected |
firefox105 | --- | unaffected |
firefox106 | + | verified |
firefox107 | + | verified |
People
(Reporter: ahale, Assigned: aosmond)
References
(Blocks 1 open bug, Regression)
Details
(Keywords: regression)
Attachments
(1 file)
48 bytes,
text/x-phabricator-request
|
RyanVM
:
approval-mozilla-release+
|
Details | Review |
Repro steps (Windows only):
- Go to about:support
- Click Trigger Device Reset
- Firefox gets stuck on a mutex in DeviceManagerDx:CreateCompositorDevices.
Reporter | ||
Updated•2 years ago
|
Updated•2 years ago
|
Updated•2 years ago
|
Comment 1•2 years ago
|
||
Set release status flags based on info from the regressing bug 1789309
Reporter | ||
Comment 2•2 years ago
|
||
Verified with mozregression...
5:58.56 INFO: Narrowed integration regression window from [99d74587, dada2510] (3 builds) to [b702eea8, dada2510] (2 builds) (~1 steps left)
5:58.56 INFO: No more integration revisions, bisection finished.
5:58.56 INFO: Last good revision: b702eea846c970a343da94b5461596e22e75426e
5:58.56 INFO: First bad revision: dada2510963e85ddb4e02d94257f5f6c4e6b577e
5:58.56 INFO: Pushlog:
https://hg.mozilla.org/integration/autoland/pushloghtml?fromchange=b702eea846c970a343da94b5461596e22e75426e&tochange=dada2510963e85ddb4e02d94257f5f6c4e6b577e
Reporter | ||
Comment 3•2 years ago
|
||
Tried reproducing the device loss using ctrl-shift-win-b (GPU reset shortcut on WIndows) and got the following results:
- Dell XPS 15 9510 laptop (Intel+NVIDIA): System hardlocked
- Custom built gaming desktop (Ryzen 7 5800X, Radeon RX 6900 XT): Windows froze for a second and then all open windows flickered once, Firefox Nightly recovered just fine.
So this may be limited to a mutex lock issue when clicking Trigger Device Reset in about:support.
Updated•2 years ago
|
Comment 4•2 years ago
|
||
Bob, this is an S2 regression in 106, is that going to impact our user base on release? If it is the case, can we have this bug investigated and assigned rapidly? Thanks
Comment 5•2 years ago
|
||
Andrew, you are the author of the regressing change (see comment 2). Can we get this corrected before Fx106 goes out the door on 18 Oct?
Comment 6•2 years ago
|
||
Tracking as this is set to S2 and we lack context for the fix wrt to timing within our release schedule.
Comment 7•2 years ago
|
||
The bug is marked as tracked for firefox106 (beta). We have limited time to fix this, the soft freeze is in 9 days. However, the bug still isn't assigned.
:bhood, could you please find an assignee for this tracked bug? Given that it is a regression and we know the cause, we could also simply backout the regressor. If you disagree with the tracking decision, please talk with the release managers.
For more information, please visit auto_nag documentation.
Updated•2 years ago
|
Updated•2 years ago
|
Updated•2 years ago
|
Reporter | ||
Comment 8•2 years ago
|
||
A user reported a much more common trigger mechanism for this app freeze in https://bugzilla.mozilla.org/show_bug.cgi?id=1793964 and I'm concerned we may want to try to fix this for 106.
Assignee | ||
Comment 9•2 years ago
|
||
It is a shame we don't have tsan on Windows. It is either we are locking two different locks in two different orders, or we try to relock the same lock twice. Both require a code audit to figure out I guess.
Assignee | ||
Comment 10•2 years ago
|
||
Well that's one I can fix:
https://searchfox.org/mozilla-central/rev/e94c6cb9649bfe4e6a3888460f41bcd4fe30a6ca/gfx/thebes/DeviceManagerDx.cpp#1049
Assignee | ||
Comment 11•2 years ago
|
||
Comment 12•2 years ago
|
||
Comment 13•2 years ago
|
||
At this point of the cycle, it's unlikely we can get this fixed in 106 that ships in a few days as we have already built and QAed our release candidate and have no driver for a RC2 but I can include that in our planned 106 dot release that ships November 1.
Comment 14•2 years ago
|
||
bugherder |
Updated•2 years ago
|
I managed to reproduce this issue on a 2022-09-22 Nightly build on Windows 10 using the STR from the Description. Verified as fixed on NIghtly 108.0a1(build ID: 20221020215126) on Windows 10.
Updated•2 years ago
|
Assignee | ||
Comment 17•2 years ago
|
||
Comment on attachment 9298263 [details]
Bug 1792115 - Avoid a double lock in DeviceManagerDx::MaybeResetAndReacquireDevices.
Beta/Release Uplift Approval Request
- User impact if declined: May crash or freeze the parent process if the GPU process crashes or hits a device reset
- Is this code covered by automated tests?: Yes
- Has the fix been verified in Nightly?: Yes
- Needs manual test from QE?: No
- If yes, steps to reproduce:
- List of other uplifts needed: None
- Risk to taking this patch: Low
- Why is the change risky/not risky? (and alternatives if risky): The change is trivial/surgical, been tested in nightly/beta for a while now, and will, with as much certainty as I am capable of, make things unarguably better.
- String changes made/needed:
- Is Android affected?: Unknown
Assignee | ||
Comment 18•2 years ago
|
||
Given accumulating evidence from bug 1798099, I would also state that this is sufficient for a dot release driver.
Assignee | ||
Comment 19•2 years ago
|
||
(In reply to Andrew Osmond [:aosmond] (he/him) from comment #18)
Given accumulating evidence from bug 1798099, I would also state that this is sufficient for a dot release driver.
My justification for a dot release would be based on our telemetry:
https://firefoxgraphics.github.io/telemetry/#view=tdrs
While only 0.02% of user sessions experience device resets, those who do experience them hit them a lot (17.4 per session). Those users are much more likely to hit this and experience serious disruptions.
Comment 20•2 years ago
|
||
Comment on attachment 9298263 [details]
Bug 1792115 - Avoid a double lock in DeviceManagerDx::MaybeResetAndReacquireDevices.
Approved for 106.0.4.
Updated•2 years ago
|
Comment 21•2 years ago
|
||
bugherder uplift |
Hello,
Would there be a way to verify this fix on release on Windows 10? There's no "Trigger Device Reset" button featured in about:support, is there a procedure to properly tweak any pref in order to make the button appear? I tried the steps from Comment 3 but it doesn't reproduce for me.
Thank you.
Assignee | ||
Comment 23•2 years ago
|
||
Updated•2 years ago
|
Comment 24•2 years ago
|
||
Andrew, could QA trigger this using the device reset button in about:support as well?
Assignee | ||
Comment 25•2 years ago
|
||
They could if it was available, but it is only available on nightly and dev edition builds:
https://searchfox.org/mozilla-central/rev/49011d374b626d5f0e7dc751a8a57365878e65f1/toolkit/content/aboutSupport.js#582
Assignee | ||
Comment 26•2 years ago
|
||
As an alternative, I believe you can go to about:support, open the Web Developer tools, go to Console, and type windowUtils.triggerDeviceReset()
/ hit enter to do it manually. The code is still present in the release builds, just not the button.
It worked with the STR in Comment 26. I could reproduce the hang on 106.0.3. Does not reproduce anymore in 106.0.4, seems to be working just fine. Thank you so much for the information, Andrew, really appreciate your help! Confirmed as fixed on Firefox 106.0.4(build ID: 20221102214123) on Windows 10.
Updated•2 years ago
|
Description
•