Closed Bug 1795574 Opened 1 year ago Closed 1 year ago

Firefox Freezes and is Unresponsive

Categories

(Core :: Widget: Gtk, defect)

Firefox 107
defect

Tracking

()

RESOLVED FIXED
108 Branch
Tracking Status
firefox-esr102 --- unaffected
firefox106 --- unaffected
firefox107 + fixed
firefox108 + fixed

People

(Reporter: alanhdu, Assigned: emilio)

References

(Regression)

Details

(Keywords: hang, regression)

Attachments

(2 files)

User Agent: Mozilla/5.0 (X11; Linux x86_64; rv:107.0) Gecko/20100101 Firefox/107.0

Steps to reproduce:

Unclear to me -- I will just browse the internet normally when the browser suddenly totally freezes. I have not figured out a repeatable trigger.

Actual results:

The browser totally looks up and is unresponsive. I cannot scroll, type, right-click, or interact with the browser in any way (that includes the Firefox chrome, not just the web content).

I then get a "Unresponsive Application" popup from GNOME -- I have tried "waiting" for Firefox to unfreeze, but this does not seem to ever happen (or at least, it did not happen within 1 minute).

Looking at htop, I do not see much CPU usage (in fact, Firefox seems to use almost 0 CPU when this happens).

Some (potentially) helpful observations:

To my very uninformed eye, it looks like there's some kind of deadlock? Almost all of the threads are waiting on mutexes/futexes (although to be fair, I don't know if that's just what happens if you take any snapshot of a Firefox process).

The Bugbug bot thinks this bug should belong to the 'Core::Widget: Gtk' component, and is moving the bug to that component. Please correct in case you think the bot is wrong.

Component: Untriaged → Widget: Gtk
Product: Firefox → Core

I have the exact same issue on Arch Linux with Gnome Wayland 42.4.
I did a mozregression test and so far I have this list of commits: https://hg.mozilla.org/integration/autoland/pushloghtml?fromchange=9142cc0a7a33673956230464363b370193f693ba&tochange=27435a91e32d6a93d921d0ee33bf9bd5b05440d3
I am not entirely sure that the culprit is included in this range, because the bug only appears semi-randomly as explained.
This commit in particular caught my attention, although it should only affect Wayland : https://hg.mozilla.org/integration/autoland/rev/57fae052b3080e1fc08f610fb1baae5f79b7ae5e

Regressed by: 1695227

:stransky, since you are the author of the regressor, bug 1695227, could you take a look? Also, could you set the severity field?

For more information, please visit auto_nag documentation.

Flags: needinfo?(stransky)

I can repro this on thunderbird for some reason but not Firefox:

(gdb) bt
#0  0x00007fec9269c6f0 in  () at /usr/lib/libc.so.6
#1  0x00007fec926a2d91 in pthread_mutex_lock () at /usr/lib/libc.so.6
#2  0x0000557b21906a9e in mozilla::detail::MutexImpl::lock() ()
#3  0x00007fec8a0f7fa5 in mozilla::WaylandVsyncSource::DisableVsync() () at /home/emilio/thunderbird/libxul.so
#4  0x00007fec881af3f0 in mozilla::gfx::VsyncSource::UpdateVsyncStatus() () at /home/emilio/thunderbird/libxul.so
#5  0x00007fec8a0ca97a in mozilla::VsyncDispatcher::UpdateVsyncStatus() () at /home/emilio/thunderbird/libxul.so
#6  0x00007fec8a0c9e5c in mozilla::VsyncDispatcher::RemoveVsyncObserver(mozilla::VsyncObserver*) () at /home/emilio/thunderbird/libxul.so
#7  0x00007fec8a316d9e in mozilla::VsyncRefreshDriverTimer::StopTimer() () at /home/emilio/thunderbird/libxul.so
#8  0x00007fec8a30e99a in nsRefreshDriver::EnsureTimerStarted(nsRefreshDriver::EnsureTimerStartedFlags) () at /home/emilio/thunderbird/libxul.so
#9  0x00007fec8a34c750 in mozilla::PresShell::SetIsActive(bool, bool) () at /home/emilio/thunderbird/libxul.so
#10 0x00007fec8b26b4c7 in nsDocShell::ActivenessMaybeChanged() () at /home/emilio/thunderbird/libxul.so
#11 0x00007fec8b292cfd in std::_Function_handler<void (mozilla::dom::BrowsingContext*), mozilla::dom::BrowsingContext::DidSet(std::integral_constant<unsigned long, 2ul>, mozilla::dom::ExplicitActiveStatus)::$_12>::_M_invoke(std::_Any_data const&, mozilla::dom::BrowsingContext*&&) () at /home/emilio/thunderbird/libxul.so
#12 0x00007fec8b2360ce in mozilla::dom::BrowsingContext::PreOrderWalkVoid(std::function<void (mozilla::dom::BrowsingContext*)> const&) () at /home/emilio/thunderbird/libxul.so
#13 0x00007fec8b23d4ed in mozilla::dom::BrowsingContext::DidSet(std::integral_constant<unsigned long, 2ul>, mozilla::dom::ExplicitActiveStatus) () at /home/emilio/thunderbird/libxul.so
#14 0x00007fec8b290028 in void mozilla::dom::syncedcontext::FieldValues<mozilla::dom::BrowsingContext::BaseFieldValues, 68ul>::EachIndexInner<mozilla::dom::syncedcontext::Transaction<mozilla::dom::BrowsingContext>::Apply(mozilla::dom::BrowsingContext*, bool)::{lambda(auto:1)#1}&, 0ul, 1ul, 2ul, 3ul, 4ul, 5ul, 6ul, 7ul, 8ul, 9ul, 10ul, 11ul, 12ul, 13ul, 14ul, 15ul, 16ul, 17ul, 18ul, 19ul, 20ul, 21ul, 22ul, 23ul, 24ul, 25ul, 26ul, 27ul, 28ul, 29ul, 30ul, 31ul, 32ul, 33ul, 34ul, 35ul, 36ul, 37ul, 38ul, 39ul, 40ul, 41ul, 42ul, 43ul, 44ul, 45ul, 46ul, 47ul, 48ul, 49ul, 50ul, 51ul, 52ul, 53ul, 54ul, 55ul, 56ul, 57ul, 58ul, 59ul, 60ul, 61ul, 62ul, 63ul, 64ul, 65ul, 66ul, 67ul>(std::integer_sequence<unsigned long, 0ul, 1ul, 2ul, 3ul, 4ul, 5ul, 6ul, 7ul, 8ul, 9ul, 10ul, 11ul, 12ul, 13ul, 14ul, 15ul, 16ul, 17ul, 18ul, 19ul, 20ul, 21ul, 22ul, 23ul, 24ul, 25ul, 26ul, 27ul, 28ul, 29ul, 30ul, 31ul, 32ul, 33ul, 34ul, 35ul, 36ul, 37ul, 38ul, 39ul, 40ul, 41ul, 42ul, 43ul, 44ul, 45ul, 46ul, 47ul, 48ul, 49ul, 50ul, 51ul, 52ul, 53ul, 54ul, 55ul, 56ul, 57ul, 58ul, 59ul, 60ul, 61ul, 62ul, 63ul, 64ul, 65ul, 66ul, 67ul>, mozilla::dom::syncedcontext::Transaction<mozilla::dom::BrowsingContext>::Apply(mozilla::dom::BrowsingContext*, bool)::{lambda(auto:1)#1}&) () at /home/emilio/thunderbird/libxul.so
#15 0x00007fec8b23022c in mozilla::dom::syncedcontext::Transaction<mozilla::dom::BrowsingContext>::Apply(mozilla::dom::BrowsingContext*, bool) () at /home/emilio/thunderbird/libxul.so
#16 0x00007fec8b22fa02 in mozilla::dom::syncedcontext::Transaction<mozilla::dom::BrowsingContext>::Commit(mozilla::dom::BrowsingContext*) () at /home/emilio/thunderbird/libxul.so
#17 0x00007fec885f8628 in mozilla::dom::BrowsingContext::SetExplicitActive(mozilla::dom::ExplicitActiveStatus) () at /home/emilio/thunderbird/libxul.so
#18 0x00007fec8b2c6ee3 in mozilla::AppWindow::RecomputeBrowsingContextVisibility() () at /home/emilio/thunderbird/libxul.so
#19 0x00007fec8b2c7a46 in mozilla::AppWindow::WidgetListenerDelegate::OcclusionStateChanged(bool) () at /home/emilio/thunderbird/libxul.so
#20 0x00007fec8a122437 in nsWindow::NotifyOcclusionState(mozilla::widget::OcclusionState) () at /home/emilio/thunderbird/libxul.so
#21 0x00007fec8a0f7613 in mozilla::WaylandVsyncSource::IdleCallback() () at /home/emilio/thunderbird/libxul.so
#22 0x00007fec8a0f81e9 in mozilla::WaylandVsyncSource::SetupFrameCallback(mozilla::detail::BaseAutoLock<mozilla::Mutex&> const&)::$_1::__invoke(void*) () at /home/emilio/thunderbird/libxul.so
#23 0x00007fec91621042 in  () at /usr/lib/libglib-2.0.so.0
#24 0x00007fec9162081b in g_main_context_dispatch ()

And add lock annotations too, fixing relevant issues.

Assignee: nobody → emilio
Status: UNCONFIRMED → ASSIGNED
Ever confirmed: true
Pushed by ealvarez@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/b536635c42f1
Avoid deadlock in Wayland vsync. r=rmader
Status: ASSIGNED → RESOLVED
Closed: 1 year ago
Resolution: --- → FIXED
Target Milestone: --- → 108 Branch

[Tracking Requested - why for this release]: seems like a severe issue we should fix asap on beta

:emilio could you add a beta uplift request when you're ready?

Flags: needinfo?(emilio)

Comment on attachment 9298826 [details]
Bug 1795574 - Avoid deadlock in Wayland vsync. r=stransky

Beta/Release Uplift Approval Request

  • User impact if declined: Hangs
  • Is this code covered by automated tests?: No
  • Has the fix been verified in Nightly?: Yes
  • Needs manual test from QE?: Yes
  • If yes, steps to reproduce: Minimizing all windows or so in gnome Wayland or plasma Wayland probably does it.
  • List of other uplifts needed: none
  • Risk to taking this patch: Low
  • Why is the change risky/not risky? (and alternatives if risky): Relatively simple refactoring with extra static analysis.
  • String changes made/needed: none
  • Is Android affected?: No
Flags: needinfo?(emilio)
Attachment #9298826 - Flags: approval-mozilla-beta?
Flags: qe-verify+

Comment on attachment 9298826 [details]
Bug 1795574 - Avoid deadlock in Wayland vsync. r=stransky

Approved for 107.0b2.

Attachment #9298826 - Flags: approval-mozilla-beta? → approval-mozilla-beta+

Set release status flags based on info from the regressing bug 1695227

QA Whiteboard: [qa-triaged]

Hello! Tried to verify this today with Firefox 107.0b2 and Firefox 108.0a1 (20221019211615) (Wayland Window Protocol) on Ubuntu 22.04.01 LTS. It seems that I cannot reproduce the issue with Firefox 107.0b2 after minimizing or opening/ closing multiple tabs and windows. However, I can still reproduce this issue or a similar one at least with Firefox 108.0a1 (20221019211615) after opening/ closing multiple tabs with random webpages.
The issue occurs very randomly after Nightly is opened for some time and unfortunately, I don't have some STR to reproduce it. I have also observed that opening and closing https://meet.jit.si/ can trigger the issue sometimes. I also have a profiler which I hope can help. Should we file another issue for this or reopen this one?
Attaching a screen recording as well. If more information is needed please let me know. Thank you!

Flags: needinfo?(emilio)

Separate bug please, there's no code related to this patch running in there afaict, thank you!

Flags: needinfo?(emilio)
Flags: needinfo?(stransky)

I tried to verify this issue now that bug 1796392 was fixed on Nightly.

Unfortunately, I cannot reproduce this particular issue with firefox 107.0a1 (2022-10-15) on Ubuntu 22 Wayland (Gnome Version 42.4). I have tried to minimize/reopen all Firefox windows, change between windows, surf the web for some time, and open/close multiple tabs/windows, but Firefox does not become unresponsive. I have also tried the above-mentioned steps with Firefox 107.0b4 and 108.0a1 (2022-10-24) and the issue does not reproduce.

Can someone that could reliably reproduce this issue verify the fix on Firefox latest beta and latest Nightly? Thank you in advance!

I'm using Nightly as my daily driver. While I haven't been able to reliably reproduce them in a fresh profile, I can confirm that the hangs used to happen in my main profile multiple times a day around 2022-10-17 when I filed bug 1795741 and I haven't seen any since updating to a nightly containing these patches. No idea about beta, sorry.

Having filed the duplicate bug 1796046, I can confirm that it's fixed in the 18th nightly.

Blocks: 1786247
You need to log in before you can comment on or make changes to this bug.