This bug was filed from the Socorro interface and is report bp-845c5136-b9b2-4d3d-a874-534dd0170810. ============================================================= Crashing Thread (0) Frame Module Signature Source 0 ntdll.dll RtlEnterCriticalSection 1 xul.dll mozilla::MonitorAutoLock::MonitorAutoLock(mozilla::Monitor&) obj-firefox/dist/include/mozilla/Monitor.h:78 2 xul.dll <lambda_f3a3d6898d1d18cace97913c9d9a8661>::operator() storage/mozStorageService.cpp:949 3 xul.dll mozilla::storage::Service::Observe(nsISupports*, char const*, char16_t const*) storage/mozStorageService.cpp:944 4 xul.dll nsObserverList::NotifyObservers(nsISupports*, char const*, char16_t const*) xpcom/ds/nsObserverList.cpp:112 5 xul.dll nsObserverService::NotifyObservers(nsISupports*, char const*, char16_t const*) xpcom/ds/nsObserverService.cpp:295 6 xul.dll mozilla::ShutdownXPCOM(nsIServiceManager*) xpcom/build/XPCOMInit.cpp:894 7 xul.dll ScopedXPCOMStartup::~ScopedXPCOMStartup() toolkit/xre/nsAppRunner.cpp:1464 8 xul.dll mozilla::UniquePtr<ScopedXPCOMStartup, mozilla::DefaultDelete<ScopedXPCOMStartup> >::reset(ScopedXPCOMStartup*) obj-firefox/dist/include/mozilla/UniquePtr.h:345 9 xul.dll XREMain::XRE_main(int, char** const, mozilla::BootstrapConfig const&) toolkit/xre/nsAppRunner.cpp:4800 10 xul.dll XRE_main(int, char** const, mozilla::BootstrapConfig const&) toolkit/xre/nsAppRunner.cpp:4867 11 firefox.exe NS_internal_main(int, char**, char**) browser/app/nsBrowserApp.cpp:309 12 firefox.exe wmain toolkit/xre/nsWindowsWMain.cpp:115 13 firefox.exe __scrt_common_main_seh f:/dd/vctools/crt/vcstartup/src/startup/exe_common.inl:253 14 kernel32.dll BaseThreadInitThunk 15 ntdll.dll RtlUserThreadStart this crash signature is newly showing up in firefox 56 - so far only from windows users on 64bit versions of firefox.
I believe this is happening due to the lack of a kung fu death grip. More specifically, our xpcom-shutdown-threads logic is: - Remove all the observers, and therefore the only references the Service holds that keep itself alive. Each Connection that exists also holds a reference to the service, so we are alive only as long as there exists a live Connection instance. - Now access our own state, iterating over all of the currently registered Connections. We keep looping if any were in the process of performing an AsyncClose(). This has been dangerous for a really long time. However, it's only crash-worthy if no Connection instances exist. We've been doing better at shutdown in a variety of places, which ironically helps make us more likely to crash. I'm going to add a death grip because it is indeed preferable to have all of the observers removed before we spin our event loop.
Assignee: nobody → bugmail
Status: NEW → ASSIGNED
Created attachment 8896103 [details] [diff] [review] storage::Service needs a death grip when removing strong observer references nsObserverService::RemoveObserver has a comment about "stupid consumers" and some existing death-grip logic to avoid infinite recursion. Clearly it should have gone a step further and done NS_ProxyRelease(..., aAlwaysProxy=true)!
Attachment #8896103 - Flags: review?(bkelly)
Comment on attachment 8896103 [details] [diff] [review] storage::Service needs a death grip when removing strong observer references Review of attachment 8896103 [details] [diff] [review]: ----------------------------------------------------------------- Seems reasonable given the comment. I'm not sure RemoveObserver() could safely use NS_ProxyRelease(). Not sure how that would play with things like xpcom shutdown where observers spin the event loop, etc. Also, it would probably trigger a lot more allocations since I think observers get added/removed a lot. (Although maybe that would be in the noise.)
Attachment #8896103 - Flags: review?(bkelly) → review+
Thanks for the review! The NS_ProxyRelease comment was meant as a joke, although does not read as one sufficiently. I'll amend the commit comment before pushing.
https://hg.mozilla.org/integration/mozilla-inbound/rev/f8282594dc5aeb01df502a388b492106a2c4ae35 Bug 1389279 - storage::Service needs a death grip when removing strong observer references. r=bkelly
Status: ASSIGNED → RESOLVED
Last Resolved: 9 months ago
status-firefox57: ? → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla57
Hi :asuth, Since this is a new regression, do you think this is worth uplifting to Beta56?
(In reply to Gerry Chang [:gchang] from comment #8) > Since this is a new regression, do you think this is worth uplifting to > Beta56? Yes, crash stats is showing at least some shutdown crashes due to this on 56b, and the fix is simple enough and has baked enough that I'll ask for approval now.
Comment on attachment 8896103 [details] [diff] [review] storage::Service needs a death grip when removing strong observer references [Feature/Bug causing the regression]: No direct bug. This is a longstanding issue that was not made obvious before because many mozStorage consumers were bad at cleaning up. They've gotten better. [User impact if declined]: Rare-ish shutdown crashes. We're at 4 right now, but it's quite possible that users with lots of (legacy) extensions exacerbate the problem, so the crash rate could be much higher on release if not taken. [Is this code covered by automated tests?]: Not directly; there may have been an intermittent orange or two that was this. Tests would reliably crash on poisoned memory if they reliably exercised this. However, this was a dumb/naive bug and it's not clear the testing effort is worth it given that we'll never remove the kung fu death grip. [Has the fix been verified in Nightly?]: It has baked on nightly. [Needs manual test from QE? If yes, steps to reproduce]: No. [List of other uplifts needed for the feature/fix]: No other uplifts. [Is the change risky?]: No. [Why is the change risky/not risky?]: We're adding a strong refcount at a point when we 100% know our object is not destroyed. We hold it through the period where we previously had a use-after-free risk. Totes safe. [String changes made/needed]: No string changes.
Attachment #8896103 - Flags: approval-mozilla-beta?
That is 4 crashes/week.
Comment on attachment 8896103 [details] [diff] [review] storage::Service needs a death grip when removing strong observer references Fix for crash that may become worse on release. Let's take this fix for 56 beta 5.
Attachment #8896103 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
status-firefox56: affected → fixed
You need to log in before you can comment on or make changes to this bug.