Crash in [@ nsMemoryReporterManager::RegisterReporterHelper]
Categories
(Core :: XPConnect, defect, P2)
Tracking
()
People
(Reporter: philipp, Assigned: mccr8)
References
Details
(Keywords: crash, regression)
Crash Data
Attachments
(1 file)
17.35 KB,
image/png
|
Details |
This bug is for crash report bp-7b417e22-0f1e-419b-927e-7545e0200803.
Top 10 frames of crashing thread:
0 xul.dll nsMemoryReporterManager::RegisterReporterHelper xpcom/base/nsMemoryReporterManager.cpp
1 xul.dll mozilla::RegisterWeakMemoryReporter xpcom/base/nsMemoryReporterManager.cpp:2670
2 xul.dll mozilla::URLPreloader::URLPreloader js/xpconnect/loader/URLPreloader.cpp:95
3 xul.dll static mozilla::URLPreloader::GetSingleton js/xpconnect/loader/URLPreloader.cpp:81
4 xul.dll mozilla::ScriptPreloader::InitCache js/xpconnect/loader/ScriptPreloader.cpp:439
5 xul.dll static mozilla::ScriptPreloader::GetChildSingleton js/xpconnect/loader/ScriptPreloader.cpp:148
6 xul.dll static mozilla::ScriptPreloader::GetSingleton js/xpconnect/loader/ScriptPreloader.cpp:107
7 xul.dll mozJSComponentLoader::ObjectForLocation js/xpconnect/loader/mozJSComponentLoader.cpp:756
8 xul.dll mozJSComponentLoader::Import js/xpconnect/loader/mozJSComponentLoader.cpp:1276
9 xul.dll mozilla::dom::module_getter::ModuleGetter dom/base/ChromeUtils.cpp:576
the volume of this crash signature is visibly increasing at the start of the 80.0b cycle and accounting for 1.7% of browser crash in beta 2. crashing installations are on windows and crashing with MOZ_CRASH(CrashIfRefcountIsZero: refcount is zero)
.
Assignee | ||
Comment 1•5 years ago
|
||
It looks like this is all the URLPreloader. The problem is that we're registering it as a memory reporter in the ctor, so the refcount is zero. This can cause some internal problems in the memory reporter manager, so we assert against it. dthayer has been doing some work on this class recently, but at a quick glance this issue would seem to be at least a few years old. Maybe Doug's changes made us use it more?
The fix would be to add an init method that does the registration, and only call it after there has been an addref.
Assignee | ||
Comment 2•5 years ago
|
||
I see a decent amount of these crashes in 78, but it does appear to have increased quite a bit in 80.
Assignee | ||
Updated•5 years ago
|
Assignee | ||
Comment 3•5 years ago
|
||
In the parent process, URLPreloader::InitInternal() will call AddObserver with the preloader, which will normally ensure that there's at least one reference. (I'm not sure how it works in the child process.)
These crashes all have ShutdownProgress of xpcom-shutdown. During XPCOM shutdown, it looks like we shut down the observer service, then clear ClearOnShutdown pointers like the singleton, then eventually shut down the JS engine. Presumably at some point before the last step some JS runs and does something that accesses the preloader. It looks like nsStringBundleBase::ParseProperties() is accessing it while we're shutting down a thread.
It looks like URLPreloader::sInitialized is not cleared after the preloader is initially shut down, so we end up recreating it instead of erroring out in URLPreloader::ReadURI.
Assignee | ||
Comment 4•5 years ago
|
||
Another fun wrinkle. After the observer service shuts down, it still exists, so you can get the service, so this line will succeed: nsCOMPtr<nsIObserverService> obs = services::GetObserverService();
. (Note that this will cause a null deref if you call it a little later, after Services has shut down!)
Then on the next line it does this: obs->AddObserver(this, DELAYED_STARTUP_TOPIC, false);
. Now that will fail, because mShuttingDown
is true, but the URLPreloader doesn't check the return value, so it thinks it has successfully initialized, and it registers as a weak memory reporter without being addrefed. Maybe that's the immediate cause of this crash.
I'm still not sure how the weak register call works in the child process.
Assignee | ||
Comment 5•5 years ago
|
||
I'm not sure what I was searching for before, but it looks like about 300 results of these crashes are happening via mozJSComponentLoader::ObjectForLocation, and only about 60 are happening via nsStringBundleBase (the latter only on 79 and older), which is more in line with what I'd expect.
Assignee | ||
Comment 6•5 years ago
|
||
I'm not really sure what should be done if we're in shutdown and the ScriptPreloader has already gone away and we're recreating it. The callers of ScriptPreloader::GetSingleton() aren't set up to deal with there not being one, and ScriptPreloader::InitCache() doesn't seem to be set up to deal with there not being a URL preloader. I'm not sure if it will be okay with a URL preloader that hasn't been successfully initialized. This code also looks old, so I'm not sure where things have started going wrong.
Assignee | ||
Comment 7•5 years ago
|
||
I'll move this to XPConnect, because I think the main problem is things going wrong when we recreate the preloader during shutdown.
Reporter | ||
Comment 8•5 years ago
|
||
interestingly, so far there are no crashes in 80.0b3.
Assignee | ||
Comment 9•5 years ago
|
||
Yeah, I'm not sure why this is suddenly happening so much. The interesting part of the stack would be in JS, but of course we don't have that in the crash report.
Assignee | ||
Comment 10•5 years ago
|
||
The 78 crash involves l10n, so maybe the 80 crash also involves l10n in a way that is less obvious, and the localizations aren't updated in the freshly-rolled out beta 3? I dunno how any of that really works.
Assignee | ||
Comment 11•5 years ago
|
||
80b4 has a crash now: bp-9c4a4c18-6ec0-42e4-9af5-cf2e00200805
It looks a little different than the other 80 crashes. It is also after xpcom-shutdown. We're in ScriptCacheParent::Recv__delete__, and trying to put some cached script information into the preloader, but the preloader has gone away so we're recreating it. (ScriptCacheChild::ActorDestroy has the same issue.) I'm still not sure why we're seeing so many of these crashes on beta.
Assignee | ||
Comment 12•5 years ago
|
||
For this particular case, it looks like we can just not do anything if the preloader doesn't exist.
Updated•5 years ago
|
Reporter | ||
Comment 13•5 years ago
|
||
when taking a closer look at the crash data, it appears that we start getting crash reports from a version just once its successor was getting released (this is also what caught me off-guard in comment #8).
so it's likely something related to an update that's triggering this problem.
version | release date | crashes starting |
---|---|---|
80.0b1 | 2020-07-29 | 2020-07-31 |
80.0b2 | 2020-07-31 | 2020-08-03 |
80.0b3 | 2020-08-03 | 2020-08-05 |
80.0b4 | 2020-08-05 | ? |
Reporter | ||
Comment 15•5 years ago
|
||
this dll correlation involving the BITS service also indicates the crash commonly happens around the time of an update:
(80.22% in signature vs 03.77% overall) Module "BitsProxy.dll" = true [96.69% vs 08.36% if platform_pretty_version = Windows 10]
Assignee | ||
Comment 16•5 years ago
|
||
Between this bug 1656974, maybe something is causing shutdown to take a lot longer. The crashes here look like shutdown races that were latent, and maybe JS code running very late in shutdown when it wouldn't have before is the issue.
Comment 17•5 years ago
|
||
So far it looks like the changes from bug 1657231 made this signature go away in 80.0b7.
Assignee | ||
Comment 18•5 years ago
|
||
(In reply to Julien Cristau [:jcristau] from comment #17)
So far it looks like the changes from bug 1657231 made this signature go away in 80.0b7.
I think it is too soon to say. The pattern we've seen before is that the crashes only show up when the next beta rolls out. So we need to see what happens to b7 when b8 is released. (See comment 13.)
Comment 19•5 years ago
|
||
Doh, right, I somehow forgot about that yesterday...
Comment 20•4 years ago
|
||
80.0b8 shipped on Friday, and still no crashes from 80.0b7 on crash-stats.
Updated•4 years ago
|
Comment 22•4 years ago
|
||
No crashes in 80.0b8 or 80.0rc1. Can we declare victory here?
Assignee | ||
Comment 23•4 years ago
|
||
Sure!
Updated•4 years ago
|
Description
•