Crash in nsGlobalWindowOuter::PrepareForProcessChange due to GetRemoteOuterWindowProxy failing
Categories
(Core :: DOM: Window and Location, defect)
Tracking
()
People
(Reporter: bradwerth, Unassigned)
References
Details
(Keywords: stalled)
Crash Data
Attachments
(1 obsolete file)
Crashed just now: https://crash-stats.mozilla.org/report/index/319e0958-2c94-496f-8e26-ac7cf0230424
Looks like nsGlobalWindowOuter::PrepareForProcessChange will null dereference the BrowsingContext if the window associated with the proxy has been unlinked.
Reporter | ||
Comment 1•2 years ago
|
||
Perhaps this means that the proxy should addref the window to prevent it from being unlinked? I don't know. I can write a patch that will prevent this specific crash, but I don't know if this is merely covering up a larger problem.
Reporter | ||
Comment 2•2 years ago
|
||
Comment 3•2 years ago
|
||
Andrew, do you know if this is the correct solution?
Comment 4•1 year ago
|
||
mWindowProxy is a weak reference that is supposed to get cleared when we unlink the outer window, so I don't think that should be happening.
The crash you linked doesn't look like a null deref of a BC.
The crash reason is MOZ_CRASH(PrepareForProcessChange GetRemoteOuterWindowProxy)
, which means that the call to GetRemoteOuterWindowProxy failed. That in turn looks like it means that RemoteObjectProxyBase::GetOrCreateProxyObject() failed to return a proxy object, which can fail for a variety of reasons, like failing to allocate a JS object.
Updated•1 year ago
|
Reporter | ||
Comment 5•1 year ago
|
||
(In reply to Andrew McCreight [:mccr8] from comment #4)
The crash reason is
MOZ_CRASH(PrepareForProcessChange GetRemoteOuterWindowProxy)
, which means that the call to GetRemoteOuterWindowProxy failed. That in turn looks like it means that RemoteObjectProxyBase::GetOrCreateProxyObject() failed to return a proxy object, which can fail for a variety of reasons, like failing to allocate a JS object.
Ah, you're right, I wasn't reading the crash report correctly. I'm not sure how to fixup GetOrCreateProxyObject
. I'll take myself off this Bug and abandon the patch.
Updated•1 year ago
|
Reporter | ||
Updated•1 year ago
|
Comment 6•1 year ago
|
||
Yeah, this stuff is pretty nasty. I think the basic issue is that it is hard to know what to even do if there is a failure, because we're in the middle of some complex mutating operation we can't really roll back. So we just crash.
Updated•1 year ago
|
Comment 7•1 year ago
|
||
The bug is linked to a topcrash signature, which matches the following criterion:
- Top 10 content process crashes on release
:edgar, could you consider increasing the severity of this top-crash bug?
For more information, please visit BugBot documentation.
Comment 8•1 year ago
|
||
(In reply to Andrew McCreight [:mccr8] from comment #4)
The crash reason is
MOZ_CRASH(PrepareForProcessChange GetRemoteOuterWindowProxy)
, which means that the call to GetRemoteOuterWindowProxy failed. That in turn looks like it means that RemoteObjectProxyBase::GetOrCreateProxyObject() failed to return a proxy object, which can fail for a variety of reasons, like failing to allocate a JS object.
FWIW, the first 5 reports I clicked on all have JSOutOfMemory: Reported
set.
IIUC the block that runs the nsAutoScriptBlocker::~nsAutoScriptBlocker()
is here in void nsFrameLoaderOwner::ChangeRemotenessCommon
. - which is a void
function. The nsGlobalWindowOuter::PrepareForProcessChange
function just returns if we find no outer window which is probably equally unexpected but then crashes apparently on OOM. If it is fine to just return in the first case, wouldn't it be fine to do so also in the OOM case (and maybe crash later due to it or not)? And if not, it would be nice to at least mark these crashes as OOM to make it easier to ignore them.
Reporter | ||
Comment 9•1 year ago
|
||
It does seem that the JSOutOfMemory: Reported
state is potentially a temporary state since it can be recovered from during GC. Indeed in at least some of the crash reports with JSOutOfMemory: Reported
in the crash annotations, there appears to be plenty of available memory.
If this is correct, then maybe it's possible for RemoteObjectProxyBase::GetOrCreateProxyObject
or something further up the call stack to detect the failed allocation and then force a synchronous GC and try again.
increasing increasing the severity to S2 as it is a top-crash.
Comment 11•1 year ago
|
||
Andrew, is there anything we can do here, based on new findings from comment 8 and comment 9? Thanks!
Updated•1 year ago
|
Comment 12•1 year ago
|
||
Those are some interesting ideas, but I'm not sure if they will help.
The current code crashes instead of bailing out because we're in the middle of a multi-step process to replace one window proxy with another, where we've cleared out some references in an outer window and a browsing context, so we're not in a valid state. If we just keep going, we'll probably crash on a null deref later. With some refactoring, maybe we could attempt to create the new proxy object before we start clearing things out, but I'm not sure how weird it would be if we give up and leave that floating around.
As for GCing, I think we already do a "last ditch" GC whenever we fail to allocate a JS object, so I don't think an extra GC will help. (I'm not super familiar with that mechanism so I might be wrong.)
I could try pushing the crash into RemoteObjectProxyBase::GetOrCreateProxyObject() when !aTransplantTo, as the callers that pass in a non-null value all crash on failure. The failure could be happening in an unexpected place. I could also make the MOZ_ASSERT(!aTransplantTo) into a release assert to eliminate the chance there is some weirdness going on there. My observation of bug 1405521, where bz landed a tremendous amount of instrumentation to try to understand a crash in a similar-ish bit of DOM code makes me worry that this could be a bit of a wild goose chase, but moving the crash would be pretty simple so we might as well try it.
Comment 13•1 year ago
|
||
Peter, any ideas? I haven't looked at this code in a few years so maybe I'm overlooking something. Thanks.
Comment 14•1 year ago
|
||
My new crash points have been hit 3 times now on Nightly.
bp-8d589a3d-c468-4e01-a7d8-209800230917 GOCPO failed at JS_SetImmutablePrototype
bp-3cd2051b-98d8-4fc5-ba41-a84da0230917 GOCPO failed at NewProxyObject
bp-ada12b07-3711-4bf3-8b73-bd6bd0230914 GOCPO failed at NewProxyObject
So, yeah it looks like it is failing after trying to allocate some JS stuff.
Updated•1 year ago
|
Comment 15•1 year ago
|
||
Based on the topcrash criteria, the crash signatures linked to this bug are not in the topcrash signatures anymore.
For more information, please visit BugBot documentation.
Updated•5 months ago
|
Description
•