Crash in xpc::NativeGlobal
Categories
(Core :: JavaScript Engine, defect)
Tracking
()
People
(Reporter: ting, Assigned: jonco)
References
(Regression, )
Details
(Keywords: crash, regression)
Crash Data
Attachments
(1 file)
47 bytes,
text/x-phabricator-request
|
RyanVM
:
approval-mozilla-beta+
RyanVM
:
approval-mozilla-esr68+
|
Details | Review |
This bug was filed from the Socorro interface and is report bp-0e195650-7323-4eb6-8741-654022160802. ============================================================= #9 with 0731 Nightly on Windows, 18 crashes from 14 installations.
Comment 1•8 years ago
|
||
20 crashes for 0801 nightly. Stack: xpc::NativeGlobal `anonymous namespace'::MessageEventRunnable::DispatchDOMEvent `anonymous namespace'::MessageEventRunnable::WorkerRun mozilla::dom::workers::WorkerRunnable::Run nsThread::ProcessNextEvent NS_ProcessNextEvent mozilla::dom::workers::WorkerPrivate::DoRunLoop `anonymous namespace'::WorkerThreadPrimaryRunnable::Run nsThread::ProcessNextEvent NS_ProcessNextEvent mozilla::ipc::MessagePumpForNonMainThreads::Run MessageLoop::RunHandler MessageLoop::Run nsThread::ThreadFunc ... I looked at the minidump and find native is 0 at https://hg.mozilla.org/mozilla-unified/annotate/465d150bc8be/js/xpconnect/wrappers/WrapperFactory.cpp#l633 so we are likely getting the new native from https://hg.mozilla.org/mozilla-unified/annotate/465d150bc8be/js/xpconnect/wrappers/WrapperFactory.cpp#l642 We crashed trying to dereference native (address 0xfffc000030828d40) Andrea, Boris, any idea?
![]() |
||
Comment 2•8 years ago
|
||
> and find native is 0 at https://hg.mozilla.org/mozilla-unified/annotate/465d150bc8be/js/xpconnect/wrappers/WrapperFactory.cpp#l633
Uh... That is quite odd. That means the global is not a Web IDL global. But I believe the only non-webidl globals we have are messagemanager globals. And I somewhat doubt those are doing worker stuff.
If you have a minidump and can try debugging this a bit, I'd be interested in the following pieces of information:
1) In `anonymous namespace'::MessageEventRunnable::DispatchDOMEvent, what is this->mBehavior?
2) In xpc::NativeGlobal, what are obj->getClass()->name and obj->getClass()->flags? If you have to just walk data structures and can't call functions, I believe you can replace obj->getClass() with reinterpret_cast<const js::shadow::Object*>(obj)->group->clasp to get this info.
Between those bits of info, we should at least know what sort of thing obj is or might be.
Comment 3•8 years ago
|
||
(In reply to Boris Zbarsky [:bz] from comment #2) > > and find native is 0 at https://hg.mozilla.org/mozilla-unified/annotate/465d150bc8be/js/xpconnect/wrappers/WrapperFactory.cpp#l633 > > Uh... That is quite odd. That means the global is not a Web IDL global. > But I believe the only non-webidl globals we have are messagemanager > globals. And I somewhat doubt those are doing worker stuff. > > If you have a minidump and can try debugging this a bit, I'd be interested > in the following pieces of information: > > 1) In `anonymous namespace'::MessageEventRunnable::DispatchDOMEvent, what > is this->mBehavior? > > 2) In xpc::NativeGlobal, what are obj->getClass()->name and > obj->getClass()->flags? If you have to just walk data structures and can't > call functions, I believe you can replace obj->getClass() with > reinterpret_cast<const js::shadow::Object*>(obj)->group->clasp to get this > info. > > Between those bits of info, we should at least know what sort of thing obj > is or might be. Unfortunately the minidump does not contain any heap data so I can't look into those object members. However we know that mBehavior != ParentThreadUnchangedBusyCount because we run to here: https://hg.mozilla.org/mozilla-unified/annotate/465d150bc8be/dom/workers/WorkerPrivate.cpp#l832 Also according to https://hg.mozilla.org/mozilla-unified/annotate/465d150bc8be/dom/workers/WorkerPrivate.cpp#l701 this is a worker without window. The crash I'm looking at is bp-6192e0d1-5e2c-46f2-ae70-b6a182160803 About 12 crash reports has .pdf URL so might be related.
![]() |
||
Comment 4•8 years ago
|
||
> because we run to here: https://hg.mozilla.org/mozilla-unified/annotate/465d150bc8be/dom/workers/WorkerPrivate.cpp#l832
OK. So we're not making the call on line 826? In that case sounds like we're running on the worker thread, ok.
In that case, having aTarget->GetParentObject() is completely expected. The "worker without a window" comment is talking about the case when aTarget == aWorkerPrivate, which would come through line 826.
But then, modulo memory corruption or something like that, JS::CurrentGlobalOrNull(aCx) should be the global of the worker (or _maybe_ the worker debugger, though that seems unlikely based on the stack aboce) which should certainly not return null from UnwrapDOMObjectToISupports, since WorkerGlobalScope and WorkerDebuggerGlobalScope quite clearly inherit from nsISupports.
Comment 5•8 years ago
|
||
As seen in bug 1291190, there could be some memory corruption-y issue here, as some users are getting a bunch of different crashes.
![]() |
||
Comment 6•8 years ago
|
||
Memory corruption could cause this pretty easily, fwiw. UnwrapDOMObjectToISupports will return null if the JSClass doesn't have JSCLASS_IS_DOMJSCLASS in its flags or if mDOMObjectIsISupports is false in the DOMJSclass. So a single memory bitflip can make the whole thing go off the rails...
Comment 7•8 years ago
|
||
A fix for bug 1291190 landed in probably the 8-7 Nightly, and this crash persists, so I guess I was wrong about this being related.
Comment 8•8 years ago
|
||
Crash volume for signature 'xpc::NativeGlobal': - nightly (version 51): 97 crashes from 2016-08-01. - aurora (version 50): 195 crashes from 2016-08-01. - beta (version 49): 12 crashes from 2016-08-02. - release (version 48): 4 crashes from 2016-07-25. - esr (version 45): 9 crashes from 2016-05-02. Crash volume on the last weeks (Week N is from 08-22 to 08-28): W. N-1 W. N-2 W. N-3 - nightly 4 13 78 - aurora 69 96 23 - beta 5 6 0 - release 2 0 1 - esr 0 0 0 Affected platforms: Windows, Mac OS X Crash rank on the last 7 days: Browser Content Plugin - nightly #321 - aurora #28 - beta #4633 - release #5598 - esr
Comment hidden (typo) |
Comment 10•8 years ago
|
||
Looks like 1291190 is still on going.
Comment 11•7 years ago
|
||
Kan-Ru, what should we do here? It looks like a low but non-zero volume of these crashes persists :/
Updated•7 years ago
|
Comment 12•6 years ago
|
||
Closing because no crashes reported for 12 weeks.
Comment 13•6 years ago
|
||
There are still some crashes so reopen it.
Comment 14•5 years ago
|
||
Bughunter can reproduce on Windows with http://auravault.com/
Firefox 72.0.1 : bp-d3426af3-4b37-466b-bf98-e8ec40200109
Nightly opt crashes but it fails to submit the crash report.
![]() |
||
Comment 15•5 years ago
|
||
Bob, thank you! That site reproduces for me in a debug m-c build as well, and I have an rr trace. Debugging now.
![]() |
||
Comment 16•5 years ago
|
||
So the basic issue (and for my future reference this is the "firefox-860" rr trace locally) is that we are doing this in ScriptLoader::FinishDynamicImport
:
989 MOZ_ALWAYS_TRUE(jsapi.Init(aRequest->mDynamicPromise));
but aRequest->mDynamicPromise
is null. In a debug build this crashes when it tries to assert !js::IsCrossCompartmentWrapper
at the top of AutoJSAPI::Init
, while in an opt build it would then pass null to xpc::NativeGlobal
and crash somewhere in there. Crashing on the obj = JS::GetNonCCWObjectGlobal(obj);
line totally makes sense in that case. Note that this is not the same crash location as earlier in the bug, but maybe crashreporter was messing things up back then.
Anyway, looking at the reports in crashreporter, all the recent ones are coming via ScriptLoader::FinishDynamicImport
. That code dates back to bug 1342012, which is much newer than this bug report. Chances are, a new bug should have been filed instead of reopening this one, but here we are...
Anyway, In ScriptLoader::FinishDynamicImport
we have aResult == NS_BINDING_ABORTED
, so the dynamic import got canceled (e.g. by the user hitting stop, or navigating away from the page or whatever).
And the reason the promise is null is that we cleared it out by calling FinishDynamicImport
from ScriptLoader::ParsingComplete
, because aTerminated
was true. So we're ending up calling FinishDynamicImport
twice for the same ModuleLoadRequest
and that blows up.
Presumably we should either stop canceling things in ScriptLoader::ParsingComplete
or handle the case when they were canceled there in later callsites to FinishDynamicImport
or in FinishDynamicImport
itself.
![]() |
||
Comment 17•5 years ago
|
||
e.g. by the user hitting stop, or navigating away from the page or whatever
And since I know I did none of those things, more likely by the iframe involved being removed from the DOM or the page calling window.stop()
on it. Which happened before parsing completed, given the rest of the debugging info.
![]() |
||
Comment 18•5 years ago
|
||
https://pernos.co/debug/R5l4wBNaRDUpvwG4zaWn4w/index.html#f{m[BjdG,AQsm_,t[iw,Q6E_,f{e[BjdG,AQsZ_,s{abG1yFwAA,bAYk,oD/ixFA,uD+1UwA___ is this crash in pernosco: the two notebook entries correspond to the two calls to ScriptLoader::FinishDynamicImport
Updated•5 years ago
|
Assignee | ||
Comment 19•5 years ago
|
||
(In reply to Boris Zbarsky [:bzbarsky, bz on IRC] from comment #18)
Thanks for debugging this.
I think we need to clear the list of dynamic import requests in ScriptLoader::ParsingComplete when aTerminated is true, like we do for all the other kinds of requests.
I wasn't able to write a crashtest for this. Using window.stop() works, but makes the test harness hang waiting for the load event to fire. I couldn't get a version to work using iframes.
Assignee | ||
Comment 20•5 years ago
|
||
![]() |
||
Comment 21•5 years ago
|
||
I think we need to clear the list of dynamic import requests in ScriptLoader::ParsingComplete when aTerminated is true, like we do for all the other kinds of requests.
It's not entirely clear to me that we really need to clear anything there, but OK.
Using window.stop() works, but makes the test harness hang waiting for the load event to fire
Did you call stop()
on a toplevel window, or on a subframe that contained the dynamic module load?
I couldn't get a version to work using iframes.
The crashing site is crashing because it navigates a subframe with a pending dynamic module load, right? Did that not reproduce the crash in our test harness?
Assignee | ||
Comment 22•5 years ago
|
||
(In reply to Boris Zbarsky [:bzbarsky, bz on IRC] from comment #21)
It turns out I didn't understand how to do iframes properly. window.stop() in an iframe does work. I'll update the patch.
Comment 23•5 years ago
|
||
Pushed by jcoppeard@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/260d440770c6 Clear the list of dynamic import requests after cancelling them in ScriptLoader::ParsingComplete as we do for other requests r=bzbarsky
Comment 24•5 years ago
|
||
bugherder |
Updated•5 years ago
|
Comment 25•5 years ago
|
||
Initial Nightly data is looking good so far. What do you think about nominating this for Beta and ESR approval, Jon?
Assignee | ||
Comment 26•5 years ago
|
||
Comment on attachment 9120445 [details]
Bug 1291535 - Clear the list of dynamic import requests after cancelling them in ScriptLoader::ParsingComplete as we do for other requests r?bzbarsky
Beta/Release Uplift Approval Request
- User impact if declined: Possible crash when using import() in some situations.
- Is this code covered by automated tests?: Yes
- Has the fix been verified in Nightly?: Yes
- Needs manual test from QE?: No
- If yes, steps to reproduce:
- List of other uplifts needed: None
- Risk to taking this patch: Low
- Why is the change risky/not risky? (and alternatives if risky): This is a small and simple change and is covered by tests.
- String changes made/needed:
ESR Uplift Approval Request
- If this is not a sec:{high,crit} bug, please state case for ESR consideration: This fixes a crash.
- User impact if declined: Possible crash when using import() in some situations.
- Fix Landed on Version: 74
- Risk to taking this patch: Low
- Why is the change risky/not risky? (and alternatives if risky): This is a small and simple change and is covered by tests.
- String or UUID changes made by this patch:
Comment 27•5 years ago
|
||
Comment on attachment 9120445 [details]
Bug 1291535 - Clear the list of dynamic import requests after cancelling them in ScriptLoader::ParsingComplete as we do for other requests r?bzbarsky
Fixes a lingering crash with dynamic module imports. Approved for 73.0b8 and 68.5esr.
Comment 28•5 years ago
|
||
bugherder uplift |
Comment 29•5 years ago
|
||
bugherder uplift |
Updated•3 years ago
|
Description
•