Closed
Bug 1412391
Opened 8 years ago
Closed 5 years ago
Crash in js::shadow::Object::slotRef
Categories
(Core :: XPConnect, defect, P1)
Tracking
()
RESOLVED
WORKSFORME
People
(Reporter: wsmwk, Assigned: sfink)
Details
(4 keywords)
Crash Data
bp-76bb3aba-c643-490f-b13c-1f99e0171027 56.0a1 20170728100358
I wasn't at keyboard at the time. I don't crash much - maybe once per month. Previous crash was a week ago different signature bp-311849ad-3ab9-4aaf-9275-8d0480171020
0 xul.dll js::shadow::Object::slotRef(unsigned __int64) obj-firefox/dist/include/jsfriendapi.h:586
1 xul.dll xpc::JSXrayTraits::construct(JSContext*, JS::Handle<JSObject*>, JS::CallArgs const&, js::Wrapper const&) js/xpconnect/wrappers/XrayWrapper.cpp:949
2 xul.dll xpc::XrayWrapper<js::CrossCompartmentWrapper, xpc::JSXrayTraits>::construct(JSContext*, JS::Handle<JSObject*>, JS::CallArgs const&) js/xpconnect/wrappers/XrayWrapper.cpp:2471
3 xul.dll js::Proxy::construct(JSContext*, JS::Handle<JSObject*>, JS::CallArgs const&) js/src/proxy/Proxy.cpp:501
4 xul.dll js::proxy_Construct(JSContext*, unsigned int, JS::Value*) js/src/proxy/Proxy.cpp:750
5 xul.dll InternalConstruct js/src/vm/Interpreter.cpp:572
6 xul.dll js::ConstructFromStack(JSContext*, JS::CallArgs const&) js/src/vm/Interpreter.cpp:598
7 xul.dll Interpret js/src/vm/Interpreter.cpp:3056
8 xul.dll js::RunScript(JSContext*, js::RunState&) js/src/vm/Interpreter.cpp:409
9 xul.dll js::InternalCallOrConstruct(JSContext*, JS::CallArgs const&, js::MaybeConstruct) js/src/vm/Interpreter.cpp:487
10 xul.dll js::Wrapper::call(JSContext*, JS::Handle<JSObject*>, JS::CallArgs const&) js/src/proxy/Wrapper.cpp:169
11 xul.dll js::CrossCompartmentWrapper::call(JSContext*, JS::Handle<JSObject*>, JS::CallArgs const&) js/src/proxy/CrossCompartmentWrapper.cpp:359
12 xul.dll js::Proxy::call(JSContext*, JS::Handle<JSObject*>, JS::CallArgs const&) js/src/proxy/Proxy.cpp:481
13 xul.dll js::InternalCallOrConstruct(JSContext*, JS::CallArgs const&, js::MaybeConstruct) js/src/vm/Interpreter.cpp:451
14 xul.dll Interpret js/src/vm/Interpreter.cpp:3064
15 xul.dll js::RunScript(JSContext*, js::RunState&) js/src/vm/Interpreter.cpp:409
16 xul.dll js::InternalCallOrConstruct(JSContext*, JS::CallArgs const&, js::MaybeConstruct) js/src/vm/Interpreter.cpp:487
17 xul.dll js::CallSelfHostedFunction(JSContext*, JS::Handle<js::PropertyName*>, JS::Handle<JS::Value>, js::AnyInvokeArgs const&, JS::MutableHandle<JS::Value>) js/src/vm/SelfHosting.cpp:1678
18 xul.dll AsyncFunctionResume js/src/vm/AsyncFunction.cpp:192
19 xul.dll AsyncFunctionAwaitPromiseReactionJob js/src/builtin/Promise.cpp:882
20 xul.dll PromiseReactionJob js/src/builtin/Promise.cpp:962
21 xul.dll js::InternalCallOrConstruct(JSContext*, JS::CallArgs const&, js::MaybeConstruct) js/src/vm/Interpreter.cpp:469
22 xul.dll JS::Call(JSContext*, JS::Handle<JS::Value>, JS::Handle<JS::Value>, JS::HandleValueArray const&, JS::MutableHandle<JS::Value>) js/src/jsapi.cpp:2948
23 xul.dll mozilla::PromiseJobRunnable::Run() xpcom/base/CycleCollectedJSContext.cpp:209
24 xul.dll mozilla::dom::Promise::PerformMicroTaskCheckpoint() dom/promise/Promise.cpp:535
25 xul.dll mozilla::CycleCollectedJSContext::AfterProcessTask(unsigned int) xpcom/base/CycleCollectedJSContext.cpp:362
Not a high crash rate, in fact rare. And most are version 56[1], but not all. For example
bp-984d1f6d-aad3-4b43-ad27-a3df70171023 57.0b9
bp-9351123d-9978-4f00-8f9d-bdf920171022 58.0a1
[1] https://crash-stats.mozilla.com/signature/?product=Firefox&signature=js%3A%3Ashadow%3A%3AObject%3A%3AslotRef&date=%3E%3D2017-10-20T15%3A05%3A01.000Z&date=%3C2017-10-27T15%3A05%3A01.000Z&_columns=date&_columns=product&_columns=version&_columns=build_id&_columns=platform&_columns=reason&_columns=address&_columns=install_time&_sort=-version&_sort=-date&page=1
Comment 1•8 years ago
|
||
Crashes come from many different sources that call js::shadow::Object::slotRef (mostly from DOM code, such as DOMGlobalHasProtoAndIFaceCache and Unwrap*() methods.
Most of the crashes are wildptrs. 1 in the last 3 months is a ...bad0badN crash (GC poison). about 10 are UAF e5 crashes.
Crashes go *way* back, but maybe only for some of the callers of slotRef(). We might consider making slotRef() a non-terminal in processing crashreports, but I'll leave that thought for people who look into this (and if they think there are multiple causes depending on caller, or if this is simply a landmine laid in this routine that lots of different calling paths can trip over.
some of the crashes are also nullptr's, but it's unclear if that's by chance or not.
Group: core-security
Comment 2•8 years ago
|
||
The specific crash in comment 0 is in this code in JSXrayTraits::construct:
JS::RootedObject holder(cx, self.ensureHolder(cx, wrapper));
if (self.getProtoKey(holder) == JSProto_Function) {
on the "if" line. getProtoKey does:
int32_t key = js::GetReservedSlot(holder, SLOT_PROTOKEY).toInt32();
return static_cast<JSProtoKey>(key);
and the crash is at 0x8. So "holder" was null. As in, we OOMed or so, similar to bug 1407414. The crash report does say "System Memory Use Percentage: 98" and claims only 300MB of RAM are free...
Looking at the general crash reports for Object::slotRef, a bunch of them are near-null from JSXrayTraits::construct. Some are near-null from WindowBinding::Wrap but it's hard to say what source that corresponds to without doing some digging. I so wish crash-stats showed the mercurial changeset. :( I see one with near-null in ModuleEnvironmentObject::lookupImport.
I do see a bunch of wild-reads under UnwrapPossiblyNotInitializedDOMObject from things like DocumentFragmentBinding::_finalize and TextBinding::_finalize and MessageEventBinding::_finalize and DataTransferBinding::_finalize and NodeListBinding::DOMProxyHandler::finalize and so on. Those are called from either CycleCollectedJSRuntime::JSObjectsTenured or FinalizeTypedArenas<JSObject>.
I am going to file a new bug on the JSXrayTraits::construct crash; that one is a guaranteed near-null, I believe. I'm pretty worried about those binding finalization crashes, though. Those suggest that the pointer to the thing we're finalizing is problematic, I think.
Flags: needinfo?(sphink)
Comment 3•8 years ago
|
||
I filed bug 1414461 for the ensureHolder OOM thing.
Updated•8 years ago
|
Group: core-security → dom-core-security
| Assignee | ||
Comment 4•8 years ago
|
||
I looked at a sampling of the crash reports. I saw the _finalize ones. But I also saw https://crash-stats.mozilla.com/report/index/44db6088-00fc-4d07-aebf-3e6bc0171119 that is crashing in UnwrapDOMObject<MediaQueryList> which is not being called during finalization. Are partially or uninitialized DOM objects leaking out somewhere?
One potential source might be bug 1418528, where we CheckedUnwrap something, get access denied, and then use the resulting nullptr anyway. Though I don't think that's very easy to trigger.
There's also https://crash-stats.mozilla.com/report/index/c360692d-7340-4811-8c06-0b9590171119 that looks like it's coming from an ObjectValue(nullptr). If I'm really lucky, the patch in bug 1366083 will smoke out some of those, though they'll have to reach a GC to be caught, and the information provided won't really be more than what we already have in this bug. I'm also doing a try push that checks for ObjectValue(nullptr) being created, though I don't expect that to catch anything here. It might be interesting to see how much slowdown it provides, and perhaps try it out on beta.
Flags: needinfo?(sphink)
Comment 5•8 years ago
|
||
> Are partially or uninitialized DOM objects leaking out somewhere?
That would be pretty bizarre, but maybe...
> that looks like it's coming from an ObjectValue(nullptr)
Can we do a diagnostic assert (so nightly and early beta only) on ObjectValue(nullptr) to smoke those out?
Comment 6•8 years ago
|
||
(for comment 5)
In the interest of having this assigned so it looks from the outside like it's progressing, I'm giving it to you, Steve. We can change that as appropriate later. Thanks for understanding :)
Assignee: nobody → sphink
Flags: needinfo?(sphink)
Priority: -- → P1
Comment 7•8 years ago
|
||
(In reply to Andrew Overholt [:overholt] from comment #6)
> (for comment 5)
>
> In the interest of having this assigned so it looks from the outside like
> it's progressing, I'm giving it to you, Steve. We can change that as
> appropriate later. Thanks for understanding :)
I realize this will sound snarkier than it should, but what's the purpose of looking busy here?
Also, I'm quite interesting in a response to bz's question as to whether some diagnostic asserts will help us here (comment 5).
Flags: needinfo?(overholt)
Comment 8•8 years ago
|
||
(In reply to Frederik Braun [:freddyb] from comment #7)
> (In reply to Andrew Overholt [:overholt] from comment #6)
> > (for comment 5)
> >
> > In the interest of having this assigned so it looks from the outside like
> > it's progressing, I'm giving it to you, Steve. We can change that as
> > appropriate later. Thanks for understanding :)
>
> I realize this will sound snarkier than it should, but what's the purpose of
> looking busy here?
Oh, I was jokingly referring to the new "sec-high bugs should have an assignee" process. (Not snarky; good question)
Flags: needinfo?(overholt)
Comment 9•7 years ago
|
||
(In reply to Boris Zbarsky [:bz] (no decent commit message means r-) from comment #5)
> > Are partially or uninitialized DOM objects leaking out somewhere?
>
> That would be pretty bizarre, but maybe...
>
> > that looks like it's coming from an ObjectValue(nullptr)
>
> Can we do a diagnostic assert (so nightly and early beta only) on
> ObjectValue(nullptr) to smoke those out?
I think this question is for you, Steve. Can you give us a reply?
| Assignee | ||
Comment 10•7 years ago
|
||
Implemented bz's suggestion in comment 5 over in bug 1366083. Hopefully that will smoke something out.
Flags: needinfo?(sphink)
Comment 11•7 years ago
|
||
(In reply to Steve Fink [:sfink] [:s:] from comment #10)
> Implemented bz's suggestion in comment 5 over in bug 1366083. Hopefully that
> will smoke something out.
Great! I'm not sure I understand the fix here :-)
The crash signature in bug 1366083 shows no new reports, as of recently. I suppose that's great.
But how does that affect the crash here? ;) What's left to be done? :-)
Flags: needinfo?(sphink)
| Assignee | ||
Comment 12•7 years ago
|
||
(In reply to Frederik Braun [:freddyb] from comment #11)
> (In reply to Steve Fink [:sfink] [:s:] from comment #10)
> > Implemented bz's suggestion in comment 5 over in bug 1366083. Hopefully that
> > will smoke something out.
>
> Great! I'm not sure I understand the fix here :-)
> The crash signature in bug 1366083 shows no new reports, as of recently. I
> suppose that's great.
>
> But how does that affect the crash here? ;)
It doesn't. But based on previous analysis in this bug, it is hoped that the MOZ_DIAGNOSTIC_ASSERT will fire before this bug has a chance to manifest, bringing us (much) closer to the root cause. Release builds won't do the assertion, however, so we would expect this signature to still show up there.
> What's left to be done? :-)
Finding and fixing the bug. There's still not enough to go on. (That said, this bug has more info than many similar crashes.)
On the other hand, it appears that the volume has dropped since February, if I'm reading the graph here right.
Flags: needinfo?(sphink)
Comment 13•7 years ago
|
||
> Finding and fixing the bug. There's still not enough to go on. (That said,
> this bug has more info than many similar crashes.)
>
> On the other hand, it appears that the volume has dropped since February, if
> I'm reading the graph here right.
Looking at a 3-month report in crash-stats, and at the graph tab, I don't see any real change overall.
Comment 14•7 years ago
|
||
Wow, so all I found was only one report for the diagnostic assert: https://crash-stats.mozilla.com/report/index/c95c61b5-36c5-431d-af2f-e72f60180428
| Assignee | ||
Comment 15•7 years ago
|
||
(In reply to Frederik Braun [:freddyb] from comment #14)
> Wow, so all I found was only one report for the diagnostic assert:
> https://crash-stats.mozilla.com/report/index/c95c61b5-36c5-431d-af2f-
> e72f60180428
Ok, so that would be JSScript::getObject(jsbytecode*) returning nullptr. Which could be pc+1 pointing to a bad index, a corrupt object table in the script, mangled offsets that record where the various tables start, or something upstream could have written a nullptr into the table.
I looked into adding a release assertion for when the objects get inserted into the table, but all of those are already guarded by nullptr checks.
But I'm still not seeing other reports for the diagnostic assert, so setObject(nullptr) does not seem to be a large source of issues here.
| Reporter | ||
Comment 18•5 years ago
|
||
I can't common on Fennec, but there are no crashes for current versions of Firefox and THunderbird in the past 6 months. https://crash-stats.mozilla.org/signature/?signature=js%3A%3Ashadow%3A%3AObject%3A%3AslotRef&date=%3E%3D2019-10-23T12%3A32%3A00.000Z&date=%3C2020-04-23T12%3A32%3A00.000Z
Status: NEW → RESOLVED
Closed: 5 years ago
Resolution: --- → WORKSFORME
Comment 19•5 years ago
|
||
Since the bug is closed, the stalled keyword is now meaningless.
For more information, please visit auto_nag documentation.
Keywords: stalled
Updated•2 years ago
|
Group: dom-core-security
You need to log in
before you can comment on or make changes to this bug.
Description
•