Closed
Bug 957006
Opened 11 years ago
Closed 11 years ago
Crash in mozilla::Maybe<nsAutoString>::~Maybe() (or sometimes [@ EnterBaseline ], or something else) a minute or so after loading a Wired article in Fennec
Categories
(Core :: DOM: Core & HTML, defect)
Tracking
()
RESOLVED
DUPLICATE
of bug 952321
People
(Reporter: kbrosnan, Assigned: djvj)
References
(Blocks 1 open bug, )
Details
(Keywords: crash, reproducible, sec-moderate, Whiteboard: [native-crash])
Crash Data
Attachments
(1 file)
1.34 KB,
patch
|
Details | Diff | Splinter Review |
This bug was filed from the Socorro interface and is
report bp-ebc8b6e8-7560-4267-b3e1-b0a8b2131231.
=============================================================
CCing Rob Miller and dholbert as they submitted crashes maybe they can expand on what they were doing. URLs are all the Wired website.
http://www.wired.com/wiredenterprise/2014/01/coke-iot/
http://www.wired.com/opinion/2013/01/wiretap-backdoors/
wyciwyg://4/http://www.wired.com/threatlevel/2013/12/better-data-security-nail-polish/
wyciwyg://5/http://www.wired.com/design/2013/12/this-is-what-emoji-look-like-irl/?cid=co16436394#slideid-383861
Does not seem to be device specific. Crashes happen on Android API 15+ (Android 4.0 ICS). Looks to be new to Firefox 29.
Comment 1•11 years ago
|
||
The stack is kind of sketchy, but it looks a little more DOM-y to me. The third frame is mozilla::dom::DocumentBinding::get_readyState.
Component: XPCOM → DOM
Comment 2•11 years ago
|
||
Sure, that's the DOMString coming of the stack.
But why should its destructor crash?
Comment 3•11 years ago
|
||
From the crash report, it looks like we're hitting a SIGILL at address 0x818d6a20 when trying to call the destructor on a string stored in a Maybe, which sounds memory-corruption-y, but maybe I'm paranoid.
Comment 4•11 years ago
|
||
I crash whenever I try to load this URL:
http://www.wired.com/threatlevel/2013/12/better-data-security-nail-polish/
in Nightly (on Android). I just tried twice and crashed both times.
The first crash had the signature mentioned here:
bp-332d798f-3c42-47a5-843f-a0f892140107
and the second had a different signature, "[@ XPC_WN_NoHelper_Resolve ]"
bp-cc13689a-3a2d-44cf-a201-0e0af2140107
(Probably doesn't matter, but FWIW: in both cases, I loaded the page by simply typing "wired threadlevel" into my awesomebar and tapping the result from my history)
Reporter | ||
Updated•11 years ago
|
Whiteboard: [native-crash]
Updated•11 years ago
|
Comment 5•11 years ago
|
||
Since we are seeing reliable steps we are tracking for now and look forward to seeing more development on this issue.
Comment 6•11 years ago
|
||
Daniel - Can you look into this? We're not sure who the owner of this bug should be?
Flags: needinfo?(dholbert)
Comment 7•11 years ago
|
||
Here's what I can glean from the stacktrace and looking at code a bit:
* Level 2 of the backtrace, "mozilla::dom::DocumentBinding::get_readyState", is in autogenerated bindings code. It lives in $OBJ/dom/bindings/DocumentBinding.cpp, and it looks like this:
> static bool
> get_readyState(JSContext* cx, JS::Handle<JSObject*> obj, nsIDocument* self, JSJitGetterCallArgs args)
> {
> DOMString result;
> self->GetReadyState(result);
> if (!xpc::NonVoidStringToJsval(cx, result, args.rval())) {
> return false;
> }
> return true;
>
* We must be getting to Level 1 of the backtrace, "mozilla::Maybe<nsAutoString>::~Maybe()", due to the local var "DOMString result" going out of scope. (The class DOMString has "Maybe<nsAutoString> mString;" as a member-var.)
http://mxr.mozilla.org/mozilla-central/source/dom/bindings/DOMString.h#137
Beyond that, I don't really know what this code is doing (and I don't have an android dev/test environment set up currently.)
bz knows a lot about the bindings code and DOMString.h (based on hg blame). bz, do you have any ideas here?
Flags: needinfo?(dholbert) → needinfo?(bzbarsky)
Comment 8•11 years ago
|
||
Nope. See comment 2. :(
What should happen here is that the GetReadyState() call calls "nsIDocument::GetReadyState(nsAString& aReadyState) const", which will invoke "DOMString::operator nsString&()" to be able to make the call. This will assert that the Maybe<nsAutoString> is not constructed yet, then call construct() on it and return its .ref(), which is the nsAutoString&.
Then GetReadyState will write to that nsAString&, like so:
aReadyState.Assign(NS_LITERAL_STRING("loading"));
(or whichever stat it's in). Then we unwind back to the binding code and call NonVoidStringToJsval. This lands us in "bool NonVoidStringToJsval(JSContext* cx, mozilla::dom::DOMString& str, JS::MutableHandleValue rval)", which checks HasStringBuffer, which is false in this case, and calls "NonVoidStringToJsval(cx, str.AsAString(), rval)". That passes the actual nsAutoString& in, which calls the nsAString& overload of NonVoidStringToJsval, and then grabs the string data and sticks it in a JSString, etc.
None of this should lead to a crash in ~Maybe, though.... and I have no idea why this is Android-specific. :(
Flags: needinfo?(bzbarsky)
Comment 9•11 years ago
|
||
OK, I got an Android build environment configured and tried loading this in a debug Fennec build.
I got an abort with this appearing in logcat:
F/MOZ_Assert(22693): Assertion failure: typeAlloc->isArgument(), at /Users/dholbert/builds/fennec/mozilla/js/src/jit/LinearScan.cpp:595
F/libc (22693): Fatal signal 11 (SIGSEGV) at 0x00000000 (code=1), thread 22782 (Analysis Helper)
I/Gecko (22693): [22693] WARNING: NS_ENSURE_TRUE(siteSpecificUA) failed: file /Users/dholbert/builds/fennec/mozilla/dom/base/Navigator.cpp, line 297
I/DEBUG (22527): unexpected waitpid response: n=22782, status=00000b00
I/DEBUG (22527): ptrace detach from 22782 failed: No such process
I/DEBUG (22527): debuggerd committing suicide to free the zombie!
I/DEBUG (23008): debuggerd: Dec 4 2013 23:15:41
I/ActivityManager( 1363): Process org.mozilla.fennec_dholbert (pid 22693) has died.
Comment 10•11 years ago
|
||
Disregard the assertion in comment 9 -- that's apparently an unrelated recent JS regression that's popping up everywhere. Filed bug 958732 for that.
With that hacked around (so that I can load the wired page), I'm unable to reproduce this in my debug Fennec build, unfortunately. :-/
Comment 11•11 years ago
|
||
...and now I can't reproduce it in my Nightly, either.
So either this was fixed over the last several days, or the wired page changed, or this is just mysterious about when it reproduces.
Hopefully it's the first of those options; I guess we can see from watching crash-stats.
Comment 12•11 years ago
|
||
Ah! I spoke too soon. A little while (maybe 30 seconds to a minute?) after the last time I loaded the Wired article in Android Nightly, it crashed, though with a different signature: [@ EnterBaseline ]
bp-f252e034-6ebc-4ba3-b1cb-0264f2140110
But then I loaded it a few more times, and waited, and was unable to crash. So, this is looking a bit finicky in when it reproduces. (if that last crash was even part of this bug)
Comment 13•11 years ago
|
||
Still no luck reproducing in a debug build on my phone (Nexus 4), though I hit this in Nightly again:
https://crash-stats.mozilla.com/report/index/d62d4ea5-136b-4fed-bb87-46b012140111
Is there any way I can run my Fennec debug build with my Nightly profile?
(I also tried to reproduce on a few-years-old 10" Samsung Tablet, using standard Firefox Nightly, and was unable to reproduce there.)
Comment 14•11 years ago
|
||
(In reply to Daniel Holbert [:dholbert] from comment #13)
> Still no luck reproducing in a debug build on my phone (Nexus 4), though I
> hit this in Nightly again:
> https://crash-stats.mozilla.com/report/index/d62d4ea5-136b-4fed-bb87-
> 46b012140111
>
> Is there any way I can run my Fennec debug build with my Nightly profile?
Not easily. If your phone is rooted, you can copy the profile folder the the sdcard and then try to run the Fennec debug build with "-P" pointing to the sdcard folder. If you are rooted, we can try to do that via IRC.
I'm curious why you think the profile is key. I might be able to make an add-on to move around profiles too.
Comment 15•11 years ago
|
||
(In reply to Mark Finkle (:mfinkle) from comment #14)
> Not easily. If your phone is rooted
(It's not rooted, unfortunately).
> I'm curious why you think the profile is key.
Because I'm only aware (so far) of this issue reproducing on this phone, using Nightly, with my current profile.
If it weren't profile-dependent, I'd expect others to have been able to reproduce this (please chime in if you have been able to), and I'd expect to be able to reproduce it with my own local Fennec build* (but I can't), and I'd expect to be able to reproduce it in Nightly on other devices (but I can't; I've tried 2 other devices -- a 7" tablet and a 10" tablet).
* It's possible that it's formfactor-dependent and opt-dependent, I suppose. On Monday, I'll maybe try a local opt Fennec build on my phone and see if it reproduces with that.
Comment 16•11 years ago
|
||
Scratch that -- it's not dependent on my profile at all.
I just installed Nightly on my girlfriend's Android phone, and loaded the Wired testcase, and after about a minute of letting the page sit, it crashed. We restarted Nightly, and after another minute of viewing the article, it crashed again.
(Not sure why I haven't been able to repro on my Android tablets; maybe the content is viewport-size-dependaet, or something. FWIW, those STR -- load the Wired page & let it sit for a minute -- reproduces the bug for me at least 30% of the time on my phone.)
Her crashes didn't have the ~Maybe() signature, though (nor have any of my own crashes from today).
Her crashes:
bp-55be3ba8-b3b5-4580-9216-04b622140112 [@ EnterBaseline ]
bp-fa7d7b6c-c4b6-450a-b9d2-710e42140112 [@ xpc::WrapperFactory::WrapForSameCompartment(JSContext*, JS::Handle<JSObject*>) ]
I submitted 7 crash reports from my own phone today from loading the Wired article. Six of those were [@ EnterBaseline ], e.g. bp-14618933-dd54-447a-bf96-1452c2140112 and bp-42deede3-45db-47cc-8b85-020472140112. And the remaining one was [@ PR_SetThreadPrivate | js::jit::IonContext::~IonContext() ]: bp-c8b1600f-a5e7-4aca-a604-1bb632140112
So, it looks like this has a lot of different signatures, and a lot of them are JS-related. So I'm starting to think this is a bug in the JS engine.
Given that this is reproducible, maybe we should track down when this regressed. (if it's a regression.) mfinkle, do you know if we have any Android-focused QA folks who could help with that?
Flags: needinfo?(mark.finkle)
Keywords: qawanted,
regressionwindow-wanted
Updated•11 years ago
|
Updated•11 years ago
|
Summary: crash in mozilla::Maybe<nsAutoString>::~Maybe() → Crash in mozilla::Maybe<nsAutoString>::~Maybe() (or sometimes [@ EnterBaseline ], or something else) a minute or so after loading a Wired article in Fennec
Comment 17•11 years ago
|
||
(FWIW: My phone is a Nexus 4 w/ Android 4.4.2, and my girlfriend's is a Moto X running Android 4.4. So, we know it's at least reproducible on those models/versions; probably others as well, though apparently not on tablets as far as I've seen.)
Keywords: reproducible
Comment 18•11 years ago
|
||
Not sure if it's helpful to connect this to all the various signatures, but given that EnterBaseline comes up often, let's connect it to that in our stats and analysis. ;-)
That said, given that our stackwalker does not deal well with JIT frames at this time, it's very much possible that we see varying signatures with crashes in JS JIT code.
Crash Signature: [@ mozilla::Maybe<nsAutoString>::~Maybe()] → [@ mozilla::Maybe<nsAutoString>::~Maybe()]
[@ EnterBaseline ]
Comment 19•11 years ago
|
||
(In reply to Daniel Holbert [:dholbert] from comment #16)
> Given that this is reproducible, maybe we should track down when this
> regressed. (if it's a regression.) mfinkle, do you know if we have any
> Android-focused QA folks who could help with that?
NI'ing Aaron and Kevin. We might be able to get some QA help from them or the Softvision folks.
Flags: needinfo?(mark.finkle)
Flags: needinfo?(kbrosnan)
Flags: needinfo?(aaron.train)
Comment 20•11 years ago
|
||
NI'ing Naveed since this now looks like a JS crash
Flags: needinfo?(nihsanullah)
Comment 21•11 years ago
|
||
None of the above aforementioned URL's are reproducing the crash for me on my Nexus 4 (Android 4.4.2), HTC One (Android 4.4), nor my Galaxy SIV (Android 4.4) and LG Nexus 5 (Android 4.4) nor my Asus Nexus 7 (2013)
Flags: needinfo?(aaron.train)
Reporter | ||
Comment 22•11 years ago
|
||
I would like to see if bug 957475 resolves this. It is sounding kinda similar.
Flags: needinfo?(kbrosnan)
Comment 23•11 years ago
|
||
(In reply to Kevin Brosnan [:kbrosnan] from comment #22)
> I would like to see if bug 957475 resolves this. It is sounding kinda
> similar.
Not sure, as that bug affects 27 and 28 and this one is marked to only affect 29.
Updated•11 years ago
|
tracking-fennec: ? → 29+
Updated•11 years ago
|
Assignee: nobody → nicolas.b.pierron
Flags: needinfo?(nihsanullah)
Comment 25•11 years ago
|
||
(In reply to Robert Kaiser (:kairo@mozilla.com) from comment #23)
> (In reply to Kevin Brosnan [:kbrosnan] from comment #22)
> > I would like to see if bug 957475 resolves this. It is sounding kinda
> > similar.
>
> Not sure, as that bug affects 27 and 28 and this one is marked to only
> affect 29.
FWIW: I installed Aurora on my phone and got it to crash once, after a few cycles of [load wired URL from comment 4, wait a minute]. Crash report: bp-a6decb48-cd31-40c7-b339-729c22140117
(I tried to get it to crash again, but couldn't, so this might be less reproducible on Aurora.)
Comment 26•11 years ago
|
||
(In reply to Daniel Holbert [:dholbert] from comment #25)
> FWIW: I installed Aurora on my phone and got it to crash once, after a few
> cycles of [load wired URL from comment 4, wait a minute]. Crash report:
> bp-a6decb48-cd31-40c7-b339-729c22140117
Eric, do you have any idea what might have cause the crash reported in this crash report?
I would not be able to investigate before Monday.
Flags: needinfo?(efaustbmo)
Comment 27•11 years ago
|
||
I looked at this briefly on Friday. I couldn't glean that much from the crashdump.
Here's what I was able to reconstruct:
We are crashing in GetObjectClass() in jsfriendapi.h. This is called from JSObject::is<ProxyObject>(), invoked on line 1508 of IonCaches.cpp in tryAttachProxy().
The actual dereference that crashes is the deref of the TypeObject to get the class.
The assembly the runs (very rougly paraphrased and pared):
reg1 <-- arg
reg1 <-- *(reg1 + 4) /* succeeds, but yields garbage (0xe000df40)
reg2 <-- *(reg1) /* kaboom */
I suspect a bad GC interaction. Use after free, maybe? The object pointer is good enough to dereference once, but contains a bad value, and this is the first time we try to inspect anything going on inside the object in the cache.
If the values inside were bogus, but readable, that would explain how it failed all the shape guards on the Ic to get to the fallback path safely, as well.
I do not own (or have access to for the next few days) a phone with which I could debug this further.
Nicolas, does that give you enough to spring forward with?
Flags: needinfo?(efaustbmo)
Comment 29•11 years ago
|
||
I'm seeing this on this page:
http://css-tricks.com/thought-process-front-end-problem/
Comment 30•11 years ago
|
||
I tried to reproduce it locally with the 5 URLs listed above[1] on an Unagi without any crash.
[1] http://people.mozilla.org/~npierron/shortcuts.html
Flags: needinfo?(nicolas.b.pierron)
Reporter | ||
Comment 31•11 years ago
|
||
This was reported against Firefox for Android. Not Firefox OS.
Comment 32•11 years ago
|
||
(In reply to Kevin Brosnan [:kbrosnan] from comment #31)
> This was reported against Firefox for Android. Not Firefox OS.
I noticed, but:
- I do not have any Android device to test on.
- The JS engine is the same on Firefox for Android and Firefox OS.
- I am reporting a fact.
The only things which differ between the 2 architectures are:
- the GC settings.
- the user agent.
- the number of cores (?)
And other differences are not going to matter, unless this is not a JS Engine issue.
Reporter | ||
Comment 33•11 years ago
|
||
Reproduces rather quickly on Firefox for Android Nexus 5, are you able to get a device from Paris IT or someone else in the office? Else we should get a device to you.
Comment 34•11 years ago
|
||
(In reply to Kevin Brosnan [:kbrosnan] from comment #33)
> Reproduces rather quickly on Firefox for Android Nexus 5, are you able to
> get a device from Paris IT or someone else in the office? Else we should get
> a device to you.
Kannan, can you reproduce and do you know how to debug/investigate Firefox for Android issues?
Flags: needinfo?(kvijayan)
Assignee | ||
Comment 35•11 years ago
|
||
I have a nexus 5 with firefox on it. I'll try to repro it locally.
Flags: needinfo?(kvijayan)
Comment 36•11 years ago
|
||
(In reply to Kannan Vijayan [:djvj] from comment #35)
> I have a nexus 5 with firefox on it. I'll try to repro it locally.
Were you able to reproduce this bug?
Should I assign it to you?
Flags: needinfo?(kvijayan)
Updated•11 years ago
|
Assignee: nicolas.b.pierron → kvijayan
Flags: needinfo?(nicolas.b.pierron)
Assignee | ||
Updated•11 years ago
|
Flags: needinfo?(kvijayan)
Assignee | ||
Comment 37•11 years ago
|
||
Reproduces for me. Now I just need to figure out how to GDB my phone. I suspect because it's not rooted. I'd prefer not to root my phone.
Assignee | ||
Comment 38•11 years ago
|
||
Got a stack trace, from a Feb 01 build (doesn't reproduce on trunk, so I just went back about a month):
Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 17982]
0x82ba05b0 in js::jit::TypedOrValueRegister::dataValue (this=0x77fde3f8)
at /home/kvijayan/Checkouts/mozilla-inbound/js/src/../../js/src/jit/RegisterSets.h:171
171 JS_ASSERT(hasValue());
(gdb) bt 10
#0 0x82ba05b0 in js::jit::TypedOrValueRegister::dataValue (this=0x77fde3f8)
at /home/kvijayan/Checkouts/mozilla-inbound/js/src/../../js/src/jit/RegisterSets.h:171
#1 0x82ba0700 in js::jit::TypedOrValueRegister::valueReg (this=0x77fde3f8)
at /home/kvijayan/Checkouts/mozilla-inbound/js/src/../../js/src/jit/RegisterSets.h:210
#2 0x82c3852c in js::jit::GetPropertyIC::tryAttachDOMProxyUnshadowed (this=0x88c23090,
cx=0x8bf2a540, ion=0x88c23000, obj=..., name=..., resetNeeded=true, returnAddr=0x8c574e14,
emitted=0x77fde54b)
at /home/kvijayan/Checkouts/mozilla-inbound/js/src/../../js/src/jit/IonCaches.cpp:1455
#3 0x82c38c16 in js::jit::GetPropertyIC::tryAttachProxy (this=0x88c23090, cx=0x8bf2a540,
ion=0x88c23000, obj=..., name=..., returnAddr=0x8c574e14, emitted=0x77fde54b)
at /home/kvijayan/Checkouts/mozilla-inbound/js/src/../../js/src/jit/IonCaches.cpp:1521
#4 0x82c39976 in js::jit::GetPropertyIC::tryAttachStub (this=0x88c23090, cx=0x8bf2a540,
ion=0x88c23000, obj=..., name=..., returnAddr=0x8c574e14, emitted=0x77fde54b)
at /home/kvijayan/Checkouts/mozilla-inbound/js/src/../../js/src/jit/IonCaches.cpp:1666
#5 0x82c39b6a in js::jit::GetPropertyIC::update (cx=0x8bf2a540, cacheIndex=0, obj=..., vp=...)
at /home/kvijayan/Checkouts/mozilla-inbound/js/src/../../js/src/jit/IonCaches.cpp:1702
#6 0x86233d18 in ?? ()
#7 0x86233d18 in ?? ()
Backtrace stopped: previous frame identical to this frame (corrupt stack?)
(gdb)
Updated•11 years ago
|
Group: core-security
Assignee | ||
Comment 39•11 years ago
|
||
I'm reasonably sure the scratch reg that gets pulled up here is going to come from random memory, since the |payload| register is the second field in ValueOperand, and |isFloat_| is the second field in AnyRegister.
There's a remote possiblity this could be controlled, by engineering regalloc to put a critical value into the clobbered register, and then controlling the clobbering to get access to raw memory. Hard to pull off, but there's definitely a shadow of an opening for that kind of attack.
Marking sec-moderate because although attack vector is tricky to exploit, the consequences of a successful exploit are arbitrary memory access.
Whiteboard: [native-crash] → [native-crash] [sec-moderate]
Assignee | ||
Comment 40•11 years ago
|
||
This fixes the crash on my setup (tried three runs, all loaded fine). Can someone apply this patch to a broken build and confirm that it fixes the crash?
I'm using the URL from Comment 4 to test.
Assignee | ||
Updated•11 years ago
|
Attachment #8378633 -
Flags: review?(efaustbmo)
Comment 41•11 years ago
|
||
sec-moderate is a keyword not a whiteboard tag.
Keywords: sec-moderate
Whiteboard: [native-crash] [sec-moderate] → [native-crash]
Comment 42•11 years ago
|
||
That failure looks an awful lot like bug 952321. See bug 952321 comment 35 and following...
Comment 43•11 years ago
|
||
(In reply to Boris Zbarsky [:bz] (reviews will be slow; ask someone else) from comment #42)
> That failure looks an awful lot like bug 952321. See bug 952321 comment 35
> and following...
In fact, I think the patch that landed there should obviate the one attached here.
If not, let me know.
Assignee | ||
Comment 44•11 years ago
|
||
Confirmed dup.
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → DUPLICATE
Comment 45•11 years ago
|
||
Comment on attachment 8378633 [details] [diff] [review]
fix-bug-957006.patch
Cancelling review, as bug marked a dupe.
Attachment #8378633 -
Flags: review?(efaustbmo)
Updated•11 years ago
|
Reporter | ||
Updated•10 years ago
|
Keywords: regressionwindow-wanted
Updated•9 years ago
|
Group: core-security → core-security-release
Updated•9 years ago
|
Blocks: shutdownkill
No longer blocks: shutdownkill
Updated•8 years ago
|
Group: core-security-release
Updated•6 years ago
|
Component: DOM → DOM: Core & HTML
You need to log in
before you can comment on or make changes to this bug.
Description
•