Closed
Bug 1008107
Opened 11 years ago
Closed 11 years ago
crash in JS::Heap<JSObject*>::set(JSObject*)
Categories
(Core :: DOM: Core & HTML, defect)
Tracking
()
VERIFIED
FIXED
mozilla34
People
(Reporter: cork, Assigned: billm)
References
Details
(Keywords: crash, topcrash)
Crash Data
Attachments
(1 file, 1 obsolete file)
934 bytes,
patch
|
bzbarsky
:
review+
lmandel
:
approval-mozilla-aurora+
lmandel
:
approval-mozilla-beta+
|
Details | Diff | Splinter Review |
This bug was filed from the Socorro interface and is
report bp-495cea4f-a7ba-4d72-ad50-8c72a2140509.
=============================================================
Crashes on first restart after running firefox updater, I've had three different crashes, but this is the most common one.
Additional crash reports:
bp-d45fc2cf-48ce-4823-8e2f-9aae02140509
bp-7767d5ff-2b24-4862-aa22-cca822140508
![]() |
||
Comment 1•11 years ago
|
||
So the relevant part of the stack here is that sandbox_finalize calls SandboxPrivate::ForgetGlobalObject, which calls nsWrapperCache::ClearWrapper, which calls nsWrapperCache::SetWrapperJSObject(nullptr).
This ends up doing mWrapper = nullptr, which then ends up crashing at address 0x28.
The old value of mWrapper is presumably the thing being finalized, fwiw.
![]() |
||
Comment 2•11 years ago
|
||
Also, this is a 64-bit build, fwiw. Still 0x28 is fairly far from 0...
![]() |
||
Comment 3•11 years ago
|
||
Oh, the actual crash is claimed to be on line 241 of RootingApi.h, which is this:
241 } else if (js::GCMethods<T>::needsPostBarrier(ptr)) {
Comment 4•11 years ago
|
||
I can't see what's causing it to access that address.
Having thought about it, I don't see anything wrong with setting a Heap<JSObject*> to nullptr in a finalizer. It will check whether the previous value was in the nursery (in GCMethods<JSObject*>::needsPostBarrer()), but this should always be false in this situation: when we are in a finalizer, a major GC is running, hence a minor GC must have happened first. In this minor GC the object will have been marked because of Heap<>'s postbarrier, so the object now must be tenured. So nulling the pointer will not do anything special when it sees the previous value was not in the nursery.
FWIW: If this helps.
Same crash Signature on Nightly Windows 32bit builds also.
Crash report ID: bp-6cecca4d-5ea3-41cc-912a-d596f2140519
Not the same Crash Address: 0x1759253c, but the Crash Thread also claims the Source to be in RootingAPI.h.
![]() |
||
Comment 6•11 years ago
|
||
That crash is not on a near-0 address; it's an EXCEPTION_ILLEGAL_INSTRUCTION at 0x1759253c.
It also has a totally different stack.... Please file a new bug on it?
Flags: needinfo?(Event.H04iz0n)
Comment 7•11 years ago
|
||
[Tracking Requested - why for this release]: This is listed today as a top crasher on Fx32, moving up in rank several places from last week.
Comment 8•11 years ago
|
||
It looks like it is crashing on this line in sandbox_finalize:
static_cast<SandboxPrivate *>(sop)->ForgetGlobalObject();
Do you have any ideas, Gabor?
Flags: needinfo?(gkrizsanits)
Comment 9•11 years ago
|
||
Sorry but I don't have a clue. The code looks right to me and I don't know enough about the GC internals to have any good guesses. Apart from the random memory corruption there are two special cases about the lifetime of sandbox globals that _might_ be related and I would check them out. One is the JunkScope (maybe bobby's latest patch will make a difference then) the other is the NukeSandbox which can be triggered from js, and basically strips off all the external reference to the compartment, and with that kind of forcing a GC.
Flags: needinfo?(gkrizsanits)
Comment 11•11 years ago
|
||
gabor, what guarantees sop is non-null? The crash seems to be null+offset, not any random stuff.
Could we add a null check?
Flags: needinfo?(gkrizsanits)
Comment 12•11 years ago
|
||
Based on Bug 990729 (which landed for FF33), billm might know some of this code too.
Comment 13•11 years ago
|
||
(In reply to Olli Pettay [:smaug] from comment #11)
> gabor, what guarantees sop is non-null? The crash seems to be null+offset,
> not any random stuff.
> Could we add a null check?
Random in a sense that no one should ever set that private to null. We set it to SandboxPrivate when the sandbox is created and then the finalize call deleting it but does not reset the private slot back to null... So even if the finalize would be called twice it would not be null (which is arguably bad behavior but that's another story, finalize should never be called twice anyway)
So it can be null only if someone calls JS_SetPrivate on the object without knowing what it actually nulls out (leaving a memory leak behind) or some other random code nulls out this field.
Flags: needinfo?(gkrizsanits)
Comment 14•11 years ago
|
||
While it pains me to ship a top crash, I don't see any activity in this bug that demonstrates that we're going to be able to fix this in the 32 cycle.
Juan/Kairo - This is marked as a top crash for 32. Just how bad does the current data say this crash is? Do we need to continue to pursue a fix for the RC that we're going to build on Monday?
![]() |
||
Comment 15•11 years ago
|
||
Hrm, 70% of the 32.0b7 crashes with this signature are within the first minute of launch, so considered "startup". Also, it's #10 on the topcrash list with 1.26% of crashes with this version right now: https://crash-stats.mozilla.com/topcrasher/products/Firefox/versions/32.0b7
I'm not happy with shipping this one.
Flags: needinfo?(kairo)
Comment 16•11 years ago
|
||
Do we have any idea when this started to happen (more often) ?
![]() |
||
Comment 17•11 years ago
|
||
(In reply to Olli Pettay [:smaug] from comment #16)
> Do we have any idea when this started to happen (more often) ?
This signature was not in the top 300 crashes in 32.0b4 but is even #4 in 32.0b5 - Sylvestre listed all changesets for that period in http://release.mozilla.org/statistics/32/2014/08/09/fx-32-b4-to-b5.html - bug 1044193 perhaps (just guessing because it has "heap" in the summary)?
Comment 18•11 years ago
|
||
After speaking with Kairo, I'm setting this back to affected as it seems we really do need to pursue a fix, even if we have to take it in RC1.
jst/overholt - Olli has been active here but I don't know if he's the right owner. Can one of you help find a good owner who can make progress on this bug in the limited time that we have left?
Comment 19•11 years ago
|
||
I have been trying to reproduce this problem with the information from one user who says that he/she is able to reproduce the crash while previewing a comment on reddit.com using RES (add-on). No luck yet. I will try to contact the user from the information left in the crash report.
Flags: needinfo?(jbecerra)
Assignee | ||
Comment 20•11 years ago
|
||
I think we should just add a null check here. I looked at the address of SandboxPrivate::mWrapper in my own 64-bit trunk build and it's at 0x20. It's conceivable that there's an extra word on one of the base classes in beta. Also, the code Boris pointed out in comment 3 is the first time we read out SandboxPrivate::mWrapper, so it makes sense that we would crash there if sop is null.
I also looked at xpc::CreateSandboxObject (http://hg.mozilla.org/releases/mozilla-beta/annotate/7f45b3841df0/js/xpconnect/src/Sandbox.cpp#l1049). The sandbox JS object is allocated on line 1084. We set the private on line 1139. In between, there are a couple error paths; if we return on any of those, we'll end up in sandbox_finalize with a null sop.
Why those calls would fail at startup is kind of mysterious. But as long as we have those return paths, we need a null check.
Assignee | ||
Comment 21•11 years ago
|
||
Not sure who should review this with Bobby on vacation. It sounds like we want this soon. Can one of you guys look it over and check that what I said above makes sense?
Attachment #8476912 -
Flags: review?(bzbarsky)
Attachment #8476912 -
Flags: review?(bugs)
Assignee | ||
Comment 22•11 years ago
|
||
Oops, I reversed the branch. Also adding Gabor in case he's around.
Attachment #8476912 -
Attachment is obsolete: true
Attachment #8476912 -
Flags: review?(bzbarsky)
Attachment #8476912 -
Flags: review?(bugs)
Attachment #8476918 -
Flags: review?(gkrizsanits)
Attachment #8476918 -
Flags: review?(bzbarsky)
Attachment #8476918 -
Flags: review?(bugs)
![]() |
||
Comment 23•11 years ago
|
||
Comment on attachment 8476918 [details] [diff] [review]
sandbox-nullcheck v2
I can do it, but s/NS_IF_RELEASE/NS_RELEASE/?
Attachment #8476918 -
Flags: review?(gkrizsanits)
Attachment #8476918 -
Flags: review?(bzbarsky)
Attachment #8476918 -
Flags: review?(bugs)
Attachment #8476918 -
Flags: review+
Assignee | ||
Comment 24•11 years ago
|
||
Comment 25•11 years ago
|
||
Bill's working on this so cleaning up the nobody.
Assignee: nobody → wmccloskey
Flags: needinfo?(overholt)
Flags: needinfo?(jst)
Assignee | ||
Comment 26•11 years ago
|
||
Comment on attachment 8476918 [details] [diff] [review]
sandbox-nullcheck v2
Approval Request Comment
[Feature/regressing bug #]: unknown
[User impact if declined]: crashes
[Describe test coverage new/current, TBPL]: some runs on inbound
[Risks and why]: Very low risk. Just adding a null check. Worst that can happen is leaks.
[String/UUID change made/needed]: None.
Attachment #8476918 -
Flags: approval-mozilla-beta?
Attachment #8476918 -
Flags: approval-mozilla-aurora?
Comment 27•11 years ago
|
||
Comment on attachment 8476918 [details] [diff] [review]
sandbox-nullcheck v2
Approved for aurora and beta with the understanding that this will land after the build with this patch is shown to be green on inbound.
Attachment #8476918 -
Flags: approval-mozilla-beta?
Attachment #8476918 -
Flags: approval-mozilla-beta+
Attachment #8476918 -
Flags: approval-mozilla-aurora?
Attachment #8476918 -
Flags: approval-mozilla-aurora+
Comment 28•11 years ago
|
||
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla34
Updated•11 years ago
|
Flags: qe-verify+
![]() |
||
Comment 30•11 years ago
|
||
Judging from data, this seems to be fixed on 32.0b9
Comment 31•11 years ago
|
||
In Socorro there are 0 crashes with [@ JS::Heap<JSObject*>::set(JSObject*)] signature in the last 2 weeks for 34.0a2: https://crash-stats.mozilla.com/query/?product=Firefox&version=Firefox%3A34.0a2&range_value=2&range_unit=weeks&date=09%2F11%2F2014+08%3A00%3A00&query_search=signature&query_type=contains&query=JS%3A%3AHeap%3CJSObject*%3E%3A%3Aset%28JSObject*%29&reason=&release_channels=&build_id=&process_type=any&hang_type=any
Although, for 33.0b, 17 crashes are present on Windows:
- Firefox 33 beta 2: 4 crashes
- Firefox 33 beta 1: 13 crashes
Reports: https://crash-stats.mozilla.com/report/list?signature=JS%3A%3AHeap%3CJSObject%2A%3E%3A%3Aset%28JSObject%2A%29&product=Firefox&query_type=contains&range_unit=weeks&process_type=any&version=Firefox%3A33.0b&hang_type=any&date=2014-09-11+08%3A00%3A00&range_value=2#tab-reports
Bill, could you please give me your opinion regarding the crashes on 33 beta 1 and 2?
Flags: needinfo?(wmccloskey)
Assignee | ||
Comment 32•11 years ago
|
||
This is a very different crash than the ones the patch here was designed to address. If the volume justifies it, I suggest we file a new bug. Otherwise, let's just ignore it.
Flags: needinfo?(wmccloskey)
Comment 33•11 years ago
|
||
(In reply to Bill McCloskey (:billm) from comment #32)
> This is a very different crash than the ones the patch here was designed to
> address. If the volume justifies it, I suggest we file a new bug. Otherwise,
> let's just ignore it.
Thanks for your input!
The volume is still low: https://crash-stats.mozilla.com/report/list?signature=JS%3A%3AHeap%3CJSObject%2A%3E%3A%3Aset%28JSObject%2A%29&product=Firefox&query_type=contains&range_unit=weeks&process_type=any&version=Firefox%3A33.0b&hang_type=any&date=2014-09-16+10%3A00%3A00&range_value=2#tab-reports
I will keep an eye on Socorro for a few days and I will file a bug if needed.
Status: RESOLVED → VERIFIED
Updated•6 years ago
|
Component: DOM → DOM: Core & HTML
You need to log in
before you can comment on or make changes to this bug.
Description
•