Closed Bug 837370 Opened 13 years ago Closed 7 years ago

crash in TypeConstraintFreeze::newType @ js::types::TypeCompartment::addPendingRecompile

Categories

(Core :: JavaScript Engine, defect)

21 Branch
defect
Not set
critical

Tracking

()

RESOLVED WONTFIX
mozilla22
Tracking Status
firefox20 --- unaffected
firefox21 + fixed
firefox22 --- unaffected
firefox23 --- unaffected

People

(Reporter: scoobidiver, Unassigned)

Details

(Keywords: crash, regression, steps-wanted, Whiteboard: [native-crash][leave open])

Crash Data

Attachments

(3 files)

It started spiking in 21.0a1/20130201. The regression range for the spike is: http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=20bbf73921f4&tochange=50cf5bbcb180 Signature js::types::TypeCompartment::addPendingRecompile(JSContext*, js::types::RecompileInfo const&) More Reports Search UUID bb1d4014-843d-4995-9662-2a06b2130202 Date Processed 2013-02-02 07:51:24 Uptime 16673 Last Crash 6.3 weeks before submission Install Age 16.3 hours since version was first installed. Install Time 2013-02-01 15:34:29 Product Firefox Version 21.0a1 Build ID 20130201030927 Release Channel nightly OS Windows NT OS Version 6.1.7601 Service Pack 1 Build Architecture x86 Build Architecture Info GenuineIntel family 6 model 42 stepping 7 Crash Reason EXCEPTION_ACCESS_VIOLATION_READ Crash Address 0x4 App Notes AdapterVendorID: 0x1002, AdapterDeviceID: 0x6760, AdapterSubsysID: 3673103c, AdapterDriverVersion: 8.882.2.0 Has dual GPUs. GPU #2: AdapterVendorID2: 0x8086, AdapterDeviceID2: 0x0116, AdapterSubsysID2: 3673103c, AdapterDriverVersion2: 8.882.2.0D2D? D2D+ DWrite? DWrite+ D3D10 Layers? D3D10 Layers+ Processor Notes sp-processor09.phx1.mozilla.com_15950:2008 EMCheckCompatibility True Adapter Vendor ID 0x1002 Adapter Device ID 0x6760 Total Virtual Memory 4294836224 Available Virtual Memory 3489337344 System Memory Use Percentage 35 Available Page File 10577408000 Available Physical Memory 4386545664 Frame Module Signature Source 0 mozjs.dll js::types::TypeCompartment::addPendingRecompile js/src/jsinfer.cpp:2794 1 mozjs.dll TypeConstraintFreeze::newType js/src/jsinfer.cpp:1677 2 mozjs.dll js::types::TypeCompartment::resolvePending js/src/jsinferinlines.h:1150 3 mozjs.dll js::types::TypeSet::addType js/src/jsinferinlines.h:1469 4 mozjs.dll js::types::TypeCompartment::markSetsUnknown js/src/jsinfer.cpp:2953 5 mozjs.dll js::SetClassAndProto js/src/jsobj.cpp:2878 6 mozjs.dll JSObject::ReserveForTradeGuts js/src/jsobj.cpp:1787 7 mozjs.dll JSObject::swap js/src/jsobj.cpp:2019 8 mozjs.dll JS_TransplantObject js/src/jsapi.cpp:1602 9 xul.dll xpc::TransplantObject js/xpconnect/wrappers/WrapperFactory.cpp:639 10 xul.dll mozilla::dom::ReparentWrapper dom/bindings/BindingUtils.cpp:1417 11 xul.dll nsNodeUtils::CloneAndAdopt content/base/src/nsNodeUtils.cpp:532 12 xul.dll nsNodeUtils::Adopt content/base/src/nsNodeUtils.h:185 13 xul.dll nsIDocument::AdoptNode content/base/src/nsDocument.cpp:6368 14 xul.dll nsDocument::AdoptNode content/base/src/nsDocument.cpp:6250 15 xul.dll AdoptNodeIntoOwnerDoc content/base/src/nsINode.cpp:1278 16 xul.dll nsINode::ReplaceOrInsertBefore content/base/src/nsINode.cpp:1847 17 xul.dll mozilla::dom::NodeBinding::insertBefore obj-firefox/dom/bindings/NodeBinding.cpp:524 18 xul.dll mozilla::dom::NodeBinding::genericMethod obj-firefox/dom/bindings/NodeBinding.cpp:1424 19 mozjs.dll js::InvokeKernel js/src/jsinterp.cpp:390 20 mozjs.dll js::Invoke js/src/jsinterp.cpp:437 21 mozjs.dll js::CrossCompartmentWrapper::call js/src/jswrapper.cpp:631 22 mozjs.dll proxy_Call js/src/jsproxy.cpp:3015 23 mozjs.dll js::InvokeKernel js/src/jsinterp.cpp:383 24 mozjs.dll js::Interpret js/src/jsinterp.cpp:2376 25 mozjs.dll js::RunScript js/src/jsinterp.cpp:339 26 mozjs.dll js::InvokeKernel js/src/jsinterp.cpp:404 27 mozjs.dll js::Invoke js/src/jsinterp.h:131 28 mozjs.dll js_fun_apply js/src/jsfun.cpp:965 29 mozjs.dll js::InvokeKernel js/src/jsinterp.cpp:390 30 mozjs.dll js::Interpret js/src/jsinterp.cpp:2376 31 mozjs.dll js::RunScript js/src/jsinterp.cpp:339 32 mozjs.dll js::InvokeKernel js/src/jsinterp.cpp:404 33 mozjs.dll js::Invoke js/src/jsinterp.h:131 34 mozjs.dll mozjs.dll@0xb70d0 More reports at: https://crash-stats.mozilla.com/report/list?signature=js%3A%3Atypes%3A%3ATypeCompartment%3A%3AaddPendingRecompile%28JSContext*%2C+js%3A%3Atypes%3A%3ARecompileInfo+const%26%29
It's #5 top crasher in today's build.
Keywords: topcrash
Tracking since it's a top crasher and assigning to David to go through the regression range to try and find a culprit for this sudden explosion in crash volume.
Assignee: general → dvander
Nothing in the regression window looks suspect, but it's hard to say since this is fairly core code. Soccorro unfortunately isn't great for answering questions like, "what addresses does this crash at the most", but eyeballing the reports available it seems like the common crash is at 0x4, so a NULL deref in RecompileInfo::compileOuput(). Nicolas, do you know what that means or how the compile output information could be NULL?
Flags: needinfo?(nicolas.b.pierron)
Just by looking at the code, it seems that cx->compartment->types.constrainedOutputs is NULL, and when we use the operator [] on it we access the pointer to the beginning which is at te beginning of the Vector (behind the vtable). The only case where this might happen is in the sweep case where we reset constrainedOutputs. Maybe the validity is not yet right as we might be waiting for a parallel compilation. This is still some blind-guess of what is going on.
Flags: needinfo?(nicolas.b.pierron)
(In reply to Nicolas B. Pierron [:pierron] [:nbp] from comment #4) > Just by looking at the code, it seems that > cx->compartment->types.constrainedOutputs is NULL, and when we use the > operator [] on it we access the pointer to the beginning which is at te > beginning of the Vector (behind the vtable). In other words. When we call the operator[] on cx->compartment->types.constrainedOutputs, we have an offset of 4/8 on x86/x64, because it reads the pointer which is stored just after the TempAllocPolicy class (not the vtable) from which it inherits. Currently, just by looking at offsets of the inlined functions, this is the only location which match the displacement. So, I am quite confident about the symptoms and we should look on how the list of constraint output can be reset without getting rid of all compiled code. PS: That's awful how sleeping can disable you ability to read your own writing …
It seems to be related to Google+ or Gmail according to comments.
OS: Windows 7 → All
Whiteboard: [native-crash]
This is now #2 on Aurora 21.
After talking with Brian and Nicolas, we think this may be caused by a cross-compartment mismatch. I don't see anything particularly incriminating in the regression range. Bill, do you? We're going to try sanity checking against NULL and see what happens.
https://hg.mozilla.org/try/rev/236dd1d066bd This patch does some NULL checking and cross-compartment release-asserts. If it looks okay on try we should throw it at nightly for a day or two to see what happens
Status: NEW → ASSIGNED
Bug 836312 is in the range. We could try backing it out to see if there's any effect. Backing it out shouldn't have much effect on performance. However, there's nothing about that patch that seems likely to trigger this problem.
Crash Signature: [@ js::types::TypeCompartment::addPendingRecompile(JSContext*, js::types::RecompileInfo const&)] → [@ js::types::TypeCompartment::addPendingRecompile(JSContext*, js::types::RecompileInfo const&)] [@ js::types::CompilerOutput::isValid() ]
I've been trying, unsuccessfully, to reproduce this crash in Firefox Aurora 21.0a2 on Windows 7. I've been using GMail and Google+ as the comments indicate but no crashes experienced in the last couple of hours. Are they any more specific leads I can follow?
(In reply to Bill McCloskey (:billm) from comment #11) > Bug 836312 is in the range. We could try backing it out to see if there's > any effect. Backing it out shouldn't have much effect on performance. > > However, there's nothing about that patch that seems likely to trigger this > problem. Considering this is a top-crash & the spike coincides with this landing I think we should go ahead & back out Bug 836312 on Aurora(only) and continue our investigation on central based on :dvander's approach in comment# 10 to see if either helps . We should be able to easily identify if the backout really helped based on crash-stats.Bill can you or :jonco please help with the backout here?
(In reply to bhavana bajaj [:bajaj] from comment #13) > Considering this is a top-crash & the spike coincides with this landing I > think we should go ahead & back out Bug 836312 on Aurora(only) and continue > our investigation on central based on :dvander's approach in comment# 10 to > see if either helps . > > We should be able to easily identify if the backout really helped based on > crash-stats.Bill can you or :jonco please help with the backout here? Anything QA can do to assist in this effort?
(In reply to Anthony Hughes, Mozilla QA (:ashughes) from comment #14) > (In reply to bhavana bajaj [:bajaj] from comment #13) > > Considering this is a top-crash & the spike coincides with this landing I > > think we should go ahead & back out Bug 836312 on Aurora(only) and continue > > our investigation on central based on :dvander's approach in comment# 10 to > > see if either helps . > > > > We should be able to easily identify if the backout really helped based on > > crash-stats.Bill can you or :jonco please help with the backout here? > > Anything QA can do to assist in this effort?
Keywords: qawanted
(In reply to Bill McCloskey (:billm) from comment #15) > Sorry, I forgot about this. Backed out on Aurora: There are still crashes in 21.0a2/20130308.
True, it's still around in current Aurora builds, but crash frequency seems to have decreased.
Attached patch diagnostic patchSplinter Review
Brian, this is the diagnostic patch we talked about. It tries to stop the NULL derefs and also tell us whether cross-compartment violations are happening.
Attachment #723533 - Flags: review?(bhackett1024)
Comment on attachment 723533 [details] [diff] [review] diagnostic patch Review of attachment 723533 [details] [diff] [review]: ----------------------------------------------------------------- ::: js/src/jsinfer.cpp @@ +2312,5 @@ > + if (!co) { > + if (script->compartment() != cx->compartment) > + MOZ_CRASH(); > + return; > + } Unfortunately I think there are a bunch more places where cross compartment pointers could cause TI data to be malformed. To be comprehensive these checks should be at API boundaries.
Attachment #723533 - Flags: review?(bhackett1024) → review+
There's a second improvement between 21.0a2/20130314 and 20130316. The working range is: http://hg.mozilla.org/releases/mozilla-aurora/pushloghtml?fromchange=a64ef5e8030c&tochange=8f067a1714bf
(In reply to Scoobidiver from comment #23) > There's a second improvement between 21.0a2/20130314 and 20130316. It's a red herring because Aurora L10N builds are no longer updated (bug 851720).
(In reply to Brian Hackett (:bhackett) from comment #20) > Unfortunately I think there are a bunch more places where cross compartment > pointers could cause TI data to be malformed. To be comprehensive these > checks should be at API boundaries. Yeah, understandable. I was told we can't do that because it crashes too much in practice, or there was some perf cost or something. https://hg.mozilla.org/integration/mozilla-inbound/rev/f2706052a3bf
Status: ASSIGNED → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla22
I think that patch is only diagnostic, so this bug should be kept open, right?
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Whiteboard: [native-crash] → [native-crash][leave open]
dvander, do the diagnostics help here?
(In reply to Robert Kaiser (:kairo@mozilla.com) from comment #28) > dvander, do the diagnostics help here? My guess would be that this helps identifying the kind of issues that we have, just by looking at the graph that is reported here: https://crash-stats.mozilla.com/report/list?signature=js%3A%3Atypes%3A%3ATypeCompartment%3A%3AaddPendingRecompile%28JSContext*%2C+js%3A%3Atypes%3A%3ARecompileInfo+const%26%29
Adding needinfo on :dvander to help with comment# 28/29.
Flags: needinfo?(dvander)
My crash-stats fu isn't good enough to see whether the diagnostics helped. Am I supposed to be able to see my MOZ_CRASH's somewhere? But these crashes have completely gone away on nightly, so the non-diagnostic part of the patch did something. I'll tear out the diagnostics.
Flags: needinfo?(dvander)
I'm confused, it doesn't look like they've gone away on nightly. The link in comment 29 still shows crashes such as https://crash-stats.mozilla.com/report/index/dda76994-a9cc-44ed-abfd-4f2df2130402 But in this case at least the crash is right at the beginning of the function, at http://hg.mozilla.org/releases/mozilla-aurora/annotate/28ff0e110387/js/src/jsinfer.cpp#l2800 so either `info` or `cx` is null (debugging the minidump directly would show you in the inlined function what was actually being dereferenced).
When I look at the top crashers for 22, addPendingRecompile is at 0.06%
(In reply to David Anderson [:dvander] from comment #33) > When I look at the top crashers for 22, addPendingRecompile is at 0.06% For both crash signatures, crashes stopped in 22.0a1/20130320. The working range is: http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=e23e43a2c14e&tochange=8156df33b757 It seems either the diagnostic patch has fixed these crashes or another patch in the working range did it.
How can mere diagnostics fix the crash? Could the signature have shifted, i.e. are we crashing from a different signature instead (bsmedberg said that shouldn't be the case)? Or is the diagnostics stuff changing timing and influencing what's going on?
The patch was a combination of papering over the crash, as well as some diagnostics to try and determine why the crash was happening. Apparently, the diagnostics were inconclusive. I'll make a follow-up patch to remove them shortly.
Ah, makes sense. The papering-over part should get uplifted to 21 as well in that case.
(In reply to David Anderson [:dvander] from comment #36) > The patch was a combination of papering over the crash Can you write it without the diagnostic part and uplift it to Beta where it's currently #4 top browser crasher?
Adding needsinfo on :dvander to help with comment #37,38. :dvander, we have already gone to build with 21.0b2 targeting release today. 21.0b3 will be going to build Tuesday, Apr 16 early morning PT. Can you please help get a patch ready asap so that it can get into our next beta ?
Flags: needinfo?(dvander)
Attachment #736018 - Flags: review?(bhackett1024)
Flags: needinfo?(dvander)
Attachment #736018 - Flags: review?(bhackett1024) → review+
Attachment #736023 - Flags: review?(bhackett1024) → review+
:dvander, can you please request the beta patch for uplift and land by Monday EOD/Tue Morning before we go-to-build with our Fx21 beta 4(targeted Tuesday Morning)?Thanks !
Setting need-info for comment #42
Flags: needinfo?(dvander)
Comment on attachment 736023 [details] [diff] [review] patch for beta [Approval Request Comment] Bug caused by (feature/regressing bug #): unknown User impact if declined: specific top crash signature Testing completed (on m-c, etc.): yes Risk to taking this patch (and alternatives if risky): none String or IDL/UUID changes made by this patch:
Attachment #736023 - Flags: approval-mozilla-beta?
Flags: needinfo?(dvander)
Comment on attachment 736023 [details] [diff] [review] patch for beta low risk patch, lnown to resolve a top-crasher by nightly testing. Approving uplift on beta. Please make sure to land by Tuesday(tomorrow) Morning to get this into our next beta going to build tomorrow.Thanks!
Attachment #736023 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
Do you plan to land the patch to remove the diagnostic part in Aurora and the trunk?
Flags: needinfo?(dvander)
Keywords: topcrash
Flags: needinfo?(dvander)
Crash Signature: [@ js::types::TypeCompartment::addPendingRecompile(JSContext*, js::types::RecompileInfo const&)] [@ js::types::CompilerOutput::isValid() ] → [@ js::types::TypeCompartment::addPendingRecompile(JSContext*, js::types::RecompileInfo const&)] [@ js::types::CompilerOutput::isValid() ] [@ js::types::TypeCompartment::addPendingRecompile] [@ js::types::CompilerOutput::isValid ]
Assignee: dvander → nobody
Closing because no crash reported since 12 weeks.
Status: REOPENED → RESOLVED
Closed: 12 years ago7 years ago
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: