Closed Bug 893184 Opened 6 years ago Closed 6 years ago

Intermittent jsreftest.html?test=ecma_5/JSON/parse-array-gc.js | application crashed [@ js::gc::Cell::tenuredZone() const] after "Assertion failure: addr % CellSize == 0, at ../../../js/src/gc/Heap.h:1027"

Categories

(Core :: JavaScript Engine, defect, critical)

x86_64
macOS
defect
Not set
critical

Tracking

()

RESOLVED FIXED
mozilla26
Tracking Status
firefox24 --- unaffected
firefox25 --- fixed
firefox26 --- fixed

People

(Reporter: RyanVM, Assigned: jonco)

References

Details

(Keywords: assertion, crash, intermittent-failure)

Crash Data

Attachments

(1 file, 2 obsolete files)

https://tbpl.mozilla.org/php/getParsedLog.php?id=25224823&tree=Mozilla-Inbound

Rev5 MacOSX Mountain Lion 10.8 mozilla-inbound debug test jsreftest on 2013-07-12 09:59:05 PDT for push 1c4f15e06065
slave: talos-mtnlion-r5-033

10:05:28     INFO -  REFTEST TEST-START | file:///builds/slave/talos-slave/test/build/tests/jsreftest/tests/jsreftest.html?test=ecma_5/JSON/parse-array-gc.js
10:05:28     INFO -  REFTEST TEST-LOAD | file:///builds/slave/talos-slave/test/build/tests/jsreftest/tests/jsreftest.html?test=ecma_5/JSON/parse-array-gc.js | 1101 / 6561 (16%)
10:05:28     INFO -  ++DOMWINDOW == 138 (0x14e6e7a78) [serial = 2168] [outer = 0x10f661798]
10:05:28     INFO -  WARNING: No permission to access camera: file ../../../dom/camera/DOMCameraManager.cpp, line 84
10:05:28     INFO -  852563: IdValuePair::value should be initialized to avoid GC sequence-point issues
10:05:28     INFO -  Note: You must run this test under valgrind to be certain it passes
10:05:28     INFO -  Tests complete
10:05:30     INFO -  Assertion failure: addr % CellSize == 0, at ../../../js/src/gc/Heap.h:1027
10:05:41  WARNING -  TEST-UNEXPECTED-FAIL | file:///builds/slave/talos-slave/test/build/tests/jsreftest/tests/jsreftest.html?test=ecma_5/JSON/parse-array-gc.js | Exited with code 1 during test run
10:05:41     INFO -  INFO | automation.py | Application ran for: 0:04:13.114511
10:05:41     INFO -  INFO | zombiecheck | Reading PID log: /var/folders/sd/07fbcfhn15386646cl87f8hh00000w/T/tmpuTLioapidlog
10:05:55  WARNING -  PROCESS-CRASH | file:///builds/slave/talos-slave/test/build/tests/jsreftest/tests/jsreftest.html?test=ecma_5/JSON/parse-array-gc.js | application crashed [@ js::gc::Cell::tenuredZone() const]
10:05:55     INFO -  Crash dump filename: /var/folders/sd/07fbcfhn15386646cl87f8hh00000w/T/tmpu2Odp7/minidumps/C5FCC72A-2EF7-416D-AA6D-A88F7E54F903.dmp
10:05:55     INFO -  Operating system: Mac OS X
10:05:55     INFO -                    10.8.0 12A269
10:05:55     INFO -  CPU: amd64
10:05:55     INFO -       family 6 model 42 stepping 7
10:05:55     INFO -       8 CPUs
10:05:55     INFO -  Crash reason:  EXC_BAD_ACCESS / KERN_INVALID_ADDRESS
10:05:55     INFO -  Crash address: 0x0
10:05:55     INFO -  Thread 0 (crashed)
10:05:55     INFO -   0  XUL!js::gc::Cell::tenuredZone() const [Heap.h:1c4f15e06065 : 993 + 0x0]
10:05:55     INFO -      rbx = 0x00007fff7dacac68   r12 = 0x0000000106939e00
10:05:55     INFO -      r13 = 0xcdcdcdcdcdcdcdcd   r14 = 0x000000010693a168
10:05:55     INFO -      r15 = 0x000000014a87eea0   rip = 0x0000000103725e7b
10:05:55     INFO -      rsp = 0x00007fff5fbfa0a0   rbp = 0x00007fff5fbfa0b0
10:05:55     INFO -      Found by: given as instruction pointer in context
10:05:55     INFO -   1  XUL!MarkInternal<JSAtom> [String.h:1c4f15e06065 : 429 + 0x7]
10:05:55     INFO -      rbx = 0x000000000000002b   r12 = 0x0000000106939e00
10:05:55     INFO -      r13 = 0xcdcdcdcdcdcdcdcd   r14 = 0x000000010693a168
10:05:55     INFO -      r15 = 0x000000014a87eea0   rip = 0x00000001035bb82d
10:05:55     INFO -      rsp = 0x00007fff5fbfa0c0   rbp = 0x00007fff5fbfa100
10:05:55     INFO -      Found by: call frame info
10:05:55     INFO -   2  XUL!JSScript::markChildren(JSTracer*) [jsscript.cpp:1c4f15e06065 : 2742 + 0xa]
10:05:55     INFO -      rbx = 0x000000000000002b   r12 = 0x0000000106939e00
10:05:55     INFO -      r13 = 0x0000000120ee5128   r14 = 0x000000010693a168
10:05:55     INFO -      r15 = 0x0000000103c0cd1d   rip = 0x00000001037f00fc
10:05:55     INFO -      rsp = 0x00007fff5fbfa110   rbp = 0x00007fff5fbfa140
10:05:55     INFO -      Found by: call frame info
10:05:55     INFO -   3  XUL!MarkInternal<JSScript> [Marking.cpp:1c4f15e06065 : 190 + 0xa]
10:05:55     INFO -      rbx = 0x000000010693a168   r12 = 0x0000000106939e00
10:05:55     INFO -      r13 = 0x0000000120ee5128   r14 = 0x000000010693a168
10:05:55     INFO -      r15 = 0x00007fff5fbfa1c8   rip = 0x00000001035b49ac
10:05:55     INFO -      rsp = 0x00007fff5fbfa150   rbp = 0x00007fff5fbfa190
10:05:55     INFO -      Found by: call frame info
10:05:55     INFO -   4  XUL!MarkRangeConservatively [RootMarking.cpp:1c4f15e06065 : 232 + 0xe]
10:05:55     INFO -      rbx = 0x000000010693a168   r12 = 0x0000000120ee5128
10:05:55     INFO -      r13 = 0x0000000120ee5000   r14 = 0x0000000000000002
10:05:55     INFO -      r15 = 0x0000000120ee5128   rip = 0x0000000103600bdc
10:05:55     INFO -      rsp = 0x00007fff5fbfa1a0   rbp = 0x00007fff5fbfa230
10:05:55     INFO -      Found by: call frame info
Crash Signature: [@ js::gc::Cell::tenuredZone()]
Jon, this seems to have gone up in frequency recently. Any chance you could take a look?
Flags: needinfo?(jcoppeard)
(In reply to Ryan VanderMeulen [:RyanVM UTC-4] from comment #15)

Sure, I'll look into it.
Assignee: general → jcoppeard
Status: NEW → ASSIGNED
Flags: needinfo?(jcoppeard)
Blocks: 905216
Attached patch shared-script-atoms (obsolete) — Splinter Review
The problem here seems to be that the shared script data's atoms are not initialized stright away and so can end up being traced by a GC before they are set.

Since we're pretending that the memory is a HeapPtrAtom array, we should really be calling the constructor/destructor of that class on every element.  Actually we could get away without calling the constructor and just zeroing the memory at the moment, but this is saner.

I recorded the number of atoms in the SharedScriptData struct which also means we can implement atoms() without having to pass any parameters.
Attachment #790704 - Flags: review?(till)
Severity: normal → critical
Summary: Intermittent jsreftest.html?test=ecma_5/JSON/parse-array-gc.js | Exited with code 1 during test run | application crashed [@ js::gc::Cell::tenuredZone() const] after "Assertion failure: addr % CellSize == 0, at ../../../js/src/gc/Heap.h:1027" → Intermittent jsreftest.html?test=ecma_5/JSON/parse-array-gc.js | application crashed [@ js::gc::Cell::tenuredZone() const] after "Assertion failure: addr % CellSize == 0, at ../../../js/src/gc/Heap.h:1027"
Comment on attachment 790704 [details] [diff] [review]
shared-script-atoms

Review of attachment 790704 [details] [diff] [review]:
-----------------------------------------------------------------

Urgh, thanks for catching this.

I wonder if we should turn JSScript->natoms and maybe ->length into getters that consult the SharedScriptData. natoms is only used in a few places, with markChildren being the only perf-critical one. Do you think that the added overhead would be an issue? njn would be delighted, I'm sure. :)
Attachment #790704 - Flags: review?(till) → review+
My original patch caused occasional crashes on MacOS 10.6 mochitest, in the SharedScriptData destructor which ends up calling pre-barriers on the contents of the HeapPtrAtoms.  The crash happened when one of the atoms contained a garbage value.

I don't think it's strictly necessary to do this, and we weren't doing it before anyway, so in the interests of fixing this bug I took out that part of the patch and landed it.

I'll raise a further bug to work out what's going on with this and apply the tidyups Till suggested above.

https://hg.mozilla.org/integration/mozilla-inbound/rev/560f18fc3ae3
https://hg.mozilla.org/mozilla-central/rev/560f18fc3ae3
Status: ASSIGNED → RESOLVED
Closed: 6 years ago
Flags: in-testsuite+
Resolution: --- → FIXED
Target Milestone: --- → mozilla26
Please request an Aurora uplift when you feel this has baked sufficiently long :)
(In reply to TinderboxPushlog Robot from comment #61)
> RyanVM
> https://tbpl.mozilla.org/php/getParsedLog.php?id=26785326&tree=Mozilla-
> Inbound
> Rev5 MacOSX Mountain Lion 10.8 mozilla-inbound debug test jsreftest on
> 2013-08-20 14:19:53
> revision: 01d3658a980c
> slave: talos-mtnlion-r5-084
> 
> TEST-UNEXPECTED-FAIL |
> file:///builds/slave/talos-slave/test/build/tests/jsreftest/tests/jsreftest.
> html?test=ecma_5/JSON/parse-array-gc.js | Exited with code 1 during test run
> PROCESS-CRASH |
> file:///builds/slave/talos-slave/test/build/tests/jsreftest/tests/jsreftest.
> html?test=ecma_5/JSON/parse-array-gc.js | application crashed [@
> JSScript::markChildren(JSTracer*)]
> Return code: 256

This is post-comment 55 :(
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Target Milestone: mozilla26 → ---
So I added this small change post-review because I thought it was innocuous, but in fact it ended up breaking things :(

This is a patch to revert it.

I never managed to get the actual failure to reproduce, but it's pretty clear from the crash that this is the culprit.
Attachment #793492 - Flags: review?(till)
Comment on attachment 793492 [details] [diff] [review]
revert-bogus-script-marking-change

Review of attachment 793492 [details] [diff] [review]:
-----------------------------------------------------------------

While I can't think of a reason for `code` to be NULL while `natoms` is > 0, it most certainly is never right to call MarkScriptData with a NULL pointer as the second argument. :)
Attachment #793492 - Flags: review?(till) → review+
I had to back this out while investigating another failure. Re-landed.
https://hg.mozilla.org/integration/mozilla-inbound/rev/e1b2f6bd92c4
This appears to crash FFOS on Nexus 4 on startup. Backing out allows the N4 to boot.
By "this" do you mean only the first patch that's made m-c (and was known to be buggy - hence the follow-up that's currently on inbound), or do you mean both patches? If the former, can you boot if you apply the patch from comment 63?
The first patch and the first patch plus follow up both crash.
(gdb) bt
#0  new_ (natoms=4, srcnotesLength=<optimized out>, codeLength=61, 
    cx=0xb6a23320) at ../../../gecko/js/src/gc/Barrier.h:132
#1  JSScript::fullyInitFromEmitter (cx=0xb6a23320, script=..., bce=0xbe8e6250)
    at ../../../gecko/js/src/jsscript.cpp:1923
#2  0xb5d7ecca in js::frontend::EmitFunctionScript (cx=0xb6a23320, 
    bce=0xbe8e6250, body=<optimized out>)
    at ../../../gecko/js/src/frontend/BytecodeEmitter.cpp:2583
#3  0xb5d7f0c0 in EmitFunc (cx=0xb6a23320, bce=0xbe8e6598, pn=0xb2ea7880)
    at ../../../gecko/js/src/frontend/BytecodeEmitter.cpp:4555
#4  0xb5d7d8f8 in js::frontend::EmitTree (cx=0xb6a23320, bce=0xbe8e6598, 
    pn=0xb2ea7880) at ../../../gecko/js/src/frontend/BytecodeEmitter.cpp:5632
#5  0xb5d78cf8 in js::frontend::CompileScript (cx=0xb6a23320, 
    alloc=<optimized out>, scopeChain=..., evalCaller=..., options=..., 
    chars=0xb2e6f000, length=112375, source_=0x0, staticLevel=0, 
    extraSct=0xbe8e7408)
    at ../../../gecko/js/src/frontend/BytecodeCompiler.cpp:361
#6  0xb5c1b7e2 in JS::Evaluate (cx=0xb6a23320, obj=..., options=..., 
    chars=0xb2e6f000, length=112375, rval=0xbe8e74f0)
    at ../../../gecko/js/src/jsapi.cpp:5203
#7  0xb5c1ba38 in JS::Evaluate (cx=0xb6a23320, obj=..., options=..., 
    bytes=0xb2e53000 "\n\n\n\n\nvar std_isFinite = isFinite;\nvar std_isNaN = isNaN;\nvar std_Array_indexOf = ArrayIndexOf;\nvar std_Array_join = Array.prototype.join;\nvar std_Array_push = Array.prototype.push;\nvar std_Array_shift"..., 
    length=0, rval=0xbe8e74f0) at ../../../gecko/js/src/jsapi.cpp:5239
#8  0xb5bdfc0a in JSRuntime::initSelfHosting (this=<optimized out>, 
    cx=0xb6a23320) at ../../../gecko/js/src/vm/SelfHosting.cpp:762
#9  0xb5c31e9e in js::NewContext (rt=0xb3004000, 
    stackChunkSize=<optimized out>) at ../../../gecko/js/src/jscntxt.cpp:201
#10 0xb54aece4 in XPCJSContextStack::GetSafeJSContext (this=0xb6a108e0)
    at ../../../../gecko/js/xpconnect/src/XPCJSContextStack.cpp:150
#11 0xb53d4cfe in nsScriptSecurityManager::GetSafeJSContext (
    this=<optimized out>)
    at ../../../gecko/caps/src/nsScriptSecurityManager.cpp:270
#12 0xb53d55ba in nsScriptSecurityManager::Init (this=0xb6add880)
    at ../../../gecko/caps/src/nsScriptSecurityManager.cpp:2332
#13 0xb53d7734 in nsScriptSecurityManager::GetScriptSecurityManager ()
    at ../../../gecko/caps/src/nsScriptSecurityManager.cpp:2413
#14 0xb51bdb3a in nsContentUtils::Init ()
    at ../../../../gecko/content/base/src/nsContentUtils.cpp:372
#15 0xb50846b6 in nsLayoutStatics::Initialize ()
    at ../../../gecko/layout/build/nsLayoutStatics.cpp:163
#16 0xb50841f6 in Initialize ()
    at ../../../gecko/layout/build/nsLayoutModule.cpp:405
#17 0xb58277e4 in nsComponentManagerImpl::KnownModule::Load (this=0xb6a8a180)
    at ../../../gecko/xpcom/components/nsComponentManager.cpp:770
#18 0xb5827f88 in nsFactoryEntry::GetFactory (this=0xb6a89320)
    at ../../../gecko/xpcom/components/nsComponentManager.cpp:1778
#19 0xb5828284 in CreateInstanceByContractID (aResult=0xbe8e76b8, aIID=..., 
    aDelegate=0x0, aContractID=0xb6afe8c0 "@mozilla.org/moz/jsloader;1", 
    this=0xb6a25700)
    at ../../../gecko/xpcom/components/nsComponentManager.cpp:1093
#20 nsComponentManagerImpl::CreateInstanceByContractID (this=0xb6a25700, 
    aContractID=0xb6afe8c0 "@mozilla.org/moz/jsloader;1", aDelegate=0x0, 
    aIID=..., aResult=0xbe8e76b8)
    at ../../../gecko/xpcom/components/nsComponentManager.cpp:1048
#21 0xb58291fc in GetServiceByContractID (result=0xbe8e7734, aIID=..., 
    aContractID=0xb6afe8c0 "@mozilla.org/moz/jsloader;1", this=0xb6a25700)
    at ../../../gecko/xpcom/components/nsComponentManager.cpp:1453
#22 nsComponentManagerImpl::GetServiceByContractID (this=0xb6a25700, 
    aContractID=0xb6afe8c0 "@mozilla.org/moz/jsloader;1", aIID=..., 
    result=0xbe8e7734)
    at ../../../gecko/xpcom/components/nsComponentManager.cpp:1367
#23 0xb5829440 in nsGetServiceFromCategory::operator() (this=0xbe8e7758, 
    aIID=..., aInstancePtr=0xbe8e7734)
    at ../../../gecko/xpcom/components/nsComponentManager.cpp:171
#24 0xb580c258 in nsCOMPtr_base::assign_from_helper (this=0xbe8e7748, 
    helper=<optimized out>, iid=<optimized out>)
    at ../../../gecko/xpcom/glue/nsCOMPtr.cpp:110
#25 0xb58276aa in operator= (rhs=..., this=0xbe8e7748)
    at ../../dist/include/nsCOMPtr.h:727
#26 nsComponentManagerImpl::LoaderForExtension (this=0xb6a25700, aExt=...)
    at ../../../gecko/xpcom/components/nsComponentManager.cpp:1489
#27 0xb5827782 in nsComponentManagerImpl::KnownModule::EnsureLoader (
    this=0xb6afe680)
    at ../../../gecko/xpcom/components/nsComponentManager.cpp:747
#28 0xb58277c0 in nsComponentManagerImpl::KnownModule::Load (this=0xb6afe680)
    at ../../../gecko/xpcom/components/nsComponentManager.cpp:758
#29 0xb5827f88 in nsFactoryEntry::GetFactory (this=0xb6aff150)
    at ../../../gecko/xpcom/components/nsComponentManager.cpp:1778
#30 0xb5828284 in CreateInstanceByContractID (aResult=0xbe8e7810, aIID=..., 
    aDelegate=0x0, 
    aContractID=0xb3003f70 "@mozilla.org/browser/directory-provider;1", 
    this=0xb6a25700)
    at ../../../gecko/xpcom/components/nsComponentManager.cpp:1093
#31 nsComponentManagerImpl::CreateInstanceByContractID (this=0xb6a25700, 
    aContractID=0xb3003f70 "@mozilla.org/browser/directory-provider;1", 
    aDelegate=0x0, aIID=..., aResult=0xbe8e7810)
    at ../../../gecko/xpcom/components/nsComponentManager.cpp:1048
#32 0xb58291fc in GetServiceByContractID (result=0xbe8e7864, aIID=..., 
    aContractID=0xb3003f70 "@mozilla.org/browser/directory-provider;1", 
    this=0xb6a25700)
    at ../../../gecko/xpcom/components/nsComponentManager.cpp:1453
#33 nsComponentManagerImpl::GetServiceByContractID (this=0xb6a25700, 
    aContractID=0xb3003f70 "@mozilla.org/browser/directory-provider;1", 
    aIID=..., result=0xbe8e7864)
    at ../../../gecko/xpcom/components/nsComponentManager.cpp:1367
#34 0xb580cbfa in nsGetServiceByContractID::operator() (this=<optimized out>, 
    aIID=<optimized out>, aInstancePtr=0xbe8e7864)
    at ../../../gecko/xpcom/glue/nsComponentManagerUtils.cpp:246
#35 0xb580c216 in nsCOMPtr_base::assign_from_gs_contractid (this=0xbe8e7888, 
    gs=..., iid=<optimized out>) at ../../../gecko/xpcom/glue/nsCOMPtr.cpp:92
#36 0xb581f9f6 in nsCOMPtr (gs=<optimized out>, this=0xbe8e7888)
    at ../../dist/include/nsCOMPtr.h:620
#37 nsDirectoryService::RegisterCategoryProviders (this=0xb6a1a540)
    at ../../../gecko/xpcom/io/nsDirectoryService.cpp:487
#38 0xb5811490 in NS_InitXPCOM2 (result=0xb6a1b154, 
    binDirectory=<optimized out>, appFileLocationProvider=<optimized out>)
    at ../../../gecko/xpcom/build/nsXPComInit.cpp:554
#39 0xb4f2967a in ScopedXPCOMStartup::Initialize (this=0xb6a1b154)
    at ../../../gecko/toolkit/xre/nsAppRunner.cpp:1191
#40 0xb4f2c966 in XREMain::XRE_main (this=0xbe8e7994, argc=1, 
    argv=<optimized out>, aAppData=<optimized out>)
    at ../../../gecko/toolkit/xre/nsAppRunner.cpp:3927
#41 0xb4f2cac0 in XRE_main (argc=1, argv=0xbe8e9b44, aAppData=0x221f8, 
    aFlags=<optimized out>) at ../../../gecko/toolkit/xre/nsAppRunner.cpp:4133
#42 0x0000b300 in do_main (argv=0xbe8e9b44, argc=1)
    at ../../../gecko/b2g/app/nsBrowserApp.cpp:168
#43 main (argc=<optimized out>, argv=<optimized out>)
    at ../../../gecko/b2g/app/nsBrowserApp.cpp:261
(In reply to Michael Wu [:mwu] from comment #71)

By any chance, do you have the details of the exception that caused that crash?
Fixed pointer alignment issue that caused crashes on ARM and re-landed.

https://hg.mozilla.org/integration/mozilla-inbound/rev/698bf87a8347
Is there a plan to add N4 to tbpl?   This could have been caught before it hit m-c :-/
(In reply to Michael Vines [:m1] [:evilmachines] from comment #77)
> Is there a plan to add N4 to tbpl?   This could have been caught before it
> hit m-c :-/

There's work going on to add the JB emulator to TBPL - bug 885630 . No idea if that would've caught this. Testing on actual devices is somewhat hard so we largely rely on the emulator now.
https://hg.mozilla.org/mozilla-central/rev/698bf87a8347

Round 3! As before, please request Aurora and Beta uplift when you feel this has sufficiently baked on trunk :)
Status: REOPENED → RESOLVED
Closed: 6 years ago6 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla26
(In reply to Michael Wu [:mwu] from comment #78)

This one was an alignment issue.  Would it not be possible to configure one of the existing emulator targets to abort on unaligned access?  I thought most ARM platforms were normally set up this way anyway.
We use qemu, and I'm not sure if that supports exceptions on unaligned access. http://stackoverflow.com/a/4032289 says no, but that's from 2010. I haven't found any better answers on Google.
Final patch, as pushed.
Attachment #790704 - Attachment is obsolete: true
Attachment #793492 - Attachment is obsolete: true
Attachment #796010 - Flags: review+
Comment on attachment 796010 [details] [diff] [review]
shared-scipt-atoms

[Approval Request Comment]
Bug caused by (feature/regressing bug #): Bug 679940 - split JSScript into a per-compartment portion and a shareable runtime-global portion
User impact if declined: Possible crashes
Testing completed (on m-c, etc.): On central since 22nd.
Risk to taking this patch (and alternatives if risky): Low
String or IDL/UUID changes made by this patch: None
Attachment #796010 - Flags: approval-mozilla-aurora?
Attachment #796010 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
You need to log in before you can comment on or make changes to this bug.