Closed Bug 853154 Opened 11 years ago Closed 11 years ago

IonMonkey: Crash [@ AtomizeAndCopyChars] or [@ js::Atomize]

Categories

(Core :: JavaScript Engine, defect)

x86_64
Linux
defect
Not set
critical

Tracking

()

RESOLVED FIXED
mozilla23
Tracking Status
firefox19 --- wontfix
firefox20 --- wontfix
firefox21 - fixed
firefox22 - fixed
firefox23 --- verified
firefox-esr17 --- unaffected

People

(Reporter: gkw, Assigned: sstangl)

References

Details

(Keywords: crash, regression, testcase)

Crash Data

Attachments

(2 files, 1 obsolete file)

Attached file stack
x = this
__defineGetter__("eval", Array.reduce)
try {
    (function() {
        x.eval
    })()
} catch (e) {}

asserts js threadsafe opt shell on mozilla-central changeset e23e43a2c14e with --ion-parallel-compile=on --thread-count=33 --no-jm --ion-eager at AtomizeAndCopyChars.

Thanks to Nicolas for helping to diagnose this issue - it is likely to be just a null crash. He mentions that we should not use the cx in the codegen, only in IonBuilder.

This testcase crashes only once every 30-odd tries on a particular Linux configuration - due to its instability, not marking for autoBisect nor JSBugMon. I couldn't get this to crash in gdb, but I could get Linux to generate a coredump then have gdb inspect the coredump to get the backtrace.
To fix this:

1. All script targets (in this case, the target of an LCallKnown) must be generated and added as CompilerRoots during initial MIR construction

2. We should have a mechanism that guarantees that the backend (i.e., CompileBackEnd in the current codebase, and OptimizeMIR(), GenerateLIR(), and GenerateCode() after Bug 850070 lands) cannot call functions that allocate from the heap.
(After f2f discussion) This likely goes all the way back to bug 774253, when off thread compilation with Ion was added in IonMonkey, which means it should only affect IonMonkey.

This may also be the potential source of some intermittent oranges.

Since this is likely to only be a null crash, it might not be needed to be rushed into 20, but would be good to backport to Aurora 21.
> This may also be the potential source of some intermittent oranges.

Bug 842974 shows an intermittent orange with js::Atomize on the stack, but I'm not sure if it's related.
Crash Signature: [@ AtomizeAndCopyChars] [@ js::Atomize]
Summary: IonMonkey: Crash [@ AtomizeAndCopyChars] → IonMonkey: Crash [@ AtomizeAndCopyChars] or [@ js::Atomize]
Based on description this looks like a long standing issue(since ionmonkey landed) and very specific to a particular linux configuration where it is hard to repro consistently.Hence not tracking for release but will consider a low risk uplift based on risk.
Sean, based on your comments in comment 1, what would be a good next step?
Flags: needinfo?(sstangl)
Reproduced again on changeset 55f9e3e3dae7 (tested on 32-bit opt threadsafe deterministic shell on a 32-core computer, also crashes once every 30-50ish runs):

Program terminated with signal 11, Segmentation fault.
#0  AtomizeAndCopyChars<(js::AllowGC)1> (ib=js::DoNotInternAtom, length=17, tbchars=0xf58edf98, cx=0x0) at /tmp/abtmp-55f9e3e3dae7-bpHeHb/compilePath/js/src/jsatom.cpp:293
293     /tmp/abtmp-55f9e3e3dae7-bpHeHb/compilePath/js/src/jsatom.cpp: No such file or directory.
(gdb) bt
#0  AtomizeAndCopyChars<(js::AllowGC)1> (ib=js::DoNotInternAtom, length=17, tbchars=0xf58edf98, cx=0x0) at /tmp/abtmp-55f9e3e3dae7-bpHeHb/compilePath/js/src/jsatom.cpp:293
#1  js::Atomize (cx=0x0, bytes=0x851655b "ArrayStaticReduce", length=17, ib=js::DoNotInternAtom) at /tmp/abtmp-55f9e3e3dae7-bpHeHb/compilePath/js/src/jsatom.cpp:389
#2  0x080c10b9 in JSFunction::initializeLazyScript (this=0xf59304c0, cx=0x0) at /tmp/abtmp-55f9e3e3dae7-bpHeHb/compilePath/js/src/jsfun.cpp:1108
#3  0x084dfb26 in getOrCreateScript (cx=0x0, this=<optimized out>) at /tmp/abtmp-55f9e3e3dae7-bpHeHb/compilePath/js/src/jsfun.h:218
#4  js::ion::CodeGenerator::visitCallKnown (this=0xf47006a0, call=0x98d5fc0) at /tmp/abtmp-55f9e3e3dae7-bpHeHb/compilePath/js/src/ion/CodeGenerator.cpp:1398
#5  0x0838ea28 in js::ion::LCallKnown::accept (this=0x98d5fc0, visitor=0xf47006a0) at /tmp/abtmp-55f9e3e3dae7-bpHeHb/compilePath/js/src/ion/LIR-Common.h:821
#6  0x084d8f07 in js::ion::CodeGenerator::generateBody (this=0xf47006a0) at /tmp/abtmp-55f9e3e3dae7-bpHeHb/compilePath/js/src/ion/CodeGenerator.cpp:2232
#7  0x084db185 in js::ion::CodeGenerator::generate (this=0xf47006a0) at /tmp/abtmp-55f9e3e3dae7-bpHeHb/compilePath/js/src/ion/CodeGenerator.cpp:4511
#8  0x0832db9d in js::ion::GenerateCode (mir=0x98d2a60, lir=0x98d5bf8, maybeMasm=0x0) at /tmp/abtmp-55f9e3e3dae7-bpHeHb/compilePath/js/src/ion/Ion.cpp:1179
#9  0x0832e00c in js::ion::CompileBackEnd (mir=0x98d2a60, maybeMasm=0x0) at /tmp/abtmp-55f9e3e3dae7-bpHeHb/compilePath/js/src/ion/Ion.cpp:1198
#10 0x081b2d8d in js::WorkerThread::handleIonWorkload (this=0x98dd9f0, state=...) at /tmp/abtmp-55f9e3e3dae7-bpHeHb/compilePath/js/src/jsworkers.cpp:391
#11 0x081b2f07 in js::WorkerThread::threadLoop (this=0x98dd9f0) at /tmp/abtmp-55f9e3e3dae7-bpHeHb/compilePath/js/src/jsworkers.cpp:431
#12 0xf774f5b3 in ?? () from /usr/lib/i386-linux-gnu/libnspr4.so
#13 0xf776ed4c in start_thread () from /lib/i386-linux-gnu/libpthread.so.0
#14 0xf752ed3e in clone () from /lib/i386-linux-gnu/libc.so.6
(gdb) x/i $pc
=> 0x8086864 <js::Atomize(JSContext*, char const*, size_t, js::InternBehavior)+788>:    mov    (%edx),%esi
(gdb) x/b $edx
0x0:    Cannot access memory at address 0x0
(gdb) x/b $esi
0xf58edf98:     Cannot access memory at address 0xf58edf98
(gdb)
Attached patch Add Script as CompilerRoot (obsolete) — Splinter Review
Gary, could you try applying this patch and seeing whether the issue is still reproducible?
Attachment #734794 - Flags: feedback?(gary)
Flags: needinfo?(sstangl)
Comment on attachment 734794 [details] [diff] [review]
Add Script as CompilerRoot

This does fix the crash - I verified this on the specific machine with and without the patch.
Attachment #734794 - Flags: feedback?(gary) → feedback+
This bug is basically a duplicate of Bug 819797, but the solution to that bug was wrong and caused this new error.
Blocks: 819797
This patch generates and roots any target script during the IonBuilder phase. The problem with the current version is that when parallel compilation is enabled, the JSContext is NULL after MIR is constructed, so we can't generate a script during visitCallKnown().

The annoying duplication is because the entirety of makeCallHelper() was duplicated into MCallOptimize.cpp for parallel array -- we can take the hit for now and try to merge the paths later.
Attachment #734794 - Attachment is obsolete: true
Attachment #734929 - Flags: review?(hv1989)
Comment on attachment 734929 [details] [diff] [review]
Add Script as CompilerRoot [v2]

Review of attachment 734929 [details] [diff] [review]:
-----------------------------------------------------------------

This looks correct.
- Can you add an assertion that checks if cx is non-NULL in getOrCreateScript? (Will make the test deterministic)?
- Can you add testcase?
- Can you add follow-up bug, since we have other places where we use GetIonContext()->cx? Not sure if they are definitely wrong, but we should atleast look into them and try to remove them out of Codegen.
Attachment #734929 - Flags: review?(hv1989) → review+
https://hg.mozilla.org/integration/mozilla-inbound/rev/3c7ab35260ef

Added an assertion but not a testcase (we didn't have one that's reasonable). The assertion should sufficiently prevent this exact bug in the future.

Disregard the horrible tabbing in jsfun.h -- apparently we're missing vim directives in a number of files. I'll add them now and clean it up.
Assignee: general → sstangl
Status: NEW → ASSIGNED
No longer blocks: 774253
Blocks: 860077
https://hg.mozilla.org/mozilla-central/rev/3c7ab35260ef
Status: ASSIGNED → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla23
Comment on attachment 734929 [details] [diff] [review]
Add Script as CompilerRoot [v2]

[Approval Request Comment]
Bug caused by (feature/regressing bug #): 819797
User impact if declined: (Rare) NULL dereferences if a GC occurs during compilation.
Testing completed (on m-c, etc.): fuzz, m-c.
Risk to taking this patch (and alternatives if risky): None
String or IDL/UUID changes made by this patch: None
Attachment #734929 - Flags: approval-mozilla-beta?
Attachment #734929 - Flags: approval-mozilla-aurora?
Comment on attachment 734929 [details] [diff] [review]
Add Script as CompilerRoot [v2]

Since the fix is low risk and well understood, I think we can take on all branches.
Attachment #734929 - Flags: approval-mozilla-beta?
Attachment #734929 - Flags: approval-mozilla-beta+
Attachment #734929 - Flags: approval-mozilla-aurora?
Attachment #734929 - Flags: approval-mozilla-aurora+
https://hg.mozilla.org/releases/mozilla-aurora/rev/b136a9dbc9fe

This doesn't apply cleanly enough to beta for me to comfortably uplift.
> This doesn't apply cleanly enough to beta for me to comfortably uplift.

Setting needinfo.
Flags: needinfo?(sstangl)
Could not use --ion-parallel-compile=on --thread-count=33 options, I get "Error: Invalid long option: --ion-paralel-compile=on". I built IonMonkey from js/src form the cangeset mention in bug description using the --enable-debug --disable-optimize configure options.
Do I need to use other options when building?
Flags: needinfo?(gary)
yes, you need to make a thread safe build before the --ion-parallel-compile will be available.

https://developer.mozilla.org/en-US/docs/SpiderMonkey/Build_Documentation
Flags: needinfo?(gary)
Flags: in-testsuite?
Ubuntu 12.10 x64
I used thread safe options to build js (from source 4be469df6d23) and run the test from the bug description with the mentioned options and it doesn't assert.
Marking evrified for firefox 23
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: