Closed Bug 602333 Opened 14 years ago Closed 14 years ago

Crash in [@ JSC::X86Assembler::setRel32]

Categories

(Core :: JavaScript Engine, defect)

x86_64
Linux
defect
Not set
critical

Tracking

()

RESOLVED FIXED
Tracking Status
blocking2.0 --- beta8+

People

(Reporter: cpearce, Assigned: dvander)

References

Details

(4 keywords, Whiteboard: [jmorange][fixed-in-tracemonkey])

Crash Data

Attachments

(5 files, 7 obsolete files)

2.42 KB, patch
Details | Diff | Splinter Review
57.71 KB, patch
sstangl
: review+
Details | Diff | Splinter Review
38.44 KB, patch
sstangl
: review+
Details | Diff | Splinter Review
7.72 KB, patch
dmandelin
: review+
Details | Diff | Splinter Review
1.02 KB, patch
dmandelin
: review+
Details | Diff | Splinter Review
Recently we've seen a number of crashes in the media mochitests in content/media/test which have been in JSC::X86Assembler::setRel32.

Examples:

Bug 597050, crash in content/media/test/test_new_audio.html
Rev3 Fedora 12x64 mozilla-central debug test mochitests-1/5 on 2010/09/18 14:15:30
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1284844530.1284846996.15464.gz#err5

ASSERTION FAILED: offset == static_cast<int32_t>(offset)
(/builds/slave/mozilla-central-linux64-debug/build/js/src/assembler/assembler/X86Assembler.h:2251 static void JSC::X86Assembler::setRel32(void*, void*))


Thread 0 (crashed)
 0  libxul.so!JSC::X86Assembler::setRel32 [X86Assembler.h:6066d938a0dd : 2251 + 0x30]
    rbx = 0x00000028   r12 = 0x0438adb0   r13 = 0x00000000   r14 = 0xffffffff
    r15 = 0xf0119b60   rip = 0x1de416f8   rsp = 0xdc7a1e30   rbp = 0xdc7a1e50
 1  libxul.so!JSC::X86Assembler::linkJump [X86Assembler.h:6066d938a0dd : 2116 + 0x15]
    rbx = 0x00000028   r12 = 0x0438adb0   r13 = 0x00000000   r14 = 0xffffffff
    r15 = 0xf0119b60   rip = 0x1de613f9   rsp = 0xdc7a1e60   rbp = 0xdc7a1e80
 2  libxul.so!JSC::AbstractMacroAssembler<JSC::X86Assembler>::linkJump [AbstractMacroAssembler.h:6066d938a0dd : 518 + 0x17]
    rbx = 0x00000028   r12 = 0x0438adb0   r13 = 0x00000000   r14 = 0xffffffff
    r15 = 0xf0119b60   rip = 0x1de61426   rsp = 0xdc7a1e90   rbp = 0xdc7a1eb0
 3  libxul.so!JSC::LinkBuffer::link [LinkBuffer.h:6066d938a0dd : 103 + 0x17]
    rbx = 0x00000028   r12 = 0x0438adb0   r13 = 0x00000000   r14 = 0xffffffff
    r15 = 0xf0119b60   rip = 0x1de61453   rsp = 0xdc7a1ec0   rbp = 0xdc7a1ee0
 4  libxul.so!GetPropCompiler::generateStub [PolyIC.cpp:6066d938a0dd : 1169 + 0x20]
    rbx = 0x00000028   r12 = 0x0438adb0   r13 = 0x00000000   r14 = 0xffffffff
    r15 = 0xf0119b60   rip = 0x1de6a365   rsp = 0xdc7a1ef0   rbp = 0xdc7a25f0
 5  libxul.so!GetPropCompiler::update [PolyIC.cpp:6066d938a0dd : 1248 + 0x10]
    rbx = 0x031ff508   r12 = 0xf0132620   r13 = 0x00000000   r14 = 0xffffffff
    r15 = 0xf0119b60   rip = 0x1de6c55d   rsp = 0xdc7a2600   rbp = 0xdc7a2650
 6  libxul.so!js::mjit::ic::CallProp [PolyIC.cpp:6066d938a0dd : 2306 + 0xb]
    rbx = 0x031ff508   r12 = 0xf0132620   r13 = 0x00000000   r14 = 0xffffffff
    r15 = 0xf0119b60   rip = 0x1de65689   rsp = 0xdc7a2660   rbp = 0xdc7a27c0
 7  0x7f4b1f7e8722
    rbx = 0x031ff508   r12 = 0xf0132620   r13 = 0x00000000   r14 = 0xffffffff
    r15 = 0xf0119b60   rip = 0x1f7e8723   rsp = 0xdc7a27d0   rbp = 0xdc7a2850

Examples from bug 599647:
http://tinderbox.mozilla.org/showlog.cgi?log=TraceMonkey/1286389121.1286391362.22197.gz
Rev3 Fedora 12x64 tracemonkey debug test mochitests-1/5 on 2010/10/06 11:18:41

http://tinderbox.mozilla.org/showlog.cgi?log=TraceMonkey/1285874389.1285876673.27153.gz
Rev3 Fedora 12x64 tracemonkey debug test mochitests-1/5 on 2010/09/30 12:19:49

I've only seen this on x64 Linux so far.
A relative jump from the inline buffer low in the virtual address space to a PIC high in the address space is not possible on x64. However, we don't currently detect this occurrence. This is the most likely cause.
Hardware: x86 → x86_64
Severity: normal → critical
Keywords: crash
Summary: Crash in JSC::X86Assembler::setRel32 → Crash in [@ JSC::X86Assembler::setRel32]
(Weird, why doesn't it print 64-bit values for those registers?)

Would be nice to figure out how to stress test and reproduce this.
(In reply to comment #2)
> Would be nice to figure out how to stress test and reproduce this.

Here's how I do it.

If you apply this patch, you'll change most of the media mochitests so that they individually loop forever. If you then run the media mochitest which this crash shows up on (content/media/test/test_new_audio.html), the test will loop until you see the crash. You'll probably want to run the test concurrently in multiple tabs, and you'll want to change the mochitest --timeout command line option, else it will be killed after 3 minutes or so. The command line I use for mochitest is this:

(from $objdir/_tests/testing/)
python mochitest/runtests.py --appname=$objdir/dist/bin/firefox --test-path=$srcdir/content/media/test/test_new_audio.html --timeout=100000 --autorun --console-level=DEBUG --log-file=$srcdir/mochitest.log --file-level=INFO
(In reply to comment #3)
> Created attachment 481398 [details] [diff] [review]
> Patch to stress test media mochitest
> 
> (In reply to comment #2)
> > Would be nice to figure out how to stress test and reproduce this.
> 
> Here's how I do it.

In fact I just updated my tree, and I can now reproduce the crash this way. I opened test_new_audio in about 25 tabs, and I hit this crash in under 2 minutes.
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1286421348.1286423056.11488.gz#err1
Rev3 Fedora 12x64 mozilla-central debug test mochitests-1/5 on 2010/10/06 20:15:48
s: talos-r3-fed64-048
Bug 599645 is tracking this assertion-failure during test_media_selection.html (and now test_seek_out_of_range.html, when I noted the log from this bug's comment 5 there just now).

It looks like this is all the same underlying bug -- I'm just mentioning it here since that one has a number of logs already posted.
O frabjous day!

For you, at least, if not for me.  My month-plus neglected Firefox tab set reliably triggers this with the mozilla-central 4788083ce564 revision (not *just* that revision, just being precise), and I have it in a debugger.

#5  0x00007f3c17360937 in JSC::X86Assembler::setRel32 (from=0x7f3b9744a475, to=0x7f3c187f90af)
    at /home/jwalden/moz/2/js/src/assembler/assembler/X86Assembler.h:2253
2250	    static void setRel32(void* from, void* to)
2251	    {
2252	        intptr_t offset = reinterpret_cast<intptr_t>(to) - reinterpret_cast<intptr_t>(from);
2253	        ASSERT(offset == static_cast<int32_t>(offset));
2254	
2255	        js::JaegerSpew(js::JSpew_Insns,
2256	                       ISPFX "##setRel32 ((from=%p)) ((to=%p))\n", from, to);
2257	        setInt32(from, offset);

(gdb) p/x offset
$3 = 0x813aec3a
(gdb) bt
#4  <signal handler called>
#5  0x00007f3c17360937 in JSC::X86Assembler::setRel32 (from=0x7f3b9744a475, to=0x7f3c187f90af)
    at /home/jwalden/moz/2/js/src/assembler/assembler/X86Assembler.h:2253
#6  0x00007f3c17388b4b in JSC::X86Assembler::linkJump (code=0x7f3b9744a468, from=..., to=0x7f3c187f90af)
    at /home/jwalden/moz/2/js/src/assembler/assembler/X86Assembler.h:2117
#7  0x00007f3c17388c33 in JSC::AbstractMacroAssembler<JSC::X86Assembler>::linkJump (code=0x7f3b9744a468, jump=..., target=...)
    at /home/jwalden/moz/2/js/src/assembler/assembler/AbstractMacroAssembler.h:523
#8  0x00007f3c17388b7e in JSC::LinkBuffer::link (this=0x7fffa422a1a0, jump=..., label=...)
    at /home/jwalden/moz/2/js/src/assembler/assembler/LinkBuffer.h:113
#9  0x00007f3c17396715 in GetPropCompiler::generateStub (this=0x7fffa422a3e0, holder=0x7f3c04b69e38, shape=0x7f3c022b8660)
    at /home/jwalden/moz/2/js/src/methodjit/PolyIC.cpp:1120
#10 0x00007f3c17396c5e in GetPropCompiler::update (this=0x7fffa422a3e0) at /home/jwalden/moz/2/js/src/methodjit/PolyIC.cpp:1201
#11 0x00007f3c17390d0e in js::mjit::ic::CallProp (f=..., pic=0x7f3bfb635a40) at /home/jwalden/moz/2/js/src/methodjit/PolyIC.cpp:2263

The only bad news is that tab set is a (pretty-printed) 4.1MB sessionstore.js file, and it's fragile in the face of efforts to reduce it.  But reduction is definitely possible if truly necessary: when I first started reducing (before pulling out a debugger) the file was 6.9MB.
There's a testcase in bug 606288 which reproduces this crash reliably.
jst hit this on linux x64 today. Sean, potentially easy fix, can we add our own branchWithPatch, that either (a) clobbers one of the unused registers, or (b) loads the offset out of a memory table at the bottom of the script?
David says we want to fix this for Firefox 4. Marking a blocker.
blocking2.0: --- → final+
(In reply to comment #10)
> jst hit this on linux x64 today. Sean, potentially easy fix, can we add our own
> branchWithPatch, that either (a) clobbers one of the unused registers, or (b)
> loads the offset out of a memory table at the bottom of the script?

All of our code uses near, relative jumps -- there is no such thing as a 64-bit relative jump, so we would have to extend the Assembler to handle near, absolute indirect jumps (clobbering %r11, the scratchRegister), then always patch that value.

But doing absolute jumps in this manner has a significant performance penalty. To mitigate the damage, we may want different 'flavors' of branchWithPatch code, which know whether the jump is local to the buffer or possibly external.
Attached patch diagnostic patch (obsolete) — Splinter Review
Yesterday Brendan managed to get a crash in gdb, where an IC jump was completely bogus. It was low enough to make this bug a possible culprit, though it's much less likely to happen on OS X than it is on Linux. Another possibility for that crash is that a GC failed to unpatch an IC, but freed the code pointer.

This patch will crash at 0xC0DE when we're about to patch an IC jump with an invalid offset.
Attachment #489627 - Flags: review?(sstangl)
Brendan, this is the candidate for your crash.
Keywords: dogfood
Attachment #489627 - Flags: review?(sstangl) → review+
Taking. We should fix this sooner rather than later, whether or not it's happening on Mac in the wild.
Assignee: general → dvander
Status: NEW → ASSIGNED
blocking2.0: final+ → beta8+
Verified by one of Brendan's crashes that this is happening on OS X. Yuck. Patch in a day or two, the plan of attack is to extend code buffers with padding at the end, enough padding to add an absolute jump for every patchable jump. Then we'll add FarJump/CodeLocationFarJump structures to the assembler which will describe how to patch both locations.

This way, most of the time when address spaces are in range, there will be no performance hit. Once we start overflowing addresses, we'll start using indirect jumps.
Attached patch WIP v1 (obsolete) — Splinter Review
Sean, does this seem okay to you? There's a new assembler function that lets you wrap jumps as extended branches. Right before assembling, the buffer is extended with absolute jumps.
Attachment #489627 - Attachment is obsolete: true
Attachment #490793 - Flags: feedback?(sstangl)
Whiteboard: [orange][jmorange]
Attached patch part 1: framework for the fix (obsolete) — Splinter Review
A few CallIC spots that need FarJumps are included in this patch. To get the other ICs working needs a lot of refactoring, so that will be a separate patch.
Attachment #490793 - Attachment is obsolete: true
Attachment #490926 - Flags: review?(sstangl)
Attachment #490793 - Flags: feedback?(sstangl)
Attachment #490926 - Flags: review?(dmandelin)
Question: What is the "correction" value for?
The far jump generates code like:

1.   jne  _end_of_buffer

_end_of_buffer:
2.   mov rax, <absolute address>
3.   jmp rax

The DataLabelPtr is for patching the absolute address, the "correction" (poorly named, sorry) scoots back X bytes to get the start address of the thunk, to patch #1 -> #2.
Comment on attachment 490926 [details] [diff] [review]
part 1: framework for the fix

Can we rename farJumpCorrectionDistance to FarJump_StartOffsetFromTargetData or something like that? And add a comment to the value (if comment is good, name can be shorter)? Otherwise looks good.

And can you file a followup on fixing the OOM-handling for vectors in BaseAssembler?
Attachment #490926 - Flags: review?(dmandelin) → review+
Comment on attachment 490926 [details] [diff] [review]
part 1: framework for the fix

Are we sure that FarJumpCorrectionDistance is actually a constant? Can we assert that it is?

> Allocation mem = systemAlloc(size_t(4294967291));

This should be a #define, since it occurs multiple times and demands context to be understood. This value is 0xFFFFFFFB, which seems arbitrary to me. Naming the value would clarify it.

Looks good.
Attachment #490926 - Flags: review?(sstangl) → review+
You surely meant const size_t, right?  ;-)  More debuggable and all that.
The oversize allocation is just a hack so the problem is really easy to testcase and debug. It'll get removed before checkin so x64 users don't allocate 81234834GB of RAM with the method jit.
This patch is based off the refactoring that Chris and Jacob did for ARM. The goal is to begin moving all of the gross #ifdefs out of the IC code and compiler, and put it behind an interface.

PICLabels now exist for all platforms, with a distinction: if labels are not needed, a single version with hardcoded offsets is returned instead. When trying to mutate this version, it just asserts that the incoming value matches the hardcoded constant.
Attachment #491052 - Flags: review?(cdleary)
Attached patch part 3: fix getprop type guards (obsolete) — Splinter Review
Fixes inline type guards on GETPROP
Attachment #491067 - Flags: review?(sstangl)
Comment on attachment 491052 [details] [diff] [review]
part 2: refactor PICLabels to work on all platforms

I'm pretty sure that constant prop won't happen on x86 because of the dynamic-static initialization (though it is a neat trick!) -- usually static constructors run before |_main|. Can we use macro expansion for the private members of the *Labels structs, so that they form bitfield members |#if defined JS_HAS_IC_LABELS| but |static const int|s otherwise (which would require static initializers in the cpp file)? That way, we'd definitely get constant prop and the getters could still access members by the same name. This was the plan I had in mind for our MQ, but didn't try to work through it.

- Looks like some FarJump updates ended up in this patch -- maybe you want these to be part of the former patch on commit?
- Might want to get rid of the offset-getting helpers in the compilers now that it's a no-ifdef-required interface.

-    PICLabels labels;
+#if !defined JS_HAS_IC_LABELS
+    static GetPropLabels getPropLabels_;

Nit: positive reasoning on the "then" branch is easier to read when there's an "else", so I would flip-flop these blocks.
Attachment #491052 - Flags: review?(cdleary) → review+
Attached patch WIP (obsolete) — Splinter Review
I'm ditching the FarJump approach. It solved the problem nicely - no perf cost, and the type system prevented invalid links at compile-time. But it was a ton of code motion, probably too risky for b8. It also bloated the IC sizes, and masm.far() made the code look really gross.

Instead, let's just disable ICs that can't be patched. It seems really rare, and if it starts impacting performance, we can either gradually re-introduce the original patch, or work toward something fancier, like relocating code or recompilation. I'll file a follow-up bug.

This first patch reintroduces some of the safer refactorings, and makes it so that whenever we want to patch an IC stub, its size is retained in addition to its address.
Attachment #490926 - Attachment is obsolete: true
Attachment #491052 - Attachment is obsolete: true
Attachment #491067 - Attachment is obsolete: true
Attachment #491071 - Attachment is obsolete: true
Attachment #491386 - Attachment is obsolete: true
Attachment #491460 - Flags: review?(sstangl)
Attachment #491052 - Flags: review?(sstangl)
Attachment #491067 - Flags: review?(sstangl)
Attachment #491071 - Flags: review?(sstangl)
For the record, the FarJump/NearJump distinction could be useful on ARM. We don't need it for the next release but it'd be useful to consider in the future.

ARM has relative branches with a range of ±32MB. (That's 24 significant bits in the 'B' instruction, with an alignment restriction of 4 bytes.) Currently, we emit all branches using "LDR pc, =<address>", which loads the address from memory directly into the PC. This will be slower than 'B', and takes twice as much space for each branch. (With PICs, we could have quite a few branches.)

Bug 586297 aims to re-enable 'B' branches by allowing the back-end to handle the patching, but if you're planning to introduct FarJump/NearJump distinctions then ARM could probably just use that mechanism instead.
Comment on attachment 491460 [details] [diff] [review]
part 1: correctly bound code blocks

> namespace js { namespace mjit { struct JITScript; } }

Should this only be defined if JS_METHODJIT?

Everything else looks good.
Attachment #491460 - Flags: review?(sstangl) → review+
Comment on attachment 491461 [details] [diff] [review]
part 2: the fix. disable unpatchable ICs

canRelinkJump() is defined in X86Assembler.h, but it can be specialized for x86 in the same way that it is specialized for ARM, saving some calculations. Specialization between the Intel architectures usually involves #ifdef magic, or defining functions in the MacroAssemblerX86*.h files.

> //#define STRESS_JSC_ALLOCATOR

Can we prefix it with 'DEBUG_' and add a comment?

> if (hasLastStringStub && !buffer.verifyRange(lastStringStub))
>     return disable(cx, "code memory is out of range");
> if ((shouldPatchInlineTypeGuard() || shouldPatchUnconditionalClaspGuard()) &&
>     !buffer.verifyRange(cx->fp()->jit())) {
>          return disable(cx, "code memory is out of range");
> }

No less clear with one if().

Looks good. Excited to end these crashes.
Attachment #491461 - Flags: review?(sstangl) → review+
http://hg.mozilla.org/tracemonkey/rev/54fb9d61096a
http://hg.mozilla.org/tracemonkey/rev/bbc8aeb55bae

I kept the two ifs separate, it's not a very complicated expression but it's verbose, and the formatting gets weird.
Whiteboard: [orange][jmorange] → [orange][jmorange][fixed-in-tracemonkey]
This crash seems all but gone. There's a few stragglers though.

It looks like the range checks aren't complete, it's necessary to always test the range to the script in addition to the last generated stub.
Attachment #492369 - Flags: review?(dmandelin)
Attachment #492369 - Flags: review?(dmandelin) → review+
http://hg.mozilla.org/mozilla-central/rev/bbc8aeb55bae
Status: ASSIGNED → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
Brendan got another crash here, the caller was GetElementIC::attachGetProp. It looks like I missed this check when widening the ranges in part 3. Follow-up fix shortly.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Attachment #492869 - Flags: review?(dmandelin) → review+
http://hg.mozilla.org/tracemonkey/rev/e5c6b0151c7b
http://hg.mozilla.org/mozilla-central/rev/160e6bf9752e
Status: REOPENED → RESOLVED
Closed: 14 years ago14 years ago
Resolution: --- → FIXED
(In reply to comment #2)
> (Weird, why doesn't it print 64-bit values for those registers?)

That's a bug in minidump_stackwalk that I apparently introduced and totally forgot to fix, sorry.
Blocks: 639967
Crash Signature: [@ JSC::X86Assembler::setRel32]
Whiteboard: [orange][jmorange][fixed-in-tracemonkey] → [jmorange][fixed-in-tracemonkey]
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: