Closed Bug 1135707 Opened 9 years ago Closed 9 years ago

Crash [@ js::jit::Instruction::skipPool] with unaligned write

Categories

(Core :: JavaScript Engine, defect)

ARM
Linux
defect
Not set
critical

Tracking

()

RESOLVED FIXED
mozilla41
Tracking Status
firefox38 --- affected
firefox39 --- affected
firefox40 --- affected
firefox41 --- fixed
firefox-esr38 --- affected
b2g-master --- affected

People

(Reporter: decoder, Assigned: jonco)

References

Details

(Keywords: crash, regression, testcase, Whiteboard: [jsbugmon:update] requires --arm-asm-nop-fill jsshell option, not in Firefox)

Crash Data

Attachments

(1 file)

The following testcase crashes on mozilla-central revision 09f4968d5f42 (build with --enable-optimize --enable-posix-nspr-emulation --enable-valgrind --enable-gczeal --target=i686-pc-linux-gnu --enable-arm-simulator --disable-debug, run with --arm-asm-nop-fill=1 --ion-eager):

gczeal(14);
var s = '';
var it = Proxy.create({
    get: function (receiver, name) {
            s += '?';
            return function () 'it';
    }
});
var iterator_fn = Proxy.createFunction({}, function () {
    return it;
});
var obj = Proxy.create({
    get: function (receiver, name) {
        return iterator_fn;
    }
});
for (var v of obj)
    s += v;



Backtrace:

Program received signal SIGSEGV, Segmentation fault.
js::jit::Instruction::skipPool (this=0xd83dbeec) at js/src/jit/arm/Assembler-arm.cpp:2848
#0  js::jit::Instruction::skipPool (this=0xd83dbeec) at js/src/jit/arm/Assembler-arm.cpp:2848
#1  0x08433cb3 in InstructionIterator (i_=0xd83dbeec, this=0xffffc260) at js/src/jit/arm/Assembler-arm.cpp:3027
#2  js::jit::Assembler::GetPointer (instPtr=0xd83dbeec <error: Cannot access memory at address 0xd83dbeec>) at js/src/jit/arm/Assembler-arm.cpp:719
#3  0x083497fb in js::jit::JitFrameIterator::checkInvalidation (this=this@entry=0xffffc4c0, ionScriptOut=ionScriptOut@entry=0xffffc2b0) at js/src/jit/JitFrames.cpp:156
#4  0x0834bbc6 in js::jit::JitFrameIterator::ionScript (this=0xffffc4c0) at js/src/jit/JitFrames.cpp:2234
#5  0x0834cab6 in js::jit::SnapshotIterator::SnapshotIterator (this=0xffffc310, iter=...) at js/src/jit/JitFrames.cpp:1703
#6  0x0834db15 in js::jit::InlineFrameIterator::resetOn (this=0xffffc4d8, iter=0xffffc4c0) at js/src/jit/JitFrames.cpp:2315
#7  0x0834de02 in js::jit::GetPcScript (cx=cx@entry=0x938de88, scriptRes=scriptRes@entry=0xffffc7e0, pcRes=pcRes@entry=0xffffc850) at js/src/jit/JitFrames.cpp:1570
#8  0x0819ee0a in JSContext::currentScript (this=0x938de88, ppc=0xffffc850, allowCrossCompartment=JSContext::DONT_ALLOW_CROSS_COMPARTMENT) at js/src/jscntxtinlines.h:462
#9  0x0819b45b in GetNonexistentProperty (vp=..., id=..., obj=..., cx=<optimized out>, receiver=...) at js/src/vm/NativeObject.cpp:1775
#10 NativeGetPropertyInline<(js::AllowGC)1> (vp=..., id=..., receiver=..., obj=..., cx=0x938de88) at js/src/vm/NativeObject.cpp:1888
#11 js::NativeGetProperty (cx=0x938de88, obj=..., receiver=..., id=..., vp=...) at js/src/vm/NativeObject.cpp:1906
#12 0x081aef95 in js::GetProperty (cx=cx@entry=0x938de88, obj=obj@entry=..., receiver=receiver@entry=..., id=id@entry=..., vp=vp@entry=...) at js/src/vm/NativeObject.h:1411
#13 0x0819c20e in GetProperty (vp=..., name=<optimized out>, receiver=..., obj=..., cx=0x938de88) at js/src/jsobj.h:846
#14 js::GetProperty (cx=0x938de88, v=..., v@entry=..., name=name@entry=..., vp=vp@entry=...) at js/src/vm/Interpreter.cpp:3570
#15 0x0842f6a9 in js::jit::Simulator::softwareInterrupt (this=0x938d230, instr=0x93eb564) at js/src/jit/arm/Simulator-arm.cpp:2158
#16 0x0842f7ad in js::jit::Simulator::decodeType7 (this=this@entry=0x938d230, instr=instr@entry=0x93eb564) at js/src/jit/arm/Simulator-arm.cpp:3258
#17 0x0842faec in js::jit::Simulator::instructionDecode (this=this@entry=0x938d230, instr=instr@entry=0x93eb564) at js/src/jit/arm/Simulator-arm.cpp:4177
#18 0x084321d4 in execute<false> (this=0x938d230) at js/src/jit/arm/Simulator-arm.cpp:4232
#19 js::jit::Simulator::callInternal (this=this@entry=0x938d230, entry=entry@entry=0xf70aaeb8 "\377\377\377\352\360O-\351\377\377\377\352\004\320M\342\377\377\377\352\020\212-\355\377\377\377\352\r\200\240\341\377\377\377\352h\220\235\345\377\377\377\352\r\260\240\341\377\377\377\352t\240\235\345\377\377\377", <incomplete sequence \352>) at js/src/jit/arm/Simulator-arm.cpp:4320
#20 0x084323e6 in js::jit::Simulator::call (this=<optimized out>, entry=entry@entry=0xf70aaeb8 "\377\377\377\352\360O-\351\377\377\377\352\004\320M\342\377\377\377\352\020\212-\355\377\377\377\352\r\200\240\341\377\377\377\352h\220\235\345\377\377\377\352\r\260\240\341\377\377\377\352t\240\235\345\377\377\377", <incomplete sequence \352>, argument_count=<optimized out>, argument_count@entry=8) at js/src/jit/arm/Simulator-arm.cpp:4403
#21 0x0829085a in EnterBaseline (cx=cx@entry=0x938de88, data=...) at js/src/jit/BaselineJIT.cpp:122
#22 0x082bce29 in js::jit::EnterBaselineMethod (cx=cx@entry=0x938de88, state=...) at js/src/jit/BaselineJIT.cpp:154
#23 0x08196e75 in js::RunScript (cx=cx@entry=0x938de88, state=...) at js/src/vm/Interpreter.cpp:434
#24 0x0819d7e2 in js::ExecuteKernel (cx=cx@entry=0x938de88, script=script@entry=..., scopeChainArg=..., thisv=..., type=type@entry=js::EXECUTE_GLOBAL, evalInFrame=evalInFrame@entry=..., result=result@entry=0x0) at js/src/vm/Interpreter.cpp:650
#25 0x0819dae7 in js::Execute (cx=cx@entry=0x938de88, script=script@entry=..., scopeChainArg=..., rval=rval@entry=0x0) at js/src/vm/Interpreter.cpp:687
#26 0x084d2260 in ExecuteScript (cx=0x938de88, obj=..., scriptArg=..., rval=0x0) at js/src/jsapi.cpp:4007
#27 0x084d22ed in JS_ExecuteScript (cx=<optimized out>, cx@entry=0x938de88, obj=..., obj@entry=..., scriptArg=scriptArg@entry=...) at js/src/jsapi.cpp:4029
#28 0x0804b680 in RunFile (compileOnly=false, file=0x943f3e0, filename=0xffffd489 "min.js", obj=..., cx=0x938de88) at js/src/shell/js.cpp:453
#29 Process (cx=cx@entry=0x938de88, obj_=<optimized out>, filename=0xffffd489 "min.js", forceTTY=forceTTY@entry=false) at js/src/shell/js.cpp:586
#30 0x08059549 in ProcessArgs (op=0xffffd130, obj_=<optimized out>, cx=0x938de88) at js/src/shell/js.cpp:5504
#31 Shell (envp=<optimized out>, op=0xffffd130, cx=0x938de88) at js/src/shell/js.cpp:5744
#32 main (argc=4, argv=0xffffd2c4, envp=0xffffd2d8) at js/src/shell/js.cpp:6083
eax	0xd83dbeec	-667042068
ebx	0x9324398	154289048
ecx	0xf5e480d0	-169574192
edx	0x0	0
esi	0x2	2
edi	0xffffc310	-15600
ebp	0xffffc2f8	4294951672
esp	0xffffc234	4294951476
eip	0x8423639 <js::jit::Instruction::skipPool()+9>
=> 0x8423639 <js::jit::Instruction::skipPool()+9>:	mov    (%eax),%ecx
   0x842363b <js::jit::Instruction::skipPool()+11>:	xor    %edx,%edx


Marked s-s because this spills out an error about an unaligned write and the test also uses gczeal.
Flags: needinfo?(mrosenberg)
Whiteboard: [jsbugmon:update,bisect] → [jsbugmon:bisect]
JSBugMon: Cannot process bug: Unable to automatically reproduce, please track manually.
Whiteboard: [jsbugmon:bisect] → [jsbugmon:]
Whiteboard: [jsbugmon:] → [jsbugmon:update,bisect]
Whiteboard: [jsbugmon:update,bisect] → [jsbugmon:bisect]
JSBugMon: Cannot process bug: Unable to automatically reproduce, please track manually.
Whiteboard: [jsbugmon:bisect] → [jsbugmon:]
Whiteboard: [jsbugmon:] → [jsbugmon:update,bisect]
Whiteboard: [jsbugmon:update,bisect] → [jsbugmon:update]
JSBugMon: Bisection requested, result:
autoBisect shows this is probably related to the following changeset:

The first bad revision is:
changeset:   https://hg.mozilla.org/mozilla-central/rev/bcbc9f1cfb10
user:        Jon Coppeard
date:        Thu Aug 14 11:52:24 2014 +0100
summary:     Bug 650161 - Implement compacting GC for JSObjects r=terrence

This iteration took 227.768 seconds to run.
Terrence, it looks like this is a regression from compacting GC.  Can you look at this or should we wait until Jon gets back?
Flags: needinfo?(terrence)
An intersection of compacting and arm looks right up Jon's alley. Let's just wait until he gets back.
Flags: needinfo?(terrence) → needinfo?(jcoppeard)
This depends on passing the --arm-asm-nop-fill flag.  This is a JIT feature to test constant pools, which are used on ARM.  The comments say:

    // Insert a number of NOP instructions between each requested instruction at
    // all locations at which a pool can potentially spill. This is useful for
    // checking that instruction locations are correctly referenced and/or
    // followed.

I think Marty is the best person to investigate this.
Flags: needinfo?(jcoppeard)
I emailed Marty about this bug, in case he doesn't check his needinfos.
Naveed, is there someone who could look at this?
Flags: needinfo?(marty.rosenberg) → needinfo?(nihsanullah)
Marty is no longer paid staff at Mozilla. Sean can you please take a look?
Assignee: nobody → sstangl
Flags: needinfo?(nihsanullah)
Any updates for this?
Flags: needinfo?(sstangl)
Al: thanks for setting needinfo; setting me as Assignee doesn't issue a notification. I'll take a look.
This seems to be a GC bug. At least, I have looked through the ARM code and haven't seen anything strange happening. The --arm-asm-nop-fill argument triggers GC issues not because of extra branches in the ARM codegen, but because it dramatically increases the size of the generated code, which changes when gczeal(14) decides to collect.

The referenced object to which the crashing Slots store is performed is correctly emitted, and the location of the object is correctly stored in the data relocations buffer. What appears to be happening is:

1) Close to the crash, we call js::Lambda(), which allocates a new object and triggers JS_GC_ZEAL compaction.
2) We overwrite the Slots pointer via overlay->forwardTo(), and it points to the tenured object slots.
3) Immediately after that, GCRuntime::releaseRelocatedArenasWithoutUnlocking() paves over the forwarded Slots pointer with JS_MOVED_TENURED_PATTERN.
4) Nothing attempts to actually update the baked-in nursery pointer in the ARM jitcode.
5) We load 0x49494949 from the slots and crash writing to it.

I'm unfamiliar with GC internals. Jon, could you take a look? I imported a patch in Bug 1157934 that allows for ARM disassembly. If you run with that patch applied, right before the crash you should see lines like:

> 0x557853ac       e3060040       movw r0, #24640
> 0x557853b0       e3450a26       movt r0, #23078

The address of the nursery object that was tenured is then given by ((23078 << 16) | 24640). The numbers obviously change every run, so rr is useful.
Flags: needinfo?(jcoppeard)
(In reply to Sean Stangl [:sstangl] from comment #12)

Thanks for looking into this!

Do you know by what path that pointer ends up in the code?

At the moment we discard all JIT code when we do a compacting GC so I'm not sure why that's not happening here.  Also nursery pointers should get updated automatically for minor GC, so again that should be happening.

The size of the generated code shouldn't affect when we run zeal collections unless it allocates more GC things as it's triggered by the number of allocations.

I will try and spend some time digging into this tomorrow.
Flags: needinfo?(jcoppeard)
I'm a bit out of my depth with this.  

We are triggering a GC with two jit activations on the stack.  We do OSI at the start of GC, which gives spew like:

    [IonInvalidate] Invalidating all frames for GC
    [IonInvalidate] BEGIN invalidating activation
    [IonInvalidate] #1 exit frame @ 0xf69ffd2c
    [IonInvalidate] #2 Optimized JS frame @ 0xf69ffdd0, testcase.js:3 (fun: 0xf6753260, script: 0xf6747280, pc 0xf7c5ab48)
    [IonInvalidate]   ion script 0xf660ebc0, jit code 0xf6755380
    [IonInvalidate]    ! Invalidate ionScript 0xf660ebc0 (inv count 1) -> patching osipoint 0xf7c5a9a0
    [IonInvalidate] END invalidating activation
    [IonInvalidate] Invalidating all frames for GC
    [IonInvalidate] BEGIN invalidating activation
    [IonInvalidate] #1 exit frame @ 0xf69ffe8c
    [IonInvalidate] #2 Optimized JS frame @ 0xf69fff30, testcase.js:1 (fun: (nil), script: 0xf67470d0, pc 0xf7c601e8)
    [IonInvalidate]   ion script 0xf6636000, jit code 0xf6755470
    [IonInvalidate]    ! Invalidate ionScript 0xf6636000 (inv count 1) -> patching osipoint 0xf7c5f658
    [IonInvalidate] END invalidating activation

Looking at the subsequent simulator output, when we return to the first jit code activation we resume at the given PC (0xf7c5ab48) and proceed until we hit the OSI patch point when we jump to the invalidation trampoline, and everything works fine.

However when we return to the second activation, we don't resume at the PC shown in the spew (0xf7c601e8) but somewhere else in the jit code.  We continue running but don't hit the OSI patch point and we crash because we make use of a stale JSObject pointer (the object has been relocated).

Note that we don't bother to update pointers in invalidated jit code because should not be accessed.
Who is the best person to investigate this further?  It has stalled out for a few weeks, and this bug is from back in February so it would be good to get some movement here.
Flags: needinfo?(jcoppeard)
I'll take another look at this tomorrow.  To me it looks like the problem is that we are executing invalidated jitcode, but actually I don't understand enough about how OSI works to say for sure what's supposed to happen here.
Flags: needinfo?(jcoppeard)
I think the problem is that we miscalculate the ion cache rejoin branch when run with the --arm-asm-nop-fill=1 option.  

This option inserts a NOP before every instruction.  The current implementation of IonCache::rejoinLabel() ends up skipping NOP instructions when advancing from the initial jump location (see Instruction::skipPool() and its use of InstIsBNOP()).

This leads to a situation where the rejoin label is one instruction past the OSI patch point label of the associated safepoint.  If we invalidate the jit code and patch this OSI point it has no effect because we never execute the patched instruction.
Here's a patch to remove the calculation of the rejoin label and replace it with use of a Label to work out where the next instruction ends up.

This passes jit-tests locally, I'm still waiting for try results.
Assignee: sstangl → jcoppeard
Flags: needinfo?(sstangl)
Attachment #8606309 - Flags: review?(jdemooij)
Group: javascript-core-security
Comment on attachment 8606309 [details] [diff] [review]
bug1135707-rejoin-label

Review of attachment 8606309 [details] [diff] [review]:
-----------------------------------------------------------------

Thanks! Storing the CodeLocationLabel seems worth it to get rid of the complexity.
Attachment #8606309 - Flags: review?(jdemooij) → review+
I don't think this is security-sensitive as the crash relies on the --arm-asm-nop-fill option which is only present in the shell.  The only way this could happen without this is if we ever deliberately follow a jump to an ion cache with a NOP instruction, and we don't do that.

dveditz, can we unmark this as s-s?
Flags: needinfo?(dveditz)
Blocks: 650161
Group: core-security, javascript-core-security
Flags: needinfo?(dveditz)
Keywords: sec-critical
Whiteboard: [jsbugmon:update] → [jsbugmon:update] requires --arm-asm-nop-fill jsshell option, not in Firefox
https://hg.mozilla.org/mozilla-central/rev/9571f765357d
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla41
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: