Closed Bug 834678 Opened 11 years ago Closed 11 years ago

crash in js::SPSInstrumentation

Categories

(Core :: JavaScript Engine, defect)

x86
Windows 7
defect
Not set
critical

Tracking

()

VERIFIED FIXED
mozilla29
Tracking Status
firefox27 --- unaffected
firefox28 + verified
firefox29 + verified
firefox-esr24 --- unaffected
b2g18 --- unaffected
b2g-v1.1hd --- unaffected
b2g-v1.2 --- unaffected
b2g-v1.3 --- fixed
b2g-v1.3T --- fixed
b2g-v1.4 --- fixed

People

(Reporter: ted, Assigned: djvj)

References

()

Details

(Keywords: crash, sec-moderate)

Crash Data

Attachments

(2 files)

This bug was filed from the Socorro interface and is 
report bp-cd6cb111-da18-48f7-b276-278cb2130125 .
============================================================= 

I crashed while running the demo in the URL field. I can't reproduce it now, but it looks like it's the profiler crashing inside some IonMonkey callback.

All I remember doing was loading the demo page and clicking "Start".
The stack trace is:
Frame 	Module 	Signature 	Source
0 	mozjs.dll 	js::SPSInstrumentation<js::ion::MacroAssembler,js::ion::Register>::leave 	js/src/vm/SPSProfiler.h:420
1 	mozjs.dll 	js::ion::IonInstrumentation::leave 	js/src/ion/IonInstrumentation.h:34
2 	mozjs.dll 	js::ion::CodeGenerator::visitFunctionBoundary 	js/src/ion/CodeGenerator.cpp:4941
3 	mozjs.dll 	js::ion::LFunctionBoundary::accept 	js/src/ion/LIR-Common.h:3478
4 	mozjs.dll 	js::ion::CodeGenerator::generateBody 	js/src/ion/CodeGenerator.cpp:1711
5 	mozjs.dll 	js::ion::CodeGenerator::generate 	js/src/ion/CodeGenerator.cpp:3556
6 	mozjs.dll 	js::ion::CompileBackEnd 	js/src/ion/Ion.cpp:1018
7 	mozjs.dll 	js::WorkerThread::threadLoop 	js/src/jsworkers.cpp:321
8 	mozjs.dll 	js::WorkerThread::ThreadMain 	js/src/jsworkers.cpp:289
9 	nspr4.dll 	_PR_NativeRunThread 	nsprpub/pr/src/threads/combined/pruthr.c:395
10 	nspr4.dll 	pr_root 	nsprpub/pr/src/md/windows/w95thred.c:90
11 	msvcr100.dll 	_callthreadstartex 	f:\dd\vctools\crt_bld\self_x86\crt\src\threadex.c:314
12 	msvcr100.dll 	_threadstartex 	f:\dd\vctools\crt_bld\self_x86\crt\src\threadex.c:292
13 	kernel32.dll 	BaseThreadInitThunk 	
14 	ntdll.dll 	__RtlUserThreadStart 	
15 	ntdll.dll 	_RtlUserThreadStart
Assignee: nobody → general
Component: Gecko Profiler → JavaScript Engine
OS: Windows NT → Windows 7
I can reliably reproduce this on a local Linux x86-64 debug (non-optimized) build by loading the following URL:
https://groups.google.com/forum/#!topic/comp.lang.c++.moderated/OQcEmJVMV4E

Apparently I filed bug 917401 about crashing on that same URL a while ago. I have the profiler extension installed so profiling is happening (clearly).

My Linux stack:
#0  0x00007ffff26a5f62 in js::SPSInstrumentation<js::jit::MacroAssembler, js::jit::Register>::leave (this=0x7fffb93bfa50, pc=0x7fffbb00e9d1 "Q\006", masm=..., 
    scratch=...) at /build/mozilla-central/js/src/vm/SPSProfiler.h:356
#1  0x00007ffff268578f in js::jit::IonInstrumentation::leave (this=
    0x7fffb93bfa50, masm=..., reg=...)
    at /build/mozilla-central/js/src/jit/IonInstrumentation.h:33
#2  0x00007ffff2686bcd in js::jit::MacroAssembler::leaveSPSFrame (this=
    0x7fffb93bf050)
    at /build/mozilla-central/js/src/jit/IonMacroAssembler.h:937
#3  0x00007ffff26bb31b in js::jit::MacroAssembler::callWithABI<void*> (this=
    0x7fffb93bf050, fun=@0x7ffffffebd48: 0x7ffff26a20a2, result=
    js::jit::MacroAssemblerX64::GENERAL)
    at /build/mozilla-central/js/src/jit/IonMacroAssembler.h:880
#4  0x00007ffff26ef34a in js::jit::CodeGenerator::emitOOLTestObject (this=
    0x7fffb93bf000, objreg=..., ifEmulatesUndefined=0x7fffbbd6fd28, 
    ifDoesntEmulateUndefined=0x7fffbbd6fda8, scratch=...)
    at /build/mozilla-central/js/src/jit/CodeGenerator.cpp:388
#5  0x00007ffff279f871 in js::jit::OutOfLineTestObject::accept (this=
    0x7fffb92693f8, codegen=0x7fffb93bf000)
    at /build/mozilla-central/js/src/jit/CodeGenerator.cpp:426
#6  0x00007ffff27edd02 in js::jit::OutOfLineCodeBase<js::jit::CodeGenerator>::generate (this=0x7fffb92693f8, codegen=0x7fffb93bf000)
    at /build/mozilla-central/js/src/jit/shared/CodeGenerator-shared.h:493
#7  0x00007ffff287ccdd in js::jit::CodeGeneratorShared::generateOutOfLineCode (
    this=0x7fffb93bf000)
    at /build/mozilla-central/js/src/jit/shared/CodeGenerator-shared.cpp:99
#8  0x00007ffff2880bca in js::jit::CodeGeneratorX86Shared::generateOutOfLineCode (this=0x7fffb93bf000)
    at /build/mozilla-central/js/src/jit/shared/CodeGenerator-x86-shared.cpp:319
#9  0x00007ffff27040a5 in js::jit::CodeGenerator::generate (this=
    0x7fffb93bf000) at /build/mozilla-central/js/src/jit/CodeGenerator.cpp:5806
#10 0x00007ffff2713086 in js::jit::GenerateCode (mir=0x7fffb9b17248, lir=
    0x7fffbbd6f5c8, maybeMasm=0x0)
    at /build/mozilla-central/js/src/jit/Ion.cpp:1475
#11 0x00007ffff2713105 in js::jit::CompileBackEnd (mir=0x7fffb9b17248, 
    maybeMasm=0x0) at /build/mozilla-central/js/src/jit/Ion.cpp:1494
#12 0x00007ffff2713d46 in js::jit::IonCompile (cx=0x7fffbd05aa80, script=
    0x7fffbabc3a00, baselineFrame=0x7ffffffec348, osrPc=0x0, constructing=
    false, executionMode=js::SequentialExecution)
    at /build/mozilla-central/js/src/jit/Ion.cpp:1710
#13 0x00007ffff27145d7 in js::jit::Compile (cx=0x7fffbd05aa80, script=
    0x7fffbabc3a00, osrFrame=0x7ffffffec348, osrPc=0x0, constructing=false, 
    executionMode=js::SequentialExecution)
    at /build/mozilla-central/js/src/jit/Ion.cpp:1879
#14 0x00007ffff27150b5 in js::jit::CompileFunctionForBaseline (cx=
    0x7fffbd05aa80, script=0x7fffbabc3a00, frame=0x7ffffffec348, 
    isConstructing=false) at /build/mozilla-central/js/src/jit/Ion.cpp:2050
#15 0x00007ffff26352bc in js::jit::EnsureCanEnterIon (cx=0x7fffbd05aa80, stub=
    0x7fffb8e95f20, frame=0x7ffffffec348, script=0x7fffbabc3a00, pc=
    0x7fffbb00e9a9 "T", jitcodePtr=0x7ffffffec270)
    at /build/mozilla-central/js/src/jit/BaselineIC.cpp:766
#16 0x00007ffff263599a in js::jit::DoUseCountFallback (cx=0x7fffbd05aa80, stub=
    0x7fffb8e95f20, frame=0x7ffffffec348, infoPtr=0x7ffffffec2f8)
    at /build/mozilla-central/js/src/jit/BaselineIC.cpp:931
#17 0x00007fffe534c8e5 in ?? ()
#18 0x00007ffffffec290 in ?? ()
#19 0x00007ffffffec2c0 in ?? ()
#20 0x00007ffff574f2c0 in RunOnceScriptPrologueInfo ()
   from /build/debug-mozilla-central/dist/bin/libxul.so

I get 7 more frames of what I assume are JIT code and then the stack ends.
The same profile + URL doesn't seem to crash my nightly build FWIW.
djvj, can someone look into this issue?
Could this be similar to that issue you fixed earlier this week?
Flags: needinfo?(kvijayan)
My local build is from 2581b84e0ca, which looks like it's from Monday.
The first stack trace reflects what I fixed earlier this week.

The second stack trace seems unrelated, and a different issue.  SPS::Leave on a emitOOLTestObject is unlikely to be happening around function boundaries where that issue might show up.  Also, the "pc" at the bottom of the stack, according to the traceback, points to JSOP_POP (ord('Q') == 81), followed by a JSOP_GOTO.  Seems like right before a loopedge to me.  Not sure why we'd be emitting instrumentation for that...

I'd investigate the second stack trace as a different issue.
Flags: needinfo?(kvijayan)
Assignee: general → kvijayan
Tried to reproduce this with a local build (linux x86-64, debug, no-optimize), but no luck.

I tried a more recent revision as well as the one you specified (2581b84e0ca).  Is there anything special about the profile I could replicate?

Also, do you know if this is an ASSERT hitting or a random crash?
Flags: needinfo?(ted)
I believe you have to have the profiler extension installed and running to reproduce the crash. There's an assertion being hit, I missed that in all the output:
Assertion failure: frame->script->code <= pc && pc < frame->script->code + frame->script->length, at /build/mozilla-central/js/src/vm/SPSProfiler.h:357

I can try rebuilding at a newer revision to see if that still reproduces.
Flags: needinfo?(ted)
Keywords: sec-moderate
Thanks Ted!  Got it reproducing.  The issue is an OutOfLineCode block (OutOfLineTestObject) with a pc that doesn't match the script.  Not sure how it's happening as the OOL objects get their scripts and PCs assigned from the LBlock where they are added.  This suggests that the LBlock's script and pc are messed up somehow.

To be on the safe side, requesting sec-moderate on the bug.
Group: core-security
I think I found the problem.

In the handling for visitTest, if the MTest takes an MCompare as an operand, we emit a fused LInstruction for both of the instructions (e.g. LTestOAndBranch or LEmulatesUndefinedAndBranch).  This LInstruction is emitted with the MCompare as the pairing instruction.

However, the block containing the MCompare doesn't have to be the same as the block containing the MTest.  It's in fact not infrequent for this to be the case.

This leads us to emitting a LEmulatesUndefinedAndBranch on an LBlock which is paired with a MCompare that is contained in an MBasicBlock which doesn't correspond to the LBlock.

Tentative solution: pair the branching instructions with the MTest instead, and have the codegen for the MInstruction use getOperand(0) to obtain the MCompare.  Will write this up tomorrow.
Attached patch bug-834678.patchSplinter Review
Tentative patch.  Builds.  Haven't been able to test it yet as I'm upgrading my Linux dev box.
Attached patch bug-834678.patchSplinter Review
Didn't manage to do this on Friday since I spent most of the day upgrading my linux box across 3 major releases of Ubuntu.

Patch fixes bug, passes on jit-tests.  Running on try right now.  Problem and solution is described in comment 11.

To help eagerly catch these sorts of issues, I added a helpful assert to CodeGeneratorShared::addOutOfLineCode that ensures that the OOL code's PC is contained within the OOL code's JSScript.  This assert triggered in a couple of other OOL code sites, namely lowering of MCheckOverRecursed and MInterruptCheck.  The LInstructions for those were being added without associating them with their corresponding MInstruction, so the lastPC_ wasn't getting set properly before the addOutOfLineCode call.  These cases have been fixed to accord with the newly added ASSERT.
Attachment #8344264 - Flags: review?(jdemooij)
Comment on attachment 8344264 [details] [diff] [review]
bug-834678.patch

Review of attachment 8344264 [details] [diff] [review]:
-----------------------------------------------------------------

Good find!
Attachment #8344264 - Flags: review?(jdemooij) → review+
https://hg.mozilla.org/mozilla-central/rev/39219e33ec40
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla29
Is this trunk only? If not, how far back does it go?
Your comment 7 seems to indicate this was a December regression, Kannan. Is that true?
Flags: needinfo?(kvijayan)
Yes.  The fixed issue originates in december.
Flags: needinfo?(kvijayan)
Before or after things branched on December 9?
Pinging so I can get this settled on whether it was before December 9 branching.
Flags: needinfo?(kvijayan)
Per discussion with Ryan, 28 is affected as the problem landed on 12/04/2013.
Thanks for the re-ping.  Sorry about letting this slip.  Ironically been deep in some profiling work recently.  Is the question settled now?
Presumably this wants an uplift nom at this point :)
Comment on attachment 8344264 [details] [diff] [review]
bug-834678.patch

[Approval Request Comment]
Bug caused by (feature/regressing bug #):
Not reducible to a single regressing bug - a combination of factors.
 
User impact if declined: 
Unexpected crashes when profiling turned on.

Testing completed (on m-c, etc.): 
Green on m-c for more than a month.

Risk to taking this patch (and alternatives if risky): 
Low.

String or IDL/UUID changes made by this patch:
N/A
Attachment #8344264 - Flags: approval-mozilla-aurora?
Flags: needinfo?(kvijayan)
Comment on attachment 8344264 [details] [diff] [review]
bug-834678.patch

Surely you meant beta ;)
Attachment #8344264 - Flags: approval-mozilla-aurora? → approval-mozilla-beta?
Comment on attachment 8344264 [details] [diff] [review]
bug-834678.patch

I asked the same question. You are faster than I. ;)
Attachment #8344264 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
I misread the rapid release schedule table :(  Thanks for fixing!
Ioana, can you please make sure this gets verified in the next Beta?
Keywords: verifyme
I tried to reproduce this issue for verification purposes on Firefox 28.0a1 (12/04), then tested the same on Firefox 28.0b7 (02/27). I need a confirmation that what I encountered is actually this bug before moving on to Firefox 29.

I tried out all the links and details specified in the above comments, but I only got results with (re)loading the link in comment to with the profiler started. In this case, I get the following in the console, then Firefox closes (no crash reporter or anything else, the process just closes):
Assertion failure: containsPC(pc), at /builds/slave/m-cen-l64-st-an-d-000000000000/build/js/src/jsscript.h:742

Program ./firefox (pid = 3329) received signal 11.
Stack:
UNKNOWN [/home/ioanabudnar/Desktop/firefox/libxul.so +0x02a53629]
UNKNOWN [/lib/x86_64-linux-gnu/libpthread.so.0 +0x0000fcb0]
UNKNOWN [/home/ioanabudnar/Desktop/firefox/libxul.so +0x02588ddc]
UNKNOWN [/home/ioanabudnar/Desktop/firefox/libxul.so +0x0266086b]
UNKNOWN [/home/ioanabudnar/Desktop/firefox/libxul.so +0x02af2e15]
UNKNOWN [/home/ioanabudnar/Desktop/firefox/libxul.so +0x02af2aef]
UNKNOWN [/home/ioanabudnar/Desktop/firefox/libxul.so +0x02aba30b]
UNKNOWN [/home/ioanabudnar/Desktop/firefox/libxul.so +0x02ae81b2]
UNKNOWN [/home/ioanabudnar/Desktop/firefox/libxul.so +0x02ae785a]
UNKNOWN [/home/ioanabudnar/Desktop/firefox/libxul.so +0x02af0710]
UNKNOWN [/home/ioanabudnar/Desktop/firefox/libxul.so +0x02af8292]
UNKNOWN [/home/ioanabudnar/Desktop/firefox/libxul.so +0x02ad47ca]
UNKNOWN [/home/ioanabudnar/Desktop/firefox/libxul.so +0x025ef3c5]
UNKNOWN [/home/ioanabudnar/Desktop/firefox/libxul.so +0x025f0dc4]
UNKNOWN [/home/ioanabudnar/Desktop/firefox/libxul.so +0x025f19dc]
UNKNOWN [/home/ioanabudnar/Desktop/firefox/libxul.so +0x02a7c3e5]
UNKNOWN 0x7f3f1241032d
Sleeping for 300 seconds.
Type 'gdb ./firefox 3329' to attach your debugger to this thread.
Done sleeping...

This works fine on beta (no assertion, no other issues).
Flags: needinfo?(kvijayan)
Yeah, that looks like the right symptoms for the bug.  It is a bug in the profiling code, so it will only reproduce with the profile turned on, and it's an issue with incorrect bytecode PCs being used with the profiler, so the containsPC assertion is also on target.
Flags: needinfo?(kvijayan)
I also verified this is fixed on Mozilla/5.0 (X11; Linux x86_64; rv:29.0) Gecko/20100101 Firefox/29.0.
Status: RESOLVED → VERIFIED
Keywords: verifyme
Group: core-security
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: