Crashes after turning debugging off because JM releases running code

RESOLVED FIXED

Status

()

defect
--
critical
RESOLVED FIXED
9 years ago
9 years ago

People

(Reporter: sfink, Assigned: dvander)

Tracking

({crash})

unspecified
x86_64
Linux
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(blocking2.0 final+)

Details

(Whiteboard: fixed-in-tracemonkey)

Attachments

(3 attachments)

When testing my patch for bug 610793, I was using a modified version of Firebug 1.7 to set a breakpoint within a simple javascript function. Here's my test html file:

<head>
<script>
  var x = 1; var y = 7;



  for (x = 2; x < 100; ++x)
  {

    y = 3;

  }

z = 1/3;
</script>
</head>
<body>
</body>

I set a breakpoint on the 'for' line, and let firebug stop at that line. Then I do ctrl-Q to exit the browser. In my DEBUG build with patched-Firefox, patched-Firebug, I would get an assertion failure when it attempted to emit a warning about a null or undefined value. The assertion said that the pc was out of range:

Assertion failure: script->code <= pc && pc < script->code + script->length, at /home/sfink/src/.TM-3/js/src/jsopcode.cpp:5118

From looking at the ops it was executing, it seemed like it was on the line 'z=1/3', though I'm not sure why that would warn.

I then ran with an unpatched DEBUG build of tracemonkey Firefox, and an unpatched version of Firebug 1.7 (which is still in development; I have svn revision 8390). It gave me a segmentation fault when doing the same thing. Unfortunately, a Jaegermonkey ate my stack, so I couldn't tell what it was doing:

FTS0: ReferenceError: FBTrace is not defined 
FTS0: fbs.breakIntoDebugger called chrome://browser/content/browser.xul-@-48123 returning 1 

Program received signal SIGSEGV, Segmentation fault.
0x00007fffe8401416 in ?? ()
(gdb) up
#1  0x00007fffe6813068 in ?? ()
(gdb) 
#2  0x0000000000000000 in ?? ()

I attempted to reproduce with unpatched tracemonkey Firefox and Firebug 1.6, but that combo is unable to set breakpoints, probably because it is missing the async API for turning the debugger on that allows it to work with JM.

I haven't looked too far into this yet, but it appears as if the debugMode may be getting turned off (not just queued up) with debugged stack frames active, and the code is recompiled without proper stackpatching happening. (Either not being attempted at all, or not working right.)

But that's somewhat speculative at this point.

In my patched source, I was able to trigger it a little more easily by moving the pc range check assertion up into a commonly used stubs:: callout (so I didn't need to rely on a warning to trip over the assertion.)
I have no idea what I'm talking about, do I?

That assertion is talking about the PC in terms of the pointer into the bytecode stream, not JITted instructions. So this has more to do with a JM frame's register store getting the wrong PC in it. I don't understand how the PC syncing works for stub calls, though, but I wanted to suppress any wrong trails I left in that initial comment.
blocking2.0: --- → final+
while it's possible this is a bug in jsd, it's more likely to be a bug in your own code or js.
Assignee: nobody → general
Severity: normal → critical
Component: JavaScript Debugging APIs → JavaScript Engine
Keywords: crash, stackwanted
QA Contact: jsd → general
This bug is present in an unmodified tracemonkey checkout, so it is not a bug in my code. As to whether it is a bug in jsd or js, I agree that it is more likely to be a bug in js. Or if it turns out to be a bug in jsd, then the js side is still missing an assertion.

I've been chasing my tail around coming up with steps to reproduce, until I finally realized that my firebug modifications are irrelevant -- what matters is your profile. The problem seems to only occur when a breakpoint is loaded from <profile>/firebug/breakpoints.json. It must be getting set before debugMode is on or some other preparation has been done.

Exact steps to reproduce:
1. Create a fresh profile (unnecessary, but never mind)
2. Install firebug-1.7 into it:
  mkdir ~/firebug
  cd ~/firebug
  svn checkout http://fbug.googlecode.com/svn/branches/firebug1.7
  mkdir <profile>/extensions
  ln -s $HOME/firebug/branches/firebug1.7 firebug@software.joehewitt.com
  (I'll admit I haven't tested these exact instructions)
3. Fire up the browser
4. Navigate to the attached crash.html page (file:/// URI is fine)
5. Click the bug in the bottom right to enable firebug for the page
6. Reload to see the script panel
7. Set a breakpoint on the 'for' line by clicking to the left of it

If you like, you can now reload again and see it break on that line. It won't matter; you're not going to see the crash on this run.

8. Exit out of the browser. (When the bug is active, this will trigger a crash.)
9. Start the browser again and go back to that page. This time, it will remember your breakpoint settings (which seems to be the trigger of the problem.)
10. When you are sitting on the breakpoint, Ctrl-Q to exit the browser.

You will see the crash. The exact crash seems 100% reproducible from run to run, but perturbing anything will change what the exact crash is. It usually shows up as either an assertion failure (pc is out of script->code..script->code+length, or perhaps script->main..script->main+length, depending on which assertion you hit) or a segmentation fault. There will be JM frames on the stack. Often (always?), the first JM frame's return address will be invalid. The one I looked at closely would point into the middle of an instruction if I decompiled the surrounding region. I've even managed to capture a bunch of a run in gdb's reverse debugging record target, but I don't know enough about how gdb unwinds x86_64 stack frames to figure out whether the return address was patched within the scope of my recorded execution. (I want a memory location that I could set a watch on.)

You can make the crash go away for one run by stopping the browser, deleting <profile>/firebug/*.json, and restarting.

Hopefully, I can pinpoint this now that I've figured out the real STR. (Having its occurrence tied to saved profile settings was sending me down all kinds of wrong ratholes.) I just wanted to record the STR before I continued debugging.
Blocks: 610793
Ugh. This took me way too long. I shouldn't really tackle things this messy yet; I don't have a firm enough grasp of the JS engine.

I take back what I said about it being stable and reproducible. It usually reproduces, but from run to identical run, it sometimes won't, and when it won't, it will disappear for a whole bunch of consecutive runs. Which makes it much more challenging to pin down. (The profile thing is probably mostly irrelevant, other than another source of random perturbation.)

The problem is that JM ends up writing some new machine code over a memory region that is still live on the stack. So when you return up the stack into that JITted code, you land in the middle of the new code.

It does this when you're sitting at a breakpoint and you exit. This results in jsd_DebuggerOff(), which calls _destroyJSDContext(), which releases JSD's 'dumbContext'. And that ends up throwing out all the compiled code, some of which is still executing.

I'm attaching a sample stack from my 'dumbbt' tool (see <http://infomonkey.cdleary.com/questions/29/how-can-i-see-a-c-stack-trace-above-mjitted-stack-frames>). It also contains a regular gdb backtrace, which has proper filename:lineno entries until it gets lost in jitted code.
Keywords: stackwanted
Summary: Crashes after turning debugging off → Crashes after turning debugging off because JM releases running code
Assignee: general → dvander
Status: NEW → ASSIGNED
Comment on attachment 492497 [details] [diff] [review]
don't allocate JIT code across compartments

Thanks, Steve, for debugging this down to exactly what was going wrong. JSD's dummy context and compartment bites us again, this time fooling us into generating code into the wrong compartment's executable pool.
Attachment #492497 - Flags: review?(dmandelin)
Attachment #492497 - Flags: review?(dmandelin) → review+
http://hg.mozilla.org/mozilla-central/rev/c8b632c50cd0
Status: ASSIGNED → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.