Closed Bug 1015339 Opened 9 years ago Closed 8 years ago

Dumping a JS stack crashes with ion code on the stack

Categories

(Core :: JavaScript Engine: JIT, defect)

x86
macOS
defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla37

People

(Reporter: bzbarsky, Assigned: jandem)

Details

Attachments

(5 files, 1 obsolete file)

<jandem-away> jesup: yeah, the problem is that FormatFrame is bogus for Ion frames
<jandem-away> bz: well it's the iter.computeThis in FormatFrame; that doesn't work
for Ion frames
FormatFrame calls FrameIter::computeThis() and since bug 716647 part 3 that function will now assert under FrameIter::abstractFramePtr() instead of returning early for Ion frames.

This should be reproducible in the browser by calling DumpJSStack() in gdb, but it'd be nice if we had a shell function for that too, so that the fuzzers would have caught this a long time ago.
Flags: needinfo?(shu)
The typical situation in Gecko code is that we don't have a cx in the frame where we want the stack.  DumpJSStack() has the benefit of locating the right cx (usually).

Should DumpJSStack() be calling js_DumpBacktrace instead of the APIs it calls now?
The easiest fix is just to rematerialize Ion frames when trying to dump them.

I don't know what people want to do about js_DumpBacktrace and adding a
DumpJSStack for the shell, so I didn't do anything about that.

js_DumpBacktrace seems like it should be preferred for simple backtraces.

DumpJSStack though, with its ability to print locals and args and this props
and all that jazz, is a different animal. FormatFrame should be rewritten to
operate on AbstractFramePtrs, but I don't have time to do that at the moment.
Attachment #8429757 - Flags: feedback?(jdemooij)
Flags: needinfo?(shu)
Comment on attachment 8429757 [details] [diff] [review]
Spot fix DumpJSStack.

Review of attachment 8429757 [details] [diff] [review]:
-----------------------------------------------------------------

I think it's simpler to skip the computeThis call for Ion frames.

Rematerializing the frames feels too heavy-weight to get a stack trace in gdb.
Attachment #8429757 - Flags: feedback?(jdemooij)
(In reply to Jan de Mooij [:jandem] from comment #5)
> Comment on attachment 8429757 [details] [diff] [review]
> Spot fix DumpJSStack.
> 
> Review of attachment 8429757 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> I think it's simpler to skip the computeThis call for Ion frames.
> 
> Rematerializing the frames feels too heavy-weight to get a stack trace in
> gdb.

That's the dichotomy between js_DumpBacktrace and DumpJSStack I was talking about. js_DumpBacktrace is nice and simple for just a stack trace. DumpJSStack on the other hand prints all this detailed info. If speed is a concern, I say call js_DumpBacktrace.
OK, but then we definitely need a shell function for this.

That may be a good idea anyway, so that the fuzzers can hammer the materialize-Ion-frames code without entering debug mode.
Boris, how used to the "DumpJSStack" name are the Gecko devs? What do you think about the following rename?

js_DumpBacktrace -> js_DumpSimpleBacktrace
DumpJSStack -> js_DumpDetailedBacktrace
> how used to the "DumpJSStack" name are the Gecko devs?

Good question.  I expect most of them use it via something like https://wiki.mozilla.org/Gecko:Debugging_Tools#Dumping_the_JS_Stack or the equivalent bit in the in-tree .gdbinit

So renaming would probably work if you just updated the .gdbinit.

But why bother?  We can introduce a js_DumpDetailedBacktrace and just have the existing DumpJSStack() call it, right?
(In reply to Boris Zbarsky [:bz] from comment #9)
> > how used to the "DumpJSStack" name are the Gecko devs?
> 
> Good question.  I expect most of them use it via something like
> https://wiki.mozilla.org/Gecko:Debugging_Tools#Dumping_the_JS_Stack or the
> equivalent bit in the in-tree .gdbinit
> 
> So renaming would probably work if you just updated the .gdbinit.
> 
> But why bother?  We can introduce a js_DumpDetailedBacktrace and just have
> the existing DumpJSStack() call it, right?

Yeah, certainly can.
A simpler lighterweight spotfix.
Attachment #8447351 - Flags: review?(bzbarsky)
Comment on attachment 8447351 [details] [diff] [review]
Spot fix FormatStackDump.

r=me
Attachment #8447351 - Flags: review?(bzbarsky) → review+
This is still crashing in exactly the same function and in the same exact way, but with a different operation on the frame iterator.  Now it's:

  arg = iter.unaliasedActual(i, DONT_CHECK_ALIASING);

leading to:

  MOZ_ASSERT(hasUsableAbstractFramePtr());
Naveed, can we find someone to fix this, please?
Flags: needinfo?(nihsanullah)
Patch today or tomorrow. This is exactly why I repeatedly asked for a shell function :(
Flags: needinfo?(nihsanullah) → needinfo?(jdemooij)
Attached patch imported patch shell-backtrace2 (obsolete) — Splinter Review
This is not that shell function, but it's what I have lying around in my queue. It calls js_DumpBacktrace. For this bug, I think you want FormatStackDump.
Assignee: nobody → sphink
Status: NEW → ASSIGNED
I suppose while I'm here, I ought to implement something. r? Waldo because I figure he's likely to object to my hacky interface.

I don't know how to test this properly. I'll attach a very basic test file that doesn't crash.

I will probably not be around to finish this up, so someone else should take this if it's useful.

(And I need to unassign myself from this bug.)
Attachment #8461016 - Flags: review?(jwalden+bmo)
Attachment #8461002 - Attachment is obsolete: true
Assignee: sphink → jdemooij
Could this change be backported to Firefox 31 ESR when fixed?

Yesterday I was trying to troubleshoot an issue where a website that I am working on crashes Firefox 31.0. (It runs fine in Firefox 30.0.) Due to this issue, I cannot print the JS stack trace in gdb to know what is causing the crash.

I have tested Firefox 32 builds and the issue is fixed in that branch. Normally I would just wait for FF32 to be released (currently scheduled for September 2), but much to my dismay, I discovered that Firefox 31.0 was the branch point for a new ESR line, and I am experiencing the same crash in Firefox 31.0 ESR.
(In reply to Daniel Trebbien from comment #21)
> Yesterday I was trying to troubleshoot an issue where a website that I am
> working on crashes Firefox 31.0. (It runs fine in Firefox 30.0.) Due to this
> issue, I cannot print the JS stack trace in gdb to know what is causing the
> crash.

Have you tried js_DumpBacktrace(cx) (comment 2)?
Steve, thanks for the patch to add the testing functions! Will see if I can repro with that.
> Have you tried js_DumpBacktrace(cx)

Typically this is needed in Gecko code which may not have a cx.  DumpJSStack finds the right one for you as needed.
Attached patch Fix + testSplinter Review
Requesting review from Waldo as he's also reviewing the other patch (on which this patch depends).

Patch uses MagicValue(JS_OPTIMIZED_OUT) if we can't get the argument (when the frame is an Ion frame), and then displays it as "[unavailable]". As this is just a debugging function, I don't want to spend too much time on it.

I verified the jit-test asserted with --ion-eager --ion-offthread-compile=off and the fuzzers can use this test to know about these new functions.
Attachment #8461516 - Flags: review?(jwalden+bmo)
Flags: needinfo?(jdemooij)
(In reply to Nicolas B. Pierron [:nbp] from comment #22)
> (In reply to Daniel Trebbien from comment #21)
> > Yesterday I was trying to troubleshoot an issue where a website that I am
> > working on crashes Firefox 31.0. (It runs fine in Firefox 30.0.) Due to this
> > issue, I cannot print the JS stack trace in gdb to know what is causing the
> > crash.
> 
> Have you tried js_DumpBacktrace(cx) (comment 2)?

I did. In lldb, I get:

error: warning: Stopped in a C++ method, but 'this' isn't available; pretending we are in a generic context
error: use of undeclared identifier 'js_DumpBacktrace'
error: 1 errors parsing expression

When I use gdb compiled from source (http://stackoverflow.com/questions/20553784/gdb-on-mac-10-9-fails-with-not-in-executable-format-file-format-not-recognized/24918436#24918436)
the `cx' variable is not available:

(gdb) bt
#0  js::ObjectImpl::numFixedSlots () at jsfriendapi.h:563
#1  js::shadow::Object::numFixedSlots () at :668
#2  js::CallObject::aliasedVarFromArguments () at :572
#3  0x00000001030db59a in ArgGetter (cx=<value temporarily unavailable, due to optimizations>) at /builds/slave/rel-m-rel-osx64_bld-0000000000/build/js/src/vm/ArgumentsObject.cpp:308
#4  0x000000005fbe43f0 in ?? ()
Previous frame inner to this frame (gdb could not unwind past this frame)
Current language:  auto; currently c++
(gdb) print cx
$1 = <value temporarily unavailable, due to optimizations>
Waldo do you think you can review these patches this week or should I ask somebody else?
Attachment #8461016 - Flags: review?(jwalden+bmo) → review+
Comment on attachment 8461516 [details] [diff] [review]
Fix + test

More than two weeks => switching review to shu.
Attachment #8461516 - Flags: review?(jwalden+bmo) → review?(shu)
Attachment #8461516 - Flags: review?(shu) → review+
Pushed with API change suggested by Waldo. You now have to do getBacktrace({ thisprops: true }) instead of getBacktrace("thisprops").

https://hg.mozilla.org/integration/mozilla-inbound/rev/549b4ef82544
Which versions of Firefox will include this change?  Is it going to be included in the next Firefox 31.0 ESR release?
This morning I tried out the web site where I was seeing the crash in Firefox 31.0 and 31.0esr, now using Firefox 31.1.0esr, and I still see the crash.  Unfortunately, I haven't been able to prepare a minimal test case.  I have isolated a particular line of JavaScript that "fixes" the crash when commented out, but this does not indicate where the crash might be occurring because the line enables a lot of other code, and attaching the JS debugger fixes the issue.

Attaching to the Firefox process in lldb, I see:

> (lldb) p (void)DumpJSStack()
> error: Error resuming inferior the 2 time: "Resume timed out.".
> Couldn't execute function; result was eExecutionSetupError

.. at the EXC_BAD_ACCESS.

Using gdb I see:

> (gdb) print (void)DumpJSStack()
> 
> Program received signal EXC_BAD_ACCESS, Could not access memory.
> Reason: KERN_INVALID_ADDRESS at address: 0x0000000000000020
> 0x00000001031247d3 in js_CallContextDebugHandler ()
> The program being debugged was signaled while in a function called from GDB.
Part 2:

https://hg.mozilla.org/integration/mozilla-inbound/rev/4307416e17ed

Sorry for the delay, forgot to push this sooner.
Keywords: leave-open
https://hg.mozilla.org/mozilla-central/rev/4307416e17ed
Status: ASSIGNED → RESOLVED
Closed: 8 years ago
Flags: in-testsuite+
Resolution: --- → FIXED
Target Milestone: --- → mozilla37
You need to log in before you can comment on or make changes to this bug.