Closed Bug 519719 Opened 10 years ago Closed 10 years ago

TM: crash [@ JS_GetFrameThis] - SynthesizeFrame passes partly-uninitialized JSStackFrame to callHook

Categories

(Core :: JavaScript Engine, defect, P2, blocker)

x86
macOS
defect

Tracking

()

RESOLVED FIXED
mozilla1.9.3a1
Tracking Status
status1.9.2 --- beta5-fixed

People

(Reporter: davida, Assigned: jorendorff)

References

Details

(Keywords: crash, Whiteboard: [crashkill]fixed-in-tracemonkey[firebug-p1])

Crash Data

Attachments

(1 file, 3 obsolete files)

bz sez it's related to firebug.

Report ID
        	Date Submitted
    
    
bp-2957a8ba-365f-4784-b985-83ebc2090930	9/30/09	8:27 AM
5A392FEE-A972-48FC-A4AD-14C6B1C87115	9/29/09	10:26 PM
bp-e670e956-c927-42ee-8903-935dc2090929	9/29/09	10:08 PM
bp-9669dd85-099b-44dc-8cac-2c08a2090929	9/29/09	7:55 PM
bp-37083bdd-14fe-4954-be4d-dbb692090929	9/29/09	6:41 PM
bp-ac636e77-bba5-442b-a4cd-b3a4f2090929	9/29/09	6:02 PM
bp-4ccbb440-d72b-4675-a392-f944c2090929	9/29/09	5:33 PM
bp-fb0f45b9-4703-4bf2-9602-8cd332090929	9/29/09	4:33 PM
bp-7d9ffafc-7cd7-4a3d-b3f1-07bc12090929	9/29/09	4:27 PM
bp-baa81903-4312-43ca-81fd-593fe2090929	9/29/09	2:33 PM
bp-1fb87891-b68e-42ee-b9ae-25ab62090929	9/29/09	2:27 PM
bp-fa64b2b5-7ce2-47f8-9ee5-519862090929	9/29/09	12:54 PM
bp-6f1dd6d9-619f-4849-9b8a-b2a2d2090929	9/29/09	11:46 AM
bp-cff4cab8-27fe-4776-bd99-f14472090929	9/29/09	10:14 AM
0  	 	@0x801f0fc3  	
1 	libmozjs.dylib 	JS_GetFrameThis 	js/src/jsdbgapi.cpp:1144
2 	XUL 	_callHook 	js/jsd/jsd_step.c:133
3 	XUL 	jsd_FunctionCallHook 	js/jsd/jsd_step.c:285
4 	libmozjs.dylib 	SynthesizeFrame 	js/src/jstracer.cpp:5271
5 	libmozjs.dylib 	LeaveTree 	js/src/jstracer.cpp:6355

the relevant JS_GetFrameThis code looks sorta like this:

1143     if (JSVAL_IS_NULL(fp->thisv) && fp->argv)
1144         fp->thisv = OBJECT_TO_JSVAL(js_ComputeThis(cx, JS_TRUE, fp->argv));

I wish we had steps to reproduce, but David says it just randomly happens as he browses.  The stacks are running XBL ctors.

ccing some folks who've been in this code recently.  But why are we ending up inside this hook anyway while leaving trace?  That seems really odd to me; can that code really handle arbitrary js being reentered under it (which a call hook can certainly do)?

I assume this happened as a result of the most recent tracemonkey merge...  Maybe it's worth running firefox + firebug on m-c through valgrind?
Flags: blocking1.9.2?
Which version of Firebug? I guess 1.5a25.

I don't understand how you can end up in call hook by accident. 

Open Firebug Tracing, Firebug > Firebug Icon Menu > Open Tracing 
Options > FBS_STEP

If you see anything before the crash post it. If not you will have to run from the OS command line after creating firefox about:config option 
extensions.firebug-tracing-service.DBG_toOSConsole
set true.
> I don't understand how you can end up in call hook by accident. 

I'm not sure what you mean by "by accident".  We're ending up in a call hook because we're reconstructing a js stack after exiting from jitted code.
Summary: lots of crashes pointing to JS_GetFrameThis → lots of crashes [@ JS_GetFrameThis]
(In reply to comment #3)
> > I don't understand how you can end up in call hook by accident. 
> 
> I'm not sure what you mean by "by accident".  We're ending up in a call hook
> because we're reconstructing a js stack after exiting from jitted code.

Ok, Firebug is only involved because we turned on jsd. Our call hook is called callCallHook in the C++ code ;-).
Right.  This is purely jsd/js internal; no Firebug code obviously involved.
I just hit bp-0b6f2d8d-7bc3-415c-ad15-9112d2091008

No Firebug, but I did install and use Venkman earlier today.
Just to point out: dolske and ascher must be doing something else special because otherwise firebug users would be hitting this all the time.
David's on 1.9.2, I'm on 1.9.3. Quite possible this is a regression since 3.5; I'm assuming there are not many people testing Firebug on these versions.
Flags: blocking1.9.2? → blocking1.9.2+
Priority: -- → P2
there are a few of us using firebug on 1.9.2+. I haven't hit any crashes in my daily use of Firebug 1.5 on 1.9.2, but I may not be hitting exercising it heavily.

David, any steps to reproduce those crashes above?
Keywords: crash, crashreportid
i claim this is a problem with the caller of jsd, if it can't safely handle reentrancy, then it shouldn't be calling jsd which is allowed to perform arbitrary function calls.
Assignee: nobody → general
Component: JavaScript Debugging APIs → JavaScript Engine
QA Contact: jsd → general
Timeless is right. Taking.
Assignee: general → jorendorff
LeaveTree:
>        /*
>         * Synthesize a stack frame and write out the values in it using the
>         * type map pointer on the native call stack.
>         */
>        SynthesizeFrame(cx, *fi, callee);
>        int slots = FlushNativeStackFrame(cx, 1 /* callDepth */,
>                                          (*callstack)->get_typemap(),
>                                          stack, cx->fp, 0);

SynthesizeFrame:
>#ifdef DEBUG
>    // Initialize argv[-1] to a known-bogus value so we'll catch it if
>    // someone forgets to initialize it later.
>    newifp->frame.argv[-1] = JSVAL_HOLE;
>#endif

Poison!

Later in SynthesizeFrame we call the callHook. But argv[-1] will not be properly populated until after SynthesizeFrame returns and FlushNativeStackFrame is called.

It only gets worse from here. Every time SynthesizeFrame calls the callHook, it passes a stack in various stages of undress.

To answer bz's query: We make no attempt to call the callHook while actually on trace. But we call it here, when exiting a trace, in order to try to keep a much weaker API promise that callHook calls will come in FIFO order. As always, breaking the API turns out to have been a mistake; the fix is to honor the API by disabling the JIT if there's a callHook.
Keywords: crashreportid
Attached patch WIP 1 - just a test (obsolete) — Splinter Review
jsdbgapi.cpp needs a mechanism for disabling the JIT runtime-wide.  Suppose I add a field rt->debuggerInhibitsJIT, which the jsdbgapi.cpp functions set and clear appropriately.

I assume changing TRACING_ENABLED to query both rt->debuggerInhibitsJIT and the cx->options is no good.

If that is correct, here is the alternative. Add a field cx->jitEnabled. Change TRACING_ENABLED to test it. Calls to JS_{Set,Toggle}Options that set JSOPTION_JIT normally set cx->jitEnabled, but not if rt->debuggerInhibitsJIT. And any time rt->debuggerInhibitsJIT becomes true, we iterate over all contexts, clearing acx->jitEnabled everywhere. (If acx is on trace, we can use operationCallbackFlag perhaps?)

Brendan or Igor, comments?
Summary: lots of crashes [@ JS_GetFrameThis] → TM: crash [@ JS_GetFrameThis] - SynthesizeFrame passes partly-uninitialized JSStackFrame to callHook
Sounds good (para 3 in comment 14. Can't believe we were unprotected...

/be
Attached patch WIP 2 - works, likely slow (obsolete) — Splinter Review
I went ahead and made TRACING_ENABLED slower in this patch.

This isn't thread-safe but I could make it thread-safe pretty easily (rtLock around the relevant JSRuntime fields). The reason I'm not done with this already is that I'm worried about thread-safety, if I go the route of comment 14 para 3.
Attachment #410062 - Attachment is obsolete: true
Possibilities:

0. This cx->jitEnabled field is unnecessary; don't do it. If you set debug hooks while other threads are on trace, you might end up with hooks not being called properly for a while.

1. Add cx->jitEnabled, for speed, but leave it non-thread-safe. You might get the above symptoms. Also, if you happen to JS_SetOptions while another thread is modifying runtime-wide debug hooks, you could end up with the JIT disabled in your context when it shouldn't be.

2. Add cx->jitEnabled and make it thread-safe using the request model. In a request on cx, you can read or write cx->jitEnabled; when a runtime-wide debug hook change forces us to modify all contexts, do the GC rendezvous dance.

3. Add cx->jitEnabled and make it semi-thread-safe. Write only in rtLock but read without locking. Enter rtLock in every JS_SetOption call. When walking all contexts, hold rtLock the whole time.
os breakdown
  19 JS_GetFrameThis Mac OS X 10.6.2 10C540
   6 JS_GetFrameThis Mac OS X 10.6.1 10B504
   1 JS_GetFrameThis Mac OS X 10.5.8 9L30
   1 @0x0 | JS_GetFrameThis Mac OS X 10.6.2 10C540

distribution of all versions where the JS_GetFrameThis crash was found on 20091112-crashdata.csv
  23 Firefox 3.6b2
   2 Firefox 3.6b1
   2 Firefox 3.5.5
ETA: Tomorrow. Probability: 75%.

Plan: Take comment 17 option 0, perf-test that, and if there is no measurable hit, push it. If there is a measurable hit or if option 0 is shot down on general principle, I'll get advice from Brendan on how to proceed. Hopefully option 3, which is easy to implement.

(I'm more optimistic about option 0 now than I was when I wrote comment 14 paragraph 2.)
Didn't make it. Tomorrow for sure.
Looking like this will still make it today, Jason?
Yeah. WIP 2 turned out to be about a 1.6% slowdown, so I'm having to do a cx->jitEnabled flag. It'll make it today.
Attached patch WIP 2a - still 0.8% slower (obsolete) — Splinter Review
Attachment #412133 - Attachment is obsolete: true
Attached patch v3Splinter Review
** TOTAL **:           -                 853.4ms +/- 0.7%   851.6ms +/- 0.5%
Attachment #413551 - Attachment is obsolete: true
Attachment #413569 - Flags: review?(mrbkap)
Comment on attachment 413569 [details] [diff] [review]
v3

I like it.
Attachment #413569 - Flags: review?(mrbkap) → review+
http://hg.mozilla.org/tracemonkey/rev/2d7ebcd4da6c
Whiteboard: fixed-in-tracemonkey
No longer blocks: 521844
Duplicate of this bug: 521844
Looks if reports such as http://crash-stats.mozilla.com/report/index/9d43e1eb-a739-42e9-b400-c118f2091118 on Mac are the same crash? Users reported crashing when popping out Facebook chats and with the Blackboard sites that are used by universities. IF this made the merge into B4 I assume we would see those crashes go away?
Looks like this didn't make the merge into beta4, although I'd been assuming that it would (oops).
Bug 530567 is topcrash #4 (#2 on Mac) for F3.6b4 if you add its signatures together.  I'm glad we have a fix here.
Whiteboard: fixed-in-tracemonkey → [crashkill]fixed-in-tracemonkey
http://hg.mozilla.org/mozilla-central/rev/2d7ebcd4da6c
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Duplicate of this bug: 530198
Blocks: 532114
Duplicate of this bug: 531837
Whiteboard: [crashkill]fixed-in-tracemonkey → [crashkill]fixed-in-tracemonkey[firebug-p1]
Crash Signature: [@ JS_GetFrameThis]
You need to log in before you can comment on or make changes to this bug.