Closed Bug 621716 Opened 14 years ago Closed 14 years ago

crash [@ js::LoopProfile::profileOperation][@ js::LoopProfile::decide][@ js::LoopProfile::profileLoopEdge(JSContext*, unsigned int&)] [@ js::LoopProfile::profileOperation(JSContext*, JSOp)][@ js::LoopProfile::decide(JSContext*)] methodjit for DOM workers

Categories

(Core :: JavaScript Engine, defect)

defect
Not set
critical

Tracking

()

VERIFIED FIXED
Tracking Status
blocking2.0 --- betaN+

People

(Reporter: marcia, Assigned: billm)

References

Details

(Keywords: crash, topcrash)

Crash Data

Attachments

(1 file)

Seen while reviewing crash stats. New crash which started showing up using the 2010122800 builds. Link to crashes, which are primarily Mac but one Linux so far: http://tinyurl.com/29haczj

Possible pushlog regression range: http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=24b63f638579&tochange=e928817fb4e9

Frame 	Module 	Signature [Expand] 	Source
0 	XUL 	js::LoopProfile::profileOperation 	js/src/jstracer.cpp:16590
1 	XUL 	js::Interpret 	js/src/jsinterp.cpp:2654
2 	XUL 	js::MonitorTracePoint 	js/src/jstracer.cpp:16572
3 	XUL 	RunTracer 	js/src/methodjit/InvokeHelpers.cpp:964
4 		@0x119793494 	
5 	XUL 	js::mjit::JaegerShot 	js/src/methodjit/MethodJIT.cpp:745
6 	XUL 	js::Invoke 	js/src/jsinterp.cpp:654
7 	XUL 	js::ExternalInvoke 	js/src/jsinterp.cpp:858
8 	XUL 	JS_CallFunctionValue 	js/src/jsinterp.h:962
9 	XUL 	nsXPCWrappedJSClass::CallMethod 	js/src/xpconnect/src/xpcwrappedjsclass.cpp:1697
10 	XUL 	nsXPCWrappedJS::CallMethod 	js/src/xpconnect/src/xpcwrappedjs.cpp:588
11 	XUL 	PrepareAndDispatch 	xpcom/reflect/xptcall/src/md/unix/xptcstubs_x86_64_darwin.cpp:153
12 	XUL 	XUL@0xe051ea 	
13 	XUL 	nsCompressedMap::Lookup 	nsUnicharUtils.cpp:168
Regression. Should block.
blocking2.0: --- → ?
In the #2 spot today for new crashes is another similar stack, [@ js::LoopProfile::decide ]. Stacks looks like this:

Frame 	Module 	Signature [Expand] 	Source
0 	XUL 	js::LoopProfile::decide 	js/src/jstracer.cpp:1403
1 	XUL 	js::LoopProfile::profileOperation 	js/src/jstracer.cpp:16610
2 	XUL 	js::Interpret 	js/src/jsinterp.cpp:2654
3 	XUL 	js::MonitorTracePoint 	js/src/jstracer.cpp:16572
4 	XUL 	RunTracer 	js/src/methodjit/InvokeHelpers.cpp:964
5 		@0x11ad87534 	
6 	XUL 	js::mjit::JaegerShot 	js/src/methodjit/MethodJIT.cpp:745
7 	XUL 	js::Invoke 	js/src/jsinterp.cpp:654
8 	XUL 	js::ExternalInvoke 	js/src/jsinterp.cpp:858
9 	XUL 	JS_CallFunctionValue 	js/src/jsinterp.h:962
10 	XUL 	nsXPCWrappedJSClass::CallMethod 	js/src/xpconnect/src/xpcwrappedjsclass.cpp:1697
11 	XUL 	nsXPCWrappedJS::CallMethod 	js/src/xpconnect/src/xpcwrappedjs.cpp:588
12 	XUL 	PrepareAndDispatch 	xpcom/reflect/xptcall/src/md/unix/xptcstubs_x86_64_darwin.cpp:153
13 	XUL 	XUL@0xe051ea 	
14 	XUL 	-[MacApplicationDelegate handleAppleEvent:withReplyEvent:] 	toolkit/xre/MacApplicationDelegate.mm:324
15 	XUL 	js::AutoCompartment::enter 	js/src/jswrapper.cpp:361
16 	XUL 	JS_EnterCrossCompartmentCall 	js/src/jsapi.cpp:1165
17 		@0x12141323f 	
18 	libSystem.B.dylib 	libSystem.B.dylib@0x6bf9 	
19 	libnspr4.dylib 	PR_Unlock 	nsprpub/pr/src/pthreads/ptsynch.c:237
20 	XUL 	nsDOMFireEventRunnable::Run 	dom/src/threads/nsDOMWorker.cpp:1008
21 	XUL 	nsDOMWorkerRunnable::Run 	dom/src/threads/nsDOMThreadService.cpp:493
22 	XUL 	nsThreadPool::Run 	xpcom/threads/nsThreadPool.cpp:221
23 	XUL 	nsThread::ProcessNextEvent 	xpcom/threads/nsThread.cpp:626
24 	XUL 	NS_ProcessNextEvent_P 	nsThreadUtils.cpp:250
25 	XUL 	nsThread::ThreadFunc 	xpcom/threads/nsThread.cpp:278
26 	libnspr4.dylib 	_pt_root 	nsprpub/pr/src/pthreads/ptthread.c:187
27 	libSystem.B.dylib 	libSystem.B.dylib@0x3a535 	
28 	libSystem.B.dylib 	libSystem.B.dylib@0x3a3e8 	
29 	libnspr4.dylib 	PR_JoinThread 	nsprpub/pr/src/pthreads/ptthread.c:577

Should this one be a new bug or is it related to this one?
Yes, these two signatures probably have the same underlying cause. This started happening when we turned on the methodjit for DOM workers. All the stacks seem to contain nsDOMWorkerRunnable::Run, and there's not much else in the regression range anyway.
blocking2.0: ? → beta9+
Changing to all since I see a bunch of Windows crashes as well: http://tinyurl.com/2eto2zu
OS: Mac OS X → All
Hardware: x86 → All
Attached patch possible fixSplinter Review
This patch may fix the problem. It adds a couple calls to AbortProfiling so that they match AbortRecording calls.

I think the one that may be causing the problem is in ResetJITImpl. The problem here is that we call tm->flush(), which rewinds tm->dataAlloc. This is where profiles are allocated from. If we are currently profiling when this happens, tm->profile will continue to point into the now invalid tm->dataAlloc area. The AbortProfiling call NULLs out tm->profile in this case.

I'd like to land this on m-c to see if the crashes go away.
Assignee: general → wmccloskey
Status: NEW → ASSIGNED
Attachment #500114 - Flags: review?(dmandelin)
Attachment #500114 - Flags: review?(dmandelin) → review+
I'm landing this tentatively to see if the crashes go away.

http://hg.mozilla.org/tracemonkey/rev/67c778fa2403
http://hg.mozilla.org/mozilla-central/rev/aa25ead36903
It looks like this didn't get fixed. The patches above are still correct, so I'm going to leave them in.

The correlation data isn't present right now, but from manual inspection I've noticed that all the crashes happen with the Ghostery extension, which doesn't seem to be all that common overall. I'll install it and see if I can reproduce the crash.

Also, I realized that the patch to move TraceMonitor from ThreadData to JSCompartment hasn't landed on mozilla-central yet. It's possible that the profiler isn't happy if a worker moves from one thread to another. I'll need to look over the code and see what effect this might have.
(In reply to comment #7)
> Also, I realized that the patch to move TraceMonitor from ThreadData to
> JSCompartment hasn't landed on mozilla-central yet. It's possible that the
> profiler isn't happy if a worker moves from one thread to another. I'll need to
> look over the code and see what effect this might have.

Maybe it would be a good idea to just export that one over right away, since people also want it to test perf.
(In reply to comment #8)
> Maybe it would be a good idea to just export that one over right away, since
> people also want it to test perf.

OK, done. Let's see if the problem goes away.
Crashes in js::LoopProfile::decide and js::LoopProfile::profileOperation became the #1 and #2 topcrashes on Windows and Mac starting on the December 28 builds, and stayed around on December 29.  It looks like they're gone in today's builds, but I'd give it another few hours to be sure.  (http://dbaron.org/mozilla/crashes-by-build has some links that may be useful.)
Keywords: topcrash
Summary: Firefox 4.0b9pre crash in [@ js::LoopProfile::profileOperation ] → Firefox 4.0b9pre crash in [@ js::LoopProfile::profileOperation ] [@ js::LoopProfile::decide]
I still don't see it today. Given the number of crashes in the last two days, I think it's safe to say that it's fixed now.

Here's what I think may have been happening. The methodjit stores a pointer to the LoopProfile inside its trace IC. Say a profile is created on thread A, and then the script moves to thread B. Along with profile in the trace IC, we save the current "flush epoch"; we only use the profile if this number hasn't changed (meaning no intervening flushes). But the flush epoch of threads A and B are maintained independently. So when the script moves to thread B, it may see the same flush epoch and use the profile stored in its trace IC. But thread A may have flushed, making the profile invalid.

With the patch from bug 584860, the flush epoch is stored in the compartment instead of the thread. So everything works correctly.
Status: ASSIGNED → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
other signatures matching this time frame for both firefox and thunderbird are js::LoopProfile::profileLoopEdge(JSContext*, unsigned int&)
js::LoopProfile::profileOperation(JSContext*, JSOp) 

gone from thunderbird nightlies also.
v.fixed based on crash-stats
Blocks: 615723
Status: RESOLVED → VERIFIED
Summary: Firefox 4.0b9pre crash in [@ js::LoopProfile::profileOperation ] [@ js::LoopProfile::decide] → crash [@ js::LoopProfile::profileOperation][@ js::LoopProfile::decide][@ js::LoopProfile::profileLoopEdge(JSContext*, unsigned int&)] [@ js::LoopProfile::profileOperation(JSContext*, JSOp)][@ js::LoopProfile::decide(JSContext*)] methodjit for DOM workers
As per today's meeting, beta 9 will be a time-based release. Marking these all betaN+. Please move it back to beta9+ if  you believe it MUST be in the next beta (ie: trunk is in an unshippable state without this)
No longer blocks: 615723
blocking2.0: beta9+ → betaN+
Keywords: crash, topcrash
yeah, this still looks fixed.

https://crash-stats.mozilla.com/report/list?signature=js%3A%3ALoopProfile%3A%3AprofileOperation
Blocks: 615723
Keywords: crash, topcrash
Crash Signature: [@ js::LoopProfile::profileOperation] [@ js::LoopProfile::decide] [@ js::LoopProfile::profileLoopEdge(JSContext*, unsigned int&)] [@ js::LoopProfile::profileOperation(JSContext*, JSOp)] [@ js::LoopProfile::decide(JSContext*)]
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: