Last Comment Bug 660778 - Stack Overflow Crash [@ js::gc::Cell::arenaHeader() | js::gc::Cell::address()] | ASSERTION: JSEventListener has wrong script context?: 'stack && NS_SUCCEEDED(stack- >Peek(&cx)) && cx && GetScriptContextFromJSContext(cx) == mContext'
: Stack Overflow Crash [@ js::gc::Cell::arenaHeader() | js::gc::Cell::address()...
Status: VERIFIED FIXED
fixed-in-tracemonkey
: assertion, crash
Product: Core
Classification: Components
Component: JavaScript Engine (show other bugs)
: Trunk
: All All
: -- critical (vote)
: mozilla7
Assigned To: Bill McCloskey (:billm)
:
Mentors:
http://www.justin.tv/steven_bonnell_i...
: 637796 662634 (view as bug list)
Depends on: 663532
Blocks: 532972 602225 651445
  Show dependency treegraph
 
Reported: 2011-05-31 03:50 PDT by Bob Clary [:bc:]
Modified: 2012-03-07 07:31 PST (History)
16 users (show)
See Also:
Crash Signature:
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---
-
fixed
+
fixed


Attachments
crash report xp (233.34 KB, text/plain)
2011-05-31 03:54 PDT, Bob Clary [:bc:]
no flags Details
patch (10.10 KB, patch)
2011-06-01 11:27 PDT, Bill McCloskey (:billm)
gal: review+
asa: approval‑mozilla‑beta+
Details | Diff | Review

Description Bob Clary [:bc:] 2011-05-31 03:50:44 PDT
1. http://www.justin.tv/steven_bonnell_ii/b/286904322
2. Stack Overflow Crash WinXP debug nightly

Associated Socorro Signature UnmarkGrayChildren

https://crash-stats.mozilla.com/report/list?product=Firefox&query_search=signature&query_type=contains&query=UnmarkGrayChildren&reason_type=contains&date=05%2F31%2F2011%2003%3A38%3A02&range_value=1&range_unit=weeks&hang_type=any&process_type=any&do_query=1&signature=UnmarkGrayChildren

which shows mostly windows but one mac.

in vs2005 debugger:

inline uintptr_t
Cell::address() const
{
    uintptr_t addr = uintptr_t(this);
    JS_ASSERT(addr % Cell::CellSize == 0);
=>    JS_ASSERT(Chunk::withinArenasRange(addr));
    return addr;
}

		addr	118147992	unsigned int
		this	0x070acb98	const js::gc::Cell * const

>	mozjs.dll!js::gc::Cell::address()  Line 442 + 0x4 bytes	C++
 	mozjs.dll!js::gc::Cell::arenaHeader()  Line 449 + 0x8 bytes	C++
 	mozjs.dll!js::gc::AssertValidColor(const void * thing=0x070acb98, unsigned int color=1)  Line 508 + 0x8 bytes	C++
 	mozjs.dll!js_GCThingIsMarked(void * thing=0x070acb98, unsigned int color=1)  Line 565 + 0xd bytes	C++
 	xul.dll!xpc_IsGrayGCThing(void * thing=0x070acb98)  Line 155 + 0xc bytes	C++
 	xul.dll!UnmarkGrayChildren(JSTracer * trc=0x0012b564, void * thing=0x070acb98, unsigned int kind=0)  Line 583 + 0x15 bytes	C++
 	mozjs.dll!js::gc::Mark<JSObject>(JSTracer * trc=0x0012b564, JSObject * thing=0x070acb98)  Line 124 + 0x1d bytes	C++
 	mozjs.dll!js::gc::MarkKind(JSTracer * trc=0x0012b564, void * thing=0x070acb98, unsigned int kind=0)  Line 317 + 0xd bytes	C++
 	mozjs.dll!js::gc::MarkValueRaw(JSTracer * trc=0x0012b564, const js::Value & v={...})  Line 341 + 0x1b bytes	C++
 	mozjs.dll!js::gc::MarkValueRange(JSTracer * trc=0x0012b564, js::Value * beg=0x050a26f8, js::Value * end=0x050a2700, const char * name=0x00a843ec)  Line 357 + 0xd bytes	C++
 	mozjs.dll!js::gc::MarkValueRange(JSTracer * trc=0x0012b564, unsigned int len=1, js::Value * vec=0x050a26f8, const char * name=0x00a843ec)  Line 364 + 0x1b bytes	C++
 	mozjs.dll!fun_trace(JSTracer * trc=0x0012b564, JSObject * obj=0x070acc40)  Line 1974 + 0x2d bytes	C++
 	mozjs.dll!js::gc::MarkChildren(JSTracer * trc=0x0012b564, JSObject * obj=0x070acc40)  Line 665 + 0x10 bytes	C++
 	mozjs.dll!JS_TraceChildren(JSTracer * trc=0x0012b564, void * thing=0x070acc40, unsigned int kind=0)  Line 753 + 0xd bytes	C++
 	xul.dll!UnmarkGrayChildren(JSTracer * trc=0x0012b564, void * thing=0x070acc40, unsigned int kind=0)  Line 590 + 0x12 bytes	C++
...


Note that automation shows this same signature in aurora WinXP/Win7 for other justin.tv urls though not this one. 

For aurora, try http://www.justin.tv/the_jesus_chatline/b/286335617 and maybe reload to trigger the crash


Note we also get

###!!! ASSERTION: JSEventListener has wrong script context?: 'stack && NS_SUCCEEDED(stack-
>Peek(&cx)) && cx && GetScriptContextFromJSContext(cx) == mContext', file c:/work/mozilla/
builds/nightly/mozilla/dom/src/events/nsJSEventListener.cpp, line 220

###!!! ASSERTION: JSEventListener has wrong script context?: 'stack && NS_SUCCEEDED(stack-
>Peek(&cx)) && cx && GetScriptContextFromJSContext(cx) == mContext', file c:/work/mozilla/
builds/aurora/mozilla/dom/src/events/nsJSEventListener.cpp, line 220

See also bug 650519
Comment 1 Bob Clary [:bc:] 2011-05-31 03:54:57 PDT
Created attachment 536267 [details]
crash report xp
Comment 2 Gregor Wagner [:gwagner] 2011-05-31 20:43:42 PDT
Seems like custom JSTracers have problems not with over-recursion.
Comment 3 Bill McCloskey (:billm) 2011-06-01 11:27:17 PDT
Created attachment 536674 [details] [diff] [review]
patch

I haven't been able to reproduce the problem on this site, but I think this patch should fix the problem. It just checks for a stack overflow in UnmarkGray, and if it runs out of stack, it requires a GC to run before the next CC.
Comment 4 Andreas Gal :gal 2011-06-01 11:31:36 PDT
Comment on attachment 536674 [details] [diff] [review]
patch

This is a great fix! I didn't think of just forcing a GC instead of dealing with a delayed marking stack.
Comment 5 Gregor Wagner [:gwagner] 2011-06-01 12:48:19 PDT
 static void
 UnmarkGrayChildren(JSTracer *trc, void *thing, uint32 kind)
 {
+    int stackDummy;
+    if (!JS_CHECK_STACK_SIZE(trc->context->stackLimit, &stackDummy)) {

Maybe a stupid question but are we sure that stackLimit corresponds to the current thread?
Comment 6 Bill McCloskey (:billm) 2011-06-01 13:46:25 PDT
(In reply to comment #5)
> Maybe a stupid question but are we sure that stackLimit corresponds to the
> current thread?

Good question. The context that we're using is somehow derived from some kind of per-thread data in XPConnect, so I think it should be for the current thread. Whether it's actually been set properly is another matter. I see a bunch of callers to JS_SetNativeStackQuota, so I think we're probably safe.

Unfortunately, the patch is leaking on tryserver, so I'll need to sort this out.
Comment 7 Bill McCloskey (:billm) 2011-06-03 13:10:26 PDT
http://hg.mozilla.org/tracemonkey/rev/85083c2d1f32
Comment 8 Bill McCloskey (:billm) 2011-06-07 10:55:04 PDT
*** Bug 637796 has been marked as a duplicate of this bug. ***
Comment 9 Andrew McCreight [:mccr8] 2011-06-16 09:04:26 PDT
I believe that before my patch for Bug 663532 landed that the cycle collector never actually invoked the garbage collector, so we should make sure that that patch (and its followup that fixes a leak that my patch introduced...) ends up in the same release as this one.
Comment 10 Bob Clary [:bc:] 2011-06-24 18:57:25 PDT
This is fixed on mozilla-central right? Merged on June 6 by cdleary ?
http://hg.mozilla.org/mozilla-central/rev/85083c2d1f32
Comment 11 Mark Banner (:standard8) 2011-07-05 01:29:48 PDT
(In reply to comment #10)
> This is fixed on mozilla-central right? Merged on June 6 by cdleary ?
> http://hg.mozilla.org/mozilla-central/rev/85083c2d1f32

It looks to me like it landed and this should probably be marked fixed.

Given comment 9, is it too much for this to be pushed forward onto the Gecko 6 branch? or could it be possible? I'm asking this as it appears to be a top crasher for Thunderbird at the moment (#2).
Comment 12 Andrew McCreight [:mccr8] 2011-07-05 09:54:08 PDT
As I say above, note that you probably need Bug 663532 and Bug 664506, too, as the cycle collector wasn't working quite right, in a way that this bug depends on, as far as I can tell.
Comment 13 Bill McCloskey (:billm) 2011-07-05 09:55:33 PDT
http://hg.mozilla.org/mozilla-central/rev/85083c2d1f32
Comment 14 Bill McCloskey (:billm) 2011-07-05 09:55:51 PDT
*** Bug 662634 has been marked as a duplicate of this bug. ***
Comment 15 Scoobidiver (away) 2011-07-10 11:32:10 PDT
It is #9 top crasher in 6.0.
Comment 16 Bill McCloskey (:billm) 2011-07-10 12:01:21 PDT
Should we try to get this into 6?
Comment 17 Sheila Mooney 2011-07-11 14:39:30 PDT
We don't need to track this but request approval for 6 and it will get triaged.
Comment 18 Asa Dotzler [:asa] 2011-07-12 14:26:30 PDT
Comment on attachment 536674 [details] [diff] [review]
patch

Please land on both Aurora and Beta and please, in the future, please include a risk assessment with approval requests. Thanks.
Comment 19 Andrew McCreight [:mccr8] 2011-07-12 15:35:34 PDT
Comment on attachment 536674 [details] [diff] [review]
patch

This patch is already in Aurora, so I'm removing the approval flag.
Comment 20 Andrew McCreight [:mccr8] 2011-07-13 07:48:59 PDT
http://hg.mozilla.org/releases/mozilla-beta/rev/61bff9064c10
Comment 21 Dão Gottwald [:dao] 2011-07-13 07:55:04 PDT
(In reply to comment #20)
> http://hg.mozilla.org/releases/mozilla-beta/rev/61bff9064c10

You're supposed to update the status flags when landing stuff on branches.
Comment 22 Andrew McCreight [:mccr8] 2011-07-13 07:57:37 PDT
I thought I was supposed to wait until the landing actually stuck, but thanks, I'll set them immediately in the future.
Comment 23 AndreiD[QA] 2011-07-14 04:37:45 PDT
Verified on:
Mac: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.7; rv:6.0) Gecko/20100101 Firefox/6.0
Windows 7: Mozilla/5.0 (Windows NT 6.1; rv:6.0) Gecko/20100101 Firefox/6.0
Windows XP: Mozilla/5.0 (Windows NT 5.1; rv:6.0) Gecko/20100101 Firefox/6.0
Linux: Mozilla/5.0 (X11; Linux i686; rv:6.0) Gecko/20100101 Firefox/6.0

There are no errors/crashes while playing the videos / link provided in the description
Comment 24 Scoobidiver (away) 2011-07-14 05:00:05 PDT
(In reply to comment #23)
> Verified on:
> Mac: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.7; rv:6.0) Gecko/20100101
> Firefox/6.0
> Windows 7: Mozilla/5.0 (Windows NT 6.1; rv:6.0) Gecko/20100101 Firefox/6.0
> Windows XP: Mozilla/5.0 (Windows NT 5.1; rv:6.0) Gecko/20100101 Firefox/6.0
> Linux: Mozilla/5.0 (X11; Linux i686; rv:6.0) Gecko/20100101 Firefox/6.0
Which browser.startup.homepage_override.buildID? Firefox 6.0b1 or b2?
Comment 25 AndreiD[QA] 2011-07-14 05:05:04 PDT
> Which browser.startup.homepage_override.buildID? Firefox 6.0b1 or b2?
Build 2
Comment 26 :Ehsan Akhgari (busy, don't ask for review please) 2011-08-16 10:18:24 PDT
It was not clear whether this landed before the last Aurora merge or not, so now we don't exactly know whether this exists on Beta or not.  Somebody from the js team should confirm.
Comment 27 Andrew McCreight [:mccr8] 2011-08-16 10:37:28 PDT
I think it is did.  It is present in Aurora as of this morning.  I don't know if that is before or after the merge. http://hg.mozilla.org/releases/mozilla-aurora/file/37edafab8b33/xpcom/base/nsCycleCollector.cpp

If xpcom/base/nsCycleCollector.cpp has a call to NeedCollect(), then this patch has been applied.
Comment 28 Andrew McCreight [:mccr8] 2011-08-16 10:37:59 PDT
"I think it is did. "   This should be "I think it is."
Comment 29 christian 2011-09-15 13:40:53 PDT
mNeedGCBeforeCC is on mozilla-beta, marking this as fixed

Note You need to log in before you can comment on or make changes to this bug.