Closed Bug 366975 Opened 18 years ago Closed 18 years ago

js_NewGCThing asserts !flbase[flindex] because it assumes GC stays locked across last-ditch GC

Categories

(Core :: JavaScript Engine, defect)

x86
Linux
defect
Not set
normal

Tracking

()

RESOLVED FIXED

People

(Reporter: dbaron, Assigned: igor)

References

Details

(Keywords: fixed1.8.0.12, fixed1.8.1.4, Whiteboard: [patch][need testcase])

Attachments

(5 files, 3 obsolete files)

I'm seeing the first of the:
            JS_ASSERT(!flbase[flindex]);
assertions in js_NewGCThing quite frequently with leak-monitor and a large saved session.  The chain of events causing the assertion is the following:

1. we call js_NewGCThing while rt->gcMallocBytes >= rt->gcMaxMallocBytes + cx->thread-gcMallocBytes

2. We skip allocating from the free list because of this, set doGC to true, and call js_GC(cx, GC_LAST_DITCH)

3. js_GC does the GC, then *unlocks* the GC lock and calls rt->gcCallback(cx, JSGC_END).

4. leak-monitor's GC callback for JSGC_END sometimes allocates GC things, and sometimes triggers an additional GC.  In the past I believe this was considered OK.

5. we continue in js_NewGCThing, get a thing from arenaList->freeList, and then assert, because cx->thread->gcFreeLists have been added to during the JSGC_END callbacks.
Attached patch possible patch (obsolete) — Splinter Review
I've been running with this for about a day and I haven't noticed any problems.
Assignee: general → dbaron
Whiteboard: [patch]
> 4. leak-monitor's GC callback for JSGC_END sometimes allocates GC things, and
> sometimes triggers an additional GC.  In the past I believe this was considered
> OK.

It is not OK if the leak monitor calls JS_GC() directly or indirectly since it  violates an assumption that newborn array survives the last ditch GC  that a few places in SM relies on for GC safty. On the other hand if the leak monitor triggers an extra GC through just allocating more GC things, then it is OK. 


How does the leak monitor triggers GC?
Attachment #251594 - Flags: review?(igor) → review+
It triggers GC by calling JS_GC.

That said, if you're not holding the GC lock, what's to prevent something else from triggering JS_GC in a much more roundabout way?
(In reply to comment #3)
> It triggers GC by calling JS_GC.

Then we have a bug. It should not call the GC when the GC is the last ditch. I think we can fix this via checking a recursive invocation of the ditch GC in JS_GC and friends. 

> 
> That said, if you're not holding the GC lock, what's to prevent something else
> from triggering JS_GC in a much more roundabout way?

GC is cooperative and it would not run if other JS threads are running. As such the code can rely on the fact that a sequence of GC-allocations would not collect GC things rooted through newborn arrays even if the last ditch GC is triggered multiple times. 

This is fragile and hopefully the statick GC analyzer would allow to get rid of newborns even in time for 1.9, but for now we have to support that.  
(In reply to comment #2)
> On the other hand if the leak monitor
> triggers an extra GC through just allocating more GC things, then it is OK. 

I am stupid. It is not OK to call any GC-allocations from the JSGC_END since it would replace the context of newborn array.
Comment on attachment 251594 [details] [diff] [review]
possible patch

Sorry, I have to minus the patch since the assertion is the right one. Effectively it is a consequence of the requirement that no new objects should be allocated during that last ditch GC.
Attachment #251594 - Flags: review+ → review-
(In reply to comment #4)
> > That said, if you're not holding the GC lock, what's to prevent something else
> > from triggering JS_GC in a much more roundabout way?
> 
> GC is cooperative and it would not run if other JS threads are running.

This is called the request model, FYI anyone new to SpiderMonkey or its threaded mode of operation.

JSGC_END is used by XPConnect too -- it does deferred releases from that hook, and those could allocate new GC-things.  In the last ditch case, this all must be deferred further.

/be

(In reply to comment #7)
> JSGC_END is used by XPConnect too -- it does deferred releases from that hook,
> and those could allocate new GC-things.  In the last ditch case, this all must
> be deferred further.

Should we fix implementations of JSGC_END to avoid JS_GC() and GC allocations under the last ditch? Or should we patch the GC to support that?
(In reply to comment #8)
> (In reply to comment #7)
> > JSGC_END is used by XPConnect too -- it does deferred releases from that hook,
> > and those could allocate new GC-things.  In the last ditch case, this all must
> > be deferred further.
> 
> Should we fix implementations of JSGC_END to avoid JS_GC() and GC allocations
> under the last ditch? Or should we patch the GC to support that?

If we had automatic LIFO-allocated newborn roots, we wouldn't need to forbid either GC or allocation from JSGC_END. We shouldn't try to change embeddings at this point, we should automate inside SpiderMonkey to cope.

/be
(In reply to comment #9)
> We shouldn't try to change embeddings at
> this point, we should automate inside SpiderMonkey to cope.

OK, then I patch the code to allow allocations and GC calls from the last ditch GC.
Assignee: dbaron → igor
Attached patch Fix (obsolete) — Splinter Review
The patch extends David's patch by preserving atoms and weak roots during execution of JSGC_END callback. To avoid code duplication I moved newborn, lastInternalReasult and lastAtom into a single JSWealRoots struct. That triggered changes in a few files causing the patch bloat. 

To root temporaries the patch adds new temp local root, JSTVU_WEAK_ROOTS. To save source bloat I redefine JS_PUSH_TEMP_ROOT* macros. The new version requires much less source lines.
Attachment #251594 - Attachment is obsolete: true
Attachment #251823 - Flags: review?(brendan)
(In reply to comment #12)
> Created an attachment (id=251823) [details]
> Fix

The patch also changes the routing of a temporary generator list from marker to own JSTVU_GEN_LIST to reduce the code size.
Attached patch Fix v2 (obsolete) — Splinter Review
Here is a smaller patch without JS_PUSH_TEMP_ROOT refactoring. That can be done when tailoring SM for the static GC analyzer.
Attachment #251921 - Flags: review?(brendan)
Comment on attachment 251921 [details] [diff] [review]
Fix v2

>@@ -2693,21 +2707,18 @@ js_GC(JSContext *cx, JSGCInvocationKind 
>     rt = cx->runtime;
> #ifdef JS_THREADSAFE
>     /* Avoid deadlock. */
>     JS_ASSERT(!JS_IS_RUNTIME_LOCKED(rt));
> #endif
> 
>     if (gckind == GC_LAST_DITCH) {
>-        /* The last ditch GC preserves all atoms and cx->newborn. */
>+        /* The last ditch GC preserves all atoms and newborn. */

Nit: "and newborns" -- or "and weak roots."

>+            memcpy(&savedWeakRoots, &cx->weakRoots, sizeof(JSWeakRoots));

Use struct assignment, let the compiler optimize?

>+            JS_PUSH_TEMP_ROOT_WEAK_COPY(cx, &savedWeakRoots, &tvr);

At this point the JS_PUS_TEMP_ROOT_XXX pattern hurts readability, compared to JS_PUSH_TEMP_WEAK_ROOTS, but I don't have a good suggestion here, because breaking symmetry hurts too, and the COPY or "saved" notion is important to convey too. Just noting, if you have a better idea for naming here, please propose.


>+typedef struct JSWeakRoots {
>+
>+    /* Most recently created things by type, members of the GC's root set. */
>+    JSGCThing           *newborn[GCX_NTYPES];
>+
>+    /* Atom root for the last-looked-up atom on this context. */
>+    JSAtom              *lastAtom;
>+
>+    /* Root for the result of the most recent js_InternalInvoke call. */
>+    jsval               lastInternalResult;
>+
>+} JSWeakRoots;

Nit: prevailing style does not add blank lines around struct body lines.

r=me with nits picked. Good patch, thanks for fixing this.

/be
Attachment #251921 - Flags: review?(brendan) → review+
(In reply to comment #15)
> >+            JS_PUSH_TEMP_ROOT_WEAK_COPY(cx, &savedWeakRoots, &tvr);
> 
> At this point the JS_PUS_TEMP_ROOT_XXX pattern hurts readability, compared to
> JS_PUSH_TEMP_WEAK_ROOTS, but I don't have a good suggestion here, because
> breaking symmetry hurts too, and the COPY or "saved" notion is important to
> convey too. 

JS_PUSH_TEMP_ROOT_WEAK_COPY was the best variant that I came with. Perhaps we can replace JS_PUSH_TEMP_ROOT_ prefix by JS_PUSH_TR_ or JS_PUSH_TVR_ (tvr is such a good name!). But that is for another patch. 
(In reply to comment #16)
> JS_PUSH_TEMP_ROOT_WEAK_COPY was the best variant that I came with. Perhaps we
> can replace JS_PUSH_TEMP_ROOT_ prefix by JS_PUSH_TR_ or JS_PUSH_TVR_ (tvr is
> such a good name!). But that is for another patch. 

Indeed -- good idea too (_TVR).

/be
Attached patch Fix v3Splinter Review
Patch to commit with nits addressed.
Attachment #251823 - Attachment is obsolete: true
Attachment #251921 - Attachment is obsolete: true
Attachment #251956 - Flags: review+
Attachment #251823 - Flags: review?(brendan)
I committed the patch from comment 18 to the trunk:

Checking in jsapi.c;
/cvsroot/mozilla/js/src/jsapi.c,v  <--  jsapi.c
new revision: 3.307; previous revision: 3.306
done
Checking in jsarray.c;
/cvsroot/mozilla/js/src/jsarray.c,v  <--  jsarray.c
new revision: 3.106; previous revision: 3.105
done
Checking in jsatom.c;
/cvsroot/mozilla/js/src/jsatom.c,v  <--  jsatom.c
new revision: 3.89; previous revision: 3.88
done
Checking in jscntxt.c;
/cvsroot/mozilla/js/src/jscntxt.c,v  <--  jscntxt.c
new revision: 3.102; previous revision: 3.101
done
Checking in jscntxt.h;
/cvsroot/mozilla/js/src/jscntxt.h,v  <--  jscntxt.h
new revision: 3.140; previous revision: 3.139
done
Checking in jsfun.c;
/cvsroot/mozilla/js/src/jsfun.c,v  <--  jsfun.c
new revision: 3.175; previous revision: 3.174
done
Checking in jsgc.c;
/cvsroot/mozilla/js/src/jsgc.c,v  <--  jsgc.c
new revision: 3.201; previous revision: 3.200
done
Checking in jsgc.h;
/cvsroot/mozilla/js/src/jsgc.h,v  <--  jsgc.h
new revision: 3.60; previous revision: 3.59
done
Checking in jsinterp.c;
/cvsroot/mozilla/js/src/jsinterp.c,v  <--  jsinterp.c
new revision: 3.323; previous revision: 3.322
done
Checking in jsiter.c;
/cvsroot/mozilla/js/src/jsiter.c,v  <--  jsiter.c
new revision: 3.57; previous revision: 3.56
done
Checking in jsnum.c;
/cvsroot/mozilla/js/src/jsnum.c,v  <--  jsnum.c
new revision: 3.76; previous revision: 3.75
done
Checking in jsobj.c;
/cvsroot/mozilla/js/src/jsobj.c,v  <--  jsobj.c
new revision: 3.323; previous revision: 3.322
done
Checking in jsregexp.c;
/cvsroot/mozilla/js/src/jsregexp.c,v  <--  jsregexp.c
new revision: 3.132; previous revision: 3.131
done
Checking in jsxml.c;
/cvsroot/mozilla/js/src/jsxml.c,v  <--  jsxml.c
new revision: 3.138; previous revision: 3.137
done
Status: NEW → RESOLVED
Closed: 18 years ago
Resolution: --- → FIXED
I think we should apply the patch to the branches for general stability.
Flags: blocking1.8.1.2?
Flags: blocking1.8.0.10?
This patch is a bit big and late to take as a stability patch in this round, but we'll try to get it in early for the next round of updates. Unless there's some serious easily-exploited security facet to this that we're not understanding.
Flags: wanted1.8.1.x+
Flags: wanted1.8.0.x+
Flags: blocking1.8.1.2?
Flags: blocking1.8.0.10?
Flags: in-testsuite-
Hernan Badenes reported the bug with firefox 1.8.1 as well so I suggest to include the fix into the next release.

On 22/03/07, Hernan Badenes wrote:
...
> In an ajax project I am participating, the application started (with some
> recent changes) to crash Firefox, both win and linux and from last released
> versions. The application does very heavy use of JS. In some moment, when we
> are creating lot of objects and manipulating iframes and dom trees,
> sometimes firefox simply crashes. We found the explosion here:
> 
>     Assertion failure: !flbase[flindex], at jsgc.c:1442
> 
> which, after searching in the web, led us this this bug report: 
>     https://bugzilla.mozilla.org/show_bug.cgi?id=366975
Flags: blocking1.8.1.4?
Flags: blocking1.8.0.12?
Flags: blocking1.8.1.4?
Flags: blocking1.8.1.4+
Flags: blocking1.8.0.12?
Flags: blocking1.8.0.12+
This is a 1.8 version of the trunk patch that required 3 trivial merges by hand.
Attachment #260663 - Flags: review+
Attachment #260663 - Flags: approval1.8.1.4?
Here is a proof of triviality of hand-merge into 1.8.1 codebase.
The non-trivial changes to the patch when porting to 1.8.0 came from 2 sources:

1. On 1.8.0 the weak roots are cleared in ForceGC, not as a part of js_GC implementation.

2. On 1.8.0 the last ditch is indicated via "gcflags & GC_ALREADY_LOCKED".

The rest hand-merges were trivial.
Attachment #260669 - Flags: review?(brendan)
Attachment #260669 - Flags: approval1.8.0.12?
Attachment #260669 - Flags: review?(brendan) → review+
Comment on attachment 260663 [details] [diff] [review]
Fix v3 for 1.8 branch

approved for 1.8.1.4, a=dveditz for release-drivers
Attachment #260663 - Flags: approval1.8.1.4? → approval1.8.1.4+
Comment on attachment 260669 [details] [diff] [review]
Fix v3 for 1.8.0 branch

approved for 1.8.0.12, a=dveditz for release-drivers
Attachment #260669 - Flags: approval1.8.0.12? → approval1.8.0.12+
I committed the patch from comment to MOZILLA_1_8_BRANCH:

Checking in jsapi.c;
/cvsroot/mozilla/js/src/jsapi.c,v  <--  jsapi.c
new revision: 3.214.2.36; previous revision: 3.214.2.35
done
Checking in jsarray.c;
/cvsroot/mozilla/js/src/jsarray.c,v  <--  jsarray.c
new revision: 3.58.2.25; previous revision: 3.58.2.24
done
Checking in jsatom.c;
/cvsroot/mozilla/js/src/jsatom.c,v  <--  jsatom.c
new revision: 3.65.4.7; previous revision: 3.65.4.6
done
Checking in jscntxt.c;
/cvsroot/mozilla/js/src/jscntxt.c,v  <--  jscntxt.c
new revision: 3.60.4.13; previous revision: 3.60.4.12
done
Checking in jscntxt.h;
/cvsroot/mozilla/js/src/jscntxt.h,v  <--  jscntxt.h
new revision: 3.80.4.21; previous revision: 3.80.4.20
done
Checking in jsfun.c;
/cvsroot/mozilla/js/src/jsfun.c,v  <--  jsfun.c
new revision: 3.117.2.30; previous revision: 3.117.2.29
done
Checking in jsgc.c;
/cvsroot/mozilla/js/src/jsgc.c,v  <--  jsgc.c
new revision: 3.104.2.32; previous revision: 3.104.2.31
done
Checking in jsgc.h;
/cvsroot/mozilla/js/src/jsgc.h,v  <--  jsgc.h
new revision: 3.33.4.11; previous revision: 3.33.4.10
done
Checking in jsinterp.c;
/cvsroot/mozilla/js/src/jsinterp.c,v  <--  jsinterp.c
new revision: 3.181.2.86; previous revision: 3.181.2.85
done
Checking in jsiter.c;
/cvsroot/mozilla/js/src/jsiter.c,v  <--  jsiter.c
new revision: 3.17.2.28; previous revision: 3.17.2.27
done
Checking in jsnum.c;
/cvsroot/mozilla/js/src/jsnum.c,v  <--  jsnum.c
new revision: 3.66.6.5; previous revision: 3.66.6.4
done
Checking in jsobj.c;
/cvsroot/mozilla/js/src/jsobj.c,v  <--  jsobj.c
new revision: 3.208.2.51; previous revision: 3.208.2.50
done
Checking in jsregexp.c;
/cvsroot/mozilla/js/src/jsregexp.c,v  <--  jsregexp.c
new revision: 3.99.2.21; previous revision: 3.99.2.20
done
Checking in jsxml.c;
/cvsroot/mozilla/js/src/jsxml.c,v  <--  jsxml.c
new revision: 3.50.2.65; previous revision: 3.50.2.64
done
Keywords: fixed1.8.1.4
I committed the patch from comment 26 to MOZILLA_1_8_0_BRANCH:


Checking in jsapi.c;
/cvsroot/mozilla/js/src/jsapi.c,v  <--  jsapi.c
new revision: 3.214.2.11.2.13; previous revision: 3.214.2.11.2.12
done
Checking in jsarray.c;
/cvsroot/mozilla/js/src/jsarray.c,v  <--  jsarray.c
new revision: 3.58.2.10.2.12; previous revision: 3.58.2.10.2.11
done
Checking in jsatom.c;
/cvsroot/mozilla/js/src/jsatom.c,v  <--  jsatom.c
new revision: 3.65.4.1.4.3; previous revision: 3.65.4.1.4.2
done
Checking in jscntxt.h;
/cvsroot/mozilla/js/src/jscntxt.h,v  <--  jscntxt.h
new revision: 3.80.4.2.2.11; previous revision: 3.80.4.2.2.10
done
Checking in jsfun.c;
/cvsroot/mozilla/js/src/jsfun.c,v  <--  jsfun.c
new revision: 3.117.2.7.2.14; previous revision: 3.117.2.7.2.13
done
Checking in jsgc.c;
/cvsroot/mozilla/js/src/jsgc.c,v  <--  jsgc.c
new revision: 3.104.2.3.2.18; previous revision: 3.104.2.3.2.17
done
Checking in jsgc.h;
/cvsroot/mozilla/js/src/jsgc.h,v  <--  jsgc.h
new revision: 3.33.12.4; previous revision: 3.33.12.3
done
Checking in jsinterp.c;
/cvsroot/mozilla/js/src/jsinterp.c,v  <--  jsinterp.c
new revision: 3.181.2.17.2.31; previous revision: 3.181.2.17.2.30
done
Checking in jsnum.c;
/cvsroot/mozilla/js/src/jsnum.c,v  <--  jsnum.c
new revision: 3.66.6.1.2.1; previous revision: 3.66.6.1
done
Checking in jsobj.c;
/cvsroot/mozilla/js/src/jsobj.c,v  <--  jsobj.c
new revision: 3.208.2.12.2.25; previous revision: 3.208.2.12.2.24
done
Checking in jsregexp.c;
/cvsroot/mozilla/js/src/jsregexp.c,v  <--  jsregexp.c
new revision: 3.99.2.7.2.7; previous revision: 3.99.2.7.2.6
done
Checking in jsstr.c;
/cvsroot/mozilla/js/src/jsstr.c,v  <--  jsstr.c
new revision: 3.108.2.5.2.7; previous revision: 3.108.2.5.2.6
done
Checking in jsxml.c;
/cvsroot/mozilla/js/src/jsxml.c,v  <--  jsxml.c
new revision: 3.50.2.15.2.35; previous revision: 3.50.2.15.2.34
done
Keywords: fixed1.8.0.12
I got a feedback from Hernan Badenes that with the bug fixed their application no longer crashes with FF 2.0.0.4. Since the application changed since the initial report, it does not mean that the committed fix addressed the issue, but it adds confidence.
Whiteboard: [patch] → [patch][need testcase]
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: