Closed Bug 401687 Opened 13 years ago Closed 13 years ago

Stop refcounting JS objects in the cycle collector

Categories

(Core :: XPConnect, defect)

defect
Not set

Tracking

()

RESOLVED FIXED
mozilla1.9beta1

People

(Reporter: peterv, Assigned: peterv)

References

(Blocks 1 open bug)

Details

Attachments

(3 files, 4 obsolete files)

Spun off from bug 379718 which contains the discussion that led to this patch.
Flags: blocking1.9?
Attached patch v2.6 (obsolete) — Splinter Review
Here's the latest version of my patch. Unfortunately this started leaking again, and I haven't been able to figure it out yet. I am seeing some weirdness were calling JS_TraceRuntime doesn't seem to hand back all the objects that the GC marks before JSGC_MARK_END.
Yep, this is a beta blocker.
Flags: blocking1.9? → blocking1.9+
I'm now also seeing "Fault in cycle collector: valid grey node after scanning" before the EM restart.
Why doesn't JSContextParticipant (in nsXPConnect.cpp) call js_TraceContext, like the nsJSContext tracing code used to?  Isn't there a bunch of stuff reachable from a context other than its global object?

While I'm there, in JSContextParticipant, you have:
        // For now we won't unlink a JSContext, especially because there's no
        // way to root/unroot it.
I'd make this stronger: you should say that you are not permitted to
touch the pointer because the Root and Unroot methods don't ensure that
it's valid.
I like the change to move the UnmarkPurple calls earlier.  We don't need to hold off on them, especially now that we properly handle the suppression of placement in the purple buffer during scanning.  But I don't see why you need to add a purple color to balance it out?  Why not just call UnmarkPurple immediately instead of setting the color to purple?  Then you can avoid the check in visitNode, etc.

I think you want to preserve the purple buffer as long as possible in case something fails, though, since you'll want to scan the same pointers later.  So perhaps UnmarkPurple shouldn't do Forget(), but instead, you should Forget all the pointers in mBuf later, once you know the scan was able to allocate all the memory it needed, but before mScanInProgress is set to PR_FALSE (which thus makes Suspect/Forget live again).  (Once we're done scanning, the collection can't fail, right?)

But this chunk seems backwards:
+        mScanInProgress = PR_FALSE;
+
+        ScanRoots(graph);
...and could probably cause a bunch of problems.
(In reply to comment #4)
> Why doesn't JSContextParticipant (in nsXPConnect.cpp) call js_TraceContext,
> like the nsJSContext tracing code used to?  Isn't there a bunch of stuff
> reachable from a context other than its global object?

That "bunch of stuff" forms real roots that are present only when the native stack contains JS_BeginRequest/JS_EndRequest pair for the given context. That is, when JS interpreter executes some script or inside some other JS API. Everything reachable from these roots should be already colored black during the initial invocation of JS_GC. On the other hand the global object from JSContext can form a garbage circle when the JSContext instance is outside the request. Hence JSContextParticipan must inform the cycle collector only about it, not the rest of roots. 
There is a problem with cx->requestDepth usage in the patch. Any call to JS_SuspendRequest will temporary set cx->requestDepth to 0. Thus the native stack may look:

Cycle collection invocation code
...
JS_SuspendRequest
...
JS_BeginRequest

In that case the global object is still a root. Yet the patch may collect it as it just checks for cx->requestDepth, not the number of nested 
JS_BeginRequest calls. 
(In reply to comment #4)
> Why doesn't JSContextParticipant (in nsXPConnect.cpp) call js_TraceContext,
> like the nsJSContext tracing code used to?  Isn't there a bunch of stuff
> reachable from a context other than its global object?

Actually, this seems like a valid point. Because cycle collection adds a request to one context we need to trace everything in a context from the cycle collector participant. That seems to fix my shutdown leaks. New patch coming up that has that and a couple of other small fixes.

(In reply to comment #7)
> There is a problem with cx->requestDepth usage in the patch. Any call to
> JS_SuspendRequest will temporary set cx->requestDepth to 0.

Ok, we'll need to find a solution for that.

I'll try to address the other comments tomorrow.
Attached patch v2.7 (obsolete) — Splinter Review
Attachment #286703 - Attachment is obsolete: true
I think you should initialize mJSRoots.ops to null in the constructor rather than in BeginCycleCollection -- then you can assert that it's null in BeginCycleCollection, right?

Why is mScopes the one piece that needs to be ifndef XPCONNECT_STANDALONE?  In any case, it should be ifdef-ed in nsXPConnect::FinishCycleCollection too.


I find the whole business of doing last minute marking in JSGC_MARK_END (in XPCCycleCollectGCCallback) a bit scary.  It's certainly bad for extensibility, since only one callback can do so, if it depends on what's already been marked, and that callback has to be the top callback so that it runs first.  (Other users seem to use it for freelist building.)

Speaking of extensibility, I'm sure this doesn't help python integration.  Although I suppose we could eventually just chain all the languages GC's from callbacks from each other, and then go into cycle collection once we've nested inside all of them.  But it seems saner to actually split the GC in half.  (Will this approach extend sanely to multiple language runtimes?  I haven't thought it through fully.)

You might want to explain the gCollect variable in comments a little bit.  It seems to me that it's intended to suppress repeated cycle collection when the JS GC repeats (due, e.g., to context destruction) but the cycle collector didn't collect anything the previous time through.  But I'm not sure if that's the intent.

The gCollections returned from nsXPConnect::Collect seems to be 1 smaller than the number of times DoCollect was called if DoCollect actually returns false.  Is that really what you intended?

And for what it's worth, igor actually had me convinced in comment 6.  Are you confident he's wrong?
Comment on attachment 286762 [details] [diff] [review]
v2.7

I think it is wrong to treat any root but the global object in JSContext specially. 

Based on the theory that garbage cycles that involves JSContext can only happen via its global object when the context is outside JS request it should not matter if one calls js_TraceContext(cx) or JS_CallTracer(cx->globalObject, JSTRACE_OBJECT). That is, if the patch changes js_TraceRuntime to ignore just the global object when the option is set (like it was in the patch 2.6) and replaces 
  js_TraceContext(trc, acx); 
calls by 
  if (acx->globalObject) js_CallTracer(trc, acx->globalObject, JSTRACE_OBJECT);
including the case of JSContextParticipant::Traverse, then everything should continue to work except that less unnecessary tracing will be done. 

If so, then the patch does not need to even touch js/src/*.[ch]. Instead it can add before JS_GC() in

>+    JSContext *cx = mCycleCollectionContext->GetJSContext();
>+    gOldJSGCCallback = JS_SetGCCallback(cx, XPCCycleCollectGCCallback);
>+    JS_GC(cx);
>+    JS_SetGCCallback(cx, gOldJSGCCallback);
>+    gOldJSGCCallback = nsnull;

code like:

    vector globalsThatCanBeGarbage;

    JSContext *iter = nsnull, *acx;
    while((acx = JS_ContextIterator(GetJSRuntime(), &iter)))
    {
        if (!context_inside_request(acx)) {
            globalsThatCanBeGarbage.add(acx->globalObject);
            acx->globalObject = NULL;
        }
    }
    
Here globalsThatCanBeGarbage would be precisely the set of grey global objects and should be treated as such. Then after the collection is done the JS contexts that held the collected globals can be destroyed and if the global survives GC, 
 it must be reinstalled into the original JSContext.
(In reply to comment #10)
> I find the whole business of doing last minute marking in JSGC_MARK_END (in
> XPCCycleCollectGCCallback) a bit scary.  It's certainly bad for extensibility,
> since only one callback can do so, if it depends on what's already been marked,
> and that callback has to be the top callback so that it runs first.  (Other
> users seem to use it for freelist building.)

The right thing to do would be to split JS_GC into a set of calls like it was proposed in bug 379718 comment 30. The problem is that it is not straightforward and would requires a cleanup in JS to do it in a sane way. So as a temporary solution I suggest to use the callback-based approach from Peter's patches and then address the modularity issue.



> 
> Speaking of extensibility, I'm sure this doesn't help python integration. 
> Although I suppose we could eventually just chain all the languages GC's from
> callbacks from each other, and then go into cycle collection once we've nested
> inside all of them.  But it seems saner to actually split the GC in half. 
> (Will this approach extend sanely to multiple language runtimes?  I haven't
> thought it through fully.)
> 
> You might want to explain the gCollect variable in comments a little bit.  It
> seems to me that it's intended to suppress repeated cycle collection when the
> JS GC repeats (due, e.g., to context destruction) but the cycle collector
> didn't collect anything the previous time through.  But I'm not sure if that's
> the intent.
> 
> The gCollections returned from nsXPConnect::Collect seems to be 1 smaller than
> the number of times DoCollect was called if DoCollect actually returns false. 
> Is that really what you intended?
> 
> And for what it's worth, igor actually had me convinced in comment 6.  Are you
> confident he's wrong?
> 

(In reply to comment #12)
> > And for what it's worth, igor actually had me convinced in comment 6.  Are you
> > confident he's wrong?

The bug in the patch from comment 2.6 is that it does not call the tracer in JSContextParticipant on the global object when the context does not belong to the request. This should result precisely in the shutdown leak. 
(In reply to comment #10)
> And for what it's worth, igor actually had me convinced in comment 6.  Are you
> confident he's wrong?

I am seeing JS contexts with requestDepth == 0 that do hold things other than the globalObject (so far I've only seen them holding things through weak roots).

(In reply to comment #13)
> The bug in the patch from comment 2.6 is that it does not call the tracer in
> JSContextParticipant on the global object when the context does not belong to
> the request. This should result precisely in the shutdown leak. 

v2.6 always traces the global object from a context, so I'm not sure what you mean.
(In reply to comment #14)
> (In reply to comment #10)
> > And for what it's worth, igor actually had me convinced in comment 6.  Are you
> > confident he's wrong?
> 
> I am seeing JS contexts with requestDepth == 0 that do hold things other than
> the globalObject (so far I've only seen them holding things through weak
> roots).

This is a misfeature of JS_GC. It clears the weak roots only for the current context while keeping the weak roots in the rest of places. This may be a bug but it could be be related to some issues with embeddings that does not follow the request model in SpiderMonkey. For example, for the browser this was fixed only recently, see bug 396452 and its dependency tree. It can also be related to the support of recursive invocation of JS_GC.  

To Brendan: could you comment on this?

To workaround it I suggest before calling JS_GC in the patch to iterate over contexts and call JS_ClearNewbornRoots on any JSContext with 0 request depth.

> > The bug in the patch from comment 2.6 is that it does not call the tracer in
> > JSContextParticipant on the global object when the context does not belong to
> > the request. This should result precisely in the shutdown leak. 
> 
> v2.6 always traces the global object from a context, so I'm not sure what you
> mean.

Sorry, I misread some code. But then the explanation for the shutdown leaks is the uncleared newborn stuff.
> I think you should initialize mJSRoots.ops to null in the constructor rather
> than in BeginCycleCollection -- then you can assert that it's null in
> BeginCycleCollection, right?

Done.

> Why is mScopes the one piece that needs to be ifndef XPCONNECT_STANDALONE?  In
> any case, it should be ifdef-ed in nsXPConnect::FinishCycleCollection too.

Yes, thanks for spotting that. The scopes only hold their script object principal if XPCONNECT_STANDALONE is not defined.

> I find the whole business of doing last minute marking in JSGC_MARK_END (in
> XPCCycleCollectGCCallback) a bit scary.

Yes, it's not optimal. I'd rather split up the GC into mark and sweep, but it seems non-trivial to do that.

> Speaking of extensibility, I'm sure this doesn't help python integration. 

I'm not sure how it makes it worse? Either python's GC can be split up into mark and sweep, then they could use BeginCollection to do the marking and EndCollection to do the sweeping. Or they need to do refcounting (like they currently need to). Note that the current pyxpcom code seems to use refcounting.

> You might want to explain the gCollect variable in comments a little bit.  It
> seems to me that it's intended to suppress repeated cycle collection when the
> JS GC repeats (due, e.g., to context destruction) but the cycle collector
> didn't collect anything the previous time through.  But I'm not sure if that's
> the intent.

Yes, this came up while debugging the shutdown leaks. In that case things were kept alive (through the weakroots) and so we weren't collecting anything, but trying anyway on each context destruction. I'm not convinced that we don't want to try to do cycle collection on every nested GC, but seeing it happen 10 times in a row without actually collecting anything looked bad.

> The gCollections returned from nsXPConnect::Collect seems to be 1 smaller than
> the number of times DoCollect was called if DoCollect actually returns false. 
> Is that really what you intended?

Well, it returns the number of collections that did collect things. That way, failing to do cycle collection or doing 1 collection that doesn't collect anything both return 0, and we can skip doing more collections. It does mean that if we do multiple collections and the last one doesn't collect anything we'll still try to do one more if the max hasn't been reached.

>         if (!context_inside_request(acx)) {
>             globalsThatCanBeGarbage.add(acx->globalObject);
>             acx->globalObject = NULL;
>         }

This looks scary to me, are we sure that nothing under JS_GC (eg code called from the gc callback) will try to use a context's global object?
(In reply to comment #16)
> >         if (!context_inside_request(acx)) {
> >             globalsThatCanBeGarbage.add(acx->globalObject);
> >             acx->globalObject = NULL;
> >         }
> 
> This looks scary to me, are we sure that nothing under JS_GC (eg code called
> from the gc callback) will try to use a context's global object?

Anything that looks for a global object from a GC callback is broken since there is no control over which JSContext will execute the callback. I.e. the code can either looks for all globals or none at all. I looked through the code implementing the GC callback and see no such constructions.
> Why not just call UnmarkPurple
> immediately instead of setting the color to purple? 

Done.

> I think you want to preserve the purple buffer as long as possible in case
> something fails, though, since you'll want to scan the same pointers later.  So
> perhaps UnmarkPurple shouldn't do Forget(), but instead, you should Forget all
> the pointers in mBuf later, once you know the scan was able to allocate all the
> memory it needed, but before mScanInProgress is set to PR_FALSE (which thus
> makes Suspect/Forget live again).

Ok, I think that works. Done.
Attached patch v2.8 (obsolete) — Splinter Review
Addresses most of the comments. Also uses igor's proposal for weakroots and globals.
Attachment #286762 - Attachment is obsolete: true
Attachment #286860 - Flags: review?(dbaron)
Igor, could you take a look at the solution for the weakroots and the context globals? I don't think this solves the JS_SuspendRequest problem you raised?
(In reply to comment #20)
> Igor, could you take a look at the solution for the weakroots and the context
> globals? 

That code should also check for the current context with 1 as requestDepth in the same way as JSContextParticipant does. Otherwise it is OK.

> I don't think this solves the JS_SuspendRequest problem you raised?

To address the JS_SuspendRequest problem it is sufficient to add another counter to JSContext that would be increased in JS_BeginRequest and decreased in JS_EndRequest. Then replacing the JSContext.requestDepth checks in the patch by the check against the new counter will fix the problem.
For what it's worth, I find the singular vs. plural issues in
HoldScriptObject/DropScriptObjects/NS_DROP_JS_OBJECTS/NS_HOLD_JS_OBJECTS
a bit confusing, though I'm not sure what to do about it.

I haven't looked very closely at the DEBUG_CC stuff, but that's not
critical.

Why are you removing the bodies of nsXPConnect::Root, Unroot, and
Unlink?  Those seem likely to be useful.

Could you add comments explaining why you're doing the stuff with
mClearedGlobalObjects?  I don't follow.

Why is XPCJSRuntime::AddXPConnectRoots needed?  Aren't all these objects
traced by TraceXPConnectRoots?  Why do we need to note them if they're
also traced?

XPCJSRuntime::RestoreContextGlobals looks potentially O(N^2) in the
number of contexts.  Shouldn't it use a hash table?  Or can you
guarantee that JS_ContextIterator returns the same order (and thus not
need to iterate)?  Likewise for GetUnsetContextGlobal, which I'd expect
to be O(1) rather than O(N).

I'm a little confused about what should be on XPCVariant vs.
XPCTraceableVariant.  Is it possible for an XPCVariant that's not an
XPCTraceableVariant to participate in a cycle?

Is the interaction with XPConnect's deferred releases OK?  I haven't
thought much about it, and it seems like it should be OK, I think.

I'm a little confused by XPCWrappedNative::NoteTearoffs?  Who owns what,
exactly?

In nsCycleCollector.cpp, mFollowupCollection could be ifdef DEBUG_CC.

In ExplainLiveExpectedGarbage, you still set mScanInProgress to false
before the call to ScanRoots, when I think it should be after.

In hindsight, I think ForgetPurple is probably a bad idea -- probably
better to just remove from the purple buf at the same time as the
UnmarkPurple call -- otherwise things could get confused in the future.

In any case, if igor's happy with it, I think this is in good enough
shape to land, so r+sr=dbaron ... we need the testing more than further
poking.
Attachment #286860 - Flags: superreview+
Attachment #286860 - Flags: review?(dbaron)
Attachment #286860 - Flags: review+
Attached patch v2.9Splinter Review
Addressing comments. Igor, I named the new requestDepth-like member "outstandingRequests". We can rename it if you prefer another name. I haven't yet added the comment about the cleared-global stuff. Will add that in a followup. I'll also post my answers to David's questions (probably tomorrow).
Attachment #286860 - Attachment is obsolete: true
Attachment #287019 - Flags: superreview+
Attachment #287019 - Flags: review+
A version of this patch landed at 2007-11-01 15:51 and subsequent windows builds are randomly crashing for me (using my normal profile):
- 20071101_1536_firefox-3.0a9pre.en-US.win32 : no random crashing
- 20071101_1552_firefox-3.0a9pre.en-US.win32 : random crashing.
Only landing was from this bug.

My last three breakpad ids (of course, hourlies have no symbols)
- http://crash-stats.mozilla.com/report/index/f494feba-88e2-11dc-8752-001a4bd43ed6
- http://crash-stats.mozilla.com/report/index/d23e6cc0-88e2-11dc-badd-001a4bd43ed6
- http://crash-stats.mozilla.com/report/index/c34fa382-88e2-11dc-8e1a-001a4bd43ef6
confirmed, this patch is causing firefox to crash (under linux) 
So yes, patch v2.9 landed earlier today, per request from peterv...
please provide link to breakpad reports if you could. Unfortunately all the reports in comment 24 seem to be lacking stacks :(
(In reply to comment #27)
> please provide link to breakpad reports if you could. Unfortunately all the
> reports in comment 24 seem to be lacking stacks :(

That's because there are no symbols on Socorro for hourly builds.
FYI: xpcshell core dump when shutdown.

Core was generated by `./xpcshell'.
Program terminated with signal 5, Trace/breakpoint trap.
#0  JS_Assert (s=0xb7ebe097 "!rt->deflatedStringCacheLock", 
    file=0xb7ebdb80 "/home/solar/mozilla/js/src/jsstr.c", ln=2349)
    at /home/solar/mozilla/js/src/jsutil.c:63
63          abort();
(gdb) bt
#0  JS_Assert (s=0xb7ebe097 "!rt->deflatedStringCacheLock", 
    file=0xb7ebdb80 "/home/solar/mozilla/js/src/jsstr.c", ln=2349)
    at /home/solar/mozilla/js/src/jsutil.c:63
#1  0xb7e8b631 in js_InitRuntimeStringState (cx=0x8094710)
    at /home/solar/mozilla/js/src/jsstr.c:2349
#2  0xb7ddd541 in js_NewContext (rt=0x808eec0, stackChunkSize=8192)
    at /home/solar/mozilla/js/src/jscntxt.c:297
#3  0xb7dc8024 in JS_NewContext (rt=0x808eec0, stackChunkSize=8192)
    at /home/solar/mozilla/js/src/jsapi.c:990
#4  0xb789b82a in XPCJSContextStack::GetSafeJSContext (this=0x808eb88, 
    aSafeJSContext=0xbfb87110)
    at /home/solar/mozilla/js/src/xpconnect/src/xpcthreadcontext.cpp:206
#5  0xb786b407 in XPCCallContext (this=0xbfb870f4, 
    callerLanguage=XPCContext::LANG_NATIVE, cx=0x0, obj=0x0, funobj=0x0, 
    name=0, argc=4294967295, argv=0x0, rval=0x0)
    at /home/solar/mozilla/js/src/xpconnect/src/xpccallcontext.cpp:93
#6  0xb78659af in nsXPConnect::Collect (this=0x8095160)
    at /home/solar/mozilla/js/src/xpconnect/src/nsXPConnect.cpp:490
#7  0xb7d614b7 in nsCycleCollector::Collect (this=0x806e8b8, aTryCollections=5)
    at /home/solar/mozilla/xpcom/base/nsCycleCollector.cpp:2094
#8  0xb7d61589 in nsCycleCollector::Shutdown (this=0x806e8b8)
    at /home/solar/mozilla/xpcom/base/nsCycleCollector.cpp:2249
#9  0xb7d615c0 in nsCycleCollector_shutdown ()
---Type <return> to continue, or q <return> to quit---
    at /home/solar/mozilla/xpcom/base/nsCycleCollector.cpp:2667
#10 0xb7cd71ea in NS_ShutdownXPCOM_P (servMgr=0x0)
    at /home/solar/mozilla/xpcom/build/nsXPComInit.cpp:785
#11 0xb7dab7c7 in NS_ShutdownXPCOM (svcMgr=0x0)
    at /home/solar/mozilla/xpcom/stub/nsXPComStub.cpp:175
#12 0x0804c234 in main (argc=0, argv=0xbfb873c8, envp=0xbfb873cc)
    at /home/solar/mozilla/js/src/xpconnect/shell/xpcshell.cpp:1459
Current language:  auto; currently c
(gdb) 
Comment on attachment 287019 [details] [diff] [review]
v2.9

>Index: js/src/jscntxt.h
>===================================================================
>RCS file: /Users/peterv/source/cvs.mozilla.org/cvsroot/mozilla/js/src/jscntxt.h,v
>retrieving revision 3.165
>diff -u -p -r3.165 jscntxt.h
>--- js/src/jscntxt.h	18 Sep 2007 01:22:20 -0000	3.165
>+++ js/src/jscntxt.h	1 Nov 2007 13:03:19 -0000
>@@ -742,6 +742,8 @@ struct JSContext {
> #ifdef JS_THREADSAFE
>     JSThread            *thread;
>     jsrefcount          requestDepth;
>+    /* Same as requestDepth but ignoring JS_SuspendRequest/JS_ResumeRequest */
>+    jsrefcount          outstandingRequests;
>     JSScope             *scopeToShare;      /* weak reference, see jslock.c */
>     JSScope             *lockedSealedScope; /* weak ref, for low-cost sealed

Nit: SpiderMonkey style dictates a blank line before full-line comment and the period at the end of the comment text. 


>+class JSContextParticipant : public nsCycleCollectionParticipant
>+{
...
>+    NS_IMETHODIMP Traverse(void *n, nsCycleCollectionTraversalCallback &cb)
>+    {
>+        JSContext *cx = static_cast<JSContext*>(n);
>+
....
>+
>+        void* globalObject;
>+        if(cx->globalObject)
>+            globalObject = cx->globalObject;
>+        else
>+            globalObject = nsXPConnect::GetRuntime()->GetUnsetContextGlobal(cx);
>+

When cx->globalObject is not null here, JS_GC should treat it as a root and mark/trace it coloring any XPCOM object reachable from it as black. It can be asserted via assert(!JS_IsAboutToBeFinalized(cx, cx->globalObject). So why it is necessary to register it with the the cycle collector?
(In reply to comment #30)
> When cx->globalObject is not null here, JS_GC should treat it as a root and
> mark/trace it coloring any XPCOM object reachable from it as black. It can be
> asserted via assert(!JS_IsAboutToBeFinalized(cx, cx->globalObject). So why it
> is necessary to register it with the the cycle collector?

Yeah, it isn't strictly necessary. I'll make that DEBUG_CC-only, we want it in that case to get better graphs/debugging.
Javascript are slower than firefox2 - opera - safari.

A lot of javascript menù are slow with the last nightly build.
On my forum you can see left side menù that, with firefox3, is slow.

LINK: http://hmetaluni10.altervista.org/forum/faq.php
(In reply to comment #32)
> A lot of javascript menù are slow with the last nightly build.

You'll need to give a buildid, afaik this patch isn't yet in any nightly build and I'd be surprised if it causes the symptoms you're describing.

(In reply to comment #29)
> FYI: xpcshell core dump when shutdown.

I'll look into this, but it looks unrelated to the other crashes.
I decided to leave this patch in for the nightlies. I know it sucks that we'll have a bad nightly, but we've not been able to reproduce this yet and we need stack traces (and hopefully steps to reproduce). If this turns out to be a hard to fix bug we can back the patch out, else we should try to fix it before beta IMHO.
This seems to cause crashes especially in extensions that use the alert slider like forecastfox and yahoo and gmail notifiers.
Ok, that seems to get us somewhere. I just crashed after installing ForecastFox:

#0  0x01237d9e in nsXPCWrappedJS::Unlink (this=0x18a49bf0) at mozilla/js/src/xpconnect/src/xpcwrappedjs.cpp:470
#1  0x01237fe5 in nsXPCWrappedJS::~nsXPCWrappedJS (this=0x18a49bf0) at mozilla/js/src/xpconnect/src/xpcwrappedjs.cpp:457
#2  0x0123728e in nsXPCWrappedJS::Release (this=0x18a49bf0) at mozilla/js/src/xpconnect/src/xpcwrappedjs.cpp:241
#3  0x0049c0ae in nsXPTCStubBase::Release (this=0x18a49be0) at mozilla/xpcom/reflect/xptcall/src/xptcall.cpp:65
#4  0x0122e9f5 in XPCJSRuntime::GCCallback (cx=0xb0f7460, status=JSGC_END) at mozilla/js/src/xpconnect/src/xpcjsruntime.cpp:822
#5  0x0b54e48b in DOMGCCallback (cx=0xb0f7460, status=JSGC_END) at mozilla/dom/src/base/nsJSEnvironment.cpp:3403
#6  0x0120b0a6 in XPCCycleCollectGCCallback (cx=0xb0f7460, status=JSGC_END) at mozilla/js/src/xpconnect/src/nsXPConnect.cpp:438
#7  0x0033a114 in js_GC (cx=0xb0f7460, gckind=GC_NORMAL) at mozilla/js/src/jsgc.c:2780
#8  0x002f2e1f in JS_GC (cx=0xb0f7460) at mozilla/js/src/jsapi.c:2381
#9  0x0120af8e in nsXPConnect::Collect (this=0x112a840) at mozilla/js/src/xpconnect/src/nsXPConnect.cpp:504
#10 0x0049b65a in nsCycleCollector::Collect (this=0x6c9000, aTryCollections=1) at mozilla/xpcom/base/nsCycleCollector.cpp:2098
#11 0x0049b6f4 in nsCycleCollector_collect () at mozilla/xpcom/base/nsCycleCollector.cpp:2651
#12 0x0b54e290 in nsJSContext::CC () at mozilla/dom/src/base/nsJSEnvironment.cpp:3261
#13 0x0b54e397 in nsJSContext::CCIfUserInactive () at mozilla/dom/src/base/nsJSEnvironment.cpp:3302
#14 0x0b55051b in nsJSContext::Notify (this=0xb0f72e0, timer=0x186dce50) at mozilla/dom/src/base/nsJSEnvironment.cpp:3324
#15 0x0048e6d1 in nsTimerImpl::Fire (this=0x186dce50) at mozilla/xpcom/threads/nsTimerImpl.cpp:403
#16 0x0048e8d5 in nsTimerEvent::Run (this=0x1843b330) at mozilla/xpcom/threads/nsTimerImpl.cpp:487
#17 0x0048a6bd in nsThread::ProcessNextEvent (this=0x11147d0, mayWait=1, result=0xbfffc64c) at mozilla/xpcom/threads/nsThread.cpp:490
#18 0x0042fb5d in NS_ProcessNextEvent_P (thread=0x11147d0, mayWait=1) at nsThreadUtils.cpp:227
#19 0x0aeea12b in nsXULWindow::ShowModal (this=0x18a5e2e0) at mozilla/xpfe/appshell/src/nsXULWindow.cpp:398
#20 0x0aee39f7 in nsContentTreeOwner::ShowAsModal (this=0x18a60460) at mozilla/xpfe/appshell/src/nsContentTreeOwner.cpp:524
#21 0x0ad5270e in nsWindowWatcher::OpenWindowJSInternal (this=0x1163e50, aParent=0x18029650, aUrl=0x18a5e0a8 "chrome://forecastfox/content/options.xul", aName=0xbfffccb8 "options", aFeatures=0xbfffcd0c "chrome,modal", aDialog=1, argv=0x0, aCalledFromJS=1, _retval=0xbfffcd80) at mozilla/embedding/components/windowwatcher/src/nsWindowWatcher.cpp:944
#22 0x0ad52994 in nsWindowWatcher::OpenWindowJS (this=0x1163e50, aParent=0x18029650, aUrl=0x18a5e0a8 "chrome://forecastfox/content/options.xul", aName=0xbfffccb8 "options", aFeatures=0xbfffcd0c "chrome,modal", aDialog=1, argv=0x0, _retval=0xbfffcd80) at mozilla/embedding/components/windowwatcher/src/nsWindowWatcher.cpp:483
...
Attached patch Crash fix v1 (obsolete) — Splinter Review
We want this for sure, still tracking down another crash I saw.
Attached patch Crash fix v1Splinter Review
After rebuilding I can't reproduce the second crash I was seeing. Jonas or Johnny, could whoever gets to this first please r/sr.
Attachment #287086 - Attachment is obsolete: true
Attachment #287096 - Flags: superreview?
Attachment #287096 - Flags: review?(jonas)
is it included in the last nightly?
I filed bug 402208 on a cycle collector fault I'm seeing with ForecastFox.
(In reply to comment #35)
> This seems to cause crashes especially in extensions that use the alert slider
> like forecastfox and yahoo and gmail notifiers.
> 
Actually I cannot get crashes with ymail notifier this.  So far only see crashes with forecastfox which seem to relate to the alert slider.

I will try the crash fix patch and see what transpires.
Attachment #287096 - Flags: superreview?
Attachment #287096 - Flags: superreview+
Attachment #287096 - Flags: review?(jonas)
Attachment #287096 - Flags: review+
Around the time this was checked in yesterday trace_malloc_maxheap jumped a bunch (>10%) on bm-xserve11
http://build-graphs.mozilla.org/graph/query.cgi?tbox=bm-xserve11.build.mozilla.org&testname=trace_malloc_maxheap&autoscale=1&size=&days=7&units=bytes&ltype=&points=&showpoint=&avg=1

I didn't see any corresponding jump, not even a little, on fxdbug-linux-tbox.build.mozilla.org or fxdbug-win32-tb which were the only other machines I could find running the test.
Another bug with javascript, 
sometimes javascript menu's chars dosen't appear correctly.

Screen:
http://img465.imageshack.us/img465/6070/charwj6.png

LINK:
http://hmetaluni10.altervista.org/forum/faq.php
(In reply to comment #43)
> Another bug with javascript, 
> sometimes javascript menu's chars dosen't appear correctly.

Which is again unrelated to this bug. Please stop spamming this bug.
Do we want to land this?
It already landed...
Main patch landed at 2007-11-01 15:51, Solaris bustage fix at 2007-11-01 18:09, and a crash fix at 2007-11-02 08:48.
I still need to do some work (answer david's questions and fix some comments), but let's close this to get it off the M9 radar.
Status: ASSIGNED → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
(In reply to comment #15)
> > I am seeing JS contexts with requestDepth == 0 that do hold things other than
> > the globalObject (so far I've only seen them holding things through weak
> > roots).
> 
> This is a misfeature of JS_GC. It clears the weak roots only for the current
> context while keeping the weak roots in the rest of places.

Indeed, an ancient design botch that we are afraid to change for fear of breaking some embedding counting on a weak root on a non-GC'ing context.

If we can make a better GC API for Mozilla 2 / ActionMonkey or even if needed for 1.9, and opt out of the old backward-compatible API, we should.

/be
Depends on: 402404
Depends on: 402379
see comment #29
Attachment #287372 - Flags: review+
Attachment #287372 - Flags: superreview?(peterv)
Attachment #287372 - Flags: review?(peterv)
Attachment #287372 - Flags: review+
Attachment #287372 - Flags: superreview?(peterv)
Attachment #287372 - Flags: review?(peterv)
Attachment #287372 - Flags: review?(igor)
Comment on attachment 287372 [details] [diff] [review]
xpcshell crash fix

I better fix would be to initialize the string hash in JS_NewRuntime, not during initialization of the first JSContext. It would avoid init/deinit sequences in xpcshell. But in the current form the fix is also OK.
Attachment #287372 - Flags: review?(igor) → review+
(In reply to comment #50)
> Created an attachment (id=287372) [details]
> xpcshell crash fix

Lets create a new bug for that.
Depends on: 402535
(In reply to comment #42)
> Around the time this was checked in yesterday trace_malloc_maxheap jumped a
> bunch (>10%) on bm-xserve11
> http://build-graphs.mozilla.org/graph/query.cgi?tbox=bm-xserve11.build.mozilla.org&testname=trace_malloc_maxheap&autoscale=1&size=&days=7&units=bytes&ltype=&points=&showpoint=&avg=1

Actually, that seems to correspond with the checkin for bug 401722.
(In reply to comment #52)
> (In reply to comment #50)
> > Created an attachment (id=287372) [details] [details]
> > xpcshell crash fix
> 
> Lets create a new bug for that.
> 

OK, please see bug 402653.
Blocks: 403678
Blocks: 378742
Depends on: 410036
Depends on: 397435
Depends on: 431906
You need to log in before you can comment on or make changes to this bug.