Closed Bug 55426 Opened 23 years ago Closed 23 years ago

<shaver> we're leaking roots in the JS component loader

Categories

(Core :: XPConnect, defect, P3)

x86
Linux
defect

Tracking

()

VERIFIED FIXED

People

(Reporter: jag+mozbugs, Assigned: jband_mozilla)

Details

(Keywords: js1.5)

Attachments

(3 files)

Shut down app shell component {18c2f989-b09f-11d2-bcde-00805f0e1353},
rv=0x00000000
Shut down app shell component {33e569b0-40f8-11d4-9a41-000064657374},
rv=0x00000000
killing plugin host
CanUnload_enumerate: skipping native
PREF_Cleanup()
JS engine warning: leaking GC root 'rel:nsSample.js' at location 0x81193a4
JS engine warning: leaking GC root 'nsXPCWrappedJS::mJSObj' at location
0x81ca3a8
JS engine warning: leaking GC root 'nsXPCWrappedJS::mJSObj' at location
0x8116698
JS engine warning: leaking GC root 'nsXPCWrappedJS::mJSObj' at location
0x81d61e8
JS engine warning: leaking GC root 'rel:nsDictionary.js' at location 0x81f6dec
JS engine warning: leaking GC root 'rel:jsconsole-clhandler.js' at location
0x81dc0e4
JS engine warning: leaking GC root 'nsXPCWrappedJS::mJSObj' at location
0x81f6e20
JS engine warning: leaking GC root 'rel:remoteControl.js' at location 0x81c505c
JS engine warning: leaking GC root 'rel:nsSidebar.js' at location 0x81a5954
JS engine warning: leaking GC root 'rel:nsXmlRpcClient.js' at location 0x824724c
JS engine warning: leaking GC root 'nsXPCWrappedNative::mJSObj' at location
0x814fed0
JS engine warning: leaking GC root 'nsXPCWrappedJS::mJSObj' at location
0x8215ff0
JS engine warning: leaking GC root 'nsXPCWrappedJS::mJSObj' at location
0x81a45e0
JS engine warning: leaking GC root 'rel:nsFilePicker.js' at location 0x81890b4
JS engine warning: leaking GC root 'nsXPCWrappedJS::mJSObj' at location
0x81890e8
JS engine warning: 15 GC roots remain after destroying the last JSContext.
                   These roots may point to freed memory, and objects reachable
                   through them have not been finalized.
<mccabe> jag: we know about 'em.  jband has dug through them extensively, and
they're a side effect of some pretty intractable dependency problems in
xpcom/xpconnect
<mccabe> jag: jband found it was safer to leak these roots rather than have
finalizers called at inappropriate times.  Small leak-at-shutdown rather than
crash.
In case jband will be fixing these dependencies as part of his massive XPConnect
rearchitecture, I'm reassigning to him.
Assignee: shaver → jband
Hold on. Lots of those are actually unrooted before the JSRuntime is shutdown; 
i.e. all the 'rel:xxx'.

I think that mccabe's root warning stuff is in the wrong place. It fires on last 
JSContext destruction. I think it should fire only when the JSRuntime is 
destroyed - specifically in js_FinishGC. There are other embeddings out there 
that will be annoyed by these false positives too. For instance, there are 
server-like embeddings that can frequently go in an out of states where no 
JSContexts are alive, yet they may have long-lived roots intentionaly hanging 
off the JSRuntime.

I'll attach a patch to move the root leak printing code.
Status: NEW → ASSIGNED
Attached patch proposed fixSplinter Review
Above fix just moves the place mccabe's code gets invoked. I also changed the 
text a little for (I hope) better clarity and compactness.
r=mccabe.

(with fix to spelling of JSRunimte)

Thanks for the root lifetime clarification.
a=brendan@mozilla.org.

Jband, thanks for pointing out the "back from zero contexts in the runtime" 
case, and my neglected child JS_RemoveRootRT.  I should have thought of that 
when I reviewed mccabe's patch, especially since I'm helping some embedders who 
make contexts and destroy them eagerly, and transition through zero contexts 
back to non-zero all the time.

/be
Keywords: js1.5
checked in my fix for the root leak warning system (with spelling fix mccabe 
noted).
I was thinking about this the other day... When xpconnect shuts down it marks 
all the wrappers so that calls will not go through in either direction. What it 
does not do is remove the gc root it holds on each of the JSObjects wrapped for 
calling from native code (i.e. xpcwrappedjs objects). I can't think of any 
reason to not remove those roots. At this point all the wrapped natives are 
marked and no finalization calls (if gc were to happen) can make it through. The 
dtor in xpcwrappedjs will not itself remove the root after this point in 
shutdown. This bit of cleanup will clear away some noise if nothing else.

I'll attach a patch.
a=brendan@mozilla.org on that latest patch.

/be
r=mccabe.  Would it make sense to put some of your last comments from this bug
in SystemIsBeingShutDown?  Also, it looks like nsXPConnect.cpp:180 needs to
change to match the new signature.
Checked in. Added a comment at the new JS_RemoveRootRT call site.

mccabe: SystemIsBeingShutDown is a method on three different classes. Only one 
of them changed. I'm sure the fact that this tripped you up is somehow 
indicative that something is wrong with my choice of names in the code.

marking bug FIXED
Status: ASSIGNED → RESOLVED
Closed: 23 years ago
Resolution: --- → FIXED
I'm trying to understand what this most recent patch does.  Is it unrooting the
JS objects for nsXPCWrappedJS objects that still exist at shutdown time?  If so,
could it make a large leak that was leaked through an nsXPCWrappedJS object and
its JS root look like the leak was only a single nsXPCWrappedJS?
I see your comment about GC not happening after removing these roots.  So all
this does is reduce noise?  A reason not to remove the roots is so that we can
see that things were leaked because the roots were leaked.  What am I missing?
xpconnect is just a service. It gets shutdown when the service manager and 
component manager are done with it - specifically when the xpconnect dll's 
module is released.

At that point xpconnect cuts all connections (in both directions) between JS and 
native objects. It does not delete referenced wrappers because that risks a 
crash. It just makes it so that no calls can pass through the wrappers or use 
xpconnect services in their calls to the wrappers. 

For wrapped natives (called from JS) this means that no JS calls can get to the 
natives and that finalization of the JSObject part of the wrapper does not 
result in a call of Release on the native. At one point there were experiments 
with doing that Release call automatically when xpconnect shuts down. But the 
reality was that the whole mozilla system is so unstable at this arbitrary point 
during xpcom shutdown that the Release calls and propagated dtors were trying to 
call all sorts of xpcom services that were by then dead or to which those 
objects held dangling pointers. We decided that some leaking here at shutdown 
time was better than crashing. There is just too much unpredictability.

For wrapped JSObjects (what you asked about)... At xpconnect shutdown time we 
are making the promise to stop all calls through xpconnect into JS. Any 
wrappers that exist at this point are alive because some native object holds a 
reference on the wrapper - we can't kill the wrapper. The change in question 
recognises that the underlying JSObject is no longer reachable via xpconnect so 
there is no reason for xpconnect to force it to be rooted. There are subsequent 
gc calls as the JSRuntime service is released and the JSRuntime is itself 
destroyed. Freeing these roots may well allow very large graphs of JSObjects to 
be cleanly finalized. My concern was that none of those finalizations would 
propagate back through xpconnect and touch native objects at this time of 
unpridictability in the diverse native mozilla code.

Again, we have a *hard* shutdown ordering problem. It *might* be that all native 
reference holders of wrappedJS objects will eventually Release those references. 
But, xpconnect needs to do its shutdown when its time comes. The *real* root is 
up the line in some native somewhere. This change just cuts that connection 
cleanly and allows the JS stuff to do its cleanup. In fact, the nsXPCWrappedJS 
wrappers may end up not leaking at all if the native reference holder later 
Releases them. But by that time it is too late to clear the roots in JS. The old 
code would hold those roots while the JS engine shutdown. Now they are freed.

Did this answer your questions?
BTW if you want to see counts on live wrappers and stuff at xpconnect shutdown 
then compile xpconnect with XPC_DUMP_AT_SHUTDOWN defined...
http://lxr.mozilla.org/seamonkey/source/js/src/xpconnect/src/xpcprivate.h#103

(maybe you already knew this?)

The browser is actually pretty clean here. First run is messy - mostly because 
of JS Component registration. Runs with mail/news and (especially) composer 
started are messy.
My concern is that this could make it harder to detect leaks.  The presence of
leaked roots on the list of leaked roots quickly brings my attention to whatever
is holding the root.  If there is a GC run after this happens, then objects that
have been leaked (and contributing to bloat) through the lifetime of the browser
run will suddenly be freed right before we get leak stats (although the leaked
nsXPCWrappedJS will remain).

It might make it even harder to detect leaks that involve circular references
since it could break the circularity and free all the leaked memory, including
the nsXPCWrappedJS, right before shutdown.  (See, for example, what I described
in bug 56703.)
I disagree. This is removing noise from the JS root set. As you noted the leaked 
nsXPCWrappedJS count remains. The number in the bloat/leak logs represents the 
final count - taking into account that some are Released after xpconnect has 
shutdown. If these xpconnect owned roots were left in place then they would show 
*both* real leaks and roots that would have been cleared in Releases after 
xpconnect shutdown. Such a count is tainted by noise.

After xpconnect shutdown, the nsXPCWrappedJS will not be free'd via any path 
through xpconnect. Though, I admit, there may be non-xpconnect paths through 
which reference circularity could exist and still be broken here.

If you want to detect such things then add an 'else' block to the IsValid() 
check in ~nsXPCWrappedJS...
http://lxr.mozilla.org/seamonkey/source/js/src/xpconnect/src/xpcwrappedjs.cpp#25
7
...Put a breakpoint there and see if the post-xpconnect-shutdown destruction is 
coming as a result of a JS finialization somewhere.

Or put your own #ifdef'd NS_ASSERTION(isValid(), "boo") there or in 
nsXPCWrappedJS::Release

I think the more common cases may be JS component services what happen to be 
released after xpconnect shutdown.
Marking Verified -
Status: RESOLVED → VERIFIED
You need to log in before you can comment on or make changes to this bug.