<a class="header-button" href="https://bugzilla.mozilla.org/home" title="Go to home page"> Bugzilla

Assignee

Updated

•

20 years ago

Blocks: 281988

No longer depends on: 281988

Assignee

Comment 3

•

20 years ago

Minimal testcase is in the URL bar.... For some reason, we end up with: ###!!! ASSERTION: nsGenericElement's event listener manager hash not empty at shutdown!: 'sEventListenerManagersHash.entryCount == 0', file ../../../../mozilla/content/base/src/nsGenericElement.cpp, line 804 and leak the world when we load that testcase. I've confirmed that disabling the XPCNativeWrapper stuff (by just hacking NativeInterface2JSObject to never do it) avoids the leak. :( We really don't want to ship 1.8b2 with this. Debugging now...

Assignee: nobody → dbradley

URL: data:<a onclick="random text">

Component: General → XPConnect

Flags: blocking1.8b2?

Keywords: mlk

Product: Firefox → Core

QA Contact: general → pschwartau

Blake Kaplan (:mrbkap) (inactive)

Assignee

Updated

•

20 years ago

Assignee: dbradley → bzbarsky

Priority: -- → P1

Target Milestone: --- → mozilla1.8beta2

Asa Dotzler [:asa]

Updated

•

20 years ago

Flags: blocking1.8b2? → blocking1.8b2+

Updated

•

20 years ago

Flags: blocking1.8b2+ → blocking1.8b2?

Asa Dotzler [:asa]

Updated

•

20 years ago

Flags: blocking1.8b2? → blocking1.8b2+

Assignee

Updated

•

20 years ago

URL: data:<a onclick="random text"> → data:text/html,<a onclick="random text">

Comment 4

•

20 years ago

Attached patch Leak fix, I think. — Details — Splinter Review

I don't fully understand this yet, but this is IMO what we want regardless of whether this really does fix the leak (which it appears to do with the trivial testcase in this bug). What the current code does is to mark mNativeWrapper when a wrapped native is marked, and that makes the nw stay around as long as the wn stays around. I don't see a reason for that, if all script references to a nw are gone then we might as well let the nw go, even if the wn is still referenced (in a different scope, most likely). I'm going to check this in now, there's noone around to r/sr, so bz, please r+sr after the fact, and back this out if there's something really wrong with this. I want testing for this in tomorrows build, and I want to see how this affects the leak numbers ASAP. With some luck this'll make the leak go away and we can ship with this even if we don't yet fully understand all aspects of this leak etc.

Updated

•

20 years ago

Attachment #184387 - Flags: superreview?(bzbarsky)

Attachment #184387 - Flags: review?(bzbarsky)

Comment 5

•

20 years ago

Probable leak fix checked in, a=asa (over IRC).

Frank Wein [:mcsmurf]

Comment 6

•

20 years ago

The first cycle with the leak fix (double-checked with build log CVS output that the fix is in) still leaks like before.

Comment 7

•

20 years ago

We need to understand the leak to fix it. Also, I think we want that mark of the nw from its wn, or auto-magically created nw's will be ephemeral. That's bloaty and (if you can stick a property on an nw, as you can for certain ids) arguably user-hostile. Did anyone get a GC-heap dump that matched the one bz saw, where the BackstagePass object was implicated, and certain prototype chains had one too many Object protos near the end? /be

Assignee

Comment 8

•

20 years ago

jst, I think comment 7 is right -- we want to keep the XPCNativeWrapper alive just like we keep XPCWrappedNatives alive if they have random user-defined props on them.... So this is what we found yesterday, just so we all have the same info. Note that I was testing with the minimal testcase in this bug, so we may also have other issues (given that jst's change fixes this testcase but not tinderbox). 1) The leak on this testcase went away if every time XPCNativeWrapper::GetNewOrUsed was called I had it create a new wrapper _and_ set the new wrapper on the XPCWrappedNative. If I just had it create a new wrapper every time but left the wrapped native alone if it already had an mNativeWrapper, we still leaked. This is probably the issue that jst's patch tries to address... 2) I beat things about the head and shoulders till we stopped leaking the JS runtime service (through XBL and XUL), so I got a printout of the remaining roots at runtime shutdown. I also hacked XBL to use slightly more informative root names. This is what I get at shutdown: JS engine warning: leaking GC root ' -- chrome://global/content/bindings/scrollbar.xml#scrollbar -- nsXBLProtoImplMethod::mJSMethodObject' at 0x85a7d64 JS engine warning: leaking GC root 'res->input' at 0x876d7a4 JS engine warning: leaking GC root 'initScrollbar -- chrome://global/content/bindings/scrollbar.xml#scrollbar -- nsXBLProtoImplMethod::mJSMethodObject' at 0x85a80a4 JS engine warning: leaking GC root 'res->input' at 0x8598624 JS engine warning: 4 GC roots remain after destroying the JSRuntime. These roots may point to freed memory. Objects reachable through them have not been finalized. So the leaking roots are two regexp-related things and the initScrollbar method and constructor of the scrollbar binding.... I'm not really sure where to go with that. The address 0x85a80a4 does not appear in the gc heap dump that I (or emacs i-search, and I did just search for 85a80a4) can see. What _does_ appear in the dump, is this: 08473ef0 object 0x87712d8 Window via global object(Window). I'm not really sure what that means, exactly. Is the global object an implicit GC root or something? Still trying to sort out how I can use this gc heap dump... jst and I were also a little concerned about the calls to JS_NewObject for XPCNativeWrappers whose parent is null (i.e. the wrapper for the window's wrapped native). We tried setting the parent to the window's JSObject (the one off the wrapped native), but that didn't seem to help.

Assignee

Comment 9

•

20 years ago

I posted the full output of a leaking build at http://web.mit.edu/bzbarsky/www/gc_log.txt (and a gzipped version at http://web.mit.edu/bzbarsky/www/gc_log.txt.gz if people want to download it via "save link as" -- the file is about 4MB). In addition to the GC heap dump, this file contains two more types of debug output: Lines like: Created wrapper: '[object Window @ 0x87712d8]', object is: 0x8473ef0 mean we created an XPCWrappedNative for a Window object, the address of the XPCWrappedNative is 0x87712d8 and the address of its flat JSObject is 0x8473ef0. Lines like: Created extra-wrapper: 0x8474f20, for: [object Window @ 0x87712d8] mean we created an XPCNativeWrapper for the XPCWrappedNative at address 0x87712d8. The address of the XCPNativeWrapper JSObject is 0x8474f20. The gc dump is also somewhat hacked; I changed the path-dumping in gc_dump_thing as follows: while (next) { path = JS_sprintf_append(path, "%s(%s).", next->name, gc_object_class_name(next->thing)); + if ((*js_GetGCThingFlags(next->thing) & GCF_TYPEMASK) == GCX_OBJECT) { + path = JS_sprintf_append(path, "[was %p]", + (JSObject*)next->thing); + } next = next->next; } in the hopes of getting addresses for things in the path. Note that this puts the address after the '.' because I was lazy... So a line like: 08473ef0 object 0x87712d8 Window via global object(Window).[was 8473ef0] would mean that the JSObject at 0x08473ef0 (with private data 0x87712d8, so this is exactly our Window object above) had on the path the JSObject at 0x08473ef0, if I understand what this code is doing right... I didn't include the actual log XPCOM_MEM_LEAK_LOG creates, since all it tells me is that we leak a window and a document and everything that entails. Note that this is all without jst's patch. I'll try applying that now and seeing whether I can create a minimal leak testcase with it applied.

Comment 10

•

20 years ago

(In reply to comment #7) > We need to understand the leak to fix it. Also, I think we want that mark of > the nw from its wn, or auto-magically created nw's will be ephemeral. That's > bloaty and (if you can stick a property on an nw, as you can for certain ids) > arguably user-hostile. What id's would that be? I can't imagine we would ever want anyone to be able to store anything on these wrappers. What am I missing?

Comment 11

•

20 years ago

Attached patch Fix parenting of wrappers, and followup fix to the previous diff that caused access to deleted memory in some cases. — Details — Splinter Review

Attachment #184427 - Flags: superreview?(bzbarsky)

Attachment #184427 - Flags: review?(bzbarsky)

Comment 12

•

20 years ago

Attached patch Same as above w/ more newborn object protection. — Details — Splinter Review

Attachment #184428 - Flags: superreview?(bzbarsky)

Attachment #184428 - Flags: review?(bzbarsky)

Assignee

Comment 13

•

20 years ago

Comment on attachment 184428 [details] [diff] [review] Same as above w/ more newborn object protection. r+sr=bzbarsky; requesting approval for yet more attempted leak-fixing.

Attachment #184428 - Flags: superreview?(bzbarsky)

Attachment #184428 - Flags: superreview+

Attachment #184428 - Flags: review?(bzbarsky)

Attachment #184428 - Flags: review+

Attachment #184428 - Flags: approval1.8b2?

Benjamin Smedberg

Comment 14

•

20 years ago

Comment on attachment 184428 [details] [diff] [review] Same as above w/ more newborn object protection. per deerpark meeting, any reviewed patch for this bug is auto-approved

Attachment #184428 - Flags: approval1.8b2? → approval1.8b2+

Comment 15

•

20 years ago

Additional change landed now too.

Comment 16

•

20 years ago

Attached patch Make sure references from native wrappers are cleared too when JS_ClearScope() is called inbetween document loads. — Details — Splinter Review

Bz's the man, he figured this out (assuming that this *really* does fix the leak, and we think we know why it would now). What's happening here is that we end up with a wrapped native for a content window and a native wrapper for that WN. Chrome code accesses content.document, which ends up wrapping the content document and sticking it in the property cache on the NW. If the window is then closed, we end up clearing the scope, but leaving the NW around with a reference to the document. At least before my earlier changes, the cycle was something along these lines: window > window::mNativeWrapper > nsHTMLDocument > child nodes > nsEventListenerManager > nsJSEventListener > nsJSContext > window With my earlier changes the cycle must be different, but I haven't been able to reproduce a leak w/ my earlier changes so I can't say for sure what the cycle really is. This patch is going in, r+sr=bzbarsky (over IRC).

Updated

•

20 years ago

Attachment #184466 - Flags: superreview+

Attachment #184466 - Flags: review+

Comment 17

•

20 years ago

Great work, sorry I wasn't around to help. jst, I'm still wondering (comment 7): don't we want to mark mNativeWrapper from XPCWrappedNative::MarkBeforeJSFinalize? And therefore not need a finalize hook in XPCNativeWrapper? At least toString may be overridden on the nw, and it seems like a waste to let the GC destroy all the "faulted-in" ids on an busy nw for a live wn. /be

Frank Wein [:mcsmurf]

Comment 18

•

20 years ago

balsa is green again, still leaks like before.

Assignee

Comment 19

•

20 years ago

So in the range in which leaks went up, the following patches went in: Bug 294808 Bug 294831 Bug 294630 Bug 294815 Bug 281988 -- first patch only (so wrappers NOT enabled at that point) Bug 258650 I'm pretty sure the first three things on that list don't leak. Bug 294815 could, in theory, have issues... Bug 281988 could have issues if wrappers are being created manually or if somehow code was getting wrappers because it had a null script filename... or if there is a bug in the rest of the code somewhere, but I don't see one. Bug 258650 is probably safe. There _are_ some manual XPCNativeWrapper creations in the code -- see <http://lxr.mozilla.org/seamonkey/search?string=XPCNativeWrapper%28>. The only ones I see in Firefox code are pageinfo and the DOMWillOpenModalDialog handler in tabbrowser. I'll try removing the latter and seeing what happens, I guess. It really would be nice if I could reproduce the leak at this point...

Assignee

Comment 20

•

20 years ago

I think I've finally reproduced this. Balsa is reregistering components on every single startup (why? I have no idea). If I remove my compreg.dat and then start, I do see a leak. It's not leaking through the wrappers added in bug 281988, though...

Comment 21

•

20 years ago

Attached patch Also clear the scope of every explicitly created nw, and put back the marking of the "primary" nw when wn's are marked. — Details — Splinter Review

Attachment #184556 - Flags: superreview?(bzbarsky)

Attachment #184556 - Flags: review?(bzbarsky)

Updated

•

20 years ago

Attachment #184427 - Flags: superreview?(bzbarsky)

Attachment #184427 - Flags: review?(bzbarsky)

Updated

•

20 years ago

Attachment #184387 - Flags: superreview?(bzbarsky)

Attachment #184387 - Flags: review?(bzbarsky)

Jennifer Sherie Case

Comment 22

•

20 years ago

Regarding #294893 [Core]-Major leakage on Linux balsa Dep GTK1 (gcc 3.4) [Lin] Attempt 4.... GTK2+ is out, why haven't they updated? I have noticed this recently myself, 512MB of ram and fresh reboot say, I am using near 32% of it, then loads FF and it goes up to near 80% use. Does not even release it either. Has to do a Cold reboot to clear it out. Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8b2) Gecko/20050524 Firefox/1.0+ ID:2005052421 One last note, when I control T (open new Tab) then try to control v (paste) tonew tab, it does not default to the URL like it did in the previous build I had. Ok Graphical Representation here [url=http://images5.theimagehosting.com/Memory1.png]With FF Loaded[/url] [url=http://images5.theimagehosting.com/Memory2.png]Without FF loaded and a Reboot[/url] would have shown here, but for some reason they are the size of my Display 1280x1024 Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8b2) Gecko/20050525 Firefox/1.0+ ID:2005052509 Since I did this, the Prometheus Build is what had the problem.

Assignee

Comment 23

•

20 years ago

Comment on attachment 184556 [details] [diff] [review] Also clear the scope of every explicitly created nw, and put back the marking of the "primary" nw when wn's are marked. > Index: js/src/xpconnect/src/XPCNativeWrapper.cpp + JS_ClearScope(d->cx, (JSObject*)entry->key); ::JS_ClearScope, please. >Index: js/src/xpconnect/src/xpcjsruntime.cpp >+ mExplicitNativeWrapperMap(XPCNativeWrapperMap::newMap(XPC_NATIVE_WRAPPER_MAP_SIZE)), You need to delete this map in ~XPCJSRuntime. You also need to null-check this map in newXPCJSRuntime(). r+sr=bzbarsky with those fixed.

Attachment #184556 - Flags: superreview?(bzbarsky)

Attachment #184556 - Flags: superreview+

Attachment #184556 - Flags: review?(bzbarsky)

Attachment #184556 - Flags: review+

Assignee

Comment 24

•

20 years ago

So I've been debugging the leak on startup and registration. We seem to be leaking two chrome windows -- the hidden window and the main browser window. I can sorta tell them apart, since the hidden window is a ChromeWindow with an HTMLDocument hanging off it: 086bc088 object 0x86e3b60 HTMLDocument via XPCWrappedNativeProto::mJSProtoObject(XPC_WN_ModsAllowed_Proto_JSClass @ 0x081c65f0).outerWidth getter(Function @ 0x08653580).__proto__(Function @ 0x08653578).__proto__(Function @ 0x08502f20).__proto__(Object @ 0x08502f50).__proto__(Object @ 0x08502e88).__parent__(ChromeWindow @ 0x08502e48).document(HTMLDocument @ 0x086bc088). 08502e48 object 0x851cbf0 ChromeWindow via XPCWrappedNativeProto::mJSProtoObject(XPC_WN_ModsAllowed_Proto_JSClass @ 0x081c65f0).outerWidth getter(Function @ 0x08653580).__proto__(Function @ 0x08653578).__proto__(Function @ 0x08502f20).__proto__(Object @ 0x08502f50).__proto__(Object @ 0x08502e88).__parent__(ChromeWindow @ 0x08502e48). The other window we're leaking is: 081c65f8 object 0x85ad1c0 ChromeWindow via atom(AnyName @ 0x0812c438).constructor(Function @ 0x0821e930).__proto__(Function @ 0x081c66d8).__proto__(Object @ 0x081c6c10).__proto__(Object @ 0x081c6640).__parent__(ChromeWindow @ 0x081c65f8). This is the root cause of the leaks; leaking this window means we leak various stuff that entrains the chrome window as the compilation scope (you can see that above). This is the only path via which this window is marked. The leak log says "(1 references; 0 from COMPtrs)", so the only ref to it is held from the XPCWrappedNative we're leaking here. Brendan, what's this atom(AnyName) thing that claims to be a GCX_OBJECT in its typemask (otherwise it wouldn't get an "@ pointer" thing to go with it)? More importantly, why's it acting like a root? Fwiw, I checked with auto-wrapping completely disabled (and there's no manual wrapping in Firefox outside of page info, which I did not open), and the startup leak was still there....

Assignee

Comment 25

•

20 years ago

Attached patch Don't pin object atoms, since they can entrain things — Details — Splinter Review

We (brendan, jst, I) have no idea why this wasn't a problem before, but this AnyName atom is what's entraining the non-hidden chrome window... Just landed this patch, with r+a=brendan, sr=jst.

Attachment #184577 - Flags: superreview+

Attachment #184577 - Flags: review+

Assignee

Comment 26

•

20 years ago

And it looks like that patch causes crashes during standard class init -- we manage to somehow lose our just-constructed proto for the AnyName class to GC.

Assignee

Comment 27

•

20 years ago

So the reason for the crashes is that js_GetAnyName returns rt->atomState.lazy.anynameAtom if that's set. If we don't pin the atom, that can get GC'd, and then we're returning bogus stuff. Since that atom is also the proto for the class, as far as I can tell, bad things happen when we try to js_SetClassPrototype in JS_InitClass. We have to be initing the class not for the first time, but that's ok -- we have plenty of global objects around. Brendan, this is sounding like an all-you kinda thing...

Assignee

Comment 28

•

20 years ago

Attached patch Better, stronger, faster — Details — Splinter Review

This just nulls out the atomstate entries when the atom is finalized. With this change, I don't leak and don't crash, even with TOO_MUCH_GC.

Assignee

Comment 29

•

20 years ago

Comment on attachment 184597 [details] [diff] [review] Better, stronger, faster brendan, whatcha think?

Attachment #184597 - Flags: review?(brendan)

Daniel C

Comment 30

•

20 years ago

First balsa build after bz' checking. RLk:816B Lk:307KB MH:9.48MB A:259K

Daniel C

Comment 31

•

20 years ago

The following is for comparative purposes (presumably to be used in working out whether this should be "Status -> RESOLVED, Resolution -> FIXED" or not.) The previous values before this bug was filed according to http://forums.mozillazine.org./viewtopic.php?t=271385&postdays=0&postorder=asc&postsperpage=15&start=15 (and backed-up by bug 281988 comment 144) were: RLk:780B Lk:308KB MH:8.00MB A:247K

Assignee

Comment 32

•

20 years ago

Marking fixed, again pending Brendan's review. Remaining issues, if any, should get separate bugs.

Status: NEW → RESOLVED

Closed: 20 years ago

Resolution: --- → FIXED

Victor Luchits

Comment 33

•

20 years ago

Looks like both beast and pacifica tinderboxes crash on TP after this commit

Comment 34

•

20 years ago

Attached patch Make finalize hooks order independent to fix crash. — Details — Splinter Review

Updated

•

20 years ago

Attachment #184615 - Flags: superreview?(bzbarsky)

Attachment #184615 - Flags: review?(bzbarsky)

Assignee

Updated

•

20 years ago

Attachment #184615 - Flags: superreview?(bzbarsky)

Attachment #184615 - Flags: superreview+

Attachment #184615 - Flags: review?(bzbarsky)

Attachment #184615 - Flags: review+

Peter van der Woude [:Peter6]

Comment 35

•

20 years ago

finalize hook change landed.

Comment 36

•

20 years ago

Bug 295634 for random crashes on Win after last part landed

Comment 37

•

20 years ago

Comment on attachment 184597 [details] [diff] [review] Better, stronger, faster Great, thanks for covering. /be

Attachment #184597 - Flags: review?(brendan)

Attachment #184597 - Flags: review+

Attachment #184597 - Flags: approval1.8b2+