Closed Bug 296987 Opened 20 years ago Closed 19 years ago

Increased leakage on Linux balsa Dep GTK1 (gcc 3.4)

Categories

(Toolkit :: Application Update, defect)

x86
Linux
defect
Not set
major

Tracking

()

RESOLVED FIXED

People

(Reporter: Peter6, Assigned: bugs)

References

Details

(Keywords: memory-leak, perf, regression, Whiteboard: [ETA ?])

Attachments

(1 file, 1 obsolete file)

Landing of bug 296868 may have cause memleak on Balsa 20050606 14:11:00 pdt , before landing RLk:816B Lk:278KB MH:8.28MB A:216K 20050606 15:23:00 pdt , after landing RLk:75.2KB <-- Lk:1.46MB <-- MH:8.22MB A:215K also minor Ts regr. on Pacifica
OS -> Linux Severity -> Major Requesting blocking aviary1.1
Severity: normal → major
Flags: blocking-aviary1.1?
OS: Windows 2000 → Linux
brendan: Any chance you can look into this (i'll ask because you also debugged the leak in Bug 294893 together with some other people), graph of the leak can be found under http://build-graphs.mozilla.org/graph/query.cgi?tbox=balsa&testname=trace_malloc_leaks&autoscale=1&size=&units=bytes&ltype=&points=&showpoint=&avg=1&days=30 (the leak is even a bit higher than the one fixed).
Assignee: nobody → bugs
Has this been fixed by Bug 294893? Leakage is almost back to the level noted in Comment #0. RLk:6.84KB Lk:346KB MH:7.64MB A:213K
(In reply to comment #4) > Has this been fixed by Bug 294893? > Leakage is almost back to the level noted in Comment #0. No it's not been fixed. This leak was introduced by something else (the software update stuff).
Checkins that might have fixed this leak (also there is still something left): Bug 299302 and Bug 85207, Bug 299302 is more likely so (i skiped the other checkins in the timeframe since those didn't fix the leak to 99,9%).
The reason this is leaking is that the timer manager creates a reference cycle: function TimerManager() { const nsITimer = Components.interfaces.nsITimer; this._timer = Components.classes["@mozilla.org/timer;1"] .createInstance(nsITimer); var timerInterval = getPref("getIntPref", PREF_APP_UPDATE_TIMER, 5000); this._timer.initWithCallback(this, timerInterval, nsITimer.TYPE_REPEATING_SLACK); } Now the TimerManager JS object keeps alive the XPCWrappedNative for the nsITimer object, which keeps alive the C++ timer, which keeps alive the nsXPCWrappedJS that reflects the TimerManager into C++, which make the TimerManager JS object a GC root so the whole thing can't go away. If the TimerManager is a service, this code could use a third object implementing nsITimerCallback and getting the manager via getService as needed; that would break the cycle... That all said, there is other toolkit code that does similar things -- both toolkit/mozapps/extensions/content/update.xml and toolkit/content/widgets/preferences.xml are XBL bindings that keep a timer in a member variable and pass the binding itself as the nsITimerCallback; those bindings, together with whatever other resources they hold on to would also leak, but that requires opening up the update or preferences UI, which balsa doesn't do.
One other note. Fixing the timer cycle doesn't fix all the leaks -- the UpdateService is somehow leaked via the TimerManager, probably because the last GC we do GCs the TimerManager and just drops the nsXPCWrappedJS to the UpdateService; the UpdateService doesn't get GCed at that point yet. Ideally, the TimerManager would observe shutdown and drop its pointers to timer callbacks when it happens.
(In reply to comment #8) > Ideally, > the TimerManager would observe shutdown and drop its pointers to timer > callbacks when it happens. Would that make real leaks not show up in the leak logs?
Not sure... At the moment, what's happening is that TimerManager holds a JS ref to the UpdateService. Both are services implemented in JS. At shutdown, we shut down the service manager, then we're able to GC the TimerManager. But it was holding a ref to an XPCWrappedNative for an nsXPCWrappedJS for the UpdateService. So UpdateService was a GC root for this GC cycle, and didn't get GC'ed. At least as far as shaver and I can tell this is what's happening, since the TimerManager itself is not listed as being leaked. Oddly, the nsXPCWrappedJS for the UpdateService _is_ leaked according to the refcount log...
If a finalizer or GC callback removes a root, rt->gcLevel will be bumped and the GC will restart. It sounds like some XPConnect root created using GC callbacks is being removed or cleared without rt->gcLevel being bumped. bz, does that make sense at all? /be
So comment 8 after the "--" part is wrong. The reason the UpdateService leaks is the following: We leak an nsXPCWrappedJS. The ptr to this is 0x082CD890. The log output from my build says: Created wrapped native [xpconnect wrapped nsITimerCallback @ 0x827df40 (native @ 0x82cd890)], flat JSObject is 0x82d9dc8 This is the nsXPCWrappedJS getting wrapped in an XPCWrappedNative to go back out to JS; as long as the JSObject at 0x82d9dc8 is alive, this nsXPCWrappedJS object will stay alive. The JSObject is kept alive via the following GC path: 082d9dc8 object 0x827df40 XPCWrappedNative_NoHelper via nsXPCWrappedJS::mJSObj[nsIObserver,0x82cd890,0x82d9a30](Object @ 0x082d9a30).isDownloading getter(Function @ 0x082861c0).__proto__(Function @ 0x08285778).__proto__(Object @ 0x082857a8).__parent__(BackstagePass @ 0x08285770).TimerManager(Function @ 0x082d9428).prototype(Object @ 0x082d9890)._timers(Object @ 0x082d9898).background-update-timer(Object @ 0x082da1e0).callback(XPCWrappedNative_NoHelper @ 0x082d9dc8). So we do in fact have a cycle here: TimerManager.prototype -> _timer -> callback -> XPCWrappedNative JSObject -> XPCWrappedNative -> nsXPCWrappedJS -> UpdateService jsobject -> BackstagePass (scope object?) -> TimerManager -> TimerManager.prototype.
I filed bug 300023 on the issue starting with comment 8 so that this bug can stay focused on the timer cycle.
Flags: blocking-aviary1.1? → blocking1.8b4+
TimerManager not only holds the timer, but a whole bucket o' objs impl'ing nsITimerCallback (including the UpdateService)
Comment on attachment 189707 [details] [diff] [review] remove the xpcom-shutdown listener, too >+ gOS.removeObserver(this, "xpcom-shutdown"); Removing the observer seems a little unusual; are you sure the observer service handles removal during notification correctly? (I'd hope so, but...)
Whiteboard: SWAG: 2d
It's safe to remove an observer during the notification (from Darin). This patch has been checked in... it looks like Lk on Balsa went down around 7/9 and went up again on 7/23... not sure why on 7/23, that's probably unrelated.
I don't see this bug# in the CVS log... when was this patch checked in? Note that the Rlk number on balsa as of today is still about 8 times higher than it was before the bug 296868 landing (3556 vs 816).... But perhaps someone else checked something leaky in in the meantime, and we just didn't notice.
OK, I bet the remaining Rlk is bug 300833
Depends on: 300833
(In reply to comment #19) > OK, I bet the remaining Rlk is bug 300833 Also see bug 297312 comment 30. I checked the tinderbox logs of the build that showed the increased memory usage, and I did not see a NEW-LEAKS table. (bsmedberg also said on IRC that he doesn't see any potential code that could leak) Although a few days later, these 2 checkins: http://tinderbox.mozilla.org/bonsai/cvsquery.cgi?module=PhoenixTinderbox&date=explicit&mindate=1122300300&maxdate=1122304319 Showed this in the log: --NEW-LEAKS-----------------------------------leaks------leaks%----------------------- nsVoidArray 24 50.00%
bug 297312 caused a bug during startup and bug 302206 fixed that bug... I suspect bug 297312 for the increase in leakage on the day it was checked in and bug 302206 for the corresponding decrease when it was checked in. It appears there may have been a slight increase as well between the two.
Blocks: 290390
Whiteboard: SWAG: 2d → [ETA ?]
Flags: blocking1.8b5+ → blocking1.8b5-
This got fixed.
Status: NEW → RESOLVED
Closed: 19 years ago
Resolution: --- → FIXED
Product: Firefox → Toolkit
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: