Last Comment Bug 658738 - (bc-leaks) [meta] We seem to be leaking hundreds of windows until shutdown during browser-chrome tests
(bc-leaks)
: [meta] We seem to be leaking hundreds of windows until shutdown during browse...
Status: RESOLVED FIXED
[MemShrink:P2]
: meta, mlk
Product: Firefox
Classification: Client Software
Component: General (show other bugs)
: Trunk
: All All
: -- major with 8 votes (vote)
: ---
Assigned To: Dão Gottwald [:dao]
:
:
Mentors:
http://tinderbox.mozilla.org/Firefox/...
Depends on: LifetimeTesting chromeyfix 651643 663401 664116 664519 664549 664672 666221 668000 668972 669178 670463 671543 672470 674486 675412 677996 680647 680649 680686 683128 683320 683953 684803 hueyfix 712614 716692 717191 718536 719849 720985 722479 728426 728541 728606 734155 734172 734554 749361 750269 751085 751086 751100 751334 754800 754801 754803 759707 759708 759709 759711 759713 759715 759716 759717 759718 759720 759754 766291 766506 767896 768422 768935 768936 769646 770542 772382 772383 772384 772385 772778 773216 773387 773653 773661 775380 775393 776554 778433 782269 782858 932867 932880 932891 932898 933551
Blocks: SeaMonkey_bc-leaks mlk-fx6 ZombieCompartments 671421 726521 728638
  Show dependency treegraph
 
Reported: 2011-05-20 20:47 PDT by Boris Zbarsky [:bz] (still a bit busy)
Modified: 2014-05-03 04:39 PDT (History)
41 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---


Attachments
Log as a zip file to make sure it doesn't go away (1.33 MB, application/zip)
2011-05-20 20:48 PDT, Boris Zbarsky [:bz] (still a bit busy)
no flags Details
candidate list (3.14 KB, text/plain)
2011-05-26 09:19 PDT, Dão Gottwald [:dao]
no flags Details
remove event listeners in tests [Checked in: Comment 17] (92.61 KB, patch)
2011-06-07 05:25 PDT, Dão Gottwald [:dao]
no flags Details | Diff | Splinter Review
clean out each test's scope [Checked in: Comment 27] (5.53 KB, patch)
2011-06-09 04:10 PDT, Dão Gottwald [:dao]
gavin.sharp: review+
Details | Diff | Splinter Review
URIs (6.66 KB, text/plain)
2011-06-10 02:28 PDT, Dão Gottwald [:dao]
no flags Details
reset TabContextMenu.contextTab when manually calling TabContextMenu.updateContextMenu [Checked in: Comment 33] (6.55 KB, patch)
2011-06-13 02:24 PDT, Dão Gottwald [:dao]
no flags Details | Diff | Splinter Review
script for building the URI list (583 bytes, text/plain)
2011-06-13 03:53 PDT, Dão Gottwald [:dao]
no flags Details
script for building the URI list (623 bytes, text/plain)
2011-06-22 13:41 PDT, Dão Gottwald [:dao]
no flags Details
URIs (10.04 KB, text/plain)
2011-06-22 13:42 PDT, Dão Gottwald [:dao]
no flags Details
URIs (9.67 KB, text/plain)
2011-07-05 12:19 PDT, Dão Gottwald [:dao]
no flags Details
schedule final GC [Checked in: Comment 43] (860 bytes, patch)
2011-07-05 13:29 PDT, Dão Gottwald [:dao]
gavin.sharp: review+
Details | Diff | Splinter Review
URIs (9.76 KB, text/plain)
2011-07-06 01:41 PDT, Dão Gottwald [:dao]
no flags Details
URIs (9.57 KB, text/plain)
2011-07-06 03:38 PDT, Dão Gottwald [:dao]
no flags Details
more GC/CC [Checked in: Comment 62] (1.12 KB, patch)
2011-07-21 09:15 PDT, Dão Gottwald [:dao]
gavin.sharp: review+
Details | Diff | Splinter Review
URIs (85.71 KB, text/plain)
2011-07-25 07:17 PDT, Dão Gottwald [:dao]
no flags Details
URIs (9.50 KB, text/plain)
2011-07-25 07:18 PDT, Dão Gottwald [:dao]
no flags Details
URIs (9.54 KB, text/plain)
2011-07-31 11:44 PDT, Dão Gottwald [:dao]
no flags Details
URIs (9.44 KB, text/plain)
2011-08-02 05:06 PDT, Dão Gottwald [:dao]
no flags Details
URIs (9.54 KB, text/plain)
2011-08-08 02:49 PDT, Dão Gottwald [:dao]
no flags Details
script for building the URI list (604 bytes, text/plain)
2011-08-16 03:46 PDT, Dão Gottwald [:dao]
no flags Details
leak statistics (before backout) (7.00 KB, text/plain)
2011-08-19 10:49 PDT, Tim Taubert [:ttaubert]
no flags Details
leak statistics (after backout) (4.21 KB, text/plain)
2011-08-19 10:49 PDT, Tim Taubert [:ttaubert]
no flags Details
leak statistics (before backout) (15.71 KB, text/plain)
2011-08-19 11:57 PDT, Tim Taubert [:ttaubert]
no flags Details
leak statistics (after backout) (11.06 KB, text/plain)
2011-08-19 11:57 PDT, Tim Taubert [:ttaubert]
no flags Details
URIs (9.49 KB, text/plain)
2011-08-22 01:53 PDT, Dão Gottwald [:dao]
no flags Details
URIs (9.59 KB, text/plain)
2011-09-08 03:30 PDT, Dão Gottwald [:dao]
no flags Details
URIs (9.49 KB, text/plain)
2011-09-23 07:15 PDT, Dão Gottwald [:dao]
no flags Details
URIs (9.44 KB, text/plain)
2011-09-28 08:22 PDT, Dão Gottwald [:dao]
no flags Details
URIs (9.63 KB, text/plain)
2011-12-20 13:42 PST, Dão Gottwald [:dao]
no flags Details
URIs (9.53 KB, text/plain)
2012-01-09 11:16 PST, Dão Gottwald [:dao]
no flags Details
URIs (9.53 KB, text/plain)
2012-01-11 01:57 PST, Dão Gottwald [:dao]
no flags Details
URIs (9.48 KB, text/plain)
2012-01-16 15:46 PST, Dão Gottwald [:dao]
no flags Details
URIs (9.43 KB, text/plain)
2012-01-19 03:17 PST, Dão Gottwald [:dao]
no flags Details
URIs (9.28 KB, text/plain)
2012-01-24 08:50 PST, Dão Gottwald [:dao]
no flags Details
URIs (9.38 KB, text/plain)
2012-02-18 14:21 PST, Dão Gottwald [:dao]
no flags Details
URIs (9.28 KB, text/plain)
2012-03-08 09:01 PST, Dão Gottwald [:dao]
no flags Details

Description Boris Zbarsky [:bz] (still a bit busy) 2011-05-20 20:47:19 PDT
I was looking at http://tinderbox.mozilla.org/Firefox/1305941364.1305945463.28696.gz for unrelated reasons, and noticed that at the end of the test there are about 500 DOMWINDOW objects around, and only a few docshells.  So I grabbed the log and looked at the urls of the windows being destroyed _after_ the hidden window.  The top offenders of the 500 or so windows left at that point are:

  339 about:blank
   96 chrome://browser/content/browser.xul
   84 about:addons

I tried a quick test of just opening and closing windows locally and the browser.xul windows go away.

Oh, during that whole test run about 186 chrome://browser/content/browser.xul windows go away.

And windows are going away through the run as we go, so it's not like _all_ windows leak.  Just a bunch of them.  Looking at the log, we seem to have both inner and outer browser.xul windows around at the end there.

The big question: is this just test code holding on to windows?  Or is this test code calling code that would normally run in our UI and holds on to windows?

Again, all the windows are _closed_ (no docshell) but the inner and outer window objects are hanging around at shutdown.

It'd be nice if someone familiar with the browser tests took a look at which tests are opening the windows that leak or something.  Do we have any other ways of finding their allocation points?
Comment 1 Boris Zbarsky [:bz] (still a bit busy) 2011-05-20 20:48:21 PDT
Created attachment 534189 [details]
Log as a zip file to make sure it doesn't go away

Though I bet any Moth log will show this.
Comment 2 Boris Zbarsky [:bz] (still a bit busy) 2011-05-20 20:49:06 PDT
Oh, and Gavin, Dao, please cc whoever might be the right people for this?
Comment 3 Robert Sayre 2011-05-20 21:04:59 PDT
This is totally unacceptable and we will fix. Assigning to me.
Comment 4 Boris Zbarsky [:bz] (still a bit busy) 2011-05-20 21:15:06 PDT
So to be clear, since I realized some stuff from comment 0 was a bit fuzzy:

1)  I can't reproduce the problem locally running a debug build for about a minute opening and closing random windows and tabs.  Things go away as they should when I do that.

2)  About half the browser.xul windows the Moth tests opened were around at browserchrome test shutdown.  The other half went away before that.
Comment 5 Dão Gottwald [:dao] 2011-05-22 01:41:47 PDT
(In reply to comment #0)
> The big question: is this just test code holding on to windows?

That's very likely the case. Browser chrome tests all run in the same window. A good portion of them will open new windows, do stuff and close them again.
Comment 6 Boris Zbarsky [:bz] (still a bit busy) 2011-05-23 23:48:10 PDT
> A good portion of them will open new windows, do stuff and close them again.

Yes, but do they then hold references to those windows in perpetuity?
Comment 7 Dão Gottwald [:dao] 2011-05-23 23:54:52 PDT
They might add a load event listener and not remove it.
Comment 8 Boris Zbarsky [:bz] (still a bit busy) 2011-05-24 00:19:04 PDT
Sure.  The question is whether that's happening or whether we have something going on that could bite in normal browsing too.
Comment 9 Dão Gottwald [:dao] 2011-05-24 00:22:37 PDT
It's certainly happening.
Comment 10 Boris Zbarsky [:bz] (still a bit busy) 2011-05-24 00:23:58 PDT
OK, can we stop it happening and see whether the behavior goes away?
Comment 11 Justin Dolske [:Dolske] 2011-05-24 14:50:05 PDT
I'm perhaps jumping ahead, but is it possible to add a test (or modify the test harness?) so that when this is fixed is stays fixed? EG, something makes noise and turns orange if a test (or real bug!) causes a window to leak to shutdown.
Comment 12 Boris Zbarsky [:bz] (still a bit busy) 2011-05-25 22:24:48 PDT
The hidden window is always alive until shutdown.  So we'd need to do something smart....
Comment 13 David Baron :dbaron: ⌚️UTC-8 2011-05-25 23:06:00 PDT
We can assert that, after a GC, there are no longer any windows that don't have a Components object (since we remove the Components object when the window is closed).  That's the heuristic that Leak Monitor uses.  Unfortunately, we haven't gotten back to cleanly meeting that even for simple testcases since Firefox 2.
Comment 14 Dão Gottwald [:dao] 2011-05-26 09:11:44 PDT
Out of 748 tests, I found 53 tests where addEventListener occurs more often than removeEventListener. This is a naive way to count and very likely inaccurate, but the scale seems reasonable.
Comment 15 Dão Gottwald [:dao] 2011-05-26 09:19:47 PDT
Created attachment 535361 [details]
candidate list
Comment 16 Dão Gottwald [:dao] 2011-06-07 05:25:33 PDT
Created attachment 537771 [details] [diff] [review]
remove event listeners in tests
[Checked in: Comment 17]
Comment 17 Dão Gottwald [:dao] 2011-06-07 09:02:31 PDT
Comment on attachment 537771 [details] [diff] [review]
remove event listeners in tests
[Checked in: Comment 17]

http://hg.mozilla.org/mozilla-central/rev/2e45e66a9200
Comment 18 Dão Gottwald [:dao] 2011-06-07 09:23:38 PDT
Doesn't look like this made a difference.
Comment 20 Dão Gottwald [:dao] 2011-06-09 08:51:10 PDT
(In reply to comment #19)
> Created attachment 538233 [details] [diff] [review] [review]
> clean out each test's scope
> 
> this seems to get the number of windows alive at the end of the test run
> down from 542 to 287

I should add that I don't actually expect the const -> var changes in this patch (so that 'delete' can kill the references) to contribute to the lower DOM window count. It's just something I wanted to rule out as a factor.
Comment 21 :Gavin Sharp [email: gavin@gavinsharp.com] 2011-06-09 09:18:05 PDT
Comment on attachment 538233 [details] [diff] [review]
clean out each test's scope
[Checked in: Comment 27]

Can you put the destroy()/nulling of .scope in nextTest (after the duration stuff), rather than in the waitForWindowsState callback?
Comment 22 Boris Zbarsky [:bz] (still a bit busy) 2011-06-09 10:00:04 PDT
Hmm.  Is it expected that the test scopes will hang around like that?
Comment 23 :Gavin Sharp [email: gavin@gavinsharp.com] 2011-06-09 12:31:57 PDT
(In reply to comment #22)
> Hmm.  Is it expected that the test scopes will hang around like that?

The test array is kept around for the duration of the run, so yes.
Comment 24 Boris Zbarsky [:bz] (still a bit busy) 2011-06-09 15:19:49 PDT
Can we pop things out of that array instead?  Seems cleaner....
Comment 25 :Gavin Sharp [email: gavin@gavinsharp.com] 2011-06-09 15:38:19 PDT
The test objects keep track of the test results, so no. This patch clears the stuff that needs to be/can be cleared, I'm not sure I understand why you think it's "unclean".
Comment 26 Boris Zbarsky [:bz] (still a bit busy) 2011-06-09 15:42:36 PDT
Just generally, clearing properties seems like something that someone could forget to do in other circumstances.  In this case it sounds like that's a non-issue.
Comment 27 Dão Gottwald [:dao] 2011-06-10 00:08:11 PDT
Comment on attachment 538233 [details] [diff] [review]
clean out each test's scope
[Checked in: Comment 27]

http://hg.mozilla.org/mozilla-central/rev/443c7cd5bdf8
Comment 28 Dão Gottwald [:dao] 2011-06-10 02:28:24 PDT
Created attachment 538468 [details]
URIs
Comment 29 Phil Ringnalda (:philor) 2011-06-10 07:46:55 PDT
Apparently mobile-browser-chrome was depending on things not being cleaned up, since that patch took it from two permaoranges to fifteen.
Comment 30 Dão Gottwald [:dao] 2011-06-10 08:56:12 PDT
(In reply to comment #29)
> Apparently mobile-browser-chrome was depending on things not being cleaned
> up, since that patch took it from two permaoranges to fifteen.

Sounds like these mobile tests are just broken, as the scopes get cleared after the tests have declared that they're finished.
Comment 31 Dão Gottwald [:dao] 2011-06-10 09:43:03 PDT
(In reply to comment #30)
> (In reply to comment #29)
> > Apparently mobile-browser-chrome was depending on things not being cleaned
> > up, since that patch took it from two permaoranges to fifteen.
> 
> Sounds like these mobile tests are just broken, as the scopes get cleared
> after the tests have declared that they're finished.

Apparently it's still 2 failures with my patch: http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1307717565.1307718056.14042.gz
Comment 32 Phil Ringnalda (:philor) 2011-06-10 18:31:29 PDT
Yep, I can't read: between the suite just being turned on on -inbound (after the regressing patch landed) and the merges in both directions and the total failure rate, I missed by one push, and it was the merge after you. Sorry for the distraction.
Comment 33 Dão Gottwald [:dao] 2011-06-13 02:24:31 PDT
Created attachment 538835 [details] [diff] [review]
reset TabContextMenu.contextTab when manually calling TabContextMenu.updateContextMenu
[Checked in: Comment 33]

http://hg.mozilla.org/mozilla-central/rev/bbbc80a2bd8c
Comment 34 Dão Gottwald [:dao] 2011-06-13 03:53:16 PDT
Created attachment 538848 [details]
script for building the URI list

putting this here since I already lost it once
Comment 35 Johnny Stenback (:jst, jst@mozilla.com) 2011-06-21 13:37:39 PDT
Dão, I know some dependent bugs have landed here, can you summarize where we're at and what's left to do? Thanks!
Comment 36 Dão Gottwald [:dao] 2011-06-22 13:40:08 PDT
The dependent bugs have helped here and there, but on balance the window count is actually growing as people land more tests.

I just disabled sessionstore tests on Try and the number went from ~330 to 194. The URIs affected by this are about:blank (130 -> 66) and chrome://browser/content/browser.xul (79 -> 13). I now need to figure out if sessionstore code or the tests are to blame.
Comment 37 Dão Gottwald [:dao] 2011-06-22 13:41:37 PDT
Created attachment 541165 [details]
script for building the URI list
Comment 38 Dão Gottwald [:dao] 2011-06-22 13:42:38 PDT
Created attachment 541166 [details]
URIs
Comment 39 Dão Gottwald [:dao] 2011-07-05 12:19:27 PDT
Created attachment 544020 [details]
URIs
Comment 40 Dão Gottwald [:dao] 2011-07-05 13:29:25 PDT
Created attachment 544046 [details] [diff] [review]
schedule final GC
[Checked in: Comment 43]

This shaves off some noise. about:blank still fluctuates slightly and pluginInstallerWizard.xul remains random as ever.
Comment 41 :Gavin Sharp [email: gavin@gavinsharp.com] 2011-07-05 14:34:07 PDT
The installtrigger windows may be http://hg.mozilla.org/mozilla-central/annotate/4df47729b74d/browser/base/content/browser.js#l740 , though that does try to take care of releasing the references when the notifications are removed, so I'm not sure why they'd be leaking until the end of the run.
Comment 42 Jesse Ruderman 2011-07-05 23:21:14 PDT
(In reply to comment #40)
> Created attachment 544046 [details] [diff] [review] [review]
> schedule final GC
> 
> This shaves off some noise. about:blank still fluctuates slightly and
> pluginInstallerWizard.xul remains random as ever.

Should probably do CC and MP in addition to the precise GC. (Look at what about:memory does.)
Comment 43 Dão Gottwald [:dao] 2011-07-06 01:07:15 PDT
Comment on attachment 544046 [details] [diff] [review]
schedule final GC
[Checked in: Comment 43]

http://hg.mozilla.org/mozilla-central/rev/58101c64c83c
Comment 44 Dão Gottwald [:dao] 2011-07-06 01:41:36 PDT
Created attachment 544169 [details]
URIs

http://example.com/browser/toolkit/components/console/hudservice/tests/browser/test-error.html is new, introduced by the TM merge (6840fbf4dcdd).
Comment 45 Dão Gottwald [:dao] 2011-07-06 03:37:24 PDT
Don't leak test-console.html in browser_webconsole_bug_580001_closing_after_completion.js:
http://hg.mozilla.org/mozilla-central/rev/7cbeab4cc868
Interestingly, this also seems to have removed test-error.html from the list.
Comment 46 Dão Gottwald [:dao] 2011-07-06 03:38:11 PDT
Created attachment 544183 [details]
URIs
Comment 47 Jonathan Kew (:jfkthame) 2011-07-06 06:52:53 PDT
(In reply to comment #43)
> Comment on attachment 544046 [details] [diff] [review] [review]
> schedule final GC (checked in)
> 
> http://hg.mozilla.org/mozilla-central/rev/58101c64c83c

I filed bug 669617 for an apparently-new intermittent mochitest-chrome crash that occurred on the push following this one; I don't know if it's related in any way but the timing looks a bit suspicious.
Comment 48 Dão Gottwald [:dao] 2011-07-08 09:18:13 PDT
The four places.xul instances originate from these tests:

browser_410196_paste_into_tags.js
browser_416459_cut.js
browser_library_batch_delete.js
browser_library_left_pane_commands.js
Comment 49 Dão Gottwald [:dao] 2011-07-09 01:37:30 PDT
The number of docshells is now 55 (previously 7) starting with <http://hg.mozilla.org/mozilla-central/rev/a447e63943e1>.
Comment 50 Jesse Ruderman 2011-07-09 12:49:01 PDT
... which was backed out (along with the patch it was a bustage fix for).
Comment 51 Dão Gottwald [:dao] 2011-07-15 11:53:27 PDT
(In reply to comment #11)
> I'm perhaps jumping ahead, but is it possible to add a test (or modify the
> test harness?) so that when this is fixed is stays fixed? EG, something
> makes noise and turns orange if a test (or real bug!) causes a window to
> leak to shutdown.

Has anyone ideas on how to achieve this?

The list is slowly growing every few days, which makes this bug an uphill battle that I'd have to keep fighting for multiple months.
Comment 52 Kyle Huey [:khuey] (Exited; not receiving bugmail, email if necessary) 2011-07-15 12:05:58 PDT
In Bug 633670 I'm going to implement some code to test to see that a window has been GCd.  Plan is to work on this at the beginning of next week.
Comment 53 Jesse Ruderman 2011-07-15 13:15:27 PDT
We can make new leaks turn b-c orange even before all the dependencies are fixed. For example, we could make it turn orange if the number of leaked windows changes.
Comment 54 Dão Gottwald [:dao] 2011-07-21 09:15:05 PDT
Created attachment 547413 [details] [diff] [review]
more GC/CC
[Checked in: Comment 62]

Cuts away more about:blank and pluginInstallerWizard.xul (down to 1). Calling garbageCollect only once also helps, but not quite as much.
Comment 55 Jesse Ruderman 2011-07-21 12:21:15 PDT
The most effective way to clear memory is to fire several memory-pressure notifications separated by trips to the event loop, perhaps with some schedulePreciseGC tossed in.  See my comments in bug 654041.

Ideally, the mochitest shutdown sequence would make it possible to see how many windows remain after each memory-pressure notification.  This way, we can file bugs when windows take multiple GCs/CCs to clean up (which often means we're one additional edge away from a leak).
Comment 56 Dão Gottwald [:dao] 2011-07-21 15:15:42 PDT
(In reply to comment #55)
> The most effective way to clear memory is to fire several memory-pressure
> notifications separated by trips to the event loop, perhaps with some
> schedulePreciseGC tossed in.  See my comments in bug 654041.

I don't think we're interested in flushing caches here.
Comment 57 Nicholas Nethercote [:njn] 2011-07-21 15:47:48 PDT
(In reply to comment #56)
> 
> I don't think we're interested in flushing caches here.

Then do multiple GC+CC runs via the event loop.
Comment 58 Dão Gottwald [:dao] 2011-07-22 11:24:13 PDT
(In reply to comment #57)
> (In reply to comment #56)
> > 
> > I don't think we're interested in flushing caches here.
> 
> Then do multiple GC+CC runs via the event loop.

Doesn't seem to make a difference.
Comment 59 :Gavin Sharp [email: gavin@gavinsharp.com] 2011-07-23 20:16:52 PDT
Comment on attachment 547413 [details] [diff] [review]
more GC/CC
[Checked in: Comment 62]

I'm kind of lost as to what the goal is here. Have we determined that the window leaks that this patch "fixes" are not real issues that come up outside of the test harness (e.g. a problem with our GC/CC heuristics)? Test-harness only changes that hide them are only useful if we can say with confidence that they're not "real" issues, and I'm not sure that we can do that.
Comment 60 Jesse Ruderman 2011-07-23 22:43:34 PDT
The goal is to get the relatively minor "requires multiple GCs/CCs to clean up" bugs out of our way so we can start ratcheting down on *something*.

I'd like these bugs to be noted in the log.  But they probably shouldn't turn the tree orange, because they would be intermittently caught (due to heuristics such as GC timing, CC timing, and conservative scanning).

Also, winutils.garbageCollect triggers a CC, while schedulePreciseGC does not. So we clearly need at least one of the added calls.
Comment 61 :Gavin Sharp [email: gavin@gavinsharp.com] 2011-07-24 18:14:25 PDT
Comment on attachment 547413 [details] [diff] [review]
more GC/CC
[Checked in: Comment 62]

OK, that makes sense. The comment before this code should probably be adjusted - this doesn't really result in a "more accurate" window count, just a different one (omitting windows that would have eventually been picked up anyways - i.e. we're hiding less serious issues so that we can focus on more serious ones).
Comment 62 Dão Gottwald [:dao] 2011-07-25 03:12:59 PDT
Comment on attachment 547413 [details] [diff] [review]
more GC/CC
[Checked in: Comment 62]

http://hg.mozilla.org/mozilla-central/rev/b3149b080c65
Comment 63 Dão Gottwald [:dao] 2011-07-25 07:17:51 PDT
Created attachment 548179 [details]
URIs
Comment 64 Dão Gottwald [:dao] 2011-07-25 07:18:53 PDT
Created attachment 548180 [details]
URIs
Comment 65 Dão Gottwald [:dao] 2011-07-30 06:24:49 PDT
about:addons leak was due to an undeclared variable:
http://hg.mozilla.org/mozilla-central/rev/040446311029
Comment 66 Dão Gottwald [:dao] 2011-07-31 11:44:00 PDT
Created attachment 549675 [details]
URIs
Comment 67 Dão Gottwald [:dao] 2011-08-02 05:06:04 PDT
Created attachment 550047 [details]
URIs
Comment 68 Dão Gottwald [:dao] 2011-08-08 02:49:09 PDT
Created attachment 551409 [details]
URIs

Something somewhere made a huge difference. about:blank went from 100 to 31, browser.xul from 78 to 8.

test-console.html and test-console-api.html are new, regressed by <http://hg.mozilla.org/mozilla-central/pushloghtml?changeset=eef25ec2d58e>.
Comment 69 Dão Gottwald [:dao] 2011-08-08 03:04:10 PDT
> Something somewhere made a huge difference. about:blank went from 100 to 31,
> browser.xul from 78 to 8.

Something in http://hg.mozilla.org/mozilla-central/pushloghtml?changeset=0fbdefb0a9b5
Comment 70 Dão Gottwald [:dao] 2011-08-08 03:30:19 PDT
(In reply to Dão Gottwald [:dao] from comment #69)
> > Something somewhere made a huge difference. about:blank went from 100 to 31,
> > browser.xul from 78 to 8.
> 
> Something in
> http://hg.mozilla.org/mozilla-central/pushloghtml?changeset=0fbdefb0a9b5

<http://hg.mozilla.org/integration/fx-team/pushloghtml?changeset=6f1affa4bb5e>, so I guess bug 651643.
Comment 71 Dão Gottwald [:dao] 2011-08-09 01:08:10 PDT
(In reply to Dão Gottwald [:dao] from comment #70)
> (In reply to Dão Gottwald [:dao] from comment #69)
> > > Something somewhere made a huge difference. about:blank went from 100 to 31,
> > > browser.xul from 78 to 8.
> > 
> > Something in
> > http://hg.mozilla.org/mozilla-central/pushloghtml?changeset=0fbdefb0a9b5
> 
> <http://hg.mozilla.org/integration/fx-team/pushloghtml?changeset=6f1affa4bb5e>,
> so I guess bug 651643.

After trying to verify this, I found out that bug 666475 did it.
Comment 72 Dão Gottwald [:dao] 2011-08-09 03:27:37 PDT
(In reply to Dão Gottwald [:dao] from comment #71)
> (In reply to Dão Gottwald [:dao] from comment #70)
> > (In reply to Dão Gottwald [:dao] from comment #69)
> > > > Something somewhere made a huge difference. about:blank went from 100 to 31,
> > > > browser.xul from 78 to 8.
> > > 
> > > Something in
> > > http://hg.mozilla.org/mozilla-central/pushloghtml?changeset=0fbdefb0a9b5
> > 
> > <http://hg.mozilla.org/integration/fx-team/pushloghtml?changeset=6f1affa4bb5e>,
> > so I guess bug 651643.
> 
> After trying to verify this, I found out that bug 666475 did it.

Err, I mixed up the logs.
Comment 73 Dão Gottwald [:dao] 2011-08-16 03:46:50 PDT
Created attachment 553428 [details]
script for building the URI list
Comment 74 Tim Taubert [:ttaubert] 2011-08-19 10:49:08 PDT
Created attachment 554470 [details]
leak statistics (before backout)

I wrote a little script to narrow down which tests are leaking DOMWindows and docShells. It parses the build log, records created DOMWindows and docShells per test and counts them as "leaked" if they get removed after the test suite shutdown.

I compared the builds log from before and after the backout of bug 662812. I hope that can help a bit finding the leak sources.
Comment 75 Tim Taubert [:ttaubert] 2011-08-19 10:49:42 PDT
Created attachment 554471 [details]
leak statistics (after backout)
Comment 76 Dão Gottwald [:dao] 2011-08-19 11:54:52 PDT
(In reply to Tim Taubert [:ttaubert] from comment #74)
> Created attachment 554470 [details]
> leak statistics (before backout)
> 
> I wrote a little script to narrow down which tests are leaking DOMWindows
> and docShells. It parses the build log, records created DOMWindows and
> docShells per test and counts them as "leaked" if they get removed after the
> test suite shutdown.

Note that some windows aren't leaked, they just exist as part of the browser window that all tests run in. This is probably the case for browser_bug597218.js, for instance, which I guess causes panorama to load its iframe.
Comment 77 Tim Taubert [:ttaubert] 2011-08-19 11:57:08 PDT
Created attachment 554493 [details]
leak statistics (before backout)

assigned urls to the leaked windows
Comment 78 Tim Taubert [:ttaubert] 2011-08-19 11:57:53 PDT
Created attachment 554494 [details]
leak statistics (after backout)

assigned urls to the leaked windows
Comment 79 Tim Taubert [:ttaubert] 2011-08-19 11:59:22 PDT
(In reply to Dão Gottwald [:dao] from comment #76)
> Note that some windows aren't leaked, they just exist as part of the browser
> window that all tests run in. This is probably the case for
> browser_bug597218.js, for instance, which I guess causes panorama to load
> its iframe.

Yeah, I already noticed that. Thanks for the hint!
Comment 80 Dão Gottwald [:dao] 2011-08-22 01:53:04 PDT
Created attachment 554804 [details]
URIs
Comment 81 Dão Gottwald [:dao] 2011-09-08 03:30:14 PDT
Created attachment 559090 [details]
URIs
Comment 82 Dão Gottwald [:dao] 2011-09-23 07:15:49 PDT
Created attachment 562034 [details]
URIs

Since the previous update, places.xul and bookmarkProperties.xul disappeared, the browser.xul count grew by 1.
Comment 83 Dão Gottwald [:dao] 2011-09-28 08:22:29 PDT
Created attachment 563078 [details]
URIs
Comment 84 Dão Gottwald [:dao] 2011-11-11 06:46:34 PST
(In reply to Dão Gottwald [:dao] from comment #83)
> Created attachment 563078 [details]
> URIs

For the record, this is still up to date.
Comment 85 Nicholas Nethercote [:njn] 2011-12-06 14:28:34 PST
Dão, what's the status of this bug?  Looks like there are a few dependent bugs still, but not that many.  Are you working on them?  Maybe this should be downgraded to a MemShrink:P2?
Comment 86 Jesse Ruderman 2011-12-06 18:02:09 PST
Well, how important is it for the tree to turn orange when we introduce a big new leak bug?  Getting the leaks to zero is the most straightforward way to do that, but not the only way.
Comment 87 Dão Gottwald [:dao] 2011-12-07 05:45:50 PST
I don't really care if this is a P1 -- I think bug 633670 and/or bug 683953 should be, though, so we can stop monitoring this manually.

The biggest remaining offender is bug 675412, by the way, which seems to be stalled.
Comment 88 Dão Gottwald [:dao] 2011-12-20 13:42:57 PST
Created attachment 583272 [details]
URIs

We've gone from 127 to 615 windows. Most of the new ones are about:addons instances.
Comment 89 Dão Gottwald [:dao] 2011-12-20 16:33:39 PST
(In reply to Dão Gottwald [:dao] from comment #88)
> Created attachment 583272 [details]
> URIs
> 
> We've gone from 127 to 615 windows. Most of the new ones are about:addons
> instances.

I've narrowed down the regression range to cbb0233c7ba8 (good) and be0fffc6b0d1 (bad).
http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=cbb0233c7ba8&tochange=be0fffc6b0d1
Comment 90 Kyle Huey [:khuey] (Exited; not receiving bugmail, email if necessary) 2011-12-20 16:38:08 PST
The hotfix addon work is in that range.
Comment 91 David Baron :dbaron: ⌚️UTC-8 2011-12-21 05:21:52 PST
I confirmed from mozilla-inbound tinderbox logs that https://tbpl.mozilla.org/?branch=mozilla-inbound&rev=0ab6831c77bc is good (before the regression) based on the Win debug log at https://tbpl.mozilla.org/php/getParsedLog.php?id=8074372&tree=Mozilla-Inbound .  Still looking for the nearest non-deleted debug mochitest-other tinderbox log after the push khuey suspects (which is https://hg.mozilla.org/integration/mozilla-inbound/pushloghtml?changeset=e5d8d2fb987d ).

Maybe we should have a separate bug on the regression, though?
Comment 92 Dão Gottwald [:dao] 2011-12-21 05:25:40 PST
(In reply to Kyle Huey [:khuey] (khuey@mozilla.com) from comment #90)
> The hotfix addon work is in that range.

Further narrowed down to 61a46d539c69 (good) and be0fffc6b0d1 (bad):

> be0fffc6b0d1 Michael Wu — Bug 710029 - Assorted build system tweaks for B2G/Gonk. r=khuey
> 26cce2f95d49 Brian Hackett — Fix lingering uses of int32 types, no bug.
> c5026431768d Doug Sherk — Bug 711579: fix WGL context creation without robustness r=Bas
> b895916227b2 Doug Sherk — Bug 711226: separate desktop and ES 2.0 symbol loading r=bjacob Symbols were amalgamated into a single table which worked up until recently. This patch separates them so that there is a common list, then two separate ones for desktop and OpenGL ES 2.0.
> 9939bcd259a3 Doug Sherk — Bug 708207: implement WebGL's getShaderPrecisionFormat r=bjacob
> de66e7bd2b98 Brian Hackett — Add interface for accessing PC counter information from chrome code, bug 687134. r=sfink,waldo
> 5db46b0c2f14 Jonas Sicking — Back out fb4d12d2a2da, bug 701772, due to leaks.
> fbf296760b0f Steffen Wilberg — Bug 710064: Make the 'Update Add-ons Automatically' checkbox state depend on the extensions.update.enabled pref, in addition to the existing extensions.update.autoUpdateDefault pref. r=mossop

Bug 710064 / http://hg.mozilla.org/mozilla-central/rev/fbf296760b0f added a pref observer to extensions.js without ever removing it, so I guess that's the culprit.
Comment 93 Dão Gottwald [:dao] 2011-12-21 05:31:22 PST
(In reply to David Baron [:dbaron] from comment #91)
> I confirmed from mozilla-inbound tinderbox logs that
> https://tbpl.mozilla.org/?branch=mozilla-inbound&rev=0ab6831c77bc is good
> (before the regression) based on the Win debug log at
> https://tbpl.mozilla.org/php/getParsedLog.php?id=8074372&tree=Mozilla-
> Inbound .  Still looking for the nearest non-deleted debug mochitest-other
> tinderbox log after the push khuey suspects (which is
> https://hg.mozilla.org/integration/mozilla-inbound/
> pushloghtml?changeset=e5d8d2fb987d ).
> 
> Maybe we should have a separate bug on the regression, though?

filed bug 712614
Comment 94 Justin Dolske [:Dolske] 2011-12-21 11:39:37 PST
(In reply to Dão Gottwald [:dao] from comment #88)

> We've gone from 127 to 615 windows. Most of the new ones are about:addons
> instances.

Can we add a leaked-window threshold check (eg, < 130) to prevent further slips like this? Experience from previous "drive X to 0" efforts seems to be that it (1) takes a long time to actually reach 0 and (2) in the meantime more regressions will slip in.
Comment 95 Dão Gottwald [:dao] 2012-01-09 11:16:48 PST
Created attachment 587055 [details]
URIs

194 times browser.xul (instead of 8)
Comment 96 Dão Gottwald [:dao] 2012-01-11 01:57:04 PST
Created attachment 587629 [details]
URIs
Comment 97 Tim Taubert [:ttaubert] 2012-01-12 10:14:28 PST
(In reply to Justin Dolske [:Dolske] from comment #94)
> Can we add a leaked-window threshold check (eg, < 130) to prevent further
> slips like this? Experience from previous "drive X to 0" efforts seems to be
> that it (1) takes a long time to actually reach 0 and (2) in the meantime
> more regressions will slip in.

I just uploaded a new patch to bug 683953 which implements a threshold and makes the tree go orange if we leak more than that. Unfortunately, the counts vary heavily on different platforms, see bug 683953 comment #13 for some stats.
Comment 98 Marco Bonardo [::mak] 2012-01-12 11:12:18 PST
(In reply to Tim Taubert [:ttaubert] from comment #97)
> Unfortunately, the
> counts vary heavily on different platforms

not a big deal, having a threshold per platform is better than not having one.
Comment 99 Dão Gottwald [:dao] 2012-01-16 15:46:24 PST
Created attachment 589056 [details]
URIs
Comment 100 Dão Gottwald [:dao] 2012-01-19 03:17:02 PST
Created attachment 589821 [details]
URIs
Comment 101 Dão Gottwald [:dao] 2012-01-24 08:50:32 PST
Created attachment 591119 [details]
URIs
Comment 102 Serge Gautherie (:sgautherie) 2012-02-12 22:25:25 PST
http://hg.mozilla.org/mozilla-central/rev/b2d50f779d6c
Bug 658738 - split browser_394759.js into three tests
Comment 103 Dão Gottwald [:dao] 2012-02-18 14:21:32 PST
Created attachment 598580 [details]
URIs
Comment 104 neil@parkwaycc.co.uk 2012-02-28 04:39:05 PST
Would it be possible (maybe via adding something to nsIWindowUtils) to track the number of DOM windows that exist, so that the test harness can directly implicate each test as it runs? At some point tests would default to not leaking, but old tests could possibly be grandfathered in with some sort of limited opt-out?
Comment 105 Dão Gottwald [:dao] 2012-03-08 09:01:34 PST
Created attachment 604090 [details]
URIs
Comment 106 Andrew McCreight [:mccr8] 2012-05-30 08:08:07 PDT
Ed, I'm curious what spurred this large list of new bugs.  Are these new leaks, or have they just not been filed before?
Comment 107 Ed Morley [:emorley] 2012-05-30 08:12:54 PDT
They are the tests that are regularly causing us to go over the threshold and thus cause bug 754804. They just haven't been filed until now. Dão has been driving the shutdown leaks threshold related bugs for a while now, thought he could do with a second pair of hands :-)
Comment 108 Dão Gottwald [:dao] 2014-05-03 04:39:01 PDT
I think this has outlived its usefulness as a tracking bug let alone as a bug where actual work happens. New leaky tests can be (and have been) dealt with in other bugs that don't need to block this one.

Note You need to log in before you can comment on or make changes to this bug.