Last Comment Bug 438922 - Ensure that mailnews unit tests shutdown observers gracefully
: Ensure that mailnews unit tests shutdown observers gracefully
Status: VERIFIED FIXED
: mlk
Product: MailNews Core
Classification: Components
Component: Backend (show other bugs)
: Trunk
: All All
: -- normal (vote)
: Thunderbird 3.0b3
Assigned To: Mark Banner (:standard8)
:
:
Mentors:
: 462955 (view as bug list)
Depends on: 384339
Blocks: 426615 442026 mailnewsleak
  Show dependency treegraph
 
Reported: 2008-06-12 13:31 PDT by Håkan Waara
Modified: 2009-07-12 09:55 PDT (History)
7 users (show)
standard8: in‑testsuite+
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---


Attachments
Let the biff manager be shutdown by the account manager (checked in) (2.25 KB, patch)
2008-06-12 15:00 PDT, Håkan Waara
mozilla: review+
Details | Diff | Splinter Review
Shutdown gracefully on end of all tests v1 (7.46 KB, patch)
2008-06-14 12:54 PDT, Håkan Waara
standard8: review-
Details | Diff | Splinter Review
Graceful test shutdown v2 (6.22 KB, patch)
2008-06-25 10:28 PDT, Håkan Waara
no flags Details | Diff | Splinter Review
Updated partial patch (3.52 KB, patch)
2009-04-20 08:24 PDT, Mark Banner (:standard8)
no flags Details | Diff | Splinter Review
Updated patch (5.49 KB, patch)
2009-04-28 08:16 PDT, Mark Banner (:standard8)
no flags Details | Diff | Splinter Review
Updated patch v2 (5.59 KB, patch)
2009-06-09 05:50 PDT, Mark Banner (:standard8)
no flags Details | Diff | Splinter Review
[checked in] Part 1 - shutdown resource and fix addrbook, bayesian-spam-filter, import and mime tests (3.94 KB, patch)
2009-06-13 13:12 PDT, Mark Banner (:standard8)
mozilla: review+
Details | Diff | Splinter Review
[checked in] Part 2 - base, compose, db/msgdb, imap, local, news with some cleanup (2.38 KB, patch)
2009-06-15 06:26 PDT, Mark Banner (:standard8)
mozilla: review+
Details | Diff | Splinter Review
[checked in] Part 3 - gloda (249 bytes, patch)
2009-06-20 14:07 PDT, Mark Banner (:standard8)
mozilla: review+
Details | Diff | Splinter Review

Description Håkan Waara 2008-06-12 13:31:27 PDT
When working on my import test, I noticed this on shutdown of a very simple test:

"###!!! ASSERTION: XPConnect is being called on a scope without a 'Components' property!"

The problem in this case is that observers of the xpcom-shutdown notification are trigger JS components to do things. But at this point, XPCOM is half-shutdown -- the JS window is gone, for example, hence the assertion.

To make our testcases sane, I want to move the Account Manager so it doesn't do its Shutdown() on xpcom-shutdown, but earlier. But we might need to audit other observers at the same time, to ensure nothing breaks.
Comment 1 Håkan Waara 2008-06-12 13:36:25 PDT
Here's a full list of all observers and what they listen for (I grepped for AddObserver, and looked at each instance in mailnews/ and mail):

nsMsgBiffManager 
  NS_XPCOM_SHUTDOWN_OBSERVER_ID

nsMapiSupport 
  profile-after-change, 
  NS_XPCOM_SHUTDOWN_OBSERVER_ID

nsAbDirectoryDataSource 
  profile-do-change, 
  NS_XPCOM_SHUTDOWN_OBSERVER_ID

nsMsgAccountManager 
  NS_XPCOM_SHUTDOWN_OBSERVER_ID, 
  quit-application, 
  ABOUT_TO_GO_OFFLINE_TOPIC,
  session-logout, 
  profile-before-change 

nsMsgRDFDataSource
  NS_XPCOM_SHUTDOWN_OBSERVER_ID

gFolderCharsetObserver (nsFolderCharsetObserver)
  NS_XPCOM_SHUTDOWN_OBSERVER_ID

nsPalmSyncSupport
  NS_XPCOM_SHUTDOWN_OBSERVER_ID, 
  em-action-requested, 
  profile-after-change

nsIMAPHostSessionList 
  profile-before-change, 
  NS_XPCOM_SHUTDOWN_OBSERVER_ID

nsMsgShutdownService
  quit-application-requested

nsMsgComposeService
  quit-application, 
  profile-do-change
Comment 2 Håkan Waara 2008-06-12 14:27:31 PDT
I started analyzing every one of these observers, and what they do when they receive the xpcom-shutdown notification. Luckily, I didn't need to analyze all of them before I found a simple fix for the account manager.

* The account manager runs its own ::Shutdown() on two notifications: profile-before-change and the xpcom-shutdown. 
* If it has already shutdown, ::Shutdown() returns early and does nothing. 
* As one can see on http://developer.mozilla.org/en/docs/Observer_Notifications profile-before-change is always fired before xpcom-shutdown, so listening for xpcom-shutdown is redundant!

The reason we hit the case where the account manager shuts down too late (xpcom-shutdown) in only the testcases, is because our test environment doesn't issue any profile-* notifications.

So my proposed way to fix this is like this:

* Make our test environment issue profile-* notifications, both when it's loaded and is shutting down (in a tail.js file)
* Change nsMsgAccountManager to never listen on xpcom-shutdown since it 1) doesn't work and 2) is redundant

As a bonus, I found out that nsMsgBiffManager also redundantly listens to xpcom-shutdown. Its shutdown is already ensured by the nsMsgAccountManager on profile-before-change (which is *before* the notification it's observing).
Comment 3 David :Bienvenu 2008-06-12 14:39:42 PDT
Hakan, that all sounds reasonable (though Standard8 will have to comment on the test environment issue).
Comment 4 Håkan Waara 2008-06-12 15:00:44 PDT
Created attachment 324867 [details] [diff] [review]
Let the biff manager be shutdown by the account manager (checked in)

This patch removes the biffmanager as an shutdown-observer. nsMsgAccountManager::Shutdown() already calls nsMsgBiffManager::Shutdown() before.
Comment 5 David :Bienvenu 2008-06-12 15:12:20 PDT
Comment on attachment 324867 [details] [diff] [review]
Let the biff manager be shutdown by the account manager (checked in)

thx, makes sense.
Comment 6 Mark Banner (:standard8) 2008-06-13 03:32:57 PDT
(In reply to comment #2)
> The reason we hit the case where the account manager shuts down too late
> (xpcom-shutdown) in only the testcases, is because our test environment doesn't
> issue any profile-* notifications.

That's right, what I think is worth making clear here is once you get to the xpcom-shutdown stage you can't createInstance/getService. If subFolders of an account haven't been initialised, then we'll try and initialise them on shutdown as part of writing the folder cache. Not ideal, but given we don't hit this on normal app shutdown (even for new accounts), I guess subFolders are normally initialised and its not an issue.

> So my proposed way to fix this is like this:
> 
> * Make our test environment issue profile-* notifications, both when it's
> loaded and is shutting down (in a tail.js file)

Although I'm not sure I really want to, it does sound like a good idea to issue profile-* notifications when we exit xpcshell.

I think for startup it may be better to provide utility functions for profile-* notifications. Looking at where we currently use profile-* notifications on startup, I don't think there's any that we really need to do by default.
Comment 7 Håkan Waara 2008-06-14 12:54:32 PDT
Created attachment 325126 [details] [diff] [review]
Shutdown gracefully on end of all tests v1

This patch makes us issue all the shutdown notifications that Thunderbird-the-app does, and in the right order. I've created a new testcase (test_createMailAccount) that crashes without the fix (which is why I started work on this bug in the first place).

All other tests run fine, except for one:

test_sendMailMessage.js (in compose) has a problem with shutting down NSS when these notifications are issued. The good news is that since we're now exercising more code than before on shutdown of all of these tests, this actually exposes a problem that hasn't been seen before!  

I'll file a bug separate bug on it, and have commented out the shutdown() tail_compose.js for now due to it.
Comment 8 Mark Banner (:standard8) 2008-06-17 06:12:27 PDT
Comment on attachment 325126 [details] [diff] [review]
Shutdown gracefully on end of all tests v1

diff --git a/mailnews/addrbook/test/unit/head_addrbook.js b/mailnews/addrbook/test/unit/head_addrbook.js
--- a/mailnews/addrbook/test/unit/head_addrbook.js
+++ b/mailnews/addrbook/test/unit/head_addrbook.js
@@ -1,5 +1,6 @@
 // Import the main scripts that mailnews tests need to set up and tear down
 do_import_script("mailnews/test/resources/mailDirService.js");
+do_import_script("mailnews/test/resources/mailTestUtils.js");

I'd prefer the shutdown code to be in a separate script file (mailnews/test/resource/mailShutdown.js?) imported straight into the tail file. Skip the shutdown function and call postShutdownNotifications() from the global part of the new file.

\ No newline at end of file

Please fix all of these to have newlines.

I had a look at the NSS shutdown problem as well. I tried running the test with check-interactive so that I could catch it in xcode, but the crash didn't occur. So maybe that's allowing the network to clean up properly somehow.

I think we need to determine what is happening wrt the NSS case before we commit this patch. I'm concerned that otherwise we'll commit our test case implementers to extra work they really don't want to do.
Comment 9 Håkan Waara 2008-06-17 08:52:17 PDT
(In reply to comment #8)
> I had a look at the NSS shutdown problem as well. I tried running the test with
> check-interactive so that I could catch it in xcode, but the crash didn't
> occur. So maybe that's allowing the network to clean up properly somehow.

Note that you can't use check-interactive to test this way, because check-interactive will run the tail stuff *before* dropping into the console. So that's why you're not seeing any errors, and I guess you're lucky that nothing bad happens...
Comment 10 Håkan Waara 2008-06-25 10:28:32 PDT
Created attachment 326729 [details] [diff] [review]
Graceful test shutdown v2

Turns out the NSS shutdown assertion had a really simple fix. It expected a profile-change-net-teardown notification before the profile-change-teardown. This is what makes it shut down network stuff. This is also issued by XRE live.

This patch includes your other two nits: newlines at the end of files, and a new "mailShutdown.js" file for this stuff.
Comment 11 Kent James (:rkent) 2008-06-25 11:05:59 PDT
I'm not real happy to see functionality added back to the tail, when we worked earlier to remove that. Bug 384339 has shown little forward motion lately, so adding stuff to the tail breaks check-interactive for us again, which makes it more difficult to setup a debug environment for unit tests.

Let me propose a compromise. For now, add function postShutdownNotifications() in the head, and add the call to postShutdownNotifications directly to tests that need it. Make sure that postShutdownNotifications() can be called twice. Open a bug, blocked by 384339, that adds the postShutdownNotifications() call to the tail, so that at some future time we can quit adding that to the tests.
Comment 12 Håkan Waara 2008-06-26 04:35:50 PDT
(In reply to comment #11)

I agree that breaking check-interactive would suck, and I guess explicitly adding the shutdown call to all run_test()s isn't that bad (yet) since there are only 20-ish testcases so far.

About your proposed solution: I'm not sure how or why you want the shutdown to be callable twice? The whole point of it is that is it is saying to core "we're shutting down for good now, please do whatever cleanup is necessary".
Comment 13 Kent James (:rkent) 2008-06-26 11:16:54 PDT
(In reply to comment #12)

> About your proposed solution: I'm not sure how or why you want the shutdown to
> be callable twice? The whole point of it is that is it is saying to core "we're
> shutting down for good now, please do whatever cleanup is necessary".
> 
All I mean by callable twice is that you don't need to remove any existing calls to shutdown procedures, in the tests themselves, when in the future you add the same call in the tail. At that point, shutdown will be called twice. Make sure that doesn't force you to change all existing tests to remove the now redundant shutdown call.
Comment 14 Håkan Waara 2008-06-26 19:01:57 PDT
(In reply to comment #13)
> (In reply to comment #12)
> 
> > About your proposed solution: I'm not sure how or why you want the shutdown to
> > be callable twice? The whole point of it is that is it is saying to core "we're
> > shutting down for good now, please do whatever cleanup is necessary".
> > 
> All I mean by callable twice is that you don't need to remove any existing
> calls to shutdown procedures, in the tests themselves, when in the future you
> add the same call in the tail. At that point, shutdown will be called twice.
> Make sure that doesn't force you to change all existing tests to remove the now
> redundant shutdown call.

No, when we solve this the right way, we're gonna have to remove the extra shutdown calls I don't see a way to make the shutdown work two times in a row; it doesn't make any sense. It's just the price we have to pay for this workaround.
Comment 15 Mark Banner (:standard8) 2008-06-27 03:08:41 PDT
Comment on attachment 326729 [details] [diff] [review]
Graceful test shutdown v2

So I think this patch is reasonable, but we need to fix bug 384339 first, so therefore cancelling review for the time being.
Comment 16 Mark Banner (:standard8) 2009-04-20 08:24:37 PDT
Created attachment 373655 [details] [diff] [review]
Updated partial patch

I was taking a look at the patch on bug 384339 earlier and did an minor (incomplete) update of this patch - this version ensures we release the dir provider, though its not very pretty. Just parking it here for when bug 384339 gets resolved.
Comment 17 Mark Banner (:standard8) 2009-04-28 08:16:54 PDT
Created attachment 374909 [details] [diff] [review]
Updated patch

This does all the tail files, and so we get a much clearer picture, although at the moment there is a crash on shutdown of one of the news tests - I'll investigate that later.

Still waiting on bug 384339 before we can submit this.
Comment 18 Mark Banner (:standard8) 2009-06-09 05:50:38 PDT
Created attachment 382282 [details] [diff] [review]
Updated patch v2

This patch has an added gc() call before we do the final shutdown stage - this lets any globals that have been freed be cycle collected and hence give cleaner shutdown leak figures.
Comment 19 Mark Banner (:standard8) 2009-06-13 02:58:55 PDT
Now that bug 384339 has review and should be checked in soon, I'm going to take this bug and drive the patch in once that lands.
Comment 20 Mark Banner (:standard8) 2009-06-13 13:05:19 PDT
*** Bug 462955 has been marked as a duplicate of this bug. ***
Comment 21 Mark Banner (:standard8) 2009-06-13 13:12:45 PDT
Created attachment 383121 [details] [diff] [review]
[checked in] Part 1 - shutdown resource and fix addrbook, bayesian-spam-filter, import and mime tests

I've decided to land the fixes for this in several parts. Probably two, but maybe three.

First I'd like to land this part - it provides a shutdown resource that is designed to shutdown mailnews cleanly for the xpcshell-tests.

It also hooks this into the tail files for addrbook, bayesian-spam-filter, import and mime tests. These directories have no leaks reported on my mac.

I'd like to land this and have it run for a day or so, so that we can check there aren't any intermittent failures caused by the initial clean up at least.


The second part will be to land the tail files for the remaining directories hooking them up to the shutdown code as well. Some of these directories do have reported leaks. I've raised bug 498147 to track these reported leaks, and I'll raise individual bugs on the leaks when I do this second part. Note: I believe most of them are mainly issues with the tests, rather than issues in mailnews.

Depending on what success I have with the gloda, landing the tail file for gloda may be a third part. I think I have seen random hanging with the shutdown and the gloda tests, but I need to test some more to be sure.
Comment 22 Mark Banner (:standard8) 2009-06-14 07:04:04 PDT
Comment on attachment 383121 [details] [diff] [review]
[checked in] Part 1 - shutdown resource and fix addrbook, bayesian-spam-filter, import and mime tests

Checked in: http://hg.mozilla.org/comm-central/rev/e0c7adaa855f

There are some intermittent address book leaks on autocomplete - I had seen them occasionally on my build but forgotten before pushing this patch. I've raise bug 498212 to cover those leaks.

The SeaMonkey boxes haven't cycled yet. I'll check in on them later.
Comment 23 Mark Banner (:standard8) 2009-06-15 06:26:03 PDT
Created attachment 383258 [details] [diff] [review]
[checked in] Part 2 - base, compose, db/msgdb, imap, local, news with some cleanup

Time for part 2. This adds a tail file for the remaining test directories apart from gloda - I still need to do a bit more testing on that locally.

There are two additional test fixes in here. The test_enumerator_cleanup.js change delays cleaning up until after OnStopCopy has been allowed to return and complete; this allows more items to be released. The test_nsIMsgFolderListenerLocal.js change frees up some global variables before the test finishes (this reduces but doesn't totally fix the reported leaks).

I've already filed bugs for the leaks that are remaining after this patch. Some of these are generic so we don't have to keep on adding to them. See https://bugzilla.mozilla.org/showdependencytree.cgi?id=498147&hide_resolved=1 for the list of bugs.
Comment 24 Mark Banner (:standard8) 2009-06-16 03:45:01 PDT
Comment on attachment 383258 [details] [diff] [review]
[checked in] Part 2 - base, compose, db/msgdb, imap, local, news with some cleanup

Checked in: http://hg.mozilla.org/comm-central/rev/62a62b6ca564

Leak reports are all looking roughly like what I was expecting them to look like.
Comment 25 Mark Banner (:standard8) 2009-06-20 14:07:02 PDT
Created attachment 384246 [details] [diff] [review]
[checked in] Part 3 - gloda

I've now done some more testing with the gloda test directory having a tail file. My previous thoughts about hanging are unfounded, but I've noticed that test_index_messages.js and test_index_messages_in_bulk.js can sometimes pause with 0 CPU usage for a minute or two and then will continue.

With this patch in, the only reported leaks are in test_mime_emitter.js - with the patch on bug 467356, the reported leaks are then reduced to 4 bytes of an nsVoidArray. I've already got a potential candidate for that and I'll deal with it in a separate bug.
Comment 26 Mark Banner (:standard8) 2009-06-21 09:59:44 PDT
Comment on attachment 384246 [details] [diff] [review]
[checked in] Part 3 - gloda

Checked in: http://hg.mozilla.org/comm-central/rev/356486fdbaeb
Comment 27 Mark Banner (:standard8) 2009-06-21 10:00:40 PDT
All tail files checked in, I'll raise a bug for the nsVoidArray leak once the tinderboxes have cycled and I've checked all platforms.
Comment 28 Serge Gautherie (:sgautherie) 2009-07-12 09:55:22 PDT
[Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.9.2a1pre) Gecko/20090711 SeaMonkey/2.1a1pre] (home, optim default) (W2Ksp4)
(http://hg.mozilla.org/mozilla-central/rev/47bfcd275ede
 +http://hg.mozilla.org/comm-central/rev/291cbe3374b9)

V.Fixed

Note You need to log in before you can comment on or make changes to this bug.