Closed Bug 438922 Opened 16 years ago Closed 15 years ago

Ensure that mailnews unit tests shutdown observers gracefully

Categories

(MailNews Core :: Backend, defect)

defect
Not set
normal

Tracking

(Not tracked)

VERIFIED FIXED
Thunderbird 3.0b3

People

(Reporter: hwaara, Assigned: standard8)

References

Details

(Keywords: memory-leak)

Attachments

(4 files, 5 obsolete files)

When working on my import test, I noticed this on shutdown of a very simple test:

"###!!! ASSERTION: XPConnect is being called on a scope without a 'Components' property!"

The problem in this case is that observers of the xpcom-shutdown notification are trigger JS components to do things. But at this point, XPCOM is half-shutdown -- the JS window is gone, for example, hence the assertion.

To make our testcases sane, I want to move the Account Manager so it doesn't do its Shutdown() on xpcom-shutdown, but earlier. But we might need to audit other observers at the same time, to ensure nothing breaks.
Here's a full list of all observers and what they listen for (I grepped for AddObserver, and looked at each instance in mailnews/ and mail):

nsMsgBiffManager 
  NS_XPCOM_SHUTDOWN_OBSERVER_ID

nsMapiSupport 
  profile-after-change, 
  NS_XPCOM_SHUTDOWN_OBSERVER_ID

nsAbDirectoryDataSource 
  profile-do-change, 
  NS_XPCOM_SHUTDOWN_OBSERVER_ID

nsMsgAccountManager 
  NS_XPCOM_SHUTDOWN_OBSERVER_ID, 
  quit-application, 
  ABOUT_TO_GO_OFFLINE_TOPIC,
  session-logout, 
  profile-before-change 

nsMsgRDFDataSource
  NS_XPCOM_SHUTDOWN_OBSERVER_ID

gFolderCharsetObserver (nsFolderCharsetObserver)
  NS_XPCOM_SHUTDOWN_OBSERVER_ID

nsPalmSyncSupport
  NS_XPCOM_SHUTDOWN_OBSERVER_ID, 
  em-action-requested, 
  profile-after-change

nsIMAPHostSessionList 
  profile-before-change, 
  NS_XPCOM_SHUTDOWN_OBSERVER_ID

nsMsgShutdownService
  quit-application-requested

nsMsgComposeService
  quit-application, 
  profile-do-change
I started analyzing every one of these observers, and what they do when they receive the xpcom-shutdown notification. Luckily, I didn't need to analyze all of them before I found a simple fix for the account manager.

* The account manager runs its own ::Shutdown() on two notifications: profile-before-change and the xpcom-shutdown. 
* If it has already shutdown, ::Shutdown() returns early and does nothing. 
* As one can see on http://developer.mozilla.org/en/docs/Observer_Notifications profile-before-change is always fired before xpcom-shutdown, so listening for xpcom-shutdown is redundant!

The reason we hit the case where the account manager shuts down too late (xpcom-shutdown) in only the testcases, is because our test environment doesn't issue any profile-* notifications.

So my proposed way to fix this is like this:

* Make our test environment issue profile-* notifications, both when it's loaded and is shutting down (in a tail.js file)
* Change nsMsgAccountManager to never listen on xpcom-shutdown since it 1) doesn't work and 2) is redundant

As a bonus, I found out that nsMsgBiffManager also redundantly listens to xpcom-shutdown. Its shutdown is already ensured by the nsMsgAccountManager on profile-before-change (which is *before* the notification it's observing).
Hakan, that all sounds reasonable (though Standard8 will have to comment on the test environment issue).
This patch removes the biffmanager as an shutdown-observer. nsMsgAccountManager::Shutdown() already calls nsMsgBiffManager::Shutdown() before.
Assignee: nobody → hwaara
Status: NEW → ASSIGNED
Attachment #324867 - Flags: review?
Attachment #324867 - Flags: review? → review?(bienvenu)
Comment on attachment 324867 [details] [diff] [review]
Let the biff manager be shutdown by the account manager (checked in)

thx, makes sense.
Attachment #324867 - Flags: review?(bienvenu) → review+
Attachment #324867 - Attachment description: Let the biff manager be shutdown by the account manager → Let the biff manager be shutdown by the account manager (checked in)
(In reply to comment #2)
> The reason we hit the case where the account manager shuts down too late
> (xpcom-shutdown) in only the testcases, is because our test environment doesn't
> issue any profile-* notifications.

That's right, what I think is worth making clear here is once you get to the xpcom-shutdown stage you can't createInstance/getService. If subFolders of an account haven't been initialised, then we'll try and initialise them on shutdown as part of writing the folder cache. Not ideal, but given we don't hit this on normal app shutdown (even for new accounts), I guess subFolders are normally initialised and its not an issue.

> So my proposed way to fix this is like this:
> 
> * Make our test environment issue profile-* notifications, both when it's
> loaded and is shutting down (in a tail.js file)

Although I'm not sure I really want to, it does sound like a good idea to issue profile-* notifications when we exit xpcshell.

I think for startup it may be better to provide utility functions for profile-* notifications. Looking at where we currently use profile-* notifications on startup, I don't think there's any that we really need to do by default.
This patch makes us issue all the shutdown notifications that Thunderbird-the-app does, and in the right order. I've created a new testcase (test_createMailAccount) that crashes without the fix (which is why I started work on this bug in the first place).

All other tests run fine, except for one:

test_sendMailMessage.js (in compose) has a problem with shutting down NSS when these notifications are issued. The good news is that since we're now exercising more code than before on shutdown of all of these tests, this actually exposes a problem that hasn't been seen before!  

I'll file a bug separate bug on it, and have commented out the shutdown() tail_compose.js for now due to it.
Attachment #325126 - Flags: review?(bugzilla)
Summary: Audit mailnews shutdown observers → Ensure that mailnews tests shutdown observers gracefully
Comment on attachment 325126 [details] [diff] [review]
Shutdown gracefully on end of all tests v1

diff --git a/mailnews/addrbook/test/unit/head_addrbook.js b/mailnews/addrbook/test/unit/head_addrbook.js
--- a/mailnews/addrbook/test/unit/head_addrbook.js
+++ b/mailnews/addrbook/test/unit/head_addrbook.js
@@ -1,5 +1,6 @@
 // Import the main scripts that mailnews tests need to set up and tear down
 do_import_script("mailnews/test/resources/mailDirService.js");
+do_import_script("mailnews/test/resources/mailTestUtils.js");

I'd prefer the shutdown code to be in a separate script file (mailnews/test/resource/mailShutdown.js?) imported straight into the tail file. Skip the shutdown function and call postShutdownNotifications() from the global part of the new file.

\ No newline at end of file

Please fix all of these to have newlines.

I had a look at the NSS shutdown problem as well. I tried running the test with check-interactive so that I could catch it in xcode, but the crash didn't occur. So maybe that's allowing the network to clean up properly somehow.

I think we need to determine what is happening wrt the NSS case before we commit this patch. I'm concerned that otherwise we'll commit our test case implementers to extra work they really don't want to do.
Attachment #325126 - Flags: review?(bugzilla) → review-
(In reply to comment #8)
> I had a look at the NSS shutdown problem as well. I tried running the test with
> check-interactive so that I could catch it in xcode, but the crash didn't
> occur. So maybe that's allowing the network to clean up properly somehow.

Note that you can't use check-interactive to test this way, because check-interactive will run the tail stuff *before* dropping into the console. So that's why you're not seeing any errors, and I guess you're lucky that nothing bad happens...
Attached patch Graceful test shutdown v2 (obsolete) — Splinter Review
Turns out the NSS shutdown assertion had a really simple fix. It expected a profile-change-net-teardown notification before the profile-change-teardown. This is what makes it shut down network stuff. This is also issued by XRE live.

This patch includes your other two nits: newlines at the end of files, and a new "mailShutdown.js" file for this stuff.
Attachment #325126 - Attachment is obsolete: true
Attachment #326729 - Flags: review?(bugzilla)
I'm not real happy to see functionality added back to the tail, when we worked earlier to remove that. Bug 384339 has shown little forward motion lately, so adding stuff to the tail breaks check-interactive for us again, which makes it more difficult to setup a debug environment for unit tests.

Let me propose a compromise. For now, add function postShutdownNotifications() in the head, and add the call to postShutdownNotifications directly to tests that need it. Make sure that postShutdownNotifications() can be called twice. Open a bug, blocked by 384339, that adds the postShutdownNotifications() call to the tail, so that at some future time we can quit adding that to the tests.
(In reply to comment #11)

I agree that breaking check-interactive would suck, and I guess explicitly adding the shutdown call to all run_test()s isn't that bad (yet) since there are only 20-ish testcases so far.

About your proposed solution: I'm not sure how or why you want the shutdown to be callable twice? The whole point of it is that is it is saying to core "we're shutting down for good now, please do whatever cleanup is necessary".
Blocks: 442026
(In reply to comment #12)

> About your proposed solution: I'm not sure how or why you want the shutdown to
> be callable twice? The whole point of it is that is it is saying to core "we're
> shutting down for good now, please do whatever cleanup is necessary".
> 
All I mean by callable twice is that you don't need to remove any existing calls to shutdown procedures, in the tests themselves, when in the future you add the same call in the tail. At that point, shutdown will be called twice. Make sure that doesn't force you to change all existing tests to remove the now redundant shutdown call.
(In reply to comment #13)
> (In reply to comment #12)
> 
> > About your proposed solution: I'm not sure how or why you want the shutdown to
> > be callable twice? The whole point of it is that is it is saying to core "we're
> > shutting down for good now, please do whatever cleanup is necessary".
> > 
> All I mean by callable twice is that you don't need to remove any existing
> calls to shutdown procedures, in the tests themselves, when in the future you
> add the same call in the tail. At that point, shutdown will be called twice.
> Make sure that doesn't force you to change all existing tests to remove the now
> redundant shutdown call.

No, when we solve this the right way, we're gonna have to remove the extra shutdown calls I don't see a way to make the shutdown work two times in a row; it doesn't make any sense. It's just the price we have to pay for this workaround.
Depends on: 384339
Comment on attachment 326729 [details] [diff] [review]
Graceful test shutdown v2

So I think this patch is reasonable, but we need to fix bug 384339 first, so therefore cancelling review for the time being.
Attachment #326729 - Flags: review?(bugzilla)
Product: Core → MailNews Core
Summary: Ensure that mailnews tests shutdown observers gracefully → Ensure that mailnews unit tests shutdown observers gracefully
Attached patch Updated partial patch (obsolete) — Splinter Review
I was taking a look at the patch on bug 384339 earlier and did an minor (incomplete) update of this patch - this version ensures we release the dir provider, though its not very pretty. Just parking it here for when bug 384339 gets resolved.
Attached patch Updated patch (obsolete) — Splinter Review
This does all the tail files, and so we get a much clearer picture, although at the moment there is a crash on shutdown of one of the news tests - I'll investigate that later.

Still waiting on bug 384339 before we can submit this.
Attachment #373655 - Attachment is obsolete: true
Attachment #326729 - Attachment is obsolete: true
Attached patch Updated patch v2 (obsolete) — Splinter Review
This patch has an added gc() call before we do the final shutdown stage - this lets any globals that have been freed be cycle collected and hence give cleaner shutdown leak figures.
Attachment #374909 - Attachment is obsolete: true
Now that bug 384339 has review and should be checked in soon, I'm going to take this bug and drive the patch in once that lands.
Assignee: hwaara → bugzilla
OS: Mac OS X → All
Hardware: x86 → All
Target Milestone: --- → Thunderbird 3.0b4
Blocks: mailnewsleak
I've decided to land the fixes for this in several parts. Probably two, but maybe three.

First I'd like to land this part - it provides a shutdown resource that is designed to shutdown mailnews cleanly for the xpcshell-tests.

It also hooks this into the tail files for addrbook, bayesian-spam-filter, import and mime tests. These directories have no leaks reported on my mac.

I'd like to land this and have it run for a day or so, so that we can check there aren't any intermittent failures caused by the initial clean up at least.


The second part will be to land the tail files for the remaining directories hooking them up to the shutdown code as well. Some of these directories do have reported leaks. I've raised bug 498147 to track these reported leaks, and I'll raise individual bugs on the leaks when I do this second part. Note: I believe most of them are mainly issues with the tests, rather than issues in mailnews.

Depending on what success I have with the gloda, landing the tail file for gloda may be a third part. I think I have seen random hanging with the shutdown and the gloda tests, but I need to test some more to be sure.
Attachment #382282 - Attachment is obsolete: true
Attachment #383121 - Flags: review?(bienvenu)
Attachment #383121 - Flags: review?(bienvenu) → review+
Comment on attachment 383121 [details] [diff] [review]
[checked in] Part 1 - shutdown resource and fix addrbook, bayesian-spam-filter, import and mime tests

Checked in: http://hg.mozilla.org/comm-central/rev/e0c7adaa855f

There are some intermittent address book leaks on autocomplete - I had seen them occasionally on my build but forgotten before pushing this patch. I've raise bug 498212 to cover those leaks.

The SeaMonkey boxes haven't cycled yet. I'll check in on them later.
Attachment #383121 - Attachment description: Part 1 - shutdown resource and fix addrbook, bayesian-spam-filter, import and mime tests → [checked in] Part 1 - shutdown resource and fix addrbook, bayesian-spam-filter, import and mime tests
Blocks: 498302
Blocks: 498303
Blocks: 498304
Blocks: 498307
Blocks: 498321
Blocks: 498323
No longer blocks: 498212
Time for part 2. This adds a tail file for the remaining test directories apart from gloda - I still need to do a bit more testing on that locally.

There are two additional test fixes in here. The test_enumerator_cleanup.js change delays cleaning up until after OnStopCopy has been allowed to return and complete; this allows more items to be released. The test_nsIMsgFolderListenerLocal.js change frees up some global variables before the test finishes (this reduces but doesn't totally fix the reported leaks).

I've already filed bugs for the leaks that are remaining after this patch. Some of these are generic so we don't have to keep on adding to them. See https://bugzilla.mozilla.org/showdependencytree.cgi?id=498147&hide_resolved=1 for the list of bugs.
Attachment #383258 - Flags: review?(bienvenu)
Attachment #383258 - Flags: review?(bienvenu) → review+
Comment on attachment 383258 [details] [diff] [review]
[checked in] Part 2 - base, compose, db/msgdb, imap, local, news with some cleanup

Checked in: http://hg.mozilla.org/comm-central/rev/62a62b6ca564

Leak reports are all looking roughly like what I was expecting them to look like.
Attachment #383258 - Attachment description: Part 2 - base, compose, db/msgdb, imap, local, news with some cleanup → [checked in] Part 2 - base, compose, db/msgdb, imap, local, news with some cleanup
No longer blocks: 498302, 498303, 498304, 498307, 498321, 498323
Flags: in-testsuite+
I've now done some more testing with the gloda test directory having a tail file. My previous thoughts about hanging are unfounded, but I've noticed that test_index_messages.js and test_index_messages_in_bulk.js can sometimes pause with 0 CPU usage for a minute or two and then will continue.

With this patch in, the only reported leaks are in test_mime_emitter.js - with the patch on bug 467356, the reported leaks are then reduced to 4 bytes of an nsVoidArray. I've already got a potential candidate for that and I'll deal with it in a separate bug.
Attachment #384246 - Flags: review?(bienvenu)
Attachment #384246 - Flags: review?(bienvenu) → review+
Comment on attachment 384246 [details] [diff] [review]
[checked in] Part 3 - gloda

Checked in: http://hg.mozilla.org/comm-central/rev/356486fdbaeb
Attachment #384246 - Attachment description: Part 3 - gloda → [checked in] Part 3 - gloda
All tail files checked in, I'll raise a bug for the nsVoidArray leak once the tinderboxes have cycled and I've checked all platforms.
Status: ASSIGNED → RESOLVED
Closed: 15 years ago
Resolution: --- → FIXED
Target Milestone: Thunderbird 3.0b4 → Thunderbird 3.0b3
No longer blocks: 465376
No longer blocks: 462955
Keywords: mlk
[Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.9.2a1pre) Gecko/20090711 SeaMonkey/2.1a1pre] (home, optim default) (W2Ksp4)
(http://hg.mozilla.org/mozilla-central/rev/47bfcd275ede
 +http://hg.mozilla.org/comm-central/rev/291cbe3374b9)

V.Fixed
Status: RESOLVED → VERIFIED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: