I can get into a deadlocked state, where mozilla hangs, when closing a 3-pane window in the background. Testcase and steps coming up.
To reproduce the deadlock: 1. mail yourself a link to that attachment, like. <http://bugzilla.mozilla.org/attachment.cgi?id=122164&action=view> 2. In the mail 3-pane, load that message, and click on the link. 3. Click on the browser window, to bring it to the front. The window is showing an alert (as a sheet, on Mac). 4. Click the close box (the red button) of the 3-pane window, which is now in the background. Note that Mac OS X allows you to close windows that are not in the foreground. 5. Note that the browser deadlocks.
Created attachment 122166 [details] Sampler data showing what the threads are doing So the main thread is in a modal event loop for the alert(), and, on that stack, is tearing down the 3-pane window. During teardown, nsMsgAccountManager::cleanupOnExit() tries to pump the event queue of the current thread while cleaning up the inbox, and emptying the trash. Presumably, because we're in a nested event loop, that's pumping the wrong event queue, and no progress is made.
would this be related to (same root cause) as bug 200006?
Not really, although it is another modal event loop issue.
I've stepped through this in the debugger. Note this doesn't normally hang for me. I had to force cleanupInboxOnExit=1 in nsMsgAccountManager::cleanupOnExit to see the problem. But if I force that, I see this problem on every platform I've tried so far: OSX and WinXP. So it seems like it's probably a cross-platform thing. (When is cleanupInboxOnExit normally true?) At the time the 3-pane mail window is closed, there's a stack of two event loops running on the main thread: the original one and the one that services the alert window. nsMsgAccountManager is pumping the most recent queue (created by the alert window) and waiting for GetCleanupInboxInProgress() to go to false. But that never happens until the older queue is exposed for event processing, which can never happen while the alert window is up. Not unless you do awful things to it with a debugger, anyway. On my first foray into the older queue it happened there was already a single event waiting. But that event (whatever it was; I didn't check) wasn't enough. It was only after my second (third?) debugger-driven trip into the older queue that I found additional events which when processed caused inbox processing to finish up. So it seems mailbox cleanup is posting events on the older queue. Usually this happens when someone caches a queue for later use. That may be an error. It's certainly problematic in this case. This wants looking into. Another solution, as Simon points out, would be to disable closing a background window in this case. I'm afraid that effectively this means ignoring the close widget on any window at any time that a modal window is showing, even (especially) if that modal window is hosted by some other, seemingly unrelated window. That seems like a harsh usability issue and I'd like to see the cached event queue side of things looked into.
Building on the above comment, the code which is caching the old event queue and causing the deadlock is proxies set up in nsImapProtocol::SetupSinkProxy. It strikes me that proxies should in general post to the current active queue, not whatever queue was active when they were constructed. Sadly there's no support in the event queue code for that. However when I hacked up a build that did that anyway, it fixed this deadlock. It would probably be a good thing to put some version of this change into the codebase but only when we're ready to see something really scary.
Bug 170962 apparently is about this same problem, under Windows 2000. Simon, Dan -- if you agree, I'd recommend duping that one to this, since this has the more advanced troubleshooting in place, and changing this bug's platform/os to All/All.
*** Bug 170962 has been marked as a duplicate of this bug. ***
Can you test this on Mac? Windows WFM - I cannot reproduce bug 150006, nor Bug 170962 using their steps and SM trunk Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv:1.9b4pre) Gecko/2008020803 SeaMonkey/2.0a1pre
Severity: normal → critical
QA Contact: esther
I don't seem to be able to reproduce comment #3 on trunk or branch. CC:ing Karsten to let him have a try as well...
I can't reproduce either. SM OS 10.4.11 debug trunk and branch. Closing as WFM, then.
Status: NEW → RESOLVED
Last Resolved: 11 years ago
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.