Closed Bug 204006 Opened 21 years ago Closed 16 years ago

Deadlock when closing mail 3-pane if a dialog is up

Categories

(SeaMonkey :: MailNews: Message Display, defect)

PowerPC
macOS
defect
Not set
critical

Tracking

(Not tracked)

RESOLVED WORKSFORME

People

(Reporter: sfraser_bugs, Unassigned)

References

Details

(Keywords: hang)

Attachments

(2 files)

I can get into a deadlocked state, where mozilla hangs, when closing a 3-pane
window in the background. Testcase and steps coming up.
To reproduce the deadlock:
1. mail yourself a link to that attachment, like.
   <http://bugzilla.mozilla.org/attachment.cgi?id=122164&action=view>
2. In the mail 3-pane, load that message, and click on the link.
3. Click on the browser window, to bring it to the front. The window is showing
   an alert (as a sheet, on Mac).
4. Click the close box (the red button) of the 3-pane window, which is now in
the background. Note that Mac OS X allows you to close windows that are not in
the foreground.
5. Note that the browser deadlocks.
So the main thread is in a modal event loop for the alert(), and, on that
stack, is  tearing down the 3-pane window. During teardown,
nsMsgAccountManager::cleanupOnExit() tries to pump the event queue of the
current thread while cleaning up the inbox, and emptying the trash. Presumably,
because we're in a nested event loop, that's pumping the wrong event queue, and
no progress is made.
Keywords: hang
would this be related to (same root cause) as bug 200006?
Not really, although it is another modal event loop issue.
I've stepped through this in the debugger. Note this doesn't normally hang for
me. I had to force cleanupInboxOnExit=1 in nsMsgAccountManager::cleanupOnExit to
see the problem. But if I force that, I see this problem on every platform I've
tried so far: OSX and WinXP. So it seems like it's probably a cross-platform
thing. (When is cleanupInboxOnExit normally true?)

At the time the 3-pane mail window is closed, there's a stack of two event loops
running on the main thread: the original one and the one that services the alert
window. nsMsgAccountManager is pumping the most recent queue (created by the
alert window) and waiting for GetCleanupInboxInProgress() to go to false. But
that never happens until the older queue is exposed for event processing, which
can never happen while the alert window is up. Not unless you do awful things to
it with a debugger, anyway.

On my first foray into the older queue it happened there was already a single
event waiting. But that event (whatever it was; I didn't check) wasn't enough.
It was only after my second (third?) debugger-driven trip into the older queue
that I found additional events which when processed caused inbox processing to
finish up. So it seems mailbox cleanup is posting events on the older queue.
Usually this happens when someone caches a queue for later use. That may be an
error. It's certainly problematic in this case. This wants looking into.

Another solution, as Simon points out, would be to disable closing a background
window in this case. I'm afraid that effectively this means ignoring the close
widget on any window at any time that a modal window is showing, even
(especially) if that modal window is hosted by some other, seemingly unrelated
window.

That seems like a harsh usability issue and I'd like to see the cached event
queue side of things looked into.
Building on the above comment, the code which is caching the old event queue and
causing the deadlock is proxies set up in nsImapProtocol::SetupSinkProxy. It
strikes me that proxies should in general post to the current active queue, not
whatever queue was active when they were constructed. Sadly there's no support
in the event queue code for that.

However when I hacked up a build that did that anyway, it fixed this deadlock.
It would probably be a good thing to put some version of this change into the
codebase but only when we're ready to see something really scary.
Product: Browser → Seamonkey
Bug 170962 apparently is about this same problem, under Windows 2000.  
Simon, Dan -- if you agree, I'd recommend duping that one to this, since this 
has the more advanced troubleshooting in place, and changing this bug's 
platform/os to All/All.
*** Bug 170962 has been marked as a duplicate of this bug. ***
Assignee: sspitzer → mail
Can you test this on Mac?

Windows WFM - I cannot reproduce bug 150006, nor Bug 170962 using their steps and SM trunk Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv:1.9b4pre) Gecko/2008020803 SeaMonkey/2.0a1pre
Severity: normal → critical
QA Contact: esther
I don't seem to be able to reproduce comment #3 on trunk or branch. CC:ing Karsten to let him have a try as well...
I can't reproduce either.
SM OS 10.4.11 debug trunk and branch.
Closing as WFM, then.
Status: NEW → RESOLVED
Closed: 16 years ago
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: