Last Comment Bug 711787 - Deadlock in nsImapProtocol::CanHandleUrl caused by calling nsISocketTransport::IsAlive off the socket transport thread
: Deadlock in nsImapProtocol::CanHandleUrl caused by calling nsISocketTransport...
Product: MailNews Core
Classification: Components
Component: Networking: IMAP (show other bugs)
: unspecified
: x86 All
-- critical (vote)
: Thunderbird 12.0
Assigned To: David :Bienvenu
: 715090 (view as bug list)
Depends on: 711786 712363 733731
Blocks: 713253
  Show dependency treegraph
Reported: 2011-12-17 19:32 PST by Brian Smith (:briansmith, :bsmith, use NEEDINFO?)
Modified: 2012-03-07 06:01 PST (History)
16 users (show)
See Also:
Crash Signature:
QA Whiteboard:
Iteration: ---
Points: ---

Making nsSocketTransport::IsAlive concurent-robust for SSL sockets (1.33 KB, patch)
2011-12-19 07:55 PST, Honza Bambas (:mayhemer)
no flags Details | Diff | Splinter Review
move isAlive to imap thread (2.34 KB, patch)
2011-12-19 09:48 PST, David :Bienvenu
standard8: review+
standard8: approval‑comm‑aurora+
Details | Diff | Splinter Review

Description User image Brian Smith (:briansmith, :bsmith, use NEEDINFO?) 2011-12-17 19:32:47 PST
+++ This bug was initially created as a clone of Bug #704984 +++

See bug 704984 comment 25 and other comments in that bug.

See bug 711786 for discussion on whether it is allowed to call nsISocketTransport::IsAlive off the socket transport thread.

I looked at the code in nsIMAPProtocol that calls IsAlive(). It is very similar, at least in concept, to the HTTP connection manager. There are many IMAP connections open to the same server and this code is looking for one that is alive and that isn't busy doing something else, to send the IMAP "request" to.

In HTTP, this happens all in the socket transport thread; in IMAP, this happens on the main thread.

To me, it seems very wrong that the IMAP code is executing off the socket transport thread. At the same time, I don't have any clue about how much effort it would be to make the IMAP execute on the socket transport thread.
Comment 1 User image Honza Bambas (:mayhemer) 2011-12-18 10:05:58 PST
I don't think we can do any fixes in IMAP protocol implementation like this in 2 days before Aurora merge.

We need a quick fix.

I think the quick fix would be to re-enable nsSocketTransport::IsAlive() be accessible on the main thread too (bug 711786).  I'll comment on that bug.
Comment 2 User image David :Bienvenu 2011-12-19 07:44:08 PST
I can try to fix this in the imap code - I'm not convinced that changing nsSocketTransport::IsAlive is easier or safer at this point.
Comment 3 User image Honza Bambas (:mayhemer) 2011-12-19 07:55:10 PST
Created attachment 582828 [details] [diff] [review]
Making nsSocketTransport::IsAlive concurent-robust for SSL sockets

For now leaving w/o an r?.  If David comes with a better fix for IMAP, we may obsolete this patch.

However, this patch is a more general solution that could fix any other inappropriate calls from any other protocol.
Comment 4 User image David :Bienvenu 2011-12-19 08:13:32 PST
I've got a potential fix coded and am rebuilding so I can test it.
Comment 5 User image David :Bienvenu 2011-12-19 09:48:46 PST
Created attachment 582862 [details] [diff] [review]
move isAlive to imap thread

This is what I was thinking for the imap code - it moves the check for isAlive to the imap thread, and retries the url if that check fails. This means we detect the socket failure later, which is sub-optimal in the case of all the cached connections going bad at once - in that case, we try the url on each of the cached connections until we find out that each is bad, and then we open a new connection. But in theory, it should work. It's a bit hard to test because it's hard to make all the cached connections to go bad at once.
Comment 6 User image David :Bienvenu 2011-12-19 10:38:31 PST
I've just been running with a breakpoint set in the retry code, and I've hit it a couple times and the retry seemed to work, so this patch works at least somewhat.
Comment 7 User image Honza Bambas (:mayhemer) 2011-12-19 10:54:22 PST
Comment on attachment 582862 [details] [diff] [review]
move isAlive to imap thread

For anyone interested, here soon (an hour or two up) will be a try build containing this patch, so check if it helps fixing the deadlock please:

(For reference, the try run:
Comment 8 User image Mark Banner (:standard8) 2012-01-04 04:06:28 PST
Comment on attachment 582862 [details] [diff] [review]
move isAlive to imap thread

Ok, lets get this out and give it a try.
Comment 9 User image David :Bienvenu 2012-01-04 08:44:04 PST pushed to trunk. If NSS/Necko decides to keep the ability to call isAlive from the UI thread, then we could think about reverting this, but the deadlock makes the trunk fairly unusable for me.
Comment 10 User image Mark Banner (:standard8) 2012-01-17 14:03:52 PST
Checked into aurora:
Comment 11 User image David :Bienvenu 2012-01-20 07:22:04 PST
*** Bug 715090 has been marked as a duplicate of this bug. ***
Comment 12 User image Tuukka Tolvanen (sp3000) 2012-03-07 06:01:22 PST
this seems to hang me when switching folders, bug 733731

Note You need to log in before you can comment on or make changes to this bug.