Note: There are a few cases of duplicates in user autocompletion which are being worked on.

hang when switching folders on IMAP server

RESOLVED FIXED in Thunderbird 13.0

Status

MailNews Core
Networking: IMAP
--
critical
RESOLVED FIXED
5 years ago
5 years ago

People

(Reporter: Tuukka Tolvanen (sp3000), Assigned: Bienvenu)

Tracking

({hang, regression})

Trunk
Thunderbird 13.0
x86_64
Linux
hang, regression
Dependency tree / graph

Thunderbird Tracking Flags

(thunderbird11- fixed, thunderbird12+ fixed)

Details

(Whiteboard: [gs][gssolved], URL)

Attachments

(2 attachments)

(Reporter)

Description

5 years ago
frequently enough (1/10 or so) when I hit space or n, say yes to the prompt sheet to advance to next folder with unread (or, just click on a folder to switch to it) this results in a hang. server is exchange over imap

hanging on central/central since sometime in January, I think ...a bit hard to narrow it down as the reproducibility is a bit random.

Actually, the str could be something like: read stuff, idle for a minute, switch to a folder with no unread, hit space and yes to advance. Regression range:

2012-01-04-03-00-26-comm-central good 0/4
http://hg.mozilla.org/mozilla-central/rev/200a8d1fb452
http://hg.mozilla.org/comm-central/rev/fd9f0ac2bcaf

2012-01-05-03-00-25-comm-central bad 2/2
http://hg.mozilla.org/mozilla-central/rev/4795500b7c1d
http://hg.mozilla.org/comm-central/rev/6ad18d15c741
(Reporter)

Comment 1

5 years ago
(gdb) bt
#0  __lll_lock_wait ()
    at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:136
#1  0x00007ffff7bca36f in _L_lock_1145 ()
   from /lib/x86_64-linux-gnu/libpthread.so.0
#2  0x00007ffff7bca2ba in __pthread_mutex_lock (mutex=0x7ffff6d48288)
    at pthread_mutex_lock.c:101
#3  0x00007ffff62876f9 in PR_Lock ()
   from /usr/lib/x86_64-linux-gnu/libnspr4.so
#4  0x00007ffff6287d15 in PR_EnterMonitor ()
   from /usr/lib/x86_64-linux-gnu/libnspr4.so
#5  0x00007ffff6278501 in PR_CEnterMonitor ()
   from /usr/lib/x86_64-linux-gnu/libnspr4.so
#6  0x00007ffff23b7983 in nsImapProtocol::PseudoInterruptMsgLoad (
    this=0x7fffc95e9000, aImapFolder=0x7fffcbab3440, 
    aMsgWindow=0x7fffd6ad4f70, interrupted=0x7ffffffef70f)
    at comm-central/mailnews/imap/src/nsImapProtocol.cpp:1279
#7  0x00007ffff2371c64 in nsImapIncomingServer::PseudoInterruptMsgLoad (
    this=0x7fffd6ac9be0, aImapFolder=0x7fffcbab3440, 
    aMsgWindow=0x7fffd6ad4f70, interrupted=0x7ffffffef70f)
    at comm-central/mailnews/imap/src/nsImapIncomingServer.cpp:2246
#8  0x00007ffff23eaac4 in nsImapService::GetMessageFromUrl (
    this=0x7fffd0c87e20, aImapUrl=0x7fffb6ef8c00, aImapAction=268435480, 
    aImapMailFolder=0x7fffcbab3440, aImapMessage=0x7fffcbab35f8, 
    aMsgWindow=0x7fffd6ad4f70, aDisplayConsumer=0x7fffd23b3908, 
    aConvertDataToText=false, aURL=0x0)
    at comm-central/mailnews/imap/src/nsImapService.cpp:1084
#9  0x00007ffff23ea641 in nsImapService::FetchMessage (this=0x7fffd0c87e20, 
    aImapUrl=0x7fffb6ef8c00, aImapAction=268435480, 
    aImapMailFolder=0x7fffcbab3440, aImapMessage=0x7fffcbab35f8, 
    aMsgWindow=0x7fffd6ad4f70, aDisplayConsumer=0x7fffd23b3908, 
    messageIdentifierList=..., aConvertDataToText=false, 
    aAdditionalHeader=..., aURL=0x0)
    at comm-central/mailnews/imap/src/nsImapService.cpp:1046
#10 0x00007ffff23e7728 in nsImapService::DisplayMessage (this=0x7fffd0c87e20, 
    aMessageURI=0x7fffb6b84e88 "imap-message://domain%5Cuser@host.company.com/folder#86789", aDisplayConsumer=0x7fffd23b3908, aMsgWindow=0x7fffd6ad4f70, 
    aUrlListener=0x0, aCharsetOverride=0x0, aURL=0x0)
    at comm-central/mailnews/imap/src/nsImapService.cpp:580

(not sure why I seem to be getting system nspr? anyhow, it hangs in m.o nightlies as well)
(Reporter)

Comment 2

5 years ago
reverting this seems to help:

changeset:   9107:75840841cc21
user:        David Bienvenu <bienvenu@nventure.com>
date:        Wed Jan 04 08:40:48 2012 -0800
summary:     fix deadlock in imap ssl calling isAlive, r=standard8, bug 711787
 mailnews/imap/src/nsImapProtocol.cpp |  27 ++++++++++++++-------------
 1 files changed, 14 insertions(+), 13 deletions(-)
Blocks: 711787
Severity: normal → critical
tracking-thunderbird11: --- → ?
Keywords: hang, regression
(Assignee)

Comment 3

5 years ago
Have you changed the number of connections to cache with that server from the default of 5?

Next time this happens, we need more of the main thread's stack, and info from the other stacks, in particular, there should be an other thread doing something with the imap protocol object.

We can't revert the other fix since that hang was a lot worse. I've never seen this hang myself.
(Reporter)

Comment 4

5 years ago
max_cached_connections is 4; timeout 29
(Reporter)

Comment 5

5 years ago
Created attachment 603846 [details]
(gdb) thread apply all bt

this box has the default 5 cached connections fwiw
Is following effective recovery procedure when problem occurs?
  Go Work Offlne mode, Go back to Work Online mode, then open the folder again.

Do you enable automatic new mail check? Do you enable IDLE command use?
Is frequency of your problem reduced by disabling IDLE command use?
(Assignee)

Comment 7

5 years ago
ah, thx. Ok, thread 30 is trying to retry a url, probably because the server or network dropped a connection (or perhaps a loadgroup was cancelled which killed the connection):

#5  0x00007ffff24c7ae5 in (anonymous namespace)::DispatchSyncRunnable (
    r=0x7fffc511b780)
    at comm-central/mailnews/imap/src/nsSyncRunnableHelpers.cpp:308
#6  0x00007ffff24cab48 in ImapServerSinkProxy::PrepareToRetryUrl (
    this=0x7fffc507d740, a1=0x7fffc51ec000, a2=0x7fffcb8febf0)
    at comm-central/mailnews/imap/src/nsSyncRunnableHelpers.cpp:464
#7  0x00007ffff2463e6b in nsImapProtocol::RetryUrl (this=0x7fffcd9b1800)
    at comm-central/mailnews/imap/src/nsImapProtocol.cpp:1876
#8  0x00007ffff2461e19 in nsImapProtocol::ImapThreadMainLoop (
    this=0x7fffcd9b1800)
    at comm-central/mailnews/imap/src/nsImapProtocol.cpp:1361
#9  0x00007ffff2460b31 in nsImapProtocol::Run (this=0x7fffcd9b1800)

and at just the wrong time, the UI thread is trying to find a connection it can use, which causes contention over a PR_CEnterMonitor(this) on the protocol object. I'll have to think about this. I think the imap thread shouldn't be holding onto the monitor when calling into the UI thread with the sink proxy runnable calls, which might not be too hard to fix.
(Assignee)

Comment 8

5 years ago
I suspect we don't need to use the monitor at all in this method. All the member vars we're accessing should be safe to access from the imap thread. The server object protects accesses to connections with its own monitor.
(Assignee)

Comment 9

5 years ago
Created attachment 604568 [details] [diff] [review]
possible fix

I'll request a try server build w/ this patch, but if you can do your own builds, here's the patch.
Assignee: nobody → dbienvenu
tracking-thunderbird11: ? → -
tracking-thunderbird13: --- → ?
(Reporter)

Comment 10

5 years ago
Comment on attachment 604568 [details] [diff] [review]
possible fix

yeah, this fixes it for me, didn't get any other issues in the past day with it
Attachment #604568 - Flags: feedback+
(Assignee)

Updated

5 years ago
Attachment #604568 - Flags: review?(neil)

Updated

5 years ago
Attachment #604568 - Flags: review?(neil) → review+
(Assignee)

Comment 11

5 years ago
http://hg.mozilla.org/comm-central/rev/a9f0e769a175
Status: NEW → RESOLVED
Last Resolved: 5 years ago
Resolution: --- → FIXED
Target Milestone: --- → Thunderbird 13.0
(Assignee)

Updated

5 years ago
status-thunderbird13: --- → fixed
tracking-thunderbird12: --- → ?
tracking-thunderbird13: ? → ---
(Assignee)

Comment 12

5 years ago
Comment on attachment 604568 [details] [diff] [review]
possible fix

[Approval Request Comment]
User impact if declined: occasional hangs
Testing completed (on c-c, etc.): try server build fixed issue for reporter
Risk to taking this patch (and alternatives if risky): slight risk of race conditions though hangs are much more likely
Attachment #604568 - Flags: approval-comm-aurora?
Attachment #604568 - Flags: approval-comm-aurora? → approval-comm-aurora+

Updated

5 years ago
Comment on attachment 604568 [details] [diff] [review]
possible fix

[Triage Comment]
This already landed in time for 13. The requests should have let it go into 12, somehow I missed that. So a=me for comm-beta.
Attachment #604568 - Flags: approval-comm-aurora+ → approval-comm-beta+
Checked in:

http://hg.mozilla.org/releases/comm-beta/rev/151697a4635b
status-thunderbird12: --- → fixed
status-thunderbird13: fixed → ---
tracking-thunderbird12: ? → +
Duplicate of this bug: 739251
Duplicate of this bug: 738562
Duplicate of this bug: 738930

Updated

5 years ago
Summary: hang switching folders → hang when switching folders on IMAP server

Comment 18

5 years ago
Is it possible to apply this patch to 11.0?
Comment on attachment 604568 [details] [diff] [review]
possible fix

[Triage Comment]
a=Standard8, as per drivers meeting we've decided to spin a 11.0.1 for this and another issue.

I've landed it on comm-release already:

http://hg.mozilla.org/releases/comm-release/rev/832c448e5d0a
Attachment #604568 - Flags: approval-comm-release+
status-thunderbird11: --- → fixed
Duplicate of this bug: 739251
(Assignee)

Updated

5 years ago
Duplicate of this bug: 739781
I've marked the topic in https://getsatisfaction.com/mozilla_messaging/topics/problems_with_mozilla_11_0 to reflect solution in 11.0.1
Whiteboard: [gs][gssolved]

Updated

5 years ago
Duplicate of this bug: 740298

Updated

5 years ago
Blocks: 739997

Updated

5 years ago
Duplicate of this bug: 739997
(Assignee)

Updated

5 years ago
Duplicate of this bug: 728740
Duplicate of this bug: 739688
Duplicate of this bug: 713624
You need to log in before you can comment on or make changes to this bug.