Closed Bug 554898 Opened 14 years ago Closed 12 years ago

TB 100% CPU usage if connection breaks down during IMAP

Categories

(MailNews Core :: Networking: IMAP, defect)

x86
Windows XP
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED INCOMPLETE

People

(Reporter: arik, Unassigned)

References

(Depends on 1 open bug)

Details

(Keywords: perf, Whiteboard: [closeme 2012-02-21][lame-network][dupeme?])

User-Agent:       Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.2) Gecko/20100115 Firefox/3.6 (.NET CLR 3.5.30729)
Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-GB; rv:1.9.1.8) Gecko/20100227 Lightning/1.0b1 Thunderbird/3.0.3

I am seeing 100% CPU usage with Thunderbird 3.0.3 when IMAP connection problems occur.

E.g. when after sending a message via smtp, TB tries to save the message to an IMAP sent folder that is not reachable. The error message is very slow to appear.

Reproducible: Always

Steps to Reproduce:
1. Sent a message with an IMAP server that is not reachable
2. message is sent
3. TB locks up at 100% cpu usage saying "Copying Message to Sent folder..."
Related to/duplicate of MailNews Core bug 477517?
Depends on: 477517
(In reply to comment #1)
> Related to/duplicate of MailNews Core bug 477517?

Arik do you timeout too at some point ?
Keywords: perf
To my memory, yes eventually. Though normally I would not wait for 2 minutes(?) with a system that is entirely locked up. I usually kill TB with the task manager if at all possible - or a cold reset.

Would you like me to provoke the error to verify the timeout?
(In reply to comment #3)
> To my memory, yes eventually. Though normally I would not wait for 2 minutes(?)
> with a system that is entirely locked up. I usually kill TB with the task
> manager if at all possible - or a cold reset.
> 
> Would you like me to provoke the error to verify the timeout?

If you know how to provoke it yes please, and do describe how you do that.
To reproduce the problem, I did the following:

1. Created a message with large attachment (2.5 Mb)
2. Sent the message via Imap (took roughly one minute) - processor load remained at 100% throughout and computer was slightly sluggish but still usable
3. allowed TB to start saving the message to IMAP and then cut the wireless router

After cutting the connection, CPU load remained at 100% for ~45 sec and timed out after ~60 sec.

This is not quite as bad as the problem I am reporting but it might just be a problem of attachment size. I.e. what is TB doing during the smtp send and the beginning of the IMAP save that takes so much CPU? Content-transfer encoding? If so, larger attachments would take longer to encode...
(In reply to comment #5)
> 3. allowed TB to start saving the message to IMAP and then cut the wireless router
> After cutting the connection, CPU load remained at 100% for ~45 sec and timed
> out after ~60 sec.

Can you do quick check of IMAP level flow of above?
(0) Get DebugView. See bug 402793 comment #6 and bug pointed in it.
    Compose a mail with big attachment, Send Later.
    Move mail in Outbox of Local Folders to other local mail folder(say Test)
(1) Start DebugView, Stop Capture, Clear Display
(2) Start Tb with;
    SET NSPR_LOG_MODULES=imap:5
    SET NSPR_LOG_FILE=WinDebug
> https://wiki.mozilla.org/MailNews:Logging
(3) At DebugView, Clear Display, Start Capture
(4) Copy(upload) the mail in Test to Sent folder of IMAP.
    APPEND "Sent", uploading mail data starts. 
(5) Cut the wiresless router.

(Q1) Does similar problem occur?

(6) After a while, Stop Capture at Debug View, Save log(say Log-1).
(7) Terminate Tb.

Change mailnews.tcptimeout to 300(probably 100sec is set as default) via Config Editor, restar Tb.
Do above step (1) to step (7) with mailnews.tcptimeout=300. (say Log-2)
(Q2) Does similar problem occur?
(Q3) Phenomenon becomes like next? 
> After cutting the connection, CPU load remained at 100% for ~200 sec and timed
> out after ~300 sec.
(Q4) Is there difference of (t2-t1) between Log-1 ad Log-2?
> t1 : timestamp of APPEND command
> t2 : timestamp of error due to cut of wireless router

Note: There is no need to attach log files.
Component: General → Networking: IMAP
Product: Thunderbird → MailNews Core
QA Contact: general → networking.imap
Arik, there are 4 questions for you in comment 6. Can you please reply to help wada's inquiry? THanks.
Whiteboard: [closeme 2011-02-11]
Wayne, sorry for the delay. I was writing on a thesis.

(Q1) Does similar problem occur?
Q1: Yes and no. When copying from a local folder to an imap folder, CPU load is still at 100% but after cutting the connection, it seems to go down more quickly. (maybe 5-15 sec). It appears the time it takes for the CPU load to return to near idle is related to how long TB was permitted to send before the connection was cut. This may however just be coincidence...

Q2: as above, no apparent change
Q3: no CPU load returned to normal after 5-15 sec. When copying messages no obvious timeout message seems to be displayed. Or maybe I missed it? In any case it is not a dialog box that need to be confirmed as when sending messages.

Q4: I am not sure how the t2 timestamp would look like. Maybe I cut the debug monitoring too early? How long after the connection has been cut should this appear?

On my debug, i see lots of lines of encored data (gibberish) and then:
00011608	20.29994011	[2504] 4884[3b14d00]: YCRMeAMk0TRz6ukShwBcyXrqCySgDst77qmlpsaG6a0Lq+81TIOlmJdj3MpKeADpeIoDzIby 	
00011609	26.30607033	[1100] m_nMSBtnAllowedAttempt is -1	
00011610	33.32316208	[1100] m_nMSBtnAllowedAttempt is -1	
00011611	45.47875977	[4052] SNACNP Attached!  	
00011612	45.48591614	[1100] m_nMSBtnAllowedAttempt is -1	
00011613	45.84761047	[2504] 4884[3b14d00]: 418d800:mail.espci.fr:S-Sent:SendData: clearing IMAP_CONNECTION_IS_OPEN	
00011614	45.84923553	[4052] SNACNP::NPGetCaps::WNNC_NET_TYPE	
00011615	45.85208893	[2504] 4884[3b14d00]: 418d800:mail.espci.fr:S-Sent:TellThreadToDie: close socket connection	
00011616	45.85337448	[4052] SNACNP::NPGetCaps::WNNC_USER	
00011617	45.85481262	[4052] SNACNP::NPGetCaps::WNNC_CONNECTION	
00011618	46.35280609	[4052] SNACNP::NPGetCaps::WNNC_ENUMERATION	
00011619	46.35597610	[2504] 0[182a140]: creating protocol instance to retry queued url:imap://afunke@mail.espci.fr:993/appendmsgfromfile>/Sent	
00011620	46.35752487	[4052] SNACNP::NPGetCaps::WNNC_ADMIN	
00011621	46.35993576	[4052] SNACNP::NPGetCaps::WNNC_DIALOG	
00011622	46.85476303	[2504] 0[182a140]: retrying  url:imap://afunke@mail.espci.fr:993/appendmsgfromfile>/Sent	
00011623	46.85891342	[2504] 0[182a140]: 4960000:mail.espci.fr:NA:SetupWithUrl: clearing IMAP_CONNECTION_IS_OPEN
Whiteboard: [closeme 2011-02-11]
This may or may not be related to this bug:

Identical symptoms seem to occur if messages are moved from an imap server located on the LAN to an external IMAP server via a relatively slow upload link, e.g. 300kbit. Both connections using SSL.
please retest after version 10 comes out and please post results
Whiteboard: [closeme 2012-02-21]
RESOLVED INCOMPLETE due to lack of response to the previous comment. If you feel this change was made in error, please respond to this bug with your reasons why.
Status: UNCONFIRMED → RESOLVED
Closed: 12 years ago
Resolution: --- → INCOMPLETE
Depends on: 683651
Whiteboard: [closeme 2012-02-21] → [closeme 2012-02-21][lame-network][dupeme?]
See Also: → 1754158
See Also: 1754158
You need to log in before you can comment on or make changes to this bug.