Hang while retrieving pop3 mail



11 years ago
9 years ago


(Reporter: justdave, Assigned: Bienvenu)


Dependency tree / graph
Bug Flags:
blocking-thunderbird3.0a2 +

Firefox Tracking Flags

(Not tracked)



(6 attachments)

I haven't been able to check my mail on one of my pop3 accounts for the last two days because Shredder keeps hanging when it gets to a specific message.

Attached are a process sample of Thunderbird during the hang, and the relevant section of a pop3 protocol trace.
This is using version 3.0a2pre (2008051603)

Comment 3

11 years ago
yuck, I'll see if I can figure out how to turn this into a reproducible test case.
Assignee: nobody → bienvenu
Given that this keeps you from getting your mail, marking as blocking the next alpha. 
Flags: blocking-thunderbird3.0a2+
Following is last part of log.
> RECV: bWF0aW9uIGNvbmZpZGVudGlhbGx5IHRvIHNlY3VyaXR5DQp2ZW5kb3JzIChpbmNsdWRpbmcgY29t
> RECV: (null)
> Entering NET_ProcessPop3 512
> POP3: Entering state: 19
> RECV: cGV0aXRvcnMpIHdobyBoYXZlIGEgdnVsbmVyYWJpbGl0eSBwcm90ZWN0aW9uIG9yDQptaXRpZ2F0
> RECV: YXJkaW5nIHRvIDNDb20gYXQgcG9zdG1hc3RlckAzY29tLmNvbS4gDQo=
> RECV: .

Following log usually follows after last "RECV: ." (==POP3 level indicator of end of mail data)
> RECV: (null)
> POP3: Entering state: 15
> log of DELE if not "leave message on server"
> log of RETR(and DELE) if more mails should be downloaded
> log of QUIT if all mail is downloaded
( See http://kb.mozillazine.org/Session_logging_for_mail/news )

(A) "RECV: (null)" after each data block except last one is probably indicator of "end of a TCP data segment".
(B) "RECV: (null)" after last "RECV: ." is perhaps indicator of end of all TCP data segments(==end of last TCP data segment).
( See http://en.wikipedia.org/wiki/Transmission_Control_Protocol )

Lack of POP3 level log of (B) indicates that Tb(TCP level) is still waiting for something, and I saw same situation in some other bugs.  

See following pages, and get socket level log.
> http://kb.mozillazine.org/Session_logging_for_mail/news
> http://www.mozilla.org/quality/mailnews/mail-troubleshoot.html#pop
> http://www.mozilla.org/projects/netlib/http/http-debugging.html
> http://www.redhat.com/docs/manuals/enterprise/RHEL-5-manual/Deployment_Guide-en-US/sso-ov.html#sso-config-firefox-troubleshoot
> SET NSPR_LOG_MODULES=pop3:5,nsSocketTransport:5
I don't have a way to reproduce at the moment...  I went and logged in via webmail and deleted the message in question so TB could get the rest of my mail.

Comment 7

11 years ago
My guess is that we're looping forever in GetMsgTextFromStream, perhaps because of a problem with the base64 decoding, given that we may not be passing in the whole encoded string, since we're just trying to generate some body text for the new mail alert. I tried to reproduce this by sending a message with the same body, but didn't see the problem (perhaps because the headers weren't the same length, and we basically take the first 2K of the message, and try to parse that, as I remember...I'll poke around the code, but I could be way off...

Comment 8

11 years ago
I wonder if we're in nsMsgDBFolder::decodeMsgSnippet, and the process sampler doesn't know about that symbol...
Posted file pop3+socket log
Had this happen again today with a different message on a different account.  Here's the socket-level log requested from comment 5.
2 mails are downloaded. (No DELE because "Leave Messages on Server") 
  Mail-1(RETR 2) : A spam mail
  Mail-2(RETR 3) : Same mail body as one in log of Comment #2 by Dave Miller

>(RETR 2)
>(Last part of data block just before last block)
> 00000933 19.91032410 [3344] 0[3c4788]: RECV: ...(Last line in a block)
> 00000934 19.91039848 [3344] 0[3c4788]: RECV: (null)
>(Last part of last data block)
> 00000986 20.11697006 [3344] 0[3c4788]: RECV: .
> 00000987 20.12289429 [3344] 0[3c4788]: RECV: (null)
> 00000988 20.12299538 [3344] 0[3c4788]: POP3: Entering state: 15	

>(RETR 3)
> 00001090 20.14998436 [3344] 0[3c4788]: RECV: .
> 00001091 20.15243912 [3344] 0[3c4788]: RECV: (null)
> 00001092 20.15296364 [3344] 0[3c4788]: POP3: Entering state: 15

No time lag is seen between "RECV: <last data line>" and "RECV: (null)" when non last data block. But time lag(60 msec or 30 msec) is seen between "RECV: ." and following "RECV: (null), Entering state: 15" in both "RETR 2" and "RETR 3".
This indicates that last "Get of data line" request which produces last "RECV :(null) & POP3: Entering state: 15" is issued after mail data processing by detection of end of mail data("RECV .").

To Dave Miller(bug opener):
Your "hang" is "CPU 100% == loop, then unresponsive"?

Comment 11

11 years ago
I reconstructed the message shown in the pop3 log and tweaked my copy of TB to call GetMsgTextFromStream on it and I could not reproduce any hang. I didn't try it on the mac, though - I could try that. 
Original mail by Dave Miller has a following header(no space after ":".)
> Received: ... (Many Received: headers)
> X-TM-IMSS-Message-ID:<7169c8dd00007668@gardena.com>
> Received: from ext1.gardena.com ... (First Received: header)
When I added the header in my "Unsent Messages" folder by hand, Thunderbird sent is as-is and copy in Sent folder has no space after ":". But in downloaded mail, spaces is seen(X-TM-IMSS-Message-ID: <7169c8dd00007668@gardena.com>). This indicates my SMTP server or POP3 server inserts space after ":".

David, can this affect on problem re-creation test?
(In reply to comment #10)
> To Dave Miller(bug opener):
> Your "hang" is "CPU 100% == loop, then unresponsive"?

Had this happen to me again today, ironically with another message from the same sender as before to the same mailing list.  The attachment includes a raw pull from the pop3 server by hand in addition to the log of Thunderbird attempting to grab it before hanging.
Another one, different sender, different pop3 account (same ISP).

The main thing I see in common so far is they all seem to be base64 encoded.  This one has the missing space after the colon on the X-TMSS-Message-Id header, but the last one didn't, so that's probably a red herring.
To David Bienvenu: Problem of Bug 411988 ?

Comment 17

11 years ago
WADA - that's what Dan suspected as well, except that I don't see that function in Dave's stack sampling. Maybe the cause is similar.
I've seen mention that the stack symbols are sometimes incomplete on the nightly builds... if someone can tell me what I need to put in my .mozconfig to get full debug symbols included I can build one with full symbols and resample it.  I've been moving the problem messages into another folder on the server via the webmail client, so I can just move one of them back into the inbox to reproduce again at will.

Comment 19

11 years ago
Dave, how do you feel about doing a full debug build and running under gdb when you can reproduce this problem?

ac_add_options --enable-debug
ac_add_options --disable-optimize
ok, build is prepped and ready.  I don't know my way around gdb though.  find me on irc when  you can walk me through it.
ok, with real symbols available, it turns out the hang was in nsMsgSearchTerm::MatchArbitraryHeader

which makes this a dupe.
Closed: 11 years ago
Resolution: --- → DUPLICATE
Duplicate of bug: 411988
Bug 337903 and Bug 419062 are probably reports of same problem as this bug.
You need to log in before you can comment on or make changes to this bug.