hang in nsMsgSearchTerm::MatchArbitraryHeader while getting mail

RESOLVED FIXED in mozilla1.9.1a1

Status

--
critical
RESOLVED FIXED
11 years ago
3 years ago

People

(Reporter: tuukka.tolvanen, Assigned: tuukka.tolvanen)

Tracking

({hang, regression})

Trunk
mozilla1.9.1a1
x86
Linux
hang, regression
Dependency tree / graph
Bug Flags:
blocking-thunderbird3.0a2 +
blocking-thunderbird3 +

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(2 attachments, 1 obsolete attachment)

(Assignee)

Description

11 years ago
tbird linux trunk a few hours ago

I'm stuck in the loop at http://mxr.mozilla.org/mozilla/source/mailnews/base/search/src/nsMsgSearchTerm.cpp#725 when fetching a particular message off pop. It's spinning on one line while m_headersSize == 0, and buf is stuck with the last line seen as GetNextFilterLine returns -1. Bug 132340 changed some stuff around these parts so it might have caused/triggered this, Joshua has some idea about where the problem is...
(Assignee)

Comment 1

11 years ago
Created attachment 296613 [details]
probably the offending message

this is probably the offending message; I get stuck on "Content-Transfer-Encoding: base64" afaict.
Could you check whether this is an actual regression (from that bug) ?
Flags: blocking-thunderbird3?
(Assignee)

Comment 3

11 years ago
yes, I verified that receiving attachment 296613 [details] from pop.gmail doesn't hang with attachment 290632 [details] [diff] [review] reverted, but does with it applied.

Updated

11 years ago
Keywords: regression
(Assignee)

Comment 4

11 years ago
http://mxr.mozilla.org/mozilla/source/mailnews/base/search/src/nsMsgBodyHandler.cpp#107

Is GetNextLine's buf inout? If not, I'd like to truncate it unconditionally at the start to get to a known state there rather than only after length == -1 from GetNext*Line. Is it possible to get length == -1 after having built up content in buf in GetNextLine's loop, such that truncating it after that point would drop valid content? If so, the return value could currently be != buf.Length() with valid buf content unless the condition after the loop applies and corrects it; either buf.Length() should be returned or buf and nextLine length tracked separately or something.
(Assignee)

Comment 5

11 years ago
Also, I wonder what happens in GetNextLine's loop if !m_Filtering && !m_db, afaict it'd just loop there forever...
(Assignee)

Comment 6

11 years ago
well, ok, !eatThisLine may break that
(Assignee)

Comment 7

11 years ago
Created attachment 297361 [details] [diff] [review]
patch1

Assuming buf in GetNextLine is out only, avoid possibly appending lines to its incoming value [1] by truncating it first. Track length/eof(-1)ness of buf separately from that of incoming lines to make sure eof is returned when all we got was eof [2], and eof is not returned when we got something before eof [3].

[1] by ApplyTransformations; telling if that is a real risk or not is too much for my little brain to handle all at once, same applies to the other points
[2] if !m_isMultipart && m_base64part, even with buf truncated an empty line and length 0 would be returned despite not having gotten any line
[3] if the loop accumulated data into buf before hitting eof, unless !m_isMultipart && m_base64part, -1 would be returned
Assignee: nobody → tuukka.tolvanen
Status: NEW → ASSIGNED
Attachment #297361 - Flags: superreview?(bienvenu)
Attachment #297361 - Flags: review?(dmose)

Comment 8

11 years ago
Comment on attachment 297361 [details] [diff] [review]
patch1

Thanks for digging into this patch.  As you've noted, the code here is painful to wrap one's head around, and I think we need to try and continue to improve that.  So...

> 
> PRInt32 nsMsgBodyHandler::GetNextLine (nsCString &buf)
> {
>   PRInt32 length = 0;
>+  PRInt32 outLength = -1;               // length of valid data in buf

If we want to stick with this structure, which may be ok, I think the above vars probably want more explicit names such as untransformedLength and transformedLength.

>   PRBool eatThisLine = PR_FALSE;
>   nsCAutoString nextLine;
> 
>+  buf.Truncate();
>+

I suspect this is OK, but I also don't have a super high degree of confidence here.  Like you, I _think_ GetNextLine's argument is an out param only, but I wouldn't be willing to actually bet money on that.  Is this change required to fix this bug, or orthogonal to it?

>   } while (eatThisLine && length >= 0);  // if we hit eof, make sure we break out of this loop. Bug #:

One of the things that makes this loop hard to read is that we're not really testing anything about length at all here, but rather about eof-ness.  This may be better structured as something other than do/while.  And/or it might also help to explicitly break out of the loop when -1 is detected.  Can you play around and try and come up with something clearer?
  
>   // For non-multipart messages, the entire message minus headers is encoded
>   // ApplyTransformations can only decode a part
>-  if (!m_isMultipart && m_base64part)
>+  if (outLength >= 0 && !m_isMultipart && m_base64part)

It's not clear to me why the check on outLength is necessary here.  A comment in the code might help...
Attachment #297361 - Flags: review?(dmose) → review-

Comment 9

10 years ago
Comment on attachment 297361 [details] [diff] [review]
patch1

clearing review request since dmose already minused
Attachment #297361 - Flags: superreview?(bienvenu)
This should block Thunderbird 3, since it's both a regression and a hang.
Flags: blocking-thunderbird3? → blocking-thunderbird3+
Flags: blocking-thunderbird3.0a2?
Given that there's not even a single DUP of this and no new commentary here since 3.0a1, I wouldn't block 3.0a2 on this.  It'd be nice to have, of course.
Flags: wanted-thunderbird3.0a2+
Flags: blocking-thunderbird3.0a2?
Flags: blocking-thunderbird3.0a2-
Duplicate of this bug: 434091
(In reply to comment #11)
> Given that there's not even a single DUP of this

Heh, how things change with the proper debugging. :)
(Assignee)

Comment 14

10 years ago
I should get to refreshing the patch next week :)
Indeed; given the painful failure mode here, marking as blocking 3.0a2.
Flags: wanted-thunderbird3.0a2+
Flags: blocking-thunderbird3.0a2-
Flags: blocking-thunderbird3.0a2+
Just as a point of info, the attached patch does fix the hang for me.
(Assignee)

Comment 17

10 years ago
Created attachment 324226 [details] [diff] [review]
patch2 (checked in)

(In reply to comment #8)
> (From update of attachment 297361 [details] [diff] [review])

> >   PRInt32 length = 0;
> >+  PRInt32 outLength = -1;               // length of valid data in buf
> 
> If we want to stick with this structure, which may be ok, I think the above
> vars probably want more explicit names such as untransformedLength and
> transformedLength.

I opted for comments that should motivate the naming better, hope that's ok.

> >+  buf.Truncate();

> I suspect this is OK, but I also don't have a super high degree of confidence
> here.  Like you, I _think_ GetNextLine's argument is an out param only, but I
> wouldn't be willing to actually bet money on that.  Is this change required to
> fix this bug, or orthogonal to it?

Orthogonal, looks like; dropped. I've noticed no ill effects from it since January, though.

> >   } while (eatThisLine && length >= 0);  // if we hit eof, make sure we break out of this loop. Bug #:
> 
> One of the things that makes this loop hard to read is that we're not really
> testing anything about length at all here, but rather about eof-ness.  This may
> be better structured as something other than do/while.  And/or it might also
> help to explicitly break out of the loop when -1 is detected.  Can you play
> around and try and come up with something clearer?

Rearranged a bit. Then there's the crufty comments and oddly shaped conditions, but, meh.

> >   // For non-multipart messages, the entire message minus headers is encoded
> >   // ApplyTransformations can only decode a part
> >-  if (!m_isMultipart && m_base64part)
> >+  if (outLength >= 0 && !m_isMultipart && m_base64part)
> 
> It's not clear to me why the check on outLength is necessary here.  A comment
> in the code might help...

That's the check that ultimately fixes the hang, by avoiding transforming eof into "" and 0 indicating more data to be had ad infinitum. Moved to an early return, perhaps it documents more cleanly there.
Attachment #297361 - Attachment is obsolete: true
Attachment #324226 - Flags: review?(dmose)
I'm setting the in-testsuite? flag because I think this would be a useful test case for when we have a fake pop3 server - we have a message to test against and its important that we don't regress the code.

Tuukka - I'm not expecting you to do the test case in this instance - especially as we don't yet have a fake pop3 server (which Gary and/or myself are going to write soon) - I just think this bug is an ideal candidate for using as a basis for a test once we do have the fake pop3 server.
Flags: in-testsuite?
Blocks: 337903

Comment 19

10 years ago
I can supply several test mails which I've sequestered on my ISP server, just tell me to whom I should forward them.  It isn't anything I would want to drop on someone without warning.

Updated

10 years ago
Whiteboard: [needs review]
I've been running with patch2 for a couple weeks now, and haven't had any issues.  I can throw my test mails at it and they get through, too.

Updated

10 years ago
Whiteboard: [needs review] → [needs review dmose]
Comment on attachment 324226 [details] [diff] [review]
patch2 (checked in)

r+sr=dmose; thanks for the fix!
Attachment #324226 - Flags: superreview+
Attachment #324226 - Flags: review?(dmose)
Attachment #324226 - Flags: review+
Checking in nsMsgBodyHandler.cpp;
/cvsroot/mozilla/mailnews/base/search/src/nsMsgBodyHandler.cpp,v  <--  nsMsgBodyHandler.cpp
new revision: 1.34; previous revision: 1.33
done
Status: ASSIGNED → RESOLVED
Last Resolved: 10 years ago
Resolution: --- → FIXED

Updated

10 years ago
Attachment #324226 - Attachment description: patch2 → patch2 (checked in)

Updated

10 years ago
Whiteboard: [needs review dmose]

Updated

10 years ago
Target Milestone: --- → mozilla1.9.1a1
Product: Core → MailNews Core

Comment 23

10 years ago
This FIXED bug is flagged with in‑testsuite?   It would be great if assignee or someone else can clear the flag if a test is not appropriate.  And if appropriate, create a test and plus the flag to finish off the bug.

Updated

9 years ago
Duplicate of this bug: 337903
Flags: in-testsuite?
You need to log in before you can comment on or make changes to this bug.