Closed Bug 973316 Opened 10 years ago Closed 10 years ago

TEST-UNEXPECTED-FAIL + PROCESS-CRASH | xpcshell/tests/mailnews/local/test/unit/test_nsIMsgParseMailMsgState.js

Categories

(MailNews Core :: MIME, defect)

x86_64
Linux
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED
Thunderbird 31.0

People

(Reporter: mkmelin, Assigned: jcranmer)

References

Details

(Keywords: intermittent-failure)

Attachments

(2 files)

Seen on Linux x64 Opt

https://tbpl.mozilla.org/php/getParsedLog.php?id=34733807&tree=Thunderbird-Trunk&full=1#error1

TEST-UNEXPECTED-FAIL | /builds/slave/test/build/xpcshell/tests/mailnews/local/test/unit/test_nsIMsgParseMailMsgState.js | test failed (with xpcshell return code: -11), see following log:
>>>>>>>
TEST-INFO | (xpcshell/head.js) | test MAIN run_test pending (1)
<<<<<<<
mozcrash INFO | Downloading symbols from: http://ftp.mozilla.org/pub/mozilla.org/thunderbird/tinderbox-builds/comm-central-linux64/1392462351/thunderbird-30.0a1.en-US.linux-x86_64.crashreporter-symbols.zip
PROCESS-CRASH | /builds/slave/test/build/xpcshell/tests/mailnews/local/test/unit/test_nsIMsgParseMailMsgState.js | application crashed [@ nsDependentCString::nsDependentCString(char const*)]
It only seems to happen on linux 64bit and the stack is:
Thread 0 (crashed)
 0  libxul.so!nsDependentCString::nsDependentCString(char const*) [nsCharTraits.h:1ba07b762e2e : 505 + 0x9]
    rbx = 0x00007fffaeab1048   r12 = 0x00007fffaeab1040
    r13 = 0x00007f1a479eb640   r14 = 0x00007f1a56499460
    r15 = 0x00007f1a4b445880   rip = 0x00007f1a551415dd
    rsp = 0x00007fffaeab0e70   rbp = 0x00007fffaeab1030
    Found by: given as instruction pointer in context
 1  libxul.so!NS_MsgStripRE(char const**, unsigned int*, char**) [nsMsgUtils.cpp:1ba07b762e2e : 652 + 0x16]
    rbx = 0x00007fffaeab1048   r12 = 0x00007fffaeab1040
    r13 = 0x00007f1a479eb640   r14 = 0x00007f1a56499460
    r15 = 0x00007f1a4b445880   rip = 0x00007f1a5637975c
    rsp = 0x00007fffaeab0e80   rbp = 0x00007fffaeab1030
    Found by: call frame info
 2  libxul.so!nsParseMailMessageState::InternSubject(message_header*) [nsParseMailbox.cpp:1ba07b762e2e : 1234 + 0x4]
    rbx = 0x00007f1a466d3400   r12 = 0x00007fffaeab1050
    r13 = 0x00007fffaeab1168   r14 = 0x0000000000001003
    r15 = 0x00007f1a4b445880   rip = 0x00007f1a56465dd1
    rsp = 0x00007fffaeab1040   rbp = 0x00007fffaeab1080
    Found by: call frame info
 3  libxul.so!nsParseMailMessageState::FinalizeHeaders() [nsParseMailbox.cpp:1ba07b762e2e : 1466 + 0xe]
    rbx = 0x00007f1a466d3400   r12 = 0x00007f1a466d3480
    r13 = 0x00007fffaeab1168   r14 = 0x0000000000001003
    r15 = 0x00007f1a4b445880   rip = 0x00007f1a56469497
    rsp = 0x00007fffaeab1090   rbp = 0x00007fffaeab1310
    Found by: call frame info
 4  libxul.so!nsParseMailMessageState::ParseFolderLine(char const*, unsigned int) [nsParseMailbox.cpp:1ba07b762e2e : 676 + 0x7]
    rbx = 0x00007f1a466d3400   r12 = 0x0000000000000002
    r13 = 0x00007f1a551828fd   r14 = 0x00007fffaeab1520
    r15 = 0x00007fffaeab1506   rip = 0x00007f1a56469d47

May it be related to patch http://hg.mozilla.org/comm-central/rev/c7ecb1bd06ae that last did changes in nsParseMailMessageState::FinalizeHeaders , just before the InterSubject call? Even though it does not seem to touch the subject variable. May there be some in-memory variable aligning problem only affecting 64bit?
I looked into this since the tests were very erratic to actually work. Turns out, we have a nice heap overflow in here.

nsParseMsgMailState uses non-null terminated strings to pass into NS_MsgStripRE. NS_MsgStripRE assumes its strings are null-terminated in places. I'd blame my attempts to nsCString-ify nsIMimeConverter, but the old (and, well, current) implementation still used string in the xpidl, so the original versions were still wrong.

I've pushed an attempted fix to try: <https://tbpl.mozilla.org/?tree=Thunderbird-Try&rev=09eca902f9a3>.
Sob sob sob sob.

The attempted fix(es) didn't work. The real problem is that nsParseMsgMailState thinks it's giving out nul-terminated strings (it tries hard to), but it fails to do so. If I add an assert that fails if we give out a non-null-terminated string, this crash is so persistently common that I can actually debug it.

nsParseMsgMailState only manages to null-terminate if buf points to a \r or a \n character. Which it always will, unless the last header line consists of a header line containing only FWS. Confirmation that this is what is occuring in the tests is obvious when you look at code coverage results: <http://www.tjhsst.edu/~jcranmer/c-ccov/mailnews/local/src/nsParseMailbox.cpp.html> (a branch that is necessary for null termination that is not always taken!?).

I once played with rewriting this code in JS. After all the effort I've gone through to get this damn thing to pass a newly-added assertion that it only hands out null-terminated strings, I'm going to move that from "fun side experiment" to "serious attempts to make a reviewable patch and get it checked in:" the C-style string handling here is baroque, full of errors (this makes, by my count, the 5th patch in as many years fixing a string overrun in this code).

Newer try push: <https://tbpl.mozilla.org/?tree=Thunderbird-Try&rev=c53367503e8f>.
This adds the code that satisfies the checks and also fixes other potential buffer overruns (buf_end isn't safely dereferenceable, although it requires a very precise message to hit such a scenario due to the growth strategy of the underlying buffer).
Assignee: nobody → Pidgeot18
Status: NEW → ASSIGNED
Attachment #8410683 - Flags: review?(kent)
May this cause dataloss, or it crashes immediately by buffer overflow (access to somebody elses memory)?
IRC description:

11:44:29 AM - rkent: jcranmer: can you give me a short description of what is the key to your patch fixing the current crash? I haven;t tried to figure that out yet, still trying to recall how all of this was supposed to work.
11:45:51 AM - jcranmer: okay
11:46:30 AM - jcranmer: the key issue is that the only way the header gets null terminated is if buf points to \r or \n at the end (line 1118 or so)
11:47:55 AM - jcranmer: the error scenario is if you the last line of the header blocks is nothing but whitespace
11:48:36 AM - jcranmer: in that scenario, when the code hits the if(header) {} if (*buf == '\r' || *buf == '\n') {} checks
11:48:51 AM - jcranmer: buf is pointing to buf_end, which is random uninitialized memory
11:49:09 AM - jcranmer: well, actually, it's not random uninitialized memory
11:49:22 AM * jcranmer thinks
11:49:35 AM - jcranmer: uh... maybe it is
11:49:55 AM - jcranmer: I think I was following the wrong pointer
11:50:09 AM - jcranmer: in any case, wherever buf is pointing to, it isn't a CR or a LF character
11:50:33 AM - jcranmer: so the null terminator is never added
11:50:48 AM - jcranmer: and we pass around a string to everybody that is assumed to be null terminated that isn't
11:50:57 AM - jcranmer: and very very many bad things happen
11:51:35 AM - jcranmer: the main addition block on lines 1092-1102 fixes the main error
11:52:18 AM - jcranmer: I added the latter MOZ_ASSERT to guarantee null-termination, and the first MOZ_ASSERT is a precondition for the well-working of the entire code block
11:53:29 AM - jcranmer: (namely, if there is no trailing CRLF to the header block, then the code to add the null terminator doesn't have anything to land on and safely change)
11:53:35 AM - Archae|mobile has left the room (Quit: Bye).
11:54:05 AM - jcranmer: the change on line 1110 is used to fix an issue found in another test
11:54:25 AM - jcranmer: where value runs out of bounds and everything blows up
11:54:40 AM - jcranmer: [well, same test, other test data]
11:55:25 AM - jcranmer: the other two changes were visual inspections to try to ensure that we don't ever dereference buf_end
11:56:22 AM - jcranmer: [the test data that causes the second failure is a completely empty message header, so the attempt to strip the whitespace fails]
Comment on attachment 8410683 [details] [diff] [review]
Fix the buffer overruns

Review of attachment 8410683 [details] [diff] [review]:
-----------------------------------------------------------------

::: mailnews/local/src/nsParseMailbox.cpp
@@ +1055,5 @@
>  
>  SEARCH_NEWLINE:
>      // move past any non terminating characters, rewriting them if folding white space
>      // exists
> +    while (buf < buf_end && *buf != '\r' && *buf != '\n')

This is reverting a change from bug 877831. I agree that the reversion is correct, and it was the later code in that patch (that checked for negative header length) that actually fixed the crash.

@@ +1106,5 @@
>      if (header)
>      {
>        value = colon + 1;
>        // eliminate trailing blanks after the colon
> +      while (value < (buf - writeOffset) && (*value == ' ' || *value == '\t'))

Note for reference: this change really is a fix for the same issue that bug 877831 was trying to address. After this fix, the later code "if (header->length < 0)" is no longer needed, but there is no harm in keeping it in. What is happening is that, when there are both folded lines as well as a blank header, these trailing blanks after the colon are incorporated into the folding.

@@ +1114,5 @@
>        header->length = buf - header->value - writeOffset;
>        if (header->length < 0)
>          header->length = 0;
>      }
>      if (*buf == '\r' || *buf == '\n')

Rather than do your --buf above, which strikes me as odd, couldn't you instead add here "|| buf == buf_end"? It seems to me that the real issue here is that buf == buf_end is also a terminating condition, and we need to allow for that possibility at this point.
Attachment #8410683 - Flags: feedback?(Pidgeot18)
Comment on attachment 8413263 [details] [diff] [review]
RKJ variant using buf>=buf_end as terminal condition

Here is my variant to the patch. Although I prefer this version, I think that either this or the previous version is acceptable. I'll let jcranmer make the final call, but I'll r+ his patch.
Attachment #8413263 - Attachment description: RJV variant using buf>=buf_end as terminal condition → RKJ variant using buf>=buf_end as terminal condition
Attachment #8410683 - Flags: review?(kent) → review+
Attachment #8410683 - Flags: feedback?(Pidgeot18)
https://hg.mozilla.org/comm-central/rev/9ee372f0a6b8

After discussion on IRC, the problem with the alternative version of the patch, using buf_end as a separate error condition, is that *buf_end isn't necessarily valid memory to stomp on, although the growth strategy of nsByteArray makes that hard to execute.
Status: ASSIGNED → RESOLVED
Closed: 10 years ago
Flags: in-testsuite+
Resolution: --- → FIXED
Target Milestone: --- → Thunderbird 31.0
See Also: → 1051616
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: