Closed Bug 1408175 Opened 4 years ago Closed 3 years ago

Crash in MsgStripQuotedPrintable

Categories

(MailNews Core :: Backend, defect)

defect
Not set
critical

Tracking

(thunderbird58 fixed, thunderbird59 fixed)

VERIFIED FIXED
Thunderbird 59.0
Tracking Status
thunderbird58 --- fixed
thunderbird59 --- fixed

People

(Reporter: wsmwk, Assigned: jorgk-bmo)

Details

(4 keywords, Whiteboard: [regression: ~2017-09-26])

Crash Data

Attachments

(1 file, 1 obsolete file)

#1 crash for nightly 

crash first appears in 58.0a1 build 20170926030204  3de52e42-1d5b-4518-98d8-9e6450170929 	

report bp-bf6c97df-aa8e-4b0e-a642-1565f0171012.
 0 	xul.dll	MsgStripQuotedPrintable(unsigned char*)	C:/builds/moz2_slave/tb-c-cen-w64-ntly-000000000000/build/mailnews/base/util/nsMsgUtils.cpp:1595
1 	xul.dll	nsMsgDBFolder::decodeMsgSnippet(nsTSubstring<char> const&, bool, nsTString<char>&)	C:/builds/moz2_slave/tb-c-cen-w64-ntly-000000000000/build/mailnews/base/util/nsMsgDBFolder.cpp:5683
2 	xul.dll	nsMsgDBFolder::GetMsgTextFromStream(nsIInputStream*, nsTSubstring<char> const&, unsigned int, unsigned int, bool, bool, nsTSubstring<char>&, nsTSubstring<char>&)	C:/builds/moz2_slave/tb-c-cen-w64-ntly-000000000000/build/mailnews/base/util/nsMsgDBFolder.cpp:5623
3 	xul.dll	nsMsgDBFolder::GetMsgPreviewTextFromStream(nsIMsgDBHdr*, nsIInputStream*)	C:/builds/moz2_slave/tb-c-cen-w64-ntly-000000000000/build/mailnews/base/util/nsMsgDBFolder.cpp:5758
4 	xul.dll	nsImapMailFolder::ParseMsgHdrs(nsIImapProtocol*, nsIImapHeaderXferInfo*)	C:/builds/moz2_slave/tb-c-cen-w64-ntly-000000000000/build/mailnews/imap/src/nsImapMailFolder.cpp:2948
5 	xul.dll	`anonymous namespace'::SyncRunnable2<nsIImapServerSink, nsIImapProtocol* __ptr64, bool* __ptr64>::Run	C:/builds/moz2_slave/tb-c-cen-w64-ntly-000000000000/build/mailnews/imap/src/nsSyncRunnableHelpers.cpp:147
6 	xul.dll	nsThread::ProcessNextEvent(bool, bool*)	xpcom/threads/nsThread.cpp:1037
Looking at the stack, we see nsImapMailFolder.cpp:2948. That code reads:
      inputStream->ShareData(msgHdrs, strlen(msgHdrs));
      GetMessageHeader(msgKey, getter_AddRefs(msgHdr));
      if (msgHdr) {
        nsCOMPtr<nsIInputStream> stream(do_QueryInterface(inputStream));
        GetMsgPreviewTextFromStream(msgHdr, stream); <=== 2948.
      }

That line was changed here a few days before:
https://hg.mozilla.org/comm-central/rev/4ff3727aec0f13b8a9912b7fadab5fee26249302#l4.14

Boris, anything springs to your mind here?

If the crash had started two days later, I would have blamed it on bug 1403393 and friends which adopted stream related Necko changes. But given that the crash is already present on 26th Sept, we can rule that out.
Flags: needinfo?(bzbarsky)
The change quoted in comment 0 looks ok to me.

Looking at the actual callstack, we end up in MsgStripQuotedPrintable which takes "unsigned char *src" and assumes it's null-terminated, looping until it hits src[srcIdx] == 0.  We end up crashing when trying to write to dest[destIdx++].  Note that dest == src.

On the way to MsgStripQuotedPrintable we go through nsMsgDBFolder::decodeMsgSnippet which is doing:

    MsgStripQuotedPrintable((unsigned char *) aMsgSnippet.get());

where aMsgSnippet is an nsCString.

The caller of decodeMsgSnippet is GetMsgTextFromStream, which does:

    decodeMsgSnippet(NS_ConvertUTF16toUTF8(encoding), !reachedEndBody, msgText);

where msgText came from:

  nsAutoCString msgText;

and then various append calls... This all looks legit to me, assuming searchfox is not out of date on comm-central.

If you wanted to be sure, you could try backing out bug 1371893 (since m-c still has stringstream inheriting from inputstream) and see whether that helps.  But I would be pretty surprised if it does.
Flags: needinfo?(bzbarsky)
Thanks for the detailed analysis. I landed bug 1371893 since bug 1401171 was landed which had grabbed the original part 4 of bug 1371699 (attachment 8876568 [details] [diff] [review]). But by the looks of it that was pre-mature since it affected nsIMultiplexInputStream and not nsIStringInputStream. Hard to decide whether I should try a backout.

Wayne, if I backout bug 1371893, how fast do we get results?
Flags: needinfo?(vseerror)
(In reply to Jorg K (GMT+2) from comment #3)
> ...
> Wayne, if I backout bug 1371893, how fast do we get results?

Within 1-4 days.  Avg is 1 crash per day, prox every other buildid.   No crashes so far with an email. Based on extensions I installed, I'd say there are 4-5 different people crashing.

Interesting tidbits about the crashes
- one is russian locale, with russian langpack installed bp-bf6c97df-aa8e-4b0e-a642-1565f0171012  (comment 0)
- different person with en-US locale has ruendict@russia.ru installed bp-2eb535a9-c267-48a9-b389-afc600171010 (https://addons.mozilla.org/en-US/thunderbird/addon/russian-spellchecking-dic-3703/ ?)
- 90% are a specific build of win10 (except one linux), but only 24% of nightly crash reports are that win10 build. (from 11 crash reports [1])
- a couple appear to have no addons installed

[1] https://crash-stats.mozilla.com/signature/?signature=MsgStripQuotedPrintable&date=%3E%3D2017-10-06T18%3A46%3A00.000Z&date=%3C2017-10-13T18%3A46%3A00.000Z&_columns=date&_columns=build_id&_columns=platform_version&_columns=install_time&_columns=useragent_locale&_sort=-date&page=1
Flags: needinfo?(vseerror)
OK, I'll back it out as an experiment, however, very hard to understand why bug 1371893 should be the cause.
Flags: needinfo?(jorgk)
I backed it out, leaving NI to check results and reland.
(In reply to Boris Zbarsky [:bz] (no decent commit message means r-) from comment #2)
> Looking at the actual callstack, we end up in MsgStripQuotedPrintable which
> takes "unsigned char *src" and assumes it's null-terminated, looping until
> it hits src[srcIdx] == 0.

Agreed.

> The caller of decodeMsgSnippet is GetMsgTextFromStream, which does:

There is a change from Bug 1402750 (push date	2017-09-25 23:35 +0000):
https://hg.mozilla.org/comm-central/rev/337c6fe6e0e7#l6.62
| -      msgHeaders.Append(NS_LITERAL_CSTRING("\r\n"));
| +      msgHeaders.AppendLiteral("\r\n");

Comment to 'AppendLiteral':
| // AppendLiteral must ONLY be applied to an actual literal string.
| // Do not attempt to use it with a regular char* pointer, or with a char
| // array variable. Use Append or AppendASCII for those.

Comment to 'nsTSubstring'
| * nsTSubstring is an abstract string class. From an API perspective, this
| * class is the root of the string class hierarchy. It represents a single
| * contiguous array of characters, which may or may not be null-terminated.

If I count that together and still speculate that a server supplies an article min LF line breaks, this could be the explanation for overwriting the NULL termination and finally for the crash.
Thanks for looking at it, but bug 1402750 was motived by bug 870698:
https://hg.mozilla.org/mozilla-central/rev/c81b52d58ea4
"\r\n" is a literal string.

Where do you see the problem? I agree that bug 1402750 landed an bit of refactoring. I'm happy to agree that I messed up somewhere.
(In reply to Jorg K (GMT+2) from comment #8)
> Thanks for looking at it, but bug 1402750 was motived by bug 870698:

> Where do you see the problem?

This is just a wild guess from me.

It would fit in time with the appearance of this bug.
But unfortunately I can not check this because my server automatically repairs the line breaks.

Possibly Comment 6 already shows an effect.
(In reply to Jorg K (GMT+2) from comment #6)
> I backed it out, leaving NI to check results and reland.
Still crashing with bug 1371893 backed out on 2017-10-14:
https://crash-stats.mozilla.com/report/index/3871e630-7c2d-4155-bf46-7faf40171018 - 20171015040550 - 2017-10-15
https://crash-stats.mozilla.com/report/index/99d9b4fa-9810-4de4-8bb7-79cee0171017 - 20171017030201 - 2017-10-17
I will re-land bug 1371893 now.
Flags: needinfo?(jorgk)
bp-7583742f-ed8a-4c09-8749-bdbd40171019 buildid 20171016040337 onno crashed. He has no other notable crash history.

Different person (the only other one with an email address)
218cf4f4-ad08-4532-a8d7-9f3c90170929 2017-09-29 MsgStripQuotedPrintable   58.0a1
f67e4e9f-702c-4d2c-a75f-dbac20170917 2017-09-17 MimeMultipart_parse_eof   57.0a1 

We will need to see what shakes out in a few days as users get updated to current versions of newly resumed nightly builds.
Crashes continue up through the latest builds.
Ray has crashed at least twice - bp-9956b3ec-2fb2-4ba2-994f-b028f0171030 bp-ac222de9-761e-4b38-ba92-9946b0171022
#1 crash for 58 beta, so this must block 59 release. Ideally we'd push a patch into a 58 beta.  (#2 crash for nightly)

examined last month of crashes - almost every user has no prior crashes so this is definitely a regression, not a case where a signature morphed.

Walt crashed once - bp-c263d1af-29cf-4e98-a4e3-69fd00171104 with 58.0a1
Hardware: Unspecified → All
Any chance to get a reproducible case? I've looked at the code again, like Boris already did in comment #2, and don't see anything obvious.
Eric, operating on the string buffer of an nsCString here
https://dxr.mozilla.org/comm-central/rev/cea2dc7d3190d13ec294a567282b8a6bb64934ce/mailnews/base/util/nsMsgDBFolder.cpp#5703
 MsgStripQuotedPrintable((unsigned char *) aMsgSnippet.get());
is clearly hacky, but is it also crashy?
Flags: needinfo?(erahm)
> Any chance to get a reproducible case?

I don't have high hopes, even though at least two users crash frequently. But I will continue to try.
FWIW The only solid common theme in crash comments is downloading mail.
> is clearly hacky, but is it also crashy?

In theory it could be.  For example, aMsgSnippet could be pointing into immutable memory, and you're casting away const.

You should really be using BeginWriting() to get a writable pointer, at the very least.
Boris, I fully agree with your comment #19.

Would you be so kind as to look at this patch for me. I noticed that passing a char* wasn't really required.
Assignee: nobody → jorgk
Status: NEW → ASSIGNED
Flags: needinfo?(erahm)
Attachment #8934553 - Flags: review?(bzbarsky)
Comment on attachment 8934553 [details] [diff] [review]
1408175-MsgStripQuotedPrintable.patch

>+          buf.SetLength(--bufLength);

Why not just "bufLength - 1"?  bufLength is unused after this point, and it would be a lot clearer what's going on...

r=me

Whether this fixes the crashes, we'll see.
Attachment #8934553 - Flags: review?(bzbarsky) → review+
(In reply to Boris Zbarsky [:bz] (no decent commit message means r-) from comment #21)
> Why not just "bufLength - 1"?  bufLength is unused after this point, and it
> would be a lot clearer what's going on...
I'll do that.

> r=me
Thanks!

> Whether this fixes the crashes, we'll see.
Indeed.
Addressed comment.
Attachment #8934553 - Attachment is obsolete: true
Attachment #8934579 - Flags: review+
If nightly is working then I will have affected users give it a shot.
Otherwise, will need a try build.
Whiteboard: [regression: ~2017-09-26]
Also, if we manage to reenable nightly automatic updates (disabled because of on add-ons issues), we should know the effectiveness of the fix within a day or two via nightly users.
Pushed by mozilla@jorgk.com:
https://hg.mozilla.org/comm-central/rev/0810e7646b06
pass nsCString to MsgStripQuotedPrintable() instead of a raw char pointer. r=bz
Status: ASSIGNED → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
It will be in tomorrow's Daily.
Target Milestone: --- → Thunderbird 59.0
Comment on attachment 8934579 [details] [diff] [review]
1408175-MsgStripQuotedPrintable.patch (v1b)

Uplift won't do harm. If it still crashes, we might even get more information.
Attachment #8934579 - Flags: approval-comm-beta+
No crashes with 12-06 and 12-07 builds, plus I have two testers who used nightly and report success. Also no new crashes according to topcrash. So I think it is killed.  Thanks for all the diagnosis and patch.
Status: RESOLVED → VERIFIED
You need to log in before you can comment on or make changes to this bug.