Closed Bug 1016524 Opened 10 years ago Closed 7 years ago

crash in MimeMultipart_parse_eof with multipart/related. kid->clazz accessed after free (double reference to child)

Categories

(MailNews Core :: MIME, defect)

x86
All
defect
Not set
critical

Tracking

(thunderbird30 wontfix, thunderbird31 wontfix, thunderbird32 wontfix, thunderbird34 wontfix, thunderbird35 wontfix, thunderbird36 wontfix, thunderbird_esr31 wontfix, thunderbird_esr38 wontfix, thunderbird_esr5257+ verified, thunderbird57 fixed, thunderbird58 fixed)

VERIFIED FIXED
Thunderbird 58.0
Tracking Status
thunderbird30 --- wontfix
thunderbird31 --- wontfix
thunderbird32 --- wontfix
thunderbird34 --- wontfix
thunderbird35 --- wontfix
thunderbird36 --- wontfix
thunderbird_esr31 --- wontfix
thunderbird_esr38 --- wontfix
thunderbird_esr52 57+ verified
thunderbird57 --- fixed
thunderbird58 --- fixed

People

(Reporter: wsmwk, Assigned: jorgk-bmo)

References

()

Details

(6 keywords, Whiteboard: [regression:TB30.0a2][TB 57 beta => TB 52.5 ESR])

Crash Data

User Story

https://support.mozilla.org/en-US/questions/1138586 (allcargo)
https://support.mozilla.org/en-US/questions/1113325 (anthian)
https://support.mozilla.org/en-US/questions/1152766 (vilger)
https://support.mozilla.org/en-US/questions/1069076 (avalon512)
https://support.mozilla.org/en-US/questions/1160425 (AnneWu)
https://support.mozilla.org/en-US/questions/1086634 (pramitmozilla)
https://support.mozilla.org/t5/Thunderbird/Thunderbird-Crashes-continually-after-every-seconds/m-p/1386740#M147516
http://forums.mozillazine.org/viewtopic.php?f=39&t=3034383 (max64, friend of)

Attachments

(3 files, 4 obsolete files)

#5 crash for Tbird 30.0b1. Also exists in TB31.0a1 and TB32.0a1 ( bp-62871bd9-a23d-40bf-8221-444212140527 )

This bug was filed from the Socorro interface and is 
report bp-781871ed-c5a7-4942-af32-10edd2140523.
=============================================================

0	xul.dll	MimeMultipart_parse_eof	mailnews/mime/src/mimemult.cpp
1	xul.dll	MimeMultipartRelated_parse_eof	mailnews/mime/src/mimemrel.cpp
2	xul.dll	MimeMultipart_finalize	mailnews/mime/src/mimemult.cpp
3	xul.dll	MimeMultipartRelated_finalize	mailnews/mime/src/mimemrel.cpp
4	xul.dll	mime_free	mailnews/mime/src/mimei.cpp
5	xul.dll	MimeContainer_finalize	mailnews/mime/src/mimecont.cpp
6	xul.dll	MimeMessage_finalize	mailnews/mime/src/mimemsg.cpp
7	xul.dll	mime_free	mailnews/mime/src/mimei.cpp
8	xul.dll	mime_display_stream_complete	mailnews/mime/src/mimemoz2.cpp
9	xul.dll	nsStreamConverter::OnStopRequest(nsIRequest *,nsISupports *,tag_nsresult)	mailnews/mime/src/nsStreamConverter.cpp
10	xul.dll	nsMsgProtocol::OnStopRequest(nsIRequest *,nsISupports *,tag_nsresult)	mailnews/base/util/nsMsgProtocol.cpp
11	xul.dll	nsMailboxProtocol::OnStopRequest(nsIRequest *,nsISupports *,tag_nsresult)	mailnews/local/src/nsMailboxProtocol.cpp
presumably a regression.
no TB31.0a2 crashes, but probably has the potential given there are TB32.0a1 crashes
A message that is loaded when this crash happens would be extremely useful.
will try to get testcase message. but low crash count will make that difficult.

earliest crash is 30.0a2 20140411013247 bp-f238bdfb-c4dc-4128-98aa-a112d2140421
0	xul.dll	MimeMultipart_parse_eof	mailnews/mime/src/mimemult.cpp
1	xul.dll	MimeMultipartRelated_parse_eof	mailnews/mime/src/mimemrel.cpp
2	xul.dll	MimeMultipart_finalize	mailnews/mime/src/mimemult.cpp
3	xul.dll	MimeMultipartRelated_finalize	mailnews/mime/src/mimemrel.cpp
4	xul.dll	mime_free	mailnews/mime/src/mimei.cpp
5	xul.dll	MimeContainer_finalize	mailnews/mime/src/mimecont.cpp
6	xul.dll	MimeMessage_finalize	mailnews/mime/src/mimemsg.cpp
7	xul.dll	mime_free	mailnews/mime/src/mimei.cpp
8	xul.dll	mime_display_stream_complete	mailnews/mime/src/mimemoz2.cpp
9	xul.dll	nsStreamConverter::OnStopRequest(nsIRequest *,nsISupports *,tag_nsresult)	mailnews/mime/src/nsStreamConverter.cpp
10	xul.dll	nsImapCacheStreamListener::OnStopRequest(nsIRequest *,nsISupports *,tag_nsresult)	mailnews/imap/src/nsImapProtocol.cp

At least one user, hughes, has multiple crash signatures. 
For example  mime_parse_stream_complete bp-bb711a68-610d-464c-a352-21c5e2140328  bug 844647, which is waiting on bug 622092, which is waiting on a testcase
Whiteboard: [regression:TB30.0a2]
Blocks: TB31found
Keywords: testcase-wanted
#8 crash for TB31 - not quite high enough for crazy concern, so removing tracking request.  But we should reevalute in a couple weeks.

And a patch of course would be nice. Sadly (but not unusual) I still don't have testcase, nor what caused the regression.
Still #8 crash. But still no testcase
(In reply to Wayne Mery (:wsmwk) from comment #5)
> Still #8 crash. But still no testcase

This might get fixed by jsmime when we switch to it.
(In reply to Ludovic Hirlimann [:Usul] from comment #6)
> This might get fixed by jsmime when we switch to it.

Perhaps, but I'm not sure we can wait for version 38 with jsmime. Still #8.  The good news is it is not a topcrash for anything other than release builds. The bad news is one of the four users I've been in contact with has left Thunderbird - a high percentage, but not yet a statistically significant sample.
I don't forsee getting a testcase
FWIW, "some" users report this occurs during message deletes
I think this is a escalation of bug 547621 / bug 781049
(In reply to Wayne Mery (:wsmwk) from comment #10)
> I think this is a escalation of bug 547621 / bug 781049

In bug comment 9 Irving writes:
this is starting to look related to the fix in bug 478175 - if we never call MimeMultipartRelated_parse_eof(), we would double-free the "head" object of the multipart/related in MimeMultipartRelated_finalize() because there's a pointer to that part in both the children array and in ->headobj.  I haven't been able to recreate a situation where this happens, so this might not be the problem, but it's what I have for now.


For some class of users, 10-20% of their crashes are  	MimeMultipart_finalize.
For example alex 
--------------------------------------
 MimeMultipart_parse_eof bp-82737905-cdf0-4df7-800b-dea192141230
 0 	xul.dll	MimeMultipart_parse_eof	mailnews/mime/src/mimemult.cpp
1 	xul.dll	MimeMultipartRelated_parse_eof	mailnews/mime/src/mimemrel.cpp
2 	xul.dll	MimeMultipart_finalize	mailnews/mime/src/mimemult.cpp
3 	xul.dll	MimeMultipartRelated_finalize	mailnews/mime/src/mimemrel.cpp
4 	xul.dll	mime_free	mailnews/mime/src/mimei.cpp
5 	xul.dll	MimeContainer_finalize	mailnews/mime/src/mimecont.cpp 

http://hg.mozilla.org/releases/comm-esr31/annotate/7777d8694ab2/mailnews/mime/src/mimemult.cpp#l101
hg@0    96 static void
hg@0    97 MimeMultipart_finalize (MimeObject *object)
hg@0    98 {
hg@0    99  MimeMultipart *mult = (MimeMultipart *) object;
hg@0   100
bugzilla@10318   101  object->clazz->parse_eof(object, false);

http://hg.mozilla.org/releases/comm-esr31/annotate/7777d8694ab2/mailnews/mime/src/mimemrel.cpp#l1004
hg@0   997  MimeMultipartRelated *relobj = (MimeMultipartRelated *) obj;
asutherland@1946   998  MimeContainer *cont = (MimeContainer *)obj;
hg@0   999  int status = 0;
hg@0  1000  MimeObject *body;
hg@0  1001  char* ct;
hg@0  1002  const char* dct;
hg@0  1003
hg@0  1004  status = ((MimeObjectClass*)&MIME_SUPERCLASS)->parse_eof(obj, abort_p);

http://hg.mozilla.org/releases/comm-esr31/annotate/7777d8694ab2/mailnews/mime/src/mimemult.cpp#l697
hg@0   687  /* Now call parse_eof for our active child, if there is one.
hg@0   688   */
hg@0   689  if (cont->nchildren > 0 &&
hg@0   690    (mult->state == MimeMultipartPartLine ||
hg@0   691     mult->state == MimeMultipartPartFirstLine))
hg@0   692  {
hg@0   693    MimeObject *kid = cont->children[cont->nchildren-1];
bienvenu@2746   694    NS_ASSERTION(kid, "not expecting null kid");
hg@0   695    if (kid)
hg@0   696    {
hg@0   697      int status = kid->clazz->parse_eof(kid, abort_p);

--------------------------------------
 MimeMultipart_finalize bp-931be6c4-9080-489c-9496-3146a2141229
Frame	Module	Signature	Source
0 		@0xdbfc4589	
1 	xul.dll	MimeMultipart_finalize	mailnews/mime/src/mimemult.cpp
2 	xul.dll	MimeMultipartRelated_finalize	mailnews/mime/src/mimemrel.cpp
3 	xul.dll	mime_free	mailnews/mime/src/mimei.cpp
4 	xul.dll	MimeContainer_finalize	mailnews/mime/src/mimecont.cpp
5 	xul.dll	MimeMessage_finalize	mailnews/mime/src/mimemsg.cpp
6 	xul.dll	mime_free	mailnews/mime/src/mimei.cpp 

http://hg.mozilla.org/releases/comm-esr31/annotate/7777d8694ab2/mailnews/mime/src/mimei.cpp#l290
hg@0   283 mime_free (MimeObject *object)
hg@0   284 {
hg@0   285# ifdef DEBUG__
hg@0   286  int i, size = object->clazz->instance_size;
ehsan@13324   287  uint32_t *array = (uint32_t*) object;
hg@0   288# endif /* DEBUG */
hg@0   289
hg@0   290  object->clazz->finalize(object);

http://hg.mozilla.org/releases/comm-esr31/annotate/7777d8694ab2/mailnews/mime/src/mimemrel.cpp#l232
hg@0   221  if (relobj->file_buffer)
hg@0   222  {
bugzilla@10318   223    relobj->file_buffer->Remove(false);
mconley@13187   224    relobj->file_buffer = nullptr;
hg@0   225  }
asutherland@1946   226  
asutherland@1946   227  if (relobj->headobj) {
asutherland@1946   228    mime_free(relobj->headobj);
mconley@13187   229    relobj->headobj = nullptr;
asutherland@1946   230  }
hg@0   231
hg@0   232  ((MimeObjectClass*)&MIME_SUPERCLASS)->finalize(obj);

http://hg.mozilla.org/releases/comm-esr31/annotate/7777d8694ab2/mailnews/mime/src/mimemult.cpp#l101
hg@0    97 MimeMultipart_finalize (MimeObject *object)
hg@0    98 {
hg@0    99  MimeMultipart *mult = (MimeMultipart *) object;
hg@0   100
bugzilla@10318   101  object->clazz->parse_eof(object, false);
MimeMultipart_finalize probably has same cause, eg. bp-e81ac8a4-9672-47ea-ac09-a83232150414 (Andrew)
Crash Signature: [@ MimeMultipart_parse_eof] → [@ MimeMultipart_finalize] [@ MimeMultipart_parse_eof]
Removing myslef on all the bugs I'm cced on. Please NI me if you need something on MailNews Core bugs from me.
http://mxr.mozilla.org/comm-central/source/mailnews/mime/src/mimemult.cpp#697

694     NS_ASSERTION(kid, "not expecting null kid");
695     if (kid)
696     {
697       int status = kid->clazz->parse_eof(kid, abort_p);


Looks like kid->clazz is null for some reason
It would be great to kill this - I've sent Magnus some sample messages.

also  mime_free | MimeContainer_finalize, bug 547621 bp-a1de9aab-5f6a-40ef-9b2d-854a22150927
Crash address of 0x5a5a5a72 indicates that kid or kid->clazz is probably not null, but has that magic code that usually means freed memory. Not sure if uninitialized has the same value.
(In reply to Kent James (:rkent) from comment #18)
> *** Bug 1211845 has been marked as a duplicate of this bug. ***

That bug has sample messages that demonstrate the crash for that user (hidden for security reasons), but I cannot reproduce the crash using those messages.

I might try replacing the pointers with smart pointers (which initialize to zero) and add null checks. That might solve this, though not if the problem is an extra Release() somewhere.
Any action I can do to help track this down?
(In reply to Geoffrey Walton from comment #20)
> Any action I can do to help track this down?

What might help is for someone who can reproduce this reliably (such as yourself) to try to figure out the conditions that are necessary for the crash to occur.

You could start by trying to reproduce MY results, which are roughly:

1) Create a new profile, and add a new account to force Local Folders to appear.
2) Create a test folder under Local Folders.
3) Load the offending message as a file, and copy that tile to the "test" folder.
4) Open the offending message in the test folder.

For me, that did not crash. Try it.

If it crashes, try to get in a situation where this does not crash (such as install a different copy of Thunderbird, or install Thunderbird on a different machine, or disable any AV you have running, or extensions).

If it does not crash, try to make it crash, for example by trying the same series of operations on your existing profile.
ran thunderbierd -p to get profile manager.
Created new profile.
Added new imap acount.
Copied A.msf (1124KB)and A (5KB) to C:\Users\Geoffrey\AppData\Roaming\Thunderbird\Profiles\sqniz1t7.Test\ImapMail\just107.justhost.com\INBOX.sbd

Opened TB
Went to folder A
Nothing seen in there.

Created it in Local Folder and 3 mails dispplayed not 6.

got this error

bp-5092e055-9df0-4d34-b7b6-70da62151014

Will try disabling extns etc to see if I can get it to read without crashing

AdapterDeviceID: 0x0a16
AdapterDriverVersion: 10.18.10.3574
AdapterSubsysID: 227e103c
AdapterVendorID: 0x8086
Add-ons: %7Be2fda1a4-762b-4020-b5ad-a41df1933103%7D:4.3b2,%7B972ce4c6-7e08-4474-a285-3208198ce6fd%7D:41.0
AvailablePageFile: 5273358336
AvailablePhysicalMemory: 5638922240
AvailableVirtualMemory: 3801198592
BIOS_Manufacturer: Insyde
BlockedDllList: 
BreakpadReserveAddress: 35454976
BreakpadReserveSize: 67108864
BuildID: 20150908063005
CrashTime: 1444842516
EMCheckCompatibility: true
FramePoisonBase: 00000000f0de0000
FramePoisonSize: 65536
InstallTime: 1444131514
Notes: AdapterVendorID: 0x8086, AdapterDeviceID: 0x0a16, AdapterSubsysID: 227e103c, AdapterDriverVersion: 10.18.10.3574
D2D- D2D1.1- D2D1.1+ D2D+ DWrite- DWrite+ D3D11 Layers- D3D11 Layers+ 
ProductID: {3550f703-e582-4d05-9a08-453d09bdfdc6}
ProductName: Thunderbird
ReleaseChannel: beta
SafeMode: 0
SecondsSinceLastCrash: 199884
StartupTime: 1444842488
SystemMemoryUsePercentage: 33
TelemetryEnvironment: {"build":{"applicationId":"{3550f703-e582-4d05-9a08-453d09bdfdc6}","applicationName":"Thunderbird","architecture":"x86","buildId":"20150908063005","version":"41.0","vendor":null,"platformVersion":"41.0","xpcomAbi":"x86-msvc","hotfixVersion":null},"partner":{"distributionId":null,"distributionVersion":null,"partnerId":null,"distributor":null,"distributorChannel":null,"partnerNames":[]},"system":{"memoryMB":8122,"isWow64":true,"cpu":{"count":4,"vendor":null,"family":null,"model":null,"stepping":null,"extensions":["hasMMX","hasSSE","hasSSE2","hasSSE3","hasSSSE3","hasSSE4_1","hasSSE4_2"]},"os":{"name":"Windows_NT","version":"6.3","servicePackMajor":0,"servicePackMinor":0,"installYear":2015,"locale":"en-GB"},"hdd":{"profile":{"model":"HGST HTS541010A9E680","revision":"JA0OA710"},"binary":{"model":"HGST HTS541010A9E680","revision":"JA0OA710"},"system":{"model":"HGST HTS541010A9E680","revision":"JA0OA710"}},"gfx":{"D2DEnabled":true,"DWriteEnabled":true,"adapters":[{"description":"Intel(R) HD Graphics Family","vendorID":"0x8086","deviceID":"0x0a16","subsysID":"227e103c","RAM":null,"driver":"igdumdim64 igd10iumd64 igd10iumd64 igdumdim32 igd10iumd32 igd10iumd32","driverVersion":"10.18.10.3574","driverDate":"4-24-2014","GPUActive":true}],"monitors":[{"screenWidth":1366,"screenHeight":768,"refreshRate":60,"pseudoDisplay":false}]}},"settings":{"addonCompatibilityCheckEnabled":true,"blocklistEnabled":true,"isDefaultBrowser":null,"e10sEnabled":false,"telemetryEnabled":true,"isInOptoutSample":false,"locale":"en-US","update":{"channel":"beta","enabled":true,"autoDownload":true},"userPrefs":{"browser.cache.disk.capacity":358400,"devtools.debugger.remote-enabled":false}},"profile":{"creationDate":16722},"addons":{"activeAddons":{"{e2fda1a4-762b-4020-b5ad-a41df1933103}":{"blocklisted":false,"description":"An integrated calendar for Thunderbird","name":"Lightning","userDisabled":false,"appDisabled":false,"version":"4.3b2","scope":1,"type":"extension","foreignInstall":false,"hasBinaryComponents":false,"installDay":16722,"updateDay":16722}},"theme":{"id":"{972ce4c6-7e08-4474-a285-3208198ce6fd}","blocklisted":false,"description":"The default theme.","name":"Default","userDisabled":false,"appDisabled":false,"version":"41.0","scope":4,"foreignInstall":false,"hasBinaryComponents":false,"installDay":16714,"updateDay":16714},"activePlugins":[{"name":"Shockwave Flash","version":"19.0.0.185","description":"Shockwave Flash 19.0 r0","blocklisted":false,"disabled":false,"clicktoplay":false,"mimeTypes":["application/x-shockwave-flash","application/futuresplash"],"updateDay":16699}],"activeGMPlugins":{},"activeExperiment":{},"persona":null}}
Theme: classic/1.0
Throttleable: 1
TotalPageFile: 9858867200
TotalPhysicalMemory: 8516689920
TotalVirtualMemory: 4294836224
URL: 
Vendor: 
Version: 41.0
Winsock_LSP: MSAFD Tcpip [TCP/IP] : 2 : 1 : %SystemRoot%\system32\mswsock.dll 
 MSAFD Tcpip [UDP/IP] : 2 : 2 :  
 MSAFD Tcpip [RAW/IP] : 2 : 3 : %SystemRoot%\system32\mswsock.dll 
 MSAFD Tcpip [TCP/IPv6] : 2 : 1 :  
 MSAFD Tcpip [UDP/IPv6] : 2 : 2 : %SystemRoot%\system32\mswsock.dll 
 MSAFD Tcpip [RAW/IPv6] : 2 : 3 :  
 RSVP TCPv6 Service Provider : 2 : 1 : %SystemRoot%\system32\mswsock.dll 
 RSVP TCP Service Provider : 2 : 1 :  
 RSVP UDPv6 Service Provider : 2 : 2 : %SystemRoot%\system32\mswsock.dll 
 RSVP UDP Service Provider : 2 : 2 :  
 MSAFD RfComm [Bluetooth] : 2 : 1 : %SystemRoot%\system32\mswsock.dll
useragent_locale: en-US

This report also contains technical information about the state of the application when it crashed.
Believe last zip provided only contained 1 file.
(In reply to Kent James (:rkent) from comment #19)
> (In reply to Kent James (:rkent) from comment #18)
> > *** Bug 1211845 has been marked as a duplicate of this bug. *** 
> That bug has sample messages that demonstrate the crash for that user
> (hidden for security reasons), but I cannot reproduce the crash using those
> messages.
FTR, that is Geoffrey's bug

(In reply to Kent James (:rkent) from comment #21)
> (In reply to Geoffrey Walton from comment #20)
> > Any action I can do to help track this down?
> ...
> If it crashes, try to get in a situation where this does not crash (such as
> install a different copy of Thunderbird, or install Thunderbird on a
> different machine, or disable any AV you have running, or extensions).

Geoffrey, how many iterations did you attempt and were you able to create a situation where it did not crash?
Flags: needinfo?(geoffrey.walton)
Sorry still unable to open those messages without crashing.

Used hoard and squirrel clients and could read the messages using both so it does look like it is a problem with TB or TB and how it interacts with other applications.

I have recieved other emails from the senders of orig problem emails and they display without problem.
Flags: needinfo?(geoffrey.walton)
Kent, we seem to be lacking new information. How about we throw up a try build with your idea in comment 19 and run it past the several reporters I have?


(In reply to Wayne Mery (:wsmwk, use Needinfo for questions) from comment #16)
> It would be great to kill this - I've sent Magnus some sample messages.

FTR I sent Kent the same samples, from Alex and Dave. Alex still has a ton of crashes.

And curiously, some users had another crash signature in 38.1.0/38.2.0. But this signature no longer exists in 38.3.0. What the heck did we do that made it stop? (rhetorical question)
alex nsStreamConverter::OnStopRequest(nsIRequest*, nsISupports*, nsresult) bp-2835c8e0-6f38-4091-9c49-ede762150813
boceion  nsStreamConverter::OnStopRequest(nsIRequest*, nsISupports*, nsresult) bp-8ce3b55c-34f5-41c6-b24b-e0e392150727
Flags: needinfo?(rkent)
I looked at this again, still cannot reproduce, and my suggestions in comment 19 don't really work. Pretty sure that kid->clazz is being doubly released.

There is no easy way to fix this without a developer being able to reproduce. I cannot.
Flags: needinfo?(rkent)
like Kent, I am unable to reproduce with testcase from bug 1211845. And I still don't have a testcase from a user that outright crashes.

This morning I did teamviewer with a user, who can easily reproduce, but not quite 100%. Dave's sample message in local folders has multipart/related [1].  Typically, it does NOT crash when just clicking on the message after entering a folder. It DOES crash when scrolling the messages list, i.e. viewing 2-3 other messages, before hitting the crashing message.

I also haven't had anyone test that version 3 patch in bug 478175 causes this.

[1] Content-Type: multipart/related;
    boundary="_005_372E891255F4064CA7D264E3B2C0F5A442E0EA22GLEXCH1goldline_";
    type="multipart/alternative"

--_005_372E891255F4064CA7D264E3B2C0F5A442E0EA22GLEXCH1goldline_
Content-Type: multipart/alternative;
    boundary="_000_372E891255F4064CA7D264E3B2C0F5A442E0EA22GLEXCH1goldline_"

--_000_372E891255F4064CA7D264E3B2C0F5A442E0EA22GLEXCH1goldline_
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
Summary: crash in MimeMultipart_parse_eof → crash in MimeMultipart_parse_eof with multipart/related
Since Wayne CC'ed me on the bug I've downloaded the mailbox in attachment 8673791 [details]. On a local build I can't see a crash.

(In reply to Wayne Mery (:wsmwk, use Needinfo for questions) from comment #29)
> It DOES crash when scrolling the messages list, i.e.
> viewing 2-3 other messages, before hitting the crashing message.
So which one is the "crashing message"?

I tried to look at attachment 8670197 [details] from bug 1211845 but I'm not authorised.
(In reply to Jorg K (GMT+1) from comment #30)
> Since Wayne CC'ed me on the bug I've downloaded the mailbox in attachment
> 8673791 [details]. On a local build I can't see a crash.
> 
> (In reply to Wayne Mery (:wsmwk, use Needinfo for questions) from comment
> #29)
> > It DOES crash when scrolling the messages list, i.e.
> > viewing 2-3 other messages, before hitting the crashing message.
> So which one is the "crashing message"?

Emailed you two of Dave's

 
> I tried to look at attachment 8670197 [details] from bug 1211845 but I'm not
> authorised.

cc you on the bug.  If you can't view it let me know.
(In reply to Wayne Mery (:wsmwk, use Needinfo for questions) from comment #31)
> Emailed you two of Dave's
I received those, but they don't crash when viewing them in TB 47 (Earlybird). Also tried in TB 38.6. No crash.

> > I tried to look at attachment 8670197 [details] from bug 1211845 but I'm not
> > authorised.
> cc you on the bug.  If you can't view it let me know.
Thanks, but still no access.
(In reply to Jorg K (GMT+1) from comment #32)
>..
> 
> > > I tried to look at attachment 8670197 [details] from bug 1211845 but I'm not
> > > authorised.
> > cc you on the bug.  If you can't view it let me know.
> Thanks, but still no access.

just emailed to you
Geoffrey, 
Which antivirus software do you use?
If you have other examples that crash please send me a copy. 

For the record, jorgk was unable to reproduce from comment #32.
Finding reliable testcase continues to be incredibly difficult.
Flags: needinfo?(geoffrey.walton)
Observed this crash as well, with TB 38.6.0: https://crash-stats.mozilla.com/report/index/5200e0cf-6929-4b81-b553-80d892160815

TB always crashes when I view one of two mails from a particular thread (which I can't publish). Do you still need a mail to reproduce this issue? If you can tell me what to look for, I can try to narrow the mail down to a publishable crasher mail.
(In reply to oliver.gerlich from comment #35)
> Do you still need a mail to reproduce this issue?
> If you can tell me what to look for, I can try to narrow the mail down to a
> publishable crasher mail.

Yes, please do.  Some users report just clicking on a message doesn't caue a crash - it requires navigating through 2-3 messages.
1. Create a new folder crashtests in Local Folders.
2. Copy crashing messages (2-3 if possible) into crashtests folder. (Turn off message display pane using F8, to avoid crashing when copying)
3. Turn on message display, F8, and test check again that accessing the messages in crashtests cause you to crash.
4. find crashtests file in   Mail\Local Folders   in your profile https://support.mozilla.org/en-US/kb/profiles-tb
5. zip and email me the file

--
One reporter, Ian, says his crashing stopped after removing anti-virus Vipre Business Premium
Flags: needinfo?(geoffrey.walton) → needinfo?(oliver.gerlich)
Well what can I say, today the messages don't cause a crash anymore. Yesterday TB crashed 4 or 5 times, each time when viewing one of these particular messages; today the messages are displayed just fine, both in the original folder and also in the crashtests folder. Sorry, guess I can't help with this issue any more.
from bug 1211845 comment 7

When running with the attached messages, I never reach the code where it crashes. Whenever I get the stack in the same state, obj->closed_p is true at this point, and the method returns:

static int
MimeMultipart_parse_eof (MimeObject *obj, bool abort_p)
{
  MimeMultipart *mult = (MimeMultipart *) obj;
  MimeContainer *cont = (MimeContainer *) obj;

  if (obj->closed_p) return 0;
Flags: needinfo?(oliver.gerlich)
Summary: crash in MimeMultipart_parse_eof with multipart/related → crash in MimeMultipart_parse_eof with multipart/related. kid->clazz is being doubly released?
There are a fair number of linux crashes.
Oddly rare for Mac, but there are some
bp-6967ee26-7484-43a9-b1e4-38d8c2160920
bp-ce59dada-b38f-4b1a-8ca1-bb37f2160902
OS: Windows NT → All
Depends on: 1310467
Had this problem again, and could cure it this time by freeing up space on /tmp/.

So if /tmp is completely full ("df /tmp/" says 0 bytes are available), viewing these particular mails crashes TB in MimeMultipart_parse_eof(). It doesn't matter if the mail is viewed in preview pane or in a separate window. Once I free up some space in /tmp, I can view the mail without problems. This also explains why I couldn't reproduce the problem as stated in comment 37 - probably I had freed up /tmp space in between.

On this system /tmp is a separate (smallish) partition which sometimes has no space left (due to history files from Konsole terminal). Also, the crash doesn't happen with all mails. The crashing mails are somewhat huge (80 and 350 kb), with HTML and lots of quoted mails and with embedded images. Mails from the same thread which are shortened (quoted mails and images were removed) can be viewed even when /tmp is full.
Maybe decoding the images etc. requires space in /tmp and fails silently if there is no space?
Blocks: 825513
Hi ,
- i've got this problem too ;) - what can i do to help you ? 
 
User was working on machine and TB something about 2 years , everething was fine till yesterday .
I've tried everething :

TB with safe mode ,running as administrator , with windows safe mode , with and without AV , install several versions 40-50b clearing all directories in AppData/{Local,Roaming}/Thunderbird 

TB start crash after downloading all messages and keeps crash after 20-30s of running . 

I've notice message that are 'blank' (it shows properly on other machines ) and if i show source everething is there - it was message that was replied and forwarded multiple times to multiple users . 

Below crash reports . User works on roundcube now ;) 

https://crash-stats.mozilla.com/report/index/44adc873-224e-4866-ac7f-9d42d2170104
https://crash-stats.mozilla.com/report/index/77d31b68-efc8-4005-9ead-201ea2170104
https://crash-stats.mozilla.com/report/index/3a658584-8575-4286-9be0-5a14a2170104
https://crash-stats.mozilla.com/report/index/aafd9972-4501-466e-a61f-e837e2170104
https://crash-stats.mozilla.com/report/index/01275181-fce9-4cf6-adff-acdb82170104
https://crash-stats.mozilla.com/report/index/db18251e-30e3-403f-b485-15cd12170104
https://crash-stats.mozilla.com/report/index/5451e925-59bb-490c-a875-498d92170104
If you can reliably reproduce the problem on a certain mailbox, it would be good to zip up this mailbox and made it available to the developers. You don't have to attach it here, you can upload it to a file share host (like WeTransfer.com) and let us know. We won't publish any of the e-mail you give us.

Without a reproducible case we won't be able to fix this.

It' always the same crash at mimemult.cpp:697 which reads:
  int status = kid->clazz->parse_eof(kid, abort_p);
and that code hasn't changed since it was imported into Mozilla's Mercurial repository in July 2008.
Wayne Mery contact me - we will try do some remote magic.
Remote magic won't do it. We believe that you can crash it, but *I* need to crash it on my machine. We need the reproducible case on a the machine of a developer so we can watch it with the debugger connected.
Jorg - i think Wayne is trying to debug problem on our PC.

He asked me to install visual studio and TB from "tinderbox-builds/comm-central-win32-debug" , if this not help i'll try to zip mailbox .
Oh, and he mentioned that the debug build doesn't crash
Ran into it 3 times in a row just half an hour ago. After which I discovered I had ran out of space on /. So whatever Thunderbird/Gecko is trying to store outside profile folder (which I have on another partition that had free space available) is not checking for out of space error somewhere. Possibly related to the folder created in /tmp on startup?

There was no need to click/scroll anything. First time it happened in the background. The other two times shortly after restarting Thunderbird while it was still checking for new mail and minding other automatic startup tasks.
(In reply to Merike Sell (:merike) from comment #50)
> Ran into it 3 times in a row just half an hour ago. After which I discovered
> I had ran out of space on /.

Merike, do you have crash IDs from those failures?
Flags: needinfo?(merikes.lists)
Can this be triggered at will? Or it comes and goes?

All I can think of is

    MimeObject *kid = cont->children[cont->nchildren-1];
    NS_ASSERTION(kid, "not expecting null kid");
-    if (kid)
+    if (kid && kid->clazz)
    {
      int status = kid->clazz->parse_eof(kid, abort_p);
      if (status < 0) return status;
    }
  }
(In reply to Jorg K (GMT+2) from comment #53)
> Can this be triggered at will? Or it comes and goes?

On low disk space, yes. Just tried with beta (56.0b3) and the mbox attached here triggered https://crash-stats.mozilla.com/report/index/567cfd2e-59a2-4cbe-8a3d-579ab0171021 with 100K available space on /. I had previously saved this mbox under Local Folders and just selecting different messages was enough to trigger it.
I think the key here is that the system needs to run low on disk space for the crash to happen. MIME uses all sorts of temporary files, and surely when it fails to create one of those, some faulty error handling will lead to a crash.

Merike reports that the problem is reproducible with low disk space and so does the reporter in comment #40.
I can reliably reproduce the crash now.

STR:
- Import the mailbox A from the attached ZIP file.
- Compile TB with the patch. The patch simulates "no disk space".
- Double-click a message "32 Market Place", regardless which.
  They all have attachments.
- Double-click a JPG attachment.

Crash at mimemult.cpp:697

Debugger says: kid->clazz was 0xE5E5E5E5.

Kent: Keen to fix it? I promise a quick review ;-)
Flags: needinfo?(rkent)
(In reply to Jorg K (GMT+2) from comment #56)
> Created attachment 8920823 [details] [diff] [review]
> mime-no-space.patch - Patch to simulate "no disk space left"
...
> Debugger says: kid->clazz was 0xE5E5E5E5.

does this correlate nicely to comment 11?  (which should read bug 781049 comment 9)
Bug 781049 comment #9 is indeed very interesting reading (quote):
===
if we never call MimeMultipartRelated_parse_eof(), we would double-free the "head" object of the multipart/related in MimeMultipartRelated_finalize() because there's a pointer to that part in both the children array and in ->headobj.
===

What I see is very similar:
The child/kid object in question gets free'ed here:
mimemrel.cpp:228: mime_free(relobj->headobj);
Code was introduced here:
https://hg.mozilla.org/comm-central/rev/f0bdc29e9c1c#l1.14

And a little later, it's accessed as a child/kid of a container here at the crash site: mimemult.cpp:697.

So the bug is not a "double free", but an "access after free" since a pointer to the free'ed |relobj->headobj| is held as a child of the container.

Now that we understand what's happening, the interesting question is how to fix it.
As it's impossible to get a review by Kent, I'm asking my "all purpose" reviewer here. The crash will only happen under some weird circumstances, for example, if the system is running out of temp space (which the other patch simulates).

In this case things go wrong and to avoid that they go worse, I'm killing the second reference to the child here which will be stale after the free.

With both patches applied I now get an error that the attachment is empty which is not a surprise since it coudn't get extracted out to disk.

Merike, do you compile TB? Then you can try the patch on your system.
Flags: needinfo?(merikes.lists)
Attachment #8920828 - Flags: review?(acelists)
Assignee: nobody → jorgk
Summary: crash in MimeMultipart_parse_eof with multipart/related. kid->clazz is being doubly released? → crash in MimeMultipart_parse_eof with multipart/related. kid->clazz accessed after free (double reference to child)
User Story: (updated)
I've decided to improve the comment by including all the information from bug 781049 comment #9. I also decided to harden the algorithm and not only find the last child but any child.

I've thought about not adding the header as child in the error condition, but concluded that this would be to much of a regression-prone change.
Attachment #8920828 - Attachment is obsolete: true
Attachment #8920828 - Flags: review?(acelists)
Attachment #8920855 - Flags: review?(acelists)
Sorry about the noise, fixed white-space issues and a logic error.
Attachment #8920855 - Attachment is obsolete: true
Attachment #8920855 - Flags: review?(acelists)
Attachment #8920856 - Flags: review?(acelists)
Talking about logic errors, *now* I fixed it.
Attachment #8920856 - Attachment is obsolete: true
Attachment #8920856 - Flags: review?(acelists)
Flags: needinfo?(rkent)
Attachment #8920857 - Flags: review?(acelists)
(In reply to Jorg K (GMT+2) from comment #59)
> With both patches applied I now get an error that the attachment is empty
> which is not a surprise since it coudn't get extracted out to disk.
In error console? I don't get anything even hinting that issues are caused by lack of disk space. Just empty messages when trying to load them. No crashes with the patch though. Frankly, apart from Thunderbird maybe crashing it's rather hard to spot out of disk space on Linux because it continues working almost as usual.

Without the patch I was unsuccessful crashing with the same signature on nightly. Got this one instead: https://crash-stats.mozilla.com/report/index/3a43dcae-d40b-419a-bbfd-af4ab0171023
Flags: needinfo?(merikes.lists)
Hints for the reviewer:

For multipart messages there is an object representing the message and it has children that are the message parts.

The object also points to a "header" part, I'm not 100% sure what that represents, see:
https://dxr.mozilla.org/comm-central/source/mailnews/mime/src/mimemrel.h#48
  MimeObject* headobj;    /* The actual text/html head object. */
which may be the part that represents the body that is rendered. The children are related objects, like embedded/attached images.

If you look at MimeMultipartRelated_parse_eof() you can see

  // replace the existing head object with the new object
  for (int iChild = 0; iChild < cont->nchildren; iChild++) {
    if (cont->children[iChild] == relobj->headobj) {
      // cleanup of the headobj is performed explicitly in our finalizer now
      //  that it does not get cleaned up as a child.
      cont->children[iChild] = body;
      body->parent = obj;
      body->options = obj->options;
    }
  }

So wherever we find a child that equals the header, we overwrite the child with something else.

This *must* be done when free'ing the object in MimeMultipartRelated_finalize() as the detailed comment explains.

The link is that under some error conditions MimeMultipartRelated_parse_eof() doesn't run.

In conclusion, the added code does absolutely no harm and fixes the crash as you as you can see with the test patch.
(In reply to Merike Sell (:merike) from comment #63)
> ...
> Without the patch I was unsuccessful crashing with the same signature on
> nightly. Got this one instead: bp-3a43dcae-d40b-419a-bbfd-af4ab0171023

bp-59ce416b-f065-46b7-9260-29c870171023 is libxul.so@0xe7321c | libxul.so@0xe6e33c | libxul.so@0xe72cc3 | libxul.so@0xe640c1 | libxul.so@0xe57c80 | libxul.so@0xe640c1 | libxul.so@0xe6e00b | libxul.so@0xe81d60 | libxul.so@0x1c13e9f | libxul.so@0xca36d1 | libxul.so@0xe2558e | libxul.so@0x100e4bb   with no thunderbird symbols
Comment on attachment 8920857 [details] [diff] [review]
1016524-fix-ancient-MIME-crash.patch (v2c)

Review of attachment 8920857 [details] [diff] [review]:
-----------------------------------------------------------------

Thanks, this does prevent the crash.
I assume it would be quite hard to replace the attachment with a dummy message saying what happened?
Attachment #8920857 - Flags: review?(acelists) → review+
Use after free makes this a security sensitive issue.

> I assume it would be quite hard to replace the attachment with a dummy message saying what happened?
Indeed.
Group: mail-core-security
I have several users I can ask to test an eventual nightly or try build. That's a rare thing that we should use to our advantage. So IMO this patch should not be uplifted to beta until we use them to determine whether there are any ill effects in their environment.
Added a break since the algorithm wouldn't have found two subsequent children pointing to the header anyway.
Attachment #8920857 - Attachment is obsolete: true
Attachment #8922507 - Flags: review+
https://hg.mozilla.org/comm-central/rev/427b418a19b39c29da079e92079811991d7b5d13
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Target Milestone: --- → Thunderbird 58.0
Comment on attachment 8922507 [details] [diff] [review]
1016524-fix-ancient-MIME-crash.patch (v2d)

Top crash, "use after free" vulnerability: Should be uplifted.
Attachment #8922507 - Flags: approval-comm-esr52?
Attachment #8922507 - Flags: approval-comm-beta+
Group: mail-core-security → core-security-release
This bug produces about 1000 crashes a day, mostly on TB 52.4.
https://crash-stats.mozilla.com/signature/?signature=MimeMultipart_parse_eof&date=%3E%3D2017-10-26T13%3A55%3A31.000Z&date=%3C2017-10-27T13%3A55%3A31.000Z&_columns=date&_columns=product&_columns=version&_columns=build_id&_columns=platform&_columns=reason&_columns=address&_columns=install_time&_sort=-date&page=1#reports

I think it would be pretty irresponsible to get feedback on what be believe is the solution.

All the code does is remove a child from a child list. Failure to remove that child will lead to a crash 100%. From a software engineering point of view the patch is pretty low risk, so the risk of "ill effects" is far lower than the risk of exposing users to crashes and vulnerabilities.
Sorry:
I think it would be pretty irresponsible NOT to get feedback on what be believe is the solution on a larger scale in beta right now.
Well, then the beta is going to wait, because I'm not going to waste the contacts I have cultivated while doing QA
I'm not sure what your contacts will do for you here. This bug is on file for more than three years and only Merike was able to reproduce it semi-reliably under certain conditions. This is not like bug 1368786 where we had a user that could reliably tell whether a binary crashed or not.
(In reply to Jorg K (GMT+2) from comment #75)
> I'm not sure what your contacts will do for you here. This bug is on file
> for more than three years and only Merike was able to reproduce it
> semi-reliably under certain conditions. This is not like bug 1368786 where
> we had a user that could reliably tell whether a binary crashed or not.

You seem to have so little confidence in my work ethic, experience and recommendations that I'm not sure it's worth giving more detail. I will state one. In this bug we have a datapoint of one, under controlled conditions. Amongst the users I have contact, many crash a dozen or more times per week*, even up to 80 per week**.  So while we have never had a testcase message by which you or I were able to reproduce the problem, we have users who crash on a daily basis and can certainly measure the impact of a patch (and I was on two user systems). More importantly, one might presume, we have variability of environment and messages which will give us more assurance of effectiveness and whether there are side effects which the current test can't possibly predict.  Yes, uplifting quickly to beta will give us more data - but not at zero risk.

* https://crash-stats.mozilla.com/search/?signature=%3DMimeMultipart_parse_eof&product=Thunderbird&date=%3E%3D2017-10-13T10%3A32%3A52.000Z&date=%3C2017-10-27T10%3A32%3A52.000Z&_sort=user_comments&_sort=-date&_facets=signature&_facets=install_time&_columns=date&_columns=version&_columns=build_id&_columns=user_comments#facet-install_time

** users with these high crash rates suggest there is a tolerance users will endure without going ballistic, at least for this crash signature. The user comments in the crashes, and the users with whom I have direct contact, seem consistent with this theory. This suggests there is less need to rush the QA process.
I think you seem to have so little confidence in my work ethic, experience and recommendations :-(

I wouldn't want to push a "dangerous" fix that potentially changes system behaviour out the door. The fix here is well understood and pretty harmless and addresses the exact "use after free" that has been seen by myself and Kent and others.

Anyway, if you have users that crash a lot, please point them to
https://archive.mozilla.org/pub/thunderbird/tinderbox-builds/comm-central-linux/1509055821/ - Linux32
https://archive.mozilla.org/pub/thunderbird/tinderbox-builds/comm-central-linux64/1509055821/ - Linux64
https://archive.mozilla.org/pub/thunderbird/tinderbox-builds/comm-central-macosx64/1509055821/ - Mac
https://archive.mozilla.org/pub/thunderbird/tinderbox-builds/comm-central-win32/1509055821/ - Win32
https://archive.mozilla.org/pub/thunderbird/tinderbox-builds/comm-central-win64/1509055821/ - Win64
since sadly Daily didn't run today. I'm looking into getting it going.
I also think we should not push this to beta if we can avoid it and Wayne has some testers that can try it out before beta.
We can't be sure the patch is low risk. I myself haven't understood it much, it does some strange things in the mime structure. It does fix the crash on the sample messages (I have tested that), but we don't know of other sideeffects. It is not a clear-cut code of "if MimeMultipartRelated_parse_eof wasn't called, do this...", we just check the condition indirectly. And who knows if that check may also catch some other situations? Yes, as jcranmer nor kent looked at this yet, Jorg is currently the most knowledgeable to assess if this is low risk or not, as he made the fix. But if we can get a second assessment from the selected nightly testers, I think we should do it first.
Good job twisting my words to your purposes.  You will agree, I hope, that it is rare that I offer an opinion. Also, we have different perspectives and different approaches (including a different perspective of risk) that should be complimentary, not antagonistic.
Maybe we should focus on getting some results, links in comment #77. If I can get bug 1412352 and bug 1412290 fixed today, I'll spin a new Daily. So far we're busted with no ETA since we depend in releng.
ardy who had this crash from https://support.mozilla.org/en-US/questions/1182156 also writes that he had seen "Unable to write the email to the mailbox. Make sure the file system allows you write privileges, and you have enough disc space to copy the mailbox."
Well, we know that running out of disk space will create the error condition that leads to the crash. That's what Merike discovered for us and that was key to reproducing the crash and finding a fix, which curiously had already been known for a long time ( bug 781049 comment #9). As I said many times: The fix does no harm and will fix some crashes, maybe all, maybe it will shift the error elsewhere. The only way to find out is to expose this to a wider audience.
Whiteboard: [regression:TB30.0a2] → [regression:TB30.0a2][TB 57 beta => TB 52.5 ESR]
Attachment #8922507 - Flags: approval-comm-esr52? → approval-comm-esr52+
57 beta crash-stats looks good. And the several users I have feedback from also report success with no ill effects
Verified as per comment #84. This will be uplifted to TB 52.5 right now.
Status: RESOLVED → VERIFIED
No crashes after 52.4.0.  v.fixed
See Also: → 1240772
Group: core-security-release
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: