Crash in [@ MimeMultipartAlternative_flush_children] opening message
Categories
(MailNews Core :: MIME, defect, P1)
Tracking
(thunderbird_esr78 unaffected, thunderbird91+ fixed)
Tracking | Status | |
---|---|---|
thunderbird_esr78 | --- | unaffected |
thunderbird91 | + | fixed |
People
(Reporter: wsmwk, Assigned: benc)
References
Details
(4 keywords)
Crash Data
Attachments
(8 files, 1 obsolete file)
3.25 MB,
message/rfc822
|
Details | |
17.78 KB,
application/octet-stream
|
Details | |
3.59 MB,
message/rfc822
|
Details | |
5.59 MB,
application/zip
|
Details | |
81.46 KB,
message/rfc822
|
Details | |
45.31 KB,
image/png
|
Details | |
927.98 KB,
image/png
|
Details | |
48 bytes,
text/x-phabricator-request
|
wsmwk
:
approval-comm-beta+
|
Details | Review |
+++ This bug was initially created as a clone of Bug #1663664 +++
Still a top 10 crash for 83 beta. Example...
bp-fd91392e-b37a-4d26-ab75-565800201125
0 xul.dll MimeMultipartAlternative_flush_children(MimeObject*, bool, priority_t) comm/mailnews/mime/src/mimemalt.cpp:233
1 xul.dll MimeMultipartAlternative_parse_eof(MimeObject*, bool) comm/mailnews/mime/src/mimemalt.cpp:252
2 xul.dll MimeMultipartRelated_parse_eof(MimeObject*, bool) comm/mailnews/mime/src/mimemrel.cpp:1067
3 xul.dll MimeMultipart_close_child(MimeObject*) comm/mailnews/mime/src/mimemult.cpp:505
4 xul.dll MimeMultipart_parse_line(char const*, int, MimeObject*) comm/mailnews/mime/src/mimemult.cpp:140
5 xul.dll mime_LineBuffer(char const*, int, char**, int*, unsigned int*, bool, int ()(char, unsigned int, void*), void*) comm/mailnews/mime/src/mimebuf.cpp:205
6 xul.dll MimeObject_parse_buffer(char const*, int, MimeObject*) comm/mailnews/mime/src/mimeobj.cpp:223
7 xul.dll MimeMessage_parse_line(char const*, int, MimeObject*) comm/mailnews/mime/src/mimemsg.cpp:179
8 xul.dll mime_LineBuffer(char const*, int, char**, int*, unsigned int*, bool, int ()(char, unsigned int, void*), void*) comm/mailnews/mime/src/mimebuf.cpp:205
9 xul.dll MimeObject_parse_buffer(char const*, int, MimeObject*) comm/mailnews/mime/src/mimeobj.cpp:223
10 xul.dll mime_display_stream_write(_nsMIMESession*, char const*, int) comm/mailnews/mime/src/mimemoz2.cpp:850
11 xul.dll nsStreamConverter::OnDataAvailable(nsIRequest*, nsIInputStream*, unsigned long long, unsigned int) comm/mailnews/mime/src/nsStreamConverter.cpp:824
12 xul.dll mozilla::dom::MediaDocumentStreamListener::OnDataAvailable(nsIRequest*, nsIInputStream*, unsigned long long, unsigned int) dom/html/MediaDocument.cpp:89
13 xul.dll mozilla::net::ParentChannelWrapper::OnDataAvailable(nsIRequest*, nsIInputStream*, unsigned long long, unsigned int) netwerk/ipc/ParentChannelWrapper.h:27
14 xul.dll mozilla::net::DocumentLoadListener::ResumeSuspendedChannel(nsIStreamListener*) netwerk/ipc/DocumentLoadListener.cpp:1179
15 xul.dll mozilla::net::DocumentLoadListener::FinishReplacementChannelSetup(nsresult) netwerk/ipc/DocumentLoadListener.cpp:1089
16 xul.dll mozilla::net::DocumentLoadListener::RedirectToRealChannelFinished(nsresult) netwerk/ipc/DocumentLoadListener.cpp:1030
17 xul.dll mozilla::net::DocumentLoadListener::TriggerRedirectToRealChannel::<unnamed-tag>::operator()(nsresult const&) netwerk/ipc/DocumentLoadListener.cpp:1970
18 xul.dll mozilla::MozPromise<nsresult, mozilla::ipc::ResponseRejectReason, 1>::ThenValue<lambda at /builds/worker/checkouts/gecko/netwerk/ipc/DocumentLoadListener.cpp:1961:11',
lambda at /builds/worker/checkouts/gecko/netwerk/ipc/DocumentLoadListener.cpp:1972:11'>::DoResolveOrRejectInternal(mozilla::MozPromise<nsresult, mozilla::ipc::ResponseRejectReason, 1>::ResolveOrRejectValue&) xpcom/threads/MozPromise.h:769
19 xul.dll mozilla::MozPromise<CopyableTArray<bool>, bool, 0>::ThenValueBase::ResolveOrRejectRunnable::Run() xpcom/threads/MozPromise.h:410
20 xul.dll mozilla::RunnableTask::Run() xpcom/threads/TaskController.cpp:450
21 xul.dll mozilla::TaskController::DoExecuteNextTaskOnlyMainThreadInternal(mozilla::detail::BaseAutoLock<mozilla::Mutex&> const&) xpcom/threads/TaskController.cpp:720
22 xul.dll mozilla::TaskController::ExecuteNextTaskOnlyMainThreadInternal(mozilla::detail::BaseAutoLock<mozilla::Mutex&> const&) xpcom/threads/TaskController.cpp:579
Comment 1•4 years ago
|
||
Crash at https://searchfox.org/comm-central/rev/709ab8c3e9eecdfe11d848cf717236948b4d6800/mailnews/mime/src/mimemalt.cpp#233. I'd assume pending_parts is wrong, but why...
Reporter | ||
Comment 2•4 years ago
|
||
Also the fix of bug 1663664 had either little or no effect on the crash rate, according to the 6 month graph https://crash-stats.mozilla.org/signature/?signature=MimeMultipartAlternative_flush_children&date=%3E%3D2020-06-05T09%3A01%3A00.000Z&date=%3C2020-12-05T09%3A01%3A00.000Z#summary
Reporter | ||
Comment 3•4 years ago
|
||
this signature arose with version 81, same as bug 1675784
Assignee | ||
Comment 4•4 years ago
|
||
I've had a bit of a blind look down through the call stack and can't see anything obvious... but mainly that's a factor of me not being at all familiar with the mime code. I'd need to sit down and really dig into it I think.
But. One thing I did notice: in most of the crash reports the access violations are on memory locations which look very suspiciously like ASCII characters. So I'd guess either a buffer overrun somewhere, or using uninitialised memory (which would probably fit in with pending_parts
being wrong, as per Comment 1).
to reporduce: Specific email causes crash - just clicking to open kills TB, every time
Comment 6•4 years ago
|
||
Can you attach a test case as .eml?
Comment 7•4 years ago
|
||
Test case EML file.
Comment 8•4 years ago
|
||
(In reply to Magnus Melin [:mkmelin] from comment #6)
Can you attach a test case as .eml?
I'm having the same issue as Antony states above. Attaching a copy of the email EML file that's causing the issue.
Reporter | ||
Comment 9•4 years ago
|
||
(In reply to willdanneriv from comment #7)
Created attachment 9229544 [details]
FW ERN Log Phase 1 Training.emlTest case EML file.
unfortunately doesn't crash for me
Reporter | ||
Comment 10•4 years ago
|
||
OTOH, I opened and forwarded it to myself as attachment, and I consistently crash when clicking on the message in Inbox.
bp-24113461-f891-4454-ac18-66ca20210701
Reporter | ||
Comment 11•4 years ago
|
||
Comment 12•4 years ago
|
||
Ben, please have a look.
Comment 13•4 years ago
|
||
(In reply to Wayne Mery (:wsmwk) from comment #10)
OTOH, I opened and forwarded it to myself as attachment, and I consistently crash when clicking on the message in Inbox.
bp-24113461-f891-4454-ac18-66ca20210701
I did the same test, and it happened to me too
I downloaded the eml
Opened in TB 90.b3
Forwarded to my self
When I try to open the email from my Inbox, and crash bp-c32bf094-d5e6-45e3-b02a-0f3f00210701
Comment 14•4 years ago
|
||
(In reply to Wayne Mery (:wsmwk) from comment #10)
OTOH, I opened and forwarded it to myself as attachment, and I consistently crash when clicking on the message in Inbox.
bp-24113461-f891-4454-ac18-66ca20210701
This is the behavior I'm seeing as well. I'm on TB 90.0b3 (64-bit).
The few emails it has happened on were forwarded emails with that as an attachment.
Comment 15•4 years ago
|
||
file as ref in my latest comment date 8/7
Comment 16•4 years ago
|
||
Further testing shows that a reply email, from this one correspondent, on the back of origin from me, works, no issue.
Sending an eml file as well of one that crashed... if i set the body message as Plain Text, I can open the msg, thogh it is is ful of Base64 chrs
QkVHSU46VkNBTEVOREFSDQpNRVRIT0Q6UkVRVUVTVA0KUFJPRElEOk1pY3Jvc29mdCBFeGNoYW5n
as the first line
Also, seems that the original email from this source works, since then fwd or resends with additional attachments causes crash.
Regressed to 90.b1 same issue
Reporter | ||
Comment 17•4 years ago
|
||
bizzare coincidence, I just crashed bp-0a947e96-d2b0-4204-acd7-11ca50210709 90.0b3 on my virtual PC.
It wasn't the testcase, but I had just received a message whose composition is similar to the testcase - be I can't reproduce the crash.
However, the testcase does reproduce on Windows and Mac.
Based on crash-stats, this is a top crash for beta, and I believe this will emerge as a top crash when we release version 91. AFICS it's not an issue on version 78
Comment 18•4 years ago
|
||
Comment 19•4 years ago
|
||
(In reply to Wayne Mery (:wsmwk) from comment #17)
bizzare coincidence, I just crashed bp-0a947e96-d2b0-4204-acd7-11ca50210709 90.0b3 on my virtual PC.
It wasn't the testcase, but I had just received a message whose composition is similar to the testcase - be I can't reproduce the crash.
However, the testcase does reproduce on Windows and Mac.Based on crash-stats, this is a top crash for beta, and I believe this will emerge as a top crash when we release version 91. AFICS it's not an issue on version 78
Wayne
received a message whose composition is similar to the testcase
Are you referring to my eml I attached - if yes my concern is tht this is as plain text and hence may not help reporduce (if not my eml - then no prob)
I will see if I can attach the original email that crashes (well I hope it is the original version) might not be any diff
Comment 20•4 years ago
|
||
had to zip as the original is just a bit too big, its full of attachments HTML
Use this dump the previous file
Comment 21•4 years ago
|
||
(In reply to Wayne Mery (:wsmwk) from comment #9)
(In reply to willdanneriv from comment #7)
Created attachment 9229544 [details]
FW ERN Log Phase 1 Training.emlTest case EML file.
unfortunately doesn't crash for me
Nor me!!!
Comment 22•4 years ago
|
||
Is there a max size for incoming attachments?
I notice that the email (in fact a resending of the same email with attachments being added with each subsequent send), grows in size
One is at 18 mg size of attachments which hen equates to 25 mg email size
Oddly, if I fwd same email to another address of mine, it opens fine - no crash (will ALL attachments!) so I guess that rather answers my question above :-/
However, the original text part of the msg is missing
Same for the original email that crashes TB - fwd to my outlook, opens but no msg in body: attachments are there though
Comment 23•4 years ago
|
||
see my comment and look at screenshot referred to
Comment 24•4 years ago
|
||
Comment 25•4 years ago
|
||
Ok, had added a new attachment
When opened you will see Sophie's reply to my reply to her originl email. That original 30/06 at 16:36 crashes TB
Now what is weird is that I was able to open and reply 30/06 17:25 and she replied next morning. My reply does NOT appear in the top window in the thread - screenshot also provided.
The original email in this thread now crashes TB. I need to check when I updated TB - I have a feeling I also jumped from 90.0b1 to b3, or 90.b2 build 1 to 90.b3 build2, so may have skipped an interdiate build/beta ver
Is there a log that records the update paths and timestamps?
Comment 26•4 years ago
|
||
Comment 27•4 years ago
|
||
Comment 28•4 years ago
|
||
update to 91.0b1 - same email(s) crash TB
Assignee | ||
Comment 29•4 years ago
|
||
I can replicate this using the email in comment #20 (I've got it in an IMAP folder, and it'll crash whenever I go to view it).
Just before the crash I see some NS_ASSERTIONs triggering (which aren't fatal by default, but really should be :- ):
[Parent 374400, Main Thread] ###!!! ASSERTION: object shouldn't be closed: '!obj->closed_p', file /home/ben/tb/mozilla/comm/mailnews/mime/src/mimeobj.cpp:220
[Parent 374400, Main Thread] ###!!! ASSERTION: object shouldn't be closed: '!obj->closed_p', file /home/ben/tb/mozilla/comm/mailnews/mime/src/mimeobj.cpp:220
[Parent 374400, Main Thread] ###!!! ASSERTION: obj already parsed: '!obj->parsed_p', file /home/ben/tb/mozilla/comm/mailnews/mime/src/mimeobj.cpp:239
I think these are related to the crash, but not 100% certain yet. Digging into it now.
Reporter | ||
Updated•4 years ago
|
Reporter | ||
Comment 30•4 years ago
|
||
#14 crash for beta 91
Assignee | ||
Comment 31•4 years ago
|
||
OK, I think I understand what's happening.
It requires a large enough message (>800KB or so, I think) with a "text/calendar" part in it.
Short description:
The calendar part causes async badness which cause nested calls to nsStreamConverter::OnDataAvailable().
Long description:
The mime parsing (in C++) is driven by an async nsInputStreamPump which feeds data into the parser by calling nsStreamConverter::OnDataAvailable(). Under normal operation the chunk of data will be processed and then OnDataAvailable() will exit, ready to be called when the next chunk is ready to process.
However, with "text/calendar" parts (and likely other types), it calls nsISimpleMimeConverter.convertToHTML() to compose a nice blob of HTML to embed into the message display. For "text/calendar", the converter is implemented in javascript, in CalMimeConverter.jsm. That convertToHTML() implementation decodes the event details and uses them to fill out a template. It uses an XMLHttpRequest to fetch that template synchronously.
This is where it all goes pear-shaped.
The XMLHttpRequest blocks, and like all good javascript plumbing, allows the thread to go on processing other pending things while we wait for the request to complete. One of these other things is the input stream we were reading from in the first place, which promptly calls OnDataAvailable() again, even though we're still in the middle of processing the first OnDataAvailable() call.
Hilarity ensues.
Not quite sure yet what the proper fix is. A couple of ideas I'll investigate tomorrow:
- CalMimeConverter.jsm could preload the template file to avoid blocking and yielding to other events. But seems a bit crappy to impose such non-yielding conditions upon JS code, as there's no obvious indication which things yield behind the scenes and which don't (no await/async to make it obvious).
- it might be possible to suspend the inputpump at the beginning of the OnDataAvailable() handling, and resume it at exit, to prevent reentry. Seems pretty icky though.
I'm currently leaning toward 1 as the lesser evil.
Geoff: are there any conventions/expectations/precedents about behind-the-scenes yielding in javascript? Is it utterly unreasonable to say some JS functions are banned from doing non-async XMLHttpRequests and the like?
Assignee | ||
Comment 32•4 years ago
|
||
Oh, looks like the only other built-in nsISimpleMimeConverter implementation is the one in addressbook, for displaying "text/vcard" (VCardUtils.jsm). But that one uses a template hard-coded into the code itself, so I don't think it'll suffer from this issue.
Comment 33•4 years ago
|
||
I'd lean towards option 1 as well. It'd be easy to see whether it works given you can reproduce the bug, and the template itself is small enough to hard-wire in. It'd be nice to have some protection from the input side of things though.
Geoff: are there any conventions/expectations/precedents about behind-the-scenes yielding in javascript?
The blocking XMLHttpRequest really allows other things to continue while it does I/O? That's somewhat surprising. I think the answer is no, and most people would assume a sync request would be okay in these circumstances anyway.
Is it utterly unreasonable to say some JS functions are banned from doing non-async XMLHttpRequests and the like?
Seems okay to me, although all you can really do to enforce it is make a big scary warning comment in the appropriate places.
Assignee | ||
Comment 34•4 years ago
|
||
(In reply to Geoff Lankow (:darktrojan) from comment #33)
The blocking XMLHttpRequest really allows other things to continue while it does I/O? That's somewhat surprising. I think the answer is no, and most people would assume a sync request would be okay in these circumstances anyway.
It's down in the deep depths of the runtime system - threads are event targets which can have a queue of events waiting to run when they get the chance. And I think a sync XMLHttpRequest yields to some other pending event.
I'm not worried about the Javascript runtime - I'm pretty confident that it's all sensible. I'm not totally sure what happens there - I'd guess that you would expect to see other JS running while the XMLHttpRequest blocked... eg a previous promise waiting on some other IO, or a timer callback? Or maybe those would have to wait until the request completed? Whatever the case is, I'm confident that it's been considered carefully and that it all Just Works (tm). JS runtime has had a lot of attention.
It's more the interaction with the C++ side of things. Calling into JS in this case, the XMLHttpRequest means we yield to some other queued event, even though we're still in the middle of processing one. From the C++ point of view, we're totally not expecting some other event to be run. Its like accidentally turning a simple function call into a coroutine.
I wonder what other JS operations can do the same thing? Hmm. Maybe this is why the [scriptable] XPIDL markers are so rigorous?
Assignee | ||
Comment 35•4 years ago
|
||
This removes the use of a blocking (synchronous) XMLHttpRequest, which was
causing recursive calls to nsStreamConverter::OnDataAvailable() and general
mayhem. The stream feeding the mime parsing takes advantage of the blocking
to send us more data even though we're still in the middle of processing the
previous chunk of data!
Assignee | ||
Comment 36•4 years ago
|
||
Looking into the implementation of sync XMLHttpRequest, it calls SpinEventLoopUntil() while blocking, which means other thread events will be serviced. Hence the inputstream being able to deliver the next chunk of data before we're ready for it...
There are a bunch of other places which also call SpinEventLoopUntil(), but I don't see any which are toooooo immediately alarming.
(looks like flushing out prefs to disk can do it, for example, but seems unlikely to be an issue for us).
I think we just have to chalk this up to: "there are some obscure things you can do which will screw this up, so don't do those things!" :-)
In the case of XMLHttpRequest, the sync operation is explicitly discouraged anyway. Good advice.
Updated•4 years ago
|
Updated•4 years ago
|
Assignee | ||
Updated•4 years ago
|
Updated•4 years ago
|
Comment 37•4 years ago
|
||
Pushed by geoff@darktrojan.net:
https://hg.mozilla.org/comm-central/rev/5518f9fe5364
Inline invitation-template.xhtml to prevent C++ re-entrancy in CalMimeConverter. r=darktrojan
Comment 38•4 years ago
|
||
Assignee | ||
Comment 39•4 years ago
|
||
Doh! Sorry about that... Worked fine on my local tests, but stupidly forgot to run a try build.
Comment 40•4 years ago
|
||
Comment on attachment 9232263 [details]
Bug 1679299 - Inline invitation-template.xhtml to prevent C++ re-entrancy in CalMimeConverter. r=darktrojan
[Approval Request Comment]
Top crash fix. Will also need to uplift bug 1722257.
Reporter | ||
Comment 41•4 years ago
|
||
Comment on attachment 9232263 [details]
Bug 1679299 - Inline invitation-template.xhtml to prevent C++ re-entrancy in CalMimeConverter. r=darktrojan
[Triage Comment]
Approved for beta
We will want to be on the lookout for regressions
Comment 42•4 years ago
|
||
bugherder uplift |
Reporter | ||
Comment 43•3 years ago
|
||
Antony, is the crash gone for you on beta?
willdanneriv, is the crash gone for you?
Comment 44•3 years ago
|
||
(In reply to Wayne Mery (:wsmwk) from comment #43)
Antony, is the crash gone for you on beta?
willdanneriv, is the crash gone for you?
Yes - looks fixed - I checked with the same email that I reported that crashed TB, this now displays properly
Cool Bananas
Reporter | ||
Comment 45•3 years ago
|
||
Thanks for the update.
Also gone according to crash-stats
Description
•