Closed Bug 1191841 Opened 9 years ago Closed 8 years ago

Calling SetForceCharacterSet() not using a Gecko-canonical name leads to crash (was: Thunderbird: Crash forwarding or sending a message, or editing draft ...)

Categories

(Core :: DOM: Navigation, defect)

x86
All
defect
Not set
critical

Tracking

()

RESOLVED FIXED
mozilla51
Tracking Status
thunderbird_esr38 --- wontfix
thunderbird_esr45 51+ fixed
firefox51 --- fixed

People

(Reporter: plemmon, Assigned: hsivonen)

References

Details

(Keywords: crash, topcrash-thunderbird)

Crash Data

Attachments

(3 files, 2 obsolete files)

Attached file tbbug1.txt
User Agent: Mozilla/5.0 (Windows NT 5.1; rv:38.0) Gecko/20100101 Firefox/38.0 Build ID: 20150525141253 Steps to reproduce: Intermittent crashes and only with you tube or other player clips in them-It happens when I open the message to read-I am Not using NT but XP Actual results: When I go to open a draft file that I have saved but not sent yet with an enclosed you tube video clip in it, some days and sometimes it crashes, other days and times it won't crash. If it crashes it will do it to all my other video messages as well, but if it doesn't then it won't for all others as well. All other messages are fine that don't have video clips in them. Expected results: The message should have opened without crashing as it does sometimes.
OS: Unspecified → Windows XP
Hardware: Unspecified → x86
Keywords: crash
nsHtml5StreamParser::WriteStreamBytes(unsigned char const*, unsigned int, unsigned int*) bp-8d50bf0c-7ae3-4508-b69c-af0672150806 0 xul.dll nsHtml5StreamParser::WriteStreamBytes(unsigned char const*, unsigned int, unsigned int*) parser/html/nsHtml5StreamParser.cpp 1 xul.dll nsHtml5StreamParser::SetupDecodingAndWriteSniffingBufferAndCurrentSegment(unsigned char const*, unsigned int, unsigned int*) parser/html/nsHtml5StreamParser.cpp 2 xul.dll nsHtml5StreamParser::SniffStreamBytes(unsigned char const*, unsigned int, unsigned int*) parser/html/nsHtml5StreamParser.cpp 3 xul.dll nsHtml5StreamParser::DoDataAvailable(unsigned char const*, unsigned int) parser/html/nsHtml5StreamParser.cpp 4 xul.dll nsStringInputStream::ReadSegments(nsresult (*)(nsIInputStream*, void*, char const*, unsigned int, unsigned int, unsigned int*), void*, unsigned int, unsigned int*) xpcom/io/nsStringStream.cpp 5 xul.dll nsHtml5StreamParser::OnDataAvailable(nsIRequest*, nsISupports*, nsIInputStream*, unsigned __int64, unsigned int) parser/html/nsHtml5StreamParser.cpp
Severity: normal → critical
Crash Signature: [@ nsHtml5StreamParser::WriteStreamBytes(unsigned char const*, unsigned int, unsigned int*)]
Component: Untriaged → Message Compose Window
Summary: bp-8d50bf0c-7ae3-4508-b69c-af0672150806 → Crash editting draft that contains video clip
Crash Signature: [@ nsHtml5StreamParser::WriteStreamBytes(unsigned char const*, unsigned int, unsigned int*)] → [@ nsHtml5StreamParser::WriteStreamBytes(unsigned char const*, unsigned int, unsigned int*)] [@ nsHtml5StreamParser::WriteStreamBytes]
This is a Thunderbird top crasher, so we should confirm. The crash is pretty obvious, uninitialized mUnicodeConverter. In mailnews code we would routinely check such things, they tend to be less cautious in core mozilla code. We could add a null check to our branch and stop the crash if we wanted, or perhaps core would also accept such a check. As to the underlying cause, without really understanding this code, the missing pointer is initialized in OnStartRequest. We've recently identified and fixed a missing OnStartRequest in compose that was related to the attachment hang, though that has not landed in esr45. Perhaps this crash will go away as well?
Status: UNCONFIRMED → NEW
Ever confirmed: true
(In reply to Kent James (:rkent) from comment #2) > This is a Thunderbird top crasher, so we should confirm. more specifically, #10 crash for 45.1.1. This would seem to be a regression, because there is a strong uptick in 45.0 betas. (TB crashes exist prior to 45.0, but at a low rate.) Unable to determine when the uptick started because there are no crashes reported at any noticable rate in trunk or aurora. I'd say Firefox got a fix, or backed out a fix, in the January time frame that Thunderbird did not, based this graph https://crash-stats.mozilla.com/signature/?date=%3E2016-01-01&signature=nsHtml5StreamParser%3A%3AWriteStreamBytes#graphs Crash rate drops after FF44.0b4, but I was unable to find a bug report or checkin that seems likely https://hg.mozilla.org/mozilla-central/log/tip/parser/html/nsHtml5StreamParser.cpp https://hg.mozilla.org/releases/mozilla-beta/log/tip/parser/html/nsHtml5StreamParser.cpp For completeness, may be worth mentioning we picked up bug 653342 in TB45.
(In reply to Kent James (:rkent) from comment #2) > This is a Thunderbird top crasher, so we should confirm. > > The crash is pretty obvious, uninitialized mUnicodeConverter. In mailnews > code we would routinely check such things, they tend to be less cautious in > core mozilla code. We could add a null check to our branch and stop the > crash if we wanted, or perhaps core would also accept such a check. > > As to the underlying cause, without really understanding this code, the > missing pointer is initialized in OnStartRequest. We've recently identified > and fixed a missing OnStartRequest in compose that was related to the > attachment hang, though that has not landed in esr45. Perhaps this crash > will go away as well? Hopefully someone can take the forensics of comment 3 further
OS: Windows XP → All
Summary: Crash editting draft that contains video clip → Crash forwarding or sending a message, or editting draft that contains video clip
Whiteboard: [regression:TB45?]
So this bug is caused by attaching and receiving (and trying to view a smallish you tube clip)? Maybe I can try to test it in the next few weeks although downloading a largish youtube video clip does not sound nice. I wonder if someone can point to a small youtube video that I can download.
It appears to be happening if it goes over too many e-mails in the draft folder. If I move that you tube e-mail into a different folder that doesn't have as many e-mails in it, I can then open it up without it crashing.
I noticed that you can download a youtube video clip in mp4 webm gpp or some such encoding format. (For example, https://www.youtube.com/watch?v=vDiETeBC9W8 ) Given that I see sHtml5StreamParser in comment 1 in the backtrace of the involved DLLs, I suspect the problem happens with webm data (?). Can someone confirm if the problem happened with .webm data format video clips only or it happened with any of the youtube clips? This problem seems so tricky to re-produce if it can be reproduced in C-C TB at all. (May have been fixed already in C-C. Oh well.) TIA
Summary: Crash forwarding or sending a message, or editting draft that contains video clip → Crash forwarding or sending a message, or editting draft that contains video clip attachment
(In reply to Paul Lemmon from comment #6) > It appears to be happening if it goes over too many e-mails in the draft > folder. If I move that you tube e-mail into a different folder that doesn't > have as many e-mails in it, I can then open it up without it crashing. ring any bells? see also comment 2
Flags: needinfo?(acelists)
Paul, does it also crash for you if you use a nightly build https://archive.mozilla.org/pub/thunderbird/nightly/latest-comm-central/thunderbird-50.0a1.en-US.win32.installer.exe ? (backup your THunderbird profile before testing this)
Flags: needinfo?(plemmon)
(In reply to Wayne Mery (:wsmwk, NI for questions) from comment #8) > ring any bells? Yes, we should disable videos also in releases :) Anyway, how many emails is "many"? And how big are those videos? Are they playing while viewing the messages?
Flags: needinfo?(acelists)
(In reply to Kent James (:rkent) from comment #2) > We could add a null check to our branch and stop the > crash if we wanted, or perhaps core would also accept such a check. Well, why don't we speak to "core". We're on good terms with Henri and Masatoshi. Gentlemen, apparently we see a crash here: https://hg.mozilla.org/releases/mozilla-esr38/annotate/b32d199c886c/parser/html/nsHtml5StreamParser.cpp#l818 Would you accept a null-check there like you have here a few lines above? https://hg.mozilla.org/releases/mozilla-esr38/annotate/b32d199c886c/parser/html/nsHtml5StreamParser.cpp#l794
Flags: needinfo?(hsivonen)
Flags: needinfo?(VYV03354)
(In reply to Wayne Mery (:wsmwk, NI for questions) from comment #1) > nsHtml5StreamParser::WriteStreamBytes(unsigned char const*, unsigned int, > unsigned int*) > bp-8d50bf0c-7ae3-4508-b69c-af0672150806 > 0 xul.dll nsHtml5StreamParser::WriteStreamBytes(unsigned char const*, > unsigned int, unsigned int*) parser/html/nsHtml5StreamParser.cpp > 1 xul.dll > nsHtml5StreamParser:: > SetupDecodingAndWriteSniffingBufferAndCurrentSegment(unsigned char const*, > unsigned int, unsigned int*) parser/html/nsHtml5StreamParser.cpp SetupDecodingAndWriteSniffingBufferAndCurrentSegment() does this: NS_ASSERTION(IsParserThread(), "Wrong thread!"); nsresult rv = NS_OK; mUnicodeDecoder = EncodingUtils::DecoderForEncoding(mCharset); if (mSniffingBuffer) { uint32_t writeCount; rv = WriteStreamBytes(mSniffingBuffer.get(), mSniffingLength, &writeCount); NS_ENSURE_SUCCESS(rv, rv); mSniffingBuffer = nullptr; } mMetaScanner = nullptr; if (aFromSegment) { rv = WriteStreamBytes(aFromSegment, aCount, aWriteCount); } return rv; As you can see, there's an assignment to mUnicodeDecoder that always happens before calling WriteStreamBytes(). How might DecoderForEncoding() return null? Its documentation says: /** * Instantiates a decoder for an encoding. The input must be a * Gecko-canonical encoding name * @param aEncoding a Gecko-canonical encoding name * @return a decoder */ Let's look at the implementation: nsAutoCString contractId(NS_UNICODEDECODER_CONTRACTID_BASE); contractId.Append(aEncoding); nsCOMPtr<nsIUnicodeDecoder> decoder = do_CreateInstance(contractId.get()); MOZ_ASSERT(decoder, "Tried to create decoder for unknown encoding."); return decoder.forget(); So if used contrary to the documentation, the method fatally asserts. But the assertion is left out in release builds, so the method returns null if used contrary to documentation. (Or, alternatively, if used before uconv has been initialized or after it has been shut down. Bug 1261841 will remove that caveat by deCOMtaminating encoding converters.) Since this crash isn't seen in Firefox, chances are that mailnews manages to make mCharset contain a string that's not a Gecko-canonical encoding name. It looks like the documentation of SetDocumentCharset fails to say that it these days takes a Gecko-canonical encoding name instead of a label. Sorry. The only caller in m-c passes a Gecko-canonical name. Instead of adding a null-check in the parser, I suggest you audit the code path that ends up calling SetDocumentCharset() on the HTML parser to see where a label as opposed to a Gecko-canonical name is passed and fix that. Using the C++ type system to prevent this problem is bug 919924.
Flags: needinfo?(hsivonen)
Flags: needinfo?(VYV03354)
Henri, thanks for the analysis and please excuse the layman questions that follow. (In reply to Henri Sivonen (:hsivonen) from comment #12) > Since this crash isn't seen in Firefox, chances are that mailnews manages to > make mCharset contain a string that's not a Gecko-canonical encoding name. > > It looks like the documentation of SetDocumentCharset fails to say that it > these days takes a Gecko-canonical encoding name instead of a label. Sorry. I'm a little confused by SetDocumentCharset(). Looking at DXR, I can see many: https://dxr.mozilla.org/comm-central/search?q=%3A%3ASetDocumentCharset&redirect=false Are we talking about nsHtml5Parser::SetDocumentCharset() or another one? > The only caller in m-c passes a Gecko-canonical name. Where is that only caller? Without knowing which SetDocumentCharset() we're talking about I can't locate it. Next questions: What are Gecko-canonical names? Is there a list? Can I test that I am using a valid parameter. What's the difference between a name and a label? To me, it's "UTF-8", "ISO-8859-1", "ISO-2022-JP", "GBK", etc.). > Instead of adding a null-check in the parser, I suggest you audit the code > path that ends up calling SetDocumentCharset() on the HTML parser to see > where a label as opposed to a Gecko-canonical name is passed and fix that. Sure, once we establish which function to look for or breakpoint on, see question above. (Without looking at it, I assume when rendering an e-mail, we pass on any charset information contained in the message. Does FF validate the charset information it finds on a webpage?) Now one thing I don't understand is this: We have a module that crashes at some point if at some earlier point some bad data was passed in. Sure, we can track down where the bad data is passed in, but shouldn't a crash be avoided under all circumstances?
Flags: needinfo?(hsivonen)
OK, I did some digging myself. Starting at mUnicodeDecoder = EncodingUtils::DecoderForEncoding(mCharset); here https://dxr.mozilla.org/mozilla-central/rev/97a52326b06a07930216ebefa5af333271578904/parser/html/nsHtml5StreamParser.cpp#301 we're talking about mCharset from here: https://dxr.mozilla.org/mozilla-central/rev/97a52326b06a07930216ebefa5af333271578904/parser/html/nsHtml5StreamParser.h#428 and SetDocumentCharset() (inline!) from here: https://dxr.mozilla.org/mozilla-central/rev/97a52326b06a07930216ebefa5af333271578904/parser/html/nsHtml5StreamParser.h#159 So nsHtml5StreamParser::SetDocumentCharset(). Breaking on this, I get this stack: nsHtml5StreamParser::SetDocumentCharset(const nsACString_internal & aCharset, int aSource) Line 160 C++ nsHtml5Parser::SetDocumentCharset(const nsACString_internal & aCharset, int aCharsetSource) Line 105 C++ nsHTMLDocument::StartDocumentLoad(const char * aCommand, nsIChannel * aChannel, nsILoadGroup * aLoadGroup, nsISupports * aContainer, nsIStreamListener * * aDocListener, bool aReset, nsIContentSink * aSink) Line 765 C++ nsContentDLF::CreateDocument(const char * aCommand, nsIChannel * aChannel, nsILoadGroup * aLoadGroup, nsIDocShell * aContainer, const nsID & aDocumentCID, nsIStreamListener * * aDocListener, nsIContentViewer * * aContentViewer) Line 375 C++ nsContentDLF::CreateInstance(const char * aCommand, nsIChannel * aChannel, nsILoadGroup * aLoadGroup, const nsACString_internal & aContentType, nsIDocShell * aContainer, nsISupports * aExtraInfo, nsIStreamListener * * aDocListener, nsIContentViewer * * aDocViewer) Line 183 C++ mozilla::mailnews::MailNewsDLF::CreateInstance(const char * aCommand, nsIChannel * aChannel, nsILoadGroup * aLoadGroup, const nsACString_internal & aContentType, nsIDocShell * aContainer, nsISupports * aExtraInfo, nsIStreamListener * * aDocListener, nsIContentViewer * * aDocViewer) Line 69 C++ nsDocShell::NewContentViewerObj(const nsACString_internal & aContentType, nsIRequest * aRequest, nsILoadGroup * aLoadGroup, nsIStreamListener * * aContentHandler, nsIContentViewer * * aViewer) Line 9225 C++ nsDocShell::CreateContentViewer(const nsACString_internal & aContentType, nsIRequest * aRequest, nsIStreamListener * * aContentHandler) Line 9026 C++ nsDSURIContentListener::DoContent(const nsACString_internal & aContentType, bool aIsContentPreferred, nsIRequest * aRequest, nsIStreamListener * * aContentHandler, bool * aAbortProcess) Line 129 C++ nsDocumentOpenInfo::TryContentListener(nsIURIContentListener * aListener, nsIChannel * aChannel) Line 720 C++ nsDocumentOpenInfo::DispatchContent(nsIRequest * request, nsISupports * aCtxt) Line 398 C++ nsDocumentOpenInfo::OnStartRequest(nsIRequest * request, nsISupports * aCtxt) Line 259 C++ nsMsgProtocol::OnStartRequest(nsIRequest * request, nsISupports * ctxt) Line 314 C++ nsMailboxProtocol::OnStartRequest(nsIRequest * request, nsISupports * ctxt) Line 204 C++ So for all I can see, no charset is passed into M-C from anywhere in Mailnews, the SetDocumentCharset() is called from within the HTML5 parser. The only call in the stack comes from here: MailNewsDLF::CreateInstance(): https://dxr.mozilla.org/comm-central/rev/d1960937860dbe784e21b43ce136253fc70e77e2/mailnews/base/src/MailNewsDLF.cpp#69 I noted that nsHtml5StreamParser::SetDocumentCharset() doesn't do any checking on the value passed it, it blindly assigns it to mCharset. So what's next?
(In reply to Jorg K (GMT+2, PTO during summer) from comment #13) > Henri, thanks for the analysis and please excuse the layman questions that > follow. > > (In reply to Henri Sivonen (:hsivonen) from comment #12) > > Since this crash isn't seen in Firefox, chances are that mailnews manages to > > make mCharset contain a string that's not a Gecko-canonical encoding name. > > > > It looks like the documentation of SetDocumentCharset fails to say that it > > these days takes a Gecko-canonical encoding name instead of a label. Sorry. > I'm a little confused by SetDocumentCharset(). Looking at DXR, I can see > many: > https://dxr.mozilla.org/comm-central/ > search?q=%3A%3ASetDocumentCharset&redirect=false > Are we talking about nsHtml5Parser::SetDocumentCharset() or another one? We are talking about nsHtml5Parser::SetDocumentCharset() which may appears as nsIParser::SetDocumentCharset() to the caller. > > The only caller in m-c passes a Gecko-canonical name. > Where is that only caller? nsHTMLDocument::StartDocumentLoad(). Well, at least in the form it runs in m-c. There is a mailnews-only bit there, which is likely the problem here. > What are Gecko-canonical names? The kind of name that you can append to NS_UNICODEDECODER_CONTRACTID_BASE so that the result is a valid contract id. For encodings that are in the Encoding Standard, this is the same as the concept _name_ in the Encoding Standard, except for GBK, whose Encoding Standard name is GBK but whose Gecko-canonical name is, for the time being, gbk (bug 1227006). > Is there a list? https://dxr.mozilla.org/mozilla-central/source/intl/uconv/nsUConvModule.cpp#125 Plus the ones that c-c adds. > Can I test > that I am using a valid parameter. For now: nsAutoCString contractId(NS_UNICODEDECODER_CONTRACTID_BASE); contractId.Append(aEncoding); nsCOMPtr<nsIUnicodeDecoder> decoder = do_CreateInstance(contractId.get()); If decoder is not null, aEncoding was a Gecko-canonical name. Bug 1261841 will change this. > What's the difference between a name and > a label? To me, it's "UTF-8", "ISO-8859-1", "ISO-2022-JP", "GBK", etc.). Each encoding has exactly one name which is case-sensitive and doesn't allow spaces. A label is an ASCII-case-insensitive potentially whitespace-padded string used in a protocol to designate an encoding. In a better-designed code base, the places where we use names to designate encodings would use either enum values or memory addresses to singleton objects. However, labels have to be strings, because they are text extracted from protocol text. "UTF-8" is a Gecko-canonical (and Encoding Standard) name. "utf-8" or " UTF-8 " are not. "UTF-8", "utf-8", "UtF-8", " UTF-8 " and "unicode-1-1-utf-8" are all labels for UTF-8. The name of an encoding is also one of its labels, except in the case of the replacement encoding whose name is not one of its labels. > (Without looking at it, I assume when rendering an e-mail, we pass on any > charset information contained in the message. Does FF validate the charset > information it finds on a webpage?) Yes. The thing that's in MIME or HTTP protocol text is a label. To go from a label to a name in the email context, use nsICharsetConverterManager::GetCharsetAlias(). (In the Web context, the method to use is mozilla::dom::EncodingUtils::FindEncodingForLabel().) (Bug 1261841 will make Gecko no longer accept email-only names, though, so you'll need to make sure UTF-7 is handled by other means than by passing the name to m-c code. I'm pretty sure it already is, though.) > Sure, we > can track down where the bad data is passed in, but shouldn't a crash be > avoided under all circumstances? No, because the API where the bad data comes in is an internal API (but c-c calls internal APIs), but we could make the crash a release assertion to highlight that bad data has been passed in.
Flags: needinfo?(hsivonen)
Henri, thank you for your continued support and patient explanations. If I understand it correctly, your argument is that C-C calls nsHtml5Parser::SetDocumentCharset() incorrectly. Well, there is no such call: https://dxr.mozilla.org/comm-central/search?q=SetDocumentCharset&redirect=false C-C has a function nsMsgCompose::SetDocumentCharset() which gets called within C-C, and nsMessenger::SetDocumentCharset() which is called from JS via messenger.setDocumentCharset(...); I don't see any calls to nsHtml5Parser::SetDocumentCharset(). Looking at the stack trace in comment #14 we see: nsHtml5StreamParser::SetDocumentCharset() Line 160 C++ nsHtml5Parser::SetDocumentCharset() Line 105 C++ nsHTMLDocument::StartDocumentLoad() Line 765 C++ nsContentDLF::CreateDocument() Line 375 C++ nsContentDLF::CreateInstance() Line 183 C++ mozilla::mailnews::MailNewsDLF::CreateInstance() Line 69 C++ <=== C-C calls M-C As per your comment #15, nsHTMLDocument::StartDocumentLoad() calls nsHtml5Parser::SetDocumentCharset(). You said in comment #12: === I suggest you audit the code path that ends up calling SetDocumentCharset() on the HTML parser to see where a label as opposed to a Gecko-canonical name is passed and fix that. === I did and I conclude that C-C cannot be the culprit here. What am I missing? In comment #15 you pointed to the bits labelled "mailnews" in M-C: In nsHTMLDocument::StartDocumentLoad(): https://dxr.mozilla.org/mozilla-central/rev/f97a056ae6235de7855fd8aaa04fb1c8d183bd06/dom/html/nsHTMLDocument.cpp#708 In nsHTMLDocument::TryUserForcedCharset(): https://dxr.mozilla.org/mozilla-central/rev/f97a056ae6235de7855fd8aaa04fb1c8d183bd06/dom/html/nsHTMLDocument.cpp#323 Do you know what they are about? Looking at blame: https://hg.mozilla.org/mozilla-central/annotate/f97a056ae6235de7855fd8aaa04fb1c8d183bd06/dom/html/nsHTMLDocument.cpp#l706 doesn't give me any clues. How do we proceed?
Flags: needinfo?(hsivonen)
(In reply to Jorg K (GMT+2, PTO during summer) from comment #16) > https://dxr.mozilla.org/mozilla-central/rev/ > f97a056ae6235de7855fd8aaa04fb1c8d183bd06/dom/html/nsHTMLDocument.cpp#708 That one looks OK. c-c setters all use "UTF-8": https://dxr.mozilla.org/comm-central/search?q=%2Bcallers%3A%22nsMessenger%3A%3ASetDisplayCharset%28const+nsACString_internal+%26%29%22 > In nsHTMLDocument::TryUserForcedCharset(): > https://dxr.mozilla.org/mozilla-central/rev/ > f97a056ae6235de7855fd8aaa04fb1c8d183bd06/dom/html/nsHTMLDocument.cpp#323 The only caller in c-c passes "UTF-8": https://dxr.mozilla.org/comm-central/source/mailnews/base/src/nsMsgPrintEngine.cpp#518 > How do we proceed? I'd double-check that the above callers are the only c-c setters of "hint" or "force" charset and not just quirks of DXR hiding stuff. I don't have a good suggestion for the step after that at this time. I will need to think about it.
Flags: needinfo?(hsivonen)
I'm not sure why you were looking at SetDisplayCharset() now. It calls SetHintCharacterSet(). Also not sure why you were looking at SetForceCharacterSet() (in nsMsgPrintEngine.cpp). For the latter please note that is it called here when the compose window starts, as the bug summary says (forwarding or editing draft). https://dxr.mozilla.org/comm-central/rev/bde9f7838d8c5003a27b961104c32b00c9109da5/editor/ui/composer/content/editingOverlay.js#193 https://dxr.mozilla.org/comm-central/rev/f97a056ae6235de7855fd8aaa04fb1c8d183bd06/mozilla/docshell/base/nsIContentViewer.idl#213 They are poorly documented, should they receive canonical names or labels? Could that be the cause?
Flags: needinfo?(hsivonen)
If I see this correctly, C-C passes in any old rubbish and that gets picked up here without any checking: https://dxr.mozilla.org/mozilla-central/rev/f97a056ae6235de7855fd8aaa04fb1c8d183bd06/layout/base/nsDocumentViewer.cpp#3169 and then used: https://dxr.mozilla.org/mozilla-central/rev/f97a056ae6235de7855fd8aaa04fb1c8d183bd06/dom/html/nsHTMLDocument.cpp#324 Is that it? SetForceCharacterSet() expects a "good" canonical name but gets anything and doesn't check it? I suggest we beef up SetForceCharacterSet() to get the canonical name from what is passed in, and if that fails, return an error or throw or make it known otherwise that something is wrong.
OK, I found it. I artificially pass in "WiNdOwS-1252" here: https://dxr.mozilla.org/comm-central/rev/bde9f7838d8c5003a27b961104c32b00c9109da5/mailnews/compose/src/nsMsgCompose.cpp#1618 Result in debug mode: MOZ_ASSERT(decoder, "Tried to create decoder for unknown encoding."); In release mode it would have crashed later. So do I fix this caller to turn a label into a canonic name like you indicated using nsICharsetConverterManager::GetCharsetAlias() or will you fix the callee?
Attached patch Propose fix (v1). (obsolete) — Splinter Review
I'm not sure you're a Mailnews reviewer, but I guess you know more about this stuff then most of the Mailnews peers ;-)
Assignee: nobody → jorgk
Status: NEW → ASSIGNED
Flags: needinfo?(plemmon)
Flags: needinfo?(hsivonen)
Attachment #8783337 - Flags: review?(hsivonen)
Attached patch Proposed fix (v1b). (obsolete) — Splinter Review
Added comment and changed reviewer to Kent.
Attachment #8783337 - Attachment is obsolete: true
Attachment #8783337 - Flags: review?(hsivonen)
Attachment #8783355 - Flags: review?(rkent)
Attachment #8783355 - Flags: feedback?(hsivonen)
Comment on attachment 8783355 [details] [diff] [review] Proposed fix (v1b). The m-c methods should really reject invalid charsets in the entrypoint functions (like SetForceCharacterSet). Or is it intentional that *Force* means let the caller set anything even if invalid with unexpected results?
Well, looks like SetForceCharacterSet() only accepts Gecko-canonical names. This is undocumented and if you don't pass one of these, you get a crash later on. Not ideal IMHO, but we should really pass the right string, since for example "utf-8" or "Windows-1252" is simply not understood and won't work.
Yes, we should of course fix our passed value. But a proper interface should reject invalid values instead of crashing on them.
BTW, no regression here. Maybe M-C behaviour changed at some point, but the C-C part was always wrong.
We (m-c) can't prevent add-ons from passing an invalid value until we drop support for XUL add-ons. We should either 1. add a parameter check to make sure non-canonical values are not sneaking in, or 2. add an assertion, make nsIContentViewer::forceCharset [noscript], and declare that this is an internal method and it is caller's responsibility to validate the parameter. But this is out of scope of this bug.
(In reply to Jorg K (GMT+2, PTO during summer) from comment #20) > Hmm, this > https://dxr.mozilla.org/comm-central/rev/ > bde9f7838d8c5003a27b961104c32b00c9109da5/editor/ui/composer/content/ > editingOverlay.js#193 > doesn't get called in TB (SM only), but this one does: > > https://dxr.mozilla.org/comm-central/rev/ > bde9f7838d8c5003a27b961104c32b00c9109da5/mailnews/compose/src/nsMsgCompose. > cpp#1618 > > I'm not sure under what condition this is run. My failure to locate these yesterday makes me distrustful of DXR search (or my ability to use DXR search). :-( (In reply to Masatoshi Kimura [:emk] from comment #28) > We (m-c) can't prevent add-ons from passing an invalid value until we drop > support for XUL add-ons. Good point. Indeed, since this stuff is scriptable and XPCOM add-ons aren't gone yet, we should validate the input in m-c. The effect this has on c-c is: 1) The methods can now return NS_ERROR_INVALID_ARG if the argument is none of a) the empty string, b) a label, or c) a Gecko-canonical name. 2) Since the above is checked in a Web-centric way, UTF-7 is treated as NS_ERROR_INVALID_ARG. (In reply to Jorg K (GMT+2, PTO during summer) from comment #21) > So do I fix this caller to turn a label into a canonic name like you > indicated using > nsICharsetConverterManager::GetCharsetAlias() > or will you fix the callee? I'm fixing the callee. But note that now passing "UTF-7" no longer works. (Not sure if it really worked before.)
Attachment #8783581 - Flags: review?(VYV03354)
Comment on attachment 8783355 [details] [diff] [review] Proposed fix (v1b). I suggest working backwards to wherever msgCharSet is extracted from MIME and mapping whatever is extracted from MIME to a canonical name as early as possible.
Attachment #8783355 - Flags: feedback?(hsivonen) → feedback-
(In reply to Henri Sivonen (:hsivonen) from comment #30) > I suggest working backwards to wherever msgCharSet is extracted from MIME > and mapping whatever is extracted from MIME to a canonical name as early as > possible. So that would be about here: https://dxr.mozilla.org/comm-central/rev/02115be839cfcd0ba3b44f7fe480974496f0a86d/mailnews/compose/src/nsMsgCompose.cpp#1606 a few lines before. There are a few more places that need inspection then: https://dxr.mozilla.org/comm-central/search?q=m_compFields-%3EGetCharacterSet&redirect=false
Comment on attachment 8783355 [details] [diff] [review] Proposed fix (v1b). I'll raise a new bug for the TB work. With a patch for M-C attached, I'll move this to Core::Layout.
Attachment #8783355 - Attachment is obsolete: true
Attachment #8783355 - Flags: review?(rkent)
Component: Message Compose Window → Layout
Product: Thunderbird → Core
Version: 38 Branch → Trunk
Assignee: jorgk → hsivonen
Summary: Crash forwarding or sending a message, or editting draft that contains video clip attachment → Calling SetForceCharacterSet() not using a Geck-canonical name leads to crash (was: Thunderbird: Crash forwarding or sending a message, or editing draft ...)
Whiteboard: [regression:TB45?]
Raised bug 1297118 for the TB work.
Attachment #8783581 - Flags: review?(VYV03354) → review+
Component: Layout → Document Navigation
Summary: Calling SetForceCharacterSet() not using a Geck-canonical name leads to crash (was: Thunderbird: Crash forwarding or sending a message, or editing draft ...) → Calling SetForceCharacterSet() not using a Gecko-canonical name leads to crash (was: Thunderbird: Crash forwarding or sending a message, or editing draft ...)
https://hg.mozilla.org/integration/mozilla-inbound/rev/e07157e29deedd1b2bda60c62f307503c558844f Bug 1191841 - Perform label resolution in scriptable charset setters of nsIContentViewer. r=emk.
Status: ASSIGNED → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla51
Depends on: 1297761
Thunderbird 51 beta will be the first beta with this, so let's wait until TB 45.6.0 to consider uplifting this to our version branch.
(In reply to Kent James (:rkent) from comment #36) > Thunderbird 51 beta will be the first beta with this, so let's wait until TB > 45.6.0 to consider uplifting this to our version branch. For the record, we didn't get TB 51 beta out yet so this didn't get into 45.6.0. But hopefully 45.7.0!
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: