Closed Bug 791003 Opened 7 years ago Closed 7 years ago

<meta charset="utf-7"> triggers NS_NOTREACHED in nsHtml5StreamParser


(Core :: HTML: Parser, defect)

Not set





(Reporter: zwol, Assigned: zwol)




(1 file)

I am dusting off bug 663291.  In the process I noticed that one of the tests for bug 672453 was failing hard (this doesn't show up on the builders because those tests are disabled due to bug 739354), and after a little digging, that looks to be a genuine code bug.

nsHtml5StreamParser::PreferredForInternalEncodingDecl has an explicit blacklist of encodings that are not "rough supersets" of ASCII; the failing test is looking for the console error that happens when that blacklist triggers.  However, the blacklist applies _after_ mapping aliases to preferred names, and uconv _also_ has a blacklist of encodings (in that are known to permit XSS smuggling.  When we hit _that_ blacklist, nsCharsetAlias::GetPreferred returns an error code even though nsCharsetAlias::Equals didn't.  PreferredForInternalEncodingDecl assumes that that's a can't-happen situation.  The obvious fix is to treat that as another case where we should fire the "not a rough superset" console error.

It's possible that the explicit blacklist in PreferredForInternalEncodingDecl should go away at this point.  The entries in it that are not covered by the "incorrect UTF-16 declaration" case at the very top of the function, nor by the XSS-smuggling blacklist in, are


Patch to follow.
Blocks: 739354
Attached patch proposed fixSplinter Review
Attachment #660869 - Flags: review?(hsivonen)
Blocks: 663291
Comment on attachment 660869 [details] [diff] [review]
proposed fix

r+ given the described nsCharsetAlias behavior. (Ideally, nsCharsetAlias would behave as if it had no idea about the encoding for which decoders are unavailable. It's bad that we have encodings that sort of exist but sort of don't.)

Attachment #660869 - Flags: review?(hsivonen) → review+

I could file a follow-up bug for figuring out if we still need the hardcoded blacklist anymore.  I also think there should be a follow-up for the bad grammar in the error message ("declare a character encoding the does not encode" -- "the" should be "that").

I'd rather you filed a follow-up for the behavior of nsCharsetAlias if you think that's appropriate, because I don't understand nsCharsetAlias very well.
Closed: 7 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla18
You need to log in before you can comment on or make changes to this bug.