Closed Bug 504831 Opened 15 years ago Closed 7 years ago

UTF-16 decoder shouldn't guess endianness when there's no BOM

Categories

(Core :: Internationalization, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla56

People

(Reporter: dbaron, Assigned: hsivonen)

References

(Blocks 1 open bug, )

Details

(Whiteboard: [fixed by encoding_rs])

According to http://lists.whatwg.org/pipermail/whatwg-whatwg.org/2009-July/021102.html , changing the UTF-16 decoder doesn't guess endianness when no BOM is present, and so that it defaults to little endian when no BOM is present, would improve compatibility with other browsers. The code in question is in the last function in a file incorrectly named nsUCS2BEToUnicode.cpp .
Why would we sacrifice conformance to the Unicode Standard, and a SHOULD in RFC 2781, for the sake of compatibility with other browsers? Especially since AFAICT in any case where there is a practical difference the other browsers will be displaying garbage and we will be displaying the intended content.
Maybe bring that up on the whatwg list?
I do not want to use utf-16be/utf-16le as labels for utf-16 with BOM. It's OK to be "misinterpreted for compatibility" as unicode/unicodeFFFE (if they are registered).
Blocks: encoding
I was just bitten by this bug. Returned data was garbled and cut short, but worked in other browsers. Looking at the HTTP request with tools like Fiddler was also showing it encoded correctly. It wasn't until I viewed it in JS did I see the garbled result. Is there a good way to debug this? I'll know going forward what is happening, but to anyone else that encounters it for the first time, good luck.
It seems this is still present http://mxr.mozilla.org/mozilla-central/source/intl/uconv/ucvlatin/nsUTF16ToUnicode.cpp#319 I don't think we plan on standardizing this so we should probably just remove it.
Depends on: encoding_rs
Fixed by bug 1261841.
Assignee: smontagu → hsivonen
Whiteboard: [fixed by encoding_rs]
Target Milestone: --- → mozilla56
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.