Open Bug 1338797 Opened 7 years ago Updated 2 years ago

Default fallback encoding of windows-1252 is surprising

Categories

(Core :: DOM: Core & HTML, defect, P3)

defect

Tracking

()

People

(Reporter: jdm, Unassigned)

Details

Attachments

(1 file)

https://xkcd.com/1705/ displays the title's comic differently in Firefox and Chrome. Chrome chooses UTF-8 encoding for the page, while Firefox chooses windows-1252 due to https://dxr.mozilla.org/mozilla-central/rev/855e6b2f6199189f37cea093cbdd1735e297e8aa/dom/encoding/FallbackEncoding.cpp#100 . This seems like a surprising default to me.
Edit: comic's title, not title's comic.
Attached file Minimal testcase
Running this from `python -m SimpleHTTPServer` reproduces the same issue that the XKCD comic shows.
What's surprising about it other than it not matching Chrome's latest behavior in this case?

Firefox is doing what's consistent with the traditional behavior of the Web Platform. Chrome recently made a change to do more content-based guessing without discussing it at the WHATWG ahead of time (https://github.com/whatwg/encoding/issues/68#issuecomment-272993181). Looks like it's already causing interop damage. :-(
Priority: -- → P3
I think this bug will be fixed by https://bugzilla.mozilla.org/show_bug.cgi?id=1497037 as we should assume UTF-8 in absence of extra information. 

This is even more annoying on plain text log files where attempts to detect encoding are likely to fail.
Component: DOM → DOM: Core & HTML
Severity: normal → S3
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: