Closed Bug 741776 Opened 8 years ago Closed 3 years ago

Use UTF-8 and don't honor the charset parameter on application/json resources loaded as documents

Categories

(Core :: Document Navigation, defect)

defect
Not set

Tracking

()

RESOLVED FIXED
mozilla53
Tracking Status
firefox53 --- fixed

People

(Reporter: hsivonen, Assigned: hsivonen)

References

Details

Attachments

(1 file, 1 obsolete file)

XHR ignores the HTTP charset for JSON responses.  The encoding is always UTF-8.  It would probably be logical not to honor the charset parameter when a JSON resource is loaded as a document into docshell. If there is a charset parameter specifying an encoding other than UTF-8, there should probably be a message to the error console.
Duplicate of this bug: 1166558
Summary: Don't honor the charset parameter on application/json resources loaded as documents → Use UTF-8 and don't honor the charset parameter on application/json resources loaded as documents
People continue to be confused by this (see http://stackoverflow.com/questions/13096259/can-charset-parameter-be-used-with-application-json-content-type-in-http-1-1) and thus include non-sense charset parameters.

Can this *please* be fixed?
Do note that at the moment, both Chrome and Firefox interpret the naked application/json mime type (without the charset parameter) *wrong*. The pick the *wrong* encoding, showing mangled characters for any chars outside of ASCII.

SEE

json displayed with wrong encoding (i.e. not unicode)
https://bugs.chromium.org/p/chromium/issues/detail?id=438464&

and

Bug 1166558 - Charset should default to utf-8 for application/json
https://bugzilla.mozilla.org/show_bug.cgi?id=1166558
(which is marked as a dupe of this bug, which is wrong)

Ironically, *this* bug (of not ignoring charset) is also present in both these browsers...

The result? Just check stack overflow. The whole world is adding `; charset=utf-8` to the mime type to work around both bugs mentioned above. If you fix this bug without first fixing the other, you will create a situation in which it will be impossible for devs to enforce the use of the *correct* encoding.

Hence, this bug should actually be blocked on Bug 1166558.
Filed a spec bug for making the spec deal with this in an explicit way as opposed to the current hand-wavy way:
https://github.com/whatwg/html/issues/1403

Set the flag as on-stack boolean here:
https://mxr.mozilla.org/mozilla-central/source/dom/html/nsHTMLDocument.cpp#547

Check the boolean here and move the other checks to an "else" branch:
https://mxr.mozilla.org/mozilla-central/source/dom/html/nsHTMLDocument.cpp#673

Communicate the flag to the parser via the charset source. Introduce a source stronger than BOM in
https://mxr.mozilla.org/mozilla-central/source/parser/nsCharsetSource.h#24

Make the parser remove the UTF-8 BOM but not honor the UTF-16 BOMs with that charset source.

Make the encoding menu disablement code check >= instead of == BOM at:
https://mxr.mozilla.org/mozilla-central/source/dom/html/nsHTMLDocument.cpp#3633
Duplicate of this bug: 1166558
Attached patch Completely untested patch (obsolete) — Splinter Review
Here's a completely untested (not even compiled) patch for interested parties to try. I'll have time to test it the week after next at the earliest.
On desktop, the dev tools not intercept JSON at least when loaded into a top-level browsing context, but this bug still applies in other cases, including mobile and WebVTT.
Assignee: nobody → hsivonen
Status: NEW → ASSIGNED
Attachment #8761539 - Attachment is obsolete: true
Attachment #8814617 - Flags: review?(ehsan)
The reason why the test case is .vtt instead of .json is in comment 8: .json now has the dev tools taking over on desktop.
Comment on attachment 8814617 [details]
Bug 741776 - Treat JSON, WebVTT and AppCache manifests as UTF-8 when loaded as plain text.

https://reviewboard.mozilla.org/r/95794/#review96010

::: dom/html/reftests/741776-1-ref.html:1
(Diff revision 1)
> +<meta charset=utf-8><plaintext>ää

Any reason you're using <plaintext> instead of <pre>?

::: uriloader/exthandler/nsExternalHelperAppService.cpp:525
(Diff revision 1)
> -  { "application/xhtml+xml", "xhtml" },
> -  { "application/xhtml+xml", "xht" },
> +  { APPLICATION_XHTML_XML, "xhtml" },
> +  { APPLICATION_XHTML_XML, "xht" },
>    { TEXT_PLAIN, "txt" },
> +  { APPLICATION_JSON, "json" },
> +  { TEXT_VTT, "vtt" },
> +  { TEXT_CACHE_MANIFEST, "appcache" },

Hmm, why do you need to do this?  This hunk is making it impossible for users to override which application we use to open these three extensions.  It's totally unreasonable to expect that Firefox is a good application to associate with .vtt and .appcache.  The former has probably better editor applications and the latter isn't really a well-known extension even.  Arguably even doing this for .json is very questionable (what if a user doesn't like our default devtools JSON viewer and prefers .json files to be opened with their favorite external application?

If you have a good argument for this hunk, I think at the very least it needs to be forked off into its own bug.

::: uriloader/exthandler/nsExternalHelperAppService.cpp:599
(Diff revision 1)
>    { IMAGE_SVG_XML, "svg", "Scalable Vector Graphics" },
>    { MESSAGE_RFC822, "eml", "RFC-822 data" },
>    { TEXT_PLAIN, "txt,text", "Text File" },
> +  { APPLICATION_JSON, "json", "JavaScript Object Notation" },
> +  { TEXT_VTT, "vtt", "Web Video Text Tracks" },
> +  { TEXT_CACHE_MANIFEST, "appcache", "Application Cache Manifest" },

Note that this hunk effectively makes these extensions known to Firefox, but it allows the user to override them if they choose to do so.

I think doing this here for json and vtt is fine.  However the .appcache extension sounds kind of made-up to me.  :-)
Attachment #8814617 - Flags: review?(ehsan) → review-
Sorry, MozReview is too eager to submit my reviews before I'm finished.  :-(  The following was supposed to go with the review comments:

Thanks for the patch, everything looks great except for the hunks in nsExternalHelperAppService.cpp.  Minusing for that.
Comment on attachment 8814617 [details]
Bug 741776 - Treat JSON, WebVTT and AppCache manifests as UTF-8 when loaded as plain text.

https://reviewboard.mozilla.org/r/95794/#review96010

> Any reason you're using <plaintext> instead of <pre>?

<plaintext> seemed like the most obvious way to get text/plain-like effects in .html, but you are right that <pre> would be a better match DOM-wise.

> Hmm, why do you need to do this?  This hunk is making it impossible for users to override which application we use to open these three extensions.  It's totally unreasonable to expect that Firefox is a good application to associate with .vtt and .appcache.  The former has probably better editor applications and the latter isn't really a well-known extension even.  Arguably even doing this for .json is very questionable (what if a user doesn't like our default devtools JSON viewer and prefers .json files to be opened with their favorite external application?
> 
> If you have a good argument for this hunk, I think at the very least it needs to be forked off into its own bug.

It looks like it works without this.

> Note that this hunk effectively makes these extensions known to Firefox, but it allows the user to override them if they choose to do so.
> 
> I think doing this here for json and vtt is fine.  However the .appcache extension sounds kind of made-up to me.  :-)

.appcache is in the spec and in various pieces of documentation:
https://html.spec.whatwg.org/#introduction-10
https://en.wikipedia.org/wiki/Cache_manifest_in_HTML5
https://developer.mozilla.org/en-US/docs/Web/HTML/Using_the_application_cache
https://www.html5rocks.com/en/tutorials/appcache/beginner/
Comment on attachment 8814617 [details]
Bug 741776 - Treat JSON, WebVTT and AppCache manifests as UTF-8 when loaded as plain text.

https://reviewboard.mozilla.org/r/95794/#review96842
Attachment #8814617 - Flags: review?(ehsan) → review+
Pushed by hsivonen@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/9f1de4aeae3a
Treat JSON, WebVTT and AppCache manifests as UTF-8 when loaded as plain text. r=Ehsan
https://hg.mozilla.org/mozilla-central/rev/9f1de4aeae3a
Status: ASSIGNED → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla53
See Also: → 1266023
Duplicate of this bug: 1266023
You need to log in before you can comment on or make changes to this bug.