nsIUnicharStreamLoaderObserver::onDetermineCharset() should be able to read the first 1024 bytes

RESOLVED FIXED in mozilla22

Status

()

RESOLVED FIXED
6 years ago
6 years ago

People

(Reporter: lduros, Assigned: bzbarsky)

Tracking

unspecified
mozilla22
x86_64
Linux
Points:
---
Bug Flags:
in-testsuite ?

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(1 attachment)

(Reporter)

Description

6 years ago
The specs have changed: http://www.whatwg.org/specs/web-apps/current-work/multipage/semantics.html#charset

A meta charset may appear within the 1024 first bytes as opposed to the 512 bytes that are currently being read. This causes an issue with extensions (LibreJS in particular) using the callback when the meta charset appears after 512 bytes, as in the following page:
http://www.tapochek.net/viewforum.php?f=183
Assignee: nobody → bzbarsky
Component: General → General
Product: Firefox → Core
Component: General → Networking
Whiteboard: [need review]
Created attachment 722392 [details] [diff] [review]
Sniff the first 1024 bytes, not the first 512, for charsets, since <meta> can be anywhere in the first 1024 bytes.
Attachment #722392 - Flags: review?(annevk)

Comment 2

6 years ago
I'd prefer that hsivonen review this. Note that what the original comment referred to is still 512 per http://mimesniff.spec.whatwg.org/

Updated

6 years ago
Attachment #722392 - Flags: review?(annevk) → review?(hsivonen)
Comment on attachment 722392 [details] [diff] [review]
Sniff the first 1024 bytes, not the first 512, for charsets, since <meta> can be anywhere in the first 1024 bytes.

I agree that 1024 bytes should be used for <meta> sniffing. But why is this class used for <meta> sniffing? It seems bad to make the CSS loader wait for more bytes before taking action in order to cater to an extension that isn't using the HTML parser for HTML. What's the extension doing? Why is the extension not using the HTML support in XHR for whatever it is that it is doing?
Or does the CSS loader wait for all bytes to arrive before doing anything anyway?
The original comment is just wrong, since there's no mimesniffing involved here.

> But why is this class used for <meta> sniffing?

Extensions do it.

> It seems bad to make the CSS loader wait for more bytes before taking action

How so?  The only "action" being taken here is providing the encoding to decode as.  Really, the CSS loader would be fine with having _all_ the data before doing that.

> in order to cater to an extension that isn't using the HTML parser for HTML.

If the extension is just trying to show the text of the response, using an HTML parser seems ... wrong.  But it does need to be able to convert the bytes to Unicode.

> Why is the extension not using the HTML support in XHR

That I can't tell you.

> Or does the CSS loader wait for all bytes to arrive before doing anything anyway?

It does, as do all consumers of the code being patched here.  The whole point of this class is to convert an HTTP request into a unichar stream, wait for it to be done, then hand the stream off to someone to read.
Comment on attachment 722392 [details] [diff] [review]
Sniff the first 1024 bytes, not the first 512, for charsets, since <meta> can be anywhere in the first 1024 bytes.

(In reply to Boris Zbarsky (:bz) from comment #5)
> > Or does the CSS loader wait for all bytes to arrive before doing anything anyway?
> 
> It does, as do all consumers of the code being patched here.  The whole
> point of this class is to convert an HTTP request into a unichar stream,
> wait for it to be done, then hand the stream off to someone to read.

Seems strange to have a sniffing limit at all then, but OK. Note that the HTML parser honors even laters metas (in the docshell case--not in the XHR case) by reloading.
Attachment #722392 - Flags: review?(hsivonen) → review+
https://hg.mozilla.org/integration/mozilla-inbound/rev/4c271568b51d
Flags: in-testsuite?
Whiteboard: [need review]
Target Milestone: --- → mozilla22
https://hg.mozilla.org/mozilla-central/rev/4c271568b51d
Status: NEW → RESOLVED
Last Resolved: 6 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.