The default bug view has changed. See this FAQ.

Make BOM take precedence over HTTP for CSS

RESOLVED FIXED in mozilla19

Status

()

Core
CSS Parsing and Computation
RESOLVED FIXED
5 years ago
4 years ago

People

(Reporter: hsivonen, Assigned: hsivonen)

Tracking

(Depends on: 1 bug, Blocks: 1 bug, {dev-doc-complete})

unspecified
mozilla19
dev-doc-complete
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(3 attachments, 6 obsolete attachments)

(Assignee)

Description

5 years ago
Determining the character encoding when CSS loading should be consistent with bug 716579 and bug 687859 for the sake of the overall consistency of the platform. That is, a BOM should take precedence over HTTP-level charset in accordance with http://encoding.spec.whatwg.org/#decode .
So BOM should imply UTF-16, then?  Right now we actually look for @charset after a BOM and allow @charset to set the encoding if it's present....

In any case, this needs to be done in the working group, since it involves changing http://www.w3.org/TR/CSS21/syndata.html#charset
(Assignee)

Comment 2

5 years ago
(In reply to Boris Zbarsky (:bz) from comment #1)
> So BOM should imply UTF-16, then?

Or UTF-8 in the case of the UTF-8 BOM.
(Assignee)

Comment 3

5 years ago
Chrome and Opera already do what this bug asks for. We are not currently matching IE, either. (IE strips the BOM from CSS and then proceeds to decode according to the encoding specified on the HTTP level, which is really bizarre.) I did not test Safari.
(Assignee)

Comment 4

5 years ago
(In reply to Boris Zbarsky (:bz) from comment #1)
> In any case, this needs to be done in the working group, since it involves
> changing http://www.w3.org/TR/CSS21/syndata.html#charset

http://lists.w3.org/Archives/Public/www-style/2012Oct/0516.html
(Assignee)

Updated

5 years ago
Assignee: nobody → hsivonen
(Assignee)

Comment 5

5 years ago
I don’t see mCharsetHint on SheetLoadData ever getting set to a non-empty value. What am I missing?
(Assignee)

Comment 6

5 years ago
Created attachment 674651 [details] [diff] [review]
WIP
(Assignee)

Comment 7

5 years ago
The patch needs bug 801402 to land first.
Depends on: 801402
> I don’t see mCharsetHint on SheetLoadData ever getting set to a non-empty value.

It might not be.  It's sort of part of the public LoadSheet() API, but consumers may always be passing an empty string.

We could just nuke it.
(Assignee)

Comment 9

4 years ago
(In reply to Boris Zbarsky (:bz) from comment #8)
> > I don’t see mCharsetHint on SheetLoadData ever getting set to a non-empty value.
> 
> It might not be.

Turns out it’s used for preloads.

- -

The tests for this bug are at https://test.csswg.org/source/contributors/opera/submitted/css3-syntax/charset/

That is, the tests are testharness.js tests but they are part of CSS WG test suite. We don’t have any precedent for importing tests with this combination. Where should I put them? Does the tooling at http://mxr.mozilla.org/mozilla-central/source/dom/imptests/ support importing a test suite under layout/ ?
(Assignee)

Comment 10

4 years ago
Created attachment 679679 [details] [diff] [review]
Adjust old tests
(Assignee)

Comment 11

4 years ago
Created attachment 679680 [details] [diff] [review]
Implement CSS3 Syntax charset handling
Attachment #674651 - Attachment is obsolete: true
Created attachment 680108 [details] [diff] [review]
Import CSSWG tests

This puts them under dom/imptests/css. I guess it would be a bit nicer to put them somewhere under layout; file a bug on me if you'd like me to extend the script to allow that.
(Assignee)

Comment 13

4 years ago
Created attachment 680627 [details] [diff] [review]
Patch on top of Ms2ger’s patch

This patch is needed on top of the one Ms2ger uploaded. However, this doesn’t mark dom/imptests/css/contributors/opera/submitted/css3-syntax/charset/test_page-utf16-css-bomless-utf16be.html as a known failure, because I couldn’t get parseFailures.py to do anything.
(Assignee)

Comment 14

4 years ago
And it’s a known failure, because it ends up testing whether our UTF-16 decoders try to sniff endianness in the BOMless case. (Our decoder sniffs. The Encoding Standard says not to, at the moment.)

Outside the scope of this bug.
(Assignee)

Comment 15

4 years ago
Created attachment 680628 [details] [diff] [review]
Adjust old tests by flipping red and green
Attachment #679679 - Attachment is obsolete: true
Attachment #680628 - Flags: review?(bzbarsky)
(Assignee)

Comment 16

4 years ago
Created attachment 680630 [details] [diff] [review]
Implement CSS3 Syntax charset handling, with updated comment
Attachment #679680 - Attachment is obsolete: true
Attachment #680630 - Flags: review?(bzbarsky)
Keywords: dev-doc-needed
Comment on attachment 680627 [details] [diff] [review]
Patch on top of Ms2ger’s patch

Review of attachment 680627 [details] [diff] [review]:
-----------------------------------------------------------------

::: dom/imptests/Makefile.in
@@ +22,5 @@
>  
>  include $(srcdir)/editing.mk
>  include $(srcdir)/html.mk
>  include $(srcdir)/webapps.mk
> +include $(srcdir)/css.mk

Gah, sorry :/ I fixed the instructions in the readme to mention this:

https://hg.mozilla.org/mozilla-central/rev/96e7580fb756
Comment on attachment 680628 [details] [diff] [review]
Adjust old tests by flipping red and green

This is landing in the same changeset as the code changes, right?

r=me
Attachment #680628 - Flags: review?(bzbarsky) → review+
Comment on attachment 680630 [details] [diff] [review]
Implement CSS3 Syntax charset handling, with updated comment

r=me if you leave the logging about where the charset info is coming from instead of removing it.
Attachment #680630 - Flags: review?(bzbarsky) → review+
Comment on attachment 680630 [details] [diff] [review]
Implement CSS3 Syntax charset handling, with updated comment

Actually, two more comments:

>+++ b/layout/style/Loader.cpp
>+  for (; i < sizeof(kCharsetSym) - 1; ++i) {

Why not just strncmp?

>+        aCharset.Trim(" \t\n\f\r");

Is this specced?  Is it needed?  Seems pretty weird... especially since having an unescaped linebreak inside a quoted-string is not OK in CSS.
(Assignee)

Comment 21

4 years ago
(In reply to Boris Zbarsky (:bz) from comment #20)
> >+++ b/layout/style/Loader.cpp
> >+  for (; i < sizeof(kCharsetSym) - 1; ++i) {
> 
> Why not just strncmp?

Because the old code did not have strncmp and I don’t tend to think in terms of the C standard library.

> >+        aCharset.Trim(" \t\n\f\r");
> 
> Is this specced?

Yes. See http://dev.w3.org/csswg/css3-syntax/#the-input-byte-stream

Step 3 says, in part: “then get an encoding for the sequence of XX bytes, decoded per windows-1252, and let temp be the return value.”

The sequence of XX bytes can contain anything except the double quote.

“Get an encoding” is defined in http://encoding.spec.whatwg.org/#concept-encoding-get which says “Remove any leading and trailing ASCII whitespace from label.” And ASCII whitespace is defined to be " ", "\t", "\n", "\f" and "\r".

I suppose it would be slightly more proper to call EncodingUtils::FindEncodingForLabel() instead of inlining “get an encoding” like this.

> Is it needed? 

It’s definitely needed in other parts of the platform that call “get an encoding”. Dunno if it is needed here. It’s a side effect of reusing the “get an encoding” concept.
(Assignee)

Comment 22

4 years ago
Created attachment 681011 [details] [diff] [review]
Implement CSS3 Syntax charset handling, with review comments addressed
Attachment #680628 - Attachment is obsolete: true
Attachment #680630 - Attachment is obsolete: true
Comment on attachment 681011 [details] [diff] [review]
Implement CSS3 Syntax charset handling, with review comments addressed

> +  for (uint32_t i = sizeof(kCharsetSym); i < aDataLength; ++i) {

Should that not be sizeof(kCharsetSym) - 1?
(Assignee)

Comment 24

4 years ago
Created attachment 681867 [details] [diff] [review]
Implement CSS3 Syntax charset handling, with review comments addressed, without off-by-one error

(In reply to Boris Zbarsky (:bz) from comment #23)
> Comment on attachment 681011 [details] [diff] [review]
> Implement CSS3 Syntax charset handling, with review comments addressed
> 
> > +  for (uint32_t i = sizeof(kCharsetSym); i < aDataLength; ++i) {
> 
> Should that not be sizeof(kCharsetSym) - 1?

Oops. Yes.

This patch uses strncmp and does not attempt to inline parts of "get an encoding".
Attachment #681011 - Attachment is obsolete: true
Attachment #681867 - Flags: review?(bzbarsky)
Comment on attachment 681867 [details] [diff] [review]
Implement CSS3 Syntax charset handling, with review comments addressed, without off-by-one error

r=me
Attachment #681867 - Flags: review?(bzbarsky) → review+
(Assignee)

Comment 26

4 years ago
(In reply to Boris Zbarsky (:bz) from comment #25)
> r=me

Thanks.

Landed the code now in order to get all the BOM precedence stuff in the same train:
https://hg.mozilla.org/integration/mozilla-inbound/rev/b301f9b2e956

Leaving open pending a decision and help from Ms2ger on how to deal with the test import. (Need to sort out the failure metadata and the HTTP headers.)
Whiteboard: [leave open]
https://hg.mozilla.org/mozilla-central/rev/b301f9b2e956
Depends on: 816849
Having landed and shipped code changes that change behavior without resolving the bug and setting a target milestone indicating when it got is _really_ confusing for regression-hunting.

Please resolve this bug appropriately and file a followup on whatever work remains.
Blocks: 859706
(Assignee)

Comment 29

4 years ago
(In reply to Boris Zbarsky (:bz) from comment #28)
> Having landed and shipped code changes that change behavior without
> resolving the bug and setting a target milestone indicating when it got is
> _really_ confusing for regression-hunting.

Yeah. This remained open longer than I expected it to remain open.

> Please resolve this bug appropriately and file a followup on whatever work
> remains.

Filed bug 880201.
Whiteboard: [leave open]
Target Milestone: --- → mozilla19
(Assignee)

Updated

4 years ago
Status: NEW → RESOLVED
Last Resolved: 4 years ago
Resolution: --- → FIXED
Updated https://developer.mozilla.org/en-US/docs/Web/CSS/@charset
to reflect that change.

Teoli, can you check if we have other places we document that topic ?
Flags: needinfo?(jypenator)
Keywords: dev-doc-needed → dev-doc-complete
I have not found other places. It looks ok.
Flags: needinfo?(jypenator)
You need to log in before you can comment on or make changes to this bug.