Closed Bug 135480 Opened 22 years ago Closed 22 years ago

CSS charset delivered by HTTP is not respected

Categories

(Core :: CSS Parsing and Computation, defect)

x86
Linux
defect
Not set
major

Tracking

()

RESOLVED WORKSFORME

People

(Reporter: choess, Assigned: dbaron)

Details

(Keywords: regression)

Lots of discussion in bug 128896 so we don't blow up completely when we can't
identify a text/css document's charset and it has non-ASCII characters &etc. 
But if I send out the testcase from that bug with Content-Type: text/html;
charset=iso-8859-1, the style is not applied (i.e., charset is not recognized).
 Using <link rel="stylesheet" charset="iso-8859-1"> does enable recognition of
the charset.
1) Shouldn't you send as text/css;charset=...
2) Is "charset=" an optional parameter for the text/css MIME type registration?
1) Probably. My comment was from memory; I just popped in an AddCharset
directive in Apache to do it, so it's being sent correctly.
2) Yes, the parameter is optional, per RFC 2318.
Ok, the code exists to do this at:
http://lxr.mozilla.org/seamonkey/source/content/html/style/src/nsCSSLoader.cpp#650

The space between media-type and parameter is correct per RFC 2616 and Apache
includes it.  Something has b0rked, it seems.
I assume the opening comment is wrong in it's use of text/html for the
type and means text/css?

I can't test this till I have a real net connection (I have a dumb
terminal right now).  But detection of charset based on http header
used to work.
Ah, I misparsed dbaron's earlier comment.  Yes, testing on a local server with
"Content-Type: text/css; charset=iso-8859-1" fails.  Adding regression keyword.
 (Also, if we don't detect any of these, we should fall back on ASCII rather
than the charset of the HTML document, but we can deal with that later.)
Keywords: regression
Dang.  Works just fine on a newly pulled build.  And I hadn't been fooling 
around with the CSS loader before this.  Strange.  Anyway, WFM; I'll file a bug 
on the ASCII issue once I figure out why my supposed fix for it blows up.
Status: NEW → RESOLVED
Closed: 22 years ago
Resolution: --- → WORKSFORME
Why should we not fall back on the HTML document charset?

If we don't, we should fall back on ISO-8859-1, not ASCII.
I'm referring to the case of external stylesheets, whose charset is likely to 
have nothing to do with the charset of the document.  (What happens, 
incidentally, if the document uses one of the non-ASCII-compatible Asian 
encodings and there's no charset info. for the stylesheet?)  We should be falling 
back on ASCII because that's the default encoding for text/* MIME-types per RFC 
2046; while RFC 2318 suggests that since CSS syntax is ASCII, ASCII, ISO-8859-1, 
and UTF-8 are acceptable choices for a charset parameter, it does not specify a 
different default in the absence of the charset parameter.
Ach.  I didn't realize HTTP redefined default charset for text/*.  Forget it.
> (What happens, incidentally, if the document uses one of the non-ASCII-
> compatible Asian encodings and there's no charset info for the stylesheet?

This case is apparently why we fall back on the document charset, at the Intl 
team's insistence.  Apparently, most websites that use such an encoding for 
their HTML _also_ use it for the CSS (makes sense, given that both are usually 
created in the same text editor) and do not provide any charset information in 
the CSS.

The fact of the matter is, by the time we're falling back on the document 
charset, all the normal charset discovery methods have failed and we only have 
two options:

1)  Fall back on ISO-8859-1 per the HTTP rfc.
2)  Fall back on the document charset on the assumption that the document and
    the css are probably in the same charset

We opted to go with #2 so that most asian sites work with Mozilla...

As a note, the "CSS syntax is ascii" thing is bogus.... all it takes is one 
class name in a non-ascii encoding in the sheet.
You need to log in before you can comment on or make changes to this bug.