CSS charset delivered by HTTP is not respected

RESOLVED WORKSFORME

Status

()

Core
CSS Parsing and Computation
--
major
RESOLVED WORKSFORME
16 years ago
16 years ago

People

(Reporter: Christopher Hoess (gone), Assigned: dbaron)

Tracking

({regression})

Trunk
x86
Linux
regression
Points:
---

Firefox Tracking Flags

(Not tracked)

Details

(Reporter)

Description

16 years ago
Lots of discussion in bug 128896 so we don't blow up completely when we can't
identify a text/css document's charset and it has non-ASCII characters &etc. 
But if I send out the testcase from that bug with Content-Type: text/html;
charset=iso-8859-1, the style is not applied (i.e., charset is not recognized).
 Using <link rel="stylesheet" charset="iso-8859-1"> does enable recognition of
the charset.
(Assignee)

Comment 1

16 years ago
1) Shouldn't you send as text/css;charset=...
2) Is "charset=" an optional parameter for the text/css MIME type registration?
(Reporter)

Comment 2

16 years ago
1) Probably. My comment was from memory; I just popped in an AddCharset
directive in Apache to do it, so it's being sent correctly.
2) Yes, the parameter is optional, per RFC 2318.
(Reporter)

Comment 3

16 years ago
Ok, the code exists to do this at:
http://lxr.mozilla.org/seamonkey/source/content/html/style/src/nsCSSLoader.cpp#650

The space between media-type and parameter is correct per RFC 2616 and Apache
includes it.  Something has b0rked, it seems.
I assume the opening comment is wrong in it's use of text/html for the
type and means text/css?

I can't test this till I have a real net connection (I have a dumb
terminal right now).  But detection of charset based on http header
used to work.
(Reporter)

Comment 5

16 years ago
Ah, I misparsed dbaron's earlier comment.  Yes, testing on a local server with
"Content-Type: text/css; charset=iso-8859-1" fails.  Adding regression keyword.
 (Also, if we don't detect any of these, we should fall back on ASCII rather
than the charset of the HTML document, but we can deal with that later.)
Keywords: regression
(Reporter)

Comment 6

16 years ago
Dang.  Works just fine on a newly pulled build.  And I hadn't been fooling 
around with the CSS loader before this.  Strange.  Anyway, WFM; I'll file a bug 
on the ASCII issue once I figure out why my supposed fix for it blows up.
Status: NEW → RESOLVED
Last Resolved: 16 years ago
Resolution: --- → WORKSFORME
Why should we not fall back on the HTML document charset?

If we don't, we should fall back on ISO-8859-1, not ASCII.
(Reporter)

Comment 8

16 years ago
I'm referring to the case of external stylesheets, whose charset is likely to 
have nothing to do with the charset of the document.  (What happens, 
incidentally, if the document uses one of the non-ASCII-compatible Asian 
encodings and there's no charset info. for the stylesheet?)  We should be falling 
back on ASCII because that's the default encoding for text/* MIME-types per RFC 
2046; while RFC 2318 suggests that since CSS syntax is ASCII, ASCII, ISO-8859-1, 
and UTF-8 are acceptable choices for a charset parameter, it does not specify a 
different default in the absence of the charset parameter.
(Reporter)

Comment 9

16 years ago
Ach.  I didn't realize HTTP redefined default charset for text/*.  Forget it.
> (What happens, incidentally, if the document uses one of the non-ASCII-
> compatible Asian encodings and there's no charset info for the stylesheet?

This case is apparently why we fall back on the document charset, at the Intl 
team's insistence.  Apparently, most websites that use such an encoding for 
their HTML _also_ use it for the CSS (makes sense, given that both are usually 
created in the same text editor) and do not provide any charset information in 
the CSS.

The fact of the matter is, by the time we're falling back on the document 
charset, all the normal charset discovery methods have failed and we only have 
two options:

1)  Fall back on ISO-8859-1 per the HTTP rfc.
2)  Fall back on the document charset on the assumption that the document and
    the css are probably in the same charset

We opted to go with #2 so that most asian sites work with Mozilla...

As a note, the "CSS syntax is ascii" thing is bogus.... all it takes is one 
class name in a non-ascii encoding in the sheet.
You need to log in before you can comment on or make changes to this bug.