135480 - CSS charset delivered by HTTP is not respected

Reporter

Description

•

22 years ago

Lots of discussion in bug 128896 so we don't blow up completely when we can't
identify a text/css document's charset and it has non-ASCII characters &etc. 
But if I send out the testcase from that bug with Content-Type: text/html;
charset=iso-8859-1, the style is not applied (i.e., charset is not recognized).
 Using <link rel="stylesheet" charset="iso-8859-1"> does enable recognition of
the charset.

David Baron :dbaron: (⌚️UTC-4, no longer working on Mozilla)

Assignee

Comment 1

•

22 years ago

1) Shouldn't you send as text/css;charset=...
2) Is "charset=" an optional parameter for the text/css MIME type registration?

Christopher Hoess (gone)

Reporter

Comment 2

•

22 years ago

1) Probably. My comment was from memory; I just popped in an AddCharset
directive in Apache to do it, so it's being sent correctly.
2) Yes, the parameter is optional, per RFC 2318.

Christopher Hoess (gone)

Reporter

Comment 3

•

22 years ago

Ok, the code exists to do this at:
http://lxr.mozilla.org/seamonkey/source/content/html/style/src/nsCSSLoader.cpp#650

The space between media-type and parameter is correct per RFC 2616 and Apache
includes it.  Something has b0rked, it seems.

Boris Zbarsky [:bzbarsky]

Comment 4

•

22 years ago

I assume the opening comment is wrong in it's use of text/html for the
type and means text/css?

I can't test this till I have a real net connection (I have a dumb
terminal right now).  But detection of charset based on http header
used to work.

Christopher Hoess (gone)

Reporter

Comment 5

•

22 years ago

Ah, I misparsed dbaron's earlier comment.  Yes, testing on a local server with
"Content-Type: text/css; charset=iso-8859-1" fails.  Adding regression keyword.
 (Also, if we don't detect any of these, we should fall back on ASCII rather
than the charset of the HTML document, but we can deal with that later.)

Keywords: regression

Christopher Hoess (gone)

Reporter

Comment 6

•

22 years ago

Dang.  Works just fine on a newly pulled build.  And I hadn't been fooling 
around with the CSS loader before this.  Strange.  Anyway, WFM; I'll file a bug 
on the ASCII issue once I figure out why my supposed fix for it blows up.

Status: NEW → RESOLVED

Closed: 22 years ago

Resolution: --- → WORKSFORME

Boris Zbarsky [:bzbarsky]

Comment 7

•

22 years ago

Why should we not fall back on the HTML document charset?

If we don't, we should fall back on ISO-8859-1, not ASCII.

Christopher Hoess (gone)

Reporter

Comment 8

•

22 years ago

I'm referring to the case of external stylesheets, whose charset is likely to 
have nothing to do with the charset of the document.  (What happens, 
incidentally, if the document uses one of the non-ASCII-compatible Asian 
encodings and there's no charset info. for the stylesheet?)  We should be falling 
back on ASCII because that's the default encoding for text/* MIME-types per RFC 
2046; while RFC 2318 suggests that since CSS syntax is ASCII, ASCII, ISO-8859-1, 
and UTF-8 are acceptable choices for a charset parameter, it does not specify a 
different default in the absence of the charset parameter.

Christopher Hoess (gone)

Reporter

Comment 9

•

22 years ago

Ach.  I didn't realize HTTP redefined default charset for text/*.  Forget it.

Boris Zbarsky [:bzbarsky]

Comment 10

•

22 years ago

> (What happens, incidentally, if the document uses one of the non-ASCII-
> compatible Asian encodings and there's no charset info for the stylesheet?

This case is apparently why we fall back on the document charset, at the Intl 
team's insistence.  Apparently, most websites that use such an encoding for 
their HTML _also_ use it for the CSS (makes sense, given that both are usually 
created in the same text editor) and do not provide any charset information in 
the CSS.

The fact of the matter is, by the time we're falling back on the document 
charset, all the normal charset discovery methods have failed and we only have 
two options:

1)  Fall back on ISO-8859-1 per the HTTP rfc.
2)  Fall back on the document charset on the assumption that the document and
    the css are probably in the same charset

We opted to go with #2 so that most asian sites work with Mozilla...

As a note, the "CSS syntax is ascii" thing is bogus.... all it takes is one 
class name in a non-ascii encoding in the sheet.

Bugzilla

Quick Search

CSS charset delivered by HTTP is not respected

Categories

(Core :: CSS Parsing and Computation, defect)

Tracking

()

People

(Reporter: choess, Assigned: dbaron)

References

Details

(Keywords: regression)

Crash Data

Security

(public)

User Story

Description

Comment 1

Comment 2

Comment 3

Comment 4

Comment 5

Comment 6

Comment 7

Comment 8

Comment 9

Comment 10