Closed
Bug 136529
Opened 23 years ago
Closed 23 years ago
CSS does not get loaded through LINK if HTML version is 4.0 Strict or 4.01 Transitional/Strict/Frameset
Categories
(Core :: CSS Parsing and Computation, defect)
Tracking
()
VERIFIED
INVALID
People
(Reporter: kripe, Assigned: dbaron)
References
()
Details
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:0.9.9) Gecko/20020313
BuildID: 2002031312
http://whhh.wellington.net.nz/lime-401.html - HTML 4.01 Strict document that
should get a lime background from the CSS /lime.css.
http://whhh.wellington.net.nz/lime-40.html - Same document with HTML 4.0
Transitional header - works in Mozilla 0.9.9.
Both used to work in Mozilla 0.9.2.1.
Reproducible: Always
Steps to Reproduce:
1.Check out the URLs above.
Actual Results: You will see a document with white background, even though the
stylesheet defines a lime background.
Expected Results: Lime background.
Assignee | ||
Comment 1•23 years ago
|
||
The CSS file is being served with a MIME type of text/plain, so we will ignore
it if we are in standards mode. See http://mozilla.org/docs/web-developer/quirks/ .
Status: UNCONFIRMED → RESOLVED
Closed: 23 years ago
Resolution: --- → INVALID
Comment 2•23 years ago
|
||
VERIFIED that lime.css is incorrectly sent as text/plain
Status: RESOLVED → VERIFIED
Comment 3•23 years ago
|
||
Specifying the type="text/css" attribute in the <link rel="stylesheet" .../>
element should solve the problem in any case, even if the web server reports
the wrong MIME type, or does not report any MIME type (for example if the link
references a local file or a FTP resource).
Comment 4•23 years ago
|
||
Actually, it should only affect it if the server sends no MIME type. If the
server sends a MIME type then per the HTML and HTTP specs, the server's MIME
type overrides the MIME type given on the <link> element.
Comment 5•23 years ago
|
||
"per the HTML and HTTP specs"...
where did you read that ?
Certainly not in the HTTP specs (which are FULLY independant of HTML) and thus
does not mandate any use of the downloaded documents by an application.
Certainly not in the HTML specs (which just says SHOULD or MAY, but does not
mandate something regarding the transport layer, and also because HTML specs
are FULLY independant of HTTP or any other transport protocol).
So we must read the specs regarding the <link> and <style> elements as they
are, without reference to any protocol. The HTML spec only says that a browser
should take a decision of whever to download or not download the referenced
document by an anticipated analysis of the type attribute, to see if it
supports this content type. If it sees that it support the "text/css" content
type, then it must maintain that decision when it will download the stylesheet,
and it clearly must manage that document according to that decision, regardless
of what the protocol layer may later say.
This also means that the following:
<img src="myimage.bin" type="image/jpeg" width="32" height="32" xml:lang="en"
title="My company logo"/>
will download and manage the referenced image as a JPEG image regarless of the
fact that the server returns the "myimage.bin" document with another media type
(such as here "application/binary"), and regardless of the fact that the server
may return another language indicator, another title for the image, and another
dimension for its best representation.
This is the only way to keep the independance of the transport layer and the
HTML spec.
Comment 6•23 years ago
|
||
# type
# This attribute gives an advisory hint as to the content type of the content
# available at the link target address. It allows user agents to opt to use a
# fallback mechanism rather than fetch the content if they are advised that
# they will get content in a content type they do not support.
#
# Authors who use this attribute take responsibility to manage the risk that
# it may become inconsistent with the content available at the link target
# address.
-- http://www.w3.org/TR/html4/struct/links.html#adef-type-A
So if the <link> says type="text/css" then we fetch it, because we support
text/css. However, if the server says "This is a text/html file" then we say
"oh, well, we don't know how to handle text/html stylesheets" so we ignore it.
Comment 7•23 years ago
|
||
And this is what you call "managing the risk":
yes we don't know how to manage text/html stylesheets, but does it matter in
that case if we already know that we can manage the anticipated text/css type ?
Why not using the CSS parser on the content to decide if this is actually valid
CSS ?
HTML is likely to contain tags, and will typically contain at least one "<"
character which is invalid anywhere as a valid CSS selector
If HTML does not contain any "<" character then it is transitional HTML, and it
will not contain stylesheets (require at least a <style> tag which is an
invalid CSS selector)
If HTML appears to be a valid CSS file, it will contain no tag and it cannot be
considered as a strict conforming HTML (missing the <html> tag)
if the referenced file is sent as text/x-unknown, then we know at least that it
is of the text family, so we can run the CSS parser on it to validate it.
I remember the conforming rules defined in the CSS specs:
1) if a CSS selector is invalid, the whole style sheet must be ignored
2) if a CSS selector is valid and the braces match and their content validates
the basic syntax (of a semi-colon separated list of a keyword followed by a
colon and by a value of one or more words), then the CSS rule can be accepted,
provided that the style sheet validates as a whole. The individual definitions
in the brace may just need to be dropped if not recognized (for backward
compatibility of CSS specs), but the rule can be accepted
3) empty CSS rules are valid (a valid CSS selector followed by braces whose
content is empty or whose all value pairs have been ignored to the previous
rule.)
So I see absolutely no problem in trying to validate the CSS file that has been
retreived.
The only case where the "text/html" mime type should not be parsed is when the
<link rel="stylesheet"> or anchor element references the stylsheet without
specifying the MIME content type. In that case, the content type of the server
can be checked and honored before trying to use the CSS parser.
If neither the server nor the link contain indication about the content type, I
really think that in that case we must not parse the content (because it may be
in a still unknown but valid content-type for stylesheets defined in further
specs, whose parsing through a CSS1 or CSS2 or CSS3 parser may cause problems).
In that situation (for example, FTP URI references to stylesheets), the
designer will need to specify the type attribute and this will actually be used
by the browser for the stylesheet validation.
Assignee | ||
Comment 8•23 years ago
|
||
> I remember the conforming rules defined in the CSS specs:
> 1) if a CSS selector is invalid, the whole style sheet must be ignored
No, just the rule.
We already have a bug somewhere on parsing a 404 error page (for a stylesheet
load) that had a STYLE element within it and getting legitimate style rules out
of it. It's not true that one can't get style rules out of HTML.
Assignee | ||
Comment 9•23 years ago
|
||
This debate has been happening in three different bugs (this one, bug 113399,
and here). It certainly doesn't help to convince me when you comment in
multiple bugs in the hope that the owner of one of them will decide to change
the behavior, but I think this bug is the most appropriate place for the
discussion of the three.
Assignee | ||
Comment 10•23 years ago
|
||
s/and here/and bug 53112/
Comment 11•23 years ago
|
||
Yes just the rule! how do you delimitate the rule if the syntax that defines it
(i.e. a valid selector and a braced body) can't be found ?
Where do you resume to the next rule ?
in my opinion you must just ignore the rule only when the basic delimiting
characters are OK (this is not the case in presence of a "<" in a selector, and
resuming to the character that follows the invalid character is not what the
standard says: a "<" character is not a rule, so you can't ignore it.
This is what the W3C CSS validator does: it does not validate ANY rule that may
appear further in the text if the parse error occurs in selectors.
There are several levels of errors in CSS:
- fatal parse errors (illegal characters in selectors)
- not matching braces are fatal parse errors.
Then only you can divide the file into indedependant rules, that can be ignored
selectively. each have a selector part and a definition part.
- correct syntax but illegal usage of an operator: ignore the rule
- invalid definitions between matching braces, such as absence of a name before
the colon or multiple names before it: ignore the rule
- unknown attribute name: ignore the rule
- invalid attriute value: ignore the rule.
Ignoring a rule is only possible when you have a context where to recover to
the next rule. If not, the CSS file is fully invalid and all previously parsed
rules that where accidently accepted in that file must be ignored and parsing
must abort with a fatal parse error.
Assignee | ||
Comment 12•23 years ago
|
||
In the following page:
<html>
<title>my page</title>
<style type="text/css">
body { display: block; }
p { color: red; }
</style>
</head>
<body>...</body></html>
There are two CSS rules followed by some garbage at the end. The first one has
an invalid selector, "<html>" ... "body", and is ignored. The { indicates the
end of the selector and the } the end of the rule. The "p { color: red; }" is a
valid rule, and then there's some garbage at the end that could be an invalid
selector for the next rule.
Comment 13•23 years ago
|
||
responding to bug 53112 comment 10:
> Isn't it an exceptional circumstance when the standard suggests that there's a
> risk of ambiguity?
There is no ambiguity. The specs fully define the behaviour for every
combination of Content-Type and type="".
> I see absolutely no problem in trying to use the CSS parser on what the server
> reports as "text/html" or "text/plain" or "text/unknown"
See the case dbaron is arguing above.
> or "text/xml"
XSLT stylesheets, MIME type text/xml, have semantics different to CSS
stylesheets, MIME type text/css. Trying to interpret text/xml stylesheets using
a CSS parser is incorrect.
We even support text/xml stylesheets; and in the unlikely event of finding a
text/xml stylesheet at the end of a link with type="text/css", we should handle
it using the XSL engine, not the CSS one. (If we don't, file a bug.)
> or "text/javascript"
Javascript Stylesheets, MIME type text/javascript, have semantics different to
CSS stylesheets, MIME type text/css. Trying to interpret text/javascript
stylesheets using a CSS parser is incorrect.
> just avoiding to use it on content types that the server says not being of
> the "text/*" family (because this is a basic requirement for a CSS parser
> which first of all is a text parser).
image/svg+xml and text/xml are virtually indistinguishable, so why should
text/xml be considered more like text/css than image/svg+xml?
Comment 14•23 years ago
|
||
You say "the specs say" ... Where did you see that ? I'm very curious to know
which point is accurate.
I have carefully searched in both HTTP and HTML specs and did not find ant such
assumption (simply because both specs are completely independant from each
other, and because HTML does not give any requirement to the underlaying
transport protocol used to retreive external entities)...
Nor did I see that you could recover from parse errors by searching for "}" in
case of bad characters in CSS selector (the specs only gives the special case
of At-Keyword blocks, which here requires a specific pseudo-selector).
So the specs only let the the designer of the HTML page let the risk of of
managing the case where both type sdon't match, and the spec does not seem to
recommend any behavior. In that case you should still trust the designer and
honor the type="" attribute, even in strict mode (but in that mode you will run
the CSS parser in strict mode too so you won't recover from parse errors except
in the only cases which are explicitly given by the CSS spec).
In quirks mode, you can still run the CSS parser with a more relaxed error
recovery, but yes you should also trust the HTML page designer (which assumes
the risk of its page becoming broken in case of unwanted external entity
corruption, or because the server for the external entity incorrectly delivers
a file of a wrong type despite of the Accept header sent by the HTTP client).
What is your rule then if HTTP does not provide any MIME type in a Content-Type
header (this case often occur when the retreived file is served or generated
through a server-side script or program), or with FTP (which does not carry any
MIME type information), or from an application-bounded namespace (such as
Microsoft HTML Help files) ?
Comment 15•23 years ago
|
||
> You say "the specs say" ... Where did you see that ? I'm very curious to know
> which point is accurate.
HTTP 1.1:
# 7.2.1 Type
#
# When an entity-body is included with a message, the data type of that
# body is determined via the header fields Content-Type and Content-
# Encoding. [...]
#
# Content-Type specifies the media type of the underlying data. [...]
#
# If and only if the media type is not given by a Content-Type field, the
# recipient MAY attempt to guess the media type via inspection of its
# content and/or the name extension(s) of the URL used to identify the
# resource. If the media type remains unknown, the recipient SHOULD
# treat it as type "application/octet-stream".
#
# [...]
#
# 14.18 Content-Type
#
# The Content-Type entity-header field indicates the media type of the
# entity-body sent to the recipient
-- http://www.w3.org/Protocols/rfc2068/rfc2068
HTML 4.01:
# type = content-type [CI]
# This attribute gives an advisory hint as to the content type of the content
# available at the link target address. It allows user agents to opt to use a
# fallback mechanism rather than fetch the content if they are advised that
# they will get content in a content type they do not support.
# Authors who use this attribute take responsibility to manage the risk that
# it may become inconsistent with the content available at the link target
# address.
-- http://www.w3.org/TR/html401/struct/links.html#adef-type-A
Put together, these specs tell us, firstly, that the type attribute is _only_
there to give a hint to the UA about whether the UA should both to try to fetch
the resource, and does not in any way imply anything about the content type of
the target document, and secondly, that the Content-Type header returned should
be honoured.
So when we look at the <link> element we establish whether or not we are
interested in the resource, and if we are (type="text/css" or type="text/xml")
then we fetch it. When we get the resource back, we look at its actual MIME type
(either from Content-Type, if present, or using content sniffing, if not) and if
we establish that it is text/css, we pass it to our CSS parser.
> Nor did I see that you could recover from parse errors by searching for "}" in
> case of bad characters in CSS selector [...]
Section 4.2 of CSS2 is not completely clear on this, but the intention of the
CSS working group (of which David and I are members) is as he described in
comment 12. I have sent an e-mail to the working group mailing list so that we
may add an errata to the spec to make this clearer.
Comment 16•23 years ago
|
||
> Nor did I see that you could recover from parse errors by searching for "}"
As Ian was kind enough to point out in another bug, this is CSS2, Section 4.1.7
(http://www.w3.org/TR/REC-CSS2/syndata.html#q8), paragraph 3. It seems pretty
clear to me...
You need to log in
before you can comment on or make changes to this bug.
Description
•