Closed Bug 113399 Opened 24 years ago Closed 23 years ago

If Content-Type doesn't match <link> type attribute, dump warning to web devel console

Categories

(Core :: CSS Parsing and Computation, enhancement, P4)

enhancement

Tracking

()

RESOLVED FIXED

People

(Reporter: zwol, Assigned: bzbarsky)

References

Details

(Keywords: helpwanted)

Suppose I have this HTML header: <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"> <html> <head> <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1"> <link rel="stylesheet" type="text/css" href="style.css"> To the best of my knowledge this is correct strict HTML 4.01. However, Mozilla ignores the style sheet. If I use //DTD HTML 4.01 Transitional// instead, the style sheet is honored.
Forgot to mention that an inline style sheet, e.g. with <style type="text/css">, works fine. So this might be a problem with <link> in strict mode.
What is the content-type of the .css file? It must be text/css, although we only enforce this in strict mode, for backwards compatability.
According to HEAD, it's text/plain. I take it I should take this up with the server administrator? An error message would have been nice, I spent a good two hours chasing my tail.
Yep. This is a server config issue... David, Ian, what do you think? Should we dump a message to the "JS" console saying the load failed because of wrong content type?
status to new while this is being talked about..
Status: UNCONFIRMED → NEW
Ever confirmed: true
I'd like to add a few more datapoints. I have made test pages that demonstrate that the stylesheet is ignored in 4.01 transitional but not 4.0 transitional. http://www.weaverling.org/test-40.html http://www.weaverling.org/test-401.html http://www.tmadelaware.org/test-40.html http://www.tmadelaware.org/test-401.html The last one fails to load the stylesheet. The last two return the .css file as a application/octet-stream type. The first two return it as text/css. The only difference in the last two is the doctype declaration. Works with application/octet-stream: DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd" Doesn't work with application/octet-stream: DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/1999/REC-html401-19991224/loose.dtd" (tested with rh7 rpm linux build 2001121419)
Logging to the JS error console seems fine with me, although the JS folks might want only JS errors there. I'm not sure. Anyway, it could be done with code like the following (stolen from nsCSSScanner.cpp): // Log it to the JavaScript console nsCOMPtr<nsIConsoleService> consoleService (do_GetService(NS_CONSOLESERVICE_CONTRACTID)); nsCOMPtr<nsIScriptError> errorObject (do_CreateInstance(NS_SCRIPTERROR_CONTRACTID)); if (consoleService && errorObject) { nsresult rv; PRUnichar *error = ToNewUnicode(mError); rv = errorObject->Init(error, NS_ConvertASCIItoUCS2(mFileName).get(), NS_LITERAL_STRING("").get(), mErrorLineNumber, mErrorColNumber, 0, "CSS Parser"); nsMemory::Free(error); if (NS_SUCCEEDED(rv)) consoleService->LogMessage(errorObject); } Giving this bug to bzbarsky while we think about it.
Assignee: dbaron → bzbarsky
Summary: <link rel="stylesheet"> ignored in strict mode → non-text/css stylesheet ignored in strict mode without giving error
Target Milestone: --- → mozilla1.0
Just wanted to say that logging stuff like this to JS console sounds weird. If I'm not seeing style according to CSS the last place I check is JS console. If you decide to do this at least print a message to status bar or something to hint developer to right path. Something like IE's yellow exclamation mark to click to get there would be great. Perhaps if JS console is renamed as "Log" and moved to View menu...
*** Bug 122576 has been marked as a duplicate of this bug. ***
*** Bug 122947 has been marked as a duplicate of this bug. ***
I don't have time to fix the issues with the JS console in the near future (read: till summer). Pushing off to 1.1 and adding helpwanted keyword. If people want this for 1.0 someone needs to step up and get this done sometime in the next week or so; past that point this is certainly not 1.0 material.
Keywords: helpwanted
Priority: -- → P4
Target Milestone: mozilla1.0 → mozilla1.1
Can somebody please explain why it is considered good behavior that Mozilla ignores the type="text/css" in the link tag in the testcase? This seems confusing if not wrong. If I wanted it to use the mime-type from the server, it'd seem I'd just use <link rel="stylesheet" href="style.css">, in which case the server better be configured appropriately. But since the type is clearly specified in the original testcase, I'd think Mozilla should use it. What am I missing here? Why specify the type to have it ignored?
You're missing the purpose of the "type" attribute. From http://www.w3.org/TR/html401/struct/links.html#adef-type-A: type = content-type [CI] This attribute gives an advisory hint as to the content type of the content available at the link target address. It allows user agents to opt to use a fallback mechanism rather than fetch the content if they are advised that they will get content in a content type they do not support. Authors who use this attribute take responsibility to manage the risk that it may become inconsistent with the content available at the link target address. For the current list of registered content types, please consult [MIMETYPES]. In other words, the type attribute exists to allow the useragent to dispense with the whole process of fetching data from the server if it knows it can't handle the type the data is supposed to be in. At the same time, the type in the HTML is purely advisory and does not override the server type... the pages this bug affects fall into the "Authors who use this attribute take responsibility to manage the risk that it may become inconsistent with the content available at the link target address" clause, essentially.
If we dump this to the JS console (hopefully by then we will have renamed this to the "Developer Console" or some such) then it should only be a warning if the type attribute doesn't match the HTTP Content-Type. We could maybe also warn if the stylesheet has a MIME type that we _know_ isn't a stylesheet MIME type. But we shouldn't, e.g., complain if a stylesheet is ignored because it has the text/sgml MIME type, or if it has the application/x-foo-bar MIME type. Both of those could be quite legitimate.
Thanks, Boris. That was very helpful and makes a lot more sense. I still question the strictness with which Mozilla ignores stylesheets. For example, sending text/plain instead of text/css seems like something Mozilla could handle as acceptable, despite the mismatch. It seems like that's a pretty common misconfiguration. It doesn't seem like this would be violating the standard, since the type attribute is just advisory and Mozilla handles text/plain in non-strict mode. Perhaps this would be better handled as a warning than as an error. In any case, I understand the bug and agree that there needs to be some sort of warning somewhere. It feels a little strange to put it in the JavaScript console, but it's better than nowhere.
I think the only acceptable solution here would be to implement a seperate HTML console, like the JS one. Giving me a alert box would certainly not be acceptable.
I think that the "advisory hint" expression does not apply to the case where the content actually does not explicitly indicates its content type. This occurs if the link references a local file (whatever its extension) or an FTP resource (which obviously do not report the content type, that the client must assume before downloading it either in binary or text mode). So, the type="text/css" attribute is recommanded in <link rel="stylesheet"> elements, and in fact, it should bypass the MIME type returned by the content server, whatever its value, when the stylesheet is loaded. An "advisory hint" does not mean that it should not be honored when it is recognized. The "inconsistence" problem can exist in the specification, however this spec does not recommend how to solve it. As HTML and CSS are specified indepently of the transport protocol (there's no link with HTTP itself), we should solve this inconsistence by ignoring the value returned by the transport layer (i.e. by ignoring the "content-type:" header that MAY be returned by the HTTP server). Don't forget that HTTP does NOT mandate the "Content-Type:" header, whose role is just to set a default behavior or management of the content, when there's no other contextual source of information on how to manage the retrieved content. As regard to the use of a content through a <link> reference, this content should then be honored using the content type specified in the authored HTML document. This behavior is consistent with other parts of the HTML and CSS specification, where the user-agent behavior is determined by the SOURCE of the link, NOT by its TARGET... It is also consistent with the XML external link type, whose behavior is determined by the link object itself, and NOT by any of the referenced entities. In any case, the <link> element attribute has priority to other sources of information. The same thing applies to other similar content settings, such as: - the character encoding (can be set as an extended "; charset=" attribute value of the content type), - the content disposition (HTML defines the "target=" attribute to avoid rendering as an inline content), and - the content language (the HTML standard defines the "hreflang=" attribute in that purpose).
I also forgot another similar content setting whose behavior is consistent with this interpretation: - the title of the content (specified by the "title=" attribute in a <link>, which also bypass any title specified by the referenced document when the link is rendered in the source document)
I agree that according to the standard, mozilla has correct behavior. However, what harm is there in allowing the <link> to override what the server returns? One large benefit would be to allow people who have no control over their web server to have their documents render correctly even though the web server is misconfigured.
I thought I read in some standards document that the referencing page's content-type hint could be used to override the server-provided one if the user agent couldn't deal with the provided type. So if we get a content-type for a stylesheet that we can't understand as a stylesheet language, and there's a hint for one that we can handle, we could try again with that type. Of course, now I can't find the document in question. Alas.
Whether there is harm or not is not the point. If you disagree with the spec, take it up with the spec authors -- our job here is to implement the spec. That's all.
Found it: http://www.w3.org/TR/REC-html40/struct/links.html#adef-type-A Looks like we'd be well within spec to use the content-type hint if we can't support the format that comes back from the server. Nyah, nyah!
You aren't reading it right. It says: # It allows user agents to opt to use a fallback mechanism rather than fetch the # content if they are advised that they will get content in a content type they do # not support. ...which just means that if the "type" attribute is something we don't support, we don't have to even bother downloading the target document.
Yeah, I'm not reading it right. Drat.
Ok. Changing the summary to match what was discussed above. The RFE here is that if we try to fetch content linked from a <link>, <meta>, @import, <style src="">, HTTP header, <img>, <object> or other content embedding mechanism and we find that the MIME type reported by the server (through the Content-Type header if the protocol used is HTTP) does not match the advisory MIME type given by the author (or implied by the mechanism used to link to the content) then we should dump a warning to the web development console (currently called the JS console).
Severity: normal → enhancement
Summary: non-text/css stylesheet ignored in strict mode without giving error → If Content-Type doesn't match <link> type attribute, dump warning to web devel console
According to comment #14 and the referenced spec, it should be read like in this case: <link rel="stylesheet" type="text/javascript" href="style.js"> In this case, a browser knows that it can't handle the "text/javascript" content type, and in that case it does not need to download the "style.js" document. But if it recognizes the "text/javascript" document type, and horors it, it will download the referenced document and will enforce using it as a Javascript source, regardless of what the server says about it. For CSS documents, this means that we should honor the "text/css" content type specified by the type attribute, because we know how to handle this type, and so we will download and manage the document according to that decision. In absence of the "type" attribute, the only decision that can be done is if we support <link rel="stylesheet" href="..."> or <style src="...">. If we do, then we have no other choice than downloading the referenced stylesheet and managing it according to the media type that MAY be returned by the server (the decision of whever we support this content type is just delayed to the effective answer from the server for this default case). Please honor the type="text/css" attribute and if so ignore whatever the server reports (you may add a warning if you want on the browser console if there's a later apparent mismatch, but I think this does not violate ANY requirement from the specs).
> and will enforce using it as a Javascript source, regardless of what the > server says about it. Doing that would be a violation of the HTTP spec. Nowhere does the HTML spec advocate such a violation. The simplest example of the problem this could cause is something like: <link type="text/css" href="http://www.bar.com/style.css"> where the registration on the bar.com domain has expired. In such cases the returned page is pretty likely to be a page advertising the registrar involved. This will be returned as text/html. You're saying we should parse it as CSS?
Yes, I mean that the registration case is a non sense: this will also affect the document using it if it's in the same domain, because it won't be downloaded too. If a document references a CSS file from another domain separately administrated, then it uses two documents with different administration and ownership, and this case ressembles to the case of broken links between domains (and with all security and copyright issues associated with interdomain references). Such documents are not self contained. Despite this, we should still parse the CSS from the external domain, and detect syntax errors if ever to ignore bogous styles or inexisting styles... When a page designer takes the risk to reference an entity from another domain, then it takes a risk of his document being broken if that domain expires. This does not violate the standard. And NO, there is no violation of the HTTP spec. HTTP is just a transport protocol to retreive an isolated document, and HTTP does not mandate any behavior for the document once it has been retreived by a client. HTTP spec does NOT define any MANDATORY or even RECOMMANDED rule, as regard to HTML or CSS. It just offers a way for the server to deliver a content type for the retreived external entity, which just constitutes an hint that MAY be used by applications, or ignored as wanted. In my opinion an application, such as an HTML browser has the right to define itself its behavior face to the actual content of the retrieved document (because once it is downloaded there is absolutely no place where HTTP dictates anything about the document). The content type of the HTTP transaction is just an hint in the single case where a document is retreived out of any context (such as when pasting an URL in the address bar of the browser). In my opinion, a browser can use the type attribute to restrict the type of acceptable document it wants, if it's possible: it can use the "Accept: text/css" header to retrieve an explicit content type, and the HTTP server in that case should honor this requirement or return an HTTP error. The Accept header value sent by a client is typically "*" (or it's the default value if it's omitted or if the transport protocol does not support sending such restriction, such as with FTP). Enforcing the content type after the content has been retrieved is a bad behavior, because the browser did not specify it absolutely wanted only a single content type and not another. In fact it has already downloaded a file, and then it should parse it according to the design of the HTML page. Anyway, the browser will still need to parse the document to validate its syntax or usability, and choose to ignore any invalid definition which does not have valid specifier names possibly combined with operators, followed by an opening brace, semi-colon separated valid CSS style entries, and a closing brace (as specified in the CSS spec). Also the entire CSS file should be disregarded if definition braces do not match. If the HTTP server returns an error status or if the CSS file does not validate the basic CSS syntax , then there will be no CSS file, and only in that case the browse can keep its default stack of CSS stylesheets. If an HTML document appears to be accepted accidentally as a CSS file, it will certainly garbage the document presentation, but this would be a very unlikely event, as most pseudo-CSS definitions will be ignored as invalid.
> this will also affect the document using it if it's in the same domain Yes, but you're assuming it's in the same domain. I see no reason to assume that. > Such documents are not self contained. No document that uses a <link> tag to import stylesheets is "self-contained". Where the <link> points to is utterly irrelevant. > we should still parse the CSS from the external domain, and detect syntax > errors Consider an HTML page with an embedded <style> tag. What do you think happens when it's parsed as CSS? > then it takes a risk of his document being broken This is true for _any_ external link. This is what the "Authors who use this attribute take responsibility to manage the risk that it may become inconsistent with the content available at the link target address" language at http://www.w3.org/TR/html401/struct/links.html#adef-type-A means. The basic question here is "if the 'type' attribute and the server's type do not match, whom should we assume to be right?" The language above unambiguously states, imo, that the server is assumed to be right and the linking document is assumed to be wrong in cases when such a mismatch occurs. Mozilla does send an Accept header that includes text/css at a much higher q value than *. Yes, we should send an Accept header that only includes text/css when making stylesheet requests. That's a separate bug from this one, and does not solve the issue of HTTP/1.0 (or even HTTP/0.9) servers. > or if the CSS file does not validate the basic CSS syntax There is no such thing. Consider <html><head><style> a { color: red } span { color: blue } </style></head></html> Does that "validate basic CSS syntax"? > but this would be a very unlikely event Not if there is a <style> tag in that document (which is becoming _very_ common).
an HTML page is very likely to contain closing tags </xxxx> which cannot be parsed as a valid CSS selector due to the "/" character in an illegal context for CSS selector. Note also that "<" is not a valid operator in CSS selectors (or is there a new allowed operator defined with it "<", using XPath ?), though the ">" character is a valid operator. This does not validate the basic CSS syntax, and in that case the whole document is invalid and should be dropped entirely (all their possible rules will be ignored). Other invalid sequences in CSS selectors are: <!DOCTYPE> <!-- comment start comment end --> (note that <!-- and --> are valid in CSS and considered like empty rules, and they can occur only at the places where a new selector could occur, and forbidden elsewhere as the comment delimiters in CSS are /* and */) even the following is invalid CSS file: <html><style><!-- p { font-color:blue } --></style></html> because of the leading tags, or simply: Page not found because the selector has no corresponding definition, or just: Page not found <style> p { font-color:blue } because the selector syntax is invalid (this transitional and non conforming HTML code would seem quite stupid), or: <p style="font-color blue">Page not found (a valid transitional but not conforming HTML code, which is also invalid CSS) I don't see any valid and conforming HTML files that can be considered also as valid CSS. I also wonder if there exists a transitional HTML file that may be considered valid CSS...
> I don't see any valid and conforming HTML files that can be considered also as > valid CSS. Bug 112644 gives an example of exactly such a thing happening (except in quirks mode, where we do parse CSS files that aren't sent with MIME type text/css). It is certainly possible, given the error handling rules of the CSS spec.
Let's not have this debate in three different bugs (bug 53112, this bug, and bug 136529), OK? I think the appropriate place is bug 136529.
Relevant info about CSS conformance: Points 1 and 2 in section 3.2 below are not the problem we discuss here. But point 3 is of importance. As well as the section 3.3 that follows it. And Section 4.2 that discuss rules for handling parsing errors Yes there are several levels of error recovery, but only for a limited set of features: unknown properties and illegal values imply ignoring just the definition (not the whole rule). Invalid At-keywords: ignore the At-keyword and the matching semicolon or braced block that follow it. For everything else, the parse error is fatal. So HTML contents (even transitional HTML) will always parse with a false CSS parse errror, with an effect to ignore the whole stylesheet... Demonstrated ! 3.2 Conformance This section defines conformance with the CSS2 specification only. There may be other levels of CSS in the future that may require a user agent to implement a different set of features in order to conform. In general, the following points must be observed by a user agent claiming conformance to this specification: 1. It must support one or more of the CSS2 media types. 2. For each source document, it must attempt to retrieve all associated style sheets that are appropriate for the supported media types. If it cannot retrieve all associated style sheets (for instance, because of network errors), it must display the document using those it can retrieve. 3. It must parse the style sheets according to this specification. In particular, it must recognize all at-rules, blocks, declarations, and selectors (see the grammar of CSS2). If a user agent encounters a property that applies for a supported media type, the user agent must parse the value according to the property definition. This means that the user agent must accept all valid values and must ignore declarations with invalid values. User agents must ignore rules that apply to unsupported media types. 4. For each element in a document tree, it must assign a value for every applicable property according to the property's definition and the rules of cascading and inheritance. 5. If the source document comes with alternate style sheets (such as with the "alternate" keyword in HTML 4.0 [HTML40]), the UA must allow the user to select one from among these style sheets and apply the selected one. Not every user agent must observe every point, however: * A user agent that inputs style sheets must respect points 1 - 3. * An authoring tool is only required to output valid style sheets * A user agent that renders a document with associated style sheets must respect points 1 - 5 and render the document according to the media-specific requirements set forth in this specification. Values may be approximated when required by the user agent. 3.3 Error conditions In general, this document does not specify error handling behavior for user agents (e.g., how they behave when they cannot find a resource designated by a URI). However, user agents must observe the rules for handling parsing errors. Since user agents may vary in how they handle error conditions, authors and users must not rely on specific error recovery behavior.
> For everything else, the parse error is fatal. No. Things are parsed according to the grammar in 4.1.1: "All levels of CSS -- level 1, level 2, and any future levels -- use the same core syntax. This allows UAs to parse (though not completely understand) style sheets written in levels of CSS that didn't exist at the time the UAs were created." ...
Please don't paste entire specs in bugs, just link to the relevant section and quote the bits that specifically back up your point. Also, please explictly mark what is a quote and what is your commentary.
Well I have first used my comments, and then pasted the relevant sections of specs because they were only relevant in their complete form. I had to do that because somebody said that we should only ignore the current rule in case of error, which is inexact per specification, which clearly states several levels or recovery, some of which being mandatory. The spec clearly states that the CSS parser MUST parse the CSS grammar. And then it states that some elements can be ignored and in which conditions. The spec defines what is a rule, and fixes a limit on recovery of blocks or of definitions withon blocks. It clearly does not say what happens when the error occurs outside the block level, simply because an error at this level breaks the limitation between rules. Point 3 in section 3.2 is the most important one, because of the MUST word used at top of this section and because it is the only point that is relevant to syntax parsing (the other points fix semantic errors which still validate the CSS syntax)
Please see comments in bug 136529 from now on. This bug is now _only_ about the warning message.
1.1alpha is frozen. Unsetting milestone and will retriage in a few days when I can make a realistic assessment of the situation.
Target Milestone: mozilla1.1alpha → ---
Depends on: 154942
This is fixed by bug 154942
Status: NEW → RESOLVED
Closed: 23 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.