The front page of www.mozilla.org is missing the correct doctype (HTML 4.01 Transitional), preventing it from validating; in addition, an ampersand in the link to the Galeon download at sourceforge.net should be transformed into the entity &. I know that the whole page is supposed to be overhauled Real Soon Now with a Zope-based framework (It should be in a state to be hacked on very soon (in the next week, I hope.)-Gervase Markham, July 2, 2001, n.p.m.documentation), but I'm filing this bug in case its arrival is delayed, as the matter has attracted comment.
The & problem seems to be gone. I attached a patch that adds an appropriate doctype to the front page. The change doesn't affect display but it would make the front page look better to validators.
Endico, could you, please, take a look at the patch and check it in, if it is OK?
This is one of these things that is really just a niggle but it is so easy to fix. Would it be an idea to cc: email@example.com as that address shows up most often in the CVS log? It would just be nice to get this looking nice to the w3 validator.
*** Bug 101552 has been marked as a duplicate of this bug. ***
CCing brendan, as he expressed an interest. Dawn - any reason we can't check this in? Gerv
Dawn told me that we can't check this in because she thinks it will affect how browsers display the pages, and that we would be labelling some broken HTML (on the pages that haven't been fixed yet) with a DOCTYPE that they didn't match. Will checking this in affect browser display in any way? Gerv
> Dawn told me that we can't check this in because she thinks it will affect how > browsers display the pages, There are three browsers that are known to pay attention to the doctype. These are Mozilla, Mac IE 5 and Windows IE 6. The doctype suggested in the patch makes all three go in their respective standards modes. Switching the mode in Mac IE 5 would not change the layout of http://www.mozilla.org/index.html in any way. In Mozilla the margin/padding around the text changes slightly. Nothing drastic. (IMO, the slight change in Mozilla shouldn't block this change. It would be *very* bad from the evang point of view.) I'm unable to test with Windows IE 6. I'd appreaciate it if some else took a look. However, I think it is very safe to expect Windows IE 6 to have no problem with it. > and that we would be labelling some broken HTML (on > the pages that haven't been fixed yet) with a DOCTYPE that they didn't match. The intent is not to add the doctype to the wrapper. This particular change is only about the front page template.
I wrote "It would be *very* bad from the evang point of view." I meant: It would be *very* bad from the evang point of view if mozilla.org itself refused to use the standards mode of Mozilla.
Dawn: does that address your concerns? Gerv
what happens the next time newsbot gives us an & in a url?
Dawn: ping? Gerv
>what happens the next time newsbot gives us an & in a url? So the broken URL came from newsbot? I suggest fixing the newsbot URL output then. > I'm unable to test with Windows IE 6. I got access to a Windows machine with IE 6 on it. There's one issue with IE 6. It is easy to fix. I'll attach a new patch.
Created attachment 61450 [details] [diff] [review] Patch v2 (IE6-friendly)
Attachment #42388 - Attachment is obsolete: true
Dawn or Myk - could this please be checked in? Gerv
Apropos of this, isn't it about time to update the copyright notice?
*** Bug 132749 has been marked as a duplicate of this bug. ***
OK, so the hold-up to making mozilla.org valid HTML is...?
accepting QA for mozilla developer docs. some of these bugs have been around for a _long_ time. Reporters, would you please review the bugs, see if the issues have been resolved, and close bugs appropriately. I will do a full review of all bugs not touched in one week (8th April). Thanks. </spam>
QA Contact: endico → imajes
The patch misses this part: @@ -115,7 +116,7 @@ <a href="http://lxr.mozilla.org/mozilla1.0/source/"> LXR for the Mozilla 1.0 branch</a> </li> -<ul> +</ul> <!-- End of Body of 1.0 Countdown --> </TD> is there a reason why this bug is being ignored? it would be trivial to fix. also, it makes mozilla look better if the front page is a correct HTML page. (shouldn't this be in product mozilla.org component firstname.lastname@example.org ?)
yeah, I think so
Component: Mozilla Developer → email@example.com
Product: Documentation → mozilla.org
Version: unspecified → other
*** Bug 138807 has been marked as a duplicate of this bug. ***
changing summary for easier searching
Summary: Invalid HTML on front page → Invalid HTML on front page [page does not validate, mozilla.org has no doctype]
*** Bug 144413 has been marked as a duplicate of this bug. ***
How about marking this bug mozilla1.0? I think this should be fixed before 1.0 comes out.
*** Bug 148184 has been marked as a duplicate of this bug. ***
Do you need help with this huge change? hohoho Please, commit this.. this is a shame. And how come there have been no comments by the bug owner so far? Is "Dawn Endico" the only person with write access to Mozilla web pages?
i checked this in and also added the align=left attributes to the 'towards 1.0' but the w3 validator is down and the others i found don't do file uploads so i didn't check it.
I've used an offline SGML validator. This is that I got: nsgmls:index.html:119:3:E: document type does not allow element "UL" here; assuming missing "LI" start-tag nsgmls:index.html:122:15:E: "UL" not finished but containing element ended nsgmls:index.html:122:15:E: end tag for "UL" omitted, but its declaration does not permit this nsgmls:index.html:119:0: start tag was here nsgmls:index.html:122:15:E: end tag for "UL" omitted, but its declaration does not permit this nsgmls:index.html:106:0: start tag was here
This is probably all due to the same issue, namely a <ul> being where an </ul> should be: <ul> [...] LXR for the Mozilla 1.0 branch</a> </li> <ul> last <ul> should be </ul>
just checked in a fix for that.
Validator is back up again. Page validates. Thanks a million, Dawn; I greatly appreciate this.
Status: NEW → RESOLVED
Last Resolved: 16 years ago
Resolution: --- → FIXED
Thank you for checking this in. The page is still missing the charset parameter in the Content-Type header, though. Is there another bug about that?
eh... why? RFC 2616 (HTTP) specifies this: The "charset" parameter is used with some media types to define the character set (section 3.4) of the data. When no explicit charset parameter is provided by the sender, media subtypes of the "text" type are defined to have a default charset value of "ISO-8859-1" when received via HTTP. Data in character sets other than "ISO-8859-1" or its subsets MUST be labeled with an appropriate charset value. See section 3.4.1 for compatibility problems. Therefore, for Latin1, no charset needs to be specified, since that's the default anyway.
biesi: the problem is, that the validator gives a rather big warning if you don't specify a charset and you won't see this nice "valid HTML"-Button until the page has a charset given.
that's a problem with the validator then
No, the document cited is old. The W3C changed the requirement to always spcify the encoding. From http://www.w3.org/TR/REC-html40/charset.html#h-5.2.2 : The HTTP protocol ([RFC2616], section 3.7.1) mentions ISO-8859-1 as a default character encoding when the "charset" parameter is absent from the "Content-Type" header field. In practice, this recommendation has proved useless because some servers don't allow a "charset" parameter to be sent, and others may not be configured to send the parameter. Therefore, user agents must not assume any default value for the "charset" parameter.
Stupid W3C... Well, adding <meta http-equiv="content-type" content="text/html; charset=iso-8859-1"> in the <head> should fix it, or?
biesi: Yes, it would.
OK, now the front page validates, but many other pages do not. See: http://www.htmlhelp.com/cgi-bin/validate.cgi?url=http%3A%2F%2Fwww.mozilla.org&warnings=yes&spider=yes&hidevalid=yes What should be done: - reopen this bug ? - open a global bug for all pages? - open a bug for each page? Regards,
Open a new bug, definitely - this one is about front page, as the summary says.
Status: RESOLVED → VERIFIED
I am sorry, but the http://www.mozilla.org/index.html page still does not validate. Using http://validator.w3.org I get the following message: I was not able to extract a character encoding labeling from any of the valid sources for such information. Without encoding information it is impossible to validate the document. The sources I tried are: * The HTTP Content-Type field. * The XML Declaration. * The HTML "META" element. And I even tried to autodetect it using the algorithm defined in Appendix F of the XML 1.0 Recommendation. Since none of these sources yielded any usable information, I will not be able to validate this document. Sorry. Please make sure you specify the character encoding in use.
I am sorry, but I cannot reopen the bug. Can the subbmitter or owner please reopen it?
http://webtools.mozilla.org/web-sniffer/view.cgi?url=http%3A%2F%2Fwww.mozilla.org%2Findex.html Front page does not provide a charset.
Whiteboard: No charset provider - not fixed?
Section 2, paragraph 3 of the Mozilla.org style guide says that META tags should not be used. So either this part of the style guide should be changed, or the charset encoding should be send in the HTTP header. The style guide can be found at http://www.mozilla.org/README-style.html .
Re: Comment #46 From Oliver Klee 2002-12-16 04:34 ------- > Section 2, paragraph 3 of the Mozilla.org style guide says that META tags > should not be used. Which I can agree to (HTTP headers are definitly preferable). > or the charset encoding should be send in the HTTP header. Exactly. As the websniffer URI I've given says, the encoding is *not* sent. Is this a new bug?
Thanks for the reference to the style guide. The style guide says: << Composer likes to put in noise like this: <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=iso-8859-1"> While a nice touch in theory, back here in the real world, that tag makes 3.0-vintage Navigators load the document twice and generally lose their minds. Don't go there. If you use Composer, take this junk out before publishing it. >> The reason given for not using META (Netscape 3.0) seems very out of date. I would fix the style guide. By the way, I would add to the style guide a first rule saying "Always validate your HTML code before publishing it."
There are other significant issues with the use of <meta> tags, such as they're not actually a very good way to set the charset. HTTP headers are much preferable. This probably depends on some server-upgrade bug or another...
I do not know what the best way to give the character set is. What I can see is that now it is not given. If there are really good reasons not to use META tags, this should be documented in the style guide. The reason given in the current document seems very poor for me. Some facts: 1) META tags are used on some pages, to specify the character set as ISO Latin 1 http://www.mozilla.org/status/ http://www.mozilla.org/hacking/ http://www.mozilla.org/status/2002-11-20.html 2) META tags are used to specify a different character set than Latin 1, e.g. http://www.mozilla.org/releases/mozilla1.3a/ 3) META tags are used to specify other things, e.g. http://www.mozilla.org/hacking/ <meta name="GENERATOR" content="Mozilla/4.73 (Macintosh; I; PPC) [Netscape]">
See <URL:http://ppewww.ph.gla.ac.uk/~flavell/charset/ns-burp.html>, for instance. This is really something that should be solved by having the server send a charset parameter with the Content-Type; that's considerably less hackish.
A real HTTP header is certainly preferrable in the HTTP context. The meta thing could still be useful for those who browse the pages from a local filesystem after saving them to disk. So, what methods does the Netscape Enterprise 3.6 server provide for setting the charset parameter on a per-file, per-directory or per-server basis? Who can change the configuration *and* can be persuaded into doing so in the foreseeable future?
Page still does not have a character set..... REOPENED
Status: VERIFIED → REOPENED
Resolution: FIXED → ---
What options does the server provide for fixing this and who is able and willing to alter the settings?
After some digging at w3c.org, I found the following at http://www.w3.org/International/O-charset.html which says :- --- It is very important that the character encoding of any XML or (X)HTML document is clearly labeled . This can be done in the following ways: * Use the 'charset' parameter in the Content-Type header of HTTP . Example: Content-Type: text/html; charset=EUC-JP * For XML, use the encoding pseudo-attribute in the xml declaration at the start of a document or the text declaration at the start of an entity. Example: <?xml version="1.0" encoding="iso-8859-1" ?> * For HTML, use the <meta> tag inside <head>. Example: <meta http-equiv="Content-Type" content="text/html; charset=utf-8" > For XHTML, you need a slash at the end: <meta http-equiv="Content-Type" content="text/html; charset=utf-8" /> --- I found that adding <?xml version="1.0" encoding="iso-8859-1" ?> as the very first line of the HTML file (I'm using HTML 4.01) fixed the problem with the W3C Validator.
> I found that adding <?xml version="1.0" encoding="iso-8859-1" ?> > as the very first line of the HTML file (I'm using HTML 4.01) fixed the > problem with the W3C Validator. That's a problem with the validator. (Really, it's a problem with the XHTML 1.0 spec, Appendix C, but let's not go there.) The advice given is only appropriate for XML (including XHTML) documents, not HTML 4.01. (And philosophically, charset should only really be handled at the HTTP level, but let's not go there, either.)
Putting Charset into the HTTP header (i.e. via the Web Server) restricts the server to one charset, which means you can't have multi-lingual pages. Mind you, I think I saw, in passing, that some web servers allow multiple charsets. Putting a XML statement in a HTML file is very wrong and, to be honest, I'm extremely surprised that it worked as my Doctype is HTML 4.01. The W3C Validator should've complained very loudly about that. Anyway, it appears that the Charset problem is outside the realm of Mozilla as it involves the W3C and web servers and can be fixed within XHTML which I understand supersedes HTML. Should this bug be marked fixed as before?
This should not be marked fixed. Also, the character encoding does not need to be the same server-wise even if the information is put in the HTTP headers (unless the server has limitations, that is). The ideal fix would be adjusting the server configuration so that the right charset parameter is sent on the HTTP level. If no one is going to fix the server configuration, then the <meta> tag would be the plan B. The XML declaration doesn't belong to HTML at all (though technically it would be an unrecognized processing instruction). (And moving to XHTML is a whole other can of worms, so let's not go there.)
The Netscape Enterprise server documentation suggests it can do conneg on charset, so it should be quite possible.
Summary: Invalid HTML on front page [page does not validate, mozilla.org has no doctype] → Front page needs charset parameter
Whiteboard: No charset provider - not fixed?
From WaSP's Ask W3C column for December 2002: Specifying Character Encoding This month kicks off our new “WaSP Asks the W3C” Question and Answer project. In this project, frequently asked questions posed to WaSP by Web authors and designers regarding standards are submitted by WaSP members to the W3C’s Quality Assurance Group for information. The answers are published and archived both here and on the W3C Web Standards Education list, where follow-up discussion also takes place. Signup details can be found at the end of this article. WaSP asks There are several ways of specifying the character encoding for a particular document. Which of the following methods (or combination thereof) does the W3C recommend, and why? * Have the server administrator set the proper encoding via the HTTP headers returned by the Web server * Have the author add the encoding with a meta element * XHTML authors can add the character encoding using the XML declaration The W3C responds These three ways of providing the character encoding of a document are not equivalent. When trying to figure out the character encoding of a resource, user agents will try, in this order: * The HTTP Content-Type header sent by the server * The XML declaration (only for XHTML documents) * The HTML/XHTML meta element * Other ways. There are algorithms to guess the character encoding, for example Since the HTTP Content-Type header has precedence, and is also the easiest information to retrieve (user-agents do not have to parse the resource to get it), it is almost always the preferred way to provide the character encoding for an (X)HTML document. However, in at least two cases, this is simply not possible: * The document author does not have any way to configure the server to send the proper HTTP Content-Type header * The document is not served via HTTP. In these cases, an HTML document should provide the character encoding via a meta element, and an XML document can provide it via the XML declaration. If the XML document uses one of the default encodings (UTF-8 or UTF-16) no declaration is needed to manage the character encoding. To sum it up * Wrong. The webmaster sets a default character encoding to be sent by the server but does not let the author override it or the info is not provided anywhere whatsoever * Good. The character encoding is not set at the server level but properly declared through the HTML meta element (and/or the XML declaration for XHTML documents) * Best. The character encoding is properly set at the server level, either with a default that authors can override or on a per-document basis, and is also available at the document level (both in the XML declaration if applicable and the meta element) for standalone use Examples Example of an XHTML 1.0 document written in French with an ISO-8859-1 encoding: <?xml version="1.0" encoding="ISO-8859-1"?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="fr" lang="fr"> <head> <title>Exemple de document XHTML 1.0</title> </head> <body> <h1>Portrait Intérieur</h1> <h2>Rainer-Maria Rilke</h2> <p>Ce ne sont pas des souvenirs<br /> qui, en moi, t'entretiennent ;<br /> tu n'es pas non plus mienne<br /> par la force d'un beau désir.</p> </body> </html> Example of an HTML 4.01 document written in French with a UTF-8 encoding: <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd"> <html lang="fr"> <head> <meta http-equiv="content-type" content="text/html; charset=UTF-8"> <title>Exemple de document HTML 4.01</title> </head> <body> <h1>Portrait Intérieur</h1> <h2>Rainer-Maria Rilke</h2> <p>Ce ne sont pas des souvenirs<br> qui, en moi, t'entretiennent ;<br> tu n'es pas non plus mienne<br> par la force d'un beau désir.</p> </body> </html> On the popular Apache Web server, the HTTP Content-Type header for a resource can be set up in the .htaccess file, as follows: <Files example.html> ForceType text/html;charset=ISO-8859-1 </Files> This would force the file example.html to be served as ISO-8859-1 even if the server had a different global configuration. WaSP comments WaSP and W3C member Tim Bray commented on this answer and said: “If you know that the document you’re sending is going to get read by an XML processor, the server should get the charset right. If the server makes any mistake the rules say that the processor is supposed to do the wrong thing! On the other hand, if the document is going to any kind of HTML reader, the server can usefully try to help and do what is suggested here. So it turns out that it matters whether you serve it as html or xhtml+xml.” How to serve HTML and XHTML will be discussed in the next issue of WaSP Asks the W3C. References * About Charset Parameters * About Character Encodings * HTML 4.0 specification on character encodings * XHTML 1.0 specification on character encodings * XML 1.0 specification on character encodings Discussion For clarification and discussion on this topic, please address your comments and questions to the W3C Web Standards Education list. To subscribe to the list, send an email to firstname.lastname@example.org with “Subject: subscribe”. You can read archived posts at http://lists.w3.org/Archives/Public/public-evangelist/.
For the charset on the whole web site there is bug 154570. Comment 57 is wrong. You can set the charset in http headers on a file by file basis (at least on reasonable web servers like Apache). pi
removing dependency on an invalid bug
No longer blocks: 92997
see attachment 127836 [details] for an xhtml transitional version of the mozilla.org home page, which uses an xml prolog to declare its content-type. i realize that it's better to send charset information on the server-side, but that's bug 154570, and i'd say that it's better to have the page validate now and later than just later.
Um, serving XHTML as text/html causes more problems than it solves. (Hixie can talk your ear off about this, if necessary). Maybe the current meltdown will wind up with us moving to Apache, which would have been the sensible choice all along?
as we can't use <meta> or an XML prolog, i checked out some docs on netscape enterprise: http://kuhub.cc.ku.edu/www/html/721final/6563/6563pro_001.html#writingnsconfigfiles it looks like we can do this (note: i can't test this b/c i don't have an ns server): <Files index.html> AddType exp=index.html type=text/html;charset=iso-8859-1 </Files> however, i'm not sure that child directories won't inherit this, so perhaps we could add something like: <Files ?*/index.html> AddType exp=index.html type=text/html </Files>
Just for reference, the documentation you linked to is for Netscape FastTrack Server 7.2. We're running Netscape Enterprise Server 3.0.
email@example.com maintains the front page; CCing him. I don't know which bug the referenced attachment 127836 [details] (XHTML version of the new front page) is attached to, though... Gerv
Attachment #127836 [details] is attached to Bug #154570 ("www.mozilla.org doesn't send charset information")
This charset discussion runs for more than 15 months now. Can't we just insert the meta tag for now until we can reconfigure the server and/or port the page to XHTML 1.1?
Whiteboard: start reading at comment 33
The server move is happening this coming week. Fiddling with server-provided headers will be much easier once we're running on Apache.
Okay, according to http://www.delorie.com/web/headers.cgi?url=http%3A%2F%2Fwww.mozilla.org , the server now runs Apache/1.3.27 (Unix) (Red-Hat/Linux). It sends no charset currently.
It *just* got moved this afternoon. Apache can do it, but not by default. :) We still have to configure it. It will happen sometime in the next week or so (we're all volunteers here).
Dave, thanks for the work. Let me just give a reminder to bug 154570 (which would fix this bug, but does not really block it). AddDefaultCharset On in the config would do the job. It can be overwritten on a by file or by type basis. pi
Request this be closed as the beta site is live ( / validates as HTML 4.01 Strict)? I would close this but I don't have permissions (and don't fulfill req's to get permissions) yet.
fixed by redesign.
Status: REOPENED → RESOLVED
Last Resolved: 16 years ago → 15 years ago
Resolution: --- → FIXED
Status: RESOLVED → VERIFIED
Component: www.mozilla.org → General
Product: Websites → www.mozilla.org
You need to log in before you can comment on or make changes to this bug.