Closed Bug 373495 Opened 18 years ago Closed 18 years ago

Make webpages on planet.mozilla.org validate as HTML 4.01

Categories

(Websites :: planet.mozilla.org, defect)

defect
Not set
minor

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: bugzilla, Assigned: wolf)

References

()

Details

Attachments

(1 file, 1 obsolete file)

Webpages on planet.mozilla.org use a transitional DTD and they *_never_* output valid markup code. There is always 100+ validation markup errors. Always. Because of bug 151557 and because of "mozilla.org Documentation Style Guide" http://www.mozilla.org/contribute/writing/guidelines#validation Planet Mozilla should (ideally) always output valid markup code and, preferably, with a strict DTD. "(...) we're making one of the most standards-compliant browsers around. It would be bad show to have an incompliant website."
I am voting for this bug because I am convinced that, via the CMS used, it should not be that difficult to output valid markup code defined with a HTML 4.01 strict DTD.
Is part of that invalid code coming from aggregated blog posts?
Hello Chris, No, I wouldn't say that. Typically, the Planet Mozilla webpage are created like this: <div class="entry"> <p> <!-- starting from here is the blog entry --> <p>Hello world </p> <p> ... </p> <p> ... </p> <!-- end of blog entry --> </p> <p class="date"> <a href="[blog entry url]">some date, like: March 09, 2007 10:24 PM</a> </p> </div> So, the generated error is the extra </p>, or, the double </p></p> in the markup code. Out of 158 validation markup errors today, exactly 75 are : {end tag for element "P" which is not open .} Solution is to remove the start <p> and then the closing and extra </p>. Still in today's code, precisely at line 1385 is (line breaks inserted to avoid horizontal scrolling): </li></li></li></li></li></li></li></li></li></li></li></li></li></li></li> </li></li></li></li></li></li></li></li></li></li></li></li></li></li></li> </li></li></li></li></li></li></li></li></li></li></li></li></li></li></li> </li></li></li></li></ul> So, again, this is related to the CMS template in use.
Severity: normal → minor
The <p></p> is the template. though the template actually has it right. but the imported entries seem to usually begin with a <p> themselves, making the resulting html wrong. The template has it as: <div class="entry"> <p> <TMPL_VAR content> </p> <p class="date"> <a href="<TMPL_VAR link ESCAPE="HTML">"><TMPL_IF creator>by <TMPL_VAR creator> at </TMPL_IF><TMPL_VAR date></a> </p> </div> The line 1385 </li>'s are not from the template. They're created in response to the Calendar blog's use of <li> with no closing tag. (in this post, http://weblogs.mozillazine.org/calendar/2007/03/sunbirdlightning_status_update_5.html ) (most likely to prevent one blog's post from cross-contaminating another one.)
Wolf, how about changing the template to become something like this: <div class="entry"> <TMPL_VAR content> <p class="date"> <a href="<TMPL_VAR link ESCAPE="HTML">"><TMPL_IF creator>by <TMPL_VAR creator> at </TMPL_IF><TMPL_VAR date></a> </p> </div> along with the CSS code: .entry {padding-top: 1em;} > The line 1385 </li>'s are not from the template. They're created in response to > the Calendar blog's use of <li> with no closing tag. Well, whoever is behind that Calendar blog should be notified that it generates 49 consecutive </li>. A few <img> missing alt attribute in a multi-party blog can be expected. Well over 100 hundreds validation markup errors every single day with many reappearing consistently, while using a transitional DTD, is a different story.
Wolf, today, there are 126 validation markup errors at planet.mozilla.org: exactly 67 out of those 126 are the same and the same already discussed here "end tag for element "P" which is not open." I have proposed a simple change that is a) efficient, b) standards-compliant and c) reducing a bit the DOM tree depth (removing "n" <p> every day at planet.mozilla.org; n = 67 today.) . The whole correction I have proposed represent exactly 35 bytes and it fixes between 67 to 100 (or more) validation markup errors every day. ---------- Still today, 48 validation markup errors are due to the same and already discussed here Error Line 2619 column NNN: "end tag for element "LI" which is not open. ...li></li></li></li></li></li></li></li></li></li></li></li></li></li></li></li " and it's all because there is 49 consecutive </li> at line 2619. I'll try to reach the Calendar blog people for this. This should be easy to fix. Those 67 + 48 validation markup errors can be fixed for good, I'm sure, and very easily. Best regards, Gérard
Attached patch Patch v1 (obsolete) — Splinter Review
Changes per Comment #5. Remove the extra <p></p> the template adds around the blog posts. (Resulting in <p><p>$blogpost_content</p></p>). Add padding to .entry (padding-top: 1em; } Requesting review from 2 planet peers. (not that I believe it requires it, just being complete, whomever gets to it first, and I believe planet changes currently require 2 levels of peer ok.) fwiw, I don't have svn access, nor am I familar with svn, so this patch is a little rough.
Assignee: asa → bugtrap
Status: NEW → ASSIGNED
Attachment #258479 - Flags: review?(preed)
Attachment #258479 - Flags: review?(asa)
Comment on attachment 258479 [details] [diff] [review] Patch v1 changes look fine to me. I don't have svn set up so this will wait on me to get that or preed to have a free minute. thanks.
Attachment #258479 - Flags: review?(asa) → review+
I've corrected the missing </li> issue on our weblog (Calendar).
Thank you Simon! This is appreciated! :) A few other errors which can be easily fixed: 1- <script type="text/javascript"> function calcDiff(thenMS) { now = new Date(); (...) else document.write(days, " days, ", hours%24, " hours, ", minutes%60, " minutes<br>", then.toLocaleString()); document.write("</dd>"); } </script> document.write("</dd>"); should be written like this: document.write("<\/dd>"); 2- <p>Maintained by <a href="https://bugzilla.mozilla.org/enter_bug.cgi?product=Websites&component=planet.mozilla.org"> should be written like this: <p>Maintained by <a href="https://bugzilla.mozilla.org/enter_bug.cgi?product=Websites&amp;component=planet.mozilla.org">
3- <div id="footer"> does not have, is missing a closing </div> 4- <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd"> can be changed to <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd">
1. Why does the validator try to validate the contents of <script> Escaping document.write to make the validator happy seems not that useful to me. 2. ok. - probably should be added to the patch in this bug, doh. 3. ok. - same as above. 4. I'm not sure requiring all the blog authors on planet comply to a strict doctype on planet is a good idea. The style guide does not require using strict. (Though, Planet isn't a mozilla.org Document as such, either, so i'm not sure how much that document really applies.) I think given that we're not in control of the very dynamic syndicated markup that passes through planet, we're in a better position to validate as transitional than strict. Currently there's, 93 errors vs. 101 for strict. I count about 7 errors (transitional) coming from blogs, but double that with strict. (mostly deprecated attributes, target, align, etc.) Room for error, in that i'm assuming any </p> is from the template still.
Attachment #258479 - Flags: review?(preed)
> 1. Why does the validator try to validate the contents of <script> Escaping > document.write to make the validator happy seems not that useful to me. Common HTML Validation Problems Writing HTML in a SCRIPT element http://www.htmlhelp.com/tools/validator/problems.html.en#script HTML 4, Section 18.2.4 Dynamic modification of documents http://www.w3.org/TR/html4/interact/scripts.html#h-18.2.4 HTML 4, Appendix B.3.2 Specifying non-HTML data, Element Content http://www.w3.org/TR/html4/appendix/notes.html#h-B.3.2.1 "When script or style data is the content of an element (SCRIPT and STYLE), the data begins immediately after the element start tag and ends at the first ETAGO ("</") delimiter followed by a name start character ([a-zA-Z]); note that this may not be the element's end tag. Authors should therefore escape "</" within the content." > 4. I'm not sure requiring all the blog authors on planet comply to a strict > doctype on planet is a good idea. Requiring? Maybe no. Desiring, hoping, expecting, inviting, asking: yes. Passing validation with a strict DTD is/should be a goal for everyone and is after all in the best interests of blog authors. "All new pages should validate as HTML 4.01 Strict using the W3C Validator. This ensures (...)" http://www.mozilla.org/contribute/writing/guidelines#validation > The style guide does not require using > strict. (Though, Planet isn't a mozilla.org Document as such, either, so i'm > not sure how much that document really applies.) I think given that we're not > in control of the very dynamic syndicated markup that passes through planet, > we're in a better position to validate as transitional than strict. Currently > there's, 93 errors vs. 101 for strict. I count about 7 errors (transitional) > coming from blogs, but double that with strict. (mostly deprecated attributes, > target, align, etc.) Correct. > Room for error, in that i'm assuming any </p> is from the > template still. Fair enough. If number of errors are under 20, we (including me!) should all be happy. Cheers!
Attached patch Patch v2Splinter Review
Address the issues from patch v1. (<p></p> and css change.) which already got review+ (thanks asa.) Adds to it.. Escaping the </dd> in the <script> tag (Comment #10 Point 1) Escape the & in the bugzilla link. (Comment #10 Point 2) Add the closing </div> (Comment #11 Point 3) I didn't change the doctype. I think we should encourage authors to use HTML4/Strict html in their posts but for now, I don't think changing the doctype is a good idea, since it puts the page in a worse situation wrt validation. Requesting review again.
Attachment #258479 - Attachment is obsolete: true
Attachment #258571 - Flags: review?(asa)
Comment on attachment 258571 [details] [diff] [review] Patch v2 nice. thanks guys.
Attachment #258571 - Flags: review?(asa) → review+
Whiteboard: [checkin-needed]
Checked in patch 2. Looks like remaining errors are in feeds themselves.
Wasn't really clear. I'll leave this bug open in case of any more work. If it's decided to be done, just resolve as fixed.
I'd like to keep this bug opened until June 1st 2007. FYI, the number of validation markup errors have been reduced quite a lot. Today, there is/was 15 validation markup errors.
(In reply to comment #18) > I'd like to keep this bug opened until June 1st 2007. > That's fine. Out of curiosity what is the significance of June 1? > FYI, the number of validation markup errors have been reduced quite a lot. > Today, there is/was 15 validation markup errors. > Reduction is always good.
Its June 1. :-) afaict, there's nothing more to be done here, a quick check shows 3 transitional errors and 5 strict errors, all in user content. (2 missing alt="", a bad align (center). for transitional. for strict, 2 align complaints (that they exist), 1 target (same), and the 2 missing alt="".) bug 372060 upgraded planet to planet venus, which should allow for better sanitizing of invalid markup. (See Comment #20 in that bug). --> Fixed.
Status: ASSIGNED → RESOLVED
Closed: 18 years ago
Resolution: --- → FIXED
Summary: Make webpages on planet.mozilla.org validate as HTML 4.01 Strict → Make webpages on planet.mozilla.org validate as HTML 4.01
Whiteboard: [checkin-needed]
Depends on: 372060
The June 1st date was just to give a time span and some time to evaluate from who/where the remaining validation markup errors were coming from on a *_regular_* basis. Having the bug still opened would have helped me contact the person behind the feed. On certain days, there are still up to over 75 validation markup errors; on others, there are 20 or less. Some people's feed generate errors which consistently lead to generate long list of </p></p></p></p></p>... or </li></li></li></li></li>... I've noticed this with John Lilly's post http://john.jubjubs.net/2007/05/18/growth-around-the-world/ and one of Robert O'Callahan blog post. Let's say a blog post is like this: <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> (...) <ul> <li>apple <li>orange <li>cherry </ul> then in Planet Mozilla, it will create _something_ like this: <ul> <li>apple <li>orange <li>cherry</li></li></li> </ul> Validation markup errors on planet.mozilla.org are not *always* just a few missing alt="", a few align attributes or a few deprecated attributes. Validity is not (and should not be) a religion but a thing mozilla bloggers should have in mind and try to achieve. > planet venus, which should allow for better > sanitizing of invalid markup We'll see :)
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: