Closed Bug 672453 Opened 8 years ago Closed 8 years ago

Whine to console when an HTML file falls off the 'good' encoding detection path

Categories

(Core :: HTML: Parser, enhancement)

enhancement
Not set

Tracking

()

RESOLVED FIXED
mozilla14

People

(Reporter: hsivonen, Assigned: hsivonen)

References

Details

(Whiteboard: [parity-ie])

Attachments

(2 files, 3 obsolete files)

To make it easier for Web authors to discover that their pages are broken, the HTML parser should warn to console when the following condition is met
 * There's no HTTP-level charset
AND
 * There's no BOM
AND
 * The meta prescan failed.

Furthermore, the parser should emit an error to the console when an encoding change restart occurs.

According to http://blogs.msdn.com/b/ieinternals/archive/2011/07/18/optimal-html-head-ordering-to-avoid-parser-restarts-redownloads-and-improve-performance.aspx , IE tells authors about restarts.
Attached patch Fix without tests (obsolete) — Splinter Review
Assignee: nobody → hsivonen
Status: NEW → ASSIGNED
Summary: Whine to console when an HTML file falls of the 'good' encoding detection path → Whine to console when an HTML file falls off the 'good' encoding detection path
The changes to tests that look unrelated are meant to silence these console messages for other tests that listen to the console and expect to find their own messages.
Attachment #580019 - Attachment is obsolete: true
Attachment #582280 - Attachment is obsolete: true
Attachment #582851 - Flags: review?(bugs)
Comment on attachment 582851 [details] [diff] [review]
Whine to console about encoding-related authoring errors, with test changes

Do we really want to do this?
I would assume this will cause *lots* of warnings, even so much that no
one will care.

At least add a pref to enable/disable this and then we could
evaluate whether this approach is feasible.
Attachment #582851 - Flags: review?(bugs) → review-
(In reply to Olli Pettay [:smaug] from comment #4)
> Do we really want to do this?
> I would assume this will cause *lots* of warnings, even so much that no
> one will care.

The warnings triggered for sites authored by someone else will be annoying, sure, but they will still be fewer than the warnings the CSS parser spits out.

I think the warnings are useful for the person who has authored the pages that trigger the warnings. If you consider the case where the person reading the warnings is diagnosing pages (s)he wrote, this patch will add console messages only when the author has made a mistake (and the mistake is super-easy to fix). Contrast this with the CSS parser which outputs tons of warnings about stuff that the author did deliberately.

> At least add a pref to enable/disable this and then we could
> evaluate whether this approach is feasible.

How about I instead disable the "EncNoDeclaration" message for framed docs--i.e. different-origin frames?

AFAICT, ad iframes are the most noisy case and with those the console message is pretty useless anyway, since the author of the page that contains the ad iframes probably isn't empowered to adjust the ad server's output anyway.

The messages other than "EncNoDeclaration" will be rare enough that I think it would be an overreaction to limit them in any way (beyond the patch already limiting the messages to at most one message per document).
smaug, ping. Would this be r+-worthy if I disabled EncNoDeclaration for framed docs?

As mentioned above, these warnings/errors would be much fewer than the flood of warnings about -webkit-foo CSS properties we currently have.
This patch implements the tweak proposed above. This patch is silent in normal browsing use at least for me.
Attachment #582851 - Attachment is obsolete: true
Attachment #608239 - Flags: review?(bugs)
Comment on attachment 608239 [details] [diff] [review]
Whine to console about encoding-related authoring errors, except about different-origin frames lacking a declaration altogether

Could you use <meta charset="utf-8">, not <meta charset=utf-8>
consistently

"When viewed viewed" -> When viewed

mAlreadyComplainedAboutCharset is initialized to false somewhere?

Use C++ casting, not C (char*)

We can try this, but I'm not sure whether this will stick.
Attachment #608239 - Flags: review?(bugs) → review+
(In reply to Olli Pettay [:smaug] from comment #8)
> Could you use <meta charset="utf-8">, not <meta charset=utf-8>
> consistently

Done.

> "When viewed viewed" -> When viewed

Done.

> mAlreadyComplainedAboutCharset is initialized to false somewhere?

Zeroing operator new.

> Use C++ casting, not C (char*)

Done.

> We can try this, but I'm not sure whether this will stick.

Thanks. Landed.

https://hg.mozilla.org/integration/mozilla-inbound/rev/2342271b2be1
https://hg.mozilla.org/mozilla-central/rev/2342271b2be1
Status: ASSIGNED → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla14
Comment on attachment 609156 [details] [diff] [review]
Fix intermittent orange

Oops. Sorry. Thanks.
Attachment #609156 - Flags: review?(hsivonen) → review+
(In reply to Henri Sivonen (:hsivonen) from comment #6)
> As mentioned above, these warnings/errors would be much fewer than the flood
> of warnings about -webkit-foo CSS properties we currently have.
-webkit-foo properties (and any vendor-prefixed properties other than -moz-foo) are not warned to the console, BTW.
(In reply to Masatoshi Kimura [:emk] from comment #13)
> (In reply to Henri Sivonen (:hsivonen) from comment #6)
> > As mentioned above, these warnings/errors would be much fewer than the flood
> > of warnings about -webkit-foo CSS properties we currently have.
> -webkit-foo properties (and any vendor-prefixed properties other than
> -moz-foo) are not warned to the console, BTW.

When did that change?
(In reply to Henri Sivonen (:hsivonen) from comment #14)
> (In reply to Masatoshi Kimura [:emk] from comment #13)
> > (In reply to Henri Sivonen (:hsivonen) from comment #6)
> > > As mentioned above, these warnings/errors would be much fewer than the flood
> > > of warnings about -webkit-foo CSS properties we currently have.
> > -webkit-foo properties (and any vendor-prefixed properties other than
> > -moz-foo) are not warned to the console, BTW.
> 
> When did that change?

It's been that way forever, IIRC. We only warn for unprefixed and moz-prefixed stuff we don't recognize.
Keywords: checkin-needed
Whiteboard: [parity-ie] → [parity-ie][c-n:https://bugzilla.mozilla.org/attachment.cgi?id=609156]
The test has been disabled because of frequent failures (bug 739354):
https://hg.mozilla.org/integration/mozilla-inbound/rev/6b3d2762c0fb
Depends on: 739354
Keywords: checkin-needed
Whiteboard: [parity-ie][c-n:https://bugzilla.mozilla.org/attachment.cgi?id=609156] → [parity-ie]
Blocks: 768138
You need to log in before you can comment on or make changes to this bug.