Last Comment Bug 672453 - Whine to console when an HTML file falls off the 'good' encoding detection path
: Whine to console when an HTML file falls off the 'good' encoding detection path
Status: RESOLVED FIXED
[parity-ie]
:
Product: Core
Classification: Components
Component: HTML: Parser (show other bugs)
: Trunk
: All All
: -- enhancement (vote)
: mozilla14
Assigned To: Henri Sivonen (:hsivonen) (Not doing reviews or reading bugmail until 2016-08-01)
:
Mentors:
Depends on: 739354 842511 842512
Blocks: 768138
  Show dependency treegraph
 
Reported: 2011-07-18 23:40 PDT by Henri Sivonen (:hsivonen) (Not doing reviews or reading bugmail until 2016-08-01)
Modified: 2013-02-19 04:14 PST (History)
8 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---


Attachments
Fix without tests (32.60 KB, patch)
2011-12-08 05:40 PST, Henri Sivonen (:hsivonen) (Not doing reviews or reading bugmail until 2016-08-01)
no flags Details | Diff | Splinter Review
Whine to console about encoding-related authoring errors (63.82 KB, patch)
2011-12-16 08:16 PST, Henri Sivonen (:hsivonen) (Not doing reviews or reading bugmail until 2016-08-01)
no flags Details | Diff | Splinter Review
Whine to console about encoding-related authoring errors, with test changes (69.54 KB, patch)
2011-12-19 09:16 PST, Henri Sivonen (:hsivonen) (Not doing reviews or reading bugmail until 2016-08-01)
bugs: review-
Details | Diff | Splinter Review
Whine to console about encoding-related authoring errors, except about different-origin frames lacking a declaration altogether (123.02 KB, patch)
2012-03-21 23:29 PDT, Henri Sivonen (:hsivonen) (Not doing reviews or reading bugmail until 2016-08-01)
bugs: review+
Details | Diff | Splinter Review
Fix intermittent orange (1.52 KB, patch)
2012-03-25 13:54 PDT, :Ms2ger (⌚ UTC+1/+2)
hsivonen: review+
Details | Diff | Splinter Review

Description Henri Sivonen (:hsivonen) (Not doing reviews or reading bugmail until 2016-08-01) 2011-07-18 23:40:54 PDT
To make it easier for Web authors to discover that their pages are broken, the HTML parser should warn to console when the following condition is met
 * There's no HTTP-level charset
AND
 * There's no BOM
AND
 * The meta prescan failed.

Furthermore, the parser should emit an error to the console when an encoding change restart occurs.

According to http://blogs.msdn.com/b/ieinternals/archive/2011/07/18/optimal-html-head-ordering-to-avoid-parser-restarts-redownloads-and-improve-performance.aspx , IE tells authors about restarts.
Comment 1 Henri Sivonen (:hsivonen) (Not doing reviews or reading bugmail until 2016-08-01) 2011-12-08 05:40:31 PST
Created attachment 580019 [details] [diff] [review]
Fix without tests
Comment 2 Henri Sivonen (:hsivonen) (Not doing reviews or reading bugmail until 2016-08-01) 2011-12-16 08:16:15 PST
Created attachment 582280 [details] [diff] [review]
Whine to console about encoding-related authoring errors

The changes to tests that look unrelated are meant to silence these console messages for other tests that listen to the console and expect to find their own messages.
Comment 3 Henri Sivonen (:hsivonen) (Not doing reviews or reading bugmail until 2016-08-01) 2011-12-19 09:16:29 PST
Created attachment 582851 [details] [diff] [review]
Whine to console about encoding-related authoring errors, with test changes
Comment 4 Olli Pettay [:smaug] 2011-12-29 15:50:38 PST
Comment on attachment 582851 [details] [diff] [review]
Whine to console about encoding-related authoring errors, with test changes

Do we really want to do this?
I would assume this will cause *lots* of warnings, even so much that no
one will care.

At least add a pref to enable/disable this and then we could
evaluate whether this approach is feasible.
Comment 5 Henri Sivonen (:hsivonen) (Not doing reviews or reading bugmail until 2016-08-01) 2012-01-02 05:59:45 PST
(In reply to Olli Pettay [:smaug] from comment #4)
> Do we really want to do this?
> I would assume this will cause *lots* of warnings, even so much that no
> one will care.

The warnings triggered for sites authored by someone else will be annoying, sure, but they will still be fewer than the warnings the CSS parser spits out.

I think the warnings are useful for the person who has authored the pages that trigger the warnings. If you consider the case where the person reading the warnings is diagnosing pages (s)he wrote, this patch will add console messages only when the author has made a mistake (and the mistake is super-easy to fix). Contrast this with the CSS parser which outputs tons of warnings about stuff that the author did deliberately.

> At least add a pref to enable/disable this and then we could
> evaluate whether this approach is feasible.

How about I instead disable the "EncNoDeclaration" message for framed docs--i.e. different-origin frames?

AFAICT, ad iframes are the most noisy case and with those the console message is pretty useless anyway, since the author of the page that contains the ad iframes probably isn't empowered to adjust the ad server's output anyway.

The messages other than "EncNoDeclaration" will be rare enough that I think it would be an overreaction to limit them in any way (beyond the patch already limiting the messages to at most one message per document).
Comment 6 Henri Sivonen (:hsivonen) (Not doing reviews or reading bugmail until 2016-08-01) 2012-03-15 05:22:19 PDT
smaug, ping. Would this be r+-worthy if I disabled EncNoDeclaration for framed docs?

As mentioned above, these warnings/errors would be much fewer than the flood of warnings about -webkit-foo CSS properties we currently have.
Comment 7 Henri Sivonen (:hsivonen) (Not doing reviews or reading bugmail until 2016-08-01) 2012-03-21 23:29:30 PDT
Created attachment 608239 [details] [diff] [review]
Whine to console about encoding-related authoring errors, except about different-origin frames lacking a declaration altogether

This patch implements the tweak proposed above. This patch is silent in normal browsing use at least for me.
Comment 8 Olli Pettay [:smaug] 2012-03-23 03:39:13 PDT
Comment on attachment 608239 [details] [diff] [review]
Whine to console about encoding-related authoring errors, except about different-origin frames lacking a declaration altogether

Could you use <meta charset="utf-8">, not <meta charset=utf-8>
consistently

"When viewed viewed" -> When viewed

mAlreadyComplainedAboutCharset is initialized to false somewhere?

Use C++ casting, not C (char*)

We can try this, but I'm not sure whether this will stick.
Comment 9 Henri Sivonen (:hsivonen) (Not doing reviews or reading bugmail until 2016-08-01) 2012-03-24 04:47:18 PDT
(In reply to Olli Pettay [:smaug] from comment #8)
> Could you use <meta charset="utf-8">, not <meta charset=utf-8>
> consistently

Done.

> "When viewed viewed" -> When viewed

Done.

> mAlreadyComplainedAboutCharset is initialized to false somewhere?

Zeroing operator new.

> Use C++ casting, not C (char*)

Done.

> We can try this, but I'm not sure whether this will stick.

Thanks. Landed.

https://hg.mozilla.org/integration/mozilla-inbound/rev/2342271b2be1
Comment 10 Ed Morley [:emorley] 2012-03-25 06:12:49 PDT
https://hg.mozilla.org/mozilla-central/rev/2342271b2be1
Comment 11 :Ms2ger (⌚ UTC+1/+2) 2012-03-25 13:54:20 PDT
Created attachment 609156 [details] [diff] [review]
Fix intermittent orange

Intermittent oranges like <https://tbpl.mozilla.org/php/getParsedLog.php?id=10359487&tree=Firefox#error0> are no fun.
Comment 12 Henri Sivonen (:hsivonen) (Not doing reviews or reading bugmail until 2016-08-01) 2012-03-25 23:30:36 PDT
Comment on attachment 609156 [details] [diff] [review]
Fix intermittent orange

Oops. Sorry. Thanks.
Comment 13 Masatoshi Kimura [:emk] 2012-03-25 23:46:22 PDT
(In reply to Henri Sivonen (:hsivonen) from comment #6)
> As mentioned above, these warnings/errors would be much fewer than the flood
> of warnings about -webkit-foo CSS properties we currently have.
-webkit-foo properties (and any vendor-prefixed properties other than -moz-foo) are not warned to the console, BTW.
Comment 14 Henri Sivonen (:hsivonen) (Not doing reviews or reading bugmail until 2016-08-01) 2012-03-25 23:56:53 PDT
(In reply to Masatoshi Kimura [:emk] from comment #13)
> (In reply to Henri Sivonen (:hsivonen) from comment #6)
> > As mentioned above, these warnings/errors would be much fewer than the flood
> > of warnings about -webkit-foo CSS properties we currently have.
> -webkit-foo properties (and any vendor-prefixed properties other than
> -moz-foo) are not warned to the console, BTW.

When did that change?
Comment 15 :Ms2ger (⌚ UTC+1/+2) 2012-03-26 01:10:36 PDT
(In reply to Henri Sivonen (:hsivonen) from comment #14)
> (In reply to Masatoshi Kimura [:emk] from comment #13)
> > (In reply to Henri Sivonen (:hsivonen) from comment #6)
> > > As mentioned above, these warnings/errors would be much fewer than the flood
> > > of warnings about -webkit-foo CSS properties we currently have.
> > -webkit-foo properties (and any vendor-prefixed properties other than
> > -moz-foo) are not warned to the console, BTW.
> 
> When did that change?

It's been that way forever, IIRC. We only warn for unprefixed and moz-prefixed stuff we don't recognize.
Comment 16 :Ms2ger (⌚ UTC+1/+2) 2012-03-26 09:37:18 PDT
Thanks, checked in

https://hg.mozilla.org/integration/mozilla-inbound/rev/63900b73be92
Comment 17 Matt Brubeck (:mbrubeck) 2012-03-26 10:46:52 PDT
Sorry, the follow-up 63900b73be92 was backed out on inbound because of mochitest-plain-5 failures:
https://hg.mozilla.org/integration/mozilla-inbound/rev/7da46c0c739c
https://tbpl.mozilla.org/php/getParsedLog.php?id=10375447&tree=Mozilla-Inbound
Comment 18 Matt Brubeck (:mbrubeck) 2012-03-26 16:23:53 PDT
The test has been disabled because of frequent failures (bug 739354):
https://hg.mozilla.org/integration/mozilla-inbound/rev/6b3d2762c0fb

Note You need to log in before you can comment on or make changes to this bug.