Closed Bug 570309 Opened 10 years ago Closed 10 years ago

[HTML5] CNNMoney unclickable links ("-- >" terminates comment in HTML5)

Categories

(Core :: DOM: HTML Parser, defect, P2)

defect

Tracking

()

RESOLVED FIXED
mozilla2.0b7
Tracking Status
blocking2.0 --- betaN+

People

(Reporter: RyanVM, Assigned: hsivonen)

References

(Depends on 1 open bug, )

Details

(Keywords: html5, regression, testcase)

Attachments

(7 files)

Ever since the HTML5 parser was enabled on trunk, CNNMoney has a strip down the left size where the text is visible, but the links aren't clickable. Furthermore, if I enabled Adblock Plus, there appears to be some sort of overlay in the affected region that's causing the problem.
Notice that rendering seems OK. However, the links on the left size are not clickable.
Notice that with Adblock Plus enabled, it exposes the unclickable region as some sort of overlay.
blocking2.0: --- → ?
Seems related with comment parsing.

<div class="adAreaLC"><!--<b>Sponsored Links:</b>
  <div id="adItem">
  < !--include virtual="/fn_adspaces/personal_finance/credit_debt_mgmt/rate_box/quigo.260x70.ad"-- >
  < !--include virtual="/fn_adspaces/personal_finance/credit_debt_mgmt/rate_box/text1.260x25.ad"-- >
  < !--include virtual="/fn_adspaces/personal_finance/credit_debt_mgmt/rate_box/text2.260x25.ad"-- >
  </div>-->
</div>

Removing < !-- ... -- > fixes it. Will investigate more tomorrow.
Looks like the comment parsing issue coincides with an other bug.
Attached file testcase 3
This is a reduced version of attachment 449496 [details].

Looks like bug 427088.
There are two problems here:

1. The HTML5 parser diplays parts of the code of comment 3 as text. For example:

   <!--
   < !--x-- >
   < !--y-- >
   -->

   is displayed as   < !--y-- > -->


2. This additional content moves floats in an unexpected manner and
   triggers the bug described in testcase 3 (attachment id=449551).
Depends on: 427088
Summary: [HTML5] CNNMoney unclickable links → [HTML5] CNNMoney unclickable links ("-- >" should not terminate HTML comments)
Keywords: regression, testcase
OS: Windows Vista → All
Hardware: x86 → All
zcorpan, Do you recall the rationale for making -- > terminate a comment? Without reparsing, would we be worse if -- > didn't terminate comments?
From research, closing a comment for --\s*> fixes 23 pages out of 129 that have an unclosed comment (and no pages would be worse, according to what I said on irc). (Closing --!> fixes 20.)

http://lists.w3.org/Archives/Public/public-html/2009Apr/0270.html
http://lists.w3.org/Archives/Public/public-html/2009Jun/0191.html
http://krijnhoetmer.nl/irc-logs/whatwg/20090605#l-180
Just for clarification:  money.cnn.com  is in quirksmode.

Old Gecko parser behavior (which is somewhat unexpected for me):

"-- >" does terminate in standards mode
"-- >" does /not/ terminate in quirksmode
Actually no, it's not unexpected. It's that funky SGML comment terminating thing. 

Is it really a good idea to terminate at "-- >" since IE don't do this?
Furthermore, it seems insane in a non-SGML world.
What about XML compatibility?
IMHO HTML5 should change for a better future :-)
(In reply to comment #9)
> http://lists.w3.org/Archives/Public/public-html/2009Apr/0270.html

  and 23 are fixed by making --\s*> 
  terminate comments (matches Gecko, IE, Opera quirks).

Btw, that sentence is wrong (does not match IE8 and Opera pre HTML5 versions).
(In reply to comment #9)
> From research, closing a comment for --\s*> fixes 23 pages out of 129 that have
> an unclosed comment (and no pages would be worse, according to what I said on
> irc). (Closing --!> fixes 20.)
> 
> http://lists.w3.org/Archives/Public/public-html/2009Apr/0270.html
> http://lists.w3.org/Archives/Public/public-html/2009Jun/0191.html
> http://krijnhoetmer.nl/irc-logs/whatwg/20090605#l-180

Thanks. I think we should treat this as an evangelism bug.

There are three options here:
 1) Breaking some sites.
 2) Breaking other sites.
 3) Reparsing (potential XSS problem, also would involve more code complexity).

(In reply to comment #12)
> (In reply to comment #9)
> > http://lists.w3.org/Archives/Public/public-html/2009Apr/0270.html
> 
>   and 23 are fixed by making --\s*> 
>   terminate comments (matches Gecko, IE, Opera quirks).
> 
> Btw, that sentence is wrong (does not match IE8 and Opera pre HTML5 versions).

Yeah, IE8 and Opera 10.5x don't terminate on --\s*> when not reparsing:
http://software.hixie.ch/utilities/js/live-dom-viewer/saved/499
Assignee: nobody → english-us
Component: HTML: Parser → English US
Product: Core → Tech Evangelism
QA Contact: parser → english-us
Version: Trunk → unspecified
Summary: [HTML5] CNNMoney unclickable links ("-- >" should not terminate HTML comments) → [HTML5] money.cnn.com CNNMoney unclickable links ("-- >" terminates comment in HTML5)
I think it was a loose statement taking into account reparsing (Opera doesn't reparse in standards mode).

I don't think it was noted which URLs break without --\s*> closing comments, unfortunately, but it should be possible to find them again by inspecting http://philip.html5.org/data/pages-with-unclosed-comments.txt
We should probably keep track of the html5-parser-related evang bugs somehow so we know what the scope of the fallout is...
Keywords: html5
(In reply to comment #15)
> We should probably keep track of the html5-parser-related evang bugs somehow so
> we know what the scope of the fallout is...

These can now be tracked by searching for evang bugs that have "[HTML5]" in the summary.
I really think HTML5 is wrong here. This behavior is just a leftover of Gecko's SGML-style comment parsing, which is widely considered as a wrong decision. [http://ln.hixie.ch/?start=1137799947]

<!-- xxx -- > yyy -->

Testing the above code with various browsers:

"-- >" terminates the comment?

                         strict mode     quirksmode
Opera 9.6, 10            no              no
Opera 10.6               yes             yes
Fx 3.6                   yes             no
WebKit 532, Chrome5      no              no
IE 6,7,8                 no              no
Konqueror 3.5, 4.4       no              no
Dillo 2.2                no              no
The problem is with pages that have

<!-- xxx -- > yyy EOF

Such pages show the yyy in today's browsers thanks to reparsing. The yyy can be all content on the page.
Pages that are broken if -- > doens't close a comment:

From http://philip.html5.org/data/pages-with-unclosed-comments.txt I see:

http://www.robertasstable.com/ - It's only a splash page, though.


From http://philip.html5.org/data/comments-not-closed-but-with-a-gt-after-them.txt I see:

http://ff.netease.com/search.php - only affects the tracker script at the end.
same with other pages from netease.com.
There were also a few that had a comment at the end that ended with -- >, but those would work either way.


Conclusion:
The breakage is less than concluded back in 2009. Either the affected pages have gone away or changed, or James' analysis was incorrect (or both). We might be able to get away with not closing comments for -- > after all. (There were still many pages that depend on --!> closing comments, though.)
Depends on: 570647
(In reply to comment #6)
> Created an attachment (id=449551) [details]
> testcase 3
> This is a reduced version of attachment 449496 [details].

Filed bug 570647 for this.
(In reply to comment #19)
> The breakage is less than concluded back in 2009. Either the affected pages
> have gone away or changed, or James' analysis was incorrect (or both). We might
> be able to get away with not closing comments for -- > after all. (There were
> still many pages that depend on --!> closing comments, though.)

Thank you!

Let's treat this as a potential spec bug and see what happens if we ship and alpha/beta with --\s*> not closing comments.
Assignee: english-us → nobody
Component: English US → HTML: Parser
Product: Tech Evangelism → Core
QA Contact: english-us → parser
Summary: [HTML5] money.cnn.com CNNMoney unclickable links ("-- >" terminates comment in HTML5) → [HTML5] CNNMoney unclickable links ("-- >" terminates comment in HTML5)
Need to see the tryserver results for this, still. The patch makes the minimal changes that make --\s+> not terminate comments. The patch leaved dead code around to ease backing out in case fixing this CNN Money case breaks more stuff elsewhere.
Assignee: nobody → hsivonen
Status: NEW → ASSIGNED
WebKit has also seen issues with CNN Money with our HTML5 parser implementation: https://bugs.webkit.org/show_bug.cgi?id=41371
Comment on attachment 452252 [details] [diff] [review]
Experimental patch for testing if things get overall better or worse

Oops. I forgot to request review on this one.
Attachment #452252 - Flags: review?(jonas)
Comment on attachment 452252 [details] [diff] [review]
Experimental patch for testing if things get overall better or worse

I don't really have an opinion on what we should do. But if you think we should test with this patch, then r=me on landing it.
Attachment #452252 - Flags: review?(jonas) → review+
Comment on attachment 452252 [details] [diff] [review]
Experimental patch for testing if things get overall better or worse

Requesting approval in order to get beta testing on this patch. We need data to know if we want to ship the final release with or without this fix.
Attachment #452252 - Flags: approval2.0?
Yeah, let's get this checked in. Please do not mark the bug FIXED until we've decided and made the appropriate spec changes.

Also, shouldn't this have automated tests, no matter which way we ultimately go?
blocking2.0: ? → betaN+
Flags: in-testsuite?
Attachment #452252 - Flags: approval2.0?
(In reply to comment #28)
> Yeah, let's get this checked in. Please do not mark the bug FIXED until we've
> decided and made the appropriate spec changes.

Thanks. Landed as http://hg.mozilla.org/mozilla-central/rev/609d101acf8e . Not marking FIXED.

If we want to keep this behavior, I want to clean up the code anyway.

> Also, shouldn't this have automated tests, no matter which way we ultimately
> go?

OK. I'll write one. (I was thinking of this as a testless trial balloon.)
Duplicate of this bug: 582515
Whiteboard: [patch landed for testing; requires spec change]
Once you file the spec bug, WebKit will be happy to change to match.   We've also had the same issue with CNN money (https://bugs.webkit.org/show_bug.cgi?id=41371).
I'm looking into matching this patch in WebKit.
The spec has changed and this is now in test suite. What remains is cleaning up the code that I left in to make it easy to revert this change.
Flags: in-testsuite? → in-testsuite+
Whiteboard: [patch landed for testing; requires spec change] → [only dead code removal left]
Comment on attachment 485337 [details] [diff] [review]
Remove the remaining dead code

rs=me
Attachment #485337 - Flags: review?(jonas) → review+
http://hg.mozilla.org/mozilla-central/rev/bbec3799beb9
Status: ASSIGNED → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Whiteboard: [only dead code removal left]
Target Milestone: --- → mozilla2.0b8
Target Milestone: mozilla2.0b8 → mozilla2.0b7
You need to log in before you can comment on or make changes to this bug.