Last Comment Bug 503632 - [HTML5][Patch] Script containing <!-- in a string never ends up closed
: [HTML5][Patch] Script containing <!-- in a string never ends up closed
Product: Core
Classification: Components
Component: HTML: Parser (show other bugs)
: Trunk
: x86 Mac OS X
: P1 major with 2 votes (vote)
: ---
Assigned To: Henri Sivonen (:hsivonen)
: 502984 504941 505897 539736 (view as bug list)
Depends on:
Blocks: xss html5-parsing 504941
  Show dependency treegraph
Reported: 2009-07-10 20:04 PDT by Boris Zbarsky [:bz]
Modified: 2011-01-30 13:09 PST (History)
8 users (show)
See Also:
Crash Signature:
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---

Testcase (144 bytes, text/html)
2009-07-10 20:04 PDT, Boris Zbarsky [:bz]
no flags Details

Description Boris Zbarsky [:bz] 2009-07-10 20:04:41 PDT
Created attachment 388021 [details]

1) Enable HTML5 parser.
2) Load attached testcase

EXPECTED RESULTS: The "There should be text here" text shows up.


ADDITIONAL INFORMATION: DOM Inspector shows that all content after the <script> is consumed as script text.  The testcase is reduced from the site in the url bar.  Our old parser shows the text, as do Safari and Opera.

I assume this will need a spec change?
Comment 1 -fullmetaljacket- 2009-07-12 18:58:03 PDT
dup of bug 502984?
Comment 2 Boris Zbarsky [:bz] 2009-07-12 19:39:12 PDT
Not at all; this particular script doesn't have anything like "-->" or "-- >" at the end.

Note that this bug causes this very bug page to misrender, since bugzilla puts the bug summary in a string in a script... which is ending up not terminated as a result.
Comment 3 Henri Sivonen (:hsivonen) 2009-07-13 00:09:12 PDT
How is this dealt with by shipped in-browser HTML parsers? I'd like to avoid introducing back-and-forth parsing of comments and CDATA escapes. I suppose I could try digging deeper into the escape hole and make ' or " until ' or " respectively escape the start of an escape.
Comment 4 Boris Zbarsky [:bz] 2009-07-13 00:33:43 PDT
The way the shipping Gecko parser deals, I think, is

It doesn't look happy, does it?

I don't know what others do.
Comment 5 Boris Zbarsky [:bz] 2009-07-13 00:34:09 PDT
Note that in that code, aIgnoreComments is set to theTag != eHTMLTag_script.
Comment 6 Henri Sivonen (:hsivonen) 2009-07-15 01:16:33 PDT
(In reply to comment #4)
> It doesn't look happy, does it?

It's not happy. :-(

If this pattern is something that need to be supported, I don't currently have better ideas than tracking things that look like string literals so that they don't affect the <!-- ... --> escape sections. That still leaves regexp literals, which seem harder to track without a more comprehensive EcmaScript parser integrated into the HTML parser. However, in that case, stray quotes would open up a whole new can of worms.

Considering that this bugzilla page breaks in Safari and Opera, I'm semi-hopeful that supporting this pattern isn't critical for Web compat.
Comment 7 Henri Sivonen (:hsivonen) 2009-07-15 01:21:50 PDT
Another possibility is letting <!-- have the escaping effect only if the there have been nothing but whitespace and optionally // or /* before <!-- after the previous line break or semicolon. This seems safer than paying attention to quotes.
Comment 8 Boris Zbarsky [:bz] 2009-07-15 08:33:39 PDT
The original site I ran into this on (see url field) works fine in Safari and Opera...
Comment 9 Jesse Ruderman 2009-07-17 18:09:26 PDT
See also bug 504941.  I don't think the position of <!-- within the script should matter.
Comment 10 shirish 2009-07-17 23:04:02 PDT
Why is there no voting or ability of CC List to this bug ?
Comment 11 -fullmetaljacket- 2009-07-19 22:43:21 PDT
(In reply to comment #10)
> Why is there no voting or ability of CC List to this bug ?

see note at comment #2
Comment 12 Jonathan Griffin (:jgriffin) 2009-07-29 15:19:40 PDT
*** Bug 505897 has been marked as a duplicate of this bug. ***
Comment 13 Henri Sivonen (:hsivonen) 2009-08-10 04:56:33 PDT
Note to people who are searching for dupes before filing bugs:
If you see this in the wild, please note the URL of the page here.
Comment 14 Jesse Ruderman 2009-08-10 10:32:06 PDT
FWIW, Bugzilla is getting fixed in bug 503980.
Comment 15 Henri Sivonen (:hsivonen) 2009-08-12 05:03:52 PDT
I wrote up a relatively radical proposal for a fix:
Comment 16 Henri Sivonen (:hsivonen) 2009-12-03 01:46:33 PST
*** Bug 504941 has been marked as a duplicate of this bug. ***
Comment 17 Henri Sivonen (:hsivonen) 2009-12-03 01:50:24 PST
*** Bug 502984 has been marked as a duplicate of this bug. ***
Comment 18 Henri Sivonen (:hsivonen) 2009-12-11 15:40:30 PST
Comment 19 Boris Zbarsky [:bz] 2010-01-14 10:55:07 PST
*** Bug 539736 has been marked as a duplicate of this bug. ***

Note You need to log in before you can comment on or make changes to this bug.