Created attachment 388021 [details]
STEPS TO REPRODUCE:
1) Enable HTML5 parser.
2) Load attached testcase
EXPECTED RESULTS: The "There should be text here" text shows up.
ACTUAL RESULTS: Blank page.
ADDITIONAL INFORMATION: DOM Inspector shows that all content after the <script> is consumed as script text. The testcase is reduced from the site in the url bar. Our old parser shows the text, as do Safari and Opera.
I assume this will need a spec change?
dup of bug 502984?
Not at all; this particular script doesn't have anything like "-->" or "-- >" at the end.
Note that this bug causes this very bug page to misrender, since bugzilla puts the bug summary in a string in a script... which is ending up not terminated as a result.
How is this dealt with by shipped in-browser HTML parsers? I'd like to avoid introducing back-and-forth parsing of comments and CDATA escapes. I suppose I could try digging deeper into the escape hole and make ' or " until ' or " respectively escape the start of an escape.
The way the shipping Gecko parser deals, I think, is http://mxr.mozilla.org/mozilla-central/source/parser/htmlparser/src/nsHTMLTokens.cpp#608
It doesn't look happy, does it?
I don't know what others do.
Note that in that code, aIgnoreComments is set to theTag != eHTMLTag_script.
(In reply to comment #4)
> It doesn't look happy, does it?
It's not happy. :-(
If this pattern is something that need to be supported, I don't currently have better ideas than tracking things that look like string literals so that they don't affect the <!-- ... --> escape sections. That still leaves regexp literals, which seem harder to track without a more comprehensive EcmaScript parser integrated into the HTML parser. However, in that case, stray quotes would open up a whole new can of worms.
Considering that this bugzilla page breaks in Safari and Opera, I'm semi-hopeful that supporting this pattern isn't critical for Web compat.
Another possibility is letting <!-- have the escaping effect only if the there have been nothing but whitespace and optionally // or /* before <!-- after the previous line break or semicolon. This seems safer than paying attention to quotes.
The original site I ran into this on (see url field) works fine in Safari and Opera...
See also bug 504941. I don't think the position of <!-- within the script should matter.
Why is there no voting or ability of CC List to this bug ?
(In reply to comment #10)
> Why is there no voting or ability of CC List to this bug ?
see note at comment #2
*** Bug 505897 has been marked as a duplicate of this bug. ***
Note to people who are searching for dupes before filing bugs:
If you see this in the wild, please note the URL of the page here.
FWIW, Bugzilla is getting fixed in bug 503980.
I wrote up a relatively radical proposal for a fix:
*** Bug 504941 has been marked as a duplicate of this bug. ***
*** Bug 502984 has been marked as a duplicate of this bug. ***
*** Bug 539736 has been marked as a duplicate of this bug. ***