Closed Bug 503632 Opened 16 years ago Closed 16 years ago

[HTML5][Patch] Script containing <!-- in a string never ends up closed

Categories

(Core :: DOM: HTML Parser, defect, P1)

x86
macOS
defect

Tracking

()

RESOLVED FIXED

People

(Reporter: bzbarsky, Assigned: hsivonen)

References

()

Details

Attachments

(1 file)

Attached file Testcase
STEPS TO REPRODUCE: 1) Enable HTML5 parser. 2) Load attached testcase EXPECTED RESULTS: The "There should be text here" text shows up. ACTUAL RESULTS: Blank page. ADDITIONAL INFORMATION: DOM Inspector shows that all content after the <script> is consumed as script text. The testcase is reduced from the site in the url bar. Our old parser shows the text, as do Safari and Opera. I assume this will need a spec change?
dup of bug 502984?
Not at all; this particular script doesn't have anything like "-->" or "-- >" at the end. Note that this bug causes this very bug page to misrender, since bugzilla puts the bug summary in a string in a script... which is ending up not terminated as a result.
How is this dealt with by shipped in-browser HTML parsers? I'd like to avoid introducing back-and-forth parsing of comments and CDATA escapes. I suppose I could try digging deeper into the escape hole and make ' or " until ' or " respectively escape the start of an escape.
The way the shipping Gecko parser deals, I think, is http://mxr.mozilla.org/mozilla-central/source/parser/htmlparser/src/nsHTMLTokens.cpp#608 It doesn't look happy, does it? I don't know what others do.
Note that in that code, aIgnoreComments is set to theTag != eHTMLTag_script.
(In reply to comment #4) > It doesn't look happy, does it? It's not happy. :-( If this pattern is something that need to be supported, I don't currently have better ideas than tracking things that look like string literals so that they don't affect the <!-- ... --> escape sections. That still leaves regexp literals, which seem harder to track without a more comprehensive EcmaScript parser integrated into the HTML parser. However, in that case, stray quotes would open up a whole new can of worms. Considering that this bugzilla page breaks in Safari and Opera, I'm semi-hopeful that supporting this pattern isn't critical for Web compat.
Another possibility is letting <!-- have the escaping effect only if the there have been nothing but whitespace and optionally // or /* before <!-- after the previous line break or semicolon. This seems safer than paying attention to quotes.
The original site I ran into this on (see url field) works fine in Safari and Opera...
See also bug 504941. I don't think the position of <!-- within the script should matter.
Blocks: 504941
Why is there no voting or ability of CC List to this bug ?
(In reply to comment #10) > Why is there no voting or ability of CC List to this bug ? see note at comment #2
Note to people who are searching for dupes before filing bugs: If you see this in the wild, please note the URL of the page here.
FWIW, Bugzilla is getting fixed in bug 503980.
I wrote up a relatively radical proposal for a fix: http://wiki.whatwg.org/wiki/CDATA_Escapes
Blocks: 301375
Summary: [HTML5]Script containing <!-- in a string never ends up closed → [HTML5][Patch] Script containing <!-- in a string never ends up closed
Status: NEW → RESOLVED
Closed: 16 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: