Closed
Bug 97886
Opened 24 years ago
Closed 20 years ago
JavaScript rendered as HTML after document.write
Categories
(Core :: DOM: HTML Parser, defect, P3)
Core
DOM: HTML Parser
Tracking
()
RESOLVED
FIXED
Future
People
(Reporter: munyer, Assigned: mrbkap)
References
Details
(Keywords: dom0)
Attachments
(4 files)
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (Macintosh; U; PPC; en-US; rv:0.9.3+) Gecko/20010831
BuildID: 2001083110
For a project which I'm not allowed to disclose, I needed a way
to make HTML pages that could detect when a script has failed to
execute (for any reason at all, including complete removal by a
security-conscious proxy) and insert other content where that
script was intended to be. The only syntax I could find that
would meet those requirements was the HTML code shown below.
This idiom is rather subtle (one might even say sneaky) but it
works, and it complies with all the standards. In fact, if I
understand the history of HTML correctly, this code should work
in EVERY correctly functioning browser, all the way back to the
very first one (Tim Berners-Lee's original NeXT Step browser).
All it requires of pre-JS1.2 browsers is the ability to ignore
SGML comments, and unrecognized tags and attributes -- and, if
it recognizes <SCRIPT> tags, to respect the LANGUAGE attribute.
This code worked in MSIE 4, and (with a little tweaking) in NN 4
as well. MSIE 5 and 6 didn't cause any trouble. Unfortunately,
now I'm getting reports that these pages don't work in NN 6.1.
I've been able to replicate the problem in today's Mozilla build.
I've included three example pages below. The first two work
correctly; the third fails. The only difference between these
pages is in the way they load the script. Page 1 uses an inline
script; page 2 uses a "<SCRIPT SRC=>" tag to load an external
script; and page 3 uses an inline script to document.write a
"<SCRIPT SRC=>" element which then loads the external script.
Reproducible: Always
Steps to Reproduce:
1. Disable JavaScript.
2. View pages 1, 2 and 3 (attached below).
3. Re-enable JavaScript.
4. View the same three pages with JavaScript.
5. If you feel like it, repeat the above with MSIE 4/5/6.
Actual Results:
Pages 1 and 2 work correctly. Page 3 works without JavaScript.
But when JavaScript is enabled, page 3 interprets the /* and the
"fallback content" as HTML, even though these data are inside a
SCRIPT element and therefore should be interpreted as JavaScript.
Expected Results: Page 3 should work the same as pages 1 and 2.
If at first you don't understand how this idiom works (I wouldn't
blame you) -- no problem; do the following and you'll understand.
1. Use the W3C HTML validator to make a SGML parse tree of page
1 below. This tree shows how the page would be parsed by any
agent that recognizes the scripts but does not execute them
(a proxy, a robot, Lynx, NN 2, NN 6 with JS disabled, etc.).
2. Imagine this parse tree with both SCRIPT elements removed.
That's how the output of a script-stripping proxy would look.
3. Load page 1 into a text editor. Manually simulate the effect
of the first script, by replacing the entire SCRIPT element
(from <SCRIPT> to </SCRIPT>, including everything in between)
with the characters the script would feed to document.write.
Use the W3C validator again, and study the new parse tree.
This shows how a browser should proceed to parse the document,
after that first script has been executed successfullly.
4. Go back to the original version, remove the <SCRIPT> and
</SCRIPT> tags, but do not remove the content between them.
Run the validator again. This shows how an extremely old
browser, like Mosaic or NN 1.0, would parse this page.
![]() |
Reporter | |
Comment 1•24 years ago
|
||
![]() |
Reporter | |
Comment 2•24 years ago
|
||
![]() |
Reporter | |
Comment 3•24 years ago
|
||
![]() |
Reporter | |
Comment 4•24 years ago
|
||
![]() |
||
Comment 5•24 years ago
|
||
Confirming with today's CVS build on Linux.
Status: UNCONFIRMED → NEW
Ever confirmed: true
OS: Mac System 9.x → All
Hardware: Macintosh → All
Status: NEW → ASSIGNED
Priority: -- → P3
Target Milestone: --- → mozilla0.9.7
Out of time. Mass move to 0.9.9
Target Milestone: mozilla0.9.8 → mozilla0.9.9
![]() |
||
Comment 10•23 years ago
|
||
Mozilla/5.0 (Macintosh; U; PPC Mac OS X; en-US; rv:0.9.8+) Gecko/20020301
I have another site that has behavior that matches the summary of this bug.
goto http://www.myrealbox.com and select the "Switch to secure connection" link
under the login box at the top left of the page. The next page attempts to put
up a Username/Password box, but barfs on some javascript and renders it into the
page instead. The javascript looks ok to me, so not sure what's up.
[was 1.1alpha]
Target Milestone: mozilla1.1alpha → Future
![]() |
||
Comment 12•21 years ago
|
||
I found a similar problem, but my examples do not use document.write(). The
pages often render with javascript as HTML as the original poster described.
URL: http://www.bulliondirect.com/catalog/selectProducts.do?category=9
validates as HTML 4.01 Transitional compliant via
http://validator.w3.org/detailed.html.
Incorrectly renders javascript about 70% of the time in the following browsers:
Mozilla 1.6 Gecko/20040113, Mozilla 1.5 Gecko/20030916, FireBird 0.7
Gecko/20031007, Netscape 7.1 Gecko/20030624.
If you keep hitting Reload, you may get a properly rendered page. Page source
is identical regardless of correct rendering. Saving a local copy of the bad
rendering then opening the local copy always renders correctly. Strange how by
reloading the page, the browser can randomly decide to render the same source
differently. Some sort of timing issue?
Looks like this bug's been around for nearly 2.5 years now.
Assignee | ||
Comment 13•20 years ago
|
||
I'm not really sure who to CC on this...
This smells like a race condition in the parser. If I step through this with a
debugger and give it some time (maybe to load the external script?) this works
fine. However, if I let it run right through, I get a bogus text token in the
parser (null start and end points) and this breaks. I'm wondering if we process
the inline script, calling CNavDTD::BuildModel on the result, which then tries
to queue the request (which should block the parser), but when we return from
processing the inline script, we *don't* block the parser (return NS_OK). This
would seem to make sense to me, as if I then waited for the script to load
(i.e., manually blocking the parser in the debugger) this would work...
Does this sound at all plausable? Does anybody know this code in depth? If this
is indeed the problem, it would seem to be another indication (to me at least)
that the parser blocking model is seriously broken (also see bug 220542). Maybe
we need to consider making a stack of some sort of elements blocking the parser
so that out-of-order continue calls don't mess us up?
Updated•20 years ago
|
Assignee: harishd → parser
Status: ASSIGNED → NEW
QA Contact: moied → mrbkap
![]() |
||
Comment 14•20 years ago
|
||
Blake, there's a lot of complexity in the blocking code, yes... I'd try putting
printfs in the content sink to see whether we're blocking the parser there or
not, then go on from there.
Assignee | ||
Comment 15•20 years ago
|
||
The root cause of this bug is that the parser uses the wrong parser context to
parse the result of the external script. When the outer (inline) script writes
the inner one, the parser pushes a new context onto its stack with key id = 0x01
(more on that later). This context is only popped off once the tokenizer reaches
EOF on its scanner (i.e., it's been exhausted). The idea is that a nested write
call (document.write("document.write('...')");) would generate a different
parser key, so a new parser context would be established, and the parser would
continue on its merry way.
However, what happens here is that when the external script loads, the tokenizer
of the inline script hasn't finished yet (since we block parsing to load
scripts, which stops tokenization at the script), so we have a parser context
already pushed on.
Now, we take a break from the action and examine NS_GENERATE_PARSER_KEY(). This
macro in nsHTMLDocument.cpp is supposed to generate increasing keys for parser
contexts as writes nest. It relies on mWriteLevel to do so.
Back to the action! In this case, however, the previous write() has finished,
and mWriteLevel is 1 (0 before the call). So the generated parser key is the
same for the external script as for the inline script. The parser hasn't popped
of the inline's script context (which is now non-incremental) and so it
continues to use this context for the external script. Now when the
(non-incremental) scanner is used to parse |<script ...>/*| the tokenizer thinks
that it needs to fake an end tag (giving |<script ...></script>|). So the /*
[message] is parsed as text because the parser uses the wrong context.
This problem is unfortunately very complex. I think the solution is going to be
to generate better parser keys based on something like the content ID of the
current script element AND mWriteLevel (is this possible?). Critiques welcome.
Assignee: parser → mrbkap
Assignee | ||
Comment 16•20 years ago
|
||
I'm marking this fixed, since I just checked in bug 280713, which, while it
doesn't solve the root of the problems here, is thorough enough to really cover
this up. If you think this should remain open to fix the problem of using the
wrong parser context, please reopen.
Status: NEW → RESOLVED
Closed: 20 years ago
Resolution: --- → FIXED
Assignee | ||
Comment 17•20 years ago
|
||
It's worth noting here that this was really fixed for good in bug 271184.
Depends on: 271184
You need to log in
before you can comment on or make changes to this bug.
Description
•