Closed Bug 32618 Opened 24 years ago Closed 24 years ago

</script> tag in javascript string seen as end of script

Categories

(Core :: JavaScript Engine, defect, P3)

x86
Linux
defect

Tracking

()

VERIFIED INVALID

People

(Reporter: sxpert, Assigned: rogerl)

References

()

Details

Attachments

(3 files)

if you have code like

document.write('</script>');

the tag is seen as the end of the current script, in the middle of the string
then you see garbage in the page "') //-->" kind of thing, look
under the "today's space facts" button
Attached file test case
4x browsers (both IE and Netscape) deal with </script> this way.  The HTML
parser does not understand JavaScript syntax, and interprets any occurrence of
</script> in a document as a closing script tag.  You munge the string a bit to
work around the issue.  For example, document.write ("</" + "script>"); would
work as expected.
Status: UNCONFIRMED → RESOLVED
Closed: 24 years ago
Resolution: --- → INVALID
This is just wrong. Netscape 4.7x for linux (tested with 
values of 0,1 and 2 for x, do not seem to show this 
behaviour.
Attached file reduced testcase
I'm not sure what you're seeing, but this testcase performs the same in NS4.x,
IE, and Mozilla.  If you're seeing something else, please provide a better
description.
Correct behaviour. Vfy

use
document.write('</SCR','IPT>');
Status: RESOLVED → VERIFIED
*** Bug 221898 has been marked as a duplicate of this bug. ***
*** Bug 225534 has been marked as a duplicate of this bug. ***
*** Bug 238139 has been marked as a duplicate of this bug. ***
*** Bug 248959 has been marked as a duplicate of this bug. ***
*** Bug 256153 has been marked as a duplicate of this bug. ***
*** Bug 285147 has been marked as a duplicate of this bug. ***
*** Bug 288146 has been marked as a duplicate of this bug. ***
(In reply to comment #7)
> Correct behaviour. Vfy
> 
> use
> document.write('</SCR','IPT>');
> 

Sorry, I'm not buying it.
I understand we have a work-around.
However, the only explanation of this behavior being 'as intended' in this bug
is comment #2 from R. Ginda:
> 4x browsers (both IE and Netscape) deal with </script> this way.  The HTML
> parser does not understand JavaScript syntax, and interprets any occurrence of
> </script> in a document as a closing script tag.  You munge the string a bit
> to work around the issue.  For example, document.write ("</" + "script>");
> would work as expected.

Which, to me, does not indicate this 'works as intended'. It simply explains why
this is happening...

...actually, it explained why this *was* happening with the browsers on the
market back then.

The parser *could*, by now, understand the Javascript syntax, or at least enough
of it to skip over string literals and resolve this issue (which, to me, is
still a bug).

If there is some other reason why this is 'as intended', please explain, I'm
really interested.

Thanks,
F.O.R.
HTML parser is not supposed to understand syntax other than (X)HTML. Use
external js file and everything solved. This bug is already marked as INVALID so
don't put any more comment on it unless it is really really necessary.
I just ran in to this in FireFox v16.0.2 and was wondering why this was never fixed.  The problem actually is that the </script> appears to be overriding the Double or single quoted string that contains the </script> in it.  This means (to me and I'm probably wrong) that there are two layers going on here.  The first one is the one that looks for the </script> command to end the current javascript area and another layer that handles the single/double quoting of strings.  The first layer is overriding the second layer and is kicking itself out of the javascript.  This would imply that there needs to be a flag that says an open single/double quote is going on so the first layer can check that and not kick out of the javascript program. But that is just a guess based upon the given situation.

Anyway, I ran in to this, found this bug already being here, noted that it was back in Netscape v4.x that it was noted, saw the work around, the work around still works, but should this even still be a problem twelve versions later?
Yeah, I can't believe it. 13 years and 16 times reported and still unfixed!!
Hi,

This behaviour seems to follow the HTML spec.

As for the workaround, I like to use the javascript-commented <![CDATA[ ]]> embedding, because it also enables the use of HTML special chars within the script (&, <, >, etc.).
To Matthieu Rivaud:

This is not an HTML spec - it is a JAVASCRIPT spec and therein lies the difference. As I said before - it is the HTML layer that is looking for the "</script>" command but the JAVASCRIPT layer should be saying TO THE HTML LAYER: "Hey!  I'm in a string and any/all HTML commands should not be dealt with".  For instance - you CAN put in "<div></div>" and the HTML layer doesn't care one wit.  You can put in "<table></table>" and it doesn't care.  In fact, you can put in ANY OTHER HTML COMMAND and it doesn't care.  Why?  Because it is only looking for the "</script>" command so it knows where the end of the javascript is located.  Therefore, wherever it is doing that is doing it incorrectly.

The JAVASCRIPT spec says that it can put anything into a string and it will be a string.  This breaks that spec which causes not only the javascript to fail but the HTML to fail as well because the rest of the javascript code is then interpreted by the HTML layer.

So by not fixing this problem Mozilla is actually creating two problems.  One with the javascript layer and one with the HTML layer.

Now - I have not worked on the FireFox code (or HTML code and Javascript code) but it seems to me that it is simply the need of a global flag (like javascriptStringFlag) to be put on the HTML line as something like:

    I'm looking for a "</script>" tag && !javascriptStringFlag

and something like this following:

     if( (isDoubleQuote() || isSingleQuote()) && !javascriptStringFlag ){
         javascriptStringFlag = true;
         if( isDoubleQuote() )javascriptStringType = 2;
         if( isSingleQuote() )javascriptStringType = 1;
         }
         else if( (isDoubleQuote() || isSingleQuote()) && javascriptStringFlag ){
             if( isDoubleQuote() && (javascriptStringType == 2) ){
                 javascriptStringFlag = false;
                 }
                 else if( isSingleQuote && (javascriptStringType == 1) ){
                      javascriptStringFlag = false;
                      }
             }

This would make it work correctly AND it would NOT break any pre-existing code because those codes were already using a workaround.  It really does need to be fixed.  I don't even think VB has this problem.
Sorry - the "and something like this following:" line got messed up somehow.  It should have been:

and something like this in the javascript layer where it is parsing incoming information:

again - sorry about that. :-)
Mmmm, I may be wrong (I'm not a Mozilla dev, and I'm not a HTML spec guru either), but I think the HTML spec prevails here (HTML being the script embedding) :

- In a first pass, the HTML parser detects the <script>...</script> tag (and this tag's contents must follow the specification as described in the script tag section). I always found the ABNF syntax a little bit cryptic (especially in this case), but I think this means "</script>" sequence of chars is prohibited in script contents, unless escaped.

- In a second pass, the script's parser kicks in (and it's not necessarily JavaScript), and parses the content of the tag, as understood by the HTML parser.

That being said, do not take my word for granted, I was merely offering another workaround and point of view ...

Regards,
MR
MR: I understand.  :-)

Here is something to think about:  The HTML parser/engine/layer has to make a conscious decision to not muck around with the other tags (such as <div></div>).  Why can it not do the same for the "</script>" tag?

The only reason would be that the HTML parser/engine/layer was written that way.

If it was written to work that way - then it could have been written to not work that way too.

If you want to argue - it is the HTML parser/engine/layer business to catch this - then I say no it is not.  The HTML parser/engine/layer handed off processing of the incoming text to the Javascript parser/engine/layer and on all other HTML commands inside of the <script></script> tags the HTML parser/engine/layer does just that.  It leaves it up to the Javascript parser/engine/layer to handle any/all errors.  It only does not do this on the "</script>" tag.  That means there is an exception built in to the HTML parser which overrides the Javascript parser/engine/layer on only this one string.

Workarounds are nice to help you get around a problem UNTIL the problem is fixed.  Like a pothole in a street. Detour signs are put up to show you how to get around the pothole.  But eventually the pothole gets fixed and the workaround(detour) is removed.

I haven't looked at the code but I know it is well over 300MB in size.  So for someone who has never worked on the code before but knows what the problem is and how it should be fixed - it is still quite a task to go find the location and post a fix.  Someone who has worked on the code before could probably find what I am talking about and fix it fairly simply.  As I've said before - I believe this just needs one global flag and maybe one local variable to keep up with what type of quote was used on the string (double or single).  Who knows.  Such a flag may already exist and it just isn't being used.

But in any event - I understand what you mean.
In that case, I think you should try to make your point using the w3c mailing list (http://lists.w3.org/Archives/Public/public-html/).

- At worst, the members will be able to explain much better than I what constraints makes your proposition hard to specify/implement
- If your position is found sufficiently interesting, you may trigger an evolution in the HTML spec, which in turn would be swiftly implemented in Firefox. (with the added advantage that other browsers would be compelled to comply)

Regards,
MR
I actually downloaded the SDKs from Microsoft.  Unfortunately, neither the Windows 7 SDK nor the Direct X SDK will install on my system.  I use Windows XP.  According to Microsoft - both of these should install without a problem - but they do not.  I'm going to have to find a Windows XP SDK and Direct X SDK for Windows XP so I can then download the source code and look at it.  But I will go check out the w3c location as well.  Thanks! :-)
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: