Closed Bug 695577 Opened 13 years ago Closed 13 years ago

E4X syntax should not be accepted in ES5 strict mode

Categories

(Core :: JavaScript Engine, defect)

Other Branch
defect
Not set
normal

Tracking

()

RESOLVED FIXED

People

(Reporter: erights, Assigned: brendan)

References

Details

(Keywords: addon-compat, dev-doc-complete)

Attachments

(2 files, 2 obsolete files)

"use strict" is currently our one real opt-in boundary for simplifying the language and reducing threats by dropping legacy complexity that is generally no longer needed. As Brendan said somewhere "E4X is crazyland", and FF's implementation of E4X deviates from the spec in ways that are not written down anywhere. Until we encountered this issues, it looked like SES could bring ocap security to ES5.1 without doing an accurate lex or parse. With this restriction, we could regain this economy.

Besides, no one wants to upgrade the E4X semantics to be compatible with ES5 or ES.next, so this seems a good time to impose this opt-in restriction.
See Also: → 695579
This will get E4X out of Harmony too. The only way for it to come back would be via a new edition of ECMA-357, which Adobe and Mozilla were going to work on. Until then, it's out. This thus signals start of deprecation for E4X in SpiderMonkey. Andrew, you ok with this?

/be
Assignee: general → brendan
Status: NEW → ASSIGNED
We do not currently use E4X and strict mode together. So as long as the existing E4X code works as it does now when strict is not set (either with "use strict" or via jsapi) then I don't think it will cause me any headache until we can remove it entirely from the codebase. I am planning on enforcing strict mode for all new projects and implicitly disabling E4X using the runtime option when strict to avoid creating any future incompatibility.
I hacked up my code to disable E4X when in strict mode (FF4 1.8.5) and I discovered that XML was still accepted in the token stream even though I set JSOPTION_STRICT in the embedding. It seems that the token stream would only be put into strict mode if "use strict" was found. I hacked up jsparse to set strict mode initially on the Parser/TreeContext if JSOPTION_STRICT is specified and now it properly throws if XML tokens are found. I see the frontend code has been all moved around now, so I am not sure yet if this is still an issue in m-i.
Attached patch proposed fix (obsolete) — Splinter Review
Attachment #568605 - Flags: review?(jwalden+bmo)
Attached patch diff -w version of last patch (obsolete) — Splinter Review
Attached patch my approachSplinter Review
I attached the approach I was taking.. I just wanted TokenStream::hasXML() to return false in strict mode and things could simply be based on that. You guys know this better than I do, though :)

The first parts about setting strict mode if the context flag is set takes care of the issue I described up above.

Also, on my end I made strict mode not call js_InitXMLClasses() so that script could not "new XML(...)". Are you planning on doing that?
(In reply to Andrew Paprocki from comment #6)
> Created attachment 568643 [details] [diff] [review] [diff] [details] [review]
> my approach
> 
> I attached the approach I was taking.. I just wanted TokenStream::hasXML()
> to return false in strict mode and things could simply be based on that. You
> guys know this better than I do, though :)

That is not enough. hasXML is just about whether <!... (CDATA and comments) are interpreted per ECMA-357, or only <!-- is taken as a comment to end of line (ye olde comment-hiding hack from 1995 to keep inline script content from showing in <script>-unaware browsers).

/be
(In reply to Brendan Eich [:brendan] from comment #7)
> That is not enough. hasXML is just about whether <!... (CDATA and comments)
> are interpreted per ECMA-357, or only <!-- is taken as a comment to end of

Ok, figured you'd know better. But my comments on the strict-mode still hold. It seems like a bug to me that TokenStream::isStrictMode() returns false if the embedding sets JSOPTION_STRICT and the code does *not* contain "use strict";
(In reply to Andrew Paprocki from comment #8)
> (In reply to Brendan Eich [:brendan] from comment #7)
> > That is not enough. hasXML is just about whether <!... (CDATA and comments)
> > are interpreted per ECMA-357, or only <!-- is taken as a comment to end of
> 
> Ok, figured you'd know better. But my comments on the strict-mode still
> hold. It seems like a bug to me that TokenStream::isStrictMode() returns
> false if the embedding sets JSOPTION_STRICT and the code does *not* contain
> "use strict";

JSOPTION_STRICT is not "use strict" -- two different things. We've aligned as much as possible but I don't believe we should make the strict-warning option ban E4X. It's just about warnings.

/be
(In reply to Brendan Eich [:brendan] from comment #9)
> JSOPTION_STRICT is not "use strict" -- two different things. We've aligned

Ok, I wasn't aware you wanted to keep those to have separate meanings. I'll have to experiment with automatically executing a standalone script '"use strict";' prior to executing the real script because I do not want app developers to have to manually put this in their script and we control strict-ness via external metadata anyway. If there is any performance impact with doing it this way I'll let you know. Perhaps there could be a runtime option to do exactly what "use strict" does?
Comment on attachment 568605 [details] [diff] [review]
proposed fix

Double-requesting in case it helps.

/be
Attachment #568605 - Flags: review?(luke)
Comment on attachment 568605 [details] [diff] [review]
proposed fix

Review of attachment 568605 [details] [diff] [review]:
-----------------------------------------------------------------

On second thought, I think I can review these changes; JS_HAS_XML_SUPPORT pretty much guides the way.

A general request: it would be reassuring (and useful for fuzzing) to add a JS_ASSERT(!tc->inStrictMode()) in all the JS_HAS_XML_SUPPORT blocks/cases in BytecodeGenerator and js::Interpret.

One high level question: I don't see anything about disabling initialization of the Namespace, QName, and XML classes.  I'm assuming that these are not disabled since they are not about strict-mode per se and the Caja guys can just delete these properties of the global?  r+ if this is correct.

::: js/src/frontend/Parser.cpp
@@ +6289,5 @@
>              pn2->pn_pos.begin = pn->pn_pos.begin;
>              pn2->pn_pos.end = tokenStream.currentToken().pos.end;
>  #if JS_HAS_XML_SUPPORT
>          } else if (tt == TOK_DBLDOT) {
> +            if (tc->inStrictMode()) {

There is another JS_HAS_XML_SUPPORT block above this one in memberExpr.  IIUC, primaryExpr should be handling the situation there, but could you put a JS_ASSERT(!tc->inStrictMode()) in the appropriate branch?

@@ +7489,5 @@
>          pn = xmlElementOrListRoot(JS_TRUE);
>          if (!pn)
>              return NULL;
>          break;
>  #endif /* JS_HAS_XML_SUPPORT */

What about the three cases in this JS_HAS_XML_SUPPORT block?  JS_ASSERT(!tc->inStrictMode()) ?
Attachment #568605 - Flags: review?(luke)
Attachment #568605 - Flags: review?(jwalden+bmo)
Attachment #568605 - Flags: review+
(In reply to Luke Wagner [:luke] from comment #12)
> A general request: it would be reassuring (and useful for fuzzing) to add a
> JS_ASSERT(!tc->inStrictMode()) in all the JS_HAS_XML_SUPPORT blocks/cases in
> BytecodeGenerator and js::Interpret.

I did stick those in a few places, but I grew weary. On IRC, evilpies and I agreed that the patch was bigger than one might like. I'll look into how much bigger these assertions would make it. The consolation is that they'd all be #if JS_HAS_XML_SUPPORT, so they will go away in a release or two.

> One high level question: I don't see anything about disabling initialization
> of the Namespace, QName, and XML classes.  I'm assuming that these are not
> disabled since they are not about strict-mode per se and the Caja guys can
> just delete these properties of the global?  r+ if this is correct.

That's right, Caja removes unwanted built-ins, especially if it cannot freeze their prototypes :-P.

The trouble with disabling is that strict mode is a property of source code. Whereas these built-ins are properties of global objects in the heap, accessible via scope chains from all kinds of code (non-strict, strict, "extended" [Harmony, ES6 in all likelihood]). We do not provide separate scopes and heaps to strict and non-strict code, or any kind of subjective views of the heap.

> ::: js/src/frontend/Parser.cpp
> @@ +6289,5 @@
> >              pn2->pn_pos.begin = pn->pn_pos.begin;
> >              pn2->pn_pos.end = tokenStream.currentToken().pos.end;
> >  #if JS_HAS_XML_SUPPORT
> >          } else if (tt == TOK_DBLDOT) {
> > +            if (tc->inStrictMode()) {
> 
> There is another JS_HAS_XML_SUPPORT block above this one in memberExpr. 
> IIUC, primaryExpr should be handling the situation there, but could you put
> a JS_ASSERT(!tc->inStrictMode()) in the appropriate branch?

Ok.

> @@ +7489,5 @@
> >          pn = xmlElementOrListRoot(JS_TRUE);
> >          if (!pn)
> >              return NULL;
> >          break;
> >  #endif /* JS_HAS_XML_SUPPORT */
> 
> What about the three cases in this JS_HAS_XML_SUPPORT block? 
> JS_ASSERT(!tc->inStrictMode()) ?

I did grow weary -- thanks, will assert.

/be
(In reply to Brendan Eich [:brendan] from comment #13)
> (In reply to Luke Wagner [:luke] from comment #12)
> > One high level question: I don't see anything about disabling initialization
> > of the Namespace, QName, and XML classes.  I'm assuming that these are not
> > disabled since they are not about strict-mode per se and the Caja guys can
> > just delete these properties of the global?  r+ if this is correct.
> 
> That's right, Caja removes unwanted built-ins, especially if it cannot
> freeze their prototypes :-P.


Actually, Caja and SES don't even remove these from the global object. They simply prevent eval-ed code from 
* naming these globals, and so "virtually" removing them,
* obtaining a reference to the global object itself,
* being non-strict,
* accessing the original eval or Function, which would enable evasion of these constraints.

SES does so without parsing. This is important, since accurately parsing, or even lexing, JS takes too much JS code. That's why E4X is such a hazard to us -- it enables the E4X prototype to be reached not by naming a global but by constructing a literal -- which we are not in a position to detect or prevent. We have therefore upgraded our repair script http://es-lab.googlecode.com/svn/trunk/src/ses/explicit.html to detect this hazard, so Caja can fall back to the ES5/3 translation strategy on platforms that it cannot otherwise secure.

Once this bug is fixed, FF will again pass this test with flying colors ;). We are *very* grateful for the quick attention this issue has received, THANKS!
Passes testsuite in shell. Testing more, this had enough heft that I'm asking for a quick re-review. Thanks,

/be
Attachment #568605 - Attachment is obsolete: true
Attachment #568606 - Attachment is obsolete: true
Attachment #568979 - Flags: review?(luke)
Comment on attachment 568979 [details] [diff] [review]
with review comments addressed

Mmmm, asserty!
Attachment #568979 - Flags: review?(luke) → review+
http://hg.mozilla.org/mozilla-central/rev/0cff4fe76772

/be
Status: ASSIGNED → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
Someone's probably going to be interested in this change, so we should document it a bit: MDN, addon-targeted announcements, etc. and some blogging for good measure.

I have been on a blogging tear recently, on moderately dense technical stuff that takes time to get right, so I'm running low on wordsmithing energy.  If someone else could blog about this, announce it in reasonable places, etc. that would be great.  But if no one else feels like doing so, I could make an effort to do it.
Can we please have a way to opt out of this while still retaining the sane features of strict mode?
I've noted this change on Firefox 10 for developers. If additional information needs to be posted, please let me know what it ought to be:

https://developer.mozilla.org/en/Firefox_10_for_developers#JavaScript
Do anyone have the source code for one to upload to their computer, and still use E4X? I'm currently seeing to getting people interested in combining E4X's XML constructor objects with SAX(sax being a add on as methods to the XML object) expanding it's functionality.


I wish to get people interested for this being a js object query tool for working with xml database. I wish to see it on both Rhino, and node(I prefer nodes to be honest)

I talk to the ECMA organization, and they had the audacity to say E4X is.... Dead! All cause of the attention in popularity of E4X. I was appalled to say the least. I hope to get other stir up about ideal of E4X's XML object, being look upon as a querying object.... Maybe even rename it ECMA-Script querying object(ESQO... ESXQ(ECMA-Script Xml Querying Object)

All feed back is welcome.
E4X is dead, and for very good reasons.

The idea behind it wasn't bad, but the way it was integrated into the language was. SpiderMonkey was the only JS engine that ever implemented it, and there were endless problems caused by that and severe complications of the engine's implementation required for this support.

I'd recommend looking into http://sweetjs.org/ if you want to use syntax extensions in JS today. For excellent examples of what can be done with that, check jlongster's blog at http://jlongster.com/

In particular, http://jlongster.com/Compiling-JSX-with-Sweet.js-using-Readtables is pertinent, as it deals with parsing XML literals in the language.
Don't say that! Hello Till, thank you for replying. Funny thing is,a member represent of Facebook reply back to the ECMA mailing list saying they use e4x, and they too would like to see it being updated. Here the mailing list email I believe: e-TC39@ecma-international.org if you'll like to speak your mind on the issues. And if you like to read all that had been spoken, it'll be my pleasure to forward it to you and any one else who also would like to read what had been said so far. I believe the xml object within it self is a great query tool, and really the e4x doesn't do it justice, it should be more so EXQO(ECMA-Script's XML Query Object). It could be an extension of the language host on ECMA site. A server side extension. E4X... EXQO's XML Object with added SAX as methods to the object itself. To give more query tools, would be a great extension with the update. If your interested, please represent most of the mozilla community who you feel would also like to see this happen, and if can, send them a email of the ECMA email or they can go to the site and contact them.
I was the one who responded on the mailing list. I'd like to clarify that we don't use E4X. We use macros SIMILAR to E4X. We don't want it added back into engines. I'd also recommend looking into http://sweetjs.org/ as an alternative.
Please take this discussion elsewhere. E4X has been permanently removed from Spidermonkey. If for some reason that does not satisfy you and you'd like to discuss it further, the best place is the dev.planning group.
(In reply to Sebastian Markbåge from comment #26)
> I was the one who responded on the mailing list. I'd like to clarify that we
> don't use E4X. We use macros SIMILAR to E4X. We don't want it added back
> into engines. I'd also recommend looking into http://sweetjs.org/ as an
> alternative.

I aim to be a strict ECMA-Script programmer. I do not wish to use such an alternative.

But think you for responding to the mailing list.
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: