Closed Bug 335051 Opened 18 years ago Closed 14 years ago

E4X literals should be acceptable values for sharp variables

Categories

(Core :: JavaScript Engine, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED

People

(Reporter: BijuMailList, Assigned: bugzilla.mozilla.org)

References

Details

(Keywords: testcase, Whiteboard: fixed-in-tracemonkey)

Attachments

(2 files, 2 obsolete files)

E4X:sharp varariable cause prob for object with e4x

 b=[#1=(<a><b>1</b></a>),#1#];
 b={x:#1=(<a><b>1</b></a>),y:#1#};
 
 // works fine, but not

 b=[#1=<a><b>1</b></a>,#1#];
 b={x:#1=<a><b>1</b></a>,y:#1#};



step:-


1. navigate attachment "e4x_and_sharp_variables.html"
2. click Input Case "0" link
3. Click "test" button
4. repeat for links 1, 2, 3, 4 ....
Blocks: 368267
Blocks: e4x
Patch such that parsing a sharp-variable-definition-token followed by an xml-token does not yield a syntax error.
Attachment #441461 - Flags: review?(jwalden+bmo)
Any chance we could get some fuzzing love on this particularly with respect to E4X?  ;-)  It looks reasonable at first glance -- but given the interaction with decompilation, once bitten, twice shy.
Assignee: general → bugzilla.mozilla.org
OS: Windows XP → All
Hardware: x86 → All
(In reply to comment #3)
> Any chance we could get some fuzzing love on this particularly with respect to
> E4X?  ;-)  It looks reasonable at first glance -- but given the interaction
> with decompilation, once bitten, twice shy.

This seems to hold up fine with initial rounds of fuzzing on 32 and 64-bit debug builds with this patch on TM tip. :)
Comment on attachment 441461 [details] [diff] [review]
patch that should solve the bug, plus regression test

>diff -r 078c03c13b7f js/src/jsparse.cpp
>--- a/js/src/jsparse.cpp	Fri Apr 23 08:58:44 2010 +0200
>+++ b/js/src/jsparse.cpp	Mon Apr 26 12:39:46 2010 +0200
>@@ -8136,19 +8136,18 @@ Parser::primaryExpr(TokenKind tt, JSBool
>       case TOK_DEFSHARP:
>         pn = UnaryNode::create(tc);
>         if (!pn)
>             return NULL;
>         pn->pn_num = (jsint) tokenStream.currentToken().t_dval;
>         tt = tokenStream.getToken(TSF_OPERAND);
>         if (tt == TOK_USESHARP || tt == TOK_DEFSHARP ||
> #if JS_HAS_XML_SUPPORT
>             tt == TOK_STAR || tt == TOK_AT ||
>-            tt == TOK_XMLSTAGO /* XXXbe could be sharp? */ ||
> #endif
>             tt == TOK_STRING || tt == TOK_NUMBER || tt == TOK_PRIMARY) {
>             ReportCompileErrorNumber(context, &tokenStream, NULL, JSREPORT_ERROR,
>                                      JSMSG_BAD_SHARP_VAR_DEF);

The even more conservative fix would be to require tt to be TOK_LB or TOK_LC here. IOW, #1= may precede *only* an object or array initialiser (unparenthesized).

That would fix other bugs, IIRC.

/be
Comment on attachment 441461 [details] [diff] [review]
patch that should solve the bug, plus regression test

Brendan speaks sensibly.  Could you post an updated patch with the condition changed to |tt != TOK_LC && tt != TOK_LB|, and with the test changed to expect a syntax error in this case?  Simpler and narrower code is winning, moreso than the interaction of the obscure features of E4X and sharp variables.  I promise a quick r+ and commit.
Attachment #441461 - Flags: review?(jwalden+bmo) → review-
I'm a bit optimistic:

http://www.google.com/codesearch?hl=en&lr=&q=%231%3D[^[{]+lang%3Ajavascript&sbtn=Search

Thinking about the original sharp variables design, remember more (it was a long time ago), I think I was trying to allow for any non-primitive at runtime. This is what the code does (wherefore the JSMSG_BAD_SHARP_DEF runtime error).

Let's go with the proposed patch for now, and put off further work to another bug. It's not obvious we win by spending more time on sharps than the minimum to fix fuzz bugs.

/be
Comment on attachment 441461 [details] [diff] [review]
patch that should solve the bug, plus regression test

Waldo, could you please land this? Sorry for the over-optimistic miscue. Any other bugs to do with defsharp (I see one for #1#, usesharp, but no defsharp #1= at a glance in my bugmail) we'll have to get separately. Thanks,

/be
Attachment #441461 - Flags: review- → review+
Comment on attachment 441461 [details] [diff] [review]
patch that should solve the bug, plus regression test

Waldo pointed out that we don't have evidence showing the need to allow XMLSTAGO here. We then talked about how comment 0 has parens around the #n= right-hand side, and doesn't -- and how the decompiler strips unnecessary parens.

So this isn't the patch we want. Rather, we should always call primaryExpr after getting the next token, and not inspect the token type. Then, we should look at PN_TYPE(pn->pn_kid) to see what the primary expression (stripping parens) actually was.

There, we can use more or less the same tt == ... || ... chain of tests, where tt = PN_TYPE(pn->pn_kid). Watch out for node types that are normalized from next-token types.

And there, after we parse, it's ok to reject E4X on the right side of sharps. XML data should not have cycles or join points.

/be
Attachment #441461 - Flags: review+
Parsing ahead and then reporting an error based on what was parsed means the error reporter call:

ReportCompileErrorNumber(context, &tokenStream, NULL, JSREPORT_ERROR,

needs to pass pn->pn_kid not NULL for correct source coordinate blame.

/be
Rasmus, would you be willing to write the patch sketched in comment 9 and comment 10? Thanks,

/be
(In reply to comment #11)
> Rasmus, would you be willing to write the patch sketched in comment 9 and
> comment 10? 

yes, I'll look at it,
(it make take a while though, 
as I probably do not have the time
to work on it in the next weeks)
Attachment #441461 - Attachment is obsolete: true
Comment on attachment 443577 [details] [diff] [review]
Patch as sketched in comment 9+10

This looks great -- thanks, Rasmus! I'm on the road today, hoping Jeff can do r+ and land this for you. Thanks again,

/be
Attachment #443577 - Flags: review?(jwalden+bmo)
I am :(
I use a lot e4x every day, for converting web-service XML output (5MB) to readable tab delimited text to paste in Excel and send it to Business Analyst or to User. So I love e4x, and hate to see it get degraded and making an  inconsistency with other JS objects.

Example:-

a=new Number(1);b=new Boolean(1);c=new String(1);d=new Date('jan 1, 2010');
e=new RegExp('abc');f=new Function('return 1');
s=uneval({a:a, b:b, c:c, d:d, e:e, f:f, a1:a, b1:b, c1:c, d1:d, e1:e, f1:f});

// Gives ==>
// ({a:#1=(new Number(1)), b:#2=(new Boolean(true)), 
//   c:#3=(new String("1")), d:#4=(new Date(1262318400000)), 
//   e:#5=/abc/, f:#6=(function anonymous() {return 1;}), 
//   a1:#1#, b1:#2#, c1:#3#, d1:#4#, e1:#5#, f1:#6#})

// and then
eval(s) // gives ==> [object Object]


So at least why e4x in parenthesis should fail. Like in
   testSyntax("#1=(<a/>)", false);
   testSyntax("#1=(<![CDATA[foo]]>)", false);
I think Biju got a point. And there is currently a problem with uneval->eval of objects containing xml, as it generates codes with sharps a la:

js> a=<a/>; uneval({a:a,b:a});
({a:#1=<a/>, b:#1#})

I see two possible solutions here: 1) change this patch, such that xml is allowed on the right hand side of sharps, or 2) keep this as it is, and change uneval such that:
js> a=<a/>; uneval({a:a,b:a});
becomes: ({a:<a/>, b:<a/>})

I would like to implement one of them, which solution to choose?

To me solution 1) seems the best, as xml is a mutable and thus the data structure should be preserved. To elaborate a bit, imagine:

js> a=<a/>; data1={a:a,b:a}; data2 = eval(uneval({a:a,b:a})); 
js> data1.a.c=1; data2.a.c=1;
// the eval(uneval(...)) above does not work in current version

data1.b.c would now be 1. If solution 1) is choosen data2.b.c would be 1, whereas it would be undefined with solution 2).
Agreed. I also claim it's a waste of time trying to change the interaction of sharps and e4x incompatibly. The interaction is what it is, and we have much more important work to do elsewhere.

/be
I have the patch queued, will push next time I have stuff that needs pushing.

A few notes from looking at hg o -p output:

* Trailing whitespace is bad; I had to remove a bunch from the patch.
* gTestfile needed to be changed.
* Since sharp variables are an extension (and so's E4X), it seems the best
  location is e4x/extensions -- moved the test there.

Thanks again!
Summary: E4X:sharp varariable cause prob for object with e4x. → E4X literals should be acceptable values for sharp variables
Changes from previous patch:

Updated test cases to expect xml as sharp values to parse, and removed

#if JS_HAS_XML_SUPPORT
            PN_TYPE(pn->pn_kid) == TOK_XMLCOMMENT ||
            PN_TYPE(pn->pn_kid) == TOK_XMLCDATA ||
            PN_TYPE(pn->pn_kid) == TOK_XMLPTAGC ||
            PN_TYPE(pn->pn_kid) == TOK_XMLELEM ||
#endif

from jsparse.cpp, to allow xml as sharp values to parse.

(The changes from comment 18 is also incorporated)
Attachment #443577 - Attachment is obsolete: true
Attachment #444050 - Flags: review?(jwalden+bmo)
Attachment #443577 - Flags: review?(jwalden+bmo)
Comment on attachment 443577 [details] [diff] [review]
Patch as sketched in comment 9+10

Oops, I thought I'd marked this r+ when I made my last comment but actually hadn't...
Attachment #443577 - Attachment is obsolete: false
Attachment #443577 - Flags: review+
Attachment #444050 - Attachment is obsolete: true
Attachment #444050 - Flags: review?(jwalden+bmo)
As I said, I have this in my queue waiting for a few other patches to accompany it -- have one now, so probably will push first thing tomorrow morning (run out of time to push and watch tree today).
Status: NEW → ASSIGNED
Keywords: checkin-needed
I'm just trying to clear out saved bugmail by setting the checkin-needed keyword. Is that so wrong, I ask you? :-P

/be
http://hg.mozilla.org/tracemonkey/rev/62e0918a4ee8

Thanks for the patch!
Whiteboard: fixed-in-tracemonkey
Jeff, It should be attachment 444050 [details] [diff] [review]
Wrong patch was pushed. Attachment 44050 [details] was the one to push. Looks like the older one was in Waldo's queue. Thanks to Biju for pointing this out.

http://hg.mozilla.org/tracemonkey/rev/818656d6f1e4

/be
Attachment #444050 - Attachment is obsolete: false
Followup fix, to restore the use of reportErrorNumber (the js::Parser method) instead of the js::ReportCompileErrorNumber long-hand:

http://hg.mozilla.org/tracemonkey/rev/9d104c5218d7

/be
The wrong patch was not pushed.  I pushed the patch I reviewed, which I intended to review, and did not push the patch I intentionally did not review.
Then your comment 21 was misleadingly unclear. It said "I have this in my queue" but you didn't mean by "this" the most recent patch (in comment 19, which also responded to your review comment 18), and you ignored comment 15, comment 16, comment 17, and of course comment 19 with its new patch.

Intentionally ignoring more recent comment and patch work is bad module peer behavior. Responding forthrightly to more recent results or arguments, especially from the patch contributor, is basic Mozilla hacking ethics. I should not need to write this. I won't repeat it.

/be
By "this patch" in comment 21, I thought I was clear in referring to the same patch about which I'd said I had "the patch" in my queue ("As I said").  Looking back, you're right, it wasn't clear -- I should have been more precise in that comment (specifically as noted below).

Comment 18 should have been an r+ -- I intended to mark r+ at the time, and I don't remember now why I didn't.  I should have combined comment 20 and comment 21 into one to make most precisely clear the antecedent of "this patch".  Do note that at the time I made comment 20 to belatedly and officially note r+ I also canceled the review on attachment 444050 [details] [diff] [review] (well, seconds after, Bugzilla UI won't let you edit two patches at once).

As for comment 18, I thought what I wrote had clearly stated that the patch was acceptable, that I'd made a few minor tweaks to it (I had it "queued", "I had to remove" trailing whitespace, I noted it "needed" a change rather than "needs" a change, I "moved" the test), and that I would push at next opportunity -- and that no further change submissions were needed.  This was too much to assume would be clear to a new contributor without being completely explicit; I apologize for making unwarranted assumptions.

As for comment 17,

> Agreed. I also claim it's a waste of time trying to change the interaction
> of sharps and e4x incompatibly. The interaction is what it is, and we have
> much more important work to do elsewhere.

This seemed to argue to me for continuing to not permit E4X literals as sharp variables, precisely as the existing patch at the time had implemented.  "The interaction is what it is" seems to me to mean keep doing what we'd been doing.  Therefore I didn't think any change was needed to reconcile what I noted in comment 18 with your comment 17, or with the patch.  As for the preceding "Agreed", my interpretation had been that you said that comment 15 and comment 16 indicated what would be desirable -- *but* that that would apply in a Panglossian best-of-all-possible-worlds, and that we should not spend time changing from what we already did.  My understanding of your expressed opinion in comment 17 also continued to accord with your comment 14, lending this interpretation of comment 17 greater credence than if I'd known nothing about your prior views.

And as for comment 25,

> Looks like the older one was in Waldo's queue.

I don't expect someone not paying extraordinarily close attention to have noticed, but both patches were in my review queue (or do you mean push queue?  technical terminology is maddeningly overloaded at times) with r? at different times (although I might not have noticed this different-times bit because I keep open tabs for reviews for fairly long lengths of time, so the r+ may have overwritten what I still saw there as r?).

In total: I did screw up a number of things here (unclear antecedents, not marking r+ in a comment clearly intended to do that, making assumptions about how Rasmus would interpret an approving comment that also noted tweaks), and I apologize for doing so, but I don't think I'm guilty of bad-faith action.
We permit almost anything after #1= including E4X literals, if you parenthesize them. The first patch from Rasmus broke this. His second patch fixed this based on explicit words in comment 15 and comment 16.

/be
I really don't want to open up this can of worms, but my review was specifically of one patch and not of the other, so...

I don't think we should be making changes to the interaction of two obscure and bug-prone extensions to support such interaction better.  E4X is not JavaScript; it never will be.  It sees almost no use on the web.  The specification is bug-ridden and unmaintained.  *Our implementation* is effectively unmaintained.  We fix bugs in it pretty much only when driven to do so by fear of security vulnerabilities.  Sharp variables are an old extension that also sees just as little use on the web.  They have no specification at all.  They introduce extra complexity into array and object literal parsing; we must execute array literals in two different ways depending on whether sharp variables are or are not used.  No other browser will implement them.  We should be moving toward minimizing untoward interactions between the two, not toward improving them.  The first patch minimizes, but the second introduces further intertwinement of both features.  I think the second patch is a bad idea.
Points of fact:

1. E4X is a topic still on Ecma TC39's radar. Ecma can no more disown it than we can in SpiderMonkey or Rhino, or Adobe in Flash/Flex. We TC39'ers resolved to link it to ES3 for now, in order to avoid having to update it to use ES5's spec formalisms, and we resolved to revisit it in the post-ES5 timeframe.

2. E4X code is certainly maintained, or it would be removed already. Not only is it maintained (I know, I've maintained it; so have igor, mrbkap, jorendorff, and gal at least), it will be maintained in the future. The plan is to use Harmony Proxies to self-host it.

3. E4X is used by CouchDB and other SpiderMonkey embeddings as well as by Firefox users such as Rasmus and Biju.

4. SpiderMonkey is open source, so we are bound take patches that improve things. Not all patches, but not "no patches" either. Thoughtful patches are welcome. Requiring someone to parenthesize to work around an error case is in fact a bug in our code, reported here and patched by a contributor, Rasmus.

5. The fix makes us consistent in not rejecting input based on lack of parentheses, which is a better fix (for not breaking compatibility) than always rejecting (as the first patch did).

6. We took Rasmus's final patch. It's not going to be backed out. How do I know? I'm module owner. You're a peer (but you didn't act like a good peer here). This doesn't give you power to decide what's in and what is out.

I think you're barking up the wrong tree. But even if it's the right tree, and we should just remove E4X (which you haven't done anything to make happen), at some removal cost to ourselves, and hard-to-reckon but probably large costs to the people who use it, what good is just barking?

Instead, let's work to move the engine and/or language forward. Adding an error case doesn't do that.

/be
(In reply to comment #32)
> (which you haven't done anything to make happen)

That's because I know you wouldn't accept the patch, not because I'm not interested in doing the work or in foisting the problems onto other people.  But might it still be a worthwhile experiment on other grounds?  I would be more than willing to do this work if people were interested in knowing what it involved, or if they were interested in the effective performance improvements of such, or if they were interested in evaluating the total code footprint of E4X, and so on.  If people say yes, I have absolutely no problem working on this.

> at some removal cost to ourselves

Sure, nothing's cost-free, including keeping it working, or preventing it from falling into further disrepair.

> what good is just barking?

I "bark" because I see it as a necessary prerequisite to improving the situation.  I "bark" because it seems like the only action right now that won't just result in people digging in their feet -- as unfortunately seems to be happening here.

At risk of "conveniently" changing the argument, I would be curious to know what other SpiderMonkey hackers believe E4X carries its weight.
It's not all about "me". We can't rip out __proto__ right now either, and it has nothing to do with my wishes or decisions.

Barking after making a patch pocket veto attempt in this bug is just no good, period. Instead of defending barking, take on some actual problem-solving work. There are lots of perf bugs to fix; there's still a bunch of ES5 to implement.

Being forthright, proposing before disposing, and doing wanted high-priority work will always win in SpiderMonkey.

/be
(In reply to comment #34)
> It's not all about "me".

I never said it was.  I do think that, absent your objections, we would have more likely than not removed E4X by now.

> Instead of defending barking, take on some actual problem-solving work.
> There are lots of perf bugs to fix; there's still a bunch of ES5 to implement.

To the extent that this implies that I bark *rather than* doing these things, I object to it.  If no such implication was intended, I agree completely, and I continue to work on these things.
(In reply to comment #35)
> (In reply to comment #34)
> > It's not all about "me".
> 
> I never said it was.  I do think that, absent your objections, we would have
> more likely than not removed E4X by now.

Doubtful, but think whatever you like -- this whole E4X-removal topic is irrelevant, a dodge. You didn't have the initiative or moxie to raise it before now, and mishandling this bug is in no way excused by it, even if removing E4X were a good idea. So drop it here.

> To the extent that this implies that I bark *rather than* doing these things, I
> object to it.

Everything we do has opportunity cost, definitely including both E4X and your barking. Enough!

/be
http://hg.mozilla.org/mozilla-central/rev/62e0918a4ee8
Status: ASSIGNED → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
Attachment #443577 - Attachment is obsolete: true
Comment on attachment 444050 [details] [diff] [review]
Updated patch such that E4X literals are accepted as sharp values.

(In reply to comment #37)
> http://hg.mozilla.org/mozilla-central/rev/62e0918a4ee8

Robert this is a wrong patch

attachment 444050 [details] [diff] [review] is the right patch
Attachment #444050 - Flags: superreview?(brendan)
Attachment #444050 - Flags: review?(brendan)
see Brendan comment 25
Thanks...
Comment on attachment 444050 [details] [diff] [review]
Updated patch such that E4X literals are accepted as sharp values.

This landed already, with a followup patch to use reportErrorNumber.

/be
Attachment #444050 - Flags: superreview?(brendan)
Attachment #444050 - Flags: review?(brendan)
Attachment #444050 - Flags: review+
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: