Closed Bug 590755 Opened 14 years ago Closed 2 years ago

generate JS source from AST's created by Reflect.parse()

Categories

(Core :: JavaScript Engine, defect)

defect

Tracking

()

RESOLVED WONTFIX

People

(Reporter: dherman, Unassigned)

References

Details

Attachments

(3 files, 3 obsolete files)

It would be great to be able to evaluate AST objects generated by Reflect.parse or synthesized by a JS program. But an operation that directly compiles or evaluates AST objects would be hard to make safe -- it would in effect need a full-on validator to make sure you weren't generating code to pwn the interpreter/compiler/JIT's/etc.

An easier approach would just be to have a toSourceCode() method on AST objects that can generate eval-able JS source code corresponding to the parse node. That way, we haven't added any more ability to generate nasty code than we already had with eval.

According to Jesse, one application of this would be to modify the fuzzer to take existing test cases and randomly modify them.

Dave
Another application would be to modify the reducer (Lithium) to remove meaningful chunks.  This would make the reducer asymptotically faster for JS and much more suitable for use on web scripts.
The method name is toSource, a SpiderMonkey extension.

uneval(v) = (typeof v == "object" && v) ? v.toSource() : ...;

(... does v.quote() for a (typeof v == "string"), and other primitive value to source conversions).

Thus eval(uneval(v)) gives you a value v' === v for primitives other than NaN, and an isomorphic object graph v' if v is an object.

/be
Or not -- Jesse pointed out that you want both toSource (to see the AST structure as JSON-ish objects and arrays) and toSourceCode (or whatever name is best) to render it as source.

/be
> Or not -- Jesse pointed out that you want both toSource (to see the AST
> structure as JSON-ish objects and arrays) and toSourceCode (or whatever name is
> best) to render it as source.

Right, that's why I didn't call it toSource. I didn't have an obviously good name in mind, though, so if you have any suggestions let me know.

Dave
"Reflect.serialize" would nicely balance "Reflect.parse".
> "Reflect.serialize" would nicely balance "Reflect.parse".

Heh. The class that implements Reflect.parse is called ASTSerializer. That could easily be changed, and it's maybe not a great name for that class.

I'm not sure I like it for this operation either, though... I don't really see how it balances "parse," and toSource seems more like a serialization than this operation. An obvious inverse of "parse" is "unparse". Maybe a little goofy, but we already have "uneval" and "unparsing" is actually used sometimes (e.g., http://www.brics.dk/RS/98/12/). Thoughts?

Dave
For consistency with JSON (JSON.parse/JSON.stringify) you could also use Reflect.stringify. Not saying it's the best choice, just wanted to throw it into the mix ;)
Yeah, I'd be cool with stringify, not because I like the name (bleah) but because that ship has sailed, and consistency is as good a reason as any. :)

Thanks,
Dave
dherman asked me to post this.
Attached file AST decompiler
I suppose I should also upload my WIP version of this file. I don't trust it to be 100% accurate either, but it at least doesn't error on all of mozilla/js/src/tests, except for when it exceeds the recursion limit.
Attached patch WIP 1 (obsolete) — Splinter Review
I don't actually have time to work on this, but I spent a few hours on it this weekend just for fun.

I merged in a bunch of Joshua's code.

I added a flag to js/src/tests/jstests.py so you can use it to test the stringify implementation. Basically what this does is

    for each test under js/src/tests:
        before = load(test)
        after = Reflect.stringify(Reflect.parse(test)
        compare_bytecode(before, after)

This reveals that .stringify is still rather incomplete and even found a few bugs in .reflect (and I expect there are a few more).
Attachment #505983 - Attachment is obsolete: true
> I don't actually have time to work on this, but I spent a few hours on it this
> weekend just for fun.

Thanks, Jason! I'm happy to own this bug, but I very much appreciate your help.

Dave
Attached patch WIP 2 (obsolete) — Splinter Review
Dave, yes, please take this bug. :) Here's my last snapshot. It has 2 known
bugs of its own:

  - `var c\u0061se = 0;` is a valid ECMAScript program. Reflect.stringify gives
    "var case = 0;" which is invalid because case is a keyword.

  - "InternalError: too much recursion" stringifying
    js1_5/Regress/regress-89443.js.

each of which causes it to flunk one test. But it also flunks 43 other tests,
for these reasons:

  - The tester uses dis(), so it's really way too strict. Some tests fail for
    dumb reasons. For example, dis() uses the decompiler when printing
    JSOP_LAMBDA instructions, and the two otherwise identical functions
    (function () [1,]) and (function () [1]) have different decompiler output.
    Thanks, source notes.

  - A few very minor engine bugs: bug 632019, bug 632020.

  - A handful of bugs in Reflect.parse, mostly trivial: see bug 632024,
    bug 632026, bug 632027, bug 632029, bug 632030, bug 632047, bug 632056.
Attachment #508428 - Attachment is obsolete: true
I ported this code to python (https://github.com/zbraniecki/pyjs/blob/master/pyjs/serializer.py)

and I believe there's a minor bug around NewExpression/CallExpression.

Basically:

NewExpression(CallExpression(Identifier('x'))) produces "new (x())" instead of "new x()" - I guess it's due to "xprec > cprec" in wrapExpr. Should it be "xprec >= cprec" ?
Is there any chance we can get an updated patch?  WIP 2 does not apply cleanly to Aurora 8.
(In reply to Zbigniew Braniecki [:gandalf] from comment #14)
> I ported this code to python
> (https://github.com/zbraniecki/pyjs/blob/master/pyjs/serializer.py)
> 
> and I believe there's a minor bug around NewExpression/CallExpression.
> 
> Basically:
> 
> NewExpression(CallExpression(Identifier('x'))) produces "new (x())" instead
> of "new x()" - I guess it's due to "xprec > cprec" in wrapExpr. Should it be
> "xprec >= cprec" ?

gandalf, "new (x())" is correct for this AST, since the callee of the NewExpression is itself a CallExpression.

In other words, a NewExpression is a kind of CallExpression; you don't (normally) compose it with CallExpression. So the AST for "new x()" is like this:
    {
        type: "NewExpression",
        callee: {type: "Identifier", name: "x"},
        arguments: []
    }

Reflect.stringify produces "new x" for this, rather than "new x()". That is correct because "new x" is simply shorthand for "new x()"; the two expressions do exactly the same thing.
Attached patch WIP 3Splinter Review
Attachment #510295 - Attachment is obsolete: true
Peanut gallery observation:  I tried applying the above WIP 3 patch against trunk js prompt and xpcshell.  It didn't define a Reflect.stringify method as I expected.
Attached file AST Decompiler 2
Fixed a bug on Joshua Cranmer's decompileAST that breaks when decompileAST-ing a chunk of code with an `instanceof`.
Attachment #597902 - Attachment description: Fixed bug that has something to do with precedence when stringifying → AST Decompiler 2
Given the comments on bug 776290, it seems like developers would really appreciate having a serializer readily available.
... For example, note that users are already maintaining the patch themselves and placing updates on this bug.
OS: Mac OS X → All
Hardware: x86 → All
Comment on attachment 558508 [details] [diff] [review]
WIP 3

>+        switch (n.type) {
>+        case "ArrayExpression":
>+        case "ArrayPattern":
>+            {
>+                let s = '[';
>+                let e = n.elements;
>+                let len = e.length;

This will throw for "var foo = []", n.elements will be null in this case (allowed both for ArrayExpression and ArrayPattern). If I change it into |let len = e ? e.length : 0| then everything seems to work fine - no other issues found.
Btw, I fixed a bunch more issues in Joshua Cranmer's AST Decompiler: https://hg.adblockplus.org/jshydra/log/tip/scripts/astDecompile.js. It's mostly operator precedence, other than that a missing semicolon and incorrect assumptions as far as variable declarations go.
For some reason I had myself convinced this was blocked on self-hosting capabilities in the JS engine. But since Reflect.parse is provided as a toolkit component, that's probably not necessary at all!

So essentially to create Reflect.stringify I think I can just modify:

    toolkit/components/reflect/reflect.jsm

I'll try that ASAP.

Dave
Assignee: general → nobody
I see no one's placed a new patch here for quite a while, but the jshydra project is being at least minimally maintained (most recent patch was three months ago).  Is there a simple way to adapt jshydra's decompiler to Reflect.stringify?

I wonder because I was tinkering with Reflect.parse tonight on a whim.  I found it pretty easy to replace a sub-object representing the multiplication binary operator with a function call to a multiplication function.  This might be interesting because it could give us a way to simulate bootstrapping primitives to Harmony value types (BigInteger, Decimal, int64, etc.)
Severity: normal → S3

The severity field for this bug is relatively low, S3. However, the bug has 13 votes.
:sdetar, could you consider increasing the bug severity?

For more information, please visit auto_nag documentation.

Flags: needinfo?(sdetar)
Flags: needinfo?(sdetar)

Reflect.parse is in maintenance mode. This would probably be better as a library, like Esprima.

Status: NEW → RESOLVED
Closed: 2 years ago
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: