Closed Bug 779810 Opened 12 years ago Closed 11 years ago

Add support for Harmony modules to Parser

Categories

(Core :: JavaScript Engine, defect)

x86
macOS
defect
Not set
normal

Tracking

()

RESOLVED DUPLICATE of bug 568953

People

(Reporter: ejpbruel, Assigned: ejpbruel)

References

(Blocks 1 open bug)

Details

(Whiteboard: [js:t])

Attachments

(3 files, 5 obsolete files)

Now that TokenStream supports the export/import keywords, the next step in implementing Harmony modules is to extend the parser.
Attached patch Recognize the module keyword (obsolete) — Splinter Review
This patch recognizes the following syntax in statement context:
module {}

No statement list is allowed between the curlies yet. We can't simply use stmtList() for this (yet) because it will search for a directive prologue (which will trigger an assertion), unless the statement list is not at body level. To fix this, we would have to create a StmtInfo for module statements and push it on the enclosing statement list before calling stmtList(). I'm not sure yet if that makes sense though.

As an aside, how does hoisting behave inside a module's body? Are var declarations lifted to the top of the body of the module or that of the enclosing function?
Attachment #648344 - Flags: review?(jorendorff)
(In reply to Eddy Bruel [:ejpbruel] from comment #1)

> As an aside, how does hoisting behave inside a module's body? Are var
> declarations lifted to the top of the body of the module or that of the
> enclosing function?

First, modules can't be nested inside functions.

Second, var declarations should be hoisted to the module top-level.
(In reply to Sam Tobin-Hochstadt [:samth] from comment #2)
> (In reply to Eddy Bruel [:ejpbruel] from comment #1)
> 
> > As an aside, how does hoisting behave inside a module's body? Are var
> > declarations lifted to the top of the body of the module or that of the
> > enclosing function?
> 
> First, modules can't be nested inside functions.
> 
> Second, var declarations should be hoisted to the module top-level.

Thanks for clearing that up. The above patch is still valid in that case, but it needs to be extended, so that module blocks are only allowed at top level. To make hoisting work, we probably have to create a TreeContext for each module.
Depends on: 779789
Comment on attachment 648344 [details] [diff] [review]
Recognize the module keyword

Review of attachment 648344 [details] [diff] [review]:
-----------------------------------------------------------------

r=me with a handful of changes.

::: js/src/frontend/Parser.cpp
@@ +1751,5 @@
> +    if (tt != TOK_NAME) {
> +        if (tt != TOK_ERROR)
> +            reportError(NULL, JSMSG_NO_MODULE_NAME);
> +        return NULL;
> +    } 

Nit: Remove trailing space on the last line here.

More substantially, since the tokenizer already checked that the next token is a NAME on the same line, this if-block can't hit. You can kill it, and change the preceding line to:

    JS_ALWAYS_TRUE(tokenStream.matchToken(TOK_NAME));

Then remove JSMSG_NO_MODULE_NAME from js.msg.

@@ +1752,5 @@
> +        if (tt != TOK_ERROR)
> +            reportError(NULL, JSMSG_NO_MODULE_NAME);
> +        return NULL;
> +    } 
> +    ParseNode *pn = UnaryNode::create(PNK_MODULE, this);

Fine with me, but we don't really want a UnaryNode here. What do we want? Perhaps a PN_NAME node with {pn_atom: name, pn_expr: body}?

A new type and new arity-constant would be OK, but then look up all the places that switch on pn_arity.

@@ +4115,5 @@
> +#ifdef JS_HAS_MODULES
> +      case TOK_NAME:
> +          /*
> +           * Since 'module' is not a reserved keyword, treat it as an
> +           * identifier, except where it occurs in statement context 

Nit: Trailing whitespace again.

@@ +4120,5 @@
> +           * followed by another identifier on the same line.
> +           */
> +          if (tokenStream.currentToken().name() == context->runtime->atomState.moduleAtom &&
> +              tokenStream.peekTokenSameLine(TSF_OPERAND) == TOK_NAME)
> +                  return moduleStmt();

Style nit: The formatting here should be

>        if (tokenStream.currentToken().name() == context->runtime->atomState.moduleAtom &&
>            tokenStream.peekTokenSameLine(TSF_OPERAND) == TOK_NAME)
>        {
>            return moduleStmt();
>        }

(Multi-line conditions force curly braces, and they move the left brace to its own line.)

::: js/src/js.msg
@@ +353,5 @@
>  MSG_DEF(JSMSG_NONDEFAULT_FORMAL_AFTER_DEFAULT, 300, 0, JSEXN_SYNTAXERR, "parameter(s) with default followed by parameter without default")
>  MSG_DEF(JSMSG_YIELD_IN_DEFAULT,       301, 0, JSEXN_SYNTAXERR, "yield in default expression")
> +MSG_DEF(JSMSG_NO_MODULE_NAME,         302, 0, JSEXN_SYNTAXERR, "missing module name")
> +MSG_DEF(JSMSG_CURLY_BEFORE_MODULE,    303, 0, JSEXN_SYNTAXERR, "missing curly before module body")
> +MSG_DEF(JSMSG_CURLY_AFTER_MODULE,     304, 0, JSEXN_SYNTAXERR, "missing curly after module body")

Instead of the word "curly", use the symbol, like in the other similar messages:
>MSG_DEF(JSMSG_CURLY_BEFORE_BODY,       72, 0, JSEXN_SYNTAXERR, "missing { before function body")
>MSG_DEF(JSMSG_CURLY_AFTER_BODY,        73, 0, JSEXN_SYNTAXERR, "missing } after function body")
Attachment #648344 - Flags: review?(jorendorff) → review+
Whiteboard: [js:t]
Attached patch Recognise the module keyword (obsolete) — Splinter Review
Attachment #648344 - Attachment is obsolete: true
Attachment #660840 - Flags: review+
As the next step, I want to be able to parse statement lists within a module block. For this, we need to be able to distinguish between function contexts and module contexts (we should not process directive prologues for statement lists within a module context, for instance).

As a first step towards this, this patch replaces the isFunction field on SharedContext with a type field, which makes it possible to create additional context types.
Attachment #660949 - Flags: review?(jorendorff)
Here is a patch that adds support for basic module and export declarations to the parser, i.e.:

module a {
   module b {
       export var x;
       ...
   }
   export function f() { ... }
   ...
}

It's not ready to land yet. In any case I still need to add support for basic import declarations, and the bytecode emitter currently asserts on PNK_MODULE nodes. Even so, I would like to know if you agree with the approach so far.

The main thing to take away from this patch is that it introduces a third type of shared context, so that we now have GlobalSharedContext, ModuleSharedContext, and FunctionBox.

In several places in the parser we make the tacit assumption that if a context is not a function, it is the global context. With the introduction of this third context type, this is obviously no longer the case. Hopefully, I've caught all of these.
Attachment #660840 - Attachment is obsolete: true
Attachment #660949 - Attachment is obsolete: true
Attachment #660949 - Flags: review?(jorendorff)
Attachment #661952 - Flags: feedback?(jorendorff)
Comment on attachment 661952 [details] [diff] [review]
Basic module + export declarations

Review of attachment 661952 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/frontend/Parser.cpp
@@ +232,5 @@
> +    /*
> +     * Declarations that occur in the global context always refer to variables
> +     * on the global object. These can never be optimized to local variable
> +     * accesses, so escape early.
> +     */

C++ comments.

@@ +233,5 @@
> +     * Declarations that occur in the global context always refer to variables
> +     * on the global object. These can never be optimized to local variable
> +     * accesses, so escape early.
> +     */
> +    JS_ASSERT(newDecl->isFreeVar());

Did you mean to change when this assertion is run?

@@ +234,5 @@
> +     * on the global object. These can never be optimized to local variable
> +     * accesses, so escape early.
> +     */
> +    JS_ASSERT(newDecl->isFreeVar());
> +    if (sc->isGlobal()) 

Trailing whitespace here.

@@ +3701,5 @@
> +        return NULL;
> +
> +    switch (tokenStream.getToken(TSF_OPERAND)) {
> +      case TOK_FUNCTION:
> +          pn->pn_kid = functionExpr();

Bodies of case statements should be indented two spaces from the case line.

@@ +4066,5 @@
>  #endif /* JS_HAS_BLOCK_SCOPE */
>  
> +      case TOK_EXPORT:
> +        if (!pc->sc->isModule()) {
> +            reportError(NULL, JSMSG_SYNTAX_ERROR);

There should probably be a custom message for this case.

@@ +4157,5 @@
> +         * another identifier.
> +         */
> +        if (versionNumber() == JSVERSION_ECMA_5 &&
> +            tokenStream.currentToken().name() == context->runtime->atomState.moduleAtom &&
> +            tokenStream.peekTokenSameLine(TSF_OPERAND) == TOK_NAME)

Why would it have to be on the same line?

::: js/src/frontend/SharedContext.h
@@ +132,5 @@
> +enum SharedContextType {
> +    SHARED_CONTEXT_GLOBAL,
> +    SHARED_CONTEXT_MODULE,
> +    SHARED_CONTEXT_FUNCTION
> +};

This might be simpler as an anonymous enum in the SharedContext class. If not, you can use the fancy MOZ_BEGIN_ENUM_CLASS/MOZ_END_ENUM_CLASS from mfbt to get nice namespacing.

@@ +201,5 @@
>  
>      JSObject *scopeChain() const { return scopeChain_; }
>  };
>  
> +class ModuleSharedContext : public SharedContext {

Brace on the next line.

@@ +203,5 @@
>  };
>  
> +class ModuleSharedContext : public SharedContext {
> +  public:
> +    inline ModuleSharedContext(JSContext *cx);

explicit

::: js/src/jskeyword.tbl
@@ +62,5 @@
>  #else
>  JS_KEYWORD(yield,       TOK_STRICT_RESERVED, JSOP_NOP,  JSVERSION_1_7)
>  #endif
> +JS_KEYWORD(export,      TOK_EXPORT,     JSOP_NOP,       JSVERSION_DEFAULT)
> +JS_KEYWORD(import,      TOK_IMPORT,     JSOP_NOP,       JSVERSION_DEFAULT)

Align the JSOP_NOP.
Added basic support for import declarations. Have yet to address benjamins comments.

Benjamin, thanks for taking a look at this patch!
Attachment #661952 - Attachment is obsolete: true
Attachment #661952 - Flags: feedback?(jorendorff)
Attachment #661980 - Flags: feedback?(jorendorff)
Attached patch Proposed patchSplinter Review
Here's a patch that adds a Module object (this will be the 'view' on the scope object used by the module body, as discussed on irc).

This is a prerequisite for parser support, because we've decided modules will get their own context in the parser. Similar to functions, module contexts will also serve as module boxes. In general, the type of an object box is determined by the type of the object it wraps, so we need a special kind of object that is distinct from generic objects and functions.
Attachment #661980 - Attachment is obsolete: true
Attachment #661980 - Flags: feedback?(jorendorff)
Attachment #667528 - Flags: review?(jorendorff)
Comment on attachment 667528 [details] [diff] [review]
Proposed patch

Review of attachment 667528 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/builtin/Module.cpp
@@ +21,5 @@
> +{
> +    JS_ASSERT(obj->isNative());
> +
> +    Rooted<GlobalObject *> global(cx, &obj->asGlobal());
> +    return global->createBlankPrototype(cx, &ModuleClass);

You don't need to root it if you're just going to call createBlankPrototype.

@@ +27,5 @@
> +
> +js::Module *
> +js_NewModule(JSContext *cx)
> +{
> +    RootedObject obj(cx, NewBuiltinClassInstance(cx, &ModuleClass));

This rooting isn't necessary.

@@ +34,5 @@
> +    return &obj->asModule();
> +}
> +
> +inline js::Module &
> +JSObject::asModule()

Shouldn't this be in a header file?

::: js/src/builtin/Module.h
@@ +4,5 @@
> +#include "jsobj.h"
> +
> +namespace js {
> +
> +class Module : public JSObject {

Nit: brace on the next line.
This patch refactors the SharedContext hierarchy a bit so that we can add ModuleSharedContext to it. The boolean that tells whether the SharedContext is a GlobalSharedContext or a FunctionBox is gone. Instead, I'm making use of the fact that every SharedContext that is not a SharedGlobalContext is a specialisation of ObjectBox, and ObjectBox already has a mechanism to tell which specialisation we're dealing with.
Attachment #668587 - Flags: review?(jorendorff)
This patch adds a new type of SharedContext for modules.
Attachment #668784 - Flags: review?(jorendorff)
Comment on attachment 668587 [details] [diff] [review]
Patch to be reviewed

Review of attachment 668587 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/frontend/SharedContext.h
@@ +164,2 @@
>  
> +    virtual ObjectBox *toObjectBox() { return NULL; }

Either this should be called |asObjectBox| or |asGlobalSharedContext| and |asFunctionBox| need to be renamed.

@@ +164,4 @@
>  
> +    virtual ObjectBox *toObjectBox() { return NULL; }
> +    inline bool isGlobalSharedContext() { return !toObjectBox(); }
> +    inline bool isFunctionBox() { return toObjectBox() && toObjectBox()->isFunctionBox(); }

If the implementation is directly in the class, you can generally drop the |inline|. Or you just could just put the implementation in SharedContext-inl.h for consistency.
Comment on attachment 668784 [details] [diff] [review]
Patch to be reviewed

Review of attachment 668784 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/frontend/ParseNode.h
@@ +1468,2 @@
>      bool isFunctionBox() { return object->isFunction(); }
> +    ModuleBox *asModuleBox() { JS_ASSERT(isModuleBox()); return (ModuleBox *)(this); }

Fix C-cast.
Comment on attachment 667528 [details] [diff] [review]
Proposed patch

These patches are now under review in bug 568953.
Attachment #667528 - Flags: review?(jorendorff)
Attachment #668587 - Flags: review?(jorendorff)
Attachment #668784 - Flags: review?(jorendorff)
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → DUPLICATE
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: