[BinAST] Port binjs to ast v3

RESOLVED FIXED in Firefox 61

Status

()

enhancement
P3
normal
RESOLVED FIXED
Last year
Last year

People

(Reporter: Yoric, Assigned: Yoric)

Tracking

(Depends on 1 bug, Blocks 2 bugs)

unspecified
mozilla61
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(firefox61 fixed)

Details

Attachments

(7 attachments, 8 obsolete attachments)

59 bytes, text/x-review-board-request
jorendorff
: review+
froydnj
: review+
Details
59 bytes, text/x-review-board-request
froydnj
: review+
Details
59 bytes, text/x-review-board-request
arai
: review+
jorendorff
: review+
Details
59 bytes, text/x-review-board-request
arai
: review+
jorendorff
: review+
Details
59 bytes, text/x-review-board-request
arai
: review+
jorendorff
: review+
Details
59 bytes, text/x-review-board-request
arai
: review+
Details
59 bytes, text/x-review-board-request
arai
: review+
Details
We now have specifications for the Binary AST on: https://binast.github.io/ecmascript-binary-ast/

We need to port BinSource to this Binary AST.
Priority: -- → P3
I should add that the source code for Facebook was not included in the tests due to size issues.
(In reply to David Teller [:Yoric] (please use "needinfo") from comment #6)
> I should add that the source code for Facebook was not included in the tests
> due to size issues.

That is, the files are too large for mozilla-central. The tests do pass locally.

Once we have implemented the size-efficient tokenizer, we should be able to upload source + size-efficient encoded files for Facebook.
Assignee: nobody → dteller
Comment on attachment 8951198 [details]
Bug 1437004 - BinAST test harness improvements;

https://reviewboard.mozilla.org/r/220458/#review226502

r=me with the few changes below.

::: js/src/jsapi-tests/testBinASTReader.cpp:13
(Diff revision 1)
>  
>  #if defined(XP_UNIX)
>  
>  #include <dirent.h>
>  #include <sys/stat.h>
> +#include <fcntl.h>

I think the style checker will complain about these headers not being in alphabetical order. To test this, in a standalone SpiderMonkey build dir, type `make check-style`.

::: js/src/jsapi-tests/testBinASTReader.cpp:107
(Diff revision 1)
> +                continue;
> +            }
> +            if (strcmp(d_name, "..") == 0) {
> +                continue;
> +            }
> +            UniqueChars subPath(static_cast<char*>(js_malloc(pathlen + /* separator */ 1 + namlen + /* NUL */1)));

Crash if allocation fails.

::: js/src/jsapi-tests/testBinASTReader.cpp:111
(Diff revision 1)
> +            }
> +            UniqueChars subPath(static_cast<char*>(js_malloc(pathlen + /* separator */ 1 + namlen + /* NUL */1)));
> +            strncpy(subPath.get(), path, pathlen); // Assuming that `path` ends with a directory separator.
> +            strncpy(subPath.get() + pathlen, d_name, namlen);
> +            subPath.get()[pathlen + namlen] = path[pathlen - 1]; // Directory separator.
> +            subPath.get()[pathlen + namlen + 1 ] = '\0';

Make a habit of avoiding direct calls to js_malloc and strncpy, and nontrivial array indices, for your poor reviewers' sake!

In this case, use `Vector<char>` instead. It has a reasonably nice `.append(begin, length)` method, whose safety is easier to check and more "local" (thus less likely to break when nearby code changes). It may cost a few nanoseconds, but that's OK here.
Attachment #8951198 - Flags: review?(jorendorff) → review+
Comment on attachment 8951199 [details]
Bug 1437004 - ParseNode::dump() now displays names for ObjectPropertyNames;

https://reviewboard.mozilla.org/r/220460/#review226534
Attachment #8951199 - Flags: review?(jorendorff) → review+
Comment on attachment 8951200 [details]
Bug 1437004 - Fixing null string behavior in BinAST tokenizer;

https://reviewboard.mozilla.org/r/220462/#review226538

::: js/src/frontend/BinTokenReaderTester.cpp:202
(Diff revision 1)
>      // 3. Check null string (no allocation)
>      if (byteLen == 2 && *current_ == 255 && *(current_ + 1) == 0) {
>          // Special case: null string.
>          out = Nothing();
>          current_ += byteLen;
> +        if (!readConst("</string>"))

I would have structured it like this:

    if (...) {
        // 3. Special case: null string (no allocation)
        out = Nothing();
    } else {
        // 4. Other strings (bytes are copied)
        ...
    }

    current_ += byteLen;
    if (!readConst("</string>"))
        return false;
    return true;

Your call, though. I like this because the bookends match: there's exactly one `readConst("</string>")` that matches the `readConst("<string>")` above, and they're at the same indentation level.
Attachment #8951200 - Flags: review?(jorendorff) → review+
Comment on attachment 8951201 [details]
Bug 1437004 - Porting BinAST to AST v3;

I guess my knee-jerk reaction here is two-fold:

1. MUCH BETTER

2. Automatic r- because obviously we need to review and check in the code generator and a build step, not the generated code.

If the codegen is OCaml, well, we'll figure something out! :D

But consider how profoundly tedious and pointless it would be to review all the generated code line by line.
Attachment #8951201 - Flags: review-
Comment on attachment 8951201 [details]
Bug 1437004 - Porting BinAST to AST v3;

https://reviewboard.mozilla.org/r/220464/#review226544

See comment 11.

::: js/src/frontend/BinToken.h:229
(Diff revision 1)
>  #define EMIT_ENUM(name, _) name,
>      FOR_EACH_BIN_KIND(EMIT_ENUM)
>  #undef EMIT_ENUM
>      BINKIND_LIMIT /* domain size */
>  };
> +        

Remember to delete trailing whitespace / spaces on blank lines in this file. (Also in your codegen script: there's some in the auto files too.)

::: js/src/frontend/BinToken.h:236
(Diff revision 1)
>  /**
>   * The different fields of Binary AST nodes, as per the specifications of
>   * Binary AST.
>   *
>   * Usage:
> - *
> +   *

Possible stray tab character here?

::: js/src/frontend/BinToken.h:314
(Diff revision 1)
>  #define EMIT_ENUM(name, _) name,
>      FOR_EACH_BIN_FIELD(EMIT_ENUM)
>  #undef EMIT_ENUM
>      BINFIELD_LIMIT /* domain size */
>  };
> -
> +        const char* describeBinKind(const BinKind& kind);

Style nit: this line shouldn't be indented (per our nasty style rule for namespaces).
Attachment #8951201 - Flags: review-
(In reply to Jason Orendorff [:jorendorff] from comment #11)
> Comment on attachment 8951201 [details]
> Bug 1437004 - Porting BinAST to AST v3;
> 
> I guess my knee-jerk reaction here is two-fold:
> 
> 1. MUCH BETTER
> 
> 2. Automatic r- because obviously we need to review and check in the code
> generator and a build step, not the generated code.
> 
> If the codegen is OCaml, well, we'll figure something out! :D

Well, it's in Rust (with a source spec in webidl and a configuration in yaml), it has a few dependencies and it builds with Rust Nightly. I could probably port it to Rust stable, though.

> But consider how profoundly tedious and pointless it would be to review all
> the generated code line by line.

Well, if you're up for reviewing the Rust generator, I'm up for packaging it nicely.
(I have no clue how to make it part of the Build system, though)
Comment on attachment 8951197 [details]
Bug 1437004 - Tests for BinAST v3 (data);

https://reviewboard.mozilla.org/r/220456/#review227028

which part am I supposed to review about this patch?
about comment #15
Flags: needinfo?(dteller)
Comment on attachment 8951198 [details]
Bug 1437004 - BinAST test harness improvements;

https://reviewboard.mozilla.org/r/220458/#review227030

r+ with the comments addressed

::: js/src/jsapi-tests/testBinASTReader.cpp:56
(Diff revision 1)
>  
>      if (!buf.appendAll(intermediate))
>          MOZ_CRASH("Couldn't read data");
>  }
>  
> -BEGIN_TEST(testBinASTReaderECMAScript2)
> +// Invariant: `path` must end with directory separator.

it would be better just adding assertion, instead of comment,
maybe after calculating `pathlen`.

::: js/src/jsapi-tests/testBinASTReader.cpp:93
(Diff revision 1)
>      for (bool found = (hFind != INVALID_HANDLE_VALUE);
>              found;
>              found = FindNextFile(hFind, &FindFileData))
>      {
>          const char* d_name = FindFileData.cFileName;
> +        const bool isDirectory = FindFileData.dwFileAttributes | FILE_ATTRIBUTE_DIRECTORY);

remove ")", also replace "|" to "&".

::: js/src/jsapi-tests/testBinASTReader.cpp:107
(Diff revision 1)
> +                continue;
> +            }
> +            if (strcmp(d_name, "..") == 0) {
> +                continue;
> +            }
> +            UniqueChars subPath(static_cast<char*>(js_malloc(pathlen + /* separator */ 1 + namlen + /* NUL */1)));

a space between "/* NUL */" and "1"

::: js/src/jsapi-tests/testBinASTReader.cpp:118
(Diff revision 1)
> +            continue;
> +        }
> +
> +        {
> +            JS::PrepareForFullGC(cx);
> +            cx->runtime()->gc.gc(GC_NORMAL, JS::gcreason::NO_REASON);

can you add a comment describes the reason why performing GC here?

also, putting GC between finding js and binjs seems somewhat strange.
I'd suggest moving it just before doing the actual test.
(if there's no reason to put this here)

::: js/src/jsapi-tests/testBinASTReader.cpp:143
(Diff revision 1)
>          js::Vector<char16_t> txtSource(cx);
>          readFull(cx, txtPath.get(), txtSource);
>  
>          // Parse text file.
> +        CompileOptions txtOptions(cx);
> +        txtOptions.setIntroductionType(txtPath.get());

`introductionType` is not a place to put file path.
please use `.setFileAndLine` instead.

::: js/src/jsapi-tests/testBinASTReader.cpp:148
(Diff revision 1)
> +        txtOptions.setIntroductionType(txtPath.get());
> +
>          UsedNameTracker txtUsedNames(cx);
>          if (!txtUsedNames.init())
>              MOZ_CRASH("Couldn't initialize used names");
> -        js::frontend::Parser<js::frontend::FullParseHandler, char16_t> parser(cx, cx->tempLifoAlloc(), options, txtSource.begin(), txtSource.length(),
> +        js::frontend::Parser<js::frontend::FullParseHandler, char16_t> txtParser(cx, allocScope.alloc(), txtOptions, txtSource.begin(), txtSource.length(),

can you somehow wrap these lines into 99 chars?
maybe just by putting newline after "("

::: js/src/jsapi-tests/testBinASTReader.cpp:234
(Diff revision 1)
>          if (strcmp(binPrinter.string(), txtPrinter.string()) != 0) {
> -            fprintf(stderr, "Got distinct ASTs when parsing %s:\n\tBINARY\n%s\n\n\tTEXT\n%s\n", txtPath.get(), binPrinter.string(), txtPrinter.string());
> +            fprintf(stderr, "Got distinct ASTs when parsing %s (%lu/%lu):\n\tBINARY\n%s\n\n\tTEXT\n%s\n",
> +                txtPath.get(),
> +                binPrinter.getOffset(), txtPrinter.getOffset(),
> +                binPrinter.string(), txtPrinter.string());
> +#if 0 // Not for release

if you put this here, can you add some comment what this is doing?
Attachment #8951198 - Flags: review?(arai.unmht) → review+
Comment on attachment 8951199 [details]
Bug 1437004 - ParseNode::dump() now displays names for ObjectPropertyNames;

https://reviewboard.mozilla.org/r/220460/#review227034
Attachment #8951199 - Flags: review?(arai.unmht) → review+
Comment on attachment 8951200 [details]
Bug 1437004 - Fixing null string behavior in BinAST tokenizer;

https://reviewboard.mozilla.org/r/220462/#review227032

I agree with jorendorff
Attachment #8951200 - Flags: review?(arai.unmht) → review+
Comment on attachment 8951201 [details]
Bug 1437004 - Porting BinAST to AST v3;

https://reviewboard.mozilla.org/r/220464/#review227036

about the auto-generation, I think it's fine not to integrate the script into build system,
but just to put script and the source files (or the script to retrieve source files into temporary place and generate from it) in the tree,
and manually run them to update when necessary.
(Unicode things are using this way)

::: js/src/frontend/BinSource-auto.h:1
(Diff revision 1)
> +// This file is autogenerated. It is meant to be included from the declaration

it would be nice to describe how it's generated,
(like, source file, and generator script),
so that when some issue is spotted in this file, it's clear that which file to modify.

also, please put the same comment about auto-generation to BinSource-auto.cpp

::: js/src/frontend/BinSource-auto.h:72
(Diff revision 1)
> +    Const
> +};
> +
> +
> +
> +    // ----- Sums of interfaces (by lexicographical order)

it might be nice to align to either using indent or not, to comments.

::: js/src/frontend/BinSource-auto.cpp:7
(Diff revision 1)
> +* vim: set ts=8 sts=4 et sw=4 tw=99:
> +* This Source Code Form is subject to the terms of the Mozilla Public
> +* License, v. 2.0. If a copy of the MPL was not distributed with this
> +* file, You can obtain one at http://mozilla.org/MPL/2.0/. */
> +
> +// FIXME: Verify that identifiers are valid.

what does "identifiers" here mean?

::: js/src/frontend/BinSource-auto.cpp:40
(Diff revision 1)
> +using Chars = BinTokenReaderTester::Chars;
> +using NameBag = GCHashSet<JSString*>;
> +using Names = GCVector<JSString*, 8>;
> +using UsedNamePtr = UsedNameTracker::UsedNameMap::Ptr;
> +    
> +// Evaluate an expression, checking that the result is not 0.

"... is not 0 or nullptr" or maybe just "... is truthy"

::: js/src/frontend/BinSource-auto.cpp:60
(Diff revision 1)
> +    } while (false)
> +
> +#define TRY_DECL(VAR, EXPR) \
> +    auto VAR = EXPR; \
> +    if (!VAR) \
> +    return cx_->alreadyReportedError();

one more indentation

::: js/src/frontend/BinSource.h:94
(Diff revision 1)
>  
> +    MOZ_MUST_USE mozilla::GenericErrorResult<JS::Error&> raiseMissingVariableInAssertedScope(JSAtom* name);
> +    MOZ_MUST_USE mozilla::GenericErrorResult<JS::Error&> raiseMissingDirectEvalInAssertedScope();
>      MOZ_MUST_USE mozilla::GenericErrorResult<JS::Error&> raiseInvalidKind(const char* superKind, const BinKind kind);
>      MOZ_MUST_USE mozilla::GenericErrorResult<JS::Error&> raiseInvalidField(const char* kind, const BinField field);
> +    MOZ_MUST_USE mozilla::GenericErrorResult<JS::Error&> raiseInvalidNumberOfFields(const BinKind kind, const uint32_t expected, const uint32_t got);

please wrap this into 99 chars

::: js/src/frontend/BinSource.h:117
(Diff revision 1)
> +    checkFields(const BinKind kind, const BinFields& actual, const BinField (&expected)[N]);
> +    JS::Result<Ok, JS::Error&>
> +    checkFields0(const BinKind kind, const BinFields& actual);
> +
> +    JS::Result<ParseNode*>
> +    buildFunction(const size_t start, const BinKind kind, ParseNode* name, ParseNode* params, ParseNode* body, FunctionBox* funbox);

please wrap this into 99 chars

::: js/src/frontend/BinSource.h:119
(Diff revision 1)
> +    checkFields0(const BinKind kind, const BinFields& actual);
> +
> +    JS::Result<ParseNode*>
> +    buildFunction(const size_t start, const BinKind kind, ParseNode* name, ParseNode* params, ParseNode* body, FunctionBox* funbox);
> +    JS::Result<FunctionBox*>
> +    buildFunctionBox(bool isGenerator, bool isAsync);

if it's possible, it would be nice to use GeneratorKind/FunctionAsyncKind,
instead of passing multiple boolean values.

::: js/src/frontend/BinSource.cpp:162
(Diff revision 1)
>  
>      return result; // Magic conversion to Ok.
>  }
>  
> -Result<ParseNode*>
> -BinASTParser::parseProgram()
> +JS::Result<FunctionBox*>
> +BinASTParser::buildFunctionBox(bool isGenerator, bool isAsync) {

please put "{" in the next its own line

::: js/src/frontend/BinSource.cpp:166
(Diff revision 1)
> -Result<ParseNode*>
> -BinASTParser::parseProgram()
> -{
> -    BinKind kind;
> -    BinFields fields(cx_);
> -    AutoTaggedTuple guard(*tokenizer_);
> +JS::Result<FunctionBox*>
> +BinASTParser::buildFunctionBox(bool isGenerator, bool isAsync) {
> +    // Allocate the function before walking down the tree.
> +    RootedFunction fun(cx_);
> +    TRY_VAR(fun, NewFunctionWithProto(cx_,
> +            /*native*/nullptr,

/* native = */ nullptr,
same for others.

::: js/src/frontend/BinSource.cpp:168
(Diff revision 1)
> -{
> -    BinKind kind;
> -    BinFields fields(cx_);
> -    AutoTaggedTuple guard(*tokenizer_);
> -
> -    TRY(tokenizer_->enterTaggedTuple(kind, fields, guard));
> +    // Allocate the function before walking down the tree.
> +    RootedFunction fun(cx_);
> +    TRY_VAR(fun, NewFunctionWithProto(cx_,
> +            /*native*/nullptr,
> +            /*nargs placeholder*/0,
> +            /*flags */ JSFunction::INTERPRETED_NORMAL,

this doesn't need comment, since the value is descriptive

::: js/src/frontend/BinSource.cpp:172
(Diff revision 1)
> -
> -    TRY(tokenizer_->enterTaggedTuple(kind, fields, guard));
> -    if (kind != BinKind::Program)
> -        return this->raiseInvalidKind("Program", kind);
> -
> -    ParseNode* result;
> +            /*nargs placeholder*/0,
> +            /*flags */ JSFunction::INTERPRETED_NORMAL,
> +            /*enclosing env*/nullptr,
> +            /*name placeholder*/ nullptr,
> +            /*proto*/ nullptr,
> +            /*alloc*/gc::AllocKind::FUNCTION,

here too

::: js/src/frontend/BinSource.cpp:179
(Diff revision 1)
> +    ));
> +    TRY_DECL(funbox, alloc_.new_<FunctionBox>(cx_,
> +        traceListHead_,
> +        fun,
> +        /* toStringStart = */0,
> +        /* directives = */Directives(parseContext_),

and here

::: js/src/frontend/BinSource.cpp:189
(Diff revision 1)
> +                : FunctionAsyncKind::SyncFunction
> +    ));
>  
> -    TRY(guard.done());
> -    return result;
> +    traceListHead_ = funbox;
> +    FunctionSyntaxKind syntax = PrimaryExpression; // FIXME - What if we're assigning?
> +        // FIXME: The only thing we need to know is whether this is a ClassConstructor/DerivedClassConstructor

wrong indentation

::: js/src/frontend/BinSource.cpp:195
(Diff revision 1)
> +    funbox->initWithEnclosingParseContext(parseContext_, syntax);
> +    return funbox;
>  }
>  
>  JS::Result<ParseNode*>
> -BinASTParser::parseBlockStatement()
> +BinASTParser::buildFunction(const size_t start, const BinKind kind, ParseNode* name, ParseNode* params, ParseNode* body, FunctionBox* funbox) {

please wrap this into 99 chars, and also put "{" in the next line

::: js/src/frontend/BinSource.cpp:205
(Diff revision 1)
> -    switch (kind) {
> -      default:
> -        return raiseInvalidKind("BlockStatement", kind);
> -      case BinKind::BlockStatement:
> -        MOZ_TRY_VAR(result, parseBlockStatementAux(kind, fields));
> -        break;
> +        atom = name->name();
> +
> +
> +    funbox->function()->setArgCount(uint16_t(params->pn_count));
> +    funbox->function()->initAtom(atom);
> +    

remove whitespaces

::: js/src/frontend/BinSource.cpp:210
(Diff revision 1)
> -        break;
> +    
> +    // ParseNode represents the body as concatenated after the params.
> +    params->appendWithoutOrderAssumption(body);
> +
> +    TRY_DECL(result, kind == BinKind::FunctionDeclaration
> +                       ? factory_.newFunctionStatement(pos)

align "?" with "k" of "kind" above

::: js/src/frontend/BinSource.cpp:217
(Diff revision 1)
> +
> +    factory_.setFunctionBox(result, funbox);
> +    factory_.setFunctionFormalParametersAndBody(result, params);
> +
> +    HandlePropertyName dotThis = cx_->names().dotThis;
> +    const bool declareThis = hasUsedName(dotThis) || funbox->bindingsAccessedDynamically() || funbox->isDerivedClassConstructor();

please wrap this into 99 chars

::: js/src/frontend/BinSource.cpp:268
(Diff revision 1)
> -{
> -    BinKind kind;
> -    BinFields fields(cx_);
> -    AutoTaggedTuple guard(*tokenizer_);
> -
> -    TRY(tokenizer_->enterTaggedTuple(kind, fields, guard));
> +    for (uint32_t i = 0; i < length; ++i) {
> +        name = nullptr;
> +
> +        MOZ_TRY(readString(&name));
> +        auto ptr = scope.lookupDeclaredNameForAdd(name);
> +        fprintf(stderr, "parseAndUpdateScopeNames adding variable declaration\n");

please put this and the next line into #ifdef DEBUG

::: js/src/frontend/BinSource.cpp:288
(Diff revision 1)
> -    ParseContext::Scope scope(cx_, parseContext_, usedNames_);
> -    TRY(scope.init(parseContext_));
> -
> -    ParseNode* body(nullptr);
> -    ParseNode* directives(nullptr); // FIXME: Largely ignored
> -
> +    ParseContext::Scope& scope =
> +        variableDeclarationKind == VariableDeclarationKind::Var
> +            ? parseContext_->varScope()
> +            : *parseContext_->innermostScope();
> +
> +    fprintf(stderr, "Looking up declared name in scope\n");

here too

::: js/src/frontend/BinSource.cpp:433
(Diff revision 1)
> -mozilla::GenericErrorResult<JS::Error&>
> -BinASTParser::raiseInvalidKind(const char* superKind, const BinKind kind)
>  {
>      Sprinter out(cx_);
>      TRY(out.init());
> -    TRY(out.printf("In %s, invalid kind %s", superKind, describeBinKind(kind)));
> +    TRY(out.printf("In %s, invalid number of fields: expected %u, got %u", describeBinKind(kind), expected, got));

please wrap this into 99 chars

::: js/src/frontend/BinToken.h:18
(Diff revision 1)
>   *
>   * The mapping between Kind and list of fields is determined entirely by
>   * the grammar of Binary AST. The mapping between (Kind, Name) and the structure
>   * of Value is also determined entirely by the grammar of Binary AST.
>   *
> - * As per the specifications of Binary AST, kinds may be added as the language
> +  * As per the specifications of Binary AST, kinds may be added as the language

revert the change to this line
Attachment #8951201 - Flags: review?(arai.unmht)
Duplicate of this bug: 1417854
Duplicate of this bug: 1417851
Duplicate of this bug: 1417858
Comment on attachment 8951197 [details]
Bug 1437004 - Tests for BinAST v3 (data);

https://reviewboard.mozilla.org/r/220456/#review227028

Ah, you're right, there doesn't seem to be much to review here. I'll be happy with a blanket rubberstamp for this patch :)
Flags: needinfo?(dteller)
Comment on attachment 8951198 [details]
Bug 1437004 - BinAST test harness improvements;

https://reviewboard.mozilla.org/r/220458/#review227030

> can you add a comment describes the reason why performing GC here?
> 
> also, putting GC between finding js and binjs seems somewhat strange.
> I'd suggest moving it just before doing the actual test.
> (if there's no reason to put this here)

> also, putting GC between finding js and binjs seems somewhat strange.

What do you mean between finding js and binjs?

This is where we know that we're going to deal with a js/binjs pair, not with a directory.
Comment on attachment 8951197 [details]
Bug 1437004 - Tests for BinAST v3 (data);

https://reviewboard.mozilla.org/r/220456/#review227840

::: js/src/jsapi-tests/binast/parser/unit/catch_implicit.js:4
(Diff revision 1)
> +try {
> +
> +} catch (e) {
> +    

trailing spaces here
Attachment #8951197 - Flags: review?(arai.unmht) → review+
Comment on attachment 8951201 [details]
Bug 1437004 - Porting BinAST to AST v3;

https://reviewboard.mozilla.org/r/220464/#review227036

Indeed, I believe that adding the generation into the build system is going to take me several weeks of work *and* will make the reviewer's life harder, as the generator itself is going to change more often than the webidl/yaml source files.

Now, do you have suggestions on where to put these files? Keep in mind that this will be a full cargo project.
Attachment #8951198 - Attachment is obsolete: true
(In reply to Jason Orendorff [:jorendorff] from comment #11)
> Comment on attachment 8951201 [details]
> Bug 1437004 - Porting BinAST to AST v3;
> 
> I guess my knee-jerk reaction here is two-fold:
> 
> 1. MUCH BETTER
> 
> 2. Automatic r- because obviously we need to review and check in the code
> generator and a build step, not the generated code.
> 
> If the codegen is OCaml, well, we'll figure something out! :D
> 
> But consider how profoundly tedious and pointless it would be to review all
> the generated code line by line.

I can work on that (bug 1439645) but I'm not a very big fan of making this a blocker.

Yes, reviewing ~3000 lines of generator, ~900 lines of spec and ~1000 lines of yaml script is certainly more fun than reviewing ~8000 lines of generated code, but I suspect that the meta nature (even meta-meta, as I'm performing analysis and rewriting on the specs themselves before generating code) doesn't really help that much determine whether the generated code works. Also, I know that the generator itself is going to change soonish, once we switch from having a single tokenizer to having two, and I suspect that reading the changes from the generator won't be very helpful.

What do you think, jorendorff?
Flags: needinfo?(jorendorff)
Comment on attachment 8952773 [details]
Bug 1437004 - Updating tokenizer tests to ast v3;

https://reviewboard.mozilla.org/r/221998/#review228150
Attachment #8952773 - Flags: review?(arai.unmht) → review+
Comment on attachment 8951201 [details]
Bug 1437004 - Porting BinAST to AST v3;

https://reviewboard.mozilla.org/r/220464/#review228146

r+ with the comments addressed.

::: commit-message-6dc0c:5
(Diff revision 2)
> +Bug 1437004 - Porting BinAST to AST v3;r?arai,jorendoff
> +
> +This patch is a nearly complete reimplementation of BinASTReader, with the following changes:
> +
> +- Files BinToken.h, BinSource-auto.h (new), BinSource-auto.cpp (new) are now autogenerated by an out-of-tree code generator from the webidl specifications of BinAST and a small configuration file.

if you're going to land without generator for now, please put the URL of the generator source here.
(or bug number)

::: js/src/frontend/BinSource-auto.h:1
(Diff revision 2)
> +// This file was autogenerated by generate_spidermonkey, please do not modify by hand.

please wrap this into 80 chars, in the generator

::: js/src/frontend/BinSource-auto.h:8
(Diff revision 2)
> +/* -*- Mode: C++; tab-width: 8; indent-tabs-mode: nil; c-basic-offset: 4 -*-
> +* vim: set ts=8 sts=4 et sw=4 tw=99:
> +* This Source Code Form is subject to the terms of the Mozilla Public
> +* License, v. 2.0. If a copy of the MPL was not distributed with this
> +* file, You can obtain one at http://mozilla.org/MPL/2.0/. */

modelines and copyright should be placed before other comments above.

(the current order might be the result of the restriction comes from the generator tho...)

::: js/src/frontend/BinSource-auto.h:17
(Diff revision 2)
> +* file, You can obtain one at http://mozilla.org/MPL/2.0/. */
> +
> +
> +// ----- Declaring string enums (by lexicographical order)
> +enum class BinaryOperator {
> +    Comma      /* "," */,

might be nice to either aligning the comment:
```
  Comma        /* "," */,
  LogicalOr    /* "||" */,
```
or use 2 lines for comment and name:
```
  // ","
  Comma,
  // "||"
  LogicalOr,
```

but up to you.

::: js/src/frontend/BinSource-auto.cpp:41
(Diff revision 2)
> +using BinFields = BinTokenReaderTester::BinFields;
> +using Chars = BinTokenReaderTester::Chars;
> +using NameBag = GCHashSet<JSString*>;
> +using Names = GCVector<JSString*, 8>;
> +using UsedNamePtr = UsedNameTracker::UsedNameMap::Ptr;
> +    

trailing whitespaces here, and in several places.

if you haven't yet, I'd suggest configuring your text editor to highlight them, so that it's easier to notice them.

::: js/src/frontend/BinSource-auto.cpp:92
(Diff revision 2)
> +    if (actual.length() != N)
> +        return raiseInvalidNumberOfFields(kind, N, actual.length());
> +
> +    for (size_t i = 0; i < N; ++i) {
> +        if (actual[i] != expected[i])
> +            return raiseInvalidField(describeBinKind(kind), actual[0]);

actual[i]

::: js/src/frontend/BinSource-auto.cpp:129
(Diff revision 2)
> +*/
> +JS::Result<ParseNode*>
> +BinASTParser::parseAssignmentTarget()
> +{
> +    BinKind kind;
> +    BinFields fields((cx_));

extra parens

::: js/src/frontend/BinSource-auto.cpp:2784
(Diff revision 2)
> +
> +    MOZ_TRY_DECL(right, parseExpression());
> +
> +    ParseNodeKind pnk;
> +    switch (operator_) {
> +        case BinaryOperator::Comma:

please remove 2 spaecs before case

::: js/src/frontend/BinSource-auto.cpp:5999
(Diff revision 2)
> +        if (operand->isKind(ParseNodeKind::Name))
> +            pnk = ParseNodeKind::TypeOfName;
> +        else
> +            pnk = ParseNodeKind::TypeOfExpr;
> +        break;
> +        }

indentation is wrong

::: js/src/frontend/BinSource-auto.cpp:6400
(Diff revision 2)
> +    Chars chars((cx_));
> +    MOZ_TRY(readString(chars));
> +
> +    if (chars == ",")
> +        return BinaryOperator::Comma;
> +    else if (chars == "||")

you can remove `else` after `return`

::: js/src/frontend/BinSource-auto.cpp:6600
(Diff revision 2)
> +    TRY(tokenizer_->enterList(length, guard));
> +    TRY_DECL(result, factory_.newList(ParseNodeKind::ParamsBody, tokenizer_->pos(start)));
> +
> +    for (uint32_t i = 0; i < length; ++i) {
> +        MOZ_TRY_DECL(item, parseSpreadElementOrExpression());
> +    factory_.addList(/* list = */ result, /* child = */ item);

indentation seems to be wrong

::: js/src/frontend/BinSource.h:123
(Diff revision 2)
> +    checkFields(const BinKind kind, const BinFields& actual, const BinField (&expected)[N]);
> +    JS::Result<Ok, JS::Error&>
> +    checkFields0(const BinKind kind, const BinFields& actual);
> +
> +    JS::Result<ParseNode*>
> +    buildFunction(const size_t start, const BinKind kind, ParseNode* name, ParseNode* params, ParseNode* body, FunctionBox* funbox);

please wrap this line into 99 chars,
maybe some other places

::: js/src/frontend/BinSource.cpp:170
(Diff revision 2)
> -    BinFields fields(cx_);
> -    AutoTaggedTuple guard(*tokenizer_);
> -
> -    TRY(tokenizer_->enterTaggedTuple(kind, fields, guard));
> -    if (kind != BinKind::Program)
> -        return this->raiseInvalidKind("Program", kind);
> +    RootedFunction fun(cx_);
> +    TRY_VAR(fun, NewFunctionWithProto(cx_,
> +            /* native = */ nullptr,
> +            /* nargs placeholder = */ 0,
> +            JSFunction::INTERPRETED_NORMAL,
> +            /* enclosing env = */nullptr,

s/enclosing env/enclosingEnv/

also, needs a space before "nullptr"

::: js/src/frontend/BinSource.cpp:179
(Diff revision 2)
> +            TenuredObject
> +    ));
> +    TRY_DECL(funbox, alloc_.new_<FunctionBox>(cx_,
> +        traceListHead_,
> +        fun,
> +        /* toStringStart = */0,

a space before 0

::: js/src/frontend/BinSource.cpp:181
(Diff revision 2)
> +    TRY_DECL(funbox, alloc_.new_<FunctionBox>(cx_,
> +        traceListHead_,
> +        fun,
> +        /* toStringStart = */0,
> +        Directives(parseContext_),
> +        /* extraWarning = */false,

here too

::: js/src/frontend/BinSource.cpp:195
(Diff revision 2)
> +    return funbox;
>  }
>  
>  JS::Result<ParseNode*>
> -BinASTParser::parseBlockStatement()
> +BinASTParser::buildFunction(const size_t start, const BinKind kind, ParseNode* name,
> +    ParseNode* params, ParseNode* body, FunctionBox* funbox)

please align "P" of "ParseNode" to "c" of "const" above.

::: js/src/frontend/BinSource.cpp:199
(Diff revision 2)
>  {
> -    BinKind kind;
> +    TokenPos pos = tokenizer_->pos(start);
> -    BinFields fields(cx_);
> -    AutoTaggedTuple guard(*tokenizer_);
>  
> -    TRY(tokenizer_->enterTaggedTuple(kind, fields, guard));
> +    RootedAtom atom((cx_));

remove extra parens

::: js/src/frontend/BinSource.cpp:218
(Diff revision 2)
> +    const bool declareThis = hasUsedName(dotThis)
> +        || funbox->bindingsAccessedDynamically()
> +        || funbox->isDerivedClassConstructor();

please put "||" at the end of line, and align "f" of "funbox" to "c" of "cx\_" above

::: js/src/frontend/BinSource.cpp:227
(Diff revision 2)
> +    if (declareThis) {
> +        ParseContext::Scope& funScope = parseContext_->functionScope();
> +        ParseContext::Scope::AddDeclaredNamePtr p = funScope.lookupDeclaredNameForAdd(dotThis);
> +        MOZ_ASSERT(!p);
> +        TRY(funScope.addDeclaredName(parseContext_, p, dotThis, DeclarationKind::Var,
> +                                      DeclaredNameInfo::npos));

please align "D" of "DeclaredNameInfo" to "p" of "parseContext\_" above

::: js/src/frontend/BinSource.cpp:233
(Diff revision 2)
> +        funbox->setHasThisBinding();
>      }
>  
> -    TRY(guard.done());
> +    TRY_DECL(bindings,
> +             NewFunctionScopeData(cx_, parseContext_->functionScope(),
> +                                  /* hasParameterExprs = */false, alloc_, parseContext_));

a space before "false"

::: js/src/frontend/BinSource.cpp:243
(Diff revision 2)
>  }
>  
>  JS::Result<Ok>
> -BinASTParser::parseAndUpdateScopeNames(ParseContext::Scope& scope, DeclarationKind kind)
> +BinASTParser::parseAndUpdateCapturedNames()
>  {
> +    // Ignored for the moment.

might be better describing a bit more what this is ignoring.

::: js/src/frontend/BinSource.cpp:285
(Diff revision 2)
> -JS::Result<ParseNode*>
> -BinASTParser::parseBlockStatementAux(const BinKind kind, const BinFields& fields)
> +JS::Result<Ok>
> +BinASTParser::checkBinding(JSAtom* name)
>  {
> -    ParseContext::Statement stmt(parseContext_, StatementKind::Block);
> -    ParseContext::Scope scope(cx_, parseContext_, usedNames_);
> -    TRY(scope.init(parseContext_));
> +    // Check that the variable appears in the corresponding scope.
> +    ParseContext::Scope& scope =
> +        variableDeclarationKind == VariableDeclarationKind::Var

where does `variableDeclarationKind` come from?

::: js/src/frontend/BinSource.yaml:131
(Diff revision 2)
> +            // This file is meant to be included from the declaration
> +            // of class `BinASTParser`. The include may be public or private.

these lines should be placed after modelines and license.

::: js/src/frontend/BinSource.yaml:198
(Diff revision 2)
> +             * - a list of fields, where each field is:
> +             *    - a Name (see `BinField`);
> +             *    - a Value, which may be either a node or a primitive value.
> +             *
> +             * The mapping between Kind and list of fields is determined entirely by
> +             * the grammar of Binary AST. The mapping between (Kind, Name) and the structure

please wrap this line into 80 chars.
(btw, is it necessary to indent those lines?  I think this makes it harder to notice long lines...

::: js/src/frontend/BinToken.h:240
(Diff revision 2)
>  #undef EMIT_ENUM
> -    BINKIND_LIMIT /* domain size */
>  };
>  
> -const char* describeBinKind(const BinKind& kind);
> +// The number of distinct values of BinKind.
> +const size_t BINKIND_LIMIT = 171;

this seems to be unused

::: js/src/frontend/BinToken.h:327
(Diff revision 2)
>  #undef EMIT_ENUM
> -    BINFIELD_LIMIT /* domain size */
>  };
>  
> +// The number of distinct values of BinField.
> +const size_t BINFIELD_LIMIT = 61;

this too
Attachment #8951201 - Flags: review?(arai.unmht) → review+
(In reply to David Teller [:Yoric] (please use "needinfo") from comment #33)
> Yes, reviewing ~3000 lines of generator, ~900 lines of spec and ~1000 lines
> of yaml script is certainly more fun than reviewing ~8000 lines of generated
> code, but I suspect that the meta nature (even meta-meta, as I'm performing
> analysis and rewriting on the specs themselves before generating code)
> doesn't really help that much determine whether the generated code works.
> Also, I know that the generator itself is going to change soonish, once we
> switch from having a single tokenizer to having two, and I suspect that
> reading the changes from the generator won't be very helpful.
> 
> What do you think, jorendorff?

bz helped me sort through this; here's my proposal:

-   Add the generator and its input files somewhere under js/src

-   Get review from a "C++/Rust usage, tools, and style" peer,
    to make sure we did that right

-   Don't bother teaching the build system how to run Rust on host

-   Check in the generated files too (complete with DO NOT EDIT banner,
    already in place--though louder might be better)

-   Developers have to be able to update those generated files easily,
    so document that, maybe add a make target for it

-   Add a test that builds and runs the generator, and checks that the output
    is identical to the checked-in *-Auto.* files. This only has to run on
    platforms where it's easy to support; the output should be the same everywhere.
    If I touch the webidl and forget to update, I'll get backed out; that's OK.

I hope this will help stave off death by yak shaving, for a while anyway.
Flags: needinfo?(jorendorff) → needinfo?(dteller)
For context, my current yak shaving are:

I. Make it possible to build the Rust code from mozilla-central.
 I.1. Which requires vendoring the dependencies.
  I.1.a Which requires finding someone to review licenses.
   I.1.a.i Which requires finding someone to replace Gerv.
  I.1.b Which requires finding a way to reduce the size of dependencies, which exceeds the hardcoded limits / or to change the hardcoded limits.
   I.1.b.i Which requires... I've actually no idea how to reduce the size of dependencies, but I'm sure I'll find a way.
 I.2. Which also requires finding how the generator fits in the workspace config we have in place.

As you can see, I already have my plate full without also patching the build system. So, I appreciate the simplification :)
Flags: needinfo?(dteller)
Comment on attachment 8951201 [details]
Bug 1437004 - Porting BinAST to AST v3;

https://reviewboard.mozilla.org/r/220464/#review228146

> trailing whitespaces here, and in several places.
> 
> if you haven't yet, I'd suggest configuring your text editor to highlight them, so that it's easier to notice them.

Sorry about that. I thought I had fixed these.

> remove extra parens

Odd, I was convinced that this was a Most Vexing Parse, but it apparently works.

> where does `variableDeclarationKind` come from?

It's a member of `BinASTParser`, declared in `BinSource.h`. I'd be happy to prefix it with a `m`, but it's apparently against SpiderMonkey naming conventions :)

More seriously, I have added a class `AutoVariableDeclarationKind` to restore it upon scope exit.

> please wrap this line into 80 chars.
> (btw, is it necessary to indent those lines?  I think this makes it harder to notice long lines...

As far as I can tell, that's required by the yaml syntax.

> this seems to be unused

I believe it will be necessary for the next version of the tokenizer. Plus it's autogenerated, so if you don't mind, I'd like to keep it.
Comment on attachment 8951201 [details]
Bug 1437004 - Porting BinAST to AST v3;

https://reviewboard.mozilla.org/r/220464/#review228146

> It's a member of `BinASTParser`, declared in `BinSource.h`. I'd be happy to prefix it with a `m`, but it's apparently against SpiderMonkey naming conventions :)
> 
> More seriously, I have added a class `AutoVariableDeclarationKind` to restore it upon scope exit.

It's a field of `BinASTParser`.

> As far as I can tell, that's required by the yaml syntax.

This seems to be a requirement of the syntax of yaml.

> I believe it will be necessary for the next version of the tokenizer. Plus it's autogenerated, so if you don't mind, I'd like to keep it.

My initial experiments with implementing the true tokenizer need it, plus it's autogenerated. Unless you have an objection, I'd like to keep it.
Comment on attachment 8958390 [details]
Bug 1437004 - Vendoring dependencies of binjs_meta;

https://reviewboard.mozilla.org/r/227354/#review233224

This sort of patch is basically unreviewable in mozreview, unfortunately.

The below needs to be addressed.  I'm going to assume that all the packages are vendored appropriately.

::: python/mozbuild/mozbuild/vendor_rust.py:147
(Diff revision 1)
>          LICENSE_WHITELIST = [
>              'Apache-2.0',
>              'Apache-2.0 / MIT',
>              'Apache-2.0/MIT',
>              'Apache-2 / MIT',
> +            'BSD-2-Clause', # binjs_meta (only used at build time)

As the comment preceeding this says, my review is insufficient for this change.

However, I think it *could* be if we restructed the  check like so:

1. Check if the license is on `LICENSE_WHITELIST`.  If so, OK.
2. Have a separate `BUILD_TIME_LICENSE_WHITELIST`, mapping licenses to packages that we "know" will only be used at build time.
3. Add entries for the above for the BSD licenses, and `binjs_meta`/`bindgen` in the appropriate places.
4. Remove the BSD licenses from `LICENSE_WHITELIST`.

Bonus points if we actually checked that the packages in `BUILD_TIME_LICENSE_WHITELIST` were never used in `[dependencies]` for any packages, but always `[dev-dependencies]`.
Attachment #8958390 - Flags: review?(nfroyd)
Comment on attachment 8958389 [details]
Bug 1437004 - Introducing BinSource parser generator;

https://reviewboard.mozilla.org/r/227352/#review233218

I don't know if I have enough domain expertise to review the actual parser; if you really want me to, I can try...

Some semi-nitpicky comments below.  Feel free to ignore the ones you don't like.

::: js/src/frontend/binsource/src/main.rs:248
(Diff revision 1)
> +    fn export_declare_kinds_and_fields_enums(&self, buffer: &mut String) {
> +        buffer.push_str(&self.rules.hpp_tokens_header.reindent(""));

Does Rust not have a `stringstream` or similar construct?

::: js/src/frontend/binsource/src/main.rs:305
(Diff revision 1)
> +        let mut string_enums_by_name : Vec<_> = self.syntax.string_enums_by_name()
> +            .iter()
> +            .collect();
> +        string_enums_by_name.sort_by(|a, b| str::cmp(a.0.to_str(), b.0.to_str()));

This pattern is all over the place; maybe use `itertools`'s, `sorted_by` to clean it up?  That would let you drop the `mut` in a number of different places, too.

::: js/src/frontend/binsource/src/main.rs:354
(Diff revision 1)
> +            .collect();
> +        interfaces_by_name.sort_by(|a, b| str::cmp(a.0.to_str(), b.0.to_str()));
> +
> +        let mut outer_parsers = Vec::with_capacity(interfaces_by_name.len());
> +        let mut inner_parsers = Vec::with_capacity(interfaces_by_name.len());
> +        

Nit: whitespace.

::: js/src/frontend/binsource/src/main.rs:573
(Diff revision 1)
> +            name.to_str(), contents.to_str());
> +
> +        let rules_for_this_node = self.rules.get(name);
> +
> +        let type_ok = self.get_type_ok(name, "ParseNode*");
> +        let default_value = 

Nit: whitespace.

::: js/src/frontend/binsource/src/main.rs:582
(Diff revision 1)
> +                "nullptr"
> +            }.to_string();
> +
> +        // At this stage, thanks to deanonymization, `contents`
> +        // is something like `OptionalFooBar`.
> +        let named_implementation = 

Nit: whitespace.

::: js/src/frontend/binsource/src/main.rs:993
(Diff revision 1)
> +    } else if let Some(as_str) = entry.as_str() {
> +        *rule = Some(as_str.to_string());
> +        Some(Some(()))
> +    } else {
> +        None
> +    }    

Nit: whitespace.

::: js/src/frontend/binsource/src/main.rs:1025
(Diff revision 1)
> +            Arg::with_name("OUT_WEBIDL_FILE")
> +                .long("out-webidl")
> +                .required(false)
> +                .takes_value(true)
> +                .help("Path to copy the webidl source"),
> +            Arg::with_name("OUT_YAML_FILE")
> +                .long("out-yaml")
> +                .required(false)
> +                .takes_value(true)
> +                .help("Path to copy the yaml source"),

Out of curiosity, why even have these, if the files are just going to be copied anyway?  Can't consumers just be directed to the originals instead?

::: js/src/frontend/binsource/src/main.rs:1236
(Diff revision 1)
> +    let dest_path = matches.value_of("OUT_HEADER_CLASS_FILE")
> +        .unwrap();
> +    println!("...exporting C++ class header code: {}", dest_path);
> +    let mut dest = File::create(format!("{}", dest_path))
> +        .expect("Could not create C++ header source output");
> +    dest.write_all(exporter.to_spidermonkey_class_hpp().as_bytes())
> +        .expect("Could not write C++ header source output");

Two things:

1. This gets repeated three times...maybe write a small function to factor out common bits?
2. Is `format!("{}", dest_path)` really the best way to get a string out of `dest_path`?
Attachment #8958389 - Flags: review?(nfroyd)
(In reply to Nathan Froyd [:froydnj] from comment #47)
> Comment on attachment 8958390 [details]
> Bug 1437004 - Vendoring dependencies of binjs_meta;
> 
> https://reviewboard.mozilla.org/r/227354/#review233224
> 
> This sort of patch is basically unreviewable in mozreview, unfortunately.
> 
> The below needs to be addressed.  I'm going to assume that all the packages
> are vendored appropriately.

I just patched vendor_rust.py and ran `./mach vendor rust`.


> ::: python/mozbuild/mozbuild/vendor_rust.py:147
> (Diff revision 1)
> >          LICENSE_WHITELIST = [
> >              'Apache-2.0',
> >              'Apache-2.0 / MIT',
> >              'Apache-2.0/MIT',
> >              'Apache-2 / MIT',
> > +            'BSD-2-Clause', # binjs_meta (only used at build time)
> 
> As the comment preceeding this says, my review is insufficient for this
> change.

Apparently, we already use BSD-2-Clause for jemalloc: https://bugzilla.mozilla.org/show_bug.cgi?id=1444607#c2, so it doesn't look like we need any specific review.

> However, I think it *could* be if we restructed the  check like so:
> 
> 1. Check if the license is on `LICENSE_WHITELIST`.  If so, OK.
> 2. Have a separate `BUILD_TIME_LICENSE_WHITELIST`, mapping licenses to
> packages that we "know" will only be used at build time.
> 3. Add entries for the above for the BSD licenses, and
> `binjs_meta`/`bindgen` in the appropriate places.
> 4. Remove the BSD licenses from `LICENSE_WHITELIST`.

Is this really necessary for this patch?

> Bonus points if we actually checked that the packages in
> `BUILD_TIME_LICENSE_WHITELIST` were never used in `[dependencies]` for any
> packages, but always `[dev-dependencies]`.

Mmmh... where?
Comment on attachment 8958389 [details]
Bug 1437004 - Introducing BinSource parser generator;

https://reviewboard.mozilla.org/r/227352/#review233218

Ask jorendorff, I'm just getting sick of requesting reviews on this bug :)
(In reply to David Teller [:Yoric] (please use "needinfo") from comment #49)
> (In reply to Nathan Froyd [:froydnj] from comment #47)
> > ::: python/mozbuild/mozbuild/vendor_rust.py:147
> > (Diff revision 1)
> > >          LICENSE_WHITELIST = [
> > >              'Apache-2.0',
> > >              'Apache-2.0 / MIT',
> > >              'Apache-2.0/MIT',
> > >              'Apache-2 / MIT',
> > > +            'BSD-2-Clause', # binjs_meta (only used at build time)
> > 
> > As the comment preceeding this says, my review is insufficient for this
> > change.
> 
> Apparently, we already use BSD-2-Clause for jemalloc:
> https://bugzilla.mozilla.org/show_bug.cgi?id=1444607#c2, so it doesn't look
> like we need any specific review.

I can't see this bug.  I would like confirmation on this one; I am not a lawyer and all that.

> > However, I think it *could* be if we restructed the  check like so:
> > 
> > 1. Check if the license is on `LICENSE_WHITELIST`.  If so, OK.
> > 2. Have a separate `BUILD_TIME_LICENSE_WHITELIST`, mapping licenses to
> > packages that we "know" will only be used at build time.
> > 3. Add entries for the above for the BSD licenses, and
> > `binjs_meta`/`bindgen` in the appropriate places.
> > 4. Remove the BSD licenses from `LICENSE_WHITELIST`.
> 
> Is this really necessary for this patch?

It depends whether you want legal review on shipping new code with Firefox or not.  The above would guarantee (or at least help guarantee) that we're not shipping code with a license that hasn't been reviewed.

If we get legal signoff on BSD-2-Clause, then no, we don't have to do the above for this bug.

> > Bonus points if we actually checked that the packages in
> > `BUILD_TIME_LICENSE_WHITELIST` were never used in `[dependencies]` for any
> > packages, but always `[dev-dependencies]`.
> 
> Mmmh... where?

While we were vendoring things.
Comment on attachment 8958390 [details]
Bug 1437004 - Vendoring dependencies of binjs_meta;

https://reviewboard.mozilla.org/r/227354/#review233224

> As the comment preceeding this says, my review is insufficient for this change.
> 
> However, I think it *could* be if we restructed the  check like so:
> 
> 1. Check if the license is on `LICENSE_WHITELIST`.  If so, OK.
> 2. Have a separate `BUILD_TIME_LICENSE_WHITELIST`, mapping licenses to packages that we "know" will only be used at build time.
> 3. Add entries for the above for the BSD licenses, and `binjs_meta`/`bindgen` in the appropriate places.
> 4. Remove the BSD licenses from `LICENSE_WHITELIST`.
> 
> Bonus points if we actually checked that the packages in `BUILD_TIME_LICENSE_WHITELIST` were never used in `[dependencies]` for any packages, but always `[dev-dependencies]`.

As remarked by ted, we already ship jemalloc with the same license, so there doesn't seem to be any reason to request a new review for that license.

Do you still want me to do the above?
Comment on attachment 8958390 [details]
Bug 1437004 - Vendoring dependencies of binjs_meta;

https://reviewboard.mozilla.org/r/227354/#review233224

> As remarked by ted, we already ship jemalloc with the same license, so there doesn't seem to be any reason to request a new review for that license.
> 
> Do you still want me to do the above?

Either the above or ask a legal person to review the specific addition of it here.
Comment on attachment 8958389 [details]
Bug 1437004 - Introducing BinSource parser generator;

https://reviewboard.mozilla.org/r/227352/#review233218

> Does Rust not have a `stringstream` or similar construct?

Not really. I could take a `Write` and write to it using the `write!` macro, if necessary.

**edit** So, I've tried the above, and this actually makes the code uglier. Unless there is a pressing reason to do things otherwise, I'd rather keep it this way.

> Out of curiosity, why even have these, if the files are just going to be copied anyway?  Can't consumers just be directed to the originals instead?

Right, survivance from an older version. Removed.
Comment on attachment 8958389 [details]
Bug 1437004 - Introducing BinSource parser generator;

https://reviewboard.mozilla.org/r/227352/#review233218

> Not really. I could take a `Write` and write to it using the `write!` macro, if necessary.
> 
> **edit** So, I've tried the above, and this actually makes the code uglier. Unless there is a pressing reason to do things otherwise, I'd rather keep it this way.

I have zero problem with keeping it this way; I was just curious if there was a better/more idiomatic way.
Comment on attachment 8958390 [details]
Bug 1437004 - Vendoring dependencies of binjs_meta;

https://reviewboard.mozilla.org/r/227354/#review233224

> Either the above or ask a legal person to review the specific addition of it here.

> Bonus points if we actually checked that the packages in BUILD_TIME_LICENSE_WHITELIST were never used in [dependencies] for any packages, but always [dev-dependencies].

Filed that part as Bug 1445956.
Attachment #8958390 - Attachment is obsolete: true
So, apparently, there is no way to remove some of the multi-megabyte source files that are part of lalrpop-*, which need to be vendored in for the sake of the parser. Nathan, how should we handle this?

Source: https://github.com/lalrpop/lalrpop/issues/333#issuecomment-373947237
Flags: needinfo?(nfroyd)
I don't really grok the comment or the prior discussion thread, possibly because I don't understand the crate dependencies or what gets generated where.  Are we essentially vendoring the lalrpop-generated parser for lalrpop files themselves, and therefore we can't avoid vendoring gigantic files?  It looks like there are four different files being vendored, and it's not obvious to me that we need all of them since e.g. mbrubeck pointed out https://github.com/lalrpop/lalrpop/pull/295 which might have removed files (?) and https://github.com/lalrpop/lalrpop/issues/333#issuecomment-373951348 and following suggests that we could avoid those files entirely?
Flags: needinfo?(nfroyd)
The parser depends on webidl-rs, which depends on lalrpop-*. Currently, lalrpop-* contains 4 huge files. Once a version with https://github.com/lalrpop/lalrpop/pull/295/files is released, the two 1.9Mb files will disappear. For the other 2, however, https://github.com/lalrpop/lalrpop/issues/333#issuecomment-373951348 agrees that it would be nice to remove them and/or make them smaller. However, at this stage, there doesn't seem to be a clear way to do it.

Hope that clarifies a bit the issue.
Flags: needinfo?(nfroyd)
Comment on attachment 8958389 [details]
Bug 1437004 - Introducing BinSource parser generator;

https://reviewboard.mozilla.org/r/227352/#review235618

Here are my review comments on all the easy, peripheral stuff. r=me on all of that, and it can land ahead of the binsource crate itself. This review does not include js/src/frontend/binsource/src/main.rs, which I'm still churning through.

I'm assuming someone else (froydnj?) will review "Rust-in-Gecko" issues, so take my Rust-in-Gecko comments below as questions, not actual objections.

::: Cargo.lock:69
(Diff revision 3)
> + "libc 0.2.39 (registry+https://github.com/rust-lang/crates.io-index)",
> + "winapi 0.2.8 (registry+https://github.com/rust-lang/crates.io-index)",
> +]
> +
> +[[package]]
> +name = "atty"

Rust-in-Gecko: do we care about having multiple versions of atty and itertools in here?

Looks like we already do have multiple versions of a few things, so I have no objection myself.

::: js/src/frontend/binsource/Cargo.lock:3
(Diff revision 3)
> +[[package]]
> +name = "binsource"
> +version = "0.1.0"

Rust-in-Gecko: I don't know if it's the right thing to check in this file.

::: js/src/frontend/binsource/Cargo.toml:10
(Diff revision 3)
> +
> +[dependencies]
> +binjs_meta = "0.3.4"
> +clap = "^2"
> +env_logger = "0.4.3"
> +yaml-rust = "^0.4"

Nit: `^` is the default, so you can omit it (or consistently use it for all of them).

::: js/src/frontend/binsource/src/main.rs:1223
(Diff revision 3)
> +        &exporter.to_spidermonkey_token_hpp());
> +    write_to("C++ token implementation code", "OUT_IMPL_FILE",
> +        &exporter.to_spidermonkey_cpp());
> +
> +    println!("...done");
> +}

Nit: The diff says this file is missing a newline character at the end.

::: python/mozbuild/mozbuild/vendor_rust.py:166
(Diff revision 3)
> +        # Licenses for code used at build time (e.g. code generators). Please see the above
> +        # comments before adding anything to this list.
> +        BUILDTIME_LICENSE_WHITELIST = [
> +            'BSD-2-Clause', # binsource
> +            'BSD-3-Clause', # bindgen
> +        ]

I didn't review the changes in this file.
(In reply to Jason Orendorff [:jorendorff] from comment #67)
> ::: Cargo.lock:69
> (Diff revision 3)
> > + "libc 0.2.39 (registry+https://github.com/rust-lang/crates.io-index)",
> > + "winapi 0.2.8 (registry+https://github.com/rust-lang/crates.io-index)",
> > +]
> > +
> > +[[package]]
> > +name = "atty"
> 
> Rust-in-Gecko: do we care about having multiple versions of atty and
> itertools in here?

We would prefer to not have multiple versions, but as you can see, we're not very good at holding to that ideal, and people have pushed back on a more Servo-like model that would reject changes creating multiple versions. :(

> ::: js/src/frontend/binsource/Cargo.lock:3
> (Diff revision 3)
> > +[[package]]
> > +name = "binsource"
> > +version = "0.1.0"
> 
> Rust-in-Gecko: I don't know if it's the right thing to check in this file.

This file should not be necessary.  It probably won't really *hurt* anything, I think, but we shouldn't need it.

(In reply to David Teller [:Yoric] (please use "needinfo") from comment #66)
> The parser depends on webidl-rs, which depends on lalrpop-*. Currently,
> lalrpop-* contains 4 huge files. Once a version with
> https://github.com/lalrpop/lalrpop/pull/295/files is released, the two 1.9Mb
> files will disappear. For the other 2, however,
> https://github.com/lalrpop/lalrpop/issues/333#issuecomment-373951348 agrees
> that it would be nice to remove them and/or make them smaller. However, at
> this stage, there doesn't seem to be a clear way to do it.
> 
> Hope that clarifies a bit the issue.

Wait, the parser for a binary version of JS depends on *webidl-rs*?!  What in the world for?

*looks at the spec*  Oh, the *spec* is written in WebIDL?  I have...no words.  I guess that makes a certain amount of sense?

OK, so it seems like at a minimum we would like a version of lalrpop (or webidl-rs generated with?) with pull #295 included.  I am not super happy about bloating the source tree with the other files.  It does seem like reducing them or finding some other parser library that wasn't huge would be a huge time sink.  As long as those files aren't actually getting shipped as part of Firefox, I think we can deal with bloat on the host side.  Am I correct in assuming they aren't shipped with Firefox?
Flags: needinfo?(nfroyd)
(In reply to Nathan Froyd [:froydnj] from comment #68)
> Wait, the parser for a binary version of JS depends on *webidl-rs*?!  What
> in the world for?
> 
> *looks at the spec*  Oh, the *spec* is written in WebIDL?  I have...no
> words.  I guess that makes a certain amount of sense?

Well, it is the standard for writing specs on the web.

> OK, so it seems like at a minimum we would like a version of lalrpop (or
> webidl-rs generated with?) with pull #295 included.  I am not super happy
> about bloating the source tree with the other files.  It does seem like
> reducing them or finding some other parser library that wasn't huge would be
> a huge time sink.  As long as those files aren't actually getting shipped as
> part of Firefox, I think we can deal with bloat on the host side.  Am I
> correct in assuming they aren't shipped with Firefox?

Correct.
Comment on attachment 8958389 [details]
Bug 1437004 - Introducing BinSource parser generator;

https://reviewboard.mozilla.org/r/227352/#review236128

::: python/mozbuild/mozbuild/vendor_rust.py:219
(Diff revision 3)
>                  if license_matches:
>                      license = license_matches[0].group(1)
>                      self.log(logging.DEBUG, 'package_license', {},
>                               'has license {}'.format(license))
>  
> -                    if license not in LICENSE_WHITELIST:
> +                    if license not in RUNTIME_LICENSE_WHITELIST and license not in BUILDTIME_LICENSE_WHITELIST:

This check basically implements a single whitelist, no?  Which sort of defeats the point of having two whitelists.  If you're going to have two whitelists, then the build time white list needs to be indexed by packages that we only use at build time.
Attachment #8958389 - Flags: review?(nfroyd)
Comment on attachment 8958389 [details]
Bug 1437004 - Introducing BinSource parser generator;

https://reviewboard.mozilla.org/r/227352/#review236128

> This check basically implements a single whitelist, no?  Which sort of defeats the point of having two whitelists.  If you're going to have two whitelists, then the build time white list needs to be indexed by packages that we only use at build time.

Is there a way to find out which packages we only use at build-time?
Comment on attachment 8958389 [details]
Bug 1437004 - Introducing BinSource parser generator;

https://reviewboard.mozilla.org/r/227352/#review237228

r=me on the vendor_rust.py changes and vendoring things generally; I'll leave the Rust-y bits and whatnot to jorendorff.
Attachment #8958389 - Flags: review?(nfroyd) → review+
Comment on attachment 8958389 [details]
Bug 1437004 - Introducing BinSource parser generator;

https://reviewboard.mozilla.org/r/227352/#review237700

r=me.

Just imagine how long it would have taken me to review all 6600 lines of BinSource-auto.cpp...

I have a ton of nits which should be easy to address. And not much else! I kept involuntarily asking myself "should we be generating this particular bit of code (as opposed to using fancy templates)?" but I don't think it's worth worring about, certainly not yet; I also kept asking myself "gee, is this the right design for this parser", but, same deal.

This review errs on the side of telling you things about Rust that you might already know. Apologies in advance!

::: Cargo.lock:2
(Diff revision 4)
>  [[package]]
> +name = "Inflector"

I had trouble with this Cargo.lock file (some packages added as new dependencies in the file are not added as top-level [[package]]s), but it was easy to regenerate it:

    $ cd `hg root`
    $ hg revert -r .^ Cargo.lock
    $ ./mach vendor rust

::: js/src/frontend/binsource/Cargo.toml:10
(Diff revision 4)
> +
> +[dependencies]
> +binjs_meta = "0.3.4"
> +clap = "^2"
> +env_logger = "0.4.3"
> +yaml-rust = "^0.4"

I had to add `itertools`, `webidl`, and `yaml_rust` to get this working.

::: js/src/frontend/binsource/src/main.rs:12
(Diff revision 4)
> +extern crate yaml_rust;
> +
> +use binjs_meta::export::{ TypeDeanonymizer, TypeName };
> +use binjs_meta::import::Importer;
> +use binjs_meta::spec::*;
> +use binjs_meta::util::*;

Please either explicitly `use binjs_meta::util::{ToStr, Reindentable};`
or add a comment noting that .reindent() and .newline() come from this
import, since it's a bit mysterious otherwise. Or, move them from
binjs_meta to this file, if binsource is the only consumer.

::: js/src/frontend/binsource/src/main.rs:18
(Diff revision 4)
> +
> +use std::collections::{ HashMap, HashSet };
> +use std::fs::*;
> +use std::io::*;
> +
> +use clap::*;

This isn't great, because both std::io and clap have a `Result` type.
Happily, we're not using much from these modules, so it's not a big deal
to list the names:

    use std::fs::File;
    use std::io::prelude::*;

    use clap::{App, Arg};

(By convention, Rust modules named `prelude` are designed to be imported using `*`.)

::: js/src/frontend/binsource/src/main.rs:23
(Diff revision 4)
> +use clap::*;
> +
> +use itertools::Itertools;
> +
> +#[derive(Clone, Default)]
> +pub struct FieldParsingRules {

Nothing needs to be `pub` in a main.rs, I don't think.

::: js/src/frontend/binsource/src/main.rs:23
(Diff revision 4)
> +use clap::*;
> +
> +use itertools::Itertools;
> +
> +#[derive(Clone, Default)]
> +pub struct FieldParsingRules {

Maybe rename these types:
    FieldParsingRule -> FieldRules
    NodeParsingRules -> TypeRules
    GenerationRules -> Rules

::: js/src/frontend/binsource/src/main.rs:23
(Diff revision 4)
> +use clap::*;
> +
> +use itertools::Itertools;
> +
> +#[derive(Clone, Default)]
> +pub struct FieldParsingRules {

These structs seem like the right place to document what the yaml file
is allowed to do. Please give each of the three types, and each of their
fields, a one-line comment.

::: js/src/frontend/binsource/src/main.rs:41
(Diff revision 4)
> +    pub inherits: Option<NodeName>,
> +
> +    /// Override the result type for the method.
> +    pub type_ok: Option<String>,
> +
> +    pub start: Option<String>,

Either the "start" field or the "init" key should change so they match.

::: js/src/frontend/binsource/src/main.rs:56
(Diff revision 4)
> +    fn get(&self, name: &NodeName) -> NodeParsingRules {
> +        let mut rules = self.per_node.get(name)
> +            .cloned()
> +            .unwrap_or_default();
> +        let inherits = rules.inherits.clone();
> +        if let Some(ref parent) = inherits {

Cloning is unnecessary, right?

    if let Some(ref parent) = rules.inherits {
        ...
    }

::: js/src/frontend/binsource/src/main.rs:65
(Diff revision 4)
> +                start,
> +                append,
> +                by_field,
> +                build_result,
> +            } = self.get(parent);
> +            if rules.inherits.is_none() {

This can't be the case--the enclosing if-statement checks that rules.inherits is present.

::: js/src/frontend/binsource/src/main.rs:89
(Diff revision 4)
> +        }
> +        rules
> +    }
> +}
> +#[derive(Default)]
> +pub struct GenerationRules {

Style nit: Definitely consider putting the struct before the impl. I think it helps human readers to keep the three structs that show the whole structure of the rules adjacent.

::: js/src/frontend/binsource/src/main.rs:101
(Diff revision 4)
> +    hpp_tokens_field_doc: Option<String>,
> +    per_node: HashMap<NodeName, NodeParsingRules>,
> +}
> +
> +struct ToWebidl;
> +impl ToWebidl {

Could this code be moved into binjs_meta? The three methods here seem
like they'd cohere nicely with that crate, as methods of TypeSpec, Type,
and Interface.

::: js/src/frontend/binsource/src/main.rs:105
(Diff revision 4)
> +struct ToWebidl;
> +impl ToWebidl {
> +    pub fn spec(spec: &TypeSpec, prefix: &str, indent: &str) -> String {
> +        match *spec {
> +            TypeSpec::Array { ref contents, supports_empty: false } =>
> +                format!("[NonEmpty] FrozenArray<{}>", Self::type_(&*contents, prefix, indent)),

This will work fine without `&*`.

Same thing again two lines below.

::: js/src/frontend/binsource/src/main.rs:158
(Diff revision 4)
> +    }
> +}
> +pub struct CPPExporter {
> +    syntax: Spec,
> +    rules: GenerationRules,
> +    list_parsers_to_generate: Vec<(NodeName, (/* supports_empty */ bool, NodeName))>,

Note that Rust supports triples: `(NodeName, bool, NodeName)`.

::: js/src/frontend/binsource/src/main.rs:159
(Diff revision 4)
> +}
> +pub struct CPPExporter {
> +    syntax: Spec,
> +    rules: GenerationRules,
> +    list_parsers_to_generate: Vec<(NodeName, (/* supports_empty */ bool, NodeName))>,
> +    option_parsers_to_generate: Vec<(NodeName, NodeName)>,

Suggestion: offhand, it looked like the code would read better if each
of these got its own little struct:

    struct ListParser {
        name: NodeName,
        supports_empty: bool,
        element_type: NodeName,
    }

::: js/src/frontend/binsource/src/main.rs:164
(Diff revision 4)
> +    option_parsers_to_generate: Vec<(NodeName, NodeName)>,
> +}
> +
> +impl CPPExporter {
> +    pub fn new(deanonymizer: TypeDeanonymizer, options: SpecOptions) -> Self {
> +        let syntax = deanonymizer.into_spec(options);

Style nit: `fn new(syntax: Spec)` seems better here. The caller can
finish deanonymization before calling this.

`TypeDeanonymizer` seems like it could be replaced by a single
`fn (Spec) -> Spec` function.

::: js/src/frontend/binsource/src/main.rs:170
(Diff revision 4)
> +
> +        let mut list_parsers_to_generate = vec![];
> +        let mut option_parsers_to_generate = vec![];
> +        for (parser_node_name, typedef) in syntax.typedefs_by_name() {
> +            if typedef.is_optional() {
> +                let content_name = TypeName::type_spec(typedef.spec()); // FIXME: Wait, do we have an implementation of type names in two places?

I didn't check.

::: js/src/frontend/binsource/src/main.rs:198
(Diff revision 4)
> +            list_parsers_to_generate,
> +            option_parsers_to_generate,
> +        }
> +    }
> +
> +    fn set_export_rules(&mut self, rules: GenerationRules) {

Style nit: Pass the rules to the `new` method and remove this.

::: js/src/frontend/binsource/src/main.rs:204
(Diff revision 4)
> +        self.rules = rules;
> +    }
> +}
> +
> +// ----- Generating the header
> +impl CPPExporter {

Nit: The preceding block is also `impl CPPExporter { ... }`, so I think
the two blocks should be merged.

(I get that there is a thematic shift here, but the one-line comment expresses that directly.)

Same thing in one other place below.

::: js/src/frontend/binsource/src/main.rs:211
(Diff revision 4)
> +        let rules_for_this_interface = self.rules.get(name);
> +        // If the override is provided, use it.
> +        if let Some(ref type_ok) = rules_for_this_interface.type_ok {
> +            return type_ok.to_string()
> +        }
> +        default.to_string()

Rust note: Another way to write this method would be,

    fn get_type_ok<'a, 'name>(&'a self, name: &'name NodeName, default: &'a str) -> &'a str {
        self.rules.get(name).type_ok
            .as_ref()
            .map(|s| s as &str)
            .unwrap_or(default)
    }

You don't have to change anything. I only mention it because it's not
obvious that Rust can do this without copying the string, but it can.

::: js/src/frontend/binsource/src/main.rs:217
(Diff revision 4)
> +    }
> +
> +    fn get_method_signature(&self, name: &NodeName, default_type_ok: &str, prefix: &str, args: &str) -> String {
> +        let type_ok = self.get_type_ok(name, default_type_ok);
> +        let kind = name.to_class_cases();
> +        format!("    JS::Result<{type_ok}> parse{prefix}{kind}({args});",

It turns out we never want the method signature without a newline, so
you might as well add it here.

::: js/src/frontend/binsource/src/main.rs:223
(Diff revision 4)
> +            prefix = prefix,
> +            type_ok = type_ok,
> +            kind = kind,
> +            args = args,
> +        )
> +    }

(no action needed, just thoughts on code that generates C++)

When I worked on Codegen.py, my rule was that each method would either
return a "slab" of C++ (multiple lines, always with a trailing newline)
or a "snippet" (a single line, no newline). Nice because values of
either type can be concatenated (slabs concatenate vertically;
snippets horizontally). Also because newlines almost never have to be
added later, they can be part of format strings that build lines.
(When you have to add them later, it's never clear from looking at the
code why you're doing that or if it's really necessary.)

The first 200+ lines of Codegen.py are a bunch of helper functions to
make the C++ string-munging as nice as possible. See in particular
<https://searchfox.org/mozilla-central/source/dom/bindings/Codegen.py#206-242>.

Of course, Rust isn't Python. In the long run, it might be nice to get
the Rust program using a type other than `String` for snippets of C++
code. "A Prettier Printer" by Philip Wadler describes a lovely ADT for
representing code...

Anyway, all of this is for a later day, if ever.

::: js/src/frontend/binsource/src/main.rs:253
(Diff revision 4)
> +            .sorted();
> +        buffer.push_str(&format!("\n#define FOR_EACH_BIN_KIND(F) \\\n{nodes}",
> +            nodes = node_names.iter()
> +                .map(|name| format!("    F({name}, {name})",
> +                    name = name))
> +                .format(" \\\n")));

Style nit: As you mentioned on IRC, the `.format()` could be fused into
the `.map(|name| format!(...))`. Your call.

(Rust iterators with string items have the two methods `.join(", ")` and
`.concat()`. I mention this because I just don't know if you know about
them :D -- they're easy to miss in the docs.)

::: js/src/frontend/binsource/src/main.rs:282
(Diff revision 4)
> +                    enum_name = name.to_cpp_enum_case()))
> +                .format(" \\\n")));
> +        buffer.push_str("
> +
> +
> +enum class BinField {

Maybe put the extra newline at the end of the preceding format string:
"...\n{nodes}\n", instead of at the beginning of this (i.e. put it closer to the code that requires it to be there for correctness).

::: js/src/frontend/binsource/src/main.rs:299
(Diff revision 4)
> +    /// Declare string enums
> +    fn export_declare_string_enums_classes(&self, buffer: &mut String) {
> +        buffer.push_str("\n\n// ----- Declaring string enums (by lexicographical order)\n");
> +        let string_enums_by_name = self.syntax.string_enums_by_name()
> +            .iter()
> +            .sorted_by(|a, b| str::cmp(a.0.to_str(), b.0.to_str()));

Good idea, sorting all these. Thanks.

::: js/src/frontend/binsource/src/main.rs:359
(Diff revision 4)
> +        }
> +
> +        for parser in inner_parsers.drain(..) {
> +            buffer.push_str(&parser);
> +            buffer.push_str("\n");
> +        }

Style nit: The 20 lines of code ending here is wordy, looks like because of awkwardness around
Rust ownership. But I think you can just say:

    for (name, _) in interfaces_by_name {
        buffer.push_str(self.get_method_signature(...).reindent());
        buffer.push_str(self.get_method_signature(...).reindent());
    }

if `get_method_signature()` is changed to include a newline.

::: js/src/frontend/binsource/src/main.rs:451
(Diff revision 4)
> +        // Generate outer method
> +        buffer.push_str(&format!("{bnf}
> +{first_line}
> +{{
> +    BinKind kind;
> +    BinFields fields((cx_));

The extra parens here are unnecessary, I think. (Even without them, this
won't look like a declaration to C++, because cx_ isn't a type.)

(Same comment in 4 or 5 other places in this file.)

::: js/src/frontend/binsource/src/main.rs:469
(Diff revision 4)
> +                first_line = self.get_method_definition_start(name, "ParseNode*", "", "")
> +        ));
> +
> +        // Generate inner method
> +        let mut buffer_cases = String::new();
> +        for node in nodes {

Hmm. This generates kind of a lot of code. I wonder if it is really
faster than having a single parseAny() method with a switch statement,
capable of parsing everything; with a separate check to make sure the
node is allowed in context.

...But there's plenty of time to worry about that later.

::: js/src/frontend/binsource/src/main.rs:472
(Diff revision 4)
> +        // Generate inner method
> +        let mut buffer_cases = String::new();
> +        for node in nodes {
> +            buffer_cases.push_str(&format!("
> +      case BinKind::{kind}:
> +        MOZ_TRY_VAR(result, parseInterface{kind}(start, kind, fields));

Since these can be mutually recursive, I think we need CheckRecursionLimit() calls somewhere.

::: js/src/frontend/binsource/src/main.rs:475
(Diff revision 4)
> +            buffer_cases.push_str(&format!("
> +      case BinKind::{kind}:
> +        MOZ_TRY_VAR(result, parseInterface{kind}(start, kind, fields));
> +        break;",
> +                kind = node.to_class_cases()));
> +            }

Style nit: This close brace should line up with the `for` keyword.

::: js/src/frontend/binsource/src/main.rs:491
(Diff revision 4)
> +",
> +        kind = kind,
> +        cases = buffer_cases,
> +        first_line = self.get_method_definition_start(name, "ParseNode*", "Sum", "const size_t start, const BinKind kind, const BinFields& fields"),
> +        type_ok = self.get_type_ok(name, "ParseNode*")
> +    ));

Style nit: The five lines ending here are indented incorrectly, I think.

::: js/src/frontend/binsource/src/main.rs:557
(Diff revision 4)
> +
> +    fn generate_implement_option(&self, buffer: &mut String, name: &NodeName, contents: &NodeName) {
> +        debug!(target: "generate_spidermonkey", "Implementing optional value {} backed by {}",
> +            name.to_str(), contents.to_str());
> +
> +        let rules_for_this_node = self.rules.get(name);

It looks like most `rules_for_this_node` are ignored for optional
interfaces. It would be frustrating to add configuration for something
to the .yaml file and have it mysteriously ignored, so maybe we should
report an error if something is present but unused
(or doesn't make any sense)?

Similarly, in generate_implement_interface(), if "replace" is present
for a field, all other settings for that field are ignored.

No action required, your call.

::: js/src/frontend/binsource/src/main.rs:560
(Diff revision 4)
> +            name.to_str(), contents.to_str());
> +
> +        let rules_for_this_node = self.rules.get(name);
> +
> +        let type_ok = self.get_type_ok(name, "ParseNode*");
> +        let default_value = 

Please delete trailing whitespace in this file.

::: js/src/frontend/binsource/src/main.rs:579
(Diff revision 4)
> +                        .unwrap()
> +                } else {
> +                    panic!()
> +                }
> +            } else {
> +                panic!()

Pass a message to `panic!()`, even if it's something dumb:

    panic!("deanonymization didn't create the expected typedef for optional {}", name);

::: js/src/frontend/binsource/src/main.rs:741
(Diff revision 4)
> +                        (None,
> +                        Some(format!("MOZ_TRY_DECL({var_name}, readBool());", var_name = var_name)))
> +                    }
> +                }
> +                Some(IsNullable { content: Primitive::Void, .. }) => {
> +                    (Some(format!("// Skipping void field {}", field.name().to_str())),

Does this really happen? I think it should be an error.

::: js/src/frontend/binsource/src/main.rs:824
(Diff revision 4)
> +{decl_var}
> +{parse_var}
> +",
> +                    decl_var = decl_var.reindent("    "),
> +                    parse_var = parse_var.reindent("    "))
> +            };

The last two "else" blocks (the 20 lines ending here) are really the same case. Consider merging them.

One way to do this is with:

    let rules_for_this_field = rules_for_this_interface.by_field.get(field.name()).cloned().unwrap_or_default();

::: js/src/frontend/binsource/src/main.rs:830
(Diff revision 4)
> +            fields_implem.push_str(&rendered);
> +        }
> +
> +        let (start, build_result) =
> +            (rules_for_this_interface.start.reindent("    "),
> +             rules_for_this_interface.build_result.reindent("    "));

Style nit: Two separate declarations, maybe.

::: js/src/frontend/binsource/src/main.rs:886
(Diff revision 4)
> +
> +        let sums_of_interfaces = self.syntax.resolved_sums_of_interfaces_by_name()
> +            .iter()
> +            .sorted_by(|a, b| a.0.cmp(&b.0));
> +
> +        for &(ref name, ref nodes) in sums_of_interfaces.iter() {

Nit: for (name, nodes) in sums_of_interfaces {

::: js/src/frontend/binsource/src/main.rs:922
(Diff revision 4)
> +                                string = string,
> +                                kind = kind,
> +                                variant = string.to_cpp_enum_case()
> +                            )
> +                        })
> +                        .format("\n")

Maybe fuse .format() into .map() again.

::: js/src/frontend/binsource/src/main.rs:930
(Diff revision 4)
> +                let rendered_doc = format!("/*\nenum {kind} {{\n{cases}\n}};\n*/\n",
> +                    kind = kind,
> +                    cases = enum_.strings()
> +                            .iter()
> +                            .map(|s| format!("    \"{}\"", s))
> +                            .format(",\n")

And again here.

::: js/src/frontend/binsource/src/main.rs:969
(Diff revision 4)
> +
> +        buffer
> +    }
> +}
> +
> +fn update_rule(rule: &mut Option<String>, entry: &yaml_rust::Yaml) -> Option<Option<()>> {

The return type `Option<Option<()>>` is weird. Consider `Result<(), String>`.
Or this could return nothing and just panic on error:

    fn update_rule<F: FnOnce() -> String>(rule: &mut Option<String>, entry: &yaml_rust::Yaml, on_fail: F) {
        ...
        } else {
            panic!("{}", on_fail());
        }
    }

::: js/src/frontend/binsource/src/main.rs:982
(Diff revision 4)
> +    }    
> +}
> +
> +
> +fn main() {
> +    env_logger::init();

This returns a `Result` that must be used or rustc emits a warning.

To silence the warning, the idioms are:

    let _ = env_logger::init();  // ignore errors

    env_logger::init().unwrap();  // panic on error

::: js/src/frontend/binsource/src/main.rs:986
(Diff revision 4)
> +fn main() {
> +    env_logger::init();
> +
> +    let matches = App::new("BinJS import from webidl")
> +        .author("David Teller, <dteller@mozilla.com>")
> +        .about("Import a webidl defining the syntax of JavaScript.")

Consider updating the about message.

::: js/src/frontend/binsource/src/main.rs:1018
(Diff revision 4)
> +                .help("Path to copy the webidl source"),
> +            Arg::with_name("OUT_YAML_FILE")
> +                .long("out-yaml")
> +                .required(false)
> +                .takes_value(true)
> +                .help("Path to copy the yaml source"),

--out-webidl and --out-yaml can be deleted now, right?

::: js/src/frontend/binsource/src/main.rs:1026
(Diff revision 4)
> +
> +    let source_path = matches.value_of("INPUT.webidl")
> +        .expect("Expected INPUT.webidl");
> +
> +    let mut file = File::open(source_path)
> +        .expect("Could not open source");

Style nit: Consider factoring most of this code into separate function
that returns a Result, so as to be able to use `?` in places like this,
rather than `.expect()` which panics on error.

The main function, then, could look like this:

    fn main() {
        if let Err(err) = fallible_main() {
            eprintln!("error: {}", err.description());
        }

        println!("...done");
    }

It's ok to ignore this. I suggest it because panic error messages
are rather noisy.

::: js/src/frontend/binsource/src/main.rs:1047
(Diff revision 4)
> +    let syntax = builder.into_spec(SpecOptions {
> +        root: &fake_root,
> +        null: &null,
> +    });
> +
> +    let syntax_options = SpecOptions {

syntax_options is defined far before the place where it's used. Maybe
move it down?

Or maybe it would be better not to have these options? It seems
binjs_meta doesn't have any particular use for them; they're just passed
through.

::: js/src/frontend/binsource/src/main.rs:1054
(Diff revision 4)
> +        null: &null,
> +    };
> +
> +    let deanonymizer = TypeDeanonymizer::new(&syntax);
> +
> +    let mut generation_rules = GenerationRules::default();

It's fairly common, in this sort of situation in Rust, to build up the
fields in local variables, then finish by returning a new struct:

    Ok(GenerationRules {
        cpp_header,
        cpp_footer,
        // ...
    })

This way, there's no mutable variable, and the fields of `GenerationRules`
that aren't optional don't need to be `Option<String>`s. They can just be
Strings.

Since the strings are moved into the new `GenerationRules` (not
copied), this is efficient.

::: js/src/frontend/binsource/src/main.rs:1055
(Diff revision 4)
> +    };
> +
> +    let deanonymizer = TypeDeanonymizer::new(&syntax);
> +
> +    let mut generation_rules = GenerationRules::default();
> +    if let Some(rules_source_path) = matches.value_of("INPUT.yaml") {

The INPUT.yaml argument is required, so this should use `.unwrap()` and
avoid a big block.

::: js/src/frontend/binsource/src/main.rs:1140
(Diff revision 4)
> +                        for (field_key, field_entry) in fields {
> +                            let field_key = field_key.as_str()
> +                                .unwrap_or_else(|| panic!("In rule {}, field entries must be field names",
> +                                    node_key))
> +                                .to_string();
> +                            let field_name = syntax.get_field_name(&field_key)

I found it a little confusing that the old `syntax` Spec and the
deanonymizer existed at the same time. I would factor out the
webidl-loading into its own function:

    // Load the spec from the BinSource.webidl file.
    // This also deanonymizes the spec, giving each anonymous type a name.
    fn load_spec(webidl_path: &Path) -> Result<Spec, Box<Error>> {    
        let mut file = File::open(webidl_path)?;
        let mut source = String::new();
        file.read_to_string(&mut source)?;

        println!("...parsing webidl");
        let parser = webidl::Parser::new();
        let ast = parser.parse_string(&source)?;

        println!("...verifying grammar");
        let mut builder = Importer::import(&ast);
        let fake_root = builder.node_name("");
        let null = builder.node_name("_Null");
        builder.add_interface(&null)
            .unwrap();  // Assert that _Null was not already present.

        let original_spec = builder.into_spec(SpecOptions {
            root: &fake_root,
            null: &null,
        });

        let deanonymizer = TypeDeanonymizer::new(&original_spec);
        let named_spec = deanonymizer.into_spec(SpecOptions {
            root: &fake_root,
            null: &null,
        });

        Ok(named_spec)
    }

Then it can be called like this:

    let syntax = load_spec(webidl_path)
        .map_err(|e| format!("error loading spec from {:?}: {}", &paths.in_webidl, e.description()))?;

...or with .expect() instead of .map_err() if desired.

In general, main() was not easy to read and I think factoring tasks out would help!
Attachment #8958389 - Flags: review?(jorendorff) → review+
Comment on attachment 8958389 [details]
Bug 1437004 - Introducing BinSource parser generator;

https://reviewboard.mozilla.org/r/227352/#review238938

::: third_party/rust/Inflector/.cargo-checksum.json:1
(Diff revision 4)
> +{"files":{"CHANGELOG.md":"e9a48682ad06315c67acdf9d9519ec60e483f487cfabcafebf30c18ecd553c19","CONTRIBUTING.md":"99a4db96609e44c91d9be8385e34cd213dc091e302d58d155056539045d17a8b","Cargo.toml":"1ba1b81c80a1b15f9bac208e924262b69962bf5d05452332fb32b301360f6693","LICENSE.md":"03d1bd5bfbee8d44651e7f57faf9e5b4eda4233f0d4eda36a716f2f0533d230b","PULL_REQUEST_TEMPLATE.md":"88b1581d41bbbfda51ac309ad7c3a735f65798f5b5148f1e6201a246719913c7","README.md":"d2e81724a70e14b60a2e8b680e9024838385ff51e302bb3f0879c526a6ae97f1","benchmark":"87cfbb14d1f7a3a38a9b02579199d04d87b15b078d3cf2094fd750aeb95c820a","src/cases/camelcase/mod.rs":"8e65fca78ea88acb32c0f214cafde39b849aef253253c3681e316f2559b26977","src/cases/case/mod.rs":"f3e1795f402cdfdecf3b255c3f7e42a9a6a8a0044802501574044e4a35d6081d","src/cases/classcase/mod.rs":"5b6b74530a2a693bf1ac89342f1b25f58f39336b1ee3242547c3d6ef468a878f","src/cases/kebabcase/mod.rs":"b317ebd42f22daab4b23bb4b83ce85f053d7088680d3a32eecbd13bd5331587a","src/cases/mod.rs":"e272853bcc1c5f6eb02594038febb9dcebb6eca8eac744d6e503db5082e585c6","src/cases/pascalcase/mod.rs":"a44feed6d8877fd8a31160076befe826960aa001d859587aef2dddc1aedc397b","src/cases/screamingsnakecase/mod.rs":"21582eb1ec2170d379bf3536c6ffb39b8bdc096efe2d493674458ee27b86e985","src/cases/sentencecase/mod.rs":"eb21d7d5bf0b23e1325d429dfdc149081d233a8b950c1fdfe04b4bebcc2c0ddb","src/cases/snakecase/mod.rs":"369739e37e700c028022f308aa78504873c10a5e88768f05249c1c8481b30c9d","src/cases/tablecase/mod.rs":"a6a50a397059d775a517d5dce6ba612b107919e209a9eb56871a5c1d42314664","src/cases/titlecase/mod.rs":"3f0dac5e5b434da9234d6c389f67bb2d3c8f138dc521fa29dbe3791f8eaf5341","src/cases/traincase/mod.rs":"4e2493d6594d3c505de293c69390b3f672c0fd4d35603ae1a1aae48166bc18c2","src/lib.rs":"6c5cf60f5c2f8778a3ad7638f37064527b8a86f164117d867b8b6532e2cc655e","src/numbers/deordinalize/mod.rs":"a6e0c00ab9c50f997b215762670ef1e7ac4cbdaa1380113fd625b8d59940e61c","src/numbers/mod.rs":"fed4e090f8b64a34ae64ddcb68d899cfa4dd8e8422a060be01a70dbdb71b85e0","src/numbers/ordinalize/mod.rs":"ce0d88977efaa50792e7311c0e0a73a3115928f9f7be77f914824c3d80eab66c","src/string/constants/mod.rs":"38de3d5060a5d224d28d184eab8af02203c65d74c1d380720c3260ea205f3e05","src/string/deconstantize/mod.rs":"c79f2170dc41bd6abb89a6e74fbdd87bf011f62cfe1f34d8886fda0724ade6fa","src/string/demodulize/mod.rs":"bbcb5314473e4ca02feee4903e31a332caaa912ed2cbca0f49c2fe411a826215","src/string/mod.rs":"570f7ea4dd646f2d633ddd67079db922cc2cadf916719fa19c2f59b4d522ee89","src/string/pluralize/mod.rs":"5f07fab8b5f4e7af546f1e907426724714b9b27af1ecb59a91e57dccd0833a6e","src/string/singularize/mod.rs":"9c2d833cbcdc1489013642de22578d51f558a31e8d2fea4536a27f8fa1114169","src/suffix/foreignkey/mod.rs":"e7ad9a9a0a21fcb53becb36306a15eedf67958e2da18ae928ae592177e70e7a3","src/suffix/mod.rs":"f6f99ce6fc8794d5411d91533b67be5d4a2bc5994317d32f405b2fa3c5ec660d","tests/lib.rs":"e1cfcea8a146291396ff72b0a2e84c2b9ddaa0103717442c4921c165a2ab470d","travis-after-success.sh":"ae0dfa332c9af427c728c42761a9813c9c6cb1748cc528a5bd38774a04850375"},"package":"1b33cd9b653730fc539c53c7b3c672d2f47108fa20c6df571fa5817178f5a14c"}

**Please** land the new third_party files in a separate commit from the rest.
Comment on attachment 8958389 [details]
Bug 1437004 - Introducing BinSource parser generator;

https://reviewboard.mozilla.org/r/227352/#review237700

> I had to add `itertools`, `webidl`, and `yaml_rust` to get this working.

Mmmmh... that's right. How can it work without these dependencies?

> Please either explicitly `use binjs_meta::util::{ToStr, Reindentable};`
> or add a comment noting that .reindent() and .newline() come from this
> import, since it's a bit mysterious otherwise. Or, move them from
> binjs_meta to this file, if binsource is the only consumer.

Note: the Rust-target parser generator also uses these.

> Maybe rename these types:
>     FieldParsingRule -> FieldRules
>     NodeParsingRules -> TypeRules
>     GenerationRules -> Rules

What would you think of `FieldRules`, `NodeRules`, `FileRules`?

> Cloning is unnecessary, right?
> 
>     if let Some(ref parent) = rules.inherits {
>         ...
>     }

I don't follow. Where am I cloning?
Comment on attachment 8951201 [details]
Bug 1437004 - Porting BinAST to AST v3;

https://reviewboard.mozilla.org/r/220464/#review238936

r=me with a few questions.

::: commit-message-6dc0c:5
(Diff revision 5)
> +Bug 1437004 - Porting BinAST to AST v3;r?arai,jorendorff
> +
> +This patch is a nearly complete reimplementation of BinASTReader, with the following changes:
> +
> +- Files BinToken.h, BinSource-auto.h (new), BinSource-auto.cpp (new) are now autogenerated by an out-of-tree code generator from the webidl specifications of BinAST and a small configuration file.

Not out-of-tree anymore.

::: js/src/frontend/BinSource.h:241
(Diff revision 5)
>      // The current ParseContext, holding directives, etc.
>      ParseContext* parseContext_;
>      UsedNameTracker& usedNames_;
>      Maybe<Tokenizer> tokenizer_;
>      FullParseHandler factory_;
> +    VariableDeclarationKind variableDeclarationKind;

Is this necessary? It looks like it is only used in checkBinding, which could take it as an argument.

::: js/src/frontend/BinSource.h:241
(Diff revision 5)
>      // The current ParseContext, holding directives, etc.
>      ParseContext* parseContext_;
>      UsedNameTracker& usedNames_;
>      Maybe<Tokenizer> tokenizer_;
>      FullParseHandler factory_;
> +    VariableDeclarationKind variableDeclarationKind;

If we need it, give the field name a trailing underscore, like the other fields here.

::: js/src/frontend/BinSource.webidl:871
(Diff revision 5)
> +};
> +
> +interface VariableDeclarator : Node {
> +  attribute Binding binding;
> +  attribute Expression? init;
> +};

Please make sure there's a newline at the end of this file.

::: js/src/frontend/BinSource.yaml:20
(Diff revision 5)
> +        // as follows:
> +        // - cd $(topsrcdir)/js/src/frontend/binsource
> +        // - cargo run -- $(topsrcdir)/js/src/frontend/BinSource.webidl $(topsrcdir)/js/src/frontend/BinSource.yaml
> +        //     --out-class $(topsrcdir)/js/src/frontend/{class}
> +        //     --out-impl $(topsrcdir)/js/src/frontend/{impl_}
> +        //     --out-token $(topsrcdir)/js/src/frontend/{token}

Please move this information into binsource/README.md and point the reader there.
Attachment #8951201 - Flags: review?(jorendorff) → review+
Comment on attachment 8958389 [details]
Bug 1437004 - Introducing BinSource parser generator;

https://reviewboard.mozilla.org/r/227352/#review237700

> Could this code be moved into binjs_meta? The three methods here seem
> like they'd cohere nicely with that crate, as methods of TypeSpec, Type,
> and Interface.

I'd rather not hardcode webidl into `Type`, `TypeSpec`, `Interface`, but I can make it part of `binjs_meta::export`.

> I didn't check.

Fixed in the meantime :)

> Rust note: Another way to write this method would be,
> 
>     fn get_type_ok<'a, 'name>(&'a self, name: &'name NodeName, default: &'a str) -> &'a str {
>         self.rules.get(name).type_ok
>             .as_ref()
>             .map(|s| s as &str)
>             .unwrap_or(default)
>     }
> 
> You don't have to change anything. I only mention it because it's not
> obvious that Rust can do this without copying the string, but it can.

I think that my version is somewhat more readable :)

> (no action needed, just thoughts on code that generates C++)
> 
> When I worked on Codegen.py, my rule was that each method would either
> return a "slab" of C++ (multiple lines, always with a trailing newline)
> or a "snippet" (a single line, no newline). Nice because values of
> either type can be concatenated (slabs concatenate vertically;
> snippets horizontally). Also because newlines almost never have to be
> added later, they can be part of format strings that build lines.
> (When you have to add them later, it's never clear from looking at the
> code why you're doing that or if it's really necessary.)
> 
> The first 200+ lines of Codegen.py are a bunch of helper functions to
> make the C++ string-munging as nice as possible. See in particular
> <https://searchfox.org/mozilla-central/source/dom/bindings/Codegen.py#206-242>.
> 
> Of course, Rust isn't Python. In the long run, it might be nice to get
> the Rust program using a type other than `String` for snippets of C++
> code. "A Prettier Printer" by Philip Wadler describes a lovely ADT for
> representing code...
> 
> Anyway, all of this is for a later day, if ever.

Yeah, I was thinking of something along these lines. Sounds like a nice crate to have :)

> Style nit: As you mentioned on IRC, the `.format()` could be fused into
> the `.map(|name| format!(...))`. Your call.
> 
> (Rust iterators with string items have the two methods `.join(", ")` and
> `.concat()`. I mention this because I just don't know if you know about
> them :D -- they're easy to miss in the docs.)

> Style nit: As you mentioned on IRC, the .format() could be fused into
the .map(|name| format!(...)). Your call.

Can it? The entire point of `.format()` is to convert an iterator into a string, which `map` won't do. Or am I missing something?

> (Rust iterators with string items have the two methods .join(", ") and
.concat(). I mention this because I just don't know if you know about
them :D -- they're easy to miss in the docs.)

Are you sure? I can't find either in the documentation.

> Style nit: The 20 lines of code ending here is wordy, looks like because of awkwardness around
> Rust ownership. But I think you can just say:
> 
>     for (name, _) in interfaces_by_name {
>         buffer.push_str(self.get_method_signature(...).reindent());
>         buffer.push_str(self.get_method_signature(...).reindent());
>     }
> 
> if `get_method_signature()` is changed to include a newline.

Actually, it's mostly to enforce an order among methods, to make the header more readable.

> The extra parens here are unnecessary, I think. (Even without them, this
> won't look like a declaration to C++, because cx_ isn't a type.)
> 
> (Same comment in 4 or 5 other places in this file.)

Yeah, I've been bitten a few too many times by Most Vexing Parse, I guess :)

> Hmm. This generates kind of a lot of code. I wonder if it is really
> faster than having a single parseAny() method with a switch statement,
> capable of parsing everything; with a separate check to make sure the
> node is allowed in context.
> 
> ...But there's plenty of time to worry about that later.

That's bug 1447984.

> Since these can be mutually recursive, I think we need CheckRecursionLimit() calls somewhere.

Ok, added to `parseInterface{kind}(start, kind, field)`.

> This returns a `Result` that must be used or rustc emits a warning.
> 
> To silence the warning, the idioms are:
> 
>     let _ = env_logger::init();  // ignore errors
> 
>     env_logger::init().unwrap();  // panic on error

Er, no, it doesn't return anything: https://github.com/sebasmagri/env_logger/blob/master/src/lib.rs#L807-L809

> It's fairly common, in this sort of situation in Rust, to build up the
> fields in local variables, then finish by returning a new struct:
> 
>     Ok(GenerationRules {
>         cpp_header,
>         cpp_footer,
>         // ...
>     })
> 
> This way, there's no mutable variable, and the fields of `GenerationRules`
> that aren't optional don't need to be `Option<String>`s. They can just be
> Strings.
> 
> Since the strings are moved into the new `GenerationRules` (not
> copied), this is efficient.

Actually, I moved this to a `GenerationRules::new()`.

> The INPUT.yaml argument is required, so this should use `.unwrap()` and
> avoid a big block.

Fair enough. It used to be optional :)

> I found it a little confusing that the old `syntax` Spec and the
> deanonymizer existed at the same time. I would factor out the
> webidl-loading into its own function:
> 
>     // Load the spec from the BinSource.webidl file.
>     // This also deanonymizes the spec, giving each anonymous type a name.
>     fn load_spec(webidl_path: &Path) -> Result<Spec, Box<Error>> {    
>         let mut file = File::open(webidl_path)?;
>         let mut source = String::new();
>         file.read_to_string(&mut source)?;
> 
>         println!("...parsing webidl");
>         let parser = webidl::Parser::new();
>         let ast = parser.parse_string(&source)?;
> 
>         println!("...verifying grammar");
>         let mut builder = Importer::import(&ast);
>         let fake_root = builder.node_name("");
>         let null = builder.node_name("_Null");
>         builder.add_interface(&null)
>             .unwrap();  // Assert that _Null was not already present.
> 
>         let original_spec = builder.into_spec(SpecOptions {
>             root: &fake_root,
>             null: &null,
>         });
> 
>         let deanonymizer = TypeDeanonymizer::new(&original_spec);
>         let named_spec = deanonymizer.into_spec(SpecOptions {
>             root: &fake_root,
>             null: &null,
>         });
> 
>         Ok(named_spec)
>     }
> 
> Then it can be called like this:
> 
>     let syntax = load_spec(webidl_path)
>         .map_err(|e| format!("error loading spec from {:?}: {}", &paths.in_webidl, e.description()))?;
> 
> ...or with .expect() instead of .map_err() if desired.
> 
> In general, main() was not easy to read and I think factoring tasks out would help!

I just moved just about everything except I/O of the `main()`, this should make things more readable.
Attachment #8951197 - Attachment is obsolete: true
Attachment #8951199 - Attachment is obsolete: true
Attachment #8951200 - Attachment is obsolete: true
Attachment #8951201 - Attachment is obsolete: true
Attachment #8952773 - Attachment is obsolete: true
Comment on attachment 8951201 [details]
Bug 1437004 - Porting BinAST to AST v3;

https://reviewboard.mozilla.org/r/220464/#review238936

> Is this necessary? It looks like it is only used in checkBinding, which could take it as an argument.

The main problem here is `VariableDeclarator`, which does not have access to the parent's `VariableDeclarationKind`. I seem to remember that the problem also arises with destructuring declarations, which I haven't implemented yet.
Comment on attachment 8964887 [details]
Bug 1437004 - Tests for BinAST v3 (data);

https://reviewboard.mozilla.org/r/233622/#review239216
Attachment #8964887 - Flags: review?(arai.unmht) → review+
Comment on attachment 8964888 [details]
Bug 1437004 - ParseNode::dump() now displays names for ObjectPropertyNames;

https://reviewboard.mozilla.org/r/233624/#review239218
Attachment #8964888 - Flags: review?(arai.unmht) → review+
Comment on attachment 8964889 [details]
Bug 1437004 - Fixing null string behavior in BinAST tokenizer;

https://reviewboard.mozilla.org/r/233626/#review239222
Attachment #8964889 - Flags: review?(arai.unmht) → review+
Comment on attachment 8964890 [details]
Bug 1437004 - Porting BinAST to AST v3;

https://reviewboard.mozilla.org/r/233628/#review239224
Attachment #8964890 - Flags: review?(arai.unmht) → review+
Comment on attachment 8964891 [details]
Bug 1437004 - Updating tokenizer tests to ast v3;

https://reviewboard.mozilla.org/r/233630/#review239226
Attachment #8964891 - Flags: review?(arai.unmht) → review+
Comment on attachment 8964890 [details]
Bug 1437004 - Porting BinAST to AST v3;

https://reviewboard.mozilla.org/r/233628/#review239260
Attachment #8964890 - Flags: review?(jorendorff) → review+
Comment on attachment 8964888 [details]
Bug 1437004 - ParseNode::dump() now displays names for ObjectPropertyNames;

https://reviewboard.mozilla.org/r/233624/#review239262
Attachment #8964888 - Flags: review?(jorendorff) → review+
Comment on attachment 8964889 [details]
Bug 1437004 - Fixing null string behavior in BinAST tokenizer;

https://reviewboard.mozilla.org/r/233626/#review239264
Attachment #8964889 - Flags: review?(jorendorff) → review+
Summary: Port binjs to ast v3 → [BinAST] Port binjs to ast v3
Attachment #8964887 - Attachment is obsolete: true
Comment on attachment 8964875 [details]
Bug 1437004 - Vendored Rust dependencies;

https://reviewboard.mozilla.org/r/233602/#review239338
Attachment #8964875 - Flags: review?(nfroyd) → review+
Comment on attachment 8965243 [details]
Bug 1437004 - Tests for BinAST v3 (data);

https://reviewboard.mozilla.org/r/233956/#review239610
Attachment #8965243 - Flags: review?(arai.unmht) → review+
hg error in cmd: hg push -r tip ssh://hg.mozilla.org/integration/autoland: pushing to ssh://hg.mozilla.org/integration/autoland
searching for changes
remote: adding changesets
remote: adding manifests
remote: adding file changes
remote: added 7 changesets with 961 changes to 998 files
remote: 
remote: WebIDL file js/src/frontend/BinSource.webidl altered in changeset 42c676e994f2 without DOM peer review
remote: 
remote: 
remote: 
remote: ************************** ERROR ****************************
remote: 
remote: Changes to WebIDL files in this repo require review from a DOM peer in the form of r=...
remote: This is to ensure that we behave responsibly with exposing new Web APIs. We appreciate your understanding..
remote: 
remote: *************************************************************
remote: 
remote: 
remote: transaction abort!
remote: rollback completed
remote: pretxnchangegroup.d_webidl hook failed
abort: push failed on remote
Flags: needinfo?(dteller)
Oh, great, accidentally caught by a linter. Mmmmmh.... I might rename `BinSource.webidl` to something else. Maybe `BinSource.js-webidl`.
Flags: needinfo?(dteller)
hg error in cmd: hg push -r tip ssh://hg.mozilla.org/integration/autoland: pushing to ssh://hg.mozilla.org/integration/autoland
searching for changes
remote: adding changesets
remote: adding manifests
remote: adding file changes
remote: added 7 changesets with 962 changes to 999 files
remote: 
remote: WebIDL file js/src/frontend/BinSource.webidl altered in changeset 9930d5af6560 without DOM peer review
remote: WebIDL file js/src/frontend/BinSource.webidl altered in changeset f76e14fee4d2 without DOM peer review
remote: 
remote: 
remote: 
remote: ************************** ERROR ****************************
remote: 
remote: Changes to WebIDL files in this repo require review from a DOM peer in the form of r=...
remote: This is to ensure that we behave responsibly with exposing new Web APIs. We appreciate your understanding..
remote: 
remote: *************************************************************
remote: 
remote: 
remote: transaction abort!
remote: rollback completed
remote: pretxnchangegroup.d_webidl hook failed
abort: push failed on remote
hg error in cmd: hg push -r tip ssh://hg.mozilla.org/integration/autoland: pushing to ssh://hg.mozilla.org/integration/autoland
searching for changes
remote: adding changesets
remote: adding manifests
remote: adding file changes
remote: added 7 changesets with 962 changes to 999 files
remote: 
remote: WebIDL file js/src/frontend/BinSource.webidl altered in changeset b5980bb2b30c without DOM peer review
remote: WebIDL file js/src/frontend/BinSource.webidl altered in changeset df6f93cf65cc without DOM peer review
remote: 
remote: 
remote: 
remote: ************************** ERROR ****************************
remote: 
remote: Changes to WebIDL files in this repo require review from a DOM peer in the form of r=...
remote: This is to ensure that we behave responsibly with exposing new Web APIs. We appreciate your understanding..
remote: 
remote: *************************************************************
remote: 
remote: 
remote: transaction abort!
remote: rollback completed
remote: pretxnchangegroup.d_webidl hook failed
abort: push failed on remote
hg error in cmd: hg push -r tip ssh://hg.mozilla.org/integration/autoland: pushing to ssh://hg.mozilla.org/integration/autoland
searching for changes
remote: adding changesets
remote: adding manifests
remote: adding file changes
remote: added 7 changesets with 962 changes to 999 files
remote: 
remote: WebIDL file js/src/frontend/BinSource.webidl altered in changeset b5980bb2b30c without DOM peer review
remote: WebIDL file js/src/frontend/BinSource.webidl altered in changeset df6f93cf65cc without DOM peer review
remote: 
remote: 
remote: 
remote: ************************** ERROR ****************************
remote: 
remote: Changes to WebIDL files in this repo require review from a DOM peer in the form of r=...
remote: This is to ensure that we behave responsibly with exposing new Web APIs. We appreciate your understanding..
remote: 
remote: *************************************************************
remote: 
remote: 
remote: transaction abort!
remote: rollback completed
remote: pretxnchangegroup.d_webidl hook failed
abort: push failed on remote
hg error in cmd: hg push -r tip ssh://hg.mozilla.org/integration/autoland: pushing to ssh://hg.mozilla.org/integration/autoland
searching for changes
remote: adding changesets
remote: adding manifests
remote: adding file changes
remote: added 7 changesets with 962 changes to 999 files
remote: 
remote: WebIDL file js/src/frontend/BinSource.webidl altered in changeset b5980bb2b30c without DOM peer review
remote: WebIDL file js/src/frontend/BinSource.webidl altered in changeset df6f93cf65cc without DOM peer review
remote: 
remote: 
remote: 
remote: ************************** ERROR ****************************
remote: 
remote: Changes to WebIDL files in this repo require review from a DOM peer in the form of r=...
remote: This is to ensure that we behave responsibly with exposing new Web APIs. We appreciate your understanding..
remote: 
remote: *************************************************************
remote: 
remote: 
remote: transaction abort!
remote: rollback completed
remote: pretxnchangegroup.d_webidl hook failed
abort: push failed on remote
Pushed by dteller@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/47406f36e6ab
Tests for BinAST v3 (data);r=arai
https://hg.mozilla.org/integration/autoland/rev/a97cccaa866a
Vendored Rust dependencies;r=froydnj
https://hg.mozilla.org/integration/autoland/rev/3f27c2b65ef4
ParseNode::dump() now displays names for ObjectPropertyNames;r=arai,jorendorff
https://hg.mozilla.org/integration/autoland/rev/8e3a57a21b26
Fixing null string behavior in BinAST tokenizer;r=arai,jorendorff
https://hg.mozilla.org/integration/autoland/rev/26162086b394
Porting BinAST to AST v3;r=arai,jorendorff
https://hg.mozilla.org/integration/autoland/rev/d220f806c197
Updating tokenizer tests to ast v3;r=arai
https://hg.mozilla.org/integration/autoland/rev/646bdf14bf8e
Introducing BinSource parser generator;r=froydnj,jorendorff
Pushed by philringnalda@gmail.com:
https://hg.mozilla.org/mozilla-central/rev/856a2a04097f
followup, disable BinAST tests on Windows where they create an infinite path, a=bustage
One little bitty problem, worked around by comment 156 - this actually caused total bustage of the Windows SpiderMonkey shell build test runs, https://treeherder.mozilla.org/logviewer.html#?job_id=172266759&repo=autoland&lineNumber=58410
Status: RESOLVED → REOPENED
Flags: needinfo?(dteller)
Resolution: FIXED → ---
Status: REOPENED → RESOLVED
Closed: Last yearLast year
Flags: needinfo?(dteller)
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.