Closed Bug 1297706 Opened 3 years ago Closed 3 years ago

Make the syntax parser handle most chrome JS

Categories

(Core :: JavaScript Engine, defect)

defect
Not set

Tracking

()

RESOLVED FIXED
mozilla51

People

(Reporter: jandem, Assigned: jandem)

References

Details

(Whiteboard: [MemShrink:P1])

Attachments

(6 files, 5 obsolete files)

We parse chrome JS (JSMs) eagerly right now. This uses a lot of memory (scripts, bytecode, atoms, objects, shapes, etc).

I'd like to experiment with lazy parsing, but it's hard to test this because our syntax parser bails on stuff that's used extensively in chrome code: lexical declarations, arrow functions, destructuring, etc. The frontend rewrite in bug 1263355 should help with the lexical declaration aborts.

Lazy parsing (and relazification) for chrome JS will hopefully be a nice win.
Any thoughts on how much this would save?
Flags: needinfo?(jdemooij)
Whiteboard: [MemShrink] → [MemShrink:P2]
My guess is it could be quite a lot. I'll be optimistic and bump the MemShrink priority.
Whiteboard: [MemShrink:P2] → [MemShrink:P1]
(In reply to Jan de Mooij [:jandem] from comment #0)
> 
> I'd like to experiment with lazy parsing, but it's hard to test this because
> our syntax parser bails on stuff that's used extensively in chrome code:
> lexical declarations, arrow functions, destructuring, etc. The frontend
> rewrite in bug 1263355 should help with the lexical declaration aborts.

Is it worth analyzing some JSMs to see how often these bail-worth constructs occur? It might even be possible to rewrite some of them to avoid those constructs and see how much of an improvement you get, in order to give you an idea of the likely benefits.
I think Babel might be able to do this for you. It compiles ES6 to ES5.
Theorizing in advance of data here seems kinda pointless.  Once the new frontend lands and we fix let/const to syntax-parse (they abort still, but they're probably not far off from being syntax-parsable), then we can start adding logging to figure out what triggers aborts.  That's going to be a lot better data than guessing or analyzing anything, and it's a lot less work, too.  :-)
(In reply to Jeff Walden [:Waldo] (remove +bmo to email) from comment #5)
> then we can start
> adding logging to figure out what triggers aborts.  That's going to be a lot
> better data than guessing or analyzing anything

I must have expressed my suggestion poorly, because logging abort causes is exactly what I meant by "analyzing".
Maybe, or maybe I read a little too quickly.  Or possibly some of both.  :-)  Point is, it doesn't seem worth doing anything *yet* because there's so much noise to start.

Also, this would probably be much easier if I got back to bug 1242087, which would also coincidentally make this debuggable for non-JS people (with the caveat that that mechanism probably won't report more than once per script, in the interest of not spamming people's consoles).  But again, not something to do til new frontend (and something I've been delaying precisely for it, just like everyone else doing frontend work these days, I think).
I don't think I understand the objection. If we run some chrome JSMs through Babel, they're much more likely to be parsable by the syntax parser. Then we can run Firefox and do a quick test to see how much memory that saves. It might take an hour or two and it would tell us if this bug is worth investing in.
Last week I tried that with one or two JSMs (just converting them manually). I'll try this again and post numbers.
Flags: needinfo?(jdemooij)
Attached patch Syntax parse arrow functions. (obsolete) — Splinter Review
Duplicate of this bug: 1298200
Whoa, those patches look much simpler than I expected. shu++
Here are the language features that currently abort syntax parsing:

- lexical decls
- destructuring
- arrows
- classes
- with
- ES modules
- legacy generators (ugh)
- generator comprehensions (ugh)
- asm.js

Out of those, I've posted patches that turns on syntax parse lexicals, arrows, classes, and with, because those are easily handled now.

Out of the remaining ones, we really need to kill legacy generators and generator comprehensions and all chrome code that use it. asm.js doesn't matter, and ES modules can be made to work but there's no real rush.

That leaves destructuring, which is both widely used by chrome code and hard to handle. The problem is that at the time of parsing a destructuring expression we may not know if it is a destructuring expression. Depending on what comes afterwards, the expression is either an object or array literal, or an actual destructuring expression. The full parser thus builds up the object/array literal AST, and, if we find out that's actually destructuring, we walk over it again, tracking its declared names and so forth as necessary. The syntax parser can't do this because the whole point of syntax parsing is to not allocate the AST. Jan, do you have any ideas here?
Here's a try run to see how badly the 4 patches I posted blow up existing code: https://treeherder.mozilla.org/#/jobs?repo=try&revision=81d641409e5d
(In reply to Nicholas Nethercote [:njn] from comment #15)
> Whoa, those patches look much simpler than I expected. shu++

Simple after a 2.5MB patch. :)
(In reply to Shu-yu Guo [:shu] from comment #16)
> That leaves destructuring, which is both widely used by chrome code and hard
> to handle. The problem is that at the time of parsing a destructuring
> expression we may not know if it is a destructuring expression. Depending on
> what comes afterwards, the expression is either an object or array literal,
> or an actual destructuring expression. The full parser thus builds up the
> object/array literal AST, and, if we find out that's actually destructuring,
> we walk over it again, tracking its declared names and so forth as
> necessary. The syntax parser can't do this because the whole point of syntax
> parsing is to not allocate the AST.

Do you recommend against using destructing expressions in hot loops because of that?
(In reply to Zibi Braniecki [:gandalf][:zibi] from comment #19)
> Do you recommend against using destructing expressions in hot loops because
> of that?

Don't prematurely optimize!

Whether a loop is "hot" or not is a question after the code's been parsed.  This is all totally a question of what the code *looks like*, at the time of parsing.  Totally different phase, entirely unrelated to hotness.  The trickiness of parsing destructuring has no relevance to the performance at runtime of hot code that uses destructuring.
(In reply to Jeff Walden [:Waldo] (remove +bmo to email) from comment #20)
> (In reply to Zibi Braniecki [:gandalf][:zibi] from comment #19)
> > Do you recommend against using destructing expressions in hot loops because
> > of that?
> 
> Don't prematurely optimize!
> 
> Whether a loop is "hot" or not is a question after the code's been parsed. 
> This is all totally a question of what the code *looks like*, at the time of
> parsing.  Totally different phase, entirely unrelated to hotness.  The
> trickiness of parsing destructuring has no relevance to the performance at
> runtime of hot code that uses destructuring.

It's not about premature opt. :stas just landed a bunch of changes that remove desctructive expressions from hot loops in l20n because we measured that it impacts talos tests perf results (bug 1296618).

My question is if it's this bug, or something else - sounds like something else.
Thanks, Shu! That was fast.

Just to avoid confusion: with these patches we still don't parse JSMs lazily, we also need some small Gecko changes to make us keep the source (or else the frontend will disable lazy parsing). These patches will make it way easier to experiment with that though.
(In reply to Zibi Braniecki [:gandalf][:zibi] from comment #21)
> It's not about premature opt. :stas just landed a bunch of changes that
> remove desctructive expressions from hot loops in l20n because we measured
> that it impacts talos tests perf results (bug 1296618).
> 
> My question is if it's this bug, or something else - sounds like something
> else.

We love to see bug reports for performance issues (ideally with a standalone test that reproduces the problem). It might be something silly inside, and fixing it may be less work for us than it is to write workarounds for you (and it will help a lot more code).
(In reply to Jan de Mooij [:jandem] from comment #23)
> It might be something silly inside,

Er, "somewhere in the engine".
(In reply to Zibi Braniecki [:gandalf][:zibi] from comment #21)
> (In reply to Jeff Walden [:Waldo] (remove +bmo to email) from comment #20)
> > (In reply to Zibi Braniecki [:gandalf][:zibi] from comment #19)
> > > Do you recommend against using destructing expressions in hot loops because
> > > of that?
> > 
> > Don't prematurely optimize!
> > 
> > Whether a loop is "hot" or not is a question after the code's been parsed. 
> > This is all totally a question of what the code *looks like*, at the time of
> > parsing.  Totally different phase, entirely unrelated to hotness.  The
> > trickiness of parsing destructuring has no relevance to the performance at
> > runtime of hot code that uses destructuring.
> 
> It's not about premature opt. :stas just landed a bunch of changes that
> remove desctructive expressions from hot loops in l20n because we measured
> that it impacts talos tests perf results (bug 1296618).
> 
> My question is if it's this bug, or something else - sounds like something
> else.

That destructuring is slow is no surprise. The code we generate is, like, super slow, go through SM equivalent of meta-object protocol calls and stuff.
(In reply to Zibi Braniecki [:gandalf][:zibi] from comment #21)
> (In reply to Jeff Walden [:Waldo] (remove +bmo to email) from comment #20)
> > (In reply to Zibi Braniecki [:gandalf][:zibi] from comment #19)
> > > Do you recommend against using destructing expressions in hot loops because
> > > of that?
> 
> It's not about premature opt.
> 
> My question is if it's this bug, or something else - sounds like something
> else.

Yes, it's something else.  Syntax parsing -- this bug -- doesn't have anything to do with loop performance.
(In reply to Shu-yu Guo [:shu] from comment #25)
> > My question is if it's this bug, or something else - sounds like something
> > else.
> 
> That destructuring is slow is no surprise. The code we generate is, like,
> super slow, go through SM equivalent of meta-object protocol calls and stuff.

Is there a bug for that?
There was another abortIfSyntaxParser for arrow functions, this patch removes it and also fixes standaloneLazyFunction to handle arrows correctly.
Shu, do you have time to finish these patches? If you don't, I can also take a stab at finishing/landing them, it would definitely be nice to have this fixed.
Flags: needinfo?(shu)
Oh by all means please take this bug! I am certainly busy with other stuff.
Flags: needinfo?(shu)
Attachment #8785093 - Flags: review+
Comment on attachment 8785095 [details] [diff] [review]
Syntax parse with statements.

Review of attachment 8785095 [details] [diff] [review]:
-----------------------------------------------------------------

These two patches are straight-forward and I'll land them first.
Attachment #8785095 - Flags: review+
Keywords: leave-open
Duplicate of this bug: 892583
Depends on: 1300046
Attachment #8785096 - Flags: review+
Pushed by jandemooij@gmail.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/78ff3244294b
Syntax parse class declarations. r=jandem
Attached patch Syntax parse arrow functions (obsolete) — Splinter Review
This also fixes a couple of issues found by jit-tests/jstests.
Assignee: nobody → jdemooij
Attachment #8785094 - Attachment is obsolete: true
Attachment #8785319 - Attachment is obsolete: true
Status: NEW → ASSIGNED
Attachment #8788094 - Flags: review?(shu)
Attached patch Syntax parse arrow functions (obsolete) — Splinter Review
Attachment #8788094 - Attachment is obsolete: true
Attachment #8788094 - Flags: review?(shu)
Attachment #8788095 - Flags: review?(shu)
The Interpreter.cpp changes are not necessary I think, so I reverted them.

Sorry for the bug spam.
Attachment #8788095 - Attachment is obsolete: true
Attachment #8788095 - Flags: review?(shu)
Attachment #8788102 - Flags: review?(shu)
The patch to syntax parse arrow functions causes a devtools test failure on Try. It took me a while to track it down, but there's a bug in FunctionBox::setStart:

     void setStart(const TokenStream& tokenStream) {
         bufStart = tokenStream.currentToken().pos.begin;
         startLine = tokenStream.getLineno();
         startColumn = tokenStream.getColumn();
     }

The problem is that startLine/startColumn don't always correspond to bufStart, because getColumn simply looks at the current TokenBuf offset (so it can point to the end of a TOK_NAME token, instead of to the start of it). This wasn't a problem until now, because we only use the column number when we do lazy parsing.

The patch fixes this by calling tokenStream.srcCoords.lineNumAndColumnIndex. It also removes the now-unused TokenStream::getLineno and TokenStream::getColumn.

njn, asking you for review because you added TokenStream::SourceCoords etc.
Attachment #8788269 - Flags: review?(n.nethercote)
Comment on attachment 8788269 [details] [diff] [review]
Fix FunctionBox::setStart

Review of attachment 8788269 [details] [diff] [review]:
-----------------------------------------------------------------

I haven't thought about this code for a long time :)  This seems fine. Thank you for adding the test, that increase my confidence.
Attachment #8788269 - Flags: review?(n.nethercote) → review+
And there's more... The patch also uncovers a bug with delazifying functions for the debugger. Some intermittents like bug 1300257 and bug 1300180 become perma failures when we syntax parse arrows.

The problem is that CreateLazyScriptsForCompartment delazifies in arbitrary order, so the parser cannot assume all inner functions are lazy.
(In reply to Jan de Mooij [:jandem] from comment #42)
> The problem is that CreateLazyScriptsForCompartment delazifies in arbitrary
> order, so the parser cannot assume all inner functions are lazy.

Actually the problem is that CreateLazyScriptsForCompartment relies on lazyScript->sourceObject() being non-null iff its outer script is non-lazy. XDR decoding of lazy inner functions violates this invariant. Pre-existing bug, I'm testing a fix.
Attached patch Fix XDR (obsolete) — Splinter Review
Try confirms this fixes the devtools orange.

Things are looking pretty great now, except intermittent bug 1219497 seems to be perma orange on Win7 opt now. That's the ProfileEntry;:pc/BackgroundHangMonitor nonsense :/
Attachment #8788458 - Flags: review?(shu)
Attached patch Fix XDRSplinter Review
Attachment #8788458 - Attachment is obsolete: true
Attachment #8788458 - Flags: review?(shu)
Attachment #8788464 - Flags: review?(shu)
Pushed by jandemooij@gmail.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/25fbd73ef427
Fix FunctionBox::setStart to get the correct line/column. r=njn
Comment on attachment 8788102 [details] [diff] [review]
Syntax parse arrow functions

Review of attachment 8788102 [details] [diff] [review]:
-----------------------------------------------------------------

Thanks for taking this!

::: js/src/frontend/Parser.cpp
@@ +470,5 @@
>      length = fun->nargs() - fun->hasRest();
>      if (fun->lazyScript()->isDerivedClassConstructor())
>          setDerivedClassConstructor();
> +    if (fun->lazyScript()->needsHomeObject())
> +        setNeedsHomeObject();

Good catch.

@@ +3183,5 @@
>          syntaxKind = Getter;
>      else if (fun->isSetter())
>          syntaxKind = Setter;
> +    else if (fun->isArrow())
> +        syntaxKind = Arrow;

If you are so inclined, it'd be nice to get a FunctionKindToFunctionSyntaxKind function that translates between the two, and just use fun->kind().
Attachment #8788102 - Flags: review?(shu) → review+
Comment on attachment 8788464 [details] [diff] [review]
Fix XDR

Review of attachment 8788464 [details] [diff] [review]:
-----------------------------------------------------------------

While you're in the area of XDRLazyScript, could you change the comment "// Code free variables." to "// Code closed-over bindings."? Thanks.

::: js/src/jsscript.cpp
@@ +954,5 @@
>          for (size_t i = 0; i < numInnerFunctions; i++) {
>              if (mode == XDR_ENCODE)
>                  func = innerFunctions[i];
>  
> +            if (!XDRInterpretedFunction(xdr, nullptr, nullptr, &func))

Wait, how did this ever work before if the invariant was that functions inner to lazy scripts must have null source objects?
Attachment #8788464 - Flags: review?(shu) → review+
(In reply to Shu-yu Guo [:shu] from comment #48)
> Wait, how did this ever work before if the invariant was that functions
> inner to lazy scripts must have null source objects?

It didn't work. I think what saved us was that most files we XDR used features we couldn't syntax parse. That's changing now so we get these issues. Intermittents bug 1300257, bug 1300180, etc showed up this week and are the same issue.
Pushed by jandemooij@gmail.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/fa9899b5b63a
Fix XDR decoding to use a nullptr sourceObject for nested lazy scripts. r=shu
"Windows 7 VM opt M4" is almost perma-orange with these patches (intermittent bug 1298639), so we should fix that first.
Depends on: 1298639
Depends on: 1302645
Blocks: 1303703
I filed bug 1303703 for the destructuring case and I will file another bug to use lazy parsing for JSMs.
Status: ASSIGNED → RESOLVED
Closed: 3 years ago
Keywords: leave-open
Resolution: --- → FIXED
Target Milestone: --- → mozilla51
Blocks: 1303754
Depends on: 1296640
No longer depends on: 1296640
Depends on: 1312491
(In reply to Zibi Braniecki [:gandalf][:zibi] from comment #27)
> (In reply to Shu-yu Guo [:shu] from comment #25)
> > > My question is if it's this bug, or something else - sounds like something
> > > else.
> > 
> > That destructuring is slow is no surprise. The code we generate is, like,
> > super slow, go through SM equivalent of meta-object protocol calls and stuff.
> 
> Is there a bug for that?

Sorry for bugspam but did you find/file a bug and have a bug # for me?
Flags: needinfo?(gandalf)
You need to log in before you can comment on or make changes to this bug.