Closed Bug 577359 Opened 10 years ago Closed 9 years ago

Don't generate INITPROP/INITELEM for constant initializer elements

Categories

(Core :: JavaScript Engine, enhancement)

enhancement
Not set

Tracking

()

RESOLVED FIXED
Tracking Status
blocking2.0 --- -
status2.0 --- wanted

People

(Reporter: bhackett, Assigned: bhackett)

References

Details

(Whiteboard: [cib-memory][fixed-in-tracemonkey])

Attachments

(1 file, 9 obsolete files)

Significant overhead is incurred when processing large static initializers.  For the JSON in sunspider's string-tagcloud benchmark, compilation overhead is about 5ms with JM, on top of the unknown overhead incurred by the parser in emitting the INIT* opcodes and by the interpreter in processing them.  When these initializers are all (or mostly) constants, the parser could instead construct the object directly.
This patch is a prototype testing the perf improvement of removing INITPROP/INITELEM opcodes when they are constants.  This cuts 7ms off string-tagcloud for JM (seems 5ms is compilation overhead, 2ms is overhead generating the ops themselves).  Cloning is not being done yet, but in any case won't be necessary provided we can identify code which only runs once --- initializers in global or eval scripts (need to account for eval script caching, can global scripts be run multiple times?).
Assignee: general → bhackett1024
(In reply to comment #1)

If the  TCF_COMPILE_N_GO parser flag is set, global scripts are guaranteed to run exactly once.
super-crude patch attached which avoids constructing parse nodes when processing constants in initializers.  This is 3ms faster than the previous patch --- 10ms saved total, 3ms slower than JSC.  The remaining overhead could be spread over several things: object/array construction overhead, atomization of string constants, miscellaneous frontend overhead.
Attached patch de-cruded parser patch (obsolete) — Splinter Review
Fix up parser patch so string-tagcloud works.  10ms speedup (72ms -> 62ms) by not constructing INITPROP/INITELEM or parse nodes for the JSON.  Much of the rest of the slowdown over JSC looks like regexp and string stuff.  12ms just for page faults.
Attachment #456446 - Attachment is obsolete: true
Attachment #458025 - Attachment is obsolete: true
Nice. This is actually one of our bigger remaining slowdowns right now.
Attached patch working parser patch (obsolete) — Splinter Review
This patch passes jstests and trace-tests.  12ms for me on string-tagcloud (69ms -> 57ms).  Who should review this?

This faster parsing is only used for initializers which can only run once --- not in functions, loops, or non-COMPILE_N_GO scripts.  In the future, it'd be nice to just clone objects in these cases instead; that should be relatively easy to add once this patch is in, but should I think wait until bug 558451 is done.

There are still a couple issues, both marked with TODO in the patch.  First, there are lots of places where the parser code handling destructuring crawls over initializer objects; I think the patch's changes to initializer parse nodes handles these right, but not sure.  Second, are scripts which will be XDR'ed always compiled without COMPILE_N_GO?  Hard to tell from the code.
Attachment #458531 - Attachment is obsolete: true
Comment on attachment 461900 [details] [diff] [review]
working parser patch

Will review by tomorrow mid-afternoon.

/be
Attachment #461900 - Flags: review?(brendan)
Sorry, behind. Nits ahead in case you can fix them:

* Use canonical names (obj, pnp not ppn, vp not pvalue, etc.).
* Localize JSObject *obj in js_EmitTree, etc.
* Use prevailing major comment style, vertical bar of stars -- not

/* blah blah */
/* blah. */

or

/* blah blah
 * blah */

or variations.

* Avoid kludging JSParseNode harder (sorry, old C code, not yet abstracted): add initCount method instead of biasing pn_count by 1 for head object.

* Name parser methods after productions: initExpr, arrayInitExpr, objectInitExpr -- or even ECMA-262 initialiser, arrayInitialiser, objectInitialiser. Or the Yankee spelling ;-).

* Bending JOF_OBJECT by making length 4 is a bit ugly but ok if length is independent of format in the sense that extra byte(s) are ok. Disassembler won't show the flag byte, though, so suggest another format or a different immediate encoding.

/be
Will fix these.  You might want to hold off reviewing until the next patch --- found some of my assumptions about destructuring were wrong, and need to rework handling to be more straight-forwardly correct.  Should be done later this morning.
Attached patch revised init patch (obsolete) — Splinter Review
Add expandInitializer for getting the full parse tree for an initializer, including constants, for cases like destructuring and generators where this simplifies things.  Fix some related bugs, issues from comment 8.  Passes trace-tests and jstests.
Attachment #461900 - Attachment is obsolete: true
Attachment #462491 - Flags: review?(brendan)
Attachment #461900 - Flags: review?(brendan)
Comment on attachment 462491 [details] [diff] [review]
revised init patch

Needs rebasing, also more nits:

* Init/init vs. initializer spelled out -- pick one and stick to it everywhere.

* else after return in various places, JSParseNode::initializerObjectIsEmpty among them (it could return obj->isArray() ? ... : ...).

* Recanonicalize for loops such as

+        pn2 = pn->initializerList();
+        for (; pn2; atomIndex++, pn2 = pn2->pn_next) {

to have no empty head parts.

* Some livery for the flag byte, say jsopcode.h {G,S}ET_NEWINIT_FLAG(pc), would be slightly clearer than the current open-coding.

* Hacking pn_type to 0 or 1 is narsty, and I'm sure AST encoder hackers including dherman would agree. You could just make a new pn_u variant with carefully aligned members and define pn_objbox and pn_clone as shorthands for its JSObjectBox * and bool members -- that would be much nicer.

/be
(In reply to comment #11)
> * Some livery for the flag byte, say jsopcode.h {G,S}ET_NEWINIT_FLAG(pc), would
> be slightly clearer than the current open-coding.

It's a clone flag, so {G,S}ET_NEWINIT_CLONE_FLAG if the name is not overlong in relation to jsopcode.h peer macros. Or something shorter (GET_CLONE_FLAG).

/be
Attached patch rebased patch fixing comments (obsolete) — Splinter Review
Patch fixing issues in comment 11.  Passes tests (well, same as clean JM) and trace-tests.

Are the results of Decompile accessible to scripts?  This patch is lazy and decompiles, e.g. an uncloned [1,2,3,4,5] to [...], and I don't know if this needs to be fixed immediately or merely soon.
Attachment #462491 - Attachment is obsolete: true
Attachment #462491 - Flags: review?(brendan)
Yep!

javascript:alert(function(){return [1,2,3,4,5];});
Attachment #464249 - Flags: review?(brendan)
Should be easy to decompile.
Yeah, just laziness on my part.  Will post a revised patch shortly.
Attached patch patch fixing decompilation (obsolete) — Splinter Review
This fixes decompilation of initializers to do the right thing in the presence of constant elements on the initializer object.  Tests still pass.
Attachment #464249 - Attachment is obsolete: true
Attachment #464284 - Flags: review?(brendan)
Attachment #464249 - Flags: review?(brendan)
Attached patch patch rebased to TM (obsolete) — Splinter Review
Rebase to current TM, passes jstests and trace-tests.
Attachment #464284 - Attachment is obsolete: true
Attachment #467961 - Flags: review?(brendan)
Attachment #464284 - Flags: review?(brendan)
Sorry, will review today (may be this evening).

/be
Comment on attachment 467961 [details] [diff] [review]
patch rebased to TM

This patch is pointless, given bug 578216 (which is a much better solution).
Attachment #467961 - Flags: review?(brendan)
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → WONTFIX
I don't agree; large literals are common outside of JSON, though perhaps not in benchmarks.
Blocks: 599557
Status: RESOLVED → REOPENED
Resolution: WONTFIX → ---
Per bug 599557 they're not even uncommon in benchmarks now.  ;)
blocking2.0: --- → ?
Attached patch rebased patch (obsolete) — Splinter Review
Attachment #467961 - Attachment is obsolete: true
Attachment #478928 - Flags: review?(brendan)
blocking2.0: ? → -
status2.0: --- → wanted
No longer blocks: 605426
Comment on attachment 478928 [details] [diff] [review]
rebased patch

General naming nit:

InitNeedsClone -> InitializerNeedsCloning
TCF_UNCLONED_INIT -> TCF_UNCLONED_INITIALIZER

Still a mix of "Init" or "init" vs. "Initializer" or "intializer". One way to go is the above renames, but I'd be just as fine going the shorter way, provided you tack on "Object" after "Init" in a few places that would need it.

In jsparse.cpp, perhaps inititializer works, but then no need for Object in initializerObjectIsEmpty. Local convention for the parser favoring more verbose and grammar-oriented names might consider the nonterminal names from ECMA-262, even.

>+    /*
>+     * pn_type indicates whether the object needs to be cloned. Tack this onto the
>+     * object operation.
>+     */
>+    ptrdiff_t offset = EmitCheck(cx, cg, JSOp(0), 1);
>+    if (offset < 0)
>+        return false;
>+    *CG_NEXT(cg)++ = (jsbytecode) first->pn_cloned;

Comment seems stale here.

>+InitNeedsClone(JSTreeContext *tc)
>+{
>+    /* In a script which might be executed multiple times. */
>+    if (!(tc->flags & TCF_COMPILE_N_GO))
>+        return true;
>+
>+    /* In a function which might be called multiple times. */
>+    if (tc->flags & TCF_IN_FUNCTION)
>+        return true;

Nit: s/which/that/ for defining clauses.

Non-nit: should test ((tc->flags & (TCF_IN_FUNCTION | TCF_FUN_MODULE_PATTERN)) == TCF_IN_FUNCTION), or something like that, here. Check for problems with module pattern in loop.

> js_EmitTree(JSContext *cx, JSCodeGenerator *cg, JSParseNode *pn)
> {
>     JSBool ok, useful, wantval;
>     JSStmtInfo *stmt, stmtInfo;
>     ptrdiff_t top, off, tmp, beq, jmp;
>     JSParseNode *pn2, *pn3;
>+    JSObject *obj;

Is this obj now unused?

>+        /*
>+         * Check if the object needs to be cloned, and if so make parse nodes
>+         * for all its elements now.  This needs to happen during compilation
>+         * as during parsing we don't know whether the code will run multiple times.
>+         * TODO: should have a way of deep cloning initializer objects
>+         * and scopes, so this can be handled better.

Use FIXME instead of TODO and file that bug so you can cite it. Also cut the "scopes" junk.

>+js_XDRInitObject(JSXDRState *xdr, JSObject **objp, JSBool isArray)
>+{
>+    /*
>+     * Objects in XDR scripts should always need cloning, and will thus be empty.
>+     * TODO: make sure this is the case.
>+     */
>+    if (xdr->mode == JSXDR_DECODE) {
>+        JSObject *obj;
>+        if (isArray)
>+            obj = js_NewArrayObject(xdr->cx, 0, NULL);
>+        else
>+            obj = NewBuiltinClassInstance(xdr->cx, &js_ObjectClass);

Nit: use JSObject *obj = isArray ? ... : ...; instead of if-else, bonus is initialized decl.

>diff --git a/js/src/jsopcode.cpp b/js/src/jsopcode.cpp
>--- a/js/src/jsopcode.cpp
>+++ b/js/src/jsopcode.cpp
>@@ -404,16 +404,18 @@ js_Disassemble1(JSContext *cx, JSScript 
>             else
>                 obj = script->getRegExp(index);
>             v = OBJECT_TO_JSVAL(obj);
>         }
>         bytes = ToDisassemblySource(cx, v);
>         if (!bytes)
>             return 0;
>         fprintf(fp, " %s", bytes);
>+        if (len == 4)
>+            fprintf(fp, " %d", pc[3]);

Comment this if (len == 4) statement, blank line before the comment as usual.

>+            for (size_t ind = 0; ind < object->getArrayLength(); ind++) {

Canonical name is i not ind. Type of i should be unsigned or uintN.

>+                const Value &v = object->getDenseArrayElement(ind);

Indeed grep shows

jsobj.h:    inline const js::Value &getDenseArrayElement(uintN idx);

>+            if (closeBrackets)
>+                return SprintCString(&ss->sprinter, "]");
>+            return todo;
>+        } else {

Else after return non-sequitur.

>+                const Shape *shape = object->lastProperty();
>+                while (shape->slot != ind)
>+                    shape = shape->previous();

What causes slot gaps?

>+            /*
>+             * Check that the left side of the 'in' is valid.
>+             * TODO: no need to expand initializers here since this is in
>+             * a loop, but that is likely to change.
>+             */

FIXME: bug NNNNNNN not TODO -- or it won't happen :-P.

>+        for (size_t ind = 0; ind < length; ind++) {
>+            JSParseNode *pnval = initializerValueNode(begin, obj->getDenseArrayElement(ind));

ind -> i, size_t -> uintN again.

>+            JSParseNode *pncolon =
>+                JSParseNode::newBinaryOrAppend(TOK_COLON, JSOP_INITPROP, pnid, pnval, tc);

pncolon null check and failure propagation here.

>+          default:
>+            pn = assignExpr();
>+            if (!pn)
>+                return false;
>+            goto fallthrough;

Brute-force downward goto, but frowned upon still these days. Use a flag variable?

>+        }
>+
>+        /* Check to see if we parsed the entire initializer element. */
>+        tt = tokenStream.peekToken();
>+        switch (tt) {
>+          case TOK_COMMA:
>+          case TOK_RC:
>+          case TOK_RB: {
>+            if (!pn) {
>+                /*
>+                 * Parsed this entry, add it to the object directly.
>+                 * TODO: if value refers to an object, it is not pinned here. OK?
>+                 */
>+                jsid id;
>+                jsint intid;
>+                if (strid)
>+                    id = ATOM_TO_JSID(strid);
>+                else if (JSDOUBLE_IS_INT32(numid, &intid) && INT_FITS_IN_JSID(intid))
>+                    id = INT_TO_JSID(intid);
>+                else if (!js_InternNonIntElementId(context, object, DoubleValue(numid), &id))
>+                    return false;
>+
>+                /* Add to the object only if there is not an identically named property. */
>+                JSObject *obj2;
>+                JSProperty *prop;
>+                if (!object->lookupProperty(context, id, &obj2, &prop))
>+                    return false;
>+                if (prop) {
>+                    pn = initializerValueNode(begin, value);
>+                    break;
>+                } else {

else after break but you don't need the break anyway, it's right below.

>+                    return object->defineProperty(context, id, value, NULL, NULL, JSPROP_ENUMERATE);
>+                }

Suggest inverting the if test: if (!prop) return ...; and then no overindentation on the pn = inintializerValueNode(begin, value).

>+                pn->pn_tail = (--pn->pn_count == 1)
>+                    ? &pn->pn_head->pn_next
>+                    : &pn->pn_head;

Fits on one line in 100 columns? If not, indent ? and : to underhang (.

>+                *pn->pn_tail = NULL;
>+
>+                pntop = comprehensionTail(pnexp, pn->pn_blockid,
>+                                          TOK_ARRAYPUSH, JSOP_ARRAYPUSH);

Fits in 100 columns? Look out for other such.

More in a bit.

/be
Depends on: 606477
What's the status of this?  I was just comparing Kraken's under -j vs. -j -m -p and the compilation of big literals seemed to be the main difference.  Sure that's outside the timing loop but still seems worth fixing.
Duplicate of this bug: 599557
This bug is blocked on bug 606477; once that lands, I'm going to rewrite this to be less simpler and less invasive (still make parse trees, just emit a new opcode for singleton literals containing only constants).  Even if this isn't perf critical, the giant literals suck up memory for real websites (e.g. bug 605752), and also the type inference hates them.
Attached patch patchSplinter Review
Attachment #478928 - Attachment is obsolete: true
Attachment #478928 - Flags: review?(brendan)
Comment on attachment 491870 [details] [diff] [review]
patch

This patch emits JSOP_OBJECT for initialisers consisting entirely of constants which only run once in a global or eval script.  This patch seems to work, modulo the following assumptions (neither of which would be difficult to fix):

- compileAndGo scripts cannot be XDR'ed
- compileAndGo global and eval scripts cannot be decompiled

Times to eval() a 350K initialiser, bypassing the JSON parser in JSC and SM.

V8: 50ms
JSC: 96ms
JM (old): 285ms
TM (old): 105ms
JM/TM (new): 79ms

To get to V8 levels of performance I think we would have to avoid constructing the parse nodes, as in earlier iterations of this patch.  That would be a good deal more complex and would not help the memory situation any more than this does (since parse nodes are destroyed immediately after parsing).
Attachment #491870 - Attachment description: WIP → patch
Attachment #491870 - Flags: review?(brendan)
(In reply to comment #29)
> 
> - compileAndGo scripts cannot be XDR'ed

Stupid question time:  what is a "compile and go" script?  And what are the alternative kinds?  Thanks :)
(In reply to comment #30)
> (In reply to comment #29)
> > 
> > - compileAndGo scripts cannot be XDR'ed
> 
> Stupid question time:  what is a "compile and go" script?  And what are the
> alternative kinds?  Thanks :)

http://infomonkey.cdleary.com/questions/27/what-is-compile_n_go
Review ping for brendan.
I got bogged ddown partway through (comment 24). Brian, have you incorporated the comments I did have, and rebased? I'll review the rest, whether interdiff works or not (I can diff the patches too).

/be
The current patch is a complete rewrite, but is only 1/5 as big as earlier ones and much less invasive.
blocking2.0: - → betaN+
status2.0: wanted → ---
Comment on attachment 491870 [details] [diff] [review]
patch

>+/* Whether the current position of cg is such that generated code will only run once. */
>+static inline bool
>+IsSingletonNode(JSCodeGenerator *cg)
>+{
>+    if (!cg->compileAndGo() || cg->inFunction())
>+        return false;
>+    for (JSStmtInfo *stmt = cg->topStmt; stmt; stmt = stmt->down) {
>+        if (STMT_IS_LOOP(stmt))
>+            return false;
>+    }
>+    cg->flags |= TCF_HAS_SINGLETONS;
>+    return true;
>+}

Make this a method on cg and call it inSingletonContext? There are no nodes in sight.

>+GetConstantValue(JSContext *cx, JSCodeGenerator *cg, JSParseNode *pn, Value *vp)
>+{

This one wants to be a method on JSParseNode. Instead of passing big fat cg just for cg->needStrictChecks(), pass that bool in.

>+      default:
>+        JS_NOT_REACHED("Unexpected node");
>+        return false;
>+    }
>+    return false;

Lose the final return false or the default case one, no need for both. Prevailing style wants to cover all cases with default:; if nothing better (GCC may warn if enum is not exhausted), but also tries to put a return at the bottom of the function where it won't be missed skimming. Your call.

>+EmitSingletonInitialiser(JSContext *cx, JSCodeGenerator *cg, JSParseNode *pn)

Truly these Emit* statics want to be cg methods but don't change this one. Separate bug.

>+        if (!cg->hasSharps() && !(pn->pn_xflags & PNX_NONCONST) && IsSingletonNode(cg)) {
>+            if (!EmitSingletonInitialiser(cx, cg, pn)) {
>+                js_ReportOutOfMemory(cx);

This ReportOOM is redundant, Emit* already did that -- we report as soon as possible and propagate failure.

>+        if (!cg->hasSharps() && !(pn->pn_xflags & PNX_NONCONST) && IsSingletonNode(cg)) {
>+            if (!EmitSingletonInitialiser(cx, cg, pn)) {
>+                js_ReportOutOfMemory(cx);
>+                return JS_FALSE;
>+            }

Ditto above comments, and can you common harder instead of repeating here?

> BEGIN_CASE(JSOP_OBJECT)
> {
>     JSObject *obj;
>     LOAD_OBJECT(0, obj);
>-    /* Only XML and RegExp objects are emitted. */
>+    /* Only XML, RegExp and singleton initialiser objects are emitted. */

Comment style wants a blank line before whole-line-or-greater comments, unless preceding line ends in {.

Is this comment really pulling its weight? I think it would be better if moved to jsopcode.tbl.

>+    /* Don't cache the script if it contains singleton objects. */
>+    /*if (!script->hasSingletons)*/ {

This shouldn't be commented out, per IRC convo.

>+IsConstantExpr(JSParseNode *pn)

This too wants to be a pn method.

>-                    pn->pn_xflags |= PNX_HOLEY;
>+                    pn->pn_xflags |= (PNX_HOLEY | PNX_NONCONST);

Don't overparenthesize.

>-                pn->pn_xflags |= PNX_DESTRUCT;
>+                pn->pn_xflags |= (PNX_DESTRUCT | PNX_NONCONST);

Ditto.

r=me with these fixed.

/be
Attachment #491870 - Flags: review?(brendan) → review+
Talked to bhackett, we once again said "there oughtta be an assertion" against XDR'ing CNG scripts. And the constant initialiser does need to be decompiled, for diagnostics blaming the source expression that contained a null or undefined:

[1,2,3].no.way

/be
Whiteboard: [cib-memory]
blocking2.0: betaN+ → -
status2.0: --- → wanted
http://hg.mozilla.org/tracemonkey/rev/d4f969511664

This fixes a couple bugs in the earlier patch.  There was a memory leak wrt the eval cache, where not inserting a script into the cache caused it to (sometimes or always) leak later.  This also fixes a pre-existing behavior where JSOP_OBJECT could be emitted for regexp literals, which will not pick up subsequent changes to the RegExp statics object.
Whiteboard: [cib-memory] → [cib-memory][fixed-in-tracemonkey]
http://hg.mozilla.org/mozilla-central/rev/d4f969511664
Status: REOPENED → RESOLVED
Closed: 9 years ago9 years ago
Resolution: --- → FIXED
Depends on: 655249
You need to log in before you can comment on or make changes to this bug.