Open Bug 1169501 Opened 9 years ago Updated 2 years ago

Slow performance in Flux-Capacitor benchmark

Categories

(Core :: JavaScript Engine, defect)

defect

Tracking

()

People

(Reporter: wesj, Unassigned)

References

Details

(Keywords: perf)

Attachments

(2 files)

I saw this post tonight:

https://medium.com/social-tables-tech/we-compared-13-top-flux-implementations-you-won-t-believe-who-came-out-on-top-1063db32fe73

with a benchmark inside that we did poorly (compared to competition) on:
http://jsperf.com/flux-capacitor

Profiling didn't point to anything especially interesting to me (pointed to some fn.call() methods that didn't do anything immediately strange AFAICT). Maybe I'm doing it wrong. Seemed good to have on file.
Did you profile on the JS level or C++?

In a C++ profiler, for the flummox case, I see us spend about 20% of our time in jitcode and then the rest is:

15% under CallOrConstructBoundFunction.  Mostly DoCallFallback, array_concat, and
    DoNewArray under there.  
14% under fun_apply.  About 3.5% intrinsic_OwnPropertyKeys (snapshotting and creating the
    array), 4% SetObjectElement (mostly SetNonexistentProperty adding properties),
    3% Invoke/RunScript overhead.
10% js::GetProperty, which ends up doing PrimitiveToObject and then a
    NativeGetProperty.  I looked into this part a bit more and we're doing .valueOf on a
    primitive string.  Not sure why we're not managing to jit this part into something
    better.
 9% SetPropertyIC::update.  This seems to mostly be adding properties.
 6% CallProperty.  This is also for .valueOf on a string, and also doing a bunch of
    PrimitiveToObject/NativeGetProperty.
 4% InvokeFunction
 4% NewDenseArray
 3% GetElementIC::update, mostly getting the value from an UnboxedPlainObject.
 3% SetObjectElement.
 3% NewObjectOperationWithTemplate
 2% obj_keys
 2% obj_create

and then some long-tail stuff.
Attached file profile.json
Just the built in one with Gecko platform data turned on. This is from release (38) as well. The line numbers in the profiler never seem to match anything in the source which confuses me.
The altjs case is similar.  Only about 25% spent in jitcode, then:

12% GetProperty (again on a primitive)
11% SetPropertyIC::update.
 8% CallProperty on a primitive.
 6% SetObjectElement
 5% GetElementIC::update
 
etc.

I tried inlining str_toString (which is what String.prototype.valueOf calls) in MCallOptimize, but the inlining fails because the getInlineReturnType is MIRType_Value, not MIRType_String for some reason.  And also because thisArg->type() is MIRType_Value (except when it's MIRType_Int32??).  But trying to optimize those valueOf calls on a string seems like the right place to start here...
Flags: needinfo?(jdemooij)
Attached file Standalone test
Here's a standalone testcase. I get the following results:

Nightly:

Flummox: 2662 ms.
Alt.js:  1828 ms.

Chrome 45:

Flummox: 1342 ms.
Alt.js:  1109 ms.

Safari:

Flummox: 1149 ms.
Alt.js:  1057 ms.
The <string>.valueOf is coming from the function below. This is the Immutable.is(a, b) function (part of Facebook's immutable library).

Unminified: https://github.com/facebook/immutable-js/blob/f15503f9a2d5d86226e8800aa49fcdd1b80b744e/src/is.js#L64

Baseline has a stub that should handle the <primitive>.valueOf GETPROPs without constructing a wrapper object.

Ion can nop these, but it requires good type information for |t| en |e|. Here we probably have bad type information, if |t| and |e| have polymorphic types. Inlining this function might help with that though; will dig a bit more.

    function X(t, e) {
        if (t === e || t !== t && e !== e) return !0;
        if (!t || !e) return !1;
        if ("function" == typeof t.valueOf && "function" == typeof e.valueOf) {
            if (t = t.valueOf(), e = e.valueOf(), t === e || t !== t && e !== e) return !0;
            if (!t || !e) return !1
        }
        return "function" == typeof t.equals && "function" == typeof e.equals && t.equals(e) ? !0 : !1
    }
Depends on: 1169611
(In reply to Jan de Mooij [:jandem] from comment #4)
> Flummox: 2662 ms.
> Alt.js:  1828 ms.
> 
> Chrome 45:
> 
> Flummox: 1342 ms.
> Alt.js:  1109 ms.

The patch in bug 1169611 improves this to 1900/1200 ms for us (numbers are pretty noisy though), so a nice win but this needs more work. I'll look into the remaining issues after that lands.
We also spend > 6% under jit::Bailout, bailing out of Ion. That's pretty bad, maybe a bailout loop. That's with the patch for bug 1169611; maybe it exposed this.
Depends on: 1170107
Depends on: 1170124
Depends on: 1171405
I get about 2000/1200 ms now. Pretty noisy, especially the first one, but I think in Chrome it still performs a bit better most of the time.

Clearing the needinfo but hopefully I can look into this more soon.
Flags: needinfo?(jdemooij)
Keywords: perf
Severity: normal → S3
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: