Open Bug 1540646 Opened 5 years ago Updated 2 years ago

Figure out why JetStream 2's async-fs test is slow

Categories

(Core :: JavaScript Engine, task, P2)

task

Tracking

()

Tracking Status
firefox68 --- fix-optional

People

(Reporter: jandem, Unassigned)

References

(Blocks 1 open bug)

Details

(Keywords: perf)

Chrome is 4x faster, Safari 2x.

This test uses async generators a lot. In profiles I see us calling from JS => C++ AsyncGeneratorEnqueue => self-hosted AsyncGeneratorNext. There's also promise and generator overhead.

The score difference is much lower for me when running only "async-fs" in the shell when comparing against JSC (rev 243690, compiled with perl Tools/Scripts/build-jsc --jsc-only; Ubuntu 16.04, in a VirtualBox VM). There's a larger time difference compared to JSC, though.

Engine Score Time
SpiderMonkey 44-47 4 sec
JSC 48-51 3 sec
V8 130-132 1 sec
// File: /tmp/setup.js
// mozjs --no-async-stacks -f /tmp/setup.js cli.js
// v8 /tmp/setup.js cli.js
// jsc -f /tmp/setup.js cli.js

var runString;
if (typeof runString !== "function") {
    if (typeof newGlobal === "function") {
        runString = function() {
            var g = newGlobal();
            g.loadString = g.evaluate;
            return g;
        };
    } else {
        runString = function() {
            var r = Realm.createAllowCrossRealmAccess();
            var g = Realm.global(r);
            g.loadString = s => Realm.eval(r, s);
            return g;
        };
    }
}

var readFile;
if (typeof readFile !== "function") {
    readFile = read;
}

var testList = ["async-fs"];

There's roughly the following distribution between the three different parts of the test:

Part Time Percentage
setupDirectory 1.5 s 37.5%
forEachFileRecursively 1.5 s 37.5%
forEachDirectoryRecursively 1.0 s 25%

And for the forEachFileRecursively part, the following distribution.

Sub-part Time Percentage
forEachFileRecursively loop 1.0 s 66.6%
swapByteOrder 0.5 s 33.3%

So this part probably needs bug 1065894.

This may also be important for Fluent. We use async and generators in quite hot loops.

Note that at least Blink/V8 have quite different behavior for their job queue handling when running in the shell vs the browser. (Or did last I checked sometime last year, at least.) In the browser, the queue handling is inlined much more significantly, leading to much faster handling of promise reactions than what we manage. When doing some Promise optimizations in the shell I was very pleased with how we fared compared to V8—until I tested in the browser ...

It's possible that this doesn't matter too much for for this test, but given that it's pretty Promise heavy, I'd be surprised.

Ah, interesting to know.

There's definitely quite a bit of async/Promise overhead in this test. For example when making the test non-async by changing async-generators to normal generators, for-await-of loops to for-of loops, and async functions to normal functions, the test completed in about two seconds in the shell for me. So it got twice as fast compared to the async version.

Type: defect → task
Keywords: perf
Priority: -- → P2
Severity: normal → S3
You need to log in before you can comment on or make changes to this bug.