Open Bug 1940709 Opened 29 days ago Updated 16 days ago

[WASM] Running query on https://sqlime.org/ is 30% slower in Nightly

Categories

(Core :: JavaScript: WebAssembly, task, P3)

task

Tracking

()

People

(Reporter: mayankleoboy1, Assigned: jseward)

References

(Blocks 1 open bug, )

Details

Attachments

(2 files)

Attached file Sample sql code.txt

Go to https://sqlime.org/
Copy-paste the query into the box and click run. This will create a table.
Now delete the previous code and copy-paste-run this code:

WITH RECURSIVE calc_loop(counter, result) AS (
    SELECT 1, 1
    UNION ALL
    SELECT counter + 1, result + counter
    FROM calc_loop
    WHERE counter < 100000000
)
SELECT * FROM calc_loop where counter = 1;

Nightly: https://share.firefox.dev/4ai5TRV (40s without the profiler)
Chrome: https://share.firefox.dev/426t3ZB (29s without the profiler)

Mayank: which version of Chrome are you using here?
And where did you get it from?

Assignee: nobody → jseward
Flags: needinfo?(mayankleoboy1)

(In reply to Julian Seward [:jseward] from comment #1)

Mayank: which version of Chrome are you using here?
And where did you get it from?

From Chrome's about

Version 131.0.6778.265 (Official Build) (64-bit) It is Chrome release.

I got it from googles website, probably. A long time back.

Full disclaimer: I created the sql code using ChatGPT. I only increased the loop iterations manually.

(ni? you back so that you dont miss my response)

Flags: needinfo?(mayankleoboy1) → needinfo?(jseward)
Attached file about:support

There may be multiple problems here. One thing is that tiering doesn't work
well for this test case, neither eager tiering (oldpipe) or lazy tiering. I
believe the symptoms are consistent with the "stuck in baseline code"
phenomenon which we have theorized to exist, but (AFAIK) not seen until now.

I ran tests on a slowish machine, so I reduced the workload from 100 million
(WHERE counter < 100000000) to 12,345,678. This increases the relative
compilation costs but I think the comparisons are still valid.

For baseline-only and ion-only we have

Baseline only
* MG::startCompleteTier (BL, 37 imports, 3063 functions)
* MG::finishModule      (BL, complete tier, 0.98 MB in 0.084s = 11.72 MB/s)
* 1 rows, took 15794 ms

Ion only
* MG::startCompleteTier (OPT, 37 imports, 3063 functions)
* MG::finishModule      (OPT, complete tier, 0.98 MB in 0.652s = 1.51 MB/s)
* 1 rows, took 9900 ms

Here, the relative compilation costs and relative runtimes are pretty much as
we expect. However, we expect tiering to produce run times close to that of
Ion-only, and that does not happen:

Eager tiering (oldpipe)
* 1 rows, took 13648 ms

Lazy tiering (including non-GC)
* 1 rows, took 12589 ms

It might be that there's some strange thing happening with lazy tiering, since
that's complex and new. But eager tiering has been around for years and is
even worse affected.

Hot-block profiling with lazy tiering selected shows that we are executing a
mixture of baseline and optimized code, very roughly 50%/50%, and that split
remains unchanged all the way out to the original workload (100 million)
end-point. That might mean that there's a super-huge function which didn't get
tiered up, but playing with the tiering thresholds makes me doubt that; also it
doesn't explain what happened to eager tiering.

Hence it might be that there's a big function at the root of the call tree,
which has quite a lot of expense, and is called only once. So its optimised
version never runs.

Flags: needinfo?(jseward)
Severity: -- → N/A
Priority: -- → P3
See Also: → 1871158
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: