[WASM] Running query on https://sqlime.org/ is 30% slower in Nightly
Categories
(Core :: JavaScript: WebAssembly, task, P3)
Tracking
()
People
(Reporter: mayankleoboy1, Assigned: jseward)
References
(Blocks 1 open bug, )
Details
Attachments
(2 files)
Go to https://sqlime.org/
Copy-paste the query into the box and click run. This will create a table.
Now delete the previous code and copy-paste-run this code:
WITH RECURSIVE calc_loop(counter, result) AS (
SELECT 1, 1
UNION ALL
SELECT counter + 1, result + counter
FROM calc_loop
WHERE counter < 100000000
)
SELECT * FROM calc_loop where counter = 1;
Nightly: https://share.firefox.dev/4ai5TRV (40s without the profiler)
Chrome: https://share.firefox.dev/426t3ZB (29s without the profiler)
Assignee | ||
Comment 1•28 days ago
|
||
Mayank: which version of Chrome are you using here?
And where did you get it from?
Reporter | ||
Comment 2•28 days ago
|
||
(In reply to Julian Seward [:jseward] from comment #1)
Mayank: which version of Chrome are you using here?
And where did you get it from?
From Chrome's about
Version 131.0.6778.265 (Official Build) (64-bit)
It is Chrome release.
I got it from googles website, probably. A long time back.
Full disclaimer: I created the sql code using ChatGPT. I only increased the loop iterations manually.
(ni? you back so that you dont miss my response)
Reporter | ||
Comment 3•28 days ago
|
||
Assignee | ||
Comment 4•27 days ago
|
||
There may be multiple problems here. One thing is that tiering doesn't work
well for this test case, neither eager tiering (oldpipe) or lazy tiering. I
believe the symptoms are consistent with the "stuck in baseline code"
phenomenon which we have theorized to exist, but (AFAIK) not seen until now.
I ran tests on a slowish machine, so I reduced the workload from 100 million
(WHERE counter < 100000000
) to 12,345,678. This increases the relative
compilation costs but I think the comparisons are still valid.
For baseline-only and ion-only we have
Baseline only
* MG::startCompleteTier (BL, 37 imports, 3063 functions)
* MG::finishModule (BL, complete tier, 0.98 MB in 0.084s = 11.72 MB/s)
* 1 rows, took 15794 ms
Ion only
* MG::startCompleteTier (OPT, 37 imports, 3063 functions)
* MG::finishModule (OPT, complete tier, 0.98 MB in 0.652s = 1.51 MB/s)
* 1 rows, took 9900 ms
Here, the relative compilation costs and relative runtimes are pretty much as
we expect. However, we expect tiering to produce run times close to that of
Ion-only, and that does not happen:
Eager tiering (oldpipe)
* 1 rows, took 13648 ms
Lazy tiering (including non-GC)
* 1 rows, took 12589 ms
It might be that there's some strange thing happening with lazy tiering, since
that's complex and new. But eager tiering has been around for years and is
even worse affected.
Hot-block profiling with lazy tiering selected shows that we are executing a
mixture of baseline and optimized code, very roughly 50%/50%, and that split
remains unchanged all the way out to the original workload (100 million)
end-point. That might mean that there's a super-huge function which didn't get
tiered up, but playing with the tiering thresholds makes me doubt that; also it
doesn't explain what happened to eager tiering.
Hence it might be that there's a big function at the root of the call tree,
which has quite a lot of expense, and is called only once. So its optimised
version never runs.
Updated•24 days ago
|
Description
•