Closed Bug 1923595 Opened 1 year ago Closed 1 year ago

5.2% regression on Webassenbly-embenchen-compile (WASM optimizing) around 3Oct2024 on Windows-mozilla-central

Categories

(Core :: JavaScript: WebAssembly, defect, P1)

defect

Tracking

()

RESOLVED FIXED
133 Branch
Tracking Status
firefox-esr128 --- unaffected
firefox131 --- unaffected
firefox132 --- unaffected
firefox133 --- fixed

People

(Reporter: mayankleoboy1, Assigned: bas.schouten)

References

(Blocks 1 open bug, Regression)

Details

(Keywords: regression)

Attachments

(1 file)

The graph says it all, I think.

There looks to be some corresponding improvements in this range, but the dates seem somewhat off.

Linux and Mac for the same period.

Not sure if this is a big enough change, or if it is even worth over-rotating on these sub-benchmarks.

The Bugbug bot thinks this bug should belong to the 'Core::JavaScript: WebAssembly' component, and is moving the bug to that component. Please correct in case you think the bot is wrong.

Component: General → JavaScript: WebAssembly
Product: Firefox → Core

Hm do you see anything interesting in that regression range?

JS related bugs are: bug 1921963, bug 1904995, bug 1904957 (may impact compiling on multiple threads).

But nothing that looks interesting, which is what i dont understand. And because these tests are on m-c, I dont know how to backfill.

The datapoint just before the regression contains bug 1919025 part3 (removal of old dominator stuff). But even that shouldnt affect things.

So i cant guess anything.

Yeah I asked because I didn't see anything obvious either :)

It could be bug 1919025 because that also switched to the new code, but a 5% regression from that is a bit unexpected and it means there was one "good" data point still too...

The numbers have improved probably because of bug 1915863.
But still not back to the previous baseline.

If I'm reading the graphs right, it looks like the compile regression was just on windows? If so, that seems like a data point in favor of bug 1904957 which should only effect windows. If that bug changes the configuration of our thread pool for compilation, that could definitely cause a regression here.

See Also: → 1904957

Set release status flags based on info from the regressing bug 1904957

Other regressions are:
2.2% on wasm-instantiate

Looking at the backout commit, either CreateHeterogeneousCpuInfo() is very
expensive or TaskController::GetPoolThreadCount() is called a lot. Neither of
which seem particularly plausible to me. We could at least check whether the
latter is true though.

(In reply to Julian Seward [:jseward] from comment #10)

Looking at the backout commit, either CreateHeterogeneousCpuInfo() is very
expensive or TaskController::GetPoolThreadCount() is called a lot. Neither of
which seem particularly plausible to me. We could at least check whether the
latter is true though.

I think it's more likely that implementing CreateHeterogenousCpuInfo() for Windows changed the number of background threads we have available to compile wasm code on.

(In reply to Ryan Hunt [:rhunt] from comment #11)

(In reply to Julian Seward [:jseward] from comment #10)

Looking at the backout commit, either CreateHeterogeneousCpuInfo() is very
expensive or TaskController::GetPoolThreadCount() is called a lot. Neither of
which seem particularly plausible to me. We could at least check whether the
latter is true though.

I think it's more likely that implementing CreateHeterogenousCpuInfo() for Windows changed the number of background threads we have available to compile wasm code on.

I agree. This is interesting. I wonder if we'll have to treat E-cores a little different on Windows to get the best result, what do you think Olli?

Flags: needinfo?(bas) → needinfo?(smaug)

Also, it looks the wasm-misc-baseline compile subtest regressed by 8.6% too (from the backout run I posted). Neither of these are production configurations because they disable one of our compilers. I don't have results for our production config (just plain wasm-misc), but I'm triggering some tasks for it. I'd be surprised though if it didn't affect that one if it affected both baseline/ion configs.

I'd back out the TaskController change for now.

Do the WASM tests run at all on Apple Silicon? Since the TaskController change was initially only for Apple Silicon Macs / sp3.

Flags: needinfo?(smaug)
Blocks: wasm-perf
Severity: -- → S3
Priority: -- → P1

(In reply to Olli Pettay [:smaug][bugs@pettay.fi] from comment #14)

I'd back out the TaskController change for now.

Do the WASM tests run at all on Apple Silicon? Since the TaskController change was initially only for Apple Silicon Macs / sp3.

Fx133 is now in soft code freeze and will go to beta on 2024-10-28.
Is the plan here to back out for now or something else?

Flags: needinfo?(rhunt)
Flags: needinfo?(bas)

(In reply to Olli Pettay [:smaug][bugs@pettay.fi] from comment #14)

I'd back out the TaskController change for now.

Do the WASM tests run at all on Apple Silicon? Since the TaskController change was initially only for Apple Silicon Macs / sp3.

I don't think the wasm benchmarks run on Apple Silicon. Looks like the task [1] runs on Intel Mac Mini's.

[1] test-macosx1015-64-shippable-qr/opt-browsertime-benchmark-wasm-firefox

I think backing this out makes sense if there's no immediate fix. Bas, can you back this out?

I was also able to confirm from retriggers that this affected our stock configuration, not just the baseline/ion only variants.

Assignee: nobody → bas
Flags: needinfo?(rhunt)
Attachment #9433466 - Attachment description: WIP: Bug 1923595, don't use GetHeterogeneousCpuInfo on Windows to control the number of TaskController's background threads → Bug 1923595, don't use GetHeterogeneousCpuInfo on Windows to control the number of TaskController's background threads
Pushed by opettay@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/ef5914d0bdd6 don't use GetHeterogeneousCpuInfo on Windows to control the number of TaskController's background threads r=bas
Status: NEW → RESOLVED
Closed: 1 year ago
Resolution: --- → FIXED
Target Milestone: --- → 133 Branch
See Also: 1904957
Regressions: 1925510
Flags: needinfo?(bas)
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: