4.91 - 4.9% browser_ml_suggest_feature_perf.js NER-pipeline-ready-latency / browser_ml_suggest_feature_perf.js NER-initialization-latency (Linux) regression on Fri January 17 2025
Categories
(Core :: Memory Allocator, defect)
Tracking
()
| Tracking | Status | |
|---|---|---|
| firefox-esr128 | --- | unaffected |
| firefox134 | --- | unaffected |
| firefox135 | --- | unaffected |
| firefox136 | --- | fixed |
People
(Reporter: intermittent-bug-filer, Assigned: jstutte)
References
(Regression)
Details
(Keywords: perf, perf-alert, regression)
Attachments
(1 file)
Perfherder has detected a mozperftest performance regression from push 0927646a79dd3149e2e7463d01adb43e6295cba1. As author of one of the patches included in that push, we need your help to address this regression.
Regressions:
| Ratio | Test | Platform | Options | Absolute values (old vs new) |
|---|---|---|---|---|
| 3% | browser_ml_suggest_feature_perf.js INTENT-total-memory-usage | linux1804-64-shippable | 591.08 -> 607.00 | |
| 3% | browser_ml_suggest_feature_perf.js NER-total-memory-usage | linux1804-64-shippable | 591.08 -> 607.00 |
Improvements:
| Ratio | Test | Platform | Options | Absolute values (old vs new) |
|---|---|---|---|---|
| 5% | browser_ml_suggest_feature_perf.js NER-pipeline-ready-latency | linux1804-64-shippable | 14,547.58 -> 13,833.33 | |
| 5% | browser_ml_suggest_feature_perf.js NER-initialization-latency | linux1804-64-shippable | 14,566.58 -> 13,852.25 |
Details of the alert can be found in the alert summary, including links to graphs and comparisons for each of the affected tests. Please follow our guide to handling regression bugs and let us know your plans within 3 business days, or the patch(es) may be backed out in accordance with our regression policy.
If you need the profiling jobs you can trigger them yourself from treeherder job view or ask a sheriff to do that for you.
You can run all of these tests on try with ./mach try perf --alert 43409
The following documentation link provides more information about this command.
For more information on performance sheriffing please see our FAQ.
If you have any questions, please do not hesitate to reach out to bacasandrei@mozilla.com.
Updated•1 year ago
|
Comment 1•1 year ago
|
||
Set release status flags based on info from the regressing bug 1903758
| Assignee | ||
Comment 2•1 year ago
|
||
Looking at the graphs suggests that the higher memory usage alternated with the lower one for a while. The interesting thing is also that the number is always the same, if it came from internal differences in mozjemalloc I'd expect to see more variations. :tarek, can you help with the interpretation here?
Comment 3•1 year ago
|
||
interesting indeed. Thanks Jens.
Chidam did you change the model version in that timespan?
Hi Tarek,
The last version update for NER models were on Nov 29, 2025
https://remote-settings.mozilla.org/v1/admin/#/buckets/main-workspace/collections/ml-inference-options/records
No changes on Jan 17, 2025 for these models yet as experiments are running since Jan 14th.
Thanks!
| Assignee | ||
Comment 5•1 year ago
|
||
IIUC this measure is based on explicit memory which might be related, depending on the (non-)delay between running the test and taking the memory measurement.
yet as experiments are running since Jan 14th.
However it might look like the parallel measures with the higher value start from Jan 7 which is well before bug 1903758 landed. There might be a chance that there always was a race between triggering or not a purge before the measure was taken and that my patch made this race explicit?
My patch might be just working as expected, we could probably try to generate a profile to see when purge is happening in this test with and without the patch:
without patch and with low memory, though there is no guarantee that the race will happen or not, if it exists.
Note that the latency improvements might be real, though largely outperformed by a later change on Jan 23 and probably not worth mentioning anymore.
| Assignee | ||
Comment 6•1 year ago
|
||
So assuming that the test wants to alert us about introducing higher memory consumption through the ML models, I think we should use getReportsExtended with minimizeMemoryUsage == true in order to reduce the noise from the underlying layers, which would be better
with or without my patch.
I'll take a look at the profiles if they can confirm if that is what's happening here, but it sounds quite likely.
| Assignee | ||
Comment 7•1 year ago
|
||
Bug 1931851 introduced an ML specific memory reporting. The intent was to "Use the memory reporter to get a snapshot of the memory used by the inference process right after the inference is over."
Now memory reporting is a bit tricky. Our custom allocator has several ways to keep spare memory that gets occasionally freed when hitting certain thresholds. The reporting introduced by bug 1931851 just loooks at the lump sum of heap allocated memory, which includes all this overhead.
If we want to continue to look at this overall metric, we need to follow the same pattern as AWSY testing does, that is to explicitly minimize the aforesaid overhead before taking the snapshot. We already have the logic in place for this and must just make it accessible from JS.
Note that this may introduce quite some delay between the moment the inference ends and the moment the snapshot is actually taken. It is unclear if this still matches the original intent or if the engine slowly winds down, too, but it is probably the only thing we can do here to get a sufficiently stable measurement that makes regressions worth looking at them.
Updated•1 year ago
|
| Assignee | ||
Comment 8•1 year ago
|
||
I'll take a look at the profiles if they can confirm if that is what's happening here, but it sounds quite likely.
I was not able to find those, but the perf comparison linked on the patch shows what's going on, too.
Comment 10•1 year ago
|
||
| bugherder | ||
Comment 11•1 year ago
|
||
(In reply to Pulsebot from comment #9)
Pushed by jstutte@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/6330e3ab3f8e
Minimize memory in getTotalMemoryUsage. r=tarek,xpcom-reviewers,mccr8
Perfherder has detected a mozperftest performance change from push 6330e3ab3f8e131b9b84283e603f45d1aa35d86e.
Improvements:
| Ratio | Test | Platform | Options | Absolute values (old vs new) |
|---|---|---|---|---|
| 43% | browser_ml_summarizer_perf.js SUM-ONNX-COMMUNITY-QWEN2.5-0.5B-INSTRUCT_TINY-total-memory-usage | macosx1015-64-shippable-qr | 1,305.12 -> 747.08 | |
| 43% | browser_ml_summarizer_perf.js SUM-ONNX-COMMUNITY-QWEN2.5-0.5B-INSTRUCT_BIG-total-memory-usage | linux1804-64-shippable | 1,561.83 -> 897.00 | |
| 40% | browser_ml_summarizer_perf.js SUM-ONNX-COMMUNITY-QWEN2.5-0.5B-INSTRUCT_BIG-total-memory-usage | macosx1015-64-shippable-qr | 1,530.08 -> 913.58 | |
| 40% | browser_ml_summarizer_perf.js SUM-ONNX-COMMUNITY-QWEN2.5-0.5B-INSTRUCT_TINY-total-memory-usage | windows11-64-shippable-qr | 1,234.54 -> 744.00 | |
| 39% | browser_ml_summarizer_perf.js SUM-ONNX-COMMUNITY-QWEN2.5-0.5B-INSTRUCT_TINY-total-memory-usage | linux1804-64-shippable | 1,207.08 -> 739.29 | |
| ... | ... | ... | ... | ... |
| 7% | browser_ml_engine_multi_perf.js total-memory-usage | linux1804-64-shippable | 1,524.92 -> 1,411.00 |
Details of the alert can be found in the alert summary, including links to graphs and comparisons for each of the affected tests.
If you need the profiling jobs you can trigger them yourself from treeherder job view or ask a sheriff to do that for you.
You can run these tests on try with ./mach try perf --alert 43552
For more information on performance sheriffing please see our FAQ.
| Assignee | ||
Comment 12•1 year ago
|
||
To be clear, this patch did just establish a new baseline, reducing also the potential noise on results.
Comment 13•1 year ago
|
||
Description
•