Closed Bug 1944040 Opened 1 year ago Closed 1 year ago

4.91 - 4.9% browser_ml_suggest_feature_perf.js NER-pipeline-ready-latency / browser_ml_suggest_feature_perf.js NER-initialization-latency (Linux) regression on Fri January 17 2025

Tracking

()

Status:

RESOLVED FIXED

Milestone:

136 Branch

Tracking Flags:

Tracking

Status

firefox-esr128

---

unaffected

firefox134

---

unaffected

firefox135

---

unaffected

firefox136

---

fixed

People

(Reporter: intermittent-bug-filer, Assigned: jstutte)

References

(Regression)

Details

(Keywords: perf, perf-alert, regression)

Attachments

(1 file)

Bug 1944040 - Minimize memory in getTotalMemoryUsage. r?tarek,#xpcom-reviewers 1 year ago Jens Stutte [:jstutte] 48 bytes, text/x-phabricator-request		Details \| Review

Treeherder Bug Filer

Reporter

Description

•

1 year ago

treeherder

Perfherder has detected a mozperftest performance regression from push 0927646a79dd3149e2e7463d01adb43e6295cba1. As author of one of the patches included in that push, we need your help to address this regression.

Regressions:

Ratio	Test	Platform	Options	Absolute values (old vs new)
3%	browser_ml_suggest_feature_perf.js INTENT-total-memory-usage	linux1804-64-shippable		591.08 -> 607.00
3%	browser_ml_suggest_feature_perf.js NER-total-memory-usage	linux1804-64-shippable		591.08 -> 607.00

Improvements:

Ratio	Test	Platform	Options	Absolute values (old vs new)
5%	browser_ml_suggest_feature_perf.js NER-pipeline-ready-latency	linux1804-64-shippable		14,547.58 -> 13,833.33
5%	browser_ml_suggest_feature_perf.js NER-initialization-latency	linux1804-64-shippable		14,566.58 -> 13,852.25

Details of the alert can be found in the alert summary, including links to graphs and comparisons for each of the affected tests. Please follow our guide to handling regression bugs and let us know your plans within 3 business days, or the patch(es) may be backed out in accordance with our regression policy.

If you need the profiling jobs you can trigger them yourself from treeherder job view or ask a sheriff to do that for you.

You can run all of these tests on try with ./mach try perf --alert 43409

The following documentation link provides more information about this command.

For more information on performance sheriffing please see our FAQ.

If you have any questions, please do not hesitate to reach out to bacasandrei@mozilla.com.

Flags: needinfo?(jstutte)

Acasandrei Beatrice (needinfo me)

Updated

•

1 year ago

status-firefox136: --- → affected

BugBot [:suhaib / :marco/ :calixte]

Comment 1

•

1 year ago

Set release status flags based on info from the regressing bug 1903758

status-firefox134: --- → unaffected

status-firefox135: --- → unaffected

status-firefox-esr128: --- → unaffected

Jens Stutte [:jstutte]

Assignee

Comment 2

•

1 year ago

Looking at the graphs suggests that the higher memory usage alternated with the lower one for a while. The interesting thing is also that the number is always the same, if it came from internal differences in mozjemalloc I'd expect to see more variations. :tarek, can you help with the interpretation here?

Flags: needinfo?(jstutte) → needinfo?(tziade)

Tarek Ziadé (:tarek)

Comment 3

•

1 year ago

interesting indeed. Thanks Jens.

Chidam did you change the model version in that timespan?

Flags: needinfo?(tziade) → needinfo?(cgopal)

cgopal

Comment 4

•

1 year ago

Hi Tarek,
The last version update for NER models were on Nov 29, 2025
https://remote-settings.mozilla.org/v1/admin/#/buckets/main-workspace/collections/ml-inference-options/records
No changes on Jan 17, 2025 for these models yet as experiments are running since Jan 14th.
Thanks!

Flags: needinfo?(cgopal)

Jens Stutte [:jstutte]

Assignee

Comment 5

•

1 year ago

IIUC this measure is based on explicit memory which might be related, depending on the (non-)delay between running the test and taking the memory measurement.

yet as experiments are running since Jan 14th.

However it might look like the parallel measures with the higher value start from Jan 7 which is well before bug 1903758 landed. There might be a chance that there always was a race between triggering or not a purge before the measure was taken and that my patch made this race explicit?

My patch might be just working as expected, we could probably try to generate a profile to see when purge is happening in this test with and without the patch:

without patch and with low memory, though there is no guarantee that the race will happen or not, if it exists.

with patch

Note that the latency improvements might be real, though largely outperformed by a later change on Jan 23 and probably not worth mentioning anymore.

Jens Stutte [:jstutte]

Assignee

Comment 6

•

1 year ago

So assuming that the test wants to alert us about introducing higher memory consumption through the ML models, I think we should use getReportsExtended with minimizeMemoryUsage == true in order to reduce the noise from the underlying layers, which would be better
with or without my patch.

I'll take a look at the profiles if they can confirm if that is what's happening here, but it sounds quite likely.

Jens Stutte [:jstutte]

Assignee

Updated

•

1 year ago

Comment 7

•

1 year ago

Attached file Bug 1944040 - Minimize memory in getTotalMemoryUsage. r?tarek,#xpcom-reviewers — Details

Bug 1931851 introduced an ML specific memory reporting. The intent was to "Use the memory reporter to get a snapshot of the memory used by the inference process right after the inference is over."

Now memory reporting is a bit tricky. Our custom allocator has several ways to keep spare memory that gets occasionally freed when hitting certain thresholds. The reporting introduced by bug 1931851 just loooks at the lump sum of heap allocated memory, which includes all this overhead.

If we want to continue to look at this overall metric, we need to follow the same pattern as AWSY testing does, that is to explicitly minimize the aforesaid overhead before taking the snapshot. We already have the logic in place for this and must just make it accessible from JS.

Note that this may introduce quite some delay between the moment the inference ends and the moment the snapshot is actually taken. It is unclear if this still matches the original intent or if the engine slowly winds down, too, but it is probably the only thing we can do here to get a sufficiently stable measurement that makes regressions worth looking at them.

Phabricator Automation

Updated

•

1 year ago

Assignee: nobody → jstutte

Status: NEW → ASSIGNED

Jens Stutte [:jstutte]

Assignee

Comment 8

•

1 year ago

I'll take a look at the profiles if they can confirm if that is what's happening here, but it sounds quite likely.

I was not able to find those, but the perf comparison linked on the patch shows what's going on, too.

Pulsebot

Comment 9

•

1 year ago

Pushed by jstutte@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/6330e3ab3f8e Minimize memory in getTotalMemoryUsage. r=tarek,xpcom-reviewers,mccr8

amarc

Comment 10

•

1 year ago

bugherder

https://hg.mozilla.org/mozilla-central/rev/6330e3ab3f8e

Status: ASSIGNED → RESOLVED

Closed: 1 year ago

status-firefox136: affected → fixed

Resolution: --- → FIXED

Target Milestone: --- → 136 Branch

Alex Finder

Comment 11

•

1 year ago

(In reply to Pulsebot from comment #9)

Pushed by jstutte@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/6330e3ab3f8e
Minimize memory in getTotalMemoryUsage. r=tarek,xpcom-reviewers,mccr8

Perfherder has detected a mozperftest performance change from push 6330e3ab3f8e131b9b84283e603f45d1aa35d86e.

Improvements:

Ratio	Test	Platform	Options	Absolute values (old vs new)
43%	browser_ml_summarizer_perf.js SUM-ONNX-COMMUNITY-QWEN2.5-0.5B-INSTRUCT_TINY-total-memory-usage	macosx1015-64-shippable-qr		1,305.12 -> 747.08
43%	browser_ml_summarizer_perf.js SUM-ONNX-COMMUNITY-QWEN2.5-0.5B-INSTRUCT_BIG-total-memory-usage	linux1804-64-shippable		1,561.83 -> 897.00
40%	browser_ml_summarizer_perf.js SUM-ONNX-COMMUNITY-QWEN2.5-0.5B-INSTRUCT_BIG-total-memory-usage	macosx1015-64-shippable-qr		1,530.08 -> 913.58
40%	browser_ml_summarizer_perf.js SUM-ONNX-COMMUNITY-QWEN2.5-0.5B-INSTRUCT_TINY-total-memory-usage	windows11-64-shippable-qr		1,234.54 -> 744.00
39%	browser_ml_summarizer_perf.js SUM-ONNX-COMMUNITY-QWEN2.5-0.5B-INSTRUCT_TINY-total-memory-usage	linux1804-64-shippable		1,207.08 -> 739.29
...	...	...	...	...
7%	browser_ml_engine_multi_perf.js total-memory-usage	linux1804-64-shippable		1,524.92 -> 1,411.00

Details of the alert can be found in the alert summary, including links to graphs and comparisons for each of the affected tests.

If you need the profiling jobs you can trigger them yourself from treeherder job view or ask a sheriff to do that for you.

You can run these tests on try with ./mach try perf --alert 43552

For more information on performance sheriffing please see our FAQ.

Jens Stutte [:jstutte]

Assignee

Comment 12

•

1 year ago

To be clear, this patch did just establish a new baseline, reducing also the potential noise on results.

Tarek Ziadé (:tarek)

Comment 13

•

1 year ago

related https://bugzilla.mozilla.org/show_bug.cgi?id=1944790

You need to log in before you can comment on or make changes to this bug.