Closed Bug 1950558 Opened 1 year ago Closed 1 year ago

3.89 - 2.86% browser_ml_autofill_perf.js AUTOFILL-total-memory-usage / browser_ml_summarizer_perf.js SUM-XENOVA-DISTILBART-CNN-12-6_BIG-peak-memory-usage + 4 more (Linux, Windows) regression on Wed February 19 2025

Tracking

()

Status:

RESOLVED WORKSFORME

Tracking Flags:

Tracking

Status

firefox-esr128

---

unaffected

firefox135

---

unaffected

firefox136

---

unaffected

firefox137

---

fix-optional

firefox138

---

affected

People

(Reporter: intermittent-bug-filer, Unassigned)

References

(Regression)

Details

(Keywords: perf, perf-alert, regression)

Treeherder Bug Filer

Reporter

Description

•

1 year ago

treeherder

Perfherder has detected a mozperftest performance regression from push f9e3f8796616bf24fd9d1c7260bc126fe531d0e0. As author of one of the patches included in that push, we need your help to address this regression.

Regressions:

Ratio	Test	Platform	Absolute values (old vs new)
4%	browser_ml_autofill_perf.js AUTOFILL-total-memory-usage	linux1804-64-shippable	173.71 -> 180.46
3%	browser_ml_autofill_perf.js AUTOFILL-total-memory-usage	windows11-64-2009-hw-ref-shippable	221.08 -> 228.42
3%	browser_ml_autofill_perf.js AUTOFILL-total-memory-usage	windows11-64-24h2-hw-ref-shippable	221.21 -> 228.29
3%	browser_ml_autofill_perf.js AUTOFILL-total-memory-usage	windows11-64-shippable-qr	221.92 -> 229.00
3%	browser_ml_autofill_perf.js AUTOFILL-total-memory-usage	windows11-64-24h2-shippable	221.96 -> 228.88
3%	browser_ml_summarizer_perf.js SUM-XENOVA-DISTILBART-CNN-12-6_BIG-peak-memory-usage	linux1804-64-shippable	1,298.04 -> 1,335.12

Details of the alert can be found in the alert summary, including links to graphs and comparisons for each of the affected tests. Please follow our guide to handling regression bugs and let us know your plans within 3 business days, or the patch(es) may be backed out in accordance with our regression policy.

If you need the profiling jobs you can trigger them yourself from treeherder job view or ask a sheriff to do that for you.

You can run all of these tests on try with ./mach try perf --alert 44020

The following documentation link provides more information about this command.

For more information on performance sheriffing please see our FAQ.

If you have any questions, please do not hesitate to reach out to bacasandrei@mozilla.com.

Flags: needinfo?(jstutte)

Jens Stutte [:jstutte]

Comment 1

•

1 year ago

I struggle a bit with reading browser_ml_autofill_perf.js, but my assumption would be that the AUTOFILL-total-memory-usage is taken right after loading the model, such that it would be subject of seeing the situation before a lazy purge ever happens.

Could it be that bug 1944040 comment 6 applies also here or is this test already using the very same getTotalMemoryUsage that we tweaked in bug 1944040 ?

IIRC I cannot get a profile for those runs, nonetheless I triggered it FWIW, maybe I misremember.

Flags: needinfo?(jstutte) → needinfo?(tziade)

BugBot [:suhaib / :marco/ :calixte]

Comment 2

•

1 year ago

Set release status flags based on info from the regressing bug 1920451

status-firefox135: --- → unaffected

status-firefox136: --- → unaffected

status-firefox137: --- → affected

status-firefox-esr128: --- → unaffected

Paul Bone [:pbone]

Comment 3

•

1 year ago

It'll be interesting to retest lazy purge on this benchmark once Bug 1928254 and Bug 1947661 land

Jens Stutte [:jstutte]

Comment 4

•

1 year ago

(In reply to Jens Stutte [:jstutte] from comment #1)

IIRC I cannot get a profile for those runs, nonetheless I triggered it FWIW, maybe I misremember.

This time it worked.

In the meantime I noticed that bug 1944790 reverted the change we made in bug 1944040, so I am not surprised to see the same sensitivity of this measure to lazy purge again.

Jens Stutte [:jstutte]

Comment 5

•

1 year ago

(In reply to Paul Bone [:pbone] from comment #3)

It'll be interesting to retest lazy purge on this benchmark once Bug 1928254 and Bug 1947661 land

Yeah, though I think it is quite clear what happened here. So I can see two possible actions for this bug here:

go back to a way of memory reporting equivalent to what we did in bug 1944040.
ignore this regression and take the new baseline for good.

In general I think the way ML wants to do memory reporting needs some careful thinking about what those metrics want to actually measure.

Jens Stutte [:jstutte]

Updated

•

1 year ago

Updated

•

1 year ago

Comment 6

•

1 year ago

Set release status flags based on info from the regressing bug 1920451

status-firefox138: --- → affected

Emilio Cobos Álvarez [:emilio]

Updated

•

1 year ago

status-firefox137: affected → fix-optional

Jens Stutte [:jstutte]

Comment 7

•

1 year ago

IIUC bug 1944790 also introduced some interesting peak-memory-usage test variants. Those did not regress, AFAICT, which confirms that my patch works like expected in that taking memory measurements right after an intense processing ends will most likely see a higher value now for memory.lazypurge.reuse_grace_period ms, at least, than after a purge happened as soon as we went idle. So I'd opt for:

ignore this regression and take the new baseline for good.

with the remaining recommendation to keep in mind that the measured overhead can change lazily based on how long you wait before taking the measurement. The good thing is that the reuse grace period is probably long enough to ensure we do not see too much random noise in those measurements. I'd still recommend for those measures (not for the peak!) to ensure that at least a jemalloc_free_excess_dirty_pages() is called in that process before taking the measurement, but I see no easy way to achieve that from JS, currently.

Marking this WFM as in practice from the ML perspective nothing changed in memory consumption at all.

Status: NEW → RESOLVED

Closed: 1 year ago

Resolution: --- → WORKSFORME

Jens Stutte [:jstutte]

Updated

•

5 months ago

Flags: needinfo?(tziade)

You need to log in before you can comment on or make changes to this bug.

Bugzilla

3.89 - 2.86% browser_ml_autofill_perf.js AUTOFILL-total-memory-usage / browser_ml_summarizer_perf.js SUM-XENOVA-DISTILBART-CNN-12-6_BIG-peak-memory-usage + 4 more (Linux, Windows) regression on Wed February 19 2025

Categories

(Core :: Memory Allocator, defect)

Tracking

()

People

(Reporter: intermittent-bug-filer, Unassigned)

References

(Regression)

Details

(Keywords: perf, perf-alert, regression)

Crash Data

Security

(public)

User Story

Description

Regressions:

Comment 1

Comment 2

Comment 3

Comment 4

Comment 5

Updated

Updated

Comment 6

Updated

Comment 7

Updated