Closed Bug 2004948 Opened 1 month ago Closed 1 month ago

8.54 - 2.79% Base Content Heap Unclassified / Base Content Explicit + 1 more (Linux) regression on Wed December 3 2025

Tracking

()

Status:

VERIFIED FIXED

Milestone:

148 Branch

Tracking Flags:

Tracking

Status

firefox-esr115

---

unaffected

firefox-esr140

---

unaffected

firefox146

---

unaffected

firefox147

fixed

firefox148

fixed

People

(Reporter: intermittent-bug-filer, Assigned: sajidanwar)

References

(Regression)

Details

(Keywords: perf, perf-alert, regression)

Attachments

(1 file)

Bug 2004948 - Calculate rcap/rch/rex/ric units only after first use in the document. r=emilio 1 month ago Sajid Anwar [:sajidanwar] 48 bytes, text/x-phabricator-request	RyanVM : approval-mozilla-beta+	Details \| Review

Treeherder Bug Filer

Reporter

Description

•

1 month ago

treeherder

Perfherder has detected a awsy performance regression from push 7db60b27689000fe5afe8dacda60e825540200ba. As author of one of the patches included in that push, we need your help to address this regression.

Please acknowledge, and begin investigating this alert within 3 business days, or the patch(es) may be backed out in accordance with our regression policy. Our guide to handling regression bugs has information about how you can proceed with this investigation.

If you have any questions or need any help with the investigation, please reach out to afinder@mozilla.com. Alternatively, you can find help on Slack by joining #perf-help, and on Matrix you can find help by joining #perftest.

Regressions:

Ratio	Test	Platform	Options	Absolute values (old vs new)
9%	Base Content Heap Unclassified	linux1804-64-shippable-qr	fission	2,535,513.33 -> 2,752,044.00
4%	Base Content Resident Unique Memory	linux1804-64-shippable-qr	fission	7,576,746.67 -> 7,848,277.33
3%	Base Content Explicit	linux1804-64-shippable-qr	fission	7,633,794.00 -> 7,846,737.33

Details of the alert can be found in the alert summary, including links to graphs and comparisons for each of the affected tests.

If you need the profiling jobs you can trigger them yourself from treeherder job view or ask afinder@mozilla.com to do that for you.

You can run all of these tests on try with ./mach try perf --alert 47755

The following documentation link provides more information about this command.

Flags: needinfo?(sajidanwar94)

BugBot [:suhaib / :marco/ :calixte]

Comment 1

•

1 month ago

Set release status flags based on info from the regressing bug 1740584

status-firefox146: --- → unaffected

status-firefox147: --- → affected

status-firefox148: --- → affected

status-firefox-esr140: --- → unaffected

Sajid Anwar [:sajidanwar]

Assignee

Comment 2

•

1 month ago

Emilio, do you have any advice on this? Looking at the three metrics on PerfHerder, it seems pretty clear that the memory increases are triggered by my root font metrics patch and not random noise. In my patch, we're only retaining the new bool and f32's on the Device for the root font metrics, so that would be tiny. I'm guessing that there must be a cache on the font metrics that we're hitting, even on pages like about:blank that the perf test is using.

Do we need to implement something like what Jonathan suggested in the original bug, of doing the font metrics calculations only if there are root-font-relative units in use the document? I'm trying to think of how to implement that ... we would need the first setter of the used_root_font_metrics atomic to trigger a full restyle/recascade of the root (how?), but I wonder if that means that there would be incorrect values for the root font units for a frame or something like that.

In the meantime, do you think we need to back out the change, say that we'll fix it in the future, or accept it as an expected impact? (I'm not sure who makes that decision.)

Flags: needinfo?(sajidanwar94) → needinfo?(emilio)

Ryan VanderMeulen [:RyanVM]

Updated

•

1 month ago

status-firefox-esr115: --- → unaffected

tracking-firefox147: --- → +

tracking-firefox148: --- → +

Emilio Cobos Álvarez [:emilio]

Comment 3

•

1 month ago

Yeah I agree that it's unfortunate and it's probably caused by the unconditional font metrics lookup.

I wonder if a way we could go about this (and which would also maybe simplify rem and so on, too) would be to, instead of keeping the metrics individually, we could just keep a pointer to the Root's ComputedStyle object, so:

root_style: RwLock<Arc<ComputedValues>>,

Which would be initialized with the default_values, and updated on the root restyle. You still need to track changes, but only when metrics have been used.

So you can do something like:

device.set_root_style(new_values);
if device.used_root_font_metrics() {
    // Actually query for metric changes and restyle, though actually you might not need it?
}

Then the getters would actually query the font metrics with the root font size unconditionally, by reading self.root_style or so.

Rem units could also be changed if needed to read from the style directly, but they are hot, so maybe not. It's a bit more work but maybe worth it, and it should definitely fix it, wdyt? I don't think the regression is worth reverting the patch, specially given it's linux only.

Flags: needinfo?(emilio) → needinfo?(sajidanwar94)

Sajid Anwar [:sajidanwar]

Assignee

Comment 4

•

1 month ago

Hm, so in this approach, the device structure will always contain the root's current styles, and the root font metric getters would always query the font metrics off those current styles (which matches with how the non-root font metrics always perform the query_font_metrics calls). That would make it so that the non-root elements will always have the correct font metrics, so that's good.

We also would need to know when to recascade the entire tree when the root font/metrics have changed. You have // ... though actually you might not need it in your code snippet's comment, but I think you would need to query and compare font metrics in the root restyle, right? Otherwise that means unconditionally recascading on even non-font-related root style changes, just because there's a root font unit somewhere in the document. If we do still need to query and compare font metrics, then we'd still need to store individual root metrics, and then we may as well use those in our getters as well. So then I would think the new root_style pointer would be there to handle only the very first computation — which in effect would implement Jonathan's lazy computation suggestion.

Let me know if I'm missing something, otherwise I'll try to give this a go and see what refactors make sense for it. I'll leave rem out of it for now since it seems trivial with its use of the direct property value. But I wonder if rlh would benefit from the initial-lazy-computation, since I see it makes calls to Gecko_CalcLineHeight and I assume that could be non-trivial, and also we could eliminate the unnecessary line height calculation on the old styles by only comparing it to the stored value on the device.

Regarding the status of the regression itself, I see that the Performance Regressions Policy says that we can "promise to attempt to fix the bug" with "an estimated timeline". I'm definitely going to try and resolve the regression as soon as I can, hopefully a patch in the next few days.

Flags: needinfo?(sajidanwar94)

Emilio Cobos Álvarez [:emilio]

Comment 5

•

1 month ago

Yeah I think we want to query the metrics and restyle only if they have changed, if the root font metrics have been used.

BugBot [:suhaib / :marco/ :calixte]

Comment 6

•

1 month ago

The bug is marked as tracked for firefox147 (beta) and tracked for firefox148 (nightly). However, the bug still isn't assigned.

:fgriffith, could you please find an assignee for this tracked bug? Given that it is a regression and we know the cause, we could also simply backout the regressor. If you disagree with the tracking decision, please talk with the release managers.

For more information, please visit BugBot documentation.

Flags: needinfo?(fgriffith)

Emilio Cobos Álvarez [:emilio]

Comment 7

•

1 month ago

This is getting looked into. The regression is bad but also not the end of the world since it's memory that we'd end up using it very soon after startup.

Assignee: nobody → sajidanwar94

Flags: needinfo?(fgriffith)

David Shin[:dshin]

Updated

•

1 month ago

Severity: -- → S3

Sajid Anwar [:sajidanwar]

Assignee

Comment 8

•

1 month ago

Attached file Bug 2004948 - Calculate rcap/rch/rex/ric units only after first use in the document. r=emilio — Details

Emilio Cobos Álvarez [:emilio]

Updated

•

1 month ago

Blocks: 2005811

Phabricator Automation

Updated

•

1 month ago

Attachment #9532585 - Attachment description: Bug 2004948 - Calculate relative root font units only after first use. r=emilio → Bug 2004948 - Calculate rcap/rch/rex/ric units only after first use in the document. r=emilio

Sajid Anwar [:sajidanwar]

Assignee

Comment 9

•

1 month ago

Updated the patch to address race condition concerns. Good news is that the performance regression looks to have been addressed by the patch's first revision. Results from ./mach try perf --alert 47755:

Suite	Delta (Regression)	Delta (Fix)
Base Content Explicit	+2.73%	-2.72%
Base Content Heap Unclassified	+8.3%	-7.68%
Base Content Resident Unique Memory	+3.18%	-4.27%

The "Base Content Heap Unclassified" delta looks like we didn't fully fix the regression, but that's just because of which denominator is used for calculating the percentage. The actual heap sizes before the regression and after the fix are nearly identical.

Pulsebot

Comment 10

•

1 month ago

Pushed by ealvarez@mozilla.com: https://github.com/mozilla-firefox/firefox/commit/c4fe65e572fe https://hg.mozilla.org/integration/autoland/rev/ae2e8506c6d6 Calculate rcap/rch/rex/ric units only after first use in the document. r=emilio,firefox-style-system-reviewers

Emilio Cobos Álvarez [:emilio]

Comment 11

•

1 month ago

Comment on attachment 9532585 [details]
Bug 2004948 - Calculate rcap/rch/rex/ric units only after first use in the document. r=emilio

Beta/Release Uplift Approval Request

User impact if declined/Reason for urgency: Fixes regression on new feature.
Is this code covered by automated tests?: Yes
Has the fix been verified in Nightly?: No
Needs manual test from QE?: No
If yes, steps to reproduce:
List of other uplifts needed: none
Risk to taking this patch: Low
Why is the change risky/not risky? (and alternatives if risky): Isolated to the new feature.
String changes made/needed: none
Is Android affected?: No

Attachment #9532585 - Flags: approval-mozilla-beta?

Serban Stanca [:SerbanS]

Comment 12

•

1 month ago

bugherder

https://hg.mozilla.org/mozilla-central/rev/ae2e8506c6d6

Status: NEW → RESOLVED

Closed: 1 month ago

status-firefox148: affected → fixed

Resolution: --- → FIXED

Target Milestone: --- → 148 Branch

Ryan VanderMeulen [:RyanVM]

Comment 13

•

1 month ago

Comment on attachment 9532585 [details]
Bug 2004948 - Calculate rcap/rch/rex/ric units only after first use in the document. r=emilio

Approved for 147.0b7.

Attachment #9532585 - Flags: approval-mozilla-beta? → approval-mozilla-beta+

Ryan VanderMeulen [:RyanVM]

Updated

•

1 month ago

status-firefox147: affected → fixed

Pulsebot

Comment 14

•

1 month ago

uplift

https://github.com/mozilla-firefox/firefox/commit/d36d2f9c9d90 https://hg.mozilla.org/releases/mozilla-beta/rev/e1c5c4a043a2

Mayank Bansal

Comment 15

•

1 month ago

The numbers are back to baseline : https://treeherder.mozilla.org/perfherder/alerts?id=48037&hideDwnToInv=0
(Caveat: these are not yet confirmed and backfilled by perf-sheriffs. But the graph is very clear)

Acasandrei Beatrice (needinfo me)

Comment 16

•

1 month ago

(In reply to Pulsebot from comment #10)

Pushed by ealvarez@mozilla.com:
https://github.com/mozilla-firefox/firefox/commit/c4fe65e572fe
https://hg.mozilla.org/integration/autoland/rev/ae2e8506c6d6
Calculate rcap/rch/rex/ric units only after first use in the document.
r=emilio,firefox-style-system-reviewers

Perfherder has detected a awsy performance change from push ae2e8506c6d638db51ea71ae43f861016dd50a59.

No action is required from the author; this comment is provided for informational purposes only.

Improvements:

Ratio	Test	Platform	Options	Absolute values (old vs new)
8%	Base Content Heap Unclassified	linux1804-64-shippable-qr	fission	2,753,328.00 -> 2,545,802.00
4%	Base Content Resident Unique Memory	linux1804-64-shippable-qr	fission	7,855,786.67 -> 7,578,112.00
3%	Base Content Explicit	linux1804-64-shippable-qr	fission	7,849,958.67 -> 7,627,521.33

Need Help or Information?

If you have any questions, please reach out to bacasandrei@mozilla.com. Alternatively, you can find help on Slack by joining #perf-help, and on Matrix you can find help by joining #perftest.

Details of the alert can be found in the alert summary, including links to graphs and comparisons for each of the affected tests.

Emilio Cobos Álvarez [:emilio]

Updated

•

1 month ago

Status: RESOLVED → VERIFIED

Treeherder Bug Filer

Reporter

Updated

•

1 month ago

Regressions: 2007681

You need to log in before you can comment on or make changes to this bug.