Closed Bug 1836039 Opened 1 year ago Closed 1 year ago

5.43 - 2.16% Base Content JS / Base Content Explicit + 3 more (Linux, OSX, Windows) regression on Mon May 29 2023

Categories

(Firefox :: Translations, defect, P2)

defect

Tracking

()

RESOLVED FIXED
117 Branch
Tracking Status
firefox-esr102 --- unaffected
firefox-esr115 --- disabled
firefox114 --- unaffected
firefox115 --- disabled
firefox116 --- disabled
firefox117 --- fixed

People

(Reporter: bacasandrei, Assigned: gregtatum)

References

(Regression)

Details

(Keywords: perf, perf-alert, regression)

Perfherder has detected a awsy performance regression from push 8e03224a019c674d3b61c734ae819e8557cef17f. As author of one of the patches included in that push, we need your help to address this regression.

Regressions:

Ratio Test Platform Options Absolute values (old vs new)
5% Base Content JS windows11-64-2009-shippable-qr fission 1,578,667.00 -> 1,664,400.00
5% Base Content JS windows11-64-2009-shippable-qr fission 1,578,680.00 -> 1,664,234.00
5% Base Content JS linux1804-64-shippable-qr fission 1,577,678.67 -> 1,663,064.00
5% Base Content JS macosx1015-64-shippable-qr fission 1,581,432.00 -> 1,661,298.00
2% Base Content Explicit windows11-64-2009-shippable-qr fission 10,389,504.00 -> 10,613,930.67

Details of the alert can be found in the alert summary, including links to graphs and comparisons for each of the affected tests. Please follow our guide to handling regression bugs and let us know your plans within 3 business days, or the offending patch(es) may be backed out in accordance with our regression policy.

If you need the profiling jobs you can trigger them yourself from treeherder job view or ask a sheriff to do that for you.

For more information on performance sheriffing please see our FAQ.

Flags: needinfo?(gtatum)

== Change summary for alert #38520 (as of Wed, 31 May 2023 01:19:07 GMT) ==

Regressions:

Ratio Test Platform Options Absolute values (old vs new) Performance Profiles
8% tumblr FirstVisualChange linux1804-64-shippable-qr cold fission webrender 408.33 -> 440.00 Before/After
6% tumblr fcp windows10-64-shippable-qr cold fission webrender 261.83 -> 276.58
5% tumblr fcp linux1804-64-shippable-qr cold fission webrender 368.35 -> 387.12 Before/After

For up to date results, see: https://treeherder.mozilla.org/perfherder/alerts?id=38520

For what it's worth, this pref is enabled for early beta only, and not release, so I don't think it should be backed out just yet, but this should block desktop release. I'm investigating it.

Assignee: nobody → gtatum
Blocks: 1820261
Flags: needinfo?(gtatum)

This looks like the memory regression is from code being loaded in the TranslationsChild. I was able to reduce this by about 30% through breaking out code into a separate script and lazy loading it.

I'm collecting my results in this spreadsheet: https://docs.google.com/spreadsheets/d/1zeCdnSCngoUMG2mzNymardEb6jPNXtpoo30XnLvyaz0/edit#gid=0

I have some patches reducing the memory usage down by 45.77% of the increase, and anticipate a few more places I can find savings. I'm also working on some speed changes. This involves re-architecting some things in the actor.

The actor refactor I am working on is decreasing the number of samples recorded. In ~30 seconds of browsing I'm getting.

4 samples in the parent process
4 samples in the content process

https://share.firefox.dev/45MQ434

There is still the issue with the language detection in the DOM Worker taking 842 samples, but that is being addressed in Bug 1836974.

Depends on: 1837078

Note that both regressions reported here are now hidden because of https://bugzilla.mozilla.org/show_bug.cgi?id=1836093#c6

So if you want to check your performance improvements on CI, make sure to re-enable translations in automation first. To do so you can more or less backout https://hg.mozilla.org/mozilla-central/rev/897932a8a73d (even though there is a small documentation update that I added there which would be worth keeping).

I got this reduced down 95%, so that it's just 4,624 bytes added, rather than 85,408 bytes. I still need to clean up the code and get it ready for review.

Depends on: 1836974

Patches are up on Bug 1837078, which should fix the memory issues, but I'd like for Bug 1836974 to be done as well before re-enabling the code automation.

Priority: -- → P2

The severity field is not set for this bug.
:nordzilla, could you have a look please?

For more information, please visit BugBot documentation.

Flags: needinfo?(enordin)

After Bug 1837078, the results for AWSY are:

+15,720 bytes
+0.99%

Which feels acceptable to me for a new feature.

The tumblr test now doesn't have any measurable difference.

Marking this as S2 because the regression was large, even though it's probably fixed now.

Severity: -- → S2
Flags: needinfo?(enordin)

Greg, is the regression fixed? thanks

Flags: needinfo?(gtatum)

I've done lots of perf work here to address it, and we're going to monitor the effects of Bug 1841341.

Flags: needinfo?(gtatum)

This should have been fixed by recent work. Let's close it and file any potential new regression in a new bug.

Status: NEW → RESOLVED
Closed: 1 year ago
Resolution: --- → FIXED

was the work done in 116 or in bug 1841341? just checking if there is something to uplift if all looks good

Flags: needinfo?(gtatum)

This work is in Nightly only, so there is no need to uplift.

Flags: needinfo?(gtatum)
Target Milestone: --- → 117 Branch
You need to log in before you can comment on or make changes to this bug.