Closed Bug 1895464 Opened 5 months ago Closed 4 months ago

19.1 - 10.26% expedia LastVisualChange / expedia PerceptualSpeedIndex + 1 more (Windows) regression on Mon April 22 2024

Categories

(Core :: JavaScript: GC, defect)

defect

Tracking

()

RESOLVED WONTFIX
Tracking Status
firefox-esr115 --- unaffected
firefox125 --- unaffected
firefox126 --- unaffected
firefox127 --- wontfix
firefox128 --- wontfix

People

(Reporter: aesanu, Unassigned)

References

(Regression)

Details

(Keywords: perf, perf-alert, regression)

Perfherder has detected a browsertime performance regression from push a3ad3d387edddd34adc1955985756e48c59a734a. As author of one of the patches included in that push, we need your help to address this regression.

Regressions:

Ratio Test Platform Options Absolute values (old vs new) Performance Profiles
19% expedia LastVisualChange windows10-64-shippable-qr cold fission webrender 3,148.09 -> 3,749.50 Before/After
15% expedia LastVisualChange windows10-64-shippable-qr cold fission webrender 3,176.86 -> 3,655.87 Before/After
10% expedia PerceptualSpeedIndex windows10-64-shippable-qr cold fission webrender 1,361.18 -> 1,500.79 Before/After

Improvements:

Ratio Test Platform Options Absolute values (old vs new) Performance Profiles
6% cnn largestContentfulPaint macosx1015-64-shippable-qr fission warm webrender 329.58 -> 310.77 Before/After

Details of the alert can be found in the alert summary, including links to graphs and comparisons for each of the affected tests. Please follow our guide to handling regression bugs and let us know your plans within 3 business days, or the patch(es) may be backed out in accordance with our regression policy.

If you need the profiling jobs you can trigger them yourself from treeherder job view or ask a sheriff to do that for you.

You can run these tests on try with ./mach try perf --alert 30

For more information on performance sheriffing please see our FAQ.

Flags: needinfo?(dpalmeiro)

Set release status flags based on info from the regressing bug 1892242

Set release status flags based on info from the regressing bug 1892242

Comparing the before/after for the 19% regression, the before profile has a major GC that finishes just before the largest contentful paint, whereas the after profile has a major GC that starts just before the LCP and continues for quite a while afterwards. It looks like the difference is that the larger nursery sizes let us keep the tenuring rate down and delay the major GC until later. That part seems good.

For some reason, though, the major GC in the after profile takes significantly longer to complete: 14 slices vs 7, 17 minor GCs during the major GC vs 6). It's not clear to me if this has anything to do with the larger nursery, or if it's just that we're doing different work in the mutator at this point. It's not entirely clear from the profile exactly where the last visual change happens, but I assume that the long major GC is the contributing factor here. Maybe we get slowed down slightly by having to run barriers?

My best guess is that we delay a major GC until later (which is generally good), but then it happens to run at a bad time. If that's the case, then there's no reason to expect that this testcase generalizes to other webpages.

Jon, does my analysis make sense? Is there anything else that we should be checking here? If not, I think we can WONTFIX this.

Flags: needinfo?(jcoppeard)

(In reply to Iain Ireland [:iain] from comment #3)

For some reason, though, the major GC in the after profile takes significantly longer to complete

The major GC slices are being trigger by the engine's internal ALLOC_TRIGGER, not by the DOM. Since the nursery is much larger in the after profile, the slices are triggered less frequently (because the allocations are going through the nursery first). I don't know why the DOM is not triggering slices more often.

17 minor GCs during the major GC vs 6

These are almost all triggered by the mutator filling the nursery and are not related to the ongoing major GC. There are even more minor GCs in this section of execution in the before profile.

It's not entirely clear from the profile exactly where the last visual change happens, but I assume that the long major GC is the contributing factor here.

I don't see any evidence to suggest that GC has pushed the last visual change out by 600ms as reported in the first result. The total GC time is only a few % greater in the after profile and it's broken up into slices.

Maybe we get slowed down slightly by having to run barriers?

This will slow things down but I don't think it would be enough to cause a change like this.

Jon, does my analysis make sense? Is there anything else that we should be checking here? If not, I think we can WONTFIX this.

Yes. Agreed that this is WONTFIX. I would really like to be able to see the last visual change on the profile though so we can be clearer about the reported change.

Flags: needinfo?(jcoppeard)

(In reply to Andra Esanu (needinfo me) from comment #0)
This is probably WONTFIX, but I have some questions about the report.

  1. The first two regressions reported have different results but link to the same perfherder results page and profiles. Is that correct? What does this mean?

  2. Is it possible to see what LastVisualChange corresponds to on the profile? It's hard to investigate issues this without being able to see this.

Flags: needinfo?(aesanu)

(In reply to Jon Coppeard (:jonco) from comment #5)

(In reply to Andra Esanu (needinfo me) from comment #0)
This is probably WONTFIX, but I have some questions about the report.

  1. The first two regressions reported have different results but link to the same perfherder results page and profiles. Is that correct? What does this mean?

  2. Is it possible to see what LastVisualChange corresponds to on the profile? It's hard to investigate issues this without being able to see this.

Hello,
For the first question there are two different results because one alert was created by the sheriff after the investigation to mark the right culprit.
For the second question I talked to :sparky and he said we don't have markers for lastVisualChange in the profiles, but we should be able to use the visual progress tracks to find where the last visual change was - however those are broken in these profiles.

Flags: needinfo?(aesanu)
Status: NEW → RESOLVED
Closed: 4 months ago
Resolution: --- → WONTFIX
Flags: needinfo?(dpalmeiro)
You need to log in before you can comment on or make changes to this bug.