Closed Bug 1939719 Opened 1 month ago Closed 27 days ago

6.37% sessionrestore_no_auto_restore (Linux) regression on Mon December 30 2024

Categories

(Firefox :: Session Restore, defect, P2)

defect
Points:
2

Tracking

()

RESOLVED FIXED
136 Branch
Tracking Status
firefox-esr128 --- unaffected
firefox133 --- unaffected
firefox134 --- unaffected
firefox135 --- fixed
firefox136 --- fixed

People

(Reporter: intermittent-bug-filer, Assigned: jswinarton)

References

(Blocks 1 open bug, Regression)

Details

(4 keywords, Whiteboard: [fidefe-tabgrps-sessionstore])

Attachments

(2 files)

Perfherder has detected a talos performance regression from push 886af1de2c44e2dff7487ab6b04fc0b5cfa48894. As author of one of the patches included in that push, we need your help to address this regression.

Regressions:

Ratio Test Platform Options Absolute values (old vs new)
7% sessionrestore_no_auto_restore windows11-64-shippable-qr e10s fission stylo webrender 334.92 -> 359.33
7% sessionrestore_no_auto_restore windows11-64-shippable-qr e10s fission stylo webrender-sw 315.17 -> 337.17
6% sessionrestore_no_auto_restore linux1804-64-shippable-qr e10s fission stylo webrender 844.58 -> 898.42

Details of the alert can be found in the alert summary, including links to graphs and comparisons for each of the affected tests. Please follow our guide to handling regression bugs and let us know your plans within 3 business days, or the patch(es) may be backed out in accordance with our regression policy.

If you need the profiling jobs you can trigger them yourself from treeherder job view or ask a sheriff to do that for you.

You can run all of these tests on try with ./mach try perf --alert 43253

The following documentation link provides more information about this command.

For more information on performance sheriffing please see our FAQ.

If you have any questions, please do not hesitate to reach out to aglavic@mozilla.com.

Flags: needinfo?(jswinarton)

Set release status flags based on info from the regressing bug 1927767

Set release status flags based on info from the regressing bug 1927767

Assignee: nobody → jswinarton
Blocks: 1907100
Severity: -- → S3
Status: NEW → ASSIGNED
Points: --- → 2
Priority: -- → P2
Whiteboard: [fidefe-tabgrps-sessionstore]

Sorry for the delay on getting an analysis done here. I've just done some investigation and I admit I'm a bit stumped by this.

Re-running the alerted tests on try gave me the following results (for linux-1804-64-shippable-qr):

This is a much smaller differential than what was reported above.

In addition, the original patch modifies the session restore startup routine to add tab groups from the saved session store file, if they exist. However, the session store file used by talos has not been updated to include any tab group data yet. Therefore, this patch should have the following effects:

  • An extra empty array to each window's state is added for closed tab groups
  • There is also a check to ensure that closed tab data ends up in the correct data structure if the tab was closed as part of a group, however, the check always returns false since each tab in the talos data is ungrouped

Taken together, it seems surprising that this would cause such a large performance difference. In any case, if there is one, it is one we would need to accept, since correct restoring of tab groups is essential to the feature.

A caveat: this is the first time I have investigated a performance issue. I don't feel fully confident in my understanding of the charts produced by perfherder, so there might be mistakes in this analysis. If someone could double check this and let me know if it seems right, that would be very helpful!

Flags: needinfo?(jswinarton)

Update: :dao reviewed this and pointed out that I incorrectly specified the test parameters in my first run, so I was not getting the right results. I triggered this again and got the following compare: https://treeherder.mozilla.org/perfherder/compare?originalProject=try&originalRevision=850482d048c48f2da4445de6105c164e8e345386&newProject=try&newRevision=b665d7125943121fcedf2ee889fcd2b4c59c8ec5&framework=1&page=1

This reproduces the original issue, more or less — the linux build is reporting a delta of 3.96% and the window build has 6.05%.

I will spend some more time investigating and see if I can isolate the issue.

The root cause of this turned out to be that the talos
sessionstore.js file did not have groups, but the call to load groups
during the deferred restore routine was expecting them to exist,
throwing an error. This change fixes the issue.

PerfCompare run: https://perf.compare/compare-results?baseRev=6ccc0ebc69e907e55033ea729a5ca40d0fcb22fe&newRev=5d112ab0e86f98dbb7a9c6f8d40de01564ebec7b&baseRepo=try&newRepo=try&framework=1

Pushed by jswinarton@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/4a5f681d4596 Resolve talos sessionstore perf issue r=dao,sessionstore-reviewers
Status: ASSIGNED → RESOLVED
Closed: 27 days ago
Resolution: --- → FIXED
Target Milestone: --- → 136 Branch

The root cause of this turned out to be that the talos
sessionstore.js file did not have groups, but the call to load groups
during the deferred restore routine was expecting them to exist,
throwing an error. This change fixes the issue.

PerfCompare run: https://perf.compare/compare-results?baseRev=6ccc0ebc69e907e55033ea729a5ca40d0fcb22fe&newRev=5d112ab0e86f98dbb7a9c6f8d40de01564ebec7b&baseRepo=try&newRepo=try&framework=1

Original Revision: https://phabricator.services.mozilla.com/D234347

Attachment #9459999 - Flags: approval-mozilla-beta?

beta Uplift Approval Request

  • User impact if declined: Errors restoring saved sessions from disk, if the session file was created before the introduction of tab gruoups.
  • Code covered by automated testing: yes
  • Fix verified in Nightly: yes
  • Needs manual QE test: no
  • Steps to reproduce for manual QE testing: N/A
  • Risk associated with taking this patch: Low to none
  • Explanation of risk level: Makes a check optional in the absence of group data. Doesn't introduce any new functionality.
  • String changes made/needed: None
  • Is Android affected?: no
Attachment #9459999 - Flags: approval-mozilla-beta? → approval-mozilla-beta+

(In reply to Serban Stanca [:SerbanS] from comment #7)

https://hg.mozilla.org/mozilla-central/rev/4a5f681d4596

Perfherder has detected a talos performance change from push 4a5f681d459618d6d41624a6e894c1223a852570.

Improvements:

Ratio Test Platform Options Absolute values (old vs new)
6% sessionrestore_no_auto_restore windows11-64-shippable-qr e10s fission stylo webrender 358.25 -> 337.25

Details of the alert can be found in the alert summary, including links to graphs and comparisons for each of the affected tests.

If you need the profiling jobs you can trigger them yourself from treeherder job view or ask a sheriff to do that for you.

You can run these tests on try with ./mach try perf --alert 43373

For more information on performance sheriffing please see our FAQ.

You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: