28 - 2.95% reddit-billgates-ama.members fcp / reddit-billgates-ama.billg-ama LastVisualChange + 22 more (Linux, OSX, Windows) regression on Wed November 13 2024
Categories
(Toolkit :: Startup and Profile System, defect)
Tracking
()
| Tracking | Status | |
|---|---|---|
| firefox-esr128 | --- | unaffected |
| firefox133 | --- | unaffected |
| firefox134 | --- | fix-optional |
| firefox135 | --- | affected |
People
(Reporter: intermittent-bug-filer, Unassigned)
References
(Regression)
Details
(Keywords: perf, perf-alert, regression)
Perfherder has detected a browsertime performance regression from push a3590cf454bc8d44e59090e2dde956723b76ca5d. As author of one of the patches included in that push, we need your help to address this regression.
Regressions:
| Ratio | Test | Platform | Options | Absolute values (old vs new) | Performance Profiles |
|---|---|---|---|---|---|
| 28% | reddit-billgates-ama.billg-ama fcp | macosx1015-64-shippable-qr | cold fission webrender | 146.84 -> 187.96 | Before/After |
| 28% | reddit-billgates-ama.members fcp | macosx1015-64-shippable-qr | cold fission webrender | 146.84 -> 187.96 | Before/After |
| 18% | reddit-billgates-ama.billg-ama loadtime | macosx1015-64-shippable-qr | cold fission webrender | 1,012.55 -> 1,199.74 | Before/After |
| 18% | reddit-billgates-ama.members loadtime | macosx1015-64-shippable-qr | cold fission webrender | 1,012.55 -> 1,199.74 | Before/After |
| 18% | reddit-billgates-ama.billg-ama FirstVisualChange | macosx1015-64-shippable-qr | cold fission webrender | 223.10 -> 262.83 | Before/After |
| 16% | reddit-billgates-post-1.posts ContentfulSpeedIndex | linux1804-64-shippable-qr | cold fission webrender | 220.43 -> 254.94 | Before/After |
| 15% | reddit-billgates-post-2.top loadtime | macosx1015-64-shippable-qr | cold fission webrender | 1,009.78 -> 1,157.87 | Before/After |
| 15% | reddit-billgates-post-2.billg loadtime | macosx1015-64-shippable-qr | cold fission webrender | 1,009.78 -> 1,157.87 | Before/After |
| 15% | reddit-billgates-post-2.hot loadtime | macosx1015-64-shippable-qr | cold fission webrender | 1,009.78 -> 1,157.87 | Before/After |
| 14% | reddit-billgates-post-1.billg loadtime | macosx1015-64-shippable-qr | cold fission webrender | 1,011.40 -> 1,157.23 | Before/After |
| ... | ... | ... | ... | ... | ... |
| 8% | reddit-billgates-post-1.posts PerceptualSpeedIndex | linux1804-64-shippable-qr | cold fission webrender | 234.20 -> 253.33 | Before/After |
| 7% | reddit ContentfulSpeedIndex | linux1804-64-shippable-qr | cold fission webrender | 1,266.32 -> 1,359.22 | Before/After |
| 7% | reddit SpeedIndex | linux1804-64-shippable-qr | cold fission webrender | 1,509.47 -> 1,609.51 | Before/After |
| 6% | reddit PerceptualSpeedIndex | linux1804-64-shippable-qr | cold fission webrender | 1,447.08 -> 1,537.03 | Before/After |
| 3% | reddit-billgates-ama.billg-ama LastVisualChange | macosx1015-64-shippable-qr | cold fission webrender | 11,168.62 -> 11,498.36 | Before/After |
Improvements:
| Ratio | Test | Platform | Options | Absolute values (old vs new) | Performance Profiles |
|---|---|---|---|---|---|
| 5% | speedometer3 NewsSite-Nuxt/NavigateToPolitics/Sync | windows11-64-nightlyasrelease-qr | fission webrender | 14.54 -> 13.85 | Before/After |
| 5% | speedometer3 NewsSite-Nuxt/NavigateToUS/Sync | android-hw-a55-14-0-aarch64-shippable | fission webrender | 43.73 -> 41.75 | |
| 4% | speedometer3 TodoMVC-Lit-Complex-DOM/Adding100Items/total | windows11-64-shippable-qr | fission webrender | 11.70 -> 11.23 | Before/After |
| 4% | speedometer3 TodoMVC-WebComponents/DeletingAllItems/total | windows11-64-shippable-qr | fission webrender | 5.20 -> 5.01 | Before/After |
| 4% | speedometer3 TodoMVC-JavaScript-ES5/Adding100Items/Sync | android-hw-a55-14-0-aarch64-shippable | fission webrender | 76.51 -> 73.75 | |
| ... | ... | ... | ... | ... | ... |
| 2% | speedometer3 total | windows11-64-shippable-qr | fission webrender | 1,156.55 -> 1,133.08 | Before/After |
Details of the alert can be found in the alert summary, including links to graphs and comparisons for each of the affected tests. Please follow our guide to handling regression bugs and let us know your plans within 3 business days, or the patch(es) may be backed out in accordance with our regression policy.
If you need the profiling jobs you can trigger them yourself from treeherder job view or ask a sheriff to do that for you.
You can run all of these tests on try with ./mach try perf --alert 42596
The following documentation link provides more information about this command.
For more information on performance sheriffing please see our FAQ.
If you have any questions, please do not hesitate to reach out to afinder@mozilla.com.
Comment 1•1 year ago
|
||
Just to clarify, only the following alerts are valid regressions:
Regressions:
| Ratio | Test | Platform | Options | Absolute values (old vs new) | Performance Profiles |
|---|---|---|---|---|---|
| 7% | reddit ContentfulSpeedIndex | linux1804-64-shippable-qr | cold fission webrender | 1,266.32 -> 1,359.22 | Before/After |
| 7% | reddit SpeedIndex | linux1804-64-shippable-qr | cold fission webrender | 1,509.47 -> 1,609.51 | Before/After |
| 6% | reddit PerceptualSpeedIndex | linux1804-64-shippable-qr | cold fission webrender | 1,447.08 -> 1,537.03 | Before/After |
The other regressions mentioned in comment 0 are all infra alerts (invalid) and can be discarded.
Sorry for the confusion.
Comment 2•1 year ago
|
||
Set release status flags based on info from the regressing bug 1924850
Comment 3•1 year ago
|
||
It has been over 7 days with no activity on this performance regression.
:jhirsch, since you are the author of the regressor, bug 1924850, which triggered this performance alert, could you please provide a progress update?
If this regression is something that fixes a bug, changes the baseline of the regression metrics, or otherwise will not be fixed, please consider closing it as WONTFIX. See this documentation for more information on how to handle regressions.
For additional information/help, please needinfo the performance sheriff who filed this alert (they can be found in comment #0), or reach out in #perftest, or #perfsheriffs on Element.
For more information, please visit BugBot documentation.
Comment 4•1 year ago
•
|
||
IIUC the reddit graphs for the tests from comment 1 show the same period ~ from Nov 13 to Nov 25 with worse numbers as the other graphs from comment 0.
Are we sure those are real?
Updated•1 year ago
|
Comment 5•1 year ago
|
||
(In reply to Jens Stutte [:jstutte] from comment #4)
IIUC the reddit graphs for the tests from comment 1 show the same period ~ from Nov 13 to Nov 25 with worse numbers as the other graphs from comment 0.
Are we sure those are real?
Hi Jens! Thanks for reaching out!
As mentioned previously, only the regressions mentioned in comment 1 are valid (ContentfulSpeedIndex, SpeedIndex and PerceptualSpeedIndex on linux1804-64-shippable-qr for reddit), and the rest of the regressions mentioned in comment 0 that do not show up in comment 1 are marked as infra, therefore should be discarded.
Infra alerts are graphs that upon retriggering or backfilling, are aligned with the performance trend established after the culprit revision, not the previous performance trend (therefore invalid, which can be caused by various changes in the hardware infrastructure upon which the tests are executed). The following graph is an example of a visible infra alert from the alerts linked in comment 0. We can see there that for revisions e22456853973 and 3c174ea10f04 highlighted in the graph, the retriggers align with the performance trend established after a3590cf454bc8d44e59090e2dde956723b76ca5d. The reason why the infra alerts were added in comment 0, is because the current "File Bug" feature does not filter them out currently (this will be fixed eventually).
One thing I noticed in the graph, which was not visible at the time when this performance regression bug was logged, is that starting with revision 61d11aa9346e, which also generated an improvement alert (later marked also as infra), the graphs reported in comment 1 reverted to their initial performance trends, suggesting they might also be infra (or the performance regression was fixed in the meantime).
I started some retriggers before the original culprit revision and within the following range, and will return on Monday to check if they also turn out to be infra, or can still be considered valid regressions.
Updated•1 year ago
|
Comment 6•1 year ago
|
||
(In reply to Alex Finder from comment #5)
(In reply to Jens Stutte [:jstutte] from comment #4)
IIUC the reddit graphs for the tests from comment 1 show the same period ~ from Nov 13 to Nov 25 with worse numbers as the other graphs from comment 0.
Are we sure those are real?
Hi Jens! Thanks for reaching out!
As mentioned previously, only the regressions mentioned in comment 1 are valid (ContentfulSpeedIndex, SpeedIndex and PerceptualSpeedIndex on linux1804-64-shippable-qr for reddit), and the rest of the regressions mentioned in comment 0 that do not show up in comment 1 are marked as infra, therefore should be discarded.
Infra alerts are graphs that upon retriggering or backfilling, are aligned with the performance trend established after the culprit revision, not the previous performance trend (therefore invalid, which can be caused by various changes in the hardware infrastructure upon which the tests are executed). The following graph is an example of a visible infra alert from the alerts linked in comment 0. We can see there that for revisions e22456853973 and 3c174ea10f04 highlighted in the graph, the retriggers align with the performance trend established after a3590cf454bc8d44e59090e2dde956723b76ca5d. The reason why the infra alerts were added in comment 0, is because the current "File Bug" feature does not filter them out currently (this will be fixed eventually).
One thing I noticed in the graph, which was not visible at the time when this performance regression bug was logged, is that starting with revision 61d11aa9346e, which also generated an improvement alert (later marked also as infra), the graphs reported in comment 1 reverted to their initial performance trends, suggesting they might also be infra (or the performance regression was fixed in the meantime).
I started some retriggers before the original culprit revision and within the following range, and will return on Monday to check if they also turn out to be infra, or can still be considered valid regressions.
Following up from the previous comment, I added some re-triggers and backfills before revision a3590cf454bc8 to get a clearer graph. Will revisit the results tomorrow and check.
Updated•1 year ago
|
Comment 7•1 year ago
|
||
Returning with an analysis on the tests after the retriggers, it looks like the graphs now match the same infra pattern as the other reported tests from comment 0. I'll mark the bug as Invalid and unlink it from the alert summary. Sorry for the confusion!
Updated•10 months ago
|
Description
•