50.72% google-slides ContentfulSpeedIndex (Linux, OSX, Windows) regression on Fri December 23 2022
Categories
(Core :: JavaScript Engine, defect, P1)
Tracking
()
Tracking | Status | |
---|---|---|
firefox-esr102 | --- | unaffected |
firefox108 | --- | unaffected |
firefox109 | --- | unaffected |
firefox110 | + | fixed |
People
(Reporter: afinder, Assigned: arai)
References
(Regression)
Details
(Keywords: perf, perf-alert, regression)
Attachments
(3 files, 4 obsolete files)
Perfherder has detected a browsertime performance regression from push 7da2203b8c4f8a6c590ce3d85e1b72eca109716c. As author of one of the patches included in that push, we need your help to address this regression.
Regressions:
Ratio | Test | Platform | Options | Absolute values (old vs new) | Performance Profiles |
---|---|---|---|---|---|
116% | google-slides fcp | linux1804-64-shippable-qr | bytecode-cached fission warm webrender | 284.15 -> 613.25 | Before/After |
109% | google-slides FirstVisualChange | windows10-64-shippable-qr | bytecode-cached fission warm webrender | 316.50 -> 661.00 | Before/After |
104% | google-slides FirstVisualChange | linux1804-64-shippable-qr | bytecode-cached fission warm webrender | 316.67 -> 646.67 | Before/After |
100% | google-slides fcp | windows10-64-shippable-qr | bytecode-cached fission warm webrender | 309.71 -> 618.83 | Before/After |
53% | google-slides SpeedIndex | linux1804-64-shippable-qr | bytecode-cached fission warm webrender | 502.04 -> 770.33 | Before/After |
52% | google-slides PerceptualSpeedIndex | windows10-64-shippable-qr | bytecode-cached fission warm webrender | 526.58 -> 799.00 | Before/After |
51% | google-slides SpeedIndex | windows10-64-shippable-qr | bytecode-cached fission warm webrender | 530.33 -> 799.33 | Before/After |
47% | google-slides PerceptualSpeedIndex | linux1804-64-shippable-qr | bytecode-cached fission warm webrender | 526.50 -> 776.17 | Before/After |
34% | google-slides ContentfulSpeedIndex | linux1804-64-shippable-qr | bytecode-cached fission warm webrender | 596.08 -> 800.50 | Before/After |
29% | google-slides ContentfulSpeedIndex | windows10-64-shippable-qr | bytecode-cached fission warm webrender | 671.29 -> 864.67 | Before/After |
26% | google-slides ContentfulSpeedIndex | windows10-64-shippable-qr | bytecode-cached fission warm webrender | 696.54 -> 879.58 | Before/After |
4% | yahoo-mail loadtime | macosx1015-64-shippable-qr | bytecode-cached fission warm webrender | 239.92 -> 249.54 | |
4% | facebook fcp | macosx1015-64-shippable-qr | fission warm webrender | 247.29 -> 257.17 | |
3% | facebook loadtime | macosx1015-64-shippable-qr | fission warm webrender | 207.12 -> 212.75 |
For up to date results, see: https://treeherder.mozilla.org/perfherder/alerts?id=36550
Details of the alert can be found in the alert summary, including links to graphs and comparisons for each of the affected tests. Please follow our guide to handling regression bugs and let us know your plans within 3 business days, or the offending patch(es) may be backed out in accordance with our regression policy.
If you need the profiling jobs you can trigger them yourself from treeherder job view or ask a sheriff to do that for you.
For more information on performance sheriffing please see our FAQ.
Reporter | ||
Comment 1•1 year ago
|
||
Hi arai! Could you please also let us know which of the defects mentioned above (1804613, 1804837, 1804842, 1805149) caused the regression ? Thanks!
Assignee | ||
Comment 2•1 year ago
|
||
I'm not sure if I can solve the issue within 3 days.
if I couldn't, please backout the patches for the following bugs, which all has dependencies:
bug 1804842, bug 1804613, bug 1805149, bug 1804849, bug 1804837, bug 1805142, bug 1805645, bug 1806366
Comment 3•1 year ago
|
||
Thanks for pointing out this regression! With holidays, it may take us a about a week to determine a clear answer about what should be backed out. Fortunately, the next merge is Jan 17 so we should be able to resolve this with time to see improvements in results still.
Assignee | ||
Comment 4•1 year ago
|
||
the regression comes from bug 1805149 patches, that moves the bytecode de-duplication from the compilation step to the instantiation step,
which can move some task from off-thread to the main thread and can affect the performance:
"test-linux1804-64-shippable-qr/opt-browsertime-tp6-bytecode-firefox-google-slides" fcp warm score
- Bug 1804837 : 646 ms : https://treeherder.mozilla.org/jobs?repo=try&revision=9954702085ce6f53d1cd7138aab49302431e65ff
- Bug 1804849 : 647 ms : https://treeherder.mozilla.org/jobs?repo=try&revision=cf1a7f2cf84bfb02aefeeca8edb6eb62e0e61aa8
- Bug 1805149 : 646 ms : https://treeherder.mozilla.org/jobs?repo=try&revision=b302f7e5a261e77cb53405aac583745575fa3e25
- Bug 1804613 : 289 ms : https://treeherder.mozilla.org/jobs?repo=try&revision=416fe5a94b199f062edc055a86fb8b7a2c65a81b
- Bug 1804842 : 290 ms : https://treeherder.mozilla.org/jobs?repo=try&revision=3f4f9b0b5333e25a623a4aaca1f10760a1ebda9d
- before : 281 ms : https://treeherder.mozilla.org/jobs?repo=try&revision=c59f1413e41b182a157e37315409acf4dabb8a0d
https://hg.mozilla.org/mozilla-central/rev/343425669f57
https://hg.mozilla.org/mozilla-central/rev/ac0c7c9bbd8e
I'm still not sure why the score gets doubled.
I'll check what's happening here next week.
Comment 5•1 year ago
|
||
Set release status flags based on info from the regressing bug 1805149
Assignee | ||
Comment 6•1 year ago
|
||
I don't see that huge regression in the profile marker or the screenshots.
In the comment #0's first 2 links
- Before : FirstContentfulPaint: 704ms
- After : FirstContentfulPaint : 335ms
In the comment #4, Bug 1804613 vs Bug 1805149
- Bug 1804613 (before) : FirstContentfulPaint : 334ms
- Bug 1805149 (after) : FirstContentfulPaint - 351ms
am I missing something?
Assignee | ||
Comment 7•1 year ago
|
||
so far, what I observed is the following:
- the profile is for the different run than the runs used for the final score
- with profile, there's not much regression observed so far
- without profile, the patch makes the score bimodal, with:
- no regression (fcp ~ 250ms)
- much regression (fcp ~ 650ms)
- even with regression on fcp, FirstVisualChange, etc, the total load time isn't affected
- the visual of the first frame seems to be same between regressed and not-regressed case
- the regressed case doesn't update on some frame where not-regressed case does (see attachment)
I'll see if I can reproduce the regressed case in profile run.
Assignee | ||
Comment 8•1 year ago
|
||
both regressed and non-regressed cases observed in the following profiler run
https://treeherder.mozilla.org/jobs?repo=try&revision=b302f7e5a261e77cb53405aac583745575fa3e25&selectedTaskRun=eQEA59_OT3iYDdX3ARLtug.0
Updated•1 year ago
|
Comment 9•1 year ago
|
||
The bug is marked as tracked for firefox110 (nightly). We have limited time to fix this, the soft freeze is in 14 days. However, the bug still isn't assigned.
:sdetar, could you please find an assignee for this tracked bug? Given that it is a regression and we know the cause, we could also simply backout the regressor. If you disagree with the tracking decision, please talk with the release managers.
For more information, please visit auto_nag documentation.
Assignee | ||
Updated•1 year ago
|
Updated•1 year ago
|
Assignee | ||
Comment 10•1 year ago
|
||
The bimodal-ness is also observable in the perfharder graph
https://treeherder.mozilla.org/perfherder/graphs?highlightAlerts=1&highlightChangelogData=1&highlightCommonAlerts=0&series=autoland,4405712,1,13&timerange=2592000
https://treeherder.mozilla.org/perfherder/graphs?highlightAlerts=1&highlightChangelogData=1&highlightCommonAlerts=0&series=autoland,4405887,1,13&timerange=2592000
Assignee | ||
Comment 11•1 year ago
|
||
I don't see any notable difference in the JS-related behavior, and the code that's moved from off-thread to main-thread (SharedImmutableScriptData::shareScriptData
called by JSScript::fullyInitFromStencil
) doesn't appear in the profile,
between Navigation::Start
and "fcp" range.
So this should be a scheduling issue.
Assignee | ||
Comment 12•1 year ago
|
||
if we really need to backout, the following is the minimal set of patches that reverts the change:
Assignee | ||
Comment 13•1 year ago
|
||
testing the backout patches (folded) https://treeherder.mozilla.org/jobs?repo=try&revision=045177b0eab519fe6217b88e9bae8e49531d2873
Comment 14•1 year ago
•
|
||
I filed Bug 1808824 to seek out more information since this issue seems to be preexisting. The 400 ms change is due to slight timing changes leading to us still being paint-suppressed during the vsync, and then the DOM code deciding to execute 400 ms of scripts back-to-back without allowing any vsync to process until they are done.
This seems to be a very specific timing issue that happens to depend on how the test case is setup and the hardware it runs on. Slight changes to configuration give the same very different results. I don't think we should back out any patches.
Comment 15•1 year ago
|
||
Tooru, are you planning a fix before the merge or should we target a uplift to beta? Thanks
Comment 16•1 year ago
|
||
The regression on yahoo-mail
appears to be real. The stack samples seem pretty unreliable, but looking at markers I can see a clear 10ms of additional stencil instantiate time with the changes. This screenshot is looking at the gaps between script-execution in a single nsHtml5ExecutorReflusher
. This 10ms is the matches the pageload regression.
https://s.yimg.com/nq/nr/js/mail_app_es6_vendor_DJd61cYxTk1IkcyOrb6RX_nl4UD9dabo4MtjaSYMW4k_v1.js
: 3.7 ms -> 7.3 ms
https://s.yimg.com/nq/nr/js/mail_app_es6_093W1xWOMPPEBgMia6YydArbgLeqNX3lfXmR7pgDliA_v1.js
: 19 ms -> 25 ms
Assignee | ||
Comment 17•1 year ago
|
||
(In reply to Pascal Chevrel:pascalc from comment #15)
Tooru, are you planning a fix before the merge or should we target a uplift to beta? Thanks
fix for google-slides is landed in bug 1808824, and those 20-100% regressions will go away.
I'm going to address the remaining 5% regression here
Assignee | ||
Comment 18•1 year ago
|
||
forgot to mention about the fix.
I'm planning a fix before the merge
Assignee | ||
Comment 19•1 year ago
•
|
||
tested a patch to apply compilation-local bytecode de-deduplication, and it slightly improves the yahoo-mail loadtime, but given:
- this is slightly risky, given it can regress other case, and it may need more tuning before landing
- fcp regresses even after bug 1808824, due to bimodal-ness and bad case increases with the change
so, I'll revert the bug 1805149 patch here and move the compilation-local de-duplication patch to a followup bug.
Assignee | ||
Comment 20•1 year ago
|
||
Comment 21•1 year ago
|
||
Pushed by arai_a@mac.com: https://hg.mozilla.org/integration/autoland/rev/f2457a464d03 Revert bug 1805149 and temporarily revive BytecodeEmitter.cx. r=tcampbell
Comment 22•1 year ago
|
||
bugherder |
Comment 23•1 year ago
|
||
== Change summary for alert #36740 (as of Mon, 16 Jan 2023 20:37:03 GMT) ==
Improvements:
Ratio | Test | Platform | Options | Absolute values (old vs new) | Performance Profiles |
---|---|---|---|---|---|
3% | yahoo-mail loadtime | macosx1015-64-shippable-qr | bytecode-cached fission warm webrender | 249.50 -> 241.62 | Before/After |
For up to date results, see: https://treeherder.mozilla.org/perfherder/alerts?id=36740
Assignee | ||
Comment 24•1 year ago
|
||
Re-landing of bug 1805149 patches Part 1 and 2.
Assignee | ||
Comment 25•1 year ago
|
||
Now ScriptStencil to SharedImmutableScriptData is not 1:1, and ScriptStencil
holds SharedImmutableScriptDataIndex to point the SharedImmutableScriptData
inside SharedDataContainer.
SharedDataContainer is simplified to use either a single pointer or a vector.
Depends on D167507
Assignee | ||
Comment 26•1 year ago
|
||
Depends on D167508
Assignee | ||
Comment 27•1 year ago
|
||
Depends on D167509
Comment 28•1 year ago
|
||
Comment on attachment 9313491 [details]
Bug 1807228 - Part 1: Perform SharedImmutableScriptData::shareScriptData during instantiation. r?tcampbell!
Revision D167507 was moved to bug 1810160. Setting attachment 9313491 [details] to obsolete.
Comment 29•1 year ago
|
||
Comment on attachment 9313492 [details]
Bug 1807228 - Part 2: Perform local bytecode de-duplication during compilation. r?tcampbell!
Revision D167508 was moved to bug 1810160. Setting attachment 9313492 [details] to obsolete.
Comment 30•1 year ago
|
||
Comment on attachment 9313493 [details]
Bug 1807228 - Part 3: Perform global bytecode de-duplication at once. r?tcampbell!
Revision D167509 was moved to bug 1810160. Setting attachment 9313493 [details] to obsolete.
Comment 31•1 year ago
|
||
Comment on attachment 9313494 [details]
Bug 1807228 - Part 4: Remove AutoLockScriptData. r?tcampbell!
Revision D167510 was moved to bug 1810160. Setting attachment 9313494 [details] to obsolete.
Description
•