2.74 - 3.57% ts_paint_webext (windows10-64-shippable) regression on push df91fb44b2d4bd8cf55cb2a5360094feb2c6ca6b (Wed July 1 2020)
Categories
(Firefox :: Toolbars and Customization, defect, P1)
Tracking
()
Tracking | Status | |
---|---|---|
firefox-esr68 | --- | unaffected |
firefox-esr78 | --- | unaffected |
firefox78 | --- | unaffected |
firefox79 | --- | unaffected |
firefox80 | --- | fixed |
People
(Reporter: alexandrui, Assigned: emalysz)
References
(Blocks 1 open bug, Regression)
Details
(4 keywords)
Attachments
(1 file)
Perfherder has detected a talos performance regression from push df91fb44b2d4bd8cf55cb2a5360094feb2c6ca6b. As author of one of the patches included in that push, we need your help to address this regression.
Regressions:
4% ts_paint_webext windows10-64-shippable opt e10s stylo 299.79 -> 310.50
3% ts_paint_webext windows10-64-shippable opt e10s stylo 301.42 -> 309.67
Details of the alert can be found in the alert summary, including links to graphs and comparisons for each of the affected tests. Please follow our guide to handling regression bugs and let us know your plans within 3 business days, or the offending patch(es) will be backed out in accordance with our regression policy.
For more information on performance sheriffing please see our FAQ.
Reporter | ||
Updated•7 months ago
|
Comment 1•7 months ago
|
||
What caused the corresponding jump in the opposite direction a little while earlier on autoland? ( https://treeherder.mozilla.org/perf.html#/graphs?highlightAlerts=1&selected=1926121,1175301106&series=autoland,1926121,1,1&timerange=1209600&zoom=1593437987455,1593479085063,284.0470203986355,319.7247710351851 )
Comment 2•7 months ago
|
||
Also, what is this test, what does it do, and how does it differ from ts_paint ? There's no documentation on https://wiki.mozilla.org/TestEngineering/Performance/Talos/Tests .
Comment 3•7 months ago
|
||
I triggered profiles but they appear to be no help; the profile runs just as often show the post-patch result being faster than the pre-patch result, and it's not clear where to look for the time range that the test measures. Averaging the actual results of all the profiles before and after the change, the difference is (if I've not made any mistake; the numbers aren't published so I had to dig them out of the log and average them myself) between 376.8ms and 374.95ms, so just under 2ms - but again, no idea where in the profile (they all run about 1 full second), and the noise on the individual measurements is between 10 and 20ms.
Florian, do you have any guidance here, given that you looked at the usefulness of these profiles recently?
Comment 4•7 months ago
|
||
(In reply to :Gijs (he/him) from comment #2)
Also, what is this test, what does it do, and how does it differ from ts_paint ? There's no documentation on https://wiki.mozilla.org/TestEngineering/Performance/Talos/Tests .
From looking at https://searchfox.org/mozilla-central/rev/31d8600b73dc85b4cdbabf45ac3f1a9c11700d8e/testing/talos/talos/test.py#170-171 my guess is that it's the same as ts_paint but with a dummy webextension (testing/talos/talos/webextensions/dummy/dummy.xpi) installed to check the impact of add-ons on startup. That dummy add-on has the id talos@mozilla.org.
Comment 5•7 months ago
|
||
(In reply to :Gijs (he/him) from comment #3)
I triggered profiles but they appear to be no help; [...] it's not clear where to look for the time range that the test measures. [...], no idea where in the profile (they all run about 1 full second)
I can answer that part of the question, I don't have good answers for the rest.
The test measures at https://searchfox.org/mozilla-central/rev/e11d919cdc8a909354eb2c3e19904d9229c55d88/testing/talos/talos/startup_test/tspaint_test.html#29, so the value is the time between process creation, and this point: https://searchfox.org/mozilla-central/rev/e11d919cdc8a909354eb2c3e19904d9229c55d88/view/nsViewManager.cpp#344
A close enough approximation in the profiles is looking for the first screenshot in the screenshot track, or the first "ForwardTransaction" marker.
Set release status flags based on info from the regressing bug 1648639
Updated•7 months ago
|
Comment 7•7 months ago
|
||
baseline: https://treeherder.mozilla.org/#/jobs?repo=try&revision=4ad3b917903a91c024d2b5262cc675f10527b9dc
full patch: https://treeherder.mozilla.org/#/jobs?repo=try&revision=15564de3339d886c584e09f93f6aee183975caac
only change CUI.jsm: https://treeherder.mozilla.org/#/jobs?repo=try&revision=85a74d6b4193734ad0128cdcefd154176c08df8a
change CUI.jsm + panelUI.js: https://treeherder.mozilla.org/#/jobs?repo=try&revision=80c69b12bc9331e5a2d53f4b9f9442ff851c36a1
Reporter | ||
Comment 8•7 months ago
•
|
||
Seems like in the "full patch" push you forgot the patch and pushed the same code as in the "baseline". Anyways, seems like "change CUI.jsm + panelUI.js" introduced the regression.
Comment 9•7 months ago
|
||
(In reply to Alexandru Ionescu (needinfo me) [:alexandrui] from comment #8)
Seems like in the "full patch" push you forgot the patch and pushed the same code as in the "baseline".
I did not? The two patches have different parents, compare:
https://hg.mozilla.org/try/rev/4ad3b917903a91c024d2b5262cc675f10527b9dc
and
https://hg.mozilla.org/try/rev/15564de3339d886c584e09f93f6aee183975caac
the latter has parent https://hg.mozilla.org/try/rev/df91fb44b2d4bd8cf55cb2a5360094feb2c6ca6b, ie https://hg.mozilla.org/integration/autoland/pushloghtml?changeset=df91fb44b2d4bd8cf55cb2a5360094feb2c6ca6b quoted in comment #0 as being the regressing patch.
So I don't understand why the trypush doesn't show a regression at all.
Anyways, seems like "change CUI.jsm + panelUI.js" introduced the regression.
This link compares a revision with itself, but also even https://treeherder.mozilla.org/perf.html#/compare?originalProject=try&originalRevision=4ad3b917903a91c024d2b5262cc675f10527b9dc&newProject=try&newRevision=80c69b12bc9331e5a2d53f4b9f9442ff851c36a1&framework=1&pageTitle=CUI%2BPanelUI does not show a regression.
Any idea what's going wrong here? Even the baseline trypush has the "regressed" numbers from comment #0.
Comment 10•7 months ago
|
||
Oh, I know what the issue is - the trypushes all have qr jobs, and the regression is in the non-qr jobs. I'll add some retriggers...
Reporter | ||
Comment 11•7 months ago
•
|
||
(In reply to :Gijs (he/him) from comment #9)
I did not? The two patches have different parents, compare:
https://hg.mozilla.org/try/rev/4ad3b917903a91c024d2b5262cc675f10527b9dc
and
https://hg.mozilla.org/try/rev/15564de3339d886c584e09f93f6aee183975caac
Sorry, my bad!
Reporter | ||
Comment 12•7 months ago
|
||
(In reply to :Gijs (he/him) from comment #10)
Oh, I know what the issue is - the trypushes all have qr jobs, and the regression is in the non-qr jobs. I'll add some retriggers...
Looking forward for the results.
Comment 13•7 months ago
|
||
OK, so https://treeherder.mozilla.org/perf.html#/compare?originalProject=try&originalRevision=4ad3b917903a91c024d2b5262cc675f10527b9dc&newProject=try&newRevision=85a74d6b4193734ad0128cdcefd154176c08df8a&framework=1&pageTitle=CUI shows no significant change but if you look closely at the base revision, you can see all the replicates for the base revision and ts_paint_webext on windows10-64-shippable (not the qr one) are just under 300ms - except one, which is 359ms, and which skews the average to be the same as the post-patch state. More retries would probably fix the perfherder UI to show this as a regression, but as these numbers match the ones in comment #0 let's not waste machine time on this.
So it appears my suspicion was correct and the diff in CustomizableUI.jsm was responsible for this, ie https://hg.mozilla.org/try/rev/d2efc64d0d0b2fd088b38bba76790b196c971a22 .
Next to work out is whether this is due to just loading PMV or due to the implementation of getViewNode
or something else. Push that just loads PMV: https://treeherder.mozilla.org/#/jobs?repo=try&revision=9c11fcbef8fea814f7e719cb430cff98a97082e8
Comment 14•7 months ago
|
||
(In reply to :Gijs (he/him) from comment #13)
Next to work out is whether this is due to just loading PMV or due to the implementation of
getViewNode
or something else. Push that just loads PMV: https://treeherder.mozilla.org/#/jobs?repo=try&revision=9c11fcbef8fea814f7e719cb430cff98a97082e8
These numbers are the regressed numbers - but this is an artifact build, so I don't fully trust them. I meant to push without artifact builds, https://treeherder.mozilla.org/#/jobs?repo=try&revision=a648653d9e3b7947230d74a6a9ad900b604db522 .
These numbers are also bad! So I guess the issue is loading PMV at all...
Comment 15•7 months ago
|
||
Passing this off to Emma - she figured out that we can probably just remove some of the code in CustomizableUI.jsm as PanelMultiView takes care of the same things if/when the view is eventually opened.
Assignee | ||
Comment 16•7 months ago
|
||
Going to get a patch up soon that adds the node to the PanelUI-subView
class, but commenting out https://searchfox.org/mozilla-central/source/browser/components/customizableui/CustomizableUI.jsm#1760-1778 seems promising: https://treeherder.mozilla.org/#/jobs?repo=try&revision=2468aed51b7366fec3e8311d32de5dfb0fe1ec26&selectedTaskRun=RfBZ37NGTbOrYX-dsxOqlw.0
Assignee | ||
Comment 17•7 months ago
|
||
Comment 18•7 months ago
|
||
Pushed by rmaries@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/e7bb5cb77c6c rely on panelUI to add "PanelUI-subView" attribute r=Gijs
Updated•7 months ago
|
Comment 19•7 months ago
|
||
bugherder |
Comment 20•7 months ago
|
||
So I'm not a perf sheriff but this doesn't look fixed to me, from looking at https://treeherder.mozilla.org/perf.html#/graphs?highlightAlerts=1&series=autoland,1926121,1,1&timerange=604800 . Potentially because other stuff landed since the regressing patch...
Comment 21•7 months ago
|
||
https://treeherder.mozilla.org/#/jobs?repo=try&revision=a94d5e87f67171b53cb7f65de9d9ddd85f812bfb&selectedTaskRun=DNu9G4WCSou2_nViOX8LNA.0 shows that the patch worked at the time. So something else also tripped us into the "bad" case here between when this regressed and when we fixed it... now the question is... what?
Comment 22•7 months ago
|
||
Did 4 or 5 pushes to narrow down the week-long window a bit.
good: https://treeherder.mozilla.org/#/jobs?repo=try&revision=c9475703f658c2b8c620fd11b70f8d024b00f522&selectedTaskRun=dQSSfweKSy22Zb6awGf-sw.0 ( https://hg.mozilla.org/mozilla-central/rev/b48777a21aabc35311956a1a1395a5dbba8c1a77 )
bad: https://treeherder.mozilla.org/#/jobs?repo=try&revision=9f0684f5a9c59b05e370a9fffcd568c0d607bf36 ( https://hg.mozilla.org/mozilla-central/rev/6087e976924f95018479c6f5881878c95b8bd8e2 )
Doing another load of trypushes to bisect that to a specific m-c push:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=399dc39e734dd3bd00b4a3b3deb5ab1cb7071703 (push https://hg.mozilla.org/mozilla-central/rev/2da2c9f85102b92961435e5434f469fef0ebf08c )
https://treeherder.mozilla.org/#/jobs?repo=try&revision=80d036f9f118f0a04cbe53ccd145e65fb29837a4 (push https://hg.mozilla.org/mozilla-central/rev/c050478f22e6b4930cdcc54f8caabf4da9d58abf )
https://treeherder.mozilla.org/#/jobs?repo=try&revision=8c1f282718eeaba109ac4b0a16ec114c3e5aa5aa (push https://hg.mozilla.org/mozilla-central/rev/efa2336315eda0aaf78c2e94d6c9c98106ea136b )
https://treeherder.mozilla.org/#/jobs?repo=try&revision=eb4d2a098d42c83815f5764b7378f61538d066d5 (push https://hg.mozilla.org/mozilla-central/rev/2ccff3de01bf2aa6c7fb1f17f9a47f367d2c05fb )
https://treeherder.mozilla.org/#/jobs?repo=try&revision=c6f0a720ba05c7803c7737ab82bb29c1548e5896 (push https://hg.mozilla.org/mozilla-central/rev/ec01c146f756b74d18e4892b4fd3aecba00da93e )
https://treeherder.mozilla.org/#/jobs?repo=try&revision=2c00e145473fffe76ceddee95c28cdd9683b5fa5 (push https://hg.mozilla.org/mozilla-central/rev/75966ee1fe658b1767d7459256175c0662d14c25 )
Comment 23•7 months ago
•
|
||
(In reply to :Gijs (he/him) from comment #22)
Doing another load of trypushes to bisect that to a specific m-c push:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=399dc39e734dd3bd00b4a3b3deb5ab1cb7071703 (push https://hg.mozilla.org/mozilla-central/rev/2da2c9f85102b92961435e5434f469fef0ebf08c )
good ^^
https://treeherder.mozilla.org/#/jobs?repo=try&revision=80d036f9f118f0a04cbe53ccd145e65fb29837a4 (push https://hg.mozilla.org/mozilla-central/rev/c050478f22e6b4930cdcc54f8caabf4da9d58abf )
bad ^^
All the WPR stuff is test-only, so it shouldn't be as bad as it looks...
https://treeherder.mozilla.org/#/jobs?repo=try&revision=a83e593e6bc004436322e370b9c099d3848ee24a ( https://hg.mozilla.org/integration/autoland/rev/d0ff8bcce2867b19ac0ffe8f4a2b3ff7eb91bef8 )
https://treeherder.mozilla.org/#/jobs?repo=try&revision=d47e51c0c91857d4876d305e68e19667de9ae1f2 ( https://hg.mozilla.org/integration/autoland/rev/2b90c377a85ad6879d29ce2bae191e570992b2b4 )
https://treeherder.mozilla.org/#/jobs?repo=try&revision=12f0a6e9cb4c0064a1628f0f3250a6debbaa02ff ( https://hg.mozilla.org/integration/autoland/rev/83710b53eef098a0b2a2184ab02e8bcd76a00ab9 )
https://treeherder.mozilla.org/#/jobs?repo=try&revision=279e0f8ad6c2503f2fa10b0b9e694664472f20e1 ( https://hg.mozilla.org/integration/autoland/rev/297e0aa8233df4912e2f6bd8568f264a4a4813d4 )
Comment 24•7 months ago
|
||
I filed bug 1652273 for addressing the next regressor, though as I discussed there I'm not sure what the best approach is.
I think we're done as far as this bug is concerned, even though the numbers haven't recovered.
Description
•