14.69 - 30.82% raptor-tp6-docs-firefox / raptor-tp6-docs-firefox fcp (osx-10-10-shippable) regression on push f5f8345a5c873fc99459f53d2eb14247cbc5473a (Thu May 2 2019)
Categories
(Core :: Graphics: ImageLib, defect, P3)
Tracking
()
People
(Reporter: ccoroiu, Unassigned)
References
(Regression)
Details
(4 keywords)
Raptor has detected a Firefox performance regression from push:
As author of one of the patches included in that push, we need your help to address this regression.
Regressions:
31% raptor-tp6-docs-firefox fcp osx-10-10-shippable opt 1,373.96 -> 1,797.38
15% raptor-tp6-docs-firefox osx-10-10-shippable opt 1,387.57 -> 1,591.39
Improvements:
27% raptor-tp6-microsoft-firefox fcp linux64-shippable opt 481.75 -> 353.62
26% raptor-tp6-microsoft-firefox fcp linux64-shippable-qr opt 513.71 -> 381.62
26% raptor-tp6-microsoft-firefox fcp windows7-32-shippable opt 542.71 -> 404.25
24% raptor-tp6-microsoft-firefox fcp windows10-64-shippable opt 526.62 -> 402.88
23% raptor-tp6-microsoft-firefox fcp windows10-64-shippable-qr opt 546.04 -> 421.88
21% raptor-tp6-microsoft-firefox fcp osx-10-10-shippable opt 1,247.02 -> 982.75
20% raptor-tp6-yandex-firefox loadtime linux64-shippable-qr opt 287.08 -> 229.62
19% raptor-tp6-microsoft-firefox linux64-shippable-qr opt 1,024.67 -> 826.61
19% raptor-tp6-microsoft-firefox linux64-shippable opt 975.46 -> 792.91
19% raptor-tp6-yandex-firefox loadtime linux64-shippable opt 280.33 -> 228.33
18% raptor-tp6-yandex-firefox fcp windows10-64-shippable-qr opt 90.08 -> 74.25
17% raptor-tp6-imdb-firefox osx-10-10-shippable opt 326.45 -> 271.86
17% raptor-tp6-yandex-firefox windows7-32-shippable opt 122.80 -> 102.44
16% raptor-tp6-yandex-firefox linux64-shippable-qr opt 129.49 -> 108.66
14% raptor-tp6-yandex-firefox windows10-64-shippable opt 122.18 -> 104.63
13% raptor-tp6-yandex-firefox linux64-shippable opt 130.66 -> 113.22
13% raptor-tp6-yandex-firefox osx-10-10-shippable opt 252.07 -> 219.32
10% raptor-tp6-imdb-firefox fcp linux64-shippable-qr opt 167.67 -> 150.75
9% raptor-tp6-imdb-firefox fcp linux64-shippable opt 165.62 -> 151.08
5% raptor-tp6-sheets-firefox loadtime linux64-shippable-qr opt 753.46 -> 714.83
4% raptor-tp6-sheets-firefox loadtime linux64-shippable opt 729.25 -> 699.62
You can find links to graphs and comparison views for each of the above tests at: https://treeherder.mozilla.org/perf.html#/alerts?id=20773
On the page above you can see an alert for each affected platform as well as a link to a graph showing the history of scores for this test. There is also a link to a Treeherder page showing the Raptor jobs in a pushlog format.
To learn more about the regressing test(s) or reproducing them, please see: https://wiki.mozilla.org/Performance_sheriffing/Raptor
*** Please let us know your plans within 3 business days, or the offending patch(es) will be backed out! ***
Our wiki page outlines the common responses and expectations: https://wiki.mozilla.org/Performance_sheriffing/Talos/RegressionBugsHandling
Reporter | ||
Updated•5 years ago
|
Comment 1•5 years ago
|
||
The only reason why I could imagine OS X could regress while other platforms don't is that the patch is making us do more work on that platform. This is because the raptor-tp6-docs-firefox
test just measures the load time of this page: https://docs.google.com/document/d/1US-07msg12slQtI_xchzYxcKlTs6Fp7WqIc6W5GK5M8/edit (see https://searchfox.org/mozilla-central/rev/b2015fdd464f598d645342614593d4ebda922d95/testing/raptor/raptor/tests/raptor-tp6-2.ini#21). If you load that page and watch the network connections it makes under devtools you'll see there are no third-party tracking resources found there.
mTopLevelBaseDomain
should be google.com
for the image subresources of the main doc and empty for the rest. I doubt the cost of copying the string objects would be the cause...
This OSX only caller of imgILoader::LoadImage()
looks interesting: https://searchfox.org/mozilla-central/rev/e7d9a8749303b39dadcc0e18ea0d60a570a68145/widget/cocoa/nsMenuItemIconX.mm#235. I doubt we have a ton of menu items with icons but maybe the cost of hashing is the root cause? I think I may need to do some try pushes.
Does anyone know how I can get a talos profile for these regressions? (I don't have access to a Mac machine myself.) Thanks!
Comment 2•5 years ago
|
||
(In reply to Ehsan Akhgari (:ehsan) from comment #1)
Does anyone know how I can get a talos profile for these regressions? (I don't have access to a Mac machine myself.) Thanks!
I suppose you mean Gecko profiles? If yes, I can share that.
Comment 3•5 years ago
|
||
(In reply to Ionuț Goldan [:igoldan], Performance Sheriffing from comment #2)
(In reply to Ehsan Akhgari (:ehsan) from comment #1)
Does anyone know how I can get a talos profile for these regressions? (I don't have access to a Mac machine myself.) Thanks!
I suppose you mean Gecko profiles? If yes, I can share that.
Yes, that would be great. Thank you!
Comment 4•5 years ago
|
||
Having some Gecko profiles from Linux and Windows raptor runs which did not show a regression when this landed would be extremely helpful for comparison too!
Comment 5•5 years ago
|
||
(In reply to Ehsan Akhgari (:ehsan) from comment #4)
Having some Gecko profiles from Linux and Windows raptor runs which did not show a regression when this landed would be extremely helpful for comparison too!
These are the Gecko profiles for the raptor-tp6-docs-firefox regressions from OSX:
Comment 6•5 years ago
|
||
Thanks, Ionuț.
Discoveries so far: this patch seems to win back the regression that I caused: https://hg.mozilla.org/try/rev/2dc297791558bb4b83de7b07dfae80a78a489cce. Reverting no other changes from the original patch has any performance impact. Specifically that means changes to cache key computation or cache key comparison were not at fault. So the only possible explanation for why the above change makes any difference is if this code has some positive impact on OSX: https://searchfox.org/mozilla-central/rev/99a2a5a955960b0e58ceade1db1f7652d9db4ba1/image/imgLoader.cpp#1461 which would only make sense if clearing the image cache in between page reloads more often helps make things faster on OSX...
I'm going to push a couple of patches to try to test these hypotheses.
Comment 7•5 years ago
|
||
Breaking the image cache intentionally (https://hg.mozilla.org/try/rev/1ec3ab54dd5633a5814ac474ca57d8f8792800b3) wins back the performance regression https://treeherder.mozilla.org/perf.html#/compare?originalProject=try&originalRevision=fff2f680db9f1f98c5232153c2e0a8d6db52de96&newProject=try&newRevision=2b0bcf9c9b24931aa81691aa56f83df8c3d0299a&framework=10.
So I think this confirms that the reason behind this regression on OSX is that my patch in bug 1548349 made Firefox better at using its image cache, and as a result now it has worse performance on this test.
Andrew, what's your recommendation on how to deal with this regression?
Comment 8•5 years ago
|
||
I believe whatever is causing the regression was already present and unrelated to the patches in bug 1548349. Since we are hitting the cache more frequently, we will have changed the event ordering from the imgRequestProxy notifications (e.g. they come in sooner than they would have otherwise). This would be my best guess as to why the benchmark changed. However this would mean it is a bug already present and sensitive to timing, which makes it challenging to investigate. However that also means that real world performance is unlikely to be affected, so barring reports to the contrary from users, I would recommend we leave this for now, and keep an eye on any related future regressions.
Updated•5 years ago
|
Comment 9•5 years ago
|
||
Thanks!
Not sure if we should leave the bug open or not, but I'm not planning on doing more here.
Updated•5 years ago
|
Updated•4 years ago
|
Updated•2 years ago
|
Updated•2 years ago
|
Description
•