Closed Bug 1472610 Opened 4 years ago Closed 4 years ago
100% sccache hit rate (linux64) regression on push d6120c2bb51e (Fri Jun 29 2018)
We have detected a build metrics regression from push: https://hg.mozilla.org/integration/mozilla-inbound/pushloghtml?changeset=b6406d11016d5a5167ca7de3271a76f4590cd5a6 As author of one of the patches included in that push, we need your help to address this regression. Improvements: 100% sccache hit rate linux64 lto opt taskcluster-c4.4xlarge 0.63 -> 0.00 You can find links to graphs and comparison views for each of the above tests at: https://treeherder.mozilla.org/perf.html#/alerts?id=14113 On the page above you can see an alert for each affected platform as well as a link to a graph showing the history of scores for this test. There is also a link to a treeherder page showing the jobs in a pushlog format. To learn more about the regressing test(s), please see: https://developer.mozilla.org/en-US/docs/Mozilla/Performance/Automated_Performance_Testing_and_Sheriffing/Build_Metrics
(In reply to Ionuț Goldan [:igoldan], Performance Sheriffing from comment #0) > Improvements: > > 100% sccache hit rate linux64 lto opt taskcluster-c4.4xlarge 0.63 -> > 0.00 Perfherder records this as an improvement, but I think this is actually a regression.
:gps I noticed that since d6120c2bb51e (bug 1459004), sccache hit rate dropped to 0%. Can you confirm this?
https://treeherder.mozilla.org/perf.html#/graphs?timerange=1209600&series=mozilla-inbound,1693299,1 and https://treeherder.mozilla.org/perf.html#/graphs?timerange=1209600&series=autoland,1692792,1 show the sccache hit rate for these builds flatlining. And the way it changed is really wonky. There were a few dips to 0% before it stayed there. It's almost as if there was a change in CI that prevented new builds from working with sccache. But builds on Try (https://treeherder.mozilla.org/perf.html#/graphs?series=try,1689992,1,2&series=try,1691845,1,2&series=try,1681682,1,2&series=try,1697403,1,2) show us still getting cache hits. This is most wonky and should definitely be investigated. Needinfo on Ted because sccache related.
Flags: needinfo?(gps) → needinfo?(ted)
Product: Testing → Firefox Build System
The mislabel of "improvement" on this metric is bug 1411304.
$ curl -sL https://queue.taskcluster.net/v1/task/ekwsndpCTfWeE_lgE1N1EA/runs/0/artifacts/public/build/sccache.log.gz | zgrep -c "Cache hit" 3688 $ curl -sL https://queue.taskcluster.net/v1/task/ekwsndpCTfWeE_lgE1N1EA/runs/0/artifacts/public/build/sccache.log.gz | zgrep -c "Cache miss" 10 Likely cause: linking takes too long, and the sccache server shuts itself down.
I thought we were setting SCCACHE_IDLE_TIMEOUT so that it never shuts down other than manually... but it seems we're not.
Attachment #8989297 - Flags: review?(core-build-config-reviews) → review?(gps)
Comment on attachment 8989297 [details] Bug 1472610 - Disable sccache idle shutdown. https://reviewboard.mozilla.org/r/254364/#review261170
Attachment #8989297 - Flags: review?(gps) → review+
Pushed by firstname.lastname@example.org: https://hg.mozilla.org/integration/autoland/rev/5a3c9505a61b Disable sccache idle shutdown. r=gps
I can confirm this got fixed.
You need to log in before you can comment on or make changes to this bug.