sccache hit rate is 0
Categories
(Firefox Build System :: Toolchains, defect)
Tracking
(Not tracked)
People
(Reporter: jcristau, Unassigned)
References
(Regression)
Details
(Keywords: regression)
Since the bump to sccache 0.8.1 it looks like our hit rate on CI is down to 0, which slows down most builds significantly.
Reporter | ||
Comment 1•3 days ago
|
||
Example output from a try build:
[task 2024-06-25T18:13:09.656Z] 18:13:09 INFO - Compile requests 5575
[task 2024-06-25T18:13:09.656Z] 18:13:09 INFO - Compile requests executed 4937
[task 2024-06-25T18:13:09.656Z] 18:13:09 INFO - Cache hits 0
[task 2024-06-25T18:13:09.656Z] 18:13:09 INFO - Cache misses 4929
[task 2024-06-25T18:13:09.656Z] 18:13:09 INFO - Cache misses (C/C++) 4351
[task 2024-06-25T18:13:09.656Z] 18:13:09 INFO - Cache misses (Rust) 578
[task 2024-06-25T18:13:09.657Z] 18:13:09 INFO - Cache timeouts 0
[task 2024-06-25T18:13:09.657Z] 18:13:09 INFO - Cache read errors 0
[task 2024-06-25T18:13:09.657Z] 18:13:09 INFO - Forced recaches 0
[task 2024-06-25T18:13:09.657Z] 18:13:09 INFO - Cache write errors 4929
[task 2024-06-25T18:13:09.657Z] 18:13:09 INFO - Compilation failures 6
[task 2024-06-25T18:13:09.657Z] 18:13:09 INFO - Cache errors 2
[task 2024-06-25T18:13:09.658Z] 18:13:09 INFO - Cache errors (Rust) 2
[task 2024-06-25T18:13:09.658Z] 18:13:09 INFO - Non-cacheable compilations 0
[task 2024-06-25T18:13:09.658Z] 18:13:09 INFO - Non-cacheable calls 359
[task 2024-06-25T18:13:09.658Z] 18:13:09 INFO - Non-compilation calls 279
[task 2024-06-25T18:13:09.658Z] 18:13:09 INFO - Unsupported compiler calls 0
[task 2024-06-25T18:13:09.658Z] 18:13:09 INFO - Average cache write 0.000 s
[task 2024-06-25T18:13:09.658Z] 18:13:09 INFO - Average compiler 5.962 s
[task 2024-06-25T18:13:09.658Z] 18:13:09 INFO - Average cache read hit 0.000 s
[task 2024-06-25T18:13:09.658Z] 18:13:09 INFO - Failed distributed compilations 0
[task 2024-06-25T18:13:09.659Z] 18:13:09 INFO - Non-cacheable reasons:
[task 2024-06-25T18:13:09.659Z] 18:13:09 INFO - -P 164
[task 2024-06-25T18:13:09.659Z] 18:13:09 INFO - crate-type 134
[task 2024-06-25T18:13:09.659Z] 18:13:09 INFO - -fsyntax-only 45
[task 2024-06-25T18:13:09.659Z] 18:13:09 INFO - unknown source language 10
[task 2024-06-25T18:13:09.659Z] 18:13:09 INFO - - 3
[task 2024-06-25T18:13:09.659Z] 18:13:09 INFO - missing input 2
[task 2024-06-25T18:13:09.659Z] 18:13:09 INFO - -E 1
[task 2024-06-25T18:13:09.659Z] 18:13:09 INFO - Cache location gcs, name: sccache-l1-us-central1, prefix: /
[task 2024-06-25T18:13:09.659Z] 18:13:09 INFO - Version (client) 0.8.1
Comment 2•3 days ago
|
||
Mike, do you think it can be "Disable preprocessor cache " is too aggressive ?
https://github.com/mozilla/sccache/releases
Reporter | ||
Comment 3•3 days ago
|
||
from sccache.log:
[2024-06-25T17:34:32Z DEBUG sccache::cache::gcs] gcs: token load succeeded for scope: https://www.googleapis.com/auth/devstorage.read_write
storage write check failed: PermissionDenied (permanent) at Writer::close => GcsErrorResponse { error: GcsError { code: 403, message: "sccache-l1-us-west1@sccache-3.iam.gserviceaccount.com does not have storage.objects.delete access to the Google Cloud Storage object.", errors: [GcsErrorDetail { domain: "global", location: "", location_type: "", message: "sccache-l1-us-west1@sccache-3.iam.gserviceaccount.com does not have storage.objects.delete access to the Google Cloud Storage object.", reason: "forbidden" }] } }
Context:
uri: https://storage.googleapis.com/upload/storage/v1/b/sccache-l1-us-west1/o?uploadType=media&name=.sccache_check
response: Parts { status: 403, version: HTTP/1.1, headers: {"content-type": "application/json; charset=UTF-8", "date": "Tue, 25 Jun 2024 17:34:32 GMT", "vary": "Origin", "vary": "X-Origin", "cache-control": "no-cache, no-store, max-age=0, must-revalidate", "expires": "Mon, 01 Jan 1990 00:00:00 GMT", "pragma": "no-cache", "x-guploader-uploadid": "ACJd0Nogt5ZWdfAo-4R9JZtTNLpB5l-GiTNahZjU2nUfJ2V-7CbLeUxd6lvbCNbWmAFKyWYTIO0Dz4qrsg", "content-length": "442", "server": "UploadServer"} }
service: gcs
path: .sccache_check
and then [2024-06-25T17:34:33Z DEBUG sccache::server] Error executing cache write: Cannot write to read-only storage
, over and over again.
https://github.com/mozilla/sccache/pull/2091 looks quite suspicious.
Comment 4•3 days ago
|
||
That's the one, this PR sets the storage to be read only. It might well be that we can't write to the root or whatever.
Comment 5•3 days ago
|
||
Fixed by https://hg.mozilla.org/mozilla-central/rev/272e255a9ef8f9b76b2d394edfb9c4307189e7d9, we need someone with access to the bucket permission to continue here.
Comment 6•3 days ago
|
||
Comment 7•3 days ago
|
||
Fun fact: https://github.com/mozilla/sccache/pull/2091 comes from me filing https://github.com/mozilla/sccache/issues/1854 while investigating bug 1823997, which, you'll notice, already showed these write error messages more than a year ago.
Comment 8•3 days ago
|
||
BTW, I thought we were getting perf alerts on high numbers of cache write errors... and now I can't even find the sccache graphs on perfherder.
Comment 9•3 days ago
|
||
Comment 10•3 days ago
|
||
and there WERE alerts
... both marked invalid.
https://treeherder.mozilla.org/perfherder/alerts?id=719
https://treeherder.mozilla.org/perfherder/alerts?id=834
Comment 11•1 day ago
|
||
(In reply to Mike Hommey [:glandium] from comment #10)
and there WERE alerts
... both marked invalid.
https://treeherder.mozilla.org/perfherder/alerts?id=719
https://treeherder.mozilla.org/perfherder/alerts?id=834
Hi Mike, thank you for raising the issue, I linked both alerts to this bug and marked them as "fixed". I think this was missed by the sheriffing team because we also have the "instrumented" build alerts, which we set to "invalid" by default (there is a plan to disable alerting for instrumented builds sometime in the future). Also, build_metrics alerts that are generated by noise in the graph, which then return to a hit rate of 0 get set as invalid, since they are not showing a sustained regression. It is possible that these alerts were classified as "noisy" by mistake.
Description
•