sccache not caching C++ compiles with clang-cl

RESOLVED FIXED in Firefox 65

Status

defect
RESOLVED FIXED
Last year
6 months ago

People

(Reporter: igoldan, Assigned: mshal)

Tracking

(Blocks 1 bug, {regression})

Trunk
mozilla65
All
Windows
Dependency tree / graph

Firefox Tracking Flags

(firefox-esr60 unaffected, firefox64 wontfix, firefox65 fixed)

Details

Attachments

(3 attachments)

We have detected a build metrics regression from push:

https://hg.mozilla.org/integration/autoland/pushloghtml?changeset=f41fb72f94b69fa2b6c1b2d186d84c7f52f2291c

As author of one of the patches included in that push, we need your help to address this regression.

Regressions:

20003%  sccache requests_not_cacheable windows2012-64 pgo taskcluster-c5.4xlarge     37.00 -> 7,438.00
20003%  sccache requests_not_cacheable windows2012-64 pgo taskcluster-c4.4xlarge     37.00 -> 7,438.00
19332%  sccache requests_not_cacheable windows2012-32 pgo taskcluster-c4.4xlarge     37.00 -> 7,190.00
9513%  sccache requests_not_cacheable windows2012-64-noopt debug taskcluster-c4.4xlarge39.00 -> 3,749.00
9513%  sccache requests_not_cacheable windows2012-64 debug taskcluster-c4.4xlarge   39.00 -> 3,749.00
9513%  sccache requests_not_cacheable windows2012-64-noopt debug taskcluster-c5.4xlarge39.00 -> 3,749.00
9487%  sccache requests_not_cacheable windows2012-64 opt taskcluster-c5.4xlarge     39.00 -> 3,739.00
9487%  sccache requests_not_cacheable windows2012-64 opt rusttests taskcluster-c5.4xlarge39.00 -> 3,739.00
9487%  sccache requests_not_cacheable windows2012-64 opt rusttests taskcluster-c4.4xlarge39.00 -> 3,739.00
9487%  sccache requests_not_cacheable windows2012-64 opt taskcluster-c4.4xlarge     39.00 -> 3,739.00
9387%  sccache requests_not_cacheable windows2012-64 asan debug taskcluster-c4.4xlarge39.00 -> 3,700.00
9369%  sccache requests_not_cacheable windows2012-64 asan opt taskcluster-c4.4xlarge39.00 -> 3,693.00
9369%  sccache requests_not_cacheable windows2012-64 asan opt taskcluster-c5.4xlarge39.00 -> 3,693.00
9195%  sccache requests_not_cacheable windows2012-32-noopt debug taskcluster-c5.4xlarge39.00 -> 3,625.00
9195%  sccache requests_not_cacheable windows2012-32 debug taskcluster-c4.4xlarge   39.00 -> 3,625.00
9195%  sccache requests_not_cacheable windows2012-32-noopt debug taskcluster-c4.4xlarge39.00 -> 3,625.00
9195%  sccache requests_not_cacheable windows2012-32 debug taskcluster-c5.4xlarge   39.00 -> 3,625.00
9169%  sccache requests_not_cacheable windows2012-32 opt rusttests taskcluster-c4.4xlarge39.00 -> 3,615.00
9169%  sccache requests_not_cacheable windows2012-32 opt rusttests taskcluster-c5.4xlarge39.00 -> 3,615.00
9169%  sccache requests_not_cacheable windows2012-32 opt taskcluster-c4.4xlarge     39.00 -> 3,615.00
223%  sccache requests_not_cacheable windows2012-64 debug static-analysis taskcluster-c5.4xlarge39.00 -> 126.00
223%  sccache requests_not_cacheable windows2012-32 debug static-analysis taskcluster-c5.4xlarge39.00 -> 126.00
223%  sccache requests_not_cacheable windows2012-64 debug static-analysis taskcluster-c4.4xlarge39.00 -> 126.00
223%  sccache requests_not_cacheable windows2012-32 opt static-analysis taskcluster-c4.4xlarge39.00 -> 126.00
223%  sccache requests_not_cacheable windows2012-64 opt static-analysis taskcluster-c4.4xlarge39.00 -> 126.00
223%  sccache requests_not_cacheable windows2012-32 debug static-analysis taskcluster-c4.4xlarge39.00 -> 126.00


You can find links to graphs and comparison views for each of the above tests at: https://treeherder.mozilla.org/perf.html#/alerts?id=14390

On the page above you can see an alert for each affected platform as well as a link to a graph showing the history of scores for this test. There is also a link to a treeherder page showing the jobs in a pushlog format.

To learn more about the regressing test(s), please see: https://developer.mozilla.org/en-US/docs/Mozilla/Performance/Automated_Performance_Testing_and_Sheriffing/Build_Metrics
Flags: needinfo?(tvijiala)
The patch changes the sccache revision used, adding 18 more commits. My changes (both in the offending patch, and in the sccache pull request) should only have impact on code coverage builds, so I have no more information on this.

See https://github.com/mozilla/sccache/compare/9f4c18b4a6300e3a9250a09811cc985311c20217...1ab9a33e8d328941acc23c74c949b765f975f309 for the full set of changes in sccache.
Flags: needinfo?(tvijiala) → needinfo?(ted)
This is bad, we're effectively not caching any C++ compiles on Windows. My best guess would be this change, since our Windows builds switched to clang-cl recently: https://github.com/mozilla/sccache/commit/037a96ef5e8e9258a62cd712edc912b0b1327413
Flags: needinfo?(ted)
OK, it's not as bad as it seems, because we broke sccache caching when we switched to clang-cl, and this change just fixed sccache so it can detect clang-cl properly, but it won't cache any of our C++ compiles. That means this change didn't make anything worse, it just moved the numbers around.

I looked at one of the graphs from the alert and picked a job from before the regression:
https://treeherder.mozilla.org/#/jobs?repo=autoland&revision=c0383805a8ba29a0fc8905f74730d69e4335b970&selectedJob=188613412&group_state=expanded

Looking at the log shows:
19:05:39     INFO -  ===SCCACHE STATS===
19:05:39     INFO -  mozmake.EXE[1]: Leaving directory 'z:/build/build/src/obj-firefox'
19:05:39     INFO -  mozmake.EXE[1]: Entering directory 'z:/build/build/src/obj-firefox'
19:05:39     INFO -  z:/build/build/src/sccache2/sccache.exe --show-stats
19:05:39     INFO -  Compile requests               3886
19:05:39     INFO -  Compile requests executed       260
19:05:39     INFO -  Cache hits                      236
19:05:39     INFO -  Cache misses                     24
19:05:39     INFO -  Cache timeouts                    0
19:05:39     INFO -  Cache read errors                 0
19:05:39     INFO -  Forced recaches                   0
19:05:39     INFO -  Cache write errors                0
19:05:39     INFO -  Compilation failures              0
19:05:39     INFO -  Cache errors                      0
19:05:39     INFO -  Non-cacheable compilations        0
19:05:39     INFO -  Non-cacheable calls              39
19:05:39     INFO -  Non-compilation calls             0
19:05:39     INFO -  Unsupported compiler calls     3587
19:05:39     INFO -  Average cache write           0.397 s
19:05:39     INFO -  Average cache read miss      14.377 s
19:05:39     INFO -  Average cache read hit        0.118 s
19:05:39     INFO -  Cache location             S3, bucket: Bucket(name=taskcluster-level-3-sccache-us-east-1, base_url=http://taskcluster-level-3-sccache-us-east-1.s3.amazonaws.com/)
19:05:39     INFO -  mozmake.EXE[1]: Leaving directory 'z:/build/build/src/obj-firefox'
19:05:39     INFO -  mozmake.EXE[1]: Entering directory 'z:/build/build/src/obj-firefox'
19:05:39     INFO -  ===================

Notably, the "Unsupported compiler calls" is most compiles (probably 100% of C/C++ compilation).

A job from after the regression (this exact push, actually):
https://treeherder.mozilla.org/#/jobs?repo=autoland&revision=f41fb72f94b69fa2b6c1b2d186d84c7f52f2291c&selectedJob=188628262&group_state=expanded

Has in its log:
20:28:08     INFO -  ===SCCACHE STATS===
20:28:08     INFO -  mozmake.EXE[1]: Leaving directory 'z:/build/build/src/obj-firefox'
20:28:08     INFO -  mozmake.EXE[1]: Entering directory 'z:/build/build/src/obj-firefox'
20:28:08     INFO -  z:/build/build/src/sccache2/sccache.exe --show-stats
20:28:08     INFO -  Compile requests               3886
20:28:08     INFO -  Compile requests executed       260
20:28:08     INFO -  Cache hits                      236
20:28:08     INFO -  Cache misses                     24
20:28:08     INFO -  Cache timeouts                    0
20:28:08     INFO -  Cache read errors                 0
20:28:08     INFO -  Forced recaches                   0
20:28:08     INFO -  Cache write errors                0
20:28:08     INFO -  Compilation failures              0
20:28:08     INFO -  Cache errors                      0
20:28:08     INFO -  Non-cacheable compilations        0
20:28:08     INFO -  Non-cacheable calls            3625
20:28:08     INFO -  Non-compilation calls             1
20:28:08     INFO -  Unsupported compiler calls        0
20:28:08     INFO -  Average cache write           0.245 s
20:28:08     INFO -  Average cache read miss      14.227 s
20:28:08     INFO -  Average cache read hit        0.075 s
20:28:08     INFO -  Cache location             S3, bucket: Bucket(name=taskcluster-level-3-sccache-us-west-1, base_url=http://taskcluster-level-3-sccache-us-west-1.s3.amazonaws.com/)
20:28:08     INFO -  mozmake.EXE[1]: Leaving directory 'z:/build/build/src/obj-firefox'
20:28:08     INFO -  mozmake.EXE[1]: Entering directory 'z:/build/build/src/obj-firefox'
20:28:08     INFO -  ===================

So all this change did was move those from "Unsupported compiler" to "Non-cacheable". We just don't track "Unsupported compiler" as a metric!

My best guess is that sccache is failing to parse our clang-cl compile commandlines, probably due to the use of `-Xclang` options. I'll see if I can get better logging out of a try push to confirm that theory.
I opened an sccache issue about erroring on unsupported compilers:
https://github.com/mozilla/sccache/issues/261
I did a try push with additional sccache logging to confirm this theory:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=20dd65f8bcf14149637fb9dd3ae2cfc05fe8b48d&selectedJob=188796766

The sccache log shows:
DEBUG:sccache::server: parse_arguments: CannotCache(multiple input files): ["-Xclang", "-std=gnu99", "-fms-compatibility-version=19.13.26128", "-Folz4.obj", "-c", "-DDEBUG=1", "-DIMPL_MFBT", "-DLZ4LIB_VISIBILITY=", "-Iz:/build/build/src/mfbt", "-Iz:/build/build/src/obj-firefox/mfbt", "-Iz:/build/build/src/mfbt/double-conversion", "-Iz:/build/build/src/obj-firefox/dist/include", "-Iz:/build/build/src/obj-firefox/dist/include/nspr", "-Iz:/build/build/src/obj-firefox/dist/include/nss", "-MD", "-FI", "z:/build/build/src/obj-firefox/mozilla-config.h", "-DMOZILLA_CLIENT", "-Qunused-arguments", "-Qunused-arguments", "-nologo", "-wd4091", "-D_HAS_EXCEPTIONS=0", "-W3", "-Gy", "-Zc:inline", "-Gw", "-wd4244", "-wd4267", "-Wno-unknown-pragmas", "-Wno-ignored-pragmas", "-Wno-deprecated-declarations", "-Wno-invalid-noreturn", "-we4553", "-Z7", "-O2", "-Oy-", "-Xclang", "-MP", "-Xclang", "-dependency-file", "-Xclang", ".deps/lz4.obj.pp", "-Xclang", "-MT", "-Xclang", "lz4.obj", "z:/build/build/src/mfbt/lz4.c"]

So yes, sccache is mis-parsing the commandline and thinks there are multiple input files. I opened an sccache issue about this:
https://github.com/mozilla/sccache/issues/262
Product: Testing → Firefox Build System
:ted any updates here?
Flags: needinfo?(ted)
Sorry for not replying sooner. I looked into this a little bit but haven't had time to write a patch. In any event, this actually regressed with bug 1443590, and since our build times are still lower than they were before that patch I don't think this is critical right now. (We don't need to back anything out over it, certainly.) We should absolutely fix it at some point, which should make our Windows builds even faster. If nothing else I will likely find time to fix it after I finish reviewing and landing all the distributed sccache work that's happening right now.
Blocks: 1443590
Flags: needinfo?(ted)
Summary: 223.08 - 20002.7% sccache requests_not_cacheable (windows2012-*) regression on push f41fb72f94b69fa2b6c1b2d186d84c7f52f2291c (Tue Jul 17 2018) → sccache not caching C++ compiles with clang-cl
I'd prefer we prioritize this. Windows builds take ~45 minutes compared to ~15 minutes for Linux builds. Anything we can do to bridge that gap would be impactful.
Blocks: fastci
That sccache issue wound up getting fixed as part of Aidan's distributed compilation work (https://github.com/mozilla/sccache/pull/271). That PR has merged and I tested locally that it fixes this issue, so we can update the copy of sccache in use in CI and fix this.
Assignee: nobody → ted
(In reply to Ted Mielczarek [:ted] [:ted.mielczarek] from comment #10)
> https://treeherder.mozilla.org/#/
> jobs?repo=try&revision=1bc837a62a1e425ee9ead602a43db6a5a7f120d5

This *almost* worked, but it turns out that our Windows static analysis builds invoke clang.exe directly:
https://searchfox.org/mozilla-central/source/config/static-checking-config.mk

...which confuses sccache's msvc / clang-cl support:

17:47:09     INFO -  z:/build/build/src/sccache2/sccache.exe z:/build/build/src/clang/bin/clang.exe --driver-mode=cl -Xclang -std=gnu99 -fms-compatibility-version=19.13.26128 -FoTestNANTestingExprC.obj -c  -DNDEBUG=1 -DTRIMMED=1 -Iz:/build/build/src/build/clang-plugin/tests -Iz:/build/build/src/obj-firefox/build/clang-plugin/tests -Iz:/build/build/src/obj-firefox/dist/include -Iz:/build/build/src/obj-firefox/dist/include/nspr -Iz:/build/build/src/obj-firefox/dist/include/nss -MD -FI z:/build/build/src/obj-firefox/mozilla-config.h -DMOZILLA_CLIENT -Qunused-arguments -Qunused-arguments -nologo -wd4091 -D_HAS_EXCEPTIONS=0 -Gy -Zc:inline -arch:SSE2 -Gw -wd4244 -wd4267 -we4553 -fsyntax-only -Xclang -verify -ferror-limit=0 -std=c11 -Wno-invalid-noreturn -Z7 -Xclang -load -Xclang z:/build/build/src/obj-firefox/build/clang-plugin/clang-plugin.dll -Xclang -add-plugin -Xclang moz-check -O2 -Oy-  -Xclang -MP -Xclang -dependency-file -Xclang .deps/TestNANTestingExprC.obj.pp -Xclang -MT -Xclang TestNANTestingExprC.obj    z:/build/build/src/build/clang-plugin/tests/TestNANTestingExprC.c
17:47:09     INFO -  error: failed to execute compile
17:47:09     INFO -  caused by: Compiler not supported

I didn't even know this was a thing, but I was able to fix it in sccache. I have a patch, I'll merge it assuming it doesn't break anything in sccache's CI:
https://github.com/mozilla/sccache/pull/282
(In reply to Ted Mielczarek [:ted] [:ted.mielczarek] from comment #9)
> That sccache issue wound up getting fixed as part of Aidan's distributed
> compilation work (https://github.com/mozilla/sccache/pull/271). That PR has
> merged and I tested locally that it fixes this issue, so we can update the
> copy of sccache in use in CI and fix this.

You also need https://github.com/mozilla/sccache/pull/276 for bug 1484897
OK. I might need to hold off on merging that one just yet, as Aidan has another open PR that makes some invasive changes to commandline handling.
This still had some issues. First, the sccache toolchain build for Linux failed. ring got updated and now one of its assembly files fails to compile:
https://treeherder.mozilla.org/logviewer.html#?job_id=196747614&repo=try

[task 2018-08-30T20:27:30.650Z] /builds/worker/.cargo/registry/src/github.com-1ecc6299db9ec823/ring-0.13.2/pregenerated/x86_64-mont-elf.S: Assembler messages:
[task 2018-08-30T20:27:30.650Z] /builds/worker/.cargo/registry/src/github.com-1ecc6299db9ec823/ring-0.13.2/pregenerated/x86_64-mont-elf.S:977: Error: no such instruction: `adcxq %rax,%r13'
<...>

I think this means the gcc in use is too old.

Second, with the changes made to support clang-cl, we now wind up using sccache to compile the clang-plugin tests on Windows static analysis builds, which naturally fails because they use -fsyntax-only:
https://treeherder.mozilla.org/logviewer.html#?job_id=196750613&repo=try&lineNumber=4721

20:46:37     INFO -  z:/build/build/src/sccache2/sccache.exe z:/build/build/src/clang/bin/clang.exe --driver-mode=cl -Xclang -std=gnu99 -fms-compatibility-version=19.13.26128 -FoTestNANTestingExprC.obj -c  -DDEBUG=1 -Iz:/build/build/src/build/clang-plugin/tests -Iz:/build/build/src/obj-firefox/build/clang-plugin/tests -Iz:/build/build/src/obj-firefox/dist/include -Iz:/build/build/src/obj-firefox/dist/include/nspr -Iz:/build/build/src/obj-firefox/dist/include/nss -MD -FI z:/build/build/src/obj-firefox/mozilla-config.h -DMOZILLA_CLIENT -Qunused-arguments -Qunused-arguments -nologo -wd4091 -D_HAS_EXCEPTIONS=0 -Gy -Zc:inline -arch:SSE2 -Gw -wd4244 -wd4267 -we4553 -fsyntax-only -Xclang -verify -ferror-limit=0 -std=c11 -Wno-invalid-noreturn -Z7 -Xclang -load -Xclang z:/build/build/src/obj-firefox/build/clang-plugin/clang-plugin.dll -Xclang -add-plugin -Xclang moz-check -O2 -Oy-  -Xclang -MP -Xclang -dependency-file -Xclang .deps/TestNANTestingExprC.obj.pp -Xclang -MT -Xclang TestNANTestingExprC.obj    z:/build/build/src/build/clang-plugin/tests/TestNANTestingExprC.c
20:46:37     INFO -  sccache: encountered fatal error
20:46:37     INFO -  sccache: error : failed to store `TestNANTestingExprC.obj` to cache
20:46:37     INFO -  sccache:  cause: failed to store `TestNANTestingExprC.obj` to cache
20:46:37     INFO -  sccache:  cause: failed to zip up compiler outputs
20:46:37     INFO -  sccache:  cause: The system cannot find the file specified. (os error 2)
20:46:37     INFO -  z:/build/build/src/config/rules.mk:779: recipe for target 'TestNANTestingExprC.obj' failed

We need to add the equivalent of https://github.com/mozilla/sccache/commit/ec10cdb2ddeb3dde9891bea1fa095e504a60a28a for clang-cl.
(In reply to Ted Mielczarek [:ted] [:ted.mielczarek] from comment #18)
> I think this means the gcc in use is too old.

That would be binutils, and we use the system one for this build. Add a dependency on linux64-binutils and add the binutils/bin path to PATH and this should be good.
(In reply to Mike Hommey [:glandium] from comment #19)
> That would be binutils, and we use the system one for this build. Add a
> dependency on linux64-binutils and add the binutils/bin path to PATH and
> this should be good.

Thanks! I included that in this last try push along with my fix from comment 20, so hopefully this push will work!
That looks better. There were a few builds that failed with timeouts, but those seem to be known issues and retriggering them made them succeed.
Comment on attachment 9004504 [details]
bug 1476604 - update the version of sccache in CI to fix an issue with clang-cl. r?build

Mike Hommey [:glandium] has approved the revision.
Attachment #9004504 - Flags: review+
Pushed by tmielczarek@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/e41e158793a1
update the version of sccache in CI to fix an issue with clang-cl. r=glandium
Backed out changeset e41e158793a1 (Bug 1476604) for bustages in AccessibleHandler.dll

https://treeherder.mozilla.org/#/jobs?repo=autoland&fromchange=33c7b0ea5caaa654c72cce43124db1ebc052dead&filter-resultStatus=testfailed&filter-resultStatus=busted&filter-resultStatus=exception&filter-classifiedState=unclassified&selectedJob=197394279

https://treeherder.mozilla.org/logviewer.html#?job_id=197394279&repo=autoland&lineNumber=73845

16:43:08     INFO -  mozmake.EXE[5]: *** [AccessibleHandler.dll] Error 1
16:43:08     INFO -  mozmake.EXE[5]: Leaving directory 'z:/build/build/src/obj-firefox/accessible/ipc/win/handler'
16:43:08     INFO -  z:/build/build/src/config/recurse.mk:74: recipe for target 'accessible/ipc/win/handler/target' failed
16:43:08     INFO -  mozmake.EXE[4]: *** [accessible/ipc/win/handler/target] Error 2
16:43:08     INFO -  mozmake.EXE[4]: *** Waiting for unfinished jobs....
Flags: needinfo?(ted)
Backout by shindli@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/e40357fad1e5
Backed out changeset e41e158793a1 for bustages in AccessibleHandler.dll
Ugh. Windows PGO builds were green on try. :-(
Flags: needinfo?(ted)
Retriggering another Windows PGO build on that same try push caused it to fail:
https://treeherder.mozilla.org/logviewer.html#?job_id=197428921&repo=try&lineNumber=85338

I suspect this is "the msvc commandline parser doesn't handle -fprofile-instr-use", which is the issue that glandium's patch fixed for regular clang.
:ted, this is still a problem and the graphs show the regressed state- is this something that we are realistically going to work on, or should we close this bug?
Flags: needinfo?(ted)
ted, any thoughts on this?
The regression is from bug 1443590, but the actual build times from switching to clang-cl are faster than they were with MSVC. I still intend to fix this, but unfortunately it's at odds with getting the distributed sccache work landed since there has been a lot of churn in the sccache codebase for that.
Flags: needinfo?(ted)
Assignee: ted → mshal
Any updates?
(In reply to Ionuț Goldan [:igoldan], Performance Sheriffing from comment #33)
> Any updates?

I'm actively working on it, though it's taking me some time to get up to speed on the sccache codebase. I think I have a handle on the Linux static analysis issue, which is what prevents us from updating the sccache version to support clang-cl on Windows. Once I can land that I'll see if any new issues have popped up in the meantime.
(In reply to Michael Shal [:mshal] from comment #34)
> (In reply to Ionuț Goldan [:igoldan], Performance Sheriffing from comment
> #33)
> > Any updates?
> 
> I'm actively working on it, though it's taking me some time to get up to
> speed on the sccache codebase. I think I have a handle on the Linux static
> analysis issue, which is what prevents us from updating the sccache version
> to support clang-cl on Windows. Once I can land that I'll see if any new
> issues have popped up in the meantime.

Can you link the bug you're working at against this one, so we have a more complete overview?
Flags: needinfo?(mshal)
The upstream sccache work was done in https://github.com/mozilla/sccache/pull/332

ted - can you reland this patch with SCCACHE_REVISION=a3dcb66243d2f211bf2961b3bf34ff59e814daa2 ?
Flags: needinfo?(mshal) → needinfo?(ted)
Flags: needinfo?(ted)
Pushed by tmielczarek@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/d459920f97a5
update sccache. r=firefox-build-system-reviewers,gps
https://hg.mozilla.org/mozilla-central/rev/d459920f97a5
Status: NEW → RESOLVED
Closed: 8 months ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla65
Pushed by mozilla@jorgk.com:
https://hg.mozilla.org/comm-central/rev/47991b387f8d
Port bug 1476604 [update sccache]. rs=bustage-fix DONTBUILD
Backout by rgurzau@mozilla.com:
https://hg.mozilla.org/mozilla-central/rev/71ba7cfafa54
Backed out changeset d459920f97a5 for causing Bug 1507781 a=backout
Backed out changeset d459920f97a5 (bug 1476604) for causing Bug 1507781 a=backout

Backout link: https://hg.mozilla.org/mozilla-central/rev/71ba7cfafa54a4b458abd2b1962a0bee4011173b

https://treeherder.mozilla.org/logviewer.html#?job_id=212178705&repo=mozilla-inbound

https://queue.taskcluster.net/v1/task/W7TmbrlURuO59OvbSpj-EA/runs/0/artifacts/public/logs/live_backing.log

test error::tests::test_json_error_status ... ok
[task 2018-11-16T11:02:34.455Z] 11:02:34     INFO -  test common::tests::test_json_locator_strategy_tag_name ... ok
[task 2018-11-16T11:02:34.455Z] 11:02:34     INFO -  test error::tests::test_json_webdriver_error ... ok
[task 2018-11-16T11:02:34.455Z] 11:02:34     INFO -  test response::tests::test_json_cookie_response_with_optional ... ok
[task 2018-11-16T11:02:34.456Z] 11:02:34     INFO -  test common::tests::test_json_webelement_invalid ... ok
[task 2018-11-16T11:02:34.456Z] 11:02:34     INFO -  test response::tests::test_json_cookie_response_without_optional ... ok
[task 2018-11-16T11:02:34.456Z] 11:02:34     INFO -  test response::tests::test_json_close_window_response ... ok
[task 2018-11-16T11:02:34.456Z] 11:02:34     INFO -  test response::tests::test_json_element_rect_response ... ok
[task 2018-11-16T11:02:34.456Z] 11:02:34     INFO -  test response::tests::test_json_generic_value_response ... ok
[task 2018-11-16T11:02:34.456Z] 11:02:34     INFO -  test response::tests::test_json_cookies_response ... ok
[task 2018-11-16T11:02:34.456Z] 11:02:34     INFO -  test response::tests::test_json_void_response ... ok
[task 2018-11-16T11:02:34.456Z] 11:02:34     INFO -  test response::tests::test_json_delete_session_response ... ok
[task 2018-11-16T11:02:34.456Z] 11:02:34     INFO -  test response::tests::test_json_window_rect_response ... ok
[task 2018-11-16T11:02:34.457Z] 11:02:34     INFO -  test response::tests::test_json_new_session_response ... ok
[task 2018-11-16T11:02:34.457Z] 11:02:34     INFO -  test response::tests::test_json_timeouts_response ... ok
[task 2018-11-16T11:02:34.457Z] 11:02:34     INFO -  test result: ok. 174 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out
[task 2018-11-16T11:02:34.457Z] 11:02:34     INFO -  make[1]: Leaving directory '/builds/worker/workspace/build/src/obj-firefox/testing/geckodriver'
[task 2018-11-16T11:42:34.476Z] 11:42:34     INFO - Automation Error: mozprocess timed out after 2400 seconds running ['/usr/bin/python2.7', 'mach', '--log-no-times', 'build', '-v', 'pre-export', 'export', 'recurse_rusttests']
[task 2018-11-16T11:42:34.524Z] 11:42:34    ERROR - timed out after 2400 seconds of no output
[task 2018-11-16T11:42:34.524Z] 11:42:34    ERROR - Return code: -15
[task 2018-11-16T11:42:34.524Z] 11:42:34  WARNING - setting return code to 2
[task 2018-11-16T11:42:34.524Z] 11:42:34    FATAL - 'mach build -v pre-export export recurse_rusttests' did not run successfully. Please check log for errors.
[task 2018-11-16T11:42:34.524Z] 11:42:34    FATAL - Running post_fatal callback...
[task 2018-11-16T11:42:34.524Z] 11:42:34    FATAL - Exiting -1
[task 2018-11-16T11:42:34.524Z] 11:42:34     INFO - [mozharness: 2018-11-16 11:42:34.524835Z] Finished build step (failed)
[task 2018-11-16T11:42:34.524Z] 11:42:34     INFO - Running post-run listener: _shutdown_sccache
[task 2018-11-16T11:42:34.525Z] 11:42:34     INFO - Running command: ['/builds/worker/workspace/build/src/sccache2/sccache', '--stop-server'] in /builds/worker/workspace/build/src
[task 2018-11-16T11:42:34.525Z] 11:42:34     INFO - Copy/paste: /builds/worker/workspace/build/src/sccache2/sccache --stop-server
[task 2018-11-16T11:42:34.529Z] 11:42:34     INFO -  Stopping sccache server...
[task 2018-11-16T11:42:34.529Z] 11:42:34     INFO -  Compile requests               638
[task 2018-11-16T11:42:34.529Z] 11:42:34     INFO -  Compile requests executed      452
[task 2018-11-16T11:42:34.529Z] 11:42:34     INFO -  Cache hits                     414
[task 2018-11-16T11:42:34.529Z] 11:42:34     INFO -  Cache misses                     2
[task 2018-11-16T11:42:34.529Z] 11:42:34     INFO -  Cache timeouts                   0
[task 2018-11-16T11:42:34.529Z] 11:42:34     INFO -  Cache read errors                0
[task 2018-11-16T11:42:34.529Z] 11:42:34     INFO -  Forced recaches                  0
[task 2018-11-16T11:42:34.529Z] 11:42:34     INFO -  Cache write errors               0
[task 2018-11-16T11:42:34.530Z] 11:42:34     INFO -  Compilation failures             9
[task 2018-11-16T11:42:34.530Z] 11:42:34     INFO -  Cache errors                    26
[task 2018-11-16T11:42:34.530Z] 11:42:34     INFO -  Non-cacheable compilations       0
[task 2018-11-16T11:42:34.530Z] 11:42:34     INFO -  Non-cacheable calls             74
[task 2018-11-16T11:42:34.530Z] 11:42:34     INFO -  Non-compilation calls          112
[task 2018-11-16T11:42:34.530Z] 11:42:34     INFO -  Unsupported compiler calls       0
[task 2018-11-16T11:42:34.530Z] 11:42:34     INFO -  Average cache write          0.445 s
[task 2018-11-16T11:42:34.530Z] 11:42:34     INFO -  Average cache read miss      0.452 s
[task 2018-11-16T11:42:34.530Z] 11:42:34     INFO -  Average cache read hit       0.183 s
[task 2018-11-16T11:42:34.530Z] 11:42:34     INFO -  Non-cacheable reasons:
[task 2018-11-16T11:42:34.530Z] 11:42:34     INFO -  -E                              28
[task 2018-11-16T11:42:34.531Z] 11:42:34     INFO -  unknown source language          2
[task 2018-11-16T11:42:34.531Z] 11:42:34     INFO -  crate-type                      40
[task 2018-11-16T11:42:34.531Z] 11:42:34     INFO -  -                                4
[task 2018-11-16T11:42:34.531Z] 11:42:34     INFO -  Cache location             S3, bucket: Bucket(name=taskcluster-level-3-sccache-us-west-2, base_url=http://taskcluster-level-3-sccache-us-west-2.s3.amazonaws.com/)
[task 2018-11-16T11:42:34.531Z] 11:42:34     INFO - Return code: 0
[task 2018-11-16T11:42:34.531Z] 11:42:34     INFO - Running post-run listener: _summarize
[task 2018-11-16T11:42:34.531Z] 11:42:34    ERROR - # TBPL FAILURE #
[task 2018-11-16T11:42:34.531Z] 11:42:34     INFO - [mozharness: 2018-11-16 11:42:34.531808Z] FxDesktopBuild summary:
[task 2018-11-16T11:42:34.531Z] 11:42:34    ERROR - # TBPL FAILURE #
[task 2018-11-16T11:42:34.538Z] cleanup
[task 2018-11-16T11:42:34.538Z] + cleanup
[task 2018-11-16T11:42:34.538Z] + local rv=255
[task 2018-11-16T11:42:34.538Z] + cleanup_xvfb
[task 2018-11-16T11:42:34.539Z] pidof Xvfb
[task 2018-11-16T11:42:34.539Z] ++ pidof Xvfb
[task 2018-11-16T11:42:34.540Z] + local xvfb_pid=52
[task 2018-11-16T11:42:34.540Z] + local vnc=false
[task 2018-11-16T11:42:34.540Z] + local interactive=false
[task 2018-11-16T11:42:34.540Z] + '[' -n 52 ']'
[task 2018-11-16T11:42:34.540Z] + [[ false == false ]]
[task 2018-11-16T11:42:34.540Z] + [[ false == false ]]
[task 2018-11-16T11:42:34.540Z] + kill 52
[task 2018-11-16T11:42:34.540Z] + screen -XS xvfb quit
[task 2018-11-16T11:42:34.664Z] No screen session found.
[task 2018-11-16T11:42:34.664Z] + true
[task 2018-11-16T11:42:34.664Z] + exit 255
[taskcluster 2018-11-16 11:42:35.012Z] === Task Finished ===
[taskcluster 2018-11-16 11:42:40.495Z] Unsuccessful task run with exit code: 255 completed in 2637.349 seconds
Flags: needinfo?(mshal)
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Target Milestone: mozilla65 → ---
(In reply to Pulsebot from comment #40)
> Pushed by mozilla@jorgk.com:
> https://hg.mozilla.org/comm-central/rev/47991b387f8d
> Port bug 1476604 [update sccache]. rs=bustage-fix DONTBUILD
Looks like this small change doesn't need to be backed out. linux64-binutils in the list of jobs doesn't hurt.
I'm triggering a bunch of jobs on try with the sccache log level increased to see if I can reproduce the failure and get any more information. If that doesn't prove useful I'm going to try using some different sccache revisions to try to bisect the problem.
I'm also trying to bisect on try (hopefully 1-2 more revisions?), though if you hit it with the log level that should be interesting. I've found I have to retrigger 20-30 times to see any failures on the bad revisions, so it can be a bit tricky to reproduce. I'll post the rev if it works.
Reverting commit 5481e8736a6712a42d29382d8f858cee9e03dddc on top of sccache master seems to work:

rusttests: https://treeherder.mozilla.org/#/jobs?repo=try&revision=fd41edf6b07e1858b80f4e051b6310016d0f1d65
-p all: https://treeherder.mozilla.org/#/jobs?repo=try&revision=fac0485eb778f38c8a4f380309cadb814b56f029

ted: Is that tokio-process version needed for anything? Or can we just back it out for now and re-land the sccache version bump?
Flags: needinfo?(mshal) → needinfo?(ted)
I opened an sccache issue to track this as well: https://github.com/mozilla/sccache/issues/335

(In reply to Michael Shal [:mshal] from comment #49)
> ted: Is that tokio-process version needed for anything? Or can we just back
> it out for now and re-land the sccache version bump?

If sccache builds and works with that update reverted then it should be fine. I think the contributor that submitted that PR was just trying to update dependencies as general project hygiene.
Flags: needinfo?(ted)
The contributor that submitted that PR commented that it seems likely that this is the same issue that was noticed elsewhere, and they have another PR that updates tokio-process to a newer version which they think will fix the issue:
https://github.com/mozilla/sccache/pull/304

I'm going to merge that PR once they rebase it over Aidan's latest changes and we can re-test on try with that revision to verify.
This change updates sccache to this revision:
https://github.com/mozilla/sccache/commit/8f295c09cfdd4cff4f4a0c6f0e057979eeb8842d

Primarily to pick up the changes for handling -Xclang: https://github.com/mozilla/sccache/pull/332

MozReview-Commit-ID: GHinVG4XNDg
Did you test this on try?
Pushed by tmielczarek@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/77b100d5fe36
update sccache; r=firefox-build-system-reviewers,nalexander,ted
https://hg.mozilla.org/mozilla-central/rev/77b100d5fe36
Status: REOPENED → RESOLVED
Closed: 8 months ago8 months ago
Resolution: --- → FIXED
Target Milestone: --- → Firefox 65
Looks like this has caused some build time increases, since push from comment 55.

== Change summary for alert #18022 (as of Tue, 04 Dec 2018 17:21:07 GMT) ==

Regressions:

  9%  build times windows2012-32 debug rusttests taskcluster-c5.4xlarge     455.33 -> 494.95
  8%  build times windows2012-64 debug rusttests taskcluster-c4.4xlarge     590.46 -> 639.52

For up to date results, see: https://treeherder.mozilla.org/perf.html#/alerts?id=18022
Flags: needinfo?(ted)
I noticed this as well but I don't think it's high priority to investigate. chmanchester pointed out that if you look at the graph with a slightly-longer timeframe this increase is insignificant compared to the build time wins from bug 1505072:
https://treeherder.mozilla.org/perf.html#/graphs?timerange=5184000&series=autoland,1624916,1,2
https://treeherder.mozilla.org/perf.html#/graphs?timerange=5184000&series=autoland,1807720,1,2
Flags: needinfo?(ted)
Keywords: regression
Target Milestone: Firefox 65 → mozilla65
You need to log in before you can comment on or make changes to this bug.