Switch automation build jobs to use sccache2

RESOLVED FIXED in Firefox 53

Status

Firefox Build System
General
RESOLVED FIXED
2 years ago
4 months ago

People

(Reporter: ted, Assigned: ted)

Tracking

(Blocks: 1 bug)

unspecified
mozilla53
Dependency tree / graph

Firefox Tracking Flags

(firefox53 fixed)

Details

MozReview Requests

Submitter Diff Changes Open Issues Last Updated
Loading...
Error loading review requests:

Attachments

(1 attachment)

I've got sccache2 far enough along that I'm going to start testing it on try. I plan to ensure that everything works OK as well as run comparative builds from the same base changeset to compare build times and cache hit rates.

If that all looks good we'll be able to land my patch to switch automation to use sccache2!
Before deploying, you should add support for SCCACHE_RECACHE. That's the most immediate thing that I found missing, I still need to go through most of the code (I've read it a little already).
I should mention why: I've had to use it a couple times in the past because of bugs in sccache causing bad things in the cache and breaking builds permanently on try. The last occurrence was related to MACOSX_DEPLOYMENT_TARGET.

Comment 14

2 years ago
Hi ted, what is the latest for this work?
I was stuck on those burning OS X jobs and I finally realized the problem--the mac builders are the only non-EC2 builders, so they don't have IAM credentials, and the code I'm using to locate AWS credentials (borrowed from rusoto) defaults to looking in `~/.aws/credentials`, but the credentials that sccache.py is using are located in `~/.boto`. I've got a fix locally, I'll push it to try shortly and that should fix those builds, which should mean everything is green.
So after a lot of fiddling (obviously) here's a try job with the original sccache using SCCACHE_RECACHE=1 to force compiling everything instead of reading from the cache:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=1c7b7676fb5c3b5883cc2c4566785376c0aafe98

and here's a try run with sccache2 also using SCCACHE_RECACHE=1:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=0aca4756d9675c899c5b1a04a89220b49ead79fa

They're built atop the same base revision.
Here's a comparison of two try pushes, similar to comment 33, except I removed the SCCACHE_RECACHE, so this is comparing builds that should be mostly cache hits:
https://treeherder.mozilla.org/perf.html#/compare?originalProject=try&originalRevision=4f7d1609a34d485119d81f1b3e93bb4294fbabc1&newProject=try&newRevision=139e47cbd01b681ee39bdd1338de7d3b83c77871&framework=2&showOnlyImportant=0

I also rebased on top of gps' patches to split the build metrics out between buildbot and taskcluster, as well as by instance type, which helps.
FYI I summarized the build time comparisons I did on try (comment 39 and comment 40):
https://docs.google.com/document/d/1HWTMnStwFzUVNYAh6K7MwwOShspkDc_w8xCruyEqTWg/edit

I was mostly just trying to ensure that I didn't regress anything, but it looks like sccache2 will actually give us noticeable build time improvements on many platforms.
Comment hidden (mozreview-request)
This still depends on landing bug 1295937, but that's close enough to landing that I thought I'd get this patch up for review. Most of the useful info should be in the commit message, but here's a few other things:

I'm currently building the binaries using this script on my local Windows/Linux/Mac machines: https://github.com/luser/sccache2/blob/master/scripts/build-release.sh . I'd like to move to something better but I haven't quite figured that out yet. Taskcluster doesn't have support for mac workers yet, so to build binaries in TC I'd have to cross-compile them from Linux (which is feasible).

I'm using another script to upload the resulting binaries to tooltool:
https://github.com/luser/sccache2/blob/master/scripts/upload-tooltool.sh

...and a third script to take those resulting tooltool manifests and merge them into the in-tree manifests:
https://github.com/luser/sccache2/blob/master/scripts/update-gecko-manifests.py

This whole process is kinda crappy. I would love if we could fix bug 1313111 and just have this all work in the taskgraph.
Depends on: 1295937

Comment 47

2 years ago
mozreview-review
Comment on attachment 8811801 [details]
bug 1286934 - Switch to using sccache2.

https://reviewboard.mozilla.org/r/93758/#review94068

This looks pretty straightforward!

We should get the scripts for building sccache in the tree. Even if they aren't hooked up to the task graph, it is better than them sitting in some random repo elsewhere. I guess you can toss them in `build/build-sccache` or some such and we can refactor things later.
Attachment #8811801 - Flags: review?(gps) → review+
I'm fine with putting them wherever, but the *build* is pretty simple. It's the tooltool bits that are the PITA. :-/

Comment 49

2 years ago
mozreview-review
Comment on attachment 8811801 [details]
bug 1286934 - Switch to using sccache2.

https://reviewboard.mozilla.org/r/93758/#review93952

::: browser/config/tooltool-manifests/linux32/releng.manifest:36
(Diff revision 1)
>  },
>  {
> -"size": 167175,
> -"digest": "0b71a936edf5bd70cf274aaa5d7abc8f77fe8e7b5593a208f805cc9436fac646b9c4f0b43c2b10de63ff3da671497d35536077ecbc72dba7f8159a38b580f831",
>  "algorithm": "sha512",
> -"filename": "sccache.tar.bz2",
> +"visibility": "public",

in-tree manifests don't need visibility

::: build/sccache.mk:14
(Diff revision 1)
>  BASE_DIR = $(MOZ_OBJDIR)/$(firstword $(MOZ_BUILD_PROJECTS))
>  endif
>  
>  preflight_all:
>  	# Terminate any sccache server that might still be around
> -	-python2.7 $(TOPSRCDIR)/sccache/sccache.py > /dev/null 2>&1
> +	-$(TOPSRCDIR)/sccache2/sccache --stop-server > /dev/null 2>&1

Should need .exe on Windows, right?

::: build/sccache.mk:14
(Diff revision 1)
>  BASE_DIR = $(MOZ_OBJDIR)/$(firstword $(MOZ_BUILD_PROJECTS))
>  endif
>  
>  preflight_all:
>  	# Terminate any sccache server that might still be around
> -	-python2.7 $(TOPSRCDIR)/sccache/sccache.py > /dev/null 2>&1
> +	-$(TOPSRCDIR)/sccache2/sccache --stop-server > /dev/null 2>&1

Is the redirection to /dev/null still necessary?
(Assignee)

Comment 50

2 years ago
mozreview-review-reply
Comment on attachment 8811801 [details]
bug 1286934 - Switch to using sccache2.

https://reviewboard.mozilla.org/r/93758/#review93952

> in-tree manifests don't need visibility

Turns out if you don't specify --visibility=public tooltool.py will refuse to upload things, which is why this is here. I'm not motivated enough to try to fix that just for the sake of making the manifests look nicer.

> Should need .exe on Windows, right?

This works because the msys shell will find it with or without the exe. (The change in mozconfig.cache is because we were passing the path to `test -e`.

> Is the redirection to /dev/null still necessary?

Probably not, but I stuck an "echo stats to the log" bit earlier in the build, so it'd be redundant. I haven't yet made sccache2 save its stats to disk, and with the 5 minute inactivity timeout the server would shut itself down in the lull between the compilation phase finishing and the time the build hits `postflight_all`, so the stats were getting lost.

Longer-term I might want to integrate special sccache handling into mach or something.
FYI, I moved the sccache2 repo to mozilla/sccache:
https://github.com/mozilla/sccache

Comment 56

2 years ago
bugherder
https://hg.mozilla.org/mozilla-central/rev/683e59dc3094
Status: NEW → RESOLVED
Last Resolved: 2 years ago
status-firefox53: --- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla53

Updated

4 months ago
Product: Core → Firefox Build System
You need to log in before you can comment on or make changes to this bug.