Closed Bug 1562686 Opened 6 years ago Closed 5 years ago

[sccache] Migrate sccache to new deployment

Categories

(Taskcluster :: Operations and Service Requests, task)

Product:

Component:

Type:

task

Priority:

Not set

Severity:

normal

Tracking

(Not tracked)

Status:

RESOLVED FIXED

People

(Reporter: dustin, Assigned: dustin)

References

Details

(Keywords: leave-open)

Attachments

(2 files)

Bug 1562686 - use AWS_IAM_CREDENTIALS_URL for all S3 sccache invocations r=chmanchester 6 years ago Dustin J. Mitchell [:dustin] (he/him) 47 bytes, text/x-phabricator-request		Details \| Review
Bug 1562686 - revert remaining unnecessary bit of bug 1187245; r=glandium 6 years ago Dustin J. Mitchell [:dustin] (he/him) 47 bytes, text/x-phabricator-request		Details \| Review

Dustin J. Mitchell [:dustin] (he/him)

Assignee

Description

•

6 years ago

I don't know much about this. What I do know:

There are a bunch of S3 buckets named "...-sccache" that contain cache data for Firefox builds. Workers have access to these buckets (somehow). They're in the Taskcluster production AWS account right now, but could probably be moved elsewhere if necessary. These are separate from Taskcluster artifacts (and are a fairly Mozilla-specific thing).

Dustin J. Mitchell [:dustin] (he/him)

Assignee

Comment 1

•

6 years ago

Next steps:

gather data on how this works now
work with chmanchester, grenade, and cloudops to talk about how we want it to work in the new deployment
- use the same AWS account?
- does windows currently use an IAM role, and should that change to use auth.awsSTSCredentials?

Wander, is this something you could work on?

Dustin J. Mitchell [:dustin] (he/him)

Assignee

Updated

•

6 years ago

Flags: needinfo?(wcosta)

Wander Lairson Costa

Comment 2

•

6 years ago

(In reply to Dustin J. Mitchell [:dustin] (he/him) from comment #1)

Next steps:

gather data on how this works now

work with chmanchester, grenade, and cloudops to talk about how we want
it to work in the new deployment

use the same AWS account?

does windows currently use an IAM role, and should that change to use
auth.awsSTSCredentials?

Wander, is this something you could work on?

Not right now, but I might have some time by the end of the quarter.

Flags: needinfo?(wcosta)

Dustin J. Mitchell [:dustin] (he/him)

Assignee

Comment 3

•

6 years ago

OK, sounds like we need to find someone else then, to get this set up by the early-August deadline.

Flags: needinfo?(coop)

Chris Cooper [:coop] (he/him)

Comment 4

•

6 years ago

(In reply to Dustin J. Mitchell [:dustin] (he/him) from comment #1)

gather data on how this works now

Let's concentrate on this part for now.

:pmoore, :wcosta - can you describe how this works in generic-worker/docker-worker right now? Do workers deployed by OCC behave differently, i.e. do we need to loop in :grenade immediately?

Flags: needinfo?(wcosta)

Flags: needinfo?(pmoore)

Flags: needinfo?(coop)

Wander Lairson Costa

Comment 5

•

6 years ago

(In reply to Chris Cooper [:coop] pronoun: he from comment #4)

(In reply to Dustin J. Mitchell [:dustin] (he/him) from comment #1)

gather data on how this works now

Let's concentrate on this part for now.

:pmoore, :wcosta - can you describe how this works in
generic-worker/docker-worker right now? Do workers deployed by OCC behave
differently, i.e. do we need to loop in :grenade immediately?

At least from docker-worker POV, it relates nothing with the worker or its AMI. Everything is set up in the relevant docker image used to run the build task. I am not sure how sccache is started, but once it is, it must use tc-auth to get temporary credentials. I am also unaware of how sccache interacts with the build system.

Flags: needinfo?(wcosta)

Pete Moore [:pmoore][:pete]

Comment 6

•

6 years ago

I'm not too sure how these buckets are managed/configured - Rob, do you know?

Flags: needinfo?(pmoore) → needinfo?(rthijssen)

Rob Thijssen [:grenade (EET/UTC+0300)]

Comment 7

•

6 years ago

i believe the buckets were manually created and are not managed by automation (there is a lot of room for improvement here. eg: terraform or some other automated, auditable, source-controlled approach).

you can see the buckets by searching for sccache at https://s3.console.aws.amazon.com/s3/.

windows builders use an iam role to access the bucket.

the role is applied by the provisioner based on the workers provisioner config
the role is assigned to the provisioner config based on occ config like this: https://github.com/mozilla-releng/OpenCloudConfig/blob/f5250a9/userdata/Manifest/gecko-3-b-win2012.json#L1325-L1327

going forward it would be useful if we lost the iam role on windows in favour of the same mechanisms used by linux workers to obtain temporary credentials from tc-auth but that does involve some more than trivial effort. i don't believe security would be improved since iam also uses temporary rather than permanent credentials, shippable builds don't use sccache and bucket access is scm level specific, however there must be some value in matching the implementations and i believe the finer grained controls used by the tc-auth approach must have some value or they wouldn't have been implemented.

there was some discussion by security folk about disabling sccache until we have a better sccache use story, so it would be worth speaking to ajvb too.

Flags: needinfo?(rthijssen)

Dustin J. Mitchell [:dustin] (he/him)

Assignee

Comment 8

•

6 years ago

Removing during the transition would certainly make things easier. That said, all of the requirements to support the tc-auth-based approach are in place already, while per-worker IAM roles are not currently supported by worker-manager. So, wwitching Windows to use the tc-auth approach would make it easy to enable this functionality in the new deployment.

Flags: needinfo?(abahnken)

Dustin J. Mitchell [:dustin] (he/him)

Assignee

Comment 9

•

6 years ago

so it would be worth speaking to ajvb too.

^^ the reason for the needinfo :)

It's my understanding that the credentials from tc-auth are handled in-tree, removing the necessity of considering them when deploying workers. So hopefully the change wouldn't be too troublesome, since the code is already there to handle it. Aside from the benefit of matching implementations, the security improvements include the ability to audit access to the buckets and control it at a more fine-grained level.

Dustin J. Mitchell [:dustin] (he/him)

Assignee

Comment 10

•

6 years ago

So, [s]witching Windows to use the tc-auth approach would make it easy to enable this functionality in the new deployment.

Rob, I can't find the communication now but my impression is things are headed in this direction anyway. Can we make this the plan of record for the September cutover to the new cluster?

Flags: needinfo?(rthijssen)

AJ Bahnken [:ajvb] (she/her)

Comment 11

•

6 years ago

(In reply to Dustin J. Mitchell [:dustin] (he/him) from comment #9)

Comment 7:

so it would be worth speaking to ajvb too.

^^ the reason for the needinfo :)

It's my understanding that the credentials from tc-auth are handled in-tree, removing the necessity of considering them when deploying workers. So hopefully the change wouldn't be too troublesome, since the code is already there to handle it. Aside from the benefit of matching implementations, the security improvements include the ability to audit access to the buckets and control it at a more fine-grained level.

That makes sense to me.

(In reply to Rob Thijssen [:grenade (EET/UTC+0300)] from comment #7)

...
there was some discussion by security folk about disabling sccache until we have a better sccache use story, so it would be worth speaking to ajvb too.

Just to give context to this, we had a discussion right before Whistler '19 about L1 workers being able to potentially poison sccache that would then be used by L1 workers. The current idea was to remove the L1 sccache deployment and to have L1 workers read from the L3 sccache and not write anything. This hasn't been followed up on yet, but just wanted to add the context.

Flags: needinfo?(abahnken)

Rob Thijssen [:grenade (EET/UTC+0300)]

Comment 12

•

6 years ago

(In reply to Dustin J. Mitchell [:dustin] (he/him) from comment #10)

So, [s]witching Windows to use the tc-auth approach would make it easy to enable this functionality in the new deployment.

Rob, I can't find the communication now but my impression is things are headed in this direction anyway. Can we make this the plan of record for the September cutover to the new cluster?

i think so but someone will have to pick up the work. from a windows infra perspective, all we'd do is remove the iam roles that we currently apply to gecko-[1-3]-b-win2012.

the harder part is:

the build system or windows build configurations will need to be modified to obtain and use credentials from tc-auth since they won't be available at the worker instance level from iam.
does generic-worker already support requesting and using sccache credentials if the builds request it? i have no idea. i'm under the impression that the linux builds which do this are using docker worker but i may be out of the loop on that.

Flags: needinfo?(rthijssen)

Dustin J. Mitchell [:dustin] (he/him)

Assignee

Comment 13

•

6 years ago

does generic-worker already support requesting and using sccache credentials if the builds request it? i have no idea. i'm under the impression that the linux builds which do this are using docker worker but i may be out of the loop on that.

Yes, it's in-task, and uses the taskcluster-proxy, which generic-worker also supports.

Dustin J. Mitchell [:dustin] (he/him)

Assignee

Updated

•

6 years ago

Assignee: nobody → dustin

Dustin J. Mitchell [:dustin] (he/him)

Assignee

Comment 14

•

6 years ago

I'm hoping someone can make some relatively quick in-tree changes to support this.

Assignee: dustin → nobody

Dustin J. Mitchell [:dustin] (he/him)

Assignee

Comment 15

•

6 years ago

I'll at least see what those in-tree changes might look like. I do want to avoid our team owning this, however!

Dustin J. Mitchell [:dustin] (he/him)

Assignee

Updated

•

6 years ago

Assignee: nobody → dustin

Dustin J. Mitchell [:dustin] (he/him)

Assignee

Comment 16

•

6 years ago

https://github.com/mozilla/sccache/pull/492 to document the technique used for AWS + linux

Dustin J. Mitchell [:dustin] (he/him)

Assignee

Comment 17

•

6 years ago

OK, so here's how Linux works:

https://searchfox.org/mozilla-central/source/taskcluster/scripts/builder/build-linux.sh sets AWS_IAM_CREDENTIALS_URL before run-task runs
this URL points to the AWS endpoint
sccache reads that env var and uses it in place of the ec2 metadata service to fetch credentials

So, I think all we need to do is set that env var in windows CI as well.

Dustin J. Mitchell [:dustin] (he/him)

Assignee

Comment 18

•

6 years ago

From what I can tell, on Windows things work like this:

task.payload.command has a run-task invocation that boils down to run-task [options] %GECKO_PATH%/testing/mozharness/scripts/fx_desktop_build.py [options], so run-task starts the mozharness script, which then does the build. From what I can tell, all builds use fx_desktop_build.py

Under the hood, build-linux.sh (mentioned in previous comment) runs a similar thing, using $MOZHARNESS_SCRIPT. For all things that use SCCACHE, that variable is set to fx_desktop_build.py, so that seems to be the common denominator for things that use SCCACHE.

dustin@lamport ~/p/m-c (bug1562686) $ jq 'values | .[] | select(.task.payload.env.MOZHARNESS_SCRIPT != null and .task.payload.env.MOZHARNESS_SCRIPT != "mozharness/scripts/fx_desktop_build.py") | .kind' tasks.json  | sort -u
"generate-profile"
"l10n"
"nightly-l10n"
"openh264-plugin"
"release-generate-checksums"
"repackage"
"repackage-l10n"
"test"
"webrender"                                                                                                                                                                                                                  
dustin@lamport ~/p/m-c (bug1562686) $ jq 'values | .[] | select(.task.payload.env.USE_SCCACHE == "1") | .kind' tasks.json | sort -u                                                                                                                                                                                            
"build"                                                                                                                                                                                                                                                                                                                       
"build-fat-aar"
"instrumented-build"
"searchfox"
"static-analysis-autotest"
"valgrind"

those sets do not overlap.

So, I think the fix here is to remove the env var setting from build-linux.sh and add it in fx_desktop_build.py. My mozharness-fu is weak, but I can try to make that work. Chris, does that sound right?

Flags: needinfo?(cmanchester)

Chris Manchester (limited bugmail, email directly)

Comment 19

•

6 years ago

That sounds right. The other place to consider adding this would be mozconfig.cache.

Flags: needinfo?(cmanchester)

Dustin J. Mitchell [:dustin] (he/him)

Assignee

Comment 20

•

6 years ago

Attached file Bug 1562686 - use AWS_IAM_CREDENTIALS_URL for all S3 sccache invocations r=chmanchester — Details

Dustin J. Mitchell [:dustin] (he/him)

Assignee

Comment 21

•

6 years ago

I created aws-provisioner workerType gecko-1-b-win2012-no-isntprof to test this. Including the typo!

Dustin J. Mitchell [:dustin] (he/him)

Assignee

Comment 22

•

6 years ago

https://treeherder.mozilla.org/#/jobs?repo=try&revision=276e184582f046ef8b642c1b2135484905d33876&selectedJob=260868740

Dustin J. Mitchell [:dustin] (he/him)

Assignee

Comment 23

•

6 years ago

Rob, I'm not able to get a worker to claim tasks if I remove the instance profile from its worker type configuration
https://tools.taskcluster.net/aws-provisioner/gecko-1-b-win2012-no-isntprof/resources
https://tools.taskcluster.net/groups/WK_MfwYhTdm8gwPkn8_vxg/tasks/WK_MfwYhTdm8gwPkn8_vxg/runs/0
any idea what would cause that?

Flags: needinfo?(rthijssen)

Dustin J. Mitchell [:dustin] (he/him)

Assignee

Comment 24

•

6 years ago

(I also can't find the logs for that worker in papertrail -- I can find something under system ip-10-144-31-14 but that's for a docker-worker instance that started and ended before this worker did)

Rob Thijssen [:grenade (EET/UTC+0300)]

Comment 25

•

6 years ago

most likely the same issue as described in bug 1572089, comment 10. simplest thing is just to use worker type gecko-1-b-win2012-beta, which exists for tests of this nature, unless there's something special about the worker configuration that needs to be kept separate. if so, i can create the gecko-1-b-win2012-no-isntprof worker type in occ for you. let me know.

Flags: needinfo?(rthijssen)

Dustin J. Mitchell [:dustin] (he/him)

Assignee

Comment 26

•

6 years ago

OK, I'll just use beta, then. Thanks!

Dustin J. Mitchell [:dustin] (he/him)

Assignee

Comment 27

•

6 years ago

removing

        "IamInstanceProfile": {
          "Arn": "arn:aws:iam::692406183521:instance-profile/taskcluster-level-1-sccache"
        },

from `instanceTypes[*].launchSpec in https://tools.taskcluster.net/aws-provisioner/gecko-1-b-win2012-beta/edit

Dustin J. Mitchell [:dustin] (he/him)

Assignee

Comment 28

•

6 years ago

https://tools.taskcluster.net/groups/JAQfVwDjR9G8jxtuTR8poA/tasks/JAQfVwDjR9G8jxtuTR8poA

Dustin J. Mitchell [:dustin] (he/him)

Assignee

Comment 29

•

6 years ago

Oh, boo, sccache is disabled for that task.

https://taskcluster-artifacts.net/eri8v_eTT7uU0Ixex-cwKw has sccache enabled, as seen in https://taskcluster-artifacts.net/eri8v_eTT7uU0Ixex-cwKw/0/public/build/sccache.log. I'll try a copy of that task on the beta workerType.

https://tools.taskcluster.net/groups/fYcNZJlgRN-FEZ7vHV473A/tasks/fYcNZJlgRN-FEZ7vHV473A/details

Dustin J. Mitchell [:dustin] (he/him)

Assignee

Comment 30

•

6 years ago

https://taskcluster-artifacts.net/fYcNZJlgRN-FEZ7vHV473A/0/public/build/sccache.log looks like it ran sccache

So, I'm reasonably confident that this works.

Dustin J. Mitchell [:dustin] (he/him)

Assignee

Comment 31

•

6 years ago

Wait, that was a successful run without the patch. And mysteriously, the IamInstanceProfile has returned in the beta worker type. I guess OCC does that sometimes? So, it proves nothing.

Well, it proves that I don't know how to test this.

Dustin J. Mitchell [:dustin] (he/him)

Assignee

Comment 32

•

6 years ago

OK, modified the workerType definition and got aws-provisioner to start an instance before things got reverted. I've confirmed that the isntance on which
https://tools.taskcluster.net/groups/am0dja3sQtCL_6sCqybfmA/tasks/MvX1D5g0RgyReA_G4m8t2A/runs/0
is running, i-0de1cb4a494e24db8, has an empty "IAM Role" in the UI, whereas other gecko-1-b-win2012 instances have taskcluster-level-1-sccache in that spot. So, if that task passes, then I will be justifiably reasonably confident that this works.

Dustin J. Mitchell [:dustin] (he/him)

Assignee

Comment 33

•

6 years ago

https://taskcluster-artifacts.net/MvX1D5g0RgyReA_G4m8t2A/0/public/build/sccache.log doesn't look promising:

DEBUG 2019-08-13T20:56:45Z: sccache::compiler::compiler: [host_DiagnosticsMatcher.obj]: Cache write error: Error(Msg("failed to get AWS credentials"), State { next_error: Some(Error(Msg("Couldn\'t find AWS credentials in environment, credentials file, or IAM role."), State { next_error: None, backtrace: InternalBacktrace })), backtrace: InternalBacktrace })
DEBUG 2019-08-13T20:56:45Z: sccache::server: Error executing cache write: failed to get AWS credentials

Dustin J. Mitchell [:dustin] (he/him)

Assignee

Comment 34

•

6 years ago

Ah, earlier in the log:

DEBUG 2019-08-13T20:56:43Z: sccache::simples3::credential: Attempting to fetch credentials from http://taskcluster/auth/v1/aws/s3/read-write/taskcluster-level-1-sccache-/?format=iam-role-compat

so it is using the URL. However, the task does not have the relevant scope.

Dustin J. Mitchell [:dustin] (he/him)

Assignee

Comment 35

•

6 years ago

https://tools.taskcluster.net/groups/X8q7AnC0SnuLZ_CtATvavQ/tasks/BwzyRfhqSRa_yuQlo2i-XQ/details has the proper scopes.

Dustin J. Mitchell [:dustin] (he/him)

Assignee

Comment 36

•

6 years ago

DEBUG 2019-08-14T13:47:20Z: sccache::simples3::credential: Attempting to fetch credentials from http://taskcluster/auth/v1/aws/s3/read-write/taskcluster-level-1-sccache-/?format=iam-role-compat
WARN 2019-08-14T13:47:20Z: sccache::simples3::credential: Failed to fetch IAM credentials: Couldn't find AccessKeyId in response.

It looks like that URL is incorrect.

Dustin J. Mitchell [:dustin] (he/him)

Assignee

Comment 37

•

6 years ago

Ugh, generic-worker doesn't set TASKCLUSTER_WORKER_GROUP. But there's already code in that file that determines the bucket name... one more try.

https://treeherder.mozilla.org/#/jobs?repo=try&revision=015b2f9e607a5fec1c0a7d61d9a59881e8a3c7ea

Dustin J. Mitchell [:dustin] (he/him)

Assignee

Comment 38

•

6 years ago

DEBUG 2019-08-14T17:14:39Z: sccache::server: [target_lexicon]: Cache write finished in 0.142 s

woo!

I've changed the patch quite a bit, so I'll push a new rev.

Dustin J. Mitchell [:dustin] (he/him)

Assignee

Comment 39

•

6 years ago

In the main log,

[task 2019-08-13T20:51:20.400Z] 20:51:20     INFO -      export AWS_IAM_CREDENTIALS_URL=http://taskcluster/auth/v1/aws/s3/read-write/taskcluster-level-1-sccache-/?format=iam-role-compat

Blocks: 1573977

Dustin J. Mitchell [:dustin] (he/him)

Assignee

Comment 40

•

6 years ago

The latest run with the updated patch was green. I thought I had commented here, but apparently not! Anyway, this works on a windows worker without an instance profile.

Comment 41

•

6 years ago

Pushed by dmitchell@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/91eca815c9fc use AWS_IAM_CREDENTIALS_URL for all S3 sccache invocations r=chmanchester

Dustin J. Mitchell [:dustin] (he/him)

Assignee

Comment 42

•

6 years ago

https://github.com/mozilla-releng/OpenCloudConfig/pull/251

Keywords: leave-open

Comment 43

•

6 years ago

Backed out for build bustages.

Push with failure: https://treeherder.mozilla.org/#/jobs?repo=autoland&resultStatus=success%2Cpending%2Crunning%2Ctestfailed%2Cbusted%2Cexception&classifiedState=unclassified&searchStr=linux%2Cx64%2Copt%2Cvalgrind-linux64-valgrind%2Fopt%2C%28v%29&revision=91eca815c9fc0b5f8a5445a5345ad5ea0d834bc6&selectedJob=262062753

Failure log: https://treeherder.mozilla.org/logviewer.html#/jobs?job_id=262062753&repo=autoland&lineNumber=49750

Backout: https://hg.mozilla.org/integration/autoland/rev/95e2662a28ffdbd442be279469797113b0c3edc7

Flags: needinfo?(dustin)

Dustin J. Mitchell [:dustin] (he/him)

Assignee

Comment 44

•

6 years ago

Weird..
sccache.log looks good:

DEBUG 2019-08-16T19:34:05Z: sccache::simples3::credential: Using AWS credentials from IAM
DEBUG 2019-08-16T19:34:05Z: sccache::simples3::s3: PUT http://taskcluster-level-3-sccache-us-east-1.s3.amazonaws.com/3/1/0/310a851cfae971d940681332d950f5972b726521b5edf4331b33b53ec2223464b33244ceed06e55867115632a853b162d6ced4200ec612d348749a24df15f74d

and in the main log..

[fetches 2019-08-16T19:31:22.420Z] Extracting /builds/worker/fetches/sccache.tar.xz to /builds/worker/fetches
...
[fetches 2019-08-16T19:31:23.480Z] /builds/worker/fetches/sccache.tar.xz extracted in 1.060s
[fetches 2019-08-16T19:31:23.480Z] Removing /builds/worker/fetches/sccache.tar.xz
..
[task 2019-08-16T19:31:57.798Z] 19:31:57     INFO -      export AWS_IAM_CREDENTIALS_URL=http://taskcluster/auth/v1/aws/s3/read-write/taskcluster-level-3-sccache-us-east-1/?format=iam-role-compat
...
[task 2019-08-16T19:31:57.798Z] 19:31:57     INFO -      MOZBUILD_MANAGE_SCCACHE_DAEMON=/builds/worker/fetches/sccache/sccache
...
[task 2019-08-16T19:32:00.511Z] 19:32:00     INFO -  checking for ccache... /builds/worker/fetches/sccache/sccache
...
[task 2019-08-16T19:33:15.016Z] 19:33:15     INFO -  env RUST_LOG=sccache=debug SCCACHE_ERROR_LOG=/builds/worker/artifacts/sccache.log /builds/worker/fetches/sccache/sccache --start-server
[task 2019-08-16T19:33:15.019Z] 19:33:15     INFO -  DEBUG 2019-08-16T19:33:15Z: sccache::config: Attempting to read config file at "/builds/worker/.config/sccache/config"
[task 2019-08-16T19:33:15.019Z] 19:33:15     INFO -  DEBUG 2019-08-16T19:33:15Z: sccache::config: Couldn't open config file: No such file or directory (os error 2)
[task 2019-08-16T19:33:15.019Z] 19:33:15     INFO -  Starting sccache server...
[task 2019-08-16T19:33:15.020Z] 19:33:15     INFO -  DEBUG 2019-08-16T19:33:15Z: sccache::config: Attempting to read config file at "/builds/worker/.config/sccache/config"
[task 2019-08-16T19:33:15.020Z] 19:33:15     INFO -  DEBUG 2019-08-16T19:33:15Z: sccache::config: Couldn't open config file: No such file or directory (os error 2)
...
[task 2019-08-16T19:54:16.041Z] 19:54:16     INFO - Calling ['/builds/worker/workspace/build/src/obj-x86_64-pc-linux-gnu/_virtualenvs/init/bin/python', 'mach', 'valgrind-test'] with output_timeout 2400
[task 2019-08-16T19:54:16.782Z] 19:54:16     INFO -  Error running mach:
[task 2019-08-16T19:54:16.782Z] 19:54:16     INFO -      ['valgrind-test']
[task 2019-08-16T19:54:16.782Z] 19:54:16     INFO -  The error occurred in code that was called by the mach command. This is either
[task 2019-08-16T19:54:16.782Z] 19:54:16     INFO -  a bug in the called code itself or in the way that mach is calling it.
[task 2019-08-16T19:54:16.782Z] 19:54:16     INFO -  You can invoke |./mach busted| to check if this issue is already on file. If it
[task 2019-08-16T19:54:16.782Z] 19:54:16     INFO -  isn't, please use |./mach busted file| to report it. If |./mach busted| is
[task 2019-08-16T19:54:16.782Z] 19:54:16     INFO -  misbehaving, you can also inspect the dependencies of bug 1543241.
[task 2019-08-16T19:54:16.782Z] 19:54:16     INFO -  If filing a bug, please include the full output of mach, including this error
[task 2019-08-16T19:54:16.782Z] 19:54:16     INFO -  message.
[task 2019-08-16T19:54:16.782Z] 19:54:16     INFO -  The details of the failure are as follows:
[task 2019-08-16T19:54:16.782Z] 19:54:16     INFO -  MetaCharacterException
[task 2019-08-16T19:54:16.782Z] 19:54:16     INFO -    File "/builds/worker/workspace/build/src/build/valgrind/mach_commands.py", line 107, in valgrind_test
[task 2019-08-16T19:54:16.782Z] 19:54:16     INFO -      env.update(self.extra_environment_variables)
[task 2019-08-16T19:54:16.782Z] 19:54:16     INFO -    File "/builds/worker/workspace/build/src/python/mozbuild/mozbuild/util.py", line 980, in __get__
[task 2019-08-16T19:54:16.782Z] 19:54:16     INFO -      setattr(instance, name, self.func(instance))
[task 2019-08-16T19:54:16.782Z] 19:54:16     INFO -    File "/builds/worker/workspace/build/src/python/mozbuild/mozbuild/base.py", line 397, in extra_environment_variables
[task 2019-08-16T19:54:16.782Z] 19:54:16     INFO -      exports = shellutil.split(line)[1:]
[task 2019-08-16T19:54:16.782Z] 19:54:16     INFO -    File "/builds/worker/workspace/build/src/python/mozbuild/mozbuild/shellutil.py", line 177, in split
[task 2019-08-16T19:54:16.782Z] 19:54:16     INFO -      return _ClineSplitter(s).result
[task 2019-08-16T19:54:16.782Z] 19:54:16     INFO -    File "/builds/worker/workspace/build/src/python/mozbuild/mozbuild/shellutil.py", line 65, in __init__
[task 2019-08-16T19:54:16.782Z] 19:54:16     INFO -      self._parse_unquoted()
[task 2019-08-16T19:54:16.782Z] 19:54:16     INFO -    File "/builds/worker/workspace/build/src/python/mozbuild/mozbuild/shellutil.py", line 117, in _parse_unquoted
[task 2019-08-16T19:54:16.782Z] 19:54:16     INFO -      raise MetaCharacterException(match['special'])
[task 2019-08-16T19:54:17.075Z] 19:54:17    ERROR - Return code: 1
...
[task 2019-08-16T19:54:17.510Z] 19:54:17     INFO - Running post-run listener: _shutdown_sccache
[task 2019-08-16T19:54:17.510Z] 19:54:17     INFO - Running command: ['/builds/worker/workspace/build/src/sccache/sccache', '--stop-server'] in /builds/worker/workspace/build/src
[task 2019-08-16T19:54:17.510Z] 19:54:17     INFO - Copy/paste: /builds/worker/workspace/build/src/sccache/sccache --stop-server
[task 2019-08-16T19:54:17.514Z] 19:54:17    ERROR - caught OS error 2: No such file or directory while running ['/builds/worker/workspace/build/src/sccache/sccache', '--stop-server']

I suspect that last bit about sccache not being found isn't fatal (I'm guessing it's a leftover from when sccache was in-tree?), and the MetaCharacterException is what did us in. Indeed:

>>> from mozbuild.shellutil import split
>>> split('foo=1')
['foo=1']
>>> split('foo=1?fbo')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/dustin/p/m-c/python/mozbuild/mozbuild/shellutil.py", line 177, in split
    return _ClineSplitter(s).result
  File "/home/dustin/p/m-c/python/mozbuild/mozbuild/shellutil.py", line 65, in __init__
    self._parse_unquoted()
  File "/home/dustin/p/m-c/python/mozbuild/mozbuild/shellutil.py", line 117, in _parse_unquoted
    raise MetaCharacterException(match['special'])
mozbuild.shellutil.MetaCharacterException

so apparently this shell-parsing library can't handle ? in a string.

Flags: needinfo?(dustin)

Dustin J. Mitchell [:dustin] (he/him)

Assignee

Comment 45

•

6 years ago

I'll try a try push with

-        mk_add_options "export AWS_IAM_CREDENTIALS_URL=http://taskcluster/auth/v1/aws/s3/read-write/${bucket}/?format=iam-role-compat"
+        mk_add_options "export 'AWS_IAM_CREDENTIALS_URL=http://taskcluster/auth/v1/aws/s3/read-write/${bucket}/?format=iam-role-compat'"

Dustin J. Mitchell [:dustin] (he/him)

Assignee

Comment 46

•

6 years ago

https://treeherder.mozilla.org/#/jobs?repo=try&revision=455dbade32df1f819caf7932e2f4bc8c5bb542ee

(the V job was the only one to fail in the autoland push)

Dustin J. Mitchell [:dustin] (he/him)

Assignee

Comment 47

•

6 years ago

[task 2019-08-19T16:12:09.516Z] 16:12:09    ERROR -  Traceback (most recent call last):
[task 2019-08-19T16:12:09.516Z] 16:12:09     INFO -    File "/builds/worker/workspace/build/src/configure.py", line 133, in <module>
[task 2019-08-19T16:12:09.516Z] 16:12:09     INFO -      sys.exit(main(sys.argv))
[task 2019-08-19T16:12:09.516Z] 16:12:09     INFO -    File "/builds/worker/workspace/build/src/configure.py", line 39, in main
[task 2019-08-19T16:12:09.516Z] 16:12:09     INFO -      sandbox.run(os.path.join(os.path.dirname(__file__), 'moz.configure'))
[task 2019-08-19T16:12:09.516Z] 16:12:09     INFO -    File "/builds/worker/workspace/build/src/python/mozbuild/mozbuild/configure/__init__.py", line 497, in run
[task 2019-08-19T16:12:09.517Z] 16:12:09     INFO -      func(*args)
[task 2019-08-19T16:12:09.517Z] 16:12:09     INFO -    File "/builds/worker/workspace/build/src/python/mozbuild/mozbuild/configure/__init__.py", line 541, in _value_for
[task 2019-08-19T16:12:09.517Z] 16:12:09     INFO -      return self._value_for_depends(obj)
[task 2019-08-19T16:12:09.517Z] 16:12:09     INFO -    File "/builds/worker/workspace/build/src/python/mozbuild/mozbuild/util.py", line 961, in method_call
[task 2019-08-19T16:12:09.517Z] 16:12:09     INFO -      cache[args] = self.func(instance, *args)
[task 2019-08-19T16:12:09.519Z] 16:12:09     INFO -    File "/builds/worker/workspace/build/src/python/mozbuild/mozbuild/configure/__init__.py", line 550, in _value_for_depends
[task 2019-08-19T16:12:09.519Z] 16:12:09     INFO -      value = obj.result()
[task 2019-08-19T16:12:09.519Z] 16:12:09     INFO -    File "/builds/worker/workspace/build/src/python/mozbuild/mozbuild/util.py", line 961, in method_call
[task 2019-08-19T16:12:09.519Z] 16:12:09     INFO -      cache[args] = self.func(instance, *args)
[task 2019-08-19T16:12:09.519Z] 16:12:09     INFO -    File "/builds/worker/workspace/build/src/python/mozbuild/mozbuild/configure/__init__.py", line 156, in result
[task 2019-08-19T16:12:09.519Z] 16:12:09     INFO -      return self._func(*resolved_args)
[task 2019-08-19T16:12:09.519Z] 16:12:09     INFO -    File "/builds/worker/workspace/build/src/python/mozbuild/mozbuild/configure/__init__.py", line 1125, in wrapped
[task 2019-08-19T16:12:09.519Z] 16:12:09     INFO -      return new_func(*args, **kwargs)
[task 2019-08-19T16:12:09.519Z] 16:12:09     INFO -    File "/builds/worker/workspace/build/src/build/moz.configure/rust.configure", line 60, in unwrap
[task 2019-08-19T16:12:09.519Z] 16:12:09     INFO -      (retcode, stdout, stderr) = get_cmd_output(prog, '+stable')
[task 2019-08-19T16:12:09.519Z] 16:12:09     INFO -    File "/builds/worker/workspace/build/src/python/mozbuild/mozbuild/configure/__init__.py", line 1125, in wrapped
[task 2019-08-19T16:12:09.519Z] 16:12:09     INFO -      return new_func(*args, **kwargs)
[task 2019-08-19T16:12:09.519Z] 16:12:09     INFO -    File "/builds/worker/workspace/build/src/build/moz.configure/util.configure", line 46, in get_cmd_output
[task 2019-08-19T16:12:09.519Z] 16:12:09     INFO -      log.debug('Executing: `%s`', quote(*args))
[task 2019-08-19T16:12:09.519Z] 16:12:09     INFO -    File "/builds/worker/workspace/build/src/python/mozbuild/mozbuild/shellutil.py", line 210, in quote
[task 2019-08-19T16:12:09.519Z] 16:12:09     INFO -      return ' '.join(_quote(s) for s in strings)
[task 2019-08-19T16:12:09.519Z] 16:12:09     INFO -    File "/builds/worker/workspace/build/src/python/mozbuild/mozbuild/shellutil.py", line 210, in <genexpr>
[task 2019-08-19T16:12:09.519Z] 16:12:09     INFO -      return ' '.join(_quote(s) for s in strings)
[task 2019-08-19T16:12:09.519Z] 16:12:09     INFO -    File "/builds/worker/workspace/build/src/python/mozbuild/mozbuild/shellutil.py", line 198, in _quote
[task 2019-08-19T16:12:09.519Z] 16:12:09     INFO -      return t("'%s'") % s.replace(t("'"), t("'\\''"))
[task 2019-08-19T16:12:09.519Z] 16:12:09    ERROR -  TypeError: cannot create 'NoneType' instances

I can't reproduce this locally. Chris, how would I go about setting a variable in a mozconfig where the value contains ??

Flags: needinfo?(cmanchester)

Dustin J. Mitchell [:dustin] (he/him)

Assignee

Comment 48

•

6 years ago

(change ni since Chris is away and :glandium is all over shellutil.py)

Flags: needinfo?(cmanchester) → needinfo?(mh+mozilla)

Mike Hommey [:glandium]

Comment 49

•

6 years ago

The one with mk_add_options "export 'AWS_IAM_CREDENTIALS_URL=...'" failed because somehow that broke the mozconfig shell script somehow and CARGO ended up not being set.
The one with mk_add_options "export AWS_IAM_CREDENTIALS_URL=... failed because MozbuildObject.extra_environment_variables is using a function used to split shell commands to read what is, in fact, a make syntax to set an environment variable.

Even if the first mk_add_options variant didn't break CARGO, it would still be wrong export FOO = 'foo' in a makefile literally exports "'foo'":

$ cat /tmp/test.mk
export FOO = 'foo'

foo:
	echo $$FOO
$ make -f /tmp/test.mk
echo $FOO
'foo'

So the real solution on the mozconfig side is the second one. And MozbuildObject.extra_environment_variables should be fixed to read .mozconfig.mk as a makefile, presumably with pymake.

OTOH, the only thing using MozbuildObject.extra_environment_variables is build/valgrind/mach_commands.py and that was explicitly added for automation, not developer builds, back when we were pulling Gtk from tooltool and needed to set plenty of environment variables for that. But that's not the case anymore. That is, the code that was added in bug 1187245 and that required me to add MozbuildObject.extra_environment_variables is gone as of bug 1426785.

So I'd just backout the remaining half of bug 1187245.

Flags: needinfo?(mh+mozilla)

Dustin J. Mitchell [:dustin] (he/him)

Assignee

Comment 50

•

6 years ago

Thanks!!
https://treeherder.mozilla.org/#/jobs?repo=try&revision=5a3367a4b4c6b858361f610eab2b93af835f557d

Dustin J. Mitchell [:dustin] (he/him)

Assignee

Comment 51

•

6 years ago

Attached file Bug 1562686 - revert remaining unnecessary bit of bug 1187245; r=glandium — Details

Depends on D41454

Dustin J. Mitchell [:dustin] (he/him)

Assignee

Updated

•

6 years ago

Summary: Migrate sccache to new deployment → [sccache] Migrate sccache to new deployment

Comment 52

•

6 years ago

Pushed by dmitchell@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/fe7b9445e1d3 use AWS_IAM_CREDENTIALS_URL for all S3 sccache invocations r=chmanchester https://hg.mozilla.org/integration/autoland/rev/0ce37eda652a revert remaining unnecessary bit of bug 1187245; r=glandium

Andrei Ciure[:aciure]

Comment 53

•

6 years ago

bugherder

https://hg.mozilla.org/mozilla-central/rev/fe7b9445e1d3
https://hg.mozilla.org/mozilla-central/rev/0ce37eda652a

Comment 54

•

6 years ago

Backed out as per glanduim's request.

Backout: https://hg.mozilla.org/integration/autoland/rev/760f1b14b2a45a091f061b367ac474cdc8c13594

Flags: needinfo?(dustin)

Mike Hommey [:glandium]

Comment 55

•

6 years ago

I'll elaborate a little on why: sccache writes were failing, which led to decreasing sccache hits, and increasing build times.

Mike Hommey [:glandium]

Comment 56

•

6 years ago

Ionut: FYI, this is the cause for the large number of build metrics alerts, ranging from sccache write error increases, to sccache hit decreases, to build time increases. Do note, though, that some of the alerts are also about bug 1575471.

Flags: needinfo?(igoldan)

Alexandru Ionescu (needinfo me) [:alexandrui]

Comment 57

•

6 years ago

(In reply to Razvan Maries from comment #54)

Backed out as per glanduim's request.

Backout: https://hg.mozilla.org/integration/autoland/rev/760f1b14b2a45a091f061b367ac474cdc8c13594

This is weird because from alert 22628 I can see that a642029c8e7e Bug 1528697 and 4b5339bfcdaa Bug 1573501 seems to have caused the regreesions.

Mike Hommey [:glandium]

Comment 58

•

6 years ago

(In reply to Alexandru Ionescu :alexandrui from comment #57)

(In reply to Razvan Maries from comment #54)

Backed out as per glanduim's request.

Backout: https://hg.mozilla.org/integration/autoland/rev/760f1b14b2a45a091f061b367ac474cdc8c13594

This is weird because from alert 22628 I can see that a642029c8e7e Bug 1528697 and 4b5339bfcdaa Bug 1573501 seems to have caused the regreesions.

Because the effect is not immediate because it's related to caching. You can see everything going back in order exactly on the backout (starting with the sccache write errors).

Dustin J. Mitchell [:dustin] (he/him)

Assignee

Comment 59

•

6 years ago

How can you tell writes are failing? I had observed writes succeeding in my pushes.

Flags: needinfo?(dustin) → needinfo?(mh+mozilla)

Ionuț Goldan [:igoldan]

Comment 60

•

6 years ago

(In reply to Mike Hommey [:glandium] from comment #56)

Ionut: FYI, this is the cause for the large number of build metrics alerts, ranging from sccache write error increases, to sccache hit decreases, to build time increases. Do note, though, that some of the alerts are also about bug 1575471.

Thanks for the heads up! Alexandru, FYI.

Flags: needinfo?(igoldan) → needinfo?(alexandru.ionescu)

Dustin J. Mitchell [:dustin] (he/him)

Assignee

Comment 61

•

6 years ago

Ah, I didn't see that alert link before -- that led me to the quarry. Looking at
https://tools.taskcluster.net/groups/c1hpOdxxQbmHNasJh46zuA/tasks/SPf7ZDCCQTeFEIt-iKnbZg/runs/0/artifacts
I see in sccache.log

 WARN 2019-08-21T23:45:20Z: sccache::simples3::credential: Failed to fetch IAM credentials: Didn't get a parseable response body from instance role details
DEBUG 2019-08-21T23:45:20Z: sccache::compiler::compiler: [arm.o]: Cache write error: Error(Msg("failed to get AWS credentials"), State { next_error: Some(Error(Msg("Couldn\'t find AWS credentials in environment, credentials file, or IAM role."), State { next_error: None, backtrace: InternalBacktrace })), backtrace: InternalBacktrace })

and in the task log:

[task 2019-08-21T23:42:17.623Z] 23:42:17     INFO -      export SCCACHE_BUCKET=taskcluster-level-3-sccache-us-west-2
[task 2019-08-21T23:42:17.623Z] 23:42:17     INFO -      export 'AWS_IAM_CREDENTIALS_URL=http://taskcluster/auth/v1/aws/s3/read-write/taskcluster-level-3-sccache-us-west-2/?format=iam-role-compat

The task has scope (among others)

assume:project:taskcluster:gecko:level-3-sccache-buckets

which expands to

auth:aws-s3:read-write:taskcluster-level-3-sccache-us-west-2/*

that should be sufficient (and is the same role and scope it's been using since forever)

The task has taskcluster-proxy enabled. In that condition, I can successfully fetch credentials.

I don't see any logged calls to the auth service's awsS3Credentials method at that time, or in fact at any time from that worker's IP (I just see auth.expandScopes calls from docker-worker). I do see such an awsS3Credentials call for my test task.

Flags: needinfo?(mh+mozilla)

Dustin J. Mitchell [:dustin] (he/him)

Assignee

Comment 62

•

6 years ago

This bit of code

https://github.com/mozilla/sccache/blob/6e3295a22283d6143859c8838f583cb37c176e03/src/simples3/credential.rs#L362

        let body = body
            .map_err(|_e| "Didn't get a parseable response body from instance role details".into())
            .and_then(|body| {
                String::from_utf8(body).chain_err(|| "failed to read iam role response")
            });

does a great job of discarding a useful error message and replacing it with a misleading one. There's nothing parsing-related going on here. Rather, there's an HTTP error of some sort. What sort, we don't know. So it seems clear that sccache is -- for whatever reason -- failing to talk to the taskcluster proxy.

Dustin J. Mitchell [:dustin] (he/him)

Assignee

Comment 63

•

6 years ago

I do see

Aug 21 23:39:44 docker-worker.aws-provisioner.us-west-2c.ami-0beb39c669e36dc9b.m5-4xlarge.i-0c190a9614ea4bab4 docker-worker: {"type":"ensure image","source":"top","provisionerId":"aws-provisioner-v1","workerId":"i-0c190a9614ea4bab4","workerGroup":"us-west-2","workerType":"gecko-3-b-linux","workerNodeType":"m5.4xlarge","image":{"name":"taskcluster/taskcluster-proxy:5.1.0","type":"docker-image"}}

in the worker logs, suggesting it is correctly loading the proxy. So most likely, sccache is hitting the AWS metadata endpoint and getting an error which it is helfpully obscuring (I'll make a patch..).

The unclosed ' in

[task 2019-08-21T23:42:17.623Z] 23:42:17     INFO -      export 'AWS_IAM_CREDENTIALS_URL=http://taskcluster/auth/v1/aws/s3/read-write/taskcluster-level-3-sccache-us-west-2/?format=iam-role-compat

is curious. In fact, I bet that's it. So, how did this work in try??

Dustin J. Mitchell [:dustin] (he/him)

Assignee

Comment 64

•

6 years ago

https://treeherder.mozilla.org/#/jobs?repo=try&revision=cc7a57468dea0d4f05ee0a09f543c1b4178fed0e

Dustin J. Mitchell [:dustin] (he/him)

Assignee

Comment 65

•

6 years ago

OK, I see 0's for cache write errors in various builds I clicked on, and

DEBUG 2019-08-22T14:41:54Z: sccache::server: [bzip2_sys]: Cache write finished in 0.534 s

in one sccache.log for linux and

DEBUG 2019-08-22T14:51:24Z: sccache::simples3::s3: PUT http://taskcluster-level-1-sccache-us-west-2.s3.amazonaws.com/5/c/0/5c0fd5994ae431164601cd8560585404d31f1d80d876b033f6c99a4e5563f7b28992cd979d1398013fff128b787602bd0f1f71a74fbef9d2e39095f4ede6098e
 INFO 2019-08-22T14:51:24Z: sccache::simples3::s3: Read 21174 bytes from http://taskcluster-level-1-sccache-us-west-2.s3.amazonaws.com/7/9/1/791fcead20fcfc79ec8957a62411df71b44d1c2c7e1590e684a60979aab36f9bc5f25cc0b682d02f24f5e2b12249ee3896507bbb41307e8975169c8acb675c8e
DEBUG 2019-08-22T14:51:24Z: sccache::compiler::compiler: [unistr_cnv.obj]: Cache hit in 0.030 s
DEBUG 2019-08-22T14:51:24Z: sccache::compiler::compiler: [cstr]: Stored in cache successfully!

in one for Windows.

Alexandru, do you see any other issues with this try push?

Dustin J. Mitchell [:dustin] (he/him)

Assignee

Comment 66

•

6 years ago

Maybe that's a better question for :glandium

Flags: needinfo?(mh+mozilla)

Mike Hommey [:glandium]

Comment 67

•

6 years ago

I don't know what I should be looking for but the Windows 2012 opt has sccache write errors.

Flags: needinfo?(mh+mozilla)

Dustin J. Mitchell [:dustin] (he/him)

Assignee

Comment 68

•

6 years ago

Hm, so it worked for most jobs, just not that one? Notably, that's in eu-central-1. I can see calls to the auth.awsS3Credentials endpoint:

2019-08-22 14:58:37.204
 Fields: {
   apiVersion: "v1"    
   clientId: "task-client/Y8WnX6jQSSKGv8votzJCKw/0/on/eu-central-1/i-01d53ffa6138d524a/until/1566487069.247"    
   duration: 58.736319    
   expires: "2019-08-22T15:17:49.247Z"    
   hasAuthed: true    
   method: "GET"    
   name: "awsS3Credentials"    
   public: false    
   query: {…}    
   resource: "/aws/s3/read-write/taskcluster-level-1-sccache-eu-central-1/"    
   satisfyingScopes: [1]    
   sourceIp: "3.120.111.87"    
   statusCode: 200    
   v: 1    
  }

which is exactly the time that sccache says so in the logs:

DEBUG 2019-08-22T14:58:37Z: sccache::simples3::credential: Using AWS credentials from IAM
DEBUG 2019-08-22T14:58:37Z: sccache::simples3::s3: PUT http://taskcluster-level-1-sccache-eu-central-1.s3.amazonaws.com/5/c/c/5cc399006fa7402cdb7a6920e6758ddc198607f5b4e3c18e7bf7c148946cc5ac82c2438a09f6deab47c461f4daa31c6773bd6a532f77011d2b5d6142c2d17585
DEBUG 2019-08-22T14:58:37Z: sccache::compiler::compiler: [host_AssertAssignmentChecker.obj]: Cache write error: Error(Msg("failed to put cache entry in s3"), State { next_error: Some(Error(BadHTTPStatus(400), State { next_error: None, backtrace: InternalBacktrace })), backtrace: InternalBacktrace })
DEBUG 2019-08-22T14:58:37Z: sccache::server: Error executing cache write: failed to put cache entry in s3

and I can confirm that the auth service has the policy necessary for the eu-central-1 bucket:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "Stmt1467642244000",
            "Effect": "Allow",
            "Action": [
                "s3:DeleteObject",
                "s3:GetObject",
                "s3:PutObject"
            ],
            "Resource": [
                "arn:aws:s3:::taskcluster-level-1-sccache-eu-central-1/*",
                "arn:aws:s3:::taskcluster-level-1-sccache-us-east-1/*",
                "arn:aws:s3:::taskcluster-level-1-sccache-us-west-1/*",
                "arn:aws:s3:::taskcluster-level-1-sccache-us-west-2/*"
            ]
        },
        {
            "Sid": "Stmt1467642244001",
            "Effect": "Allow",
            "Action": [
                "s3:GetBucketLocation",
                "s3:GetBucketTagging",
                "s3:ListBucket"
            ],
            "Resource": [
                "arn:aws:s3:::taskcluster-level-1-sccache-eu-central-1",
                "arn:aws:s3:::taskcluster-level-1-sccache-us-east-1",
                "arn:aws:s3:::taskcluster-level-1-sccache-us-west-1",
                "arn:aws:s3:::taskcluster-level-1-sccache-us-west-2"
            ]
        }
    ]
}

so, why does S3 respond with a 400 status?

Dustin J. Mitchell [:dustin] (he/him)

Assignee

Comment 69

•

6 years ago

Hm, all of the sccache eu-central-1 buckets are empty. It looks like level 3 doesn't run in eu-central-1 at all, and based on the empty bucket I think that all try jobs running in eu-central-1 have been unable to write to caches for whatever reason since forever, and that's just gone un-noticed. It'd be nice to fix that, but not in this bug.

So, I've retriggered that job. If it comes back without any write errors, then I think this can be landed. OK by you, :glandium?

Flags: needinfo?(mh+mozilla)

Dustin J. Mitchell [:dustin] (he/him)

Assignee

Comment 70

•

6 years ago

Of five retriggers, four are in eu-central-1, and one is in us-west-1.

Dustin J. Mitchell [:dustin] (he/him)

Assignee

Comment 71

•

6 years ago

Indeed, the us-west-1 task has zero write errors. So, that's https://bugzilla.mozilla.org/show_bug.cgi?id=1576032. I think this is clear to land.

Mike Hommey [:glandium]

Comment 72

•

6 years ago

Fair enough.

Flags: needinfo?(mh+mozilla)

Alexandru Ionescu (needinfo me) [:alexandrui]

Comment 73

•

6 years ago

•

Hi,
Just for the record:

this push:
(In reply to Pulsebot from comment #52)

Pushed by dmitchell@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/fe7b9445e1d3
use AWS_IAM_CREDENTIALS_URL for all S3 sccache invocations r=chmanchester
https://hg.mozilla.org/integration/autoland/rev/0ce37eda652a
revert remaining unnecessary bit of bug 1187245; r=glandium

caused alert summary 22611 (about 150 alerts) - regression
and this push (its backout):
(In reply to Razvan Maries from comment #54)

Backed out as per glanduim's request.

Backout: https://hg.mozilla.org/integration/autoland/rev/760f1b14b2a45a091f061b367ac474cdc8c13594

caused alert summary 22698 (about 60 sccache and about 120 build times) - improvement. Probably more improvements of build times to come.
To note that treeherder seems that wasn't able to detect sccache regressions.

Alexandru Ionescu (needinfo me) [:alexandrui]

Comment 74

•

6 years ago

(In reply to Dustin J. Mitchell [:dustin] (he/him) from comment #65)

Alexandru, do you see any other issues with this try push?

What do you mean by other issues? I see all-green which at first sight seems to be good. I'm not so familiar with the details build times code. But if you tell me what should I look for, I will.

Flags: needinfo?(alexandru.ionescu)

Dustin J. Mitchell [:dustin] (he/him)

Assignee

Comment 75

•

6 years ago

Thanks Alexandru -- I think :glandium found what I needed. Also, note that bug 1576032 will likely lead to improvements in sccache performance on try.

Comment 76

•

6 years ago

Pushed by dmitchell@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/f266a7b397c1 use AWS_IAM_CREDENTIALS_URL for all S3 sccache invocations r=chmanchester https://hg.mozilla.org/integration/autoland/rev/a04fc912928e revert remaining unnecessary bit of bug 1187245; r=glandium

Dustin J. Mitchell [:dustin] (he/him)

Assignee

Comment 77

•

6 years ago

0 write errors on
https://treeherder.mozilla.org/#/jobs?repo=autoland&selectedJob=263139872
and similar jobs on later pushes. I think we're OK here.

Daniel Varga [:dvarga]

Comment 78

•

6 years ago

bugherder

https://hg.mozilla.org/mozilla-central/rev/f266a7b397c1
https://hg.mozilla.org/mozilla-central/rev/a04fc912928e

Dustin J. Mitchell [:dustin] (he/him)

Assignee

Comment 79

•

5 years ago

This appears to have stuck! I'll come back later this week to land the OCC change.

Dustin J. Mitchell [:dustin] (he/him)

Assignee

Comment 80

•

5 years ago

tomprince|pto> dustin: a=tomprince to land that everywhere

There's not a big rush, so Tom has agreed to do that when he's back (and it will be on beta by then anyway).

Dustin J. Mitchell [:dustin] (he/him)

Assignee

Updated

•

5 years ago

Flags: needinfo?(mozilla)

Dustin J. Mitchell [:dustin] (he/him)

Assignee

Comment 81

•

5 years ago

Tom, did this get uplifted?

Tom Prince [:tomprince]

Comment 82

•

5 years ago

bugherder uplift

https://hg.mozilla.org/releases/mozilla-release/rev/0564d050f93a
https://hg.mozilla.org/releases/mozilla-release/rev/f466abbf4a32

Tom Prince [:tomprince]

Comment 83

•

5 years ago

bugherder uplift

https://hg.mozilla.org/releases/mozilla-esr68/rev/d133da7d3e97
https://hg.mozilla.org/releases/mozilla-esr68/rev/eb5bb268f473

Tom Prince [:tomprince]

Comment 84

•

5 years ago

bugherder uplift

https://hg.mozilla.org/releases/mozilla-esr60/rev/eefee71b9b1f
https://hg.mozilla.org/releases/mozilla-esr60/rev/fb01125329bf

Dustin J. Mitchell [:dustin] (he/him)

Assignee

Comment 85

•

5 years ago

Thank you!!

Dustin J. Mitchell [:dustin] (he/him)

Assignee

Comment 86

•

5 years ago

OK, I'll count this as done. the OCC change isn't landed yet, but it's not hurting anything and OCC will be unused soon enough.

Status: NEW → RESOLVED

Closed: 5 years ago

Resolution: --- → FIXED

Tom Prince [:tomprince]

Updated

•

5 years ago

Flags: needinfo?(mozilla)

You need to log in before you can comment on or make changes to this bug.