Closed
Bug 1285732
Opened 8 years ago
Closed 8 years ago
B2G Aries and Nexus 5 L cache broken
Categories
(Firefox OS Graveyard :: General, defect)
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: gerard-majax, Assigned: garndt)
References
Details
https://treeherder.mozilla.org/#/jobs?repo=try&revision=67068b46a55e&selectedJob=23590287&filter-tier=1&filter-tier=2&filter-tier=3 Looking at the cache task I see timeout :/ https://public-artifacts.taskcluster.net/T2H3A6JuQEC0sdenO1H30A/1/public/logs/live_backing.log
Reporter | ||
Comment 1•8 years ago
|
||
Do you know what is going on?
Flags: needinfo?(wcosta)
Flags: needinfo?(garndt)
Assignee | ||
Comment 2•8 years ago
|
||
So far I'm not 100% what has changed to cause this to start failing, but it looks like we are receiving a 401 response when trying to create artifacts to upload for these cache jobs using the taskcluster-proxy. All emulator and device builds seem to be affected by this. Our normal cache jobs seem to be ok.
Reporter | ||
Comment 3•8 years ago
|
||
Retrigger also failed the same way. The last successfull task I could find was https://tools.taskcluster.net/task-inspector/#ONF3M8IKTBuD7SgkeLPEIw/ I cannot tell if it is of any importance, but that task was run out of nexus-5-l: "create-repo-cache --force-clone --upload --proxy https://github.com/mozilla-b2g/B2G https://hg.mozilla.org/mozilla-central/raw-file/default/b2g/config/nexus-5-l/sources.xml" ; while the failing task is made from aries: https://tools.taskcluster.net/task-inspector/#T2H3A6JuQEC0sdenO1H30A/ Another difference I could spot while checking the logs is that curl is not called the same way. On a successfull upload, we get: > [taskcluster-vcs] 0 run start : (cwd: /) curl --header 'Content-Type: application/x-tar' --header 'Content-Encoding: gzip' -X PUT --data-binary @'/root/.tc-vcs-repo/sources/git.mozilla.org/external/caf/platform/prebuilts/gcc/linux-x86/arm/arm-linux-androideabi-4.9/master.tar.gz' 'https://taskcluster-public-artifacts.s3-us-west-2.amazonaws.com/ONF3M8IKTBuD7SgkeLPEIw/0/public/git.mozilla.org/external/caf/platform/prebuilts/gcc/linux-x86/arm/arm-linux-androideabi-4.9/master.tar.gz?AWSAccessKeyId=AKIAJQESBGXODWDRTZUA&Content-Type=application%2Fx-tar&Expires=1467881677&Signature=QPEu8ANu6R84v%2FMEmwXj0rJDwdc%3D' > [taskcluster-vcs] run end : curl --header 'Content-Type: application/x-tar' --header 'Content-Encoding: gzip' -X PUT --data-binary @'/root/.tc-vcs-repo/sources/git.mozilla.org/external/caf/platform/prebuilts/gcc/linux-x86/arm/arm-linux-androideabi-4.9/master.tar.gz' 'https://taskcluster-public-artifacts.s3-us-west-2.amazonaws.com/ONF3M8IKTBuD7SgkeLPEIw/0/public/git.mozilla.org/external/caf/platform/prebuilts/gcc/linux-x86/arm/arm-linux-androideabi-4.9/master.tar.gz?AWSAccessKeyId=AKIAJQESBGXODWDRTZUA&Content-Type=application%2Fx-tar&Expires=1467881677&Signature=QPEu8ANu6R84v%2FMEmwXj0rJDwdc%3D' (0) in 4610 ms On a failed upload, we get: > [taskcluster-vcs] 0 run start : (cwd: /) curl --header 'Content-Type: application/x-tar' --header 'Content-Encoding: gzip' -X PUT --data-binary @'/root/.tc-vcs-repo/sources/git.mozilla.org/external/caf/platform/prebuilts/gcc/linux-x86/arm/arm-linux-androideabi-4.9/master.tar.gz' '{{url}}' > [taskcluster-vcs:warning] run end (with error) try (0/10) retrying in 8763.582732062787 ms : curl --header 'Content-Type: application/x-tar' --header 'Content-Encoding: gzip' -X PUT --data-binary @'/root/.tc-vcs-repo/sources/git.mozilla.org/external/caf/platform/prebuilts/gcc/linux-x86/arm/arm-linux-androideabi-4.9/master.tar.gz' '{{url}}' Note the difference of target URL, where in the failure case we have |{{url}}|. Was there some failed string escape? At least, I see curl warnings: > curl: (3) [globbing] nested brace in column 2
Assignee | ||
Comment 4•8 years ago
|
||
After some debugging with tc-vcs, it appears this is the reason artifact creation is failing: "details": { "status": "auth-failed", "message": "ext.certificate.expiry < now" } I'm not sure why this is happening yet as things have not changed in regards to how credentials are updated within taskcluster-proxy, but somehow somewhere something is going wrong. I also tried to add "--depth" to the `repo init` call, but caused things to fail for other reasons.
Assignee | ||
Comment 5•8 years ago
|
||
I think I have pinpointed the issue: https://github.com/taskcluster/docker-worker/blob/master/lib/features/taskcluster_proxy.js#L86 Tasks that are running with the proxy for greater than the initial claim of the task will have this issue. The taskcluster proxy is not getting updated with the newest claim credentials but rather the credentials of the original claim (task.claim.credentials). This should be changed to update the proxy with the credentials passed in the event that was emitted. I've attempted to test this out with a docker-worker deployment but I have hit an issue creating an AMI all day long. I hope that the issue goes away.
Updated•8 years ago
|
Flags: needinfo?(wcosta)
Assignee | ||
Comment 6•8 years ago
|
||
Ok, it looks like these tasks for nexus and aries cache succeeded now with the new docker-worker fixes. Builds should be verified though. https://tools.taskcluster.net/task-graph-inspector/#JdUQK2tRTHqy5ODaeICfUQ/caHK1O5zQtykU6tpaOcIlQ/ https://tools.taskcluster.net/task-graph-inspector/#JdUQK2tRTHqy5ODaeICfUQ/Rza6oXcmTDGfX4bCUZbKsw/1
Flags: needinfo?(garndt)
Assignee | ||
Updated•8 years ago
|
Assignee: nobody → garndt
Comment 7•8 years ago
|
||
Thanks Greg for not giving up!
Assignee | ||
Comment 8•8 years ago
|
||
np! glad all is well. It appears there was a green build on m-c so I'll consider this resolved.
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•