Closed Bug 1164615 Opened 9 years ago Closed 7 years ago

Handle symbol uploads from taskcluster

Categories

(Release Engineering :: General, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: dustin, Unassigned)

References

Details

For the Android and Desktop builds landed in bug 1125973, symbol uploads just do not occur.

We need to figure out a way to authenticate such uploads, and set it up.
One of the approaches we've discussed in RelEng is to offload this type of work to a separate task. i.e. leave the build task to produce unsigned binaries & tests, and delegate integration with other services to their own discrete tasks. So in this case you would have a 'symbol upload' task that depends on the build task, and knows how to talk grab symbols from S3 and publish them to the symbol store.

The aim here is to reduce the complexity and permissions required by the build tasks, and also be able to reuse things like the symbols uploading logic between different types of builds.
This is already working for builds in TC, so we should probably take advantage:
  https://dxr.mozilla.org/mozilla-central/source/toolkit/crashreporter/tools/upload_symbols.py
Ah, but that embeds the socorro token in the docker image, and thus requires that the image be private..
I think separating uploads out to a separate task with credentials is a good first step, even if the docker image winds up private (although we should be able to build a public image, then a private image that's just that + credentials).

As a follow-up, we could improve our auth story. Socorro does have an API to hand out time-limited tokens on-demand, but unfortunately its auth is Persona-only right now. If we gave it the ability to do some other kind of auth then we could have a credential manager service somewhere that handed out time-limited tokens on-demand by asking Socorro for them, and life would be pretty good.
That sounds very much like what I want to do for RelengAPI.  There's always an issue with providing *some* private information, be that the secret itself or a secret that grants access to other secrets.  That leaves us trying to get the secret into the OS image somehow (and we've historically had a terrible time with that), and trying to make sure that secret doesn't leak in logs or images or anything like that.

My thinking is to create a relengapi proxy container which has a token with a lot of permissions attached.  When it gets a request from a worker container, it looks at that task's scopes, translates them to relengapi permissions, creates a temporary token with only those permissions, and then proxies the request using *that* token.  It can track that token's expiration and automatically renew if necessary, so we're not worried about jobs taking longer than the temporary token they're issued.

The other option is for the decision task to fetch temporary tokens and embed them in the task descriptions using encrypted env vars.  Then the lack of perfect forward secrecy for those vars isn't a problem, because the tokens are temporary.
Either of those sound like good approaches. The latter sounds kind of similar to what Amazon does with IAM instance profiles for EC2.
AWS's approach is very much like the first method -- embed credentials somewhere in the host system.  In Amazon's case, they're in the instance metadata at http://169.254.169.25 (so not on the host itself, per se).  But there's no magic proxying or injection of encrypted credentials into a task description or anything like that.
I think this will use a proxy per bug  1168534 and bug 1170784.  I'd like to get bug 1170753 done first, then hopefully I'll have a better idea how to do this.
Assignee: dustin → nobody
Depends on: 1168534, 1170784
bug 1168979 got these into a separate task, which is nice. Doing the upload via a proxy so the container doesn't need an embedded secret would be even better.
No longer blocks: 1118394
See Also: → 1333230
So we could write a proxy, or we could just put the token into the Taskcluster secrets store. I guess the downside to that is that if we did things that way we'd be exposing the token to arbitrary user code, which isn't great.
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Component: General Automation → General
You need to log in before you can comment on or make changes to this bug.