Use taskcluster to send uploadFiles to S3

RESOLVED FIXED

Status

RESOLVED FIXED
4 years ago
4 years ago

People

(Reporter: mshal, Assigned: mshal)

Tracking

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(1 attachment, 7 obsolete attachments)

(Assignee)

Description

4 years ago
This bug will track changes to mozharness for uploading build artifacts to s3.
(Assignee)

Comment 1

4 years ago
Created attachment 8537369 [details] [diff] [review]
upload-files

Note that the taskcluster==0.0.7 package doesn't exist yet - we're still making a few fixes there before we can release it and upload it to our pypi repository. However, I don't expect the mozharness portion to change much before then.

I'm not entirely sure if this is the right place in mozharness to upload the files, but it seems to work in a staging linux build. It reads the file list from the uploadFiles property which is added in bug 1109136.
Attachment #8537369 - Flags: review?(jlund)
(Assignee)

Updated

4 years ago
Depends on: 1108029
(Assignee)

Updated

4 years ago
Depends on: 1112303
Comment on attachment 8537369 [details] [diff] [review]
upload-files

Review of attachment 8537369 [details] [diff] [review]:
-----------------------------------------------------------------

awesome :)

disclaimer: I can't verify the tc config bits and queue steps as part of this review. you may want to get a tc knowledgeable person for those parts

_tcClaimTask, _tcCreateArtifact, _tcCreateTask, and _tcReportCompleted are great generic helper methods. maybe they should be part of a mozharness TaskCluster class (like Proxxy http://mxr.mozilla.org/build/source/mozharness/mozharness/mozilla/proxxy.py#12)

r- to clear my queue, but everything is debatable if you feel strongly about something. :)

::: mozharness/mozilla/building/buildbase.py
@@ +1259,5 @@
>                  )
>  
>          self.generated_build_props = True
>  
> +    def _tcCreateTask(self, taskId):

nit: can you underscore_case instead of camel_case the method and arg names?

@@ +1262,5 @@
>  
> +    def _tcCreateTask(self, taskId):
> +        buildid = self.query_buildid()
> +        branch = self.buildbot_config['properties']['branch']
> +        platform = self.buildbot_config['properties']['platform']

I'd prefer to avoid referencing buildbot_config as much as possible. will self.branch and self.stage_platform[1] work for your needs?


[1] http://mxr.mozilla.org/build/search?string=stage_platform&find=mozharness%2Fconfigs%2Fbuilds&findi=&filter=%5E%5B%5E%5C0%5D*%24&hitlimit=&tree=build

@@ +1290,5 @@
> +            }
> +        })
> +        return task
> +
> +    def _tcClaimTask(self, task):

underscore case like above

@@ +1299,5 @@
> +                "workerGroup": "buildbot",
> +                "workerId": "buildbot",
> +            })
> +
> +    def _tcCreateArtifact(self, task, filename, putFile):

underscore case like above (also all the variable names within this method)

@@ +1335,5 @@
> +                "success": True,
> +            })
> +
> +    def upload_files(self):
> +        auth = os.path.join(os.getcwd(), self.config['balrog_credentials_file'])

now that we are using oauth.txt for more than balrog, let's change balrog_credentials_file to something more generic: 'credentials_file' wfm :)

@@ +1348,5 @@
> +        self.activate_virtualenv()
> +        site_packages_path = self.query_python_site_packages_path()
> +        if site_packages_path not in sys.path:
> +            sys.path.append(site_packages_path)
> +        from taskclusterclient import taskcluster

we talked a bit about this over irc. I think it be nice to wrap this as a method in VirtualenvMixin. e.g. create_and_activate_site_packages() or something

won't block on this, but I think we need to solve use cases like this at a higher level: maybe create a bootstrap.py at the base of mozharness, takes another script + script args as its own args, downloads the script if it doesn't exist (could be from in tree and allow for generic builder testing), creates a venv for a given script (if requirements are given), and then finally runs the script. documenting that idea down here so I can point to this from a few related bugs.

@@ +1351,5 @@
> +            sys.path.append(site_packages_path)
> +        from taskclusterclient import taskcluster
> +        taskcluster.config['credentials']['clientId'] = clientId
> +        taskcluster.config['credentials']['accessToken'] = accessToken
> +        self.taskcluster_queue = taskcluster.Queue()

nit: can you add this self attr to the top of the class with the other vars (assign None)

@@ +1359,5 @@
> +        self._tcClaimTask(task)
> +
> +        for uploadFile in self.query_buildbot_property('uploadFiles'):
> +            self._tcCreateArtifact(task, uploadFile, taskcluster.utils.putFile)
> +        self._tcReportCompleted(task)

not sure how taskcluster client handles exceptions but I'm wondering whether we should try/catch any of this for networking/errors?

::: scripts/fx_desktop_build.py
@@ +25,3 @@
>  
>  
> +class FxDesktopBuild(BuildScript, VirtualenvMixin, object):

can you put the VirtualenvMixin as part of BuildScript.

@@ +78,5 @@
>                  'stage_ssh_key': 'ffxbld_rsa',
> +                'virtualenv_modules': [
> +                    'requests==2.2.1',
> +                    'PyHawk-with-a-single-extra-commit==0.1.5',
> +                    'taskcluster==0.0.7',

you mentioned taskcluster 0.0.7 but I'm assuming PyHawk-with-a-single-extra-commit is available on our internal pypi server?
Attachment #8537369 - Flags: review?(jlund) → review-
Comment on attachment 8537369 [details] [diff] [review]
upload-files

Review of attachment 8537369 [details] [diff] [review]:
-----------------------------------------------------------------

forgot two things

::: mozharness/mozilla/building/buildbase.py
@@ +1260,5 @@
>  
>          self.generated_build_props = True
>  
> +    def _tcCreateTask(self, taskId):
> +        buildid = self.query_buildid()

not sure you need buildid anymore

@@ +1341,5 @@
> +        execfile(auth, credentials)
> +        clientId = credentials.get('taskcluster_clientId')
> +        accessToken = credentials.get('taskcluster_accessToken')
> +        if not clientId or not accessToken:
> +            self.warn('Skipping S3 file upload: No taskcluster credentials. ')

s/warn/warning/
(Assignee)

Comment 4

4 years ago
(In reply to Jordan Lund (:jlund) from comment #2)
> we talked a bit about this over irc. I think it be nice to wrap this as a
> method in VirtualenvMixin. e.g. create_and_activate_site_packages() or
> something

So I think what's supposed to happen is that we have a create-virtualenv action, and then just call activate_virtualenv() before I need to import things. In the b2g case we had to call create_virtualenv() directly since I needed to use the virtualenv inside a PreScriptRun function.

I tried to use a create-virtualenv action for fx_desktop_build, but ended up with:

16:09:22     INFO - Running main action method: create_virtualenv
16:09:22     INFO - Creating virtualenv /builds/slave/m-cen-lx-000000000000000000000/build/venv
16:09:22    FATAL - Uncaught exception: Traceback (most recent call last):
16:09:22    FATAL -   File "/builds/slave/m-cen-lx-000000000000000000000/scripts/mozharness/base/script.py", line 1277, in run
16:09:22    FATAL -     self.run_action(action)
16:09:22    FATAL -   File "/builds/slave/m-cen-lx-000000000000000000000/scripts/mozharness/base/script.py", line 1219, in run_action
16:09:22    FATAL -     self._possibly_run_method(method_name, error_if_missing=True)
16:09:22    FATAL -   File "/builds/slave/m-cen-lx-000000000000000000000/scripts/mozharness/base/script.py", line 1160, in _possibly_run_method
16:09:22    FATAL -     return getattr(self, method_name)()
16:09:22    FATAL -   File "/builds/slave/m-cen-lx-000000000000000000000/scripts/mozharness/base/python.py", line 335, in create_virtualenv
16:09:22    FATAL -     if not os.path.exists(virtualenv[0]) and not self.which(virtualenv[0]):
16:09:22    FATAL -   File "/builds/slave/m-cen-lx-000000000000000000000/scripts/mozharness/base/script.py", line 500, in which
16:09:22    FATAL -     env = self.query_env()
16:09:22    FATAL -   File "/builds/slave/m-cen-lx-000000000000000000000/scripts/mozharness/base/script.py", line 607, in query_env
16:09:22    FATAL -     env[key] = partial_env[key] % replace_dict
16:09:22    FATAL - KeyError: 'symbol_server_host'
16:09:22    FATAL - Running post_fatal callback...
16:09:22    ERROR - setting return code to 2 because fatal was called
16:09:22    FATAL - Exiting -1

I think the key is missing since it isn't added until 'build' when we call self.query_build_env() (I think - I don't fully understand this code).

For now I've left the code just calling create_virtualenv() followed by activate_virtualenv(). On the bright side, it seems that the hacky path modifying code isn't needed, at least in this case. I'm not sure if it's needed for influxdb as well - maybe that can just go away.

Hopefully this is at least a little better, though certainly not ideal.

> not sure how taskcluster client handles exceptions but I'm wondering whether
> we should try/catch any of this for networking/errors?

I'm not totally sure either - what would we do if we catch an exception? I mean, do we just want it to die anyway? Note that the taskcluster client will handle retries and such for the file uploads, at least.

> you mentioned taskcluster 0.0.7 but I'm assuming
> PyHawk-with-a-single-extra-commit is available on our internal pypi server?

Yeah, I uploaded the special PyHawk version to our pypi server. I'll upload tc-0.0.7 too once it's ready.

All other comments should be addressed, including those in #c3.
(Assignee)

Comment 5

4 years ago
Created attachment 8538201 [details] [diff] [review]
credentials_file

I separated this out for hopefully easier review - it just renames the balrog_credentials_file and influxdb_credentials_file -> credentials_file for use everywhere.
Attachment #8538201 - Flags: review?(jlund)
(Assignee)

Comment 6

4 years ago
Created attachment 8538205 [details] [diff] [review]
upload-files

This should have all other changes not specifically addressed in #c4. :lightsofapollo, can you also take a look at taskcluster.py to make sure it's using TC in a sane manner?
Attachment #8537369 - Attachment is obsolete: true
Attachment #8538205 - Flags: review?(jlund)
Attachment #8538205 - Flags: review?(jlal)
(Assignee)

Comment 7

4 years ago
Created attachment 8538207 [details] [diff] [review]
interdiff

Interdiff for reference. Though with the move to taskcluster.py, it may not help too much :)
Comment on attachment 8538205 [details] [diff] [review]
upload-files

Basics LGTM I think we could expand this to do more in the future if this works well.
Attachment #8538205 - Flags: review?(jlal) → review+
Comment on attachment 8538201 [details] [diff] [review]
credentials_file

Review of attachment 8538201 [details] [diff] [review]:
-----------------------------------------------------------------

we talked over irc and agreed on going the route of being more explicit than generic. iow - having a config item for every cred purpose, e.g. influx, balrog, and taskcluster.

I'm not too worried about the other mozharness scripts, but we should address this bugs script, fx_desktop_build.py
Attachment #8538201 - Flags: review?(jlund) → review-
Comment on attachment 8538205 [details] [diff] [review]
upload-files

Review of attachment 8538205 [details] [diff] [review]:
-----------------------------------------------------------------

::: mozharness/mozilla/building/buildbase.py
@@ +522,5 @@
>          self.builduid = None
>          self.query_buildid()  # sets self.buildid
>          self.query_builduid()  # sets self.builduid
>          self.generated_build_props = False
> +        self.taskcluster_queue = None

IIUC, not needed anymore

@@ +1268,5 @@
> +        credentials = {}
> +        execfile(auth, credentials)
> +        clientId = credentials.get('taskcluster_clientId')
> +        accessToken = credentials.get('taskcluster_accessToken')
> +        if not clientId or not accessToken:

nit: the mozharness norm is to follow pep here and use underscore naming scheme for attrs. there are still a few places in this method that use camel case. sorry, I know that's a tiny nit :)

@@ +1269,5 @@
> +        execfile(auth, credentials)
> +        clientId = credentials.get('taskcluster_clientId')
> +        accessToken = credentials.get('taskcluster_accessToken')
> +        if not clientId or not accessToken:
> +            self.warning('Skipping S3 file upload: No taskcluster credentials. ')

nit: trailing whitespace

@@ +1271,5 @@
> +        accessToken = credentials.get('taskcluster_accessToken')
> +        if not clientId or not accessToken:
> +            self.warning('Skipping S3 file upload: No taskcluster credentials. ')
> +            return
> +        self.create_virtualenv()

maybe add a comment here as to why we are creating and activating a virtualenv

::: mozharness/mozilla/taskcluster.py
@@ +16,5 @@
> +        self.platform = stage_platform
> +        self.revision = revision
> +        self.log_obj = log_obj
> +
> +        from taskclusterclient import taskcluster

ugh, so this is not your patch but a design point of mozharness overall.

few thoughts

1) I don't like how Taskcluster can't be used unless we activate the taskclusterclient mid script and inject its path. I have no quick clean solution so let's let this stay but I suspect it's going to bite later down the line when *all* of mozharness needs Taskcluster.. maybe we can figure out a way to introduce dependencies to mozharness scripts together in the future!
2) speaking of which, I see we are no longer doing any sys.path magic now. is that not needed for some reason anymore?
3) can we add a comment as to why we are importing here, and maybe to how one can go about adding it to their mozharness script?
Attachment #8538205 - Flags: review?(jlund) → review+
(Assignee)

Comment 11

4 years ago
(In reply to Jordan Lund (:jlund) from comment #10)
> IIUC, not needed anymore

Good catch :)

> nit: the mozharness norm is to follow pep here and use underscore naming
> scheme for attrs. there are still a few places in this method that use camel
> case. sorry, I know that's a tiny nit :)

Ok, I think I got all these now. They're a bit tricky to find, since the taskcluster client uses camelcase, so I have to leave those as is. I was searching around for [a-z][A-Z] and I think I found them all, though.
 
> nit: trailing whitespace

Fixed.

> maybe add a comment here as to why we are creating and activating a
> virtualenv

Done.
 
> ugh, so this is not your patch but a design point of mozharness overall.
> 
> few thoughts
> 
> 1) I don't like how Taskcluster can't be used unless we activate the
> taskclusterclient mid script and inject its path. I have no quick clean
> solution so let's let this stay but I suspect it's going to bite later down
> the line when *all* of mozharness needs Taskcluster.. maybe we can figure
> out a way to introduce dependencies to mozharness scripts together in the
> future!

Yeah, I would definitely love a better solution here. I think we were chatting on IRC about doing some kind of pre-script global virtualenv setup so that it doesn't have to be created/activated as part of the actual scripts. This is all voodoo to me though, so I'd like to get a python pro to weigh in as well.

> 2) speaking of which, I see we are no longer doing any sys.path magic now.
> is that not needed for some reason anymore?

I don't think so - as I was playing around with it in #c4, I tried removing it, and things still seemed to work, so I took it out for this version of the patch. I'm not sure if it is still needed for the influxdb code as well -  we might just be able to remove it. I think we can tackle that if/when we go about fixing the rest of the virtualenv setup.

> 3) can we add a comment as to why we are importing here, and maybe to how
> one can go about adding it to their mozharness script?

Yep, done.
(Assignee)

Comment 12

4 years ago
Created attachment 8538988 [details] [diff] [review]
upload-files

With #c10 changes and incorporating a taskcluster_credentials_file. r+ carried forward.
Attachment #8538207 - Attachment is obsolete: true
Attachment #8538201 - Attachment is obsolete: true
Attachment #8538205 - Attachment is obsolete: true
Attachment #8538988 - Flags: review+
(Assignee)

Comment 13

4 years ago
Created attachment 8546178 [details] [diff] [review]
buildbot-try

Small followup that I'll fold into the original patch if it looks good. The TC credentials for try builds are separate from other builds, and have access to different scopes. The try builds can only build things using "buildbot-try", while other builds can use "buildbot". This patch uses the right worker settings based on self.branch
Attachment #8546178 - Flags: review?(jlund)
(Assignee)

Updated

4 years ago
Depends on: 1122219
Comment on attachment 8546178 [details] [diff] [review]
buildbot-try

Review of attachment 8546178 [details] [diff] [review]:
-----------------------------------------------------------------

again delay was due to PTO. Sorry :|

> The TC credentials for try builds are separate from other builds

does this mean that 'taskcluster_clientId' and 'taskcluster_token' will be different for try builds? If so, maybe you have already looked into having different scopes and know that passing oauth.txt to try slaves is safe (my concern in https://bugzilla.mozilla.org/show_bug.cgi?id=1119500#c2).
Attachment #8546178 - Flags: review?(jlund) → review+
(Assignee)

Comment 15

4 years ago
(In reply to Jordan Lund (:jlund) from comment #14)
> does this mean that 'taskcluster_clientId' and 'taskcluster_token' will be
> different for try builds? If so, maybe you have already looked into having
> different scopes and know that passing oauth.txt to try slaves is safe (my
> concern in https://bugzilla.mozilla.org/show_bug.cgi?id=1119500#c2).

Yep, try builds have different TC credentials and scopes, so there shouldn't be any overlap between try and non-try builds. The oauth.txt that gets generated on try builds is also separate (it's a different file in puppet) - catlee, bhearsum, and I looked through it and couldn't find any reason not to generate it.
(In reply to Michael Shal [:mshal] from comment #15)
> (In reply to Jordan Lund (:jlund) from comment #14)
> > does this mean that 'taskcluster_clientId' and 'taskcluster_token' will be
> > different for try builds? If so, maybe you have already looked into having
> > different scopes and know that passing oauth.txt to try slaves is safe (my
> > concern in https://bugzilla.mozilla.org/show_bug.cgi?id=1119500#c2).
> 
> Yep, try builds have different TC credentials and scopes, so there shouldn't
> be any overlap between try and non-try builds. The oauth.txt that gets
> generated on try builds is also separate (it's a different file in puppet) -
> catlee, bhearsum, and I looked through it and couldn't find any reason not
> to generate it.

ship it! ;)
(Assignee)

Comment 17

4 years ago
Created attachment 8558801 [details] [diff] [review]
virtualenv-windows

Windows builds failed when I pushed to try. First, it was because clobberer.py and purge_builds.py attempted to use python.exe from the virtualenv, but the virtualenv isn't setup yet. I'm not sure if this is the correct fix, but it seems that most configs specify 'clobber' before 'create-virtualenv', so I don't think we can expect the virtualenv to be setup when running these scripts. Other suggestions are welcome.

Second, Windows needs a virtualenv executable defined in config.exes. I'm not sure if sys.executable is the right choice here, but it seems to work. Other uses in mozharness seem to vary between sys.executable, PYTHON, or 'c:/mozilla-build/python27/python'.

try results: https://treeherder.mozilla.org/#/jobs?repo=try&revision=974eea8057f6

The builds that use mozharness (Linux *, OSX *, Windows XP *) have their artifacts uploaded to s3. The rest will need to be converted to mozharness in followup bugs.
Attachment #8558801 - Flags: review?(jlund)
Comment on attachment 8558801 [details] [diff] [review]
virtualenv-windows

Review of attachment 8558801 [details] [diff] [review]:
-----------------------------------------------------------------

lgtm, I think the changes to purge.py should be okay. The following scripts use purge: MobileSingleLocale B2GHazardBuild B2GBuild EmulatorBuild SpidermonkeyBuild DesktopSingleLocale BuildScript BouncerSubmitter

Your try run covers most of these but the locale ones and bouncer is outside of your test scope. I don't think bouncer uses a virtualenv so that should be fine.

I should note that I started adding the bits for win64 mh builds in https://bugzil.la/1094364

Once that is resolved, we can add s3 abilities there too :)

::: configs/builds/releng_base_windows_32_builds.py
@@ +39,5 @@
>                  os.getcwd(), 'build', 'src', 'build', 'pymake', 'make.py'
>              )
> +        ],
> +        'virtualenv': [
> +            sys.executable,

like you said, there are many options here for choosing an interpreter. I would like a more uniform approach across mh but as you can tell from this file, I tend to lean with sys.executable :)

::: mozharness/mozilla/purge.py
@@ +51,5 @@
>  
>          cmd = []
>          if self._is_windows():
> +            # The virtualenv isn't setup yet, so just use python directly.
> +            cmd.append(self.query_exe('python'))

I think this is okay. You mentioned that most configs specify 'clobber' before 'create-virtualenv' which makes me wonder how they were not running into the same issue you had. Do you know why other scripts pass?

I don't think we have virtualenv's that are required for purge to work but if we will probably want to revisit this and other places in the future when we expect a virtenv to always be used.
Attachment #8558801 - Flags: review?(jlund) → review+
(Assignee)

Comment 19

4 years ago
(In reply to Jordan Lund (:jlund) from comment #18)
> I should note that I started adding the bits for win64 mh builds in
> https://bugzil.la/1094364
> 
> Once that is resolved, we can add s3 abilities there too :)

Awesome! I was going to try to tackle that next, but I'm glad to see it's already in progress :)

> ::: mozharness/mozilla/purge.py
> @@ +51,5 @@
> >  
> >          cmd = []
> >          if self._is_windows():
> > +            # The virtualenv isn't setup yet, so just use python directly.
> > +            cmd.append(self.query_exe('python'))
> 
> I think this is okay. You mentioned that most configs specify 'clobber'
> before 'create-virtualenv' which makes me wonder how they were not running
> into the same issue you had. Do you know why other scripts pass?
> 
> I don't think we have virtualenv's that are required for purge to work but
> if we will probably want to revisit this and other places in the future when
> we expect a virtenv to always be used.

The query_python_path() call doesn't exist unless the script inherits from VirtualenvMixin, so the previous code was just defaulting to self.query_exe(). Once the script inherits from VirtualenvMixin, then query_python_path() is a valid call, and it tries to use the virtualenv python if you have a virtualenv_path in the config. This block of code is only inside a self._is_windows() check, so I think that's why it's a new error here. I might be misunderstanding though, it can be hard to follow :(
(Assignee)

Comment 20

4 years ago
Created attachment 8560003 [details] [diff] [review]
upload-files

Combining various patch parts into a single patch. r+ carried forward.
Attachment #8538988 - Attachment is obsolete: true
Attachment #8546178 - Attachment is obsolete: true
Attachment #8558801 - Attachment is obsolete: true
Attachment #8560003 - Flags: review+
(Assignee)

Comment 22

4 years ago
I think we can close this one now. Other changes will be done in follow-up bugs.
Status: NEW → RESOLVED
Last Resolved: 4 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.