Reimplement push to mirrors for s3

RESOLVED FIXED

Status

Release Engineering
Release Automation
RESOLVED FIXED
3 years ago
2 months ago

People

(Reporter: nthomas, Assigned: nthomas)

Tracking

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(2 attachments, 4 obsolete attachments)

(Assignee)

Description

3 years ago
*_push_to_mirrors
Issue: we run rsync after ssh into stage.m.o. Firefox has index.html added in subdirs of releases/<version>/ to hide files until we actually release
Code: http://hg.mozilla.org/build/tools/file/default/scripts/release/stage-tasks.py#l101
Fix: use http://docs.aws.amazon.com/AmazonS3/latest/API/RESTObjectCOPY.html to push, parallel worker model for perf. Deprecate index.html, move them aside and put back later ?

Not planning to beetmover this, when a fairly simple will do for now. Not going to keep index.html.
(Assignee)

Comment 1

3 years ago
Created attachment 8630976 [details] [diff] [review]
[mozharness] WIP script

This is a functioning script, which takes 3m 40s to push 39.0b7 in a test bucket (with the default parallelization of 20). There are a couple of open issues in worker() which I'd appreciate your thoughts on, and of course integration with buildbotcustom.
Attachment #8630976 - Flags: feedback?(bhearsum)
Comment on attachment 8630976 [details] [diff] [review]
[mozharness] WIP script

Review of attachment 8630976 [details] [diff] [review]:
-----------------------------------------------------------------

A couple of overhaul things here:
* Do you want to guard against overwriting files that are already in releases? We have that guard in the current script.
* I haven't given too much thought to how worthwhile some of these comments are to address since we're moving right into release promotion afterwards. I don't feel terribly strongly about anything except making sure ACLs are set correctly.

::: scripts/release/push-candidate-to-releases.py
@@ +64,5 @@
> +        )
> +
> +        # set the env var for boto to read our special config file
> +        # rather than anything else we have at ~/.boto
> +        os.environ["BOTO_CONFIG"] = os.path.abspath(self.config["credentials"])

Are you thinking that this file will be permanently deployed to the machines? That's probably fine, since it's equivalent to how we have the SSH keys deployed there permanently....

@@ +86,5 @@
> +                r"^.*json$",
> +                r"^.*/mar-tools/.*$",
> +                r"^.*gecko-unsigned-unaligned.apk$",
> +                r"^.*robocop.apk$",
> +                r"^.*contrib.*"

I wonder if this should use includes, and share with the new checksums script instead? AFAIK the set of files that should be pushed should be the same as the set of files that we put in *SUMS...

Maybe it's not worthwhile changing since release promotion is on the way though.

@@ +92,5 @@
> +
> +    def _get_candidates_prefix(self):
> +        return "candidates/{}-candidates/build{}".format(
> +            self.config["version"], self.config["build_number"]
> +        )

This may need to have /pub/mozilla.org/$product/ prepended to it. AFAIK we're going to continue to use the full old paths in the buckets. I'm still waiting to get access to the staging bucket though, so I'm not 100% sure.

@@ +106,5 @@
> +
> +    def push_to_releases(self):
> +        """This step grabs the list of files in the candidates dir,
> +        filters out any unwanted files from within them, and adds the remaining
> +        to self.files for copying."""

Comment needs updating: self.files doesn't exist.

@@ +119,5 @@
> +        def worker(item):
> +            self.info("Copying {}".format(item))
> +            source, destination = item
> +            bucket.copy_key(destination, self.config["bucket_name"], source)
> +            # preserve ACL ?

If the candidates and releases files are supposed to have the same ACLs it seems like it makes sense to enforce that here...it looks like copy_key will do that for you if you want. I wonder if we should be enforcing ACLs on a continual basis (perhaps similar to my suggestion in bug 1175085 about routinely checking them) instead of or in addition to this though.

The only potential example I can come up with where we might want ACLs to be different is if we don't want files in candidates to be publicly accessible for some reason. I suspect that's not a good idea since both automation and QE will need to download them for testing purposes. Maybe I've forgotten about something that we talked about around this though...

@@ +120,5 @@
> +            self.info("Copying {}".format(item))
> +            source, destination = item
> +            bucket.copy_key(destination, self.config["bucket_name"], source)
> +            # preserve ACL ?
> +            # add redo on provider.storage_response_error

Redos for any transient errors you know of would be great!
Attachment #8630976 - Flags: feedback?(bhearsum) → feedback+
(Assignee)

Comment 3

3 years ago
(In reply to Ben Hearsum [:bhearsum] from comment #2)
> A couple of overhaul things here:
> * Do you want to guard against overwriting files that are already in
> releases? We have that guard in the current script.

That's a good idea, I'll add it.

> > +        os.environ["BOTO_CONFIG"] = os.path.abspath(self.config["credentials"])
> 
> Are you thinking that this file will be permanently deployed to the
> machines? That's probably fine, since it's equivalent to how we have the SSH
> keys deployed there permanently....

Yes, using puppet to distribute to linux & mac machines. Relatively short lived so not worth doing more than that IMO.
 
> I wonder if this should use includes, and share with the new checksums
> script instead? AFAIK the set of files that should be pushed should be the
> same as the set of files that we put in *SUMS...

Makes sense, just need to make sure directories can be handled properly.
 
> > +    def _get_candidates_prefix(self):
> > +        return "candidates/{}-candidates/build{}".format(
> > +            self.config["version"], self.config["build_number"]
> > +        )
> 
> This may need to have /pub/mozilla.org/$product/ prepended to it. AFAIK
> we're going to continue to use the full old paths in the buckets. I'm still
> waiting to get access to the staging bucket though, so I'm not 100% sure.

Any new information on this ? In the last meeting it sounded like the mozilla.org part was going away, but might still have /pub/firefox/ to deal with.
 
> @@ +119,5 @@
> > +        def worker(item):
> > +            self.info("Copying {}".format(item))
> > +            source, destination = item
> > +            bucket.copy_key(destination, self.config["bucket_name"], source)
> > +            # preserve ACL ?
> 
> If the candidates and releases files are supposed to have the same ACLs it
> seems like it makes sense to enforce that here...it looks like copy_key will
> do that for you if you want. I wonder if we should be enforcing ACLs on a
> continual basis (perhaps similar to my suggestion in bug 1175085 about
> routinely checking them) instead of or in addition to this though.

I'm guessing we'll handle ACL at the bucket or user level rather than object, and copy_key handles the latter (and adds two extra API calls). Still TBD though, I need to talk to oremj.

> The only potential example I can come up with where we might want ACLs to be
> different is if we don't want files in candidates to be publicly accessible
> for some reason. I suspect that's not a good idea since both automation and
> QE will need to download them for testing purposes. Maybe I've forgotten
> about something that we talked about around this though...

I can't think of anything here.

> Redos for any transient errors you know of would be great!

Will add this, hard to test!
(Assignee)

Comment 4

3 years ago
(In reply to Nick Thomas [:nthomas] from comment #3)
> (In reply to Ben Hearsum [:bhearsum] from comment #2)
> > I wonder if this should use includes, and share with the new checksums
> > script instead? AFAIK the set of files that should be pushed should be the
> > same as the set of files that we put in *SUMS...

The regexp became complicated so I reverted back to using excludes.
(Assignee)

Comment 5

3 years ago
Created attachment 8640415 [details] [diff] [review]
[mozharness]  push script

This has everything we discussed implemented, except paths starting like /pub/firefox instead of /pub/mozilla.org/firefox, due to this being dropped in the buckets.
Attachment #8630976 - Attachment is obsolete: true
Attachment #8640415 - Flags: review?
(Assignee)

Comment 6

3 years ago
Created attachment 8640419 [details] [diff] [review]
[mozharness] push script

Less the stray .hgtags from testing in staging.
Attachment #8640415 - Attachment is obsolete: true
Attachment #8640415 - Flags: review?
Attachment #8640419 - Flags: review?(bhearsum)
(Assignee)

Comment 7

3 years ago
Created attachment 8640423 [details] [diff] [review]
[buildbotcustom] release.py changes

There's two new config vars - here are my staging values, prod TBD:

+releaseConfig['S3Bucket']            = 'nthomas-firefox'
+releaseConfig['S3Credentials']       = '/builds/s3-push.credentials'

For product, I needed to pass stage_product instead of productName so that Android worked (fennec vs mobile); you probably need that too on bug 1174145.
Attachment #8640423 - Flags: review?(bhearsum)
Comment on attachment 8640419 [details] [diff] [review]
[mozharness] push script

Review of attachment 8640419 [details] [diff] [review]:
-----------------------------------------------------------------

Looks fine to me, hopefully you didn't spend too much time trying to share the whitelist =\.
Attachment #8640419 - Flags: review?(bhearsum) → review+
Attachment #8640423 - Flags: review?(bhearsum) → review+
(Assignee)

Updated

3 years ago
Depends on: 1190309
(Assignee)

Comment 9

3 years ago
TODO - thunderbird & xulrunner support (mainly buildbotcustom)
(Assignee)

Comment 10

3 years ago
Don't need xulrunner support, it'll be EOL by the time we switch to the new system after 41 ships.
(Assignee)

Comment 11

3 years ago
Blocked on working s3 access credentials, or diagnosing my mistake using them, see bug 1190309.
(Assignee)

Comment 12

3 years ago
Created attachment 8660608 [details] [diff] [review]
[buildbotcustom] release.py changes, v2

This adds relengapi_archiver_repo_path and relengapi_archiver_release_tag to attachment 8640423 [details] [diff] [review], carrying r+ from that. It works in staging for Firefox release, Fennec release, and Thunderbird on esr38.
Attachment #8640423 - Attachment is obsolete: true
Attachment #8660608 - Flags: review+
(Assignee)

Comment 13

3 years ago
Created attachment 8660619 [details] [diff] [review]
[gecko] push script

This moves the patch into gecko's mozharness, and drops the check for source location existence. This is due to S3 having fake directories, so we (eg) have a key
 /pub/firefox/candidates/41.0-candidates/build1/linux_info.txt
but all of the leading path of that could be completely synthetic (ie you can't test for it's existence). We're not regressing anything by dropping it, as this was a new check.
Attachment #8660619 - Flags: review?(bhearsum)
Attachment #8660619 - Flags: review?(bhearsum) → review+
(Assignee)

Updated

3 years ago
Depends on: 1210333
(Assignee)

Updated

3 years ago
Attachment #8640419 - Attachment is obsolete: true
(Assignee)

Updated

3 years ago
Keywords: leave-open
Whiteboard: keep
(Assignee)

Updated

3 years ago
Whiteboard: keep
(Assignee)

Comment 17

3 years ago
Status - ready to go
To land at transition: patch for buildbotcustom
(Assignee)

Comment 19

3 years ago
For some reason SHA512SUMS and SHA512SUMS.sums were not copied by the mirror push, but it looks like checksums ran before properly before hand. Needs investigating.
(Assignee)

Comment 20

3 years ago
Turns out the the mirror push doesn't have checksums as an upstream for Fennec when automatically pushing, fixing that over on bug 1174145.
(Assignee)

Updated

2 years ago
Status: ASSIGNED → RESOLVED
Last Resolved: 2 years ago
Resolution: --- → FIXED
Removing leave-open keyword from resolved bugs, per :sylvestre.
Keywords: leave-open
You need to log in before you can comment on or make changes to this bug.