Pre-seed AMI images with hg bundles from hg.cdn.mozilla.net

RESOLVED FIXED

Status

Release Engineering
General Automation
RESOLVED FIXED
2 years ago
11 months ago

People

(Reporter: rail, Assigned: gps)

Tracking

(Blocks: 1 bug)

Firefox Tracking Flags

(Not tracked)

Details

MozReview Requests

Submitter Diff Changes Open Issues Last Updated
Loading...
Error loading review requests:

Attachments

(1 attachment)

(Reporter)

Description

2 years ago
1) it fails (https://archive.mozilla.org/pub/firefox/tinderbox-builds/mozilla-beta-noarch/1450124939/mozilla-beta-bundle-bm73-build1-build0.txt.gz):
added 7141 changesets with 53959 changes to 16912 files
updating to branch default
129284 files updated, 0 files merged, 0 files removed, 0 files unresolved
304034 changesets found
scp: /home/ftp/pub/firefox/bundles/mozilla-beta.hg.upload: No such file or directory

2) we use the bundleclone extension!
(Assignee)

Comment 1

2 years ago
For posterity, the new official home of the bundles is at https://hg.cdn.mozilla.net/. If you want a machine readable index, we can start producing one. File a bug against Developer Services :: hg.mozilla.org.

Updated

2 years ago
See Also: → bug 1229532
I pushed https://treeherder.mozilla.org/#/jobs?repo=try&revision=09bd975cdcf5 to remove the --bundle arg from our hgtool usage. When looking at the logs it's hard to see this having any effect, because machines already have a hg share for try. However I did see some newly started AWS instances pull in 11k changesets, presumably because they're prepopulated with stale repos (bug 1229532).
(Assignee)

Comment 3

2 years ago
I'm not sure the implications of changing this, but from my perspective as a server operator of hg.mozilla.org, I'd rather have clients re-clone from S3/CDN and pull up to 1k changesets than pull 10k changesets to an old snapshot.

Also, I believe we had an uplift today. So the bundles for aurora, beta, release, etc might be out of date by a Gecko release for the next few hours yet.
Yes, I agree. Ideally we'll stop generating bundles on our side, and rely on the mercurial server, but I'm flagging up the preseeding of AWS instances to Rail. A machine readable manifest (in json or similar) on hg.cdn.mozilla.net would probably help him achieve that.
(Assignee)

Updated

2 years ago
Depends on: 1232733
I actually removed the builders in bug 1229532, so lets morph this to handle the pre-seeding. Rail, where are we doing that ? I see support via hg_bundles but not anything using it, eg https://dxr.mozilla.org/build-central/search?q=hg_bundles&redirect=false&case=true
Flags: needinfo?(rail)
Summary: Stop generating hg bundles → Pre-seed AMI images with hg bundles from hg.cdn.mozilla.net
(Reporter)

Comment 6

a year ago
We stopped using bundles because they were outdated (broken builders) and cause a lot of pull traffic. We thought that using hg clone with bundleclone enabled is cheaper.
Flags: needinfo?(rail)
(Assignee)

Comment 7

a year ago
https://hg.cdn.mozilla.net/bundles.json now exists. Pre-seed away.
Comment hidden (spam)
blocking-b2g: 2.2r? → ---
tracking-b2g: backlog → ---
(Assignee)

Comment 9

a year ago
Bug 1270317 changed how mozharness does hg repo management.

We now use the "auto pooled storage" feature of the share extension. The code is aggressive and requires the auto pooled storage feature to be enabled. It goes so far as to blow away existing clones not using pooled storage.

The auto pooled storage stores repos under <share_base>/<sha1>, where <sha1> is the 40 character SHA-1 of rev 0 of the repo. This means the existing seeding in AMIs is effectively worthless now, as the new code won't use the data.

We should update the seeding to create a single Firefox repo in <share_base>/8ba995b74e18334ab3707f27e9eb8f4e37ba3d29. Ideally this would be a generaldelta repo created from a unified Firefox repo (like https://hg.mozilla.org/experimental/firefox-unified). However, that repo is still experimental and we're not currently generating bundles for it. So perhaps we can live with seeding from mozilla-central instead.
Depends on: 1270317
(Assignee)

Comment 10

a year ago
For my future reference:

https://hg.mozilla.org/build/puppet/file/tip/modules/runner/templates/tasks/populate_shared_repos.erb
https://github.com/mozilla/build-cloud-tools/blob/master/configs/Ec2UserdataUtils.psm1
https://hg.mozilla.org/build/puppet/file/tip/modules/runner/templates/tasks/update_shared_repos.erb
(In reply to Gregory Szorc [:gps] from comment #9)
> Bug 1270317 changed how mozharness does hg repo management.
> 
> We now use the "auto pooled storage" feature of the share extension. The
> code is aggressive and requires the auto pooled storage feature to be
> enabled. It goes so far as to blow away existing clones not using pooled
> storage.
> 
> The auto pooled storage stores repos under <share_base>/<sha1>, where <sha1>
> is the 40 character SHA-1 of rev 0 of the repo. This means the existing
> seeding in AMIs is effectively worthless now, as the new code won't use the
> data.
> 
> We should update the seeding to create a single Firefox repo in
> <share_base>/8ba995b74e18334ab3707f27e9eb8f4e37ba3d29. Ideally this would be
> a generaldelta repo created from a unified Firefox repo (like
> https://hg.mozilla.org/experimental/firefox-unified). However, that repo is
> still experimental and we're not currently generating bundles for it. So
> perhaps we can live with seeding from mozilla-central instead.

we'd likely want a seed pulled there for:
* m-c
* m-beta
* m-release
* m-esr[all-that-we-use]

For the mere fact that release and beta have lots of heads and csets not in central.
Ideally we should use the same mechanism as in-tree to prepopulate the images. ie adapt
02:44:34     INFO - Copy/paste: hg --config ui.merge=internal:merge --config extensions.robustcheckout=/builds/slave/try-and-api-15-000000000000000/scripts/external_tools/robustcheckout.py robustcheckout https://hg.mozilla.org/try /builds/slave/try-and-api-15-000000000000000/build/src --sharebase /builds/hg-shared --purge --upstream https://hg.mozilla.org/mozilla-central --revision 64f88603f59ac386ea7ff737c1168d1c0a6f6eb3

We'd have to get a copy of mozharness, maybe just from the archiver using default of mozilla-central. Alternatively, bug 1270951 just landed the robustcheckout extension in tools.
(Assignee)

Comment 13

a year ago
I'd prefer we seed from https://hg.mozilla.org/experimental/firefox-unified because that repo:

1) has all heads
2) is smaller
3) uses the generaldelta storage format

If the number of operations per day is small, you /could/ `hg clone -U --uncompressed https://hg.mozilla.org/experimental/firefox-unified` to get the seed for this repo. However, before we do that we should a) consider removing the "experimental" label b) stand up bundle generation for this repo so clones are served from CDN/S3.

That being said, seeding from a clone of mozilla-central should be fine for the short term. Although the first pull from aurora, beta, release, or esr will be a bit painful. I suppose the seeding mechanism could pull from all those repos so all the heads are present.
(Assignee)

Comment 14

a year ago
We're now generating bundles for the experimental/firefox-unified repo. However, I'm also standing up a "firefox" repo that will be a near exact copy of experimental/firefox-unified. That should be fully deployed in the next ~24h. At that time, we should seed the AMI with a stream clone bundle of that repo. That will be in the "stream (generaldelta)" column of https://hg.cdn.mozilla.net and the "packed1-gd" bundle listed at https://hg.cdn.mozilla.net/bundles.json.

You can apply the bundle and populate repo caches for optimal initial consumption by doing something like:

1. hg --config format.generaldelta=true init 8ba995b74e18334ab3707f27e9eb8f4e37ba3d29
2. cd 8ba995b74e18334ab3707f27e9eb8f4e37ba3d29
3. hg debugapplystreamclonebundle <file>
4. hg pull https://hg.mozilla.org/firefox
5. hg branches
6. hg tags
(Assignee)

Comment 15

a year ago
Created attachment 8761881 [details]
Bug 1232442 - Seed images with a stream clone of the unified Firefox repo;

The https://hg.mozilla.org/firefox repo is a single Mercurial repository
with relevant heads from all important repos (mozilla-central, mozilla-inbound,
mozilla-aurora, mozilla-release, esrs, etc). The repository is encoded as
generaldelta, which means it is smaller than mozilla-central (even though it
contains 30,000+ more commits!)

Recent work in automation (namely bug 1270317) changed automation to
always use shared, pooled storage for Mercurial repos. This meant
that we only need a single store for Firefox repos.

When this change was made, we didn't change AMI seeding. This means that
a worker would clone the Firefox repo on first job that needed it. This
is obviously inefficient.

This commit changes the shared repo seeding so the pooled/shared repo
now populated in automation is seeded at AMI generation time. So on
first job run, most commits will be present and we'll only do an
incremental pull. This restores the behavior from before bug 1270317
landed. There are multiple benefits:

1) Shared repo population will complete quicker (because we're only
   populating 1 repo)
2) We'll use less disk space for local repos (because we will only
   populate 1)
3) Jobs will start faster since most commits from most Firefox repos
   will already be present in the pre-populated shared repo.

The previous version of this file had code to map the instance's current
availability zone to an S3 bucket. As of bug 1249197, hg.mozilla.org
advertised bundle URLs to the appropriate S3 endpoint based on the
requesting IP. This favors same-AZ serving and means there should be 0
cost for data downloads from S3. Since this mapping is now done server
side as part of clone bundles, we remove this feature.

The previous version of this file downloaded a tar file of the .hg
directory for various repos and uncompressed it. The new version just
does an `hg clone` preferring "streaming clones." "Streaming clones" are
effectively `tar | nc` and are extremely fast. IMO the tar file provides
little value so it has been removed from the equation.

A downside of not using a tar file is that seeding now talks to
hg.mozilla.org instead of only S3. This could potentially drive a lot of
load to hg.mozilla.org if multiple machines perform this seeding at the
same time. However, 99+% of clone load will be offloaded to S3 via the
clone bundles and hg.mozilla.org will only need to serve commits since
the bundle was created. This should not be more than a few hundred
commits and should not require much effort on behalf of the server. But
if this does overwhelm the server, we can restore tar files.

This commit assumes that all machines have Mercurial 3.7 as `hg` in
PATH. If an older version of Mercurial is present, the clone will take
several minutes longer than it should or it will fail due to the client
not having bundle2 support (the firefox repo requires bundle2).

A downside of this commit is that jobs not having the new
shared/pooled storage code deployed will need to perform a full clone
on first job because the old paths (e.g. /builds/hg-shared/mozilla-central)
are no longer present. This only impacts legacy commits/jobs and the
number of jobs should diminish over time. Furthermore, once hgtool
is updated to use shared/pooled storage, this won't be an issue (that
is tracked in bug 1270951).

Review commit: https://reviewboard.mozilla.org/r/58922/diff/#index_header
See other reviews: https://reviewboard.mozilla.org/r/58922/
Attachment #8761881 - Flags: review?(catlee)
(Assignee)

Updated

a year ago
Assignee: nobody → gps
Status: NEW → ASSIGNED
(Assignee)

Comment 16

a year ago
Comment on attachment 8761881 [details]
Bug 1232442 - Seed images with a stream clone of the unified Firefox repo;

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/58922/diff/1-2/
Attachment #8761881 - Flags: review?(jlund)
Comment on attachment 8761881 [details]
Bug 1232442 - Seed images with a stream clone of the unified Firefox repo;

https://reviewboard.mozilla.org/r/58922/#review56254

I don't think I would be a good candidate for this review as I am unfamiliar with this code. If you would like me to review it for knowledge sharing purposes, please re-request me and I will have a look.
Attachment #8761881 - Flags: review?(jlund)

Comment 18

11 months ago
https://reviewboard.mozilla.org/r/58922/#review57432

::: modules/runner/templates/tasks/populate_shared_repos.erb:32
(Diff revision 2)
>  
> -def is_try_slave(hostname):
> -    return hostname.startswith("try-")
> -
> -
> -def get_availability_zone():
> +def clone_firefox():
> +    """Clone the Firefox repo to the hg-shared directory."""
> +    dest_dir = os.path.join(SHARE_BASE_DIR, FIREFOX_SHA1)
> +    if os.path.exists(dest_dir):
> +        log.info('%s already exists; skipping' % dest_dir)

need to actually skip the operation here?

::: modules/runner/templates/tasks/populate_shared_repos.erb:89
(Diff revision 2)
>          log.warn("%s is not supported", hostname)
>          exit(0)
>  
> -    if is_try_slave(hostname):
> -        log.info("Try slave detected")
> -        dirs = get_prepopulated_dirs(is_try=True)
> +    # The Firefox repo is the only one large enough to warrant
> +    # seeding.
> +    exit(clone_firefox())

This breaks the behaviour below of exiting 0 even in the case of failure. This means that if we fail to clone this unified repo for any reason, then the machine will not be able to run any jobs.

::: modules/runner/templates/tasks/populate_shared_repos.erb:96
(Diff revision 2)
>  
>  if __name__ == "__main__":
>      try:
>          main()
>      except Exception:
>          log.exception("Failed to fetch tarballs, gracefully exiting...")

This exception message needs updating.

Comment 19

11 months ago
Comment on attachment 8761881 [details]
Bug 1232442 - Seed images with a stream clone of the unified Firefox repo;

https://reviewboard.mozilla.org/r/58922/#review57692
Attachment #8761881 - Flags: review?(catlee)
(Assignee)

Comment 20

11 months ago
Comment on attachment 8761881 [details]
Bug 1232442 - Seed images with a stream clone of the unified Firefox repo;

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/58922/diff/2-3/
Attachment #8761881 - Flags: review?(catlee)

Comment 21

11 months ago
Worth considering if these kind of prepopulations
  https://github.com/mozilla/build-cloud-tools/blob/master/instance_data/us-east-1.instance_data_prod.json
  https://github.com/mozilla/build-cloud-tools/blob/master/instance_data/us-east-1.instance_data_try.json
  https://github.com/mozilla/build-cloud-tools/blob/master/configs/Ec2UserdataUtils.psm1#L831
are also obsolete with the changes here.
(Assignee)

Comment 22

11 months ago
(In reply to Nick Thomas [:nthomas] from comment #21)
> Worth considering if these kind of prepopulations
>  
> https://github.com/mozilla/build-cloud-tools/blob/master/instance_data/us-
> east-1.instance_data_prod.json
>  
> https://github.com/mozilla/build-cloud-tools/blob/master/instance_data/us-
> east-1.instance_data_try.json
>  
> https://github.com/mozilla/build-cloud-tools/blob/master/configs/
> Ec2UserdataUtils.psm1#L831
> are also obsolete with the changes here.

Looks like it.

What are these used for? Windows machines? Does a fix belong in this bug or elsewhere?

Comment 23

11 months ago
Comment on attachment 8761881 [details]
Bug 1232442 - Seed images with a stream clone of the unified Firefox repo;

https://reviewboard.mozilla.org/r/58922/#review60762
Attachment #8761881 - Flags: review?(catlee) → review+
(Assignee)

Comment 24

11 months ago
https://hg.mozilla.org/build/puppet/rev/1d1ae9de0da2
Status: ASSIGNED → RESOLVED
Last Resolved: 11 months ago
Resolution: --- → FIXED
(Assignee)

Updated

11 months ago
Blocks: 1286335
(Assignee)

Updated

11 months ago
Blocks: 1286336
(Assignee)

Updated

11 months ago
Blocks: 1286430
You need to log in before you can comment on or make changes to this bug.