Last Comment Bug 1232442 - Pre-seed AMI images with hg bundles from hg.cdn.mozilla.net
: Pre-seed AMI images with hg bundles from hg.cdn.mozilla.net
Status: RESOLVED FIXED
:
Product: Release Engineering
Classification: Other
Component: General Automation (show other bugs)
: unspecified
: All All
-- normal (vote)
: ---
Assigned To: Gregory Szorc [:gps] (away until 2017-03-20)
: Chris AtLee [:catlee]
:
Mentors:
Depends on: 1232733 1270317
Blocks: 1286335 1286336 1286430
  Show dependency treegraph
 
Reported: 2015-12-14 13:28 PST by Rail Aliiev [:rail] ⌚️ET
Modified: 2016-07-12 19:11 PDT (History)
7 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---

MozReview Requests
Submitter Diff Changes Open Issues Last Updated
Loading...
Error loading review requests:
Show discarded requests

Attachments
Bug 1232442 - Seed images with a stream clone of the unified Firefox repo; (58 bytes, text/x-review-board-request)
2016-06-09 17:30 PDT, Gregory Szorc [:gps] (away until 2017-03-20)
catlee: review+
Details | Review

Description User image Rail Aliiev [:rail] ⌚️ET 2015-12-14 13:28:13 PST
1) it fails (https://archive.mozilla.org/pub/firefox/tinderbox-builds/mozilla-beta-noarch/1450124939/mozilla-beta-bundle-bm73-build1-build0.txt.gz):
added 7141 changesets with 53959 changes to 16912 files
updating to branch default
129284 files updated, 0 files merged, 0 files removed, 0 files unresolved
304034 changesets found
scp: /home/ftp/pub/firefox/bundles/mozilla-beta.hg.upload: No such file or directory

2) we use the bundleclone extension!
Comment 1 User image Gregory Szorc [:gps] (away until 2017-03-20) 2015-12-14 13:36:36 PST
For posterity, the new official home of the bundles is at https://hg.cdn.mozilla.net/. If you want a machine readable index, we can start producing one. File a bug against Developer Services :: hg.mozilla.org.
Comment 2 User image Nick Thomas [:nthomas] 2015-12-14 19:59:00 PST
I pushed https://treeherder.mozilla.org/#/jobs?repo=try&revision=09bd975cdcf5 to remove the --bundle arg from our hgtool usage. When looking at the logs it's hard to see this having any effect, because machines already have a hg share for try. However I did see some newly started AWS instances pull in 11k changesets, presumably because they're prepopulated with stale repos (bug 1229532).
Comment 3 User image Gregory Szorc [:gps] (away until 2017-03-20) 2015-12-14 20:07:32 PST
I'm not sure the implications of changing this, but from my perspective as a server operator of hg.mozilla.org, I'd rather have clients re-clone from S3/CDN and pull up to 1k changesets than pull 10k changesets to an old snapshot.

Also, I believe we had an uplift today. So the bundles for aurora, beta, release, etc might be out of date by a Gecko release for the next few hours yet.
Comment 4 User image Nick Thomas [:nthomas] 2015-12-14 20:30:49 PST
Yes, I agree. Ideally we'll stop generating bundles on our side, and rely on the mercurial server, but I'm flagging up the preseeding of AWS instances to Rail. A machine readable manifest (in json or similar) on hg.cdn.mozilla.net would probably help him achieve that.
Comment 5 User image Nick Thomas [:nthomas] 2016-02-10 19:34:34 PST
I actually removed the builders in bug 1229532, so lets morph this to handle the pre-seeding. Rail, where are we doing that ? I see support via hg_bundles but not anything using it, eg https://dxr.mozilla.org/build-central/search?q=hg_bundles&redirect=false&case=true
Comment 6 User image Rail Aliiev [:rail] ⌚️ET 2016-02-11 07:36:57 PST
We stopped using bundles because they were outdated (broken builders) and cause a lot of pull traffic. We thought that using hg clone with bundleclone enabled is cheaper.
Comment 7 User image Gregory Szorc [:gps] (away until 2017-03-20) 2016-04-22 08:25:10 PDT
https://hg.cdn.mozilla.net/bundles.json now exists. Pre-seed away.
Comment 8 User image sunnybunny8920 2016-04-25 06:11:56 PDT Comment hidden (spam)
Comment 9 User image Gregory Szorc [:gps] (away until 2017-03-20) 2016-05-06 11:31:08 PDT
Bug 1270317 changed how mozharness does hg repo management.

We now use the "auto pooled storage" feature of the share extension. The code is aggressive and requires the auto pooled storage feature to be enabled. It goes so far as to blow away existing clones not using pooled storage.

The auto pooled storage stores repos under <share_base>/<sha1>, where <sha1> is the 40 character SHA-1 of rev 0 of the repo. This means the existing seeding in AMIs is effectively worthless now, as the new code won't use the data.

We should update the seeding to create a single Firefox repo in <share_base>/8ba995b74e18334ab3707f27e9eb8f4e37ba3d29. Ideally this would be a generaldelta repo created from a unified Firefox repo (like https://hg.mozilla.org/experimental/firefox-unified). However, that repo is still experimental and we're not currently generating bundles for it. So perhaps we can live with seeding from mozilla-central instead.
Comment 11 User image Justin Wood (:Callek) [away until Feb 27] 2016-05-06 11:44:27 PDT
(In reply to Gregory Szorc [:gps] from comment #9)
> Bug 1270317 changed how mozharness does hg repo management.
> 
> We now use the "auto pooled storage" feature of the share extension. The
> code is aggressive and requires the auto pooled storage feature to be
> enabled. It goes so far as to blow away existing clones not using pooled
> storage.
> 
> The auto pooled storage stores repos under <share_base>/<sha1>, where <sha1>
> is the 40 character SHA-1 of rev 0 of the repo. This means the existing
> seeding in AMIs is effectively worthless now, as the new code won't use the
> data.
> 
> We should update the seeding to create a single Firefox repo in
> <share_base>/8ba995b74e18334ab3707f27e9eb8f4e37ba3d29. Ideally this would be
> a generaldelta repo created from a unified Firefox repo (like
> https://hg.mozilla.org/experimental/firefox-unified). However, that repo is
> still experimental and we're not currently generating bundles for it. So
> perhaps we can live with seeding from mozilla-central instead.

we'd likely want a seed pulled there for:
* m-c
* m-beta
* m-release
* m-esr[all-that-we-use]

For the mere fact that release and beta have lots of heads and csets not in central.
Comment 12 User image Nick Thomas [:nthomas] 2016-05-24 22:34:37 PDT
Ideally we should use the same mechanism as in-tree to prepopulate the images. ie adapt
02:44:34     INFO - Copy/paste: hg --config ui.merge=internal:merge --config extensions.robustcheckout=/builds/slave/try-and-api-15-000000000000000/scripts/external_tools/robustcheckout.py robustcheckout https://hg.mozilla.org/try /builds/slave/try-and-api-15-000000000000000/build/src --sharebase /builds/hg-shared --purge --upstream https://hg.mozilla.org/mozilla-central --revision 64f88603f59ac386ea7ff737c1168d1c0a6f6eb3

We'd have to get a copy of mozharness, maybe just from the archiver using default of mozilla-central. Alternatively, bug 1270951 just landed the robustcheckout extension in tools.
Comment 13 User image Gregory Szorc [:gps] (away until 2017-03-20) 2016-05-25 00:12:09 PDT
I'd prefer we seed from https://hg.mozilla.org/experimental/firefox-unified because that repo:

1) has all heads
2) is smaller
3) uses the generaldelta storage format

If the number of operations per day is small, you /could/ `hg clone -U --uncompressed https://hg.mozilla.org/experimental/firefox-unified` to get the seed for this repo. However, before we do that we should a) consider removing the "experimental" label b) stand up bundle generation for this repo so clones are served from CDN/S3.

That being said, seeding from a clone of mozilla-central should be fine for the short term. Although the first pull from aurora, beta, release, or esr will be a bit painful. I suppose the seeding mechanism could pull from all those repos so all the heads are present.
Comment 14 User image Gregory Szorc [:gps] (away until 2017-03-20) 2016-06-07 17:24:53 PDT
We're now generating bundles for the experimental/firefox-unified repo. However, I'm also standing up a "firefox" repo that will be a near exact copy of experimental/firefox-unified. That should be fully deployed in the next ~24h. At that time, we should seed the AMI with a stream clone bundle of that repo. That will be in the "stream (generaldelta)" column of https://hg.cdn.mozilla.net and the "packed1-gd" bundle listed at https://hg.cdn.mozilla.net/bundles.json.

You can apply the bundle and populate repo caches for optimal initial consumption by doing something like:

1. hg --config format.generaldelta=true init 8ba995b74e18334ab3707f27e9eb8f4e37ba3d29
2. cd 8ba995b74e18334ab3707f27e9eb8f4e37ba3d29
3. hg debugapplystreamclonebundle <file>
4. hg pull https://hg.mozilla.org/firefox
5. hg branches
6. hg tags
Comment 15 User image Gregory Szorc [:gps] (away until 2017-03-20) 2016-06-09 17:30:52 PDT
Created attachment 8761881 [details]
Bug 1232442 - Seed images with a stream clone of the unified Firefox repo;

The https://hg.mozilla.org/firefox repo is a single Mercurial repository
with relevant heads from all important repos (mozilla-central, mozilla-inbound,
mozilla-aurora, mozilla-release, esrs, etc). The repository is encoded as
generaldelta, which means it is smaller than mozilla-central (even though it
contains 30,000+ more commits!)

Recent work in automation (namely bug 1270317) changed automation to
always use shared, pooled storage for Mercurial repos. This meant
that we only need a single store for Firefox repos.

When this change was made, we didn't change AMI seeding. This means that
a worker would clone the Firefox repo on first job that needed it. This
is obviously inefficient.

This commit changes the shared repo seeding so the pooled/shared repo
now populated in automation is seeded at AMI generation time. So on
first job run, most commits will be present and we'll only do an
incremental pull. This restores the behavior from before bug 1270317
landed. There are multiple benefits:

1) Shared repo population will complete quicker (because we're only
   populating 1 repo)
2) We'll use less disk space for local repos (because we will only
   populate 1)
3) Jobs will start faster since most commits from most Firefox repos
   will already be present in the pre-populated shared repo.

The previous version of this file had code to map the instance's current
availability zone to an S3 bucket. As of bug 1249197, hg.mozilla.org
advertised bundle URLs to the appropriate S3 endpoint based on the
requesting IP. This favors same-AZ serving and means there should be 0
cost for data downloads from S3. Since this mapping is now done server
side as part of clone bundles, we remove this feature.

The previous version of this file downloaded a tar file of the .hg
directory for various repos and uncompressed it. The new version just
does an `hg clone` preferring "streaming clones." "Streaming clones" are
effectively `tar | nc` and are extremely fast. IMO the tar file provides
little value so it has been removed from the equation.

A downside of not using a tar file is that seeding now talks to
hg.mozilla.org instead of only S3. This could potentially drive a lot of
load to hg.mozilla.org if multiple machines perform this seeding at the
same time. However, 99+% of clone load will be offloaded to S3 via the
clone bundles and hg.mozilla.org will only need to serve commits since
the bundle was created. This should not be more than a few hundred
commits and should not require much effort on behalf of the server. But
if this does overwhelm the server, we can restore tar files.

This commit assumes that all machines have Mercurial 3.7 as `hg` in
PATH. If an older version of Mercurial is present, the clone will take
several minutes longer than it should or it will fail due to the client
not having bundle2 support (the firefox repo requires bundle2).

A downside of this commit is that jobs not having the new
shared/pooled storage code deployed will need to perform a full clone
on first job because the old paths (e.g. /builds/hg-shared/mozilla-central)
are no longer present. This only impacts legacy commits/jobs and the
number of jobs should diminish over time. Furthermore, once hgtool
is updated to use shared/pooled storage, this won't be an issue (that
is tracked in bug 1270951).

Review commit: https://reviewboard.mozilla.org/r/58922/diff/#index_header
See other reviews: https://reviewboard.mozilla.org/r/58922/
Comment 16 User image Gregory Szorc [:gps] (away until 2017-03-20) 2016-06-09 17:31:43 PDT
Comment on attachment 8761881 [details]
Bug 1232442 - Seed images with a stream clone of the unified Firefox repo;

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/58922/diff/1-2/
Comment 17 User image Jordan Lund (:jlund) 2016-06-14 07:06:49 PDT
Comment on attachment 8761881 [details]
Bug 1232442 - Seed images with a stream clone of the unified Firefox repo;

https://reviewboard.mozilla.org/r/58922/#review56254

I don't think I would be a good candidate for this review as I am unfamiliar with this code. If you would like me to review it for knowledge sharing purposes, please re-request me and I will have a look.
Comment 18 User image Chris AtLee [:catlee] 2016-06-24 09:53:05 PDT
https://reviewboard.mozilla.org/r/58922/#review57432

::: modules/runner/templates/tasks/populate_shared_repos.erb:32
(Diff revision 2)
>  
> -def is_try_slave(hostname):
> -    return hostname.startswith("try-")
> -
> -
> -def get_availability_zone():
> +def clone_firefox():
> +    """Clone the Firefox repo to the hg-shared directory."""
> +    dest_dir = os.path.join(SHARE_BASE_DIR, FIREFOX_SHA1)
> +    if os.path.exists(dest_dir):
> +        log.info('%s already exists; skipping' % dest_dir)

need to actually skip the operation here?

::: modules/runner/templates/tasks/populate_shared_repos.erb:89
(Diff revision 2)
>          log.warn("%s is not supported", hostname)
>          exit(0)
>  
> -    if is_try_slave(hostname):
> -        log.info("Try slave detected")
> -        dirs = get_prepopulated_dirs(is_try=True)
> +    # The Firefox repo is the only one large enough to warrant
> +    # seeding.
> +    exit(clone_firefox())

This breaks the behaviour below of exiting 0 even in the case of failure. This means that if we fail to clone this unified repo for any reason, then the machine will not be able to run any jobs.

::: modules/runner/templates/tasks/populate_shared_repos.erb:96
(Diff revision 2)
>  
>  if __name__ == "__main__":
>      try:
>          main()
>      except Exception:
>          log.exception("Failed to fetch tarballs, gracefully exiting...")

This exception message needs updating.
Comment 19 User image Chris AtLee [:catlee] 2016-06-27 10:48:55 PDT
Comment on attachment 8761881 [details]
Bug 1232442 - Seed images with a stream clone of the unified Firefox repo;

https://reviewboard.mozilla.org/r/58922/#review57692
Comment 20 User image Gregory Szorc [:gps] (away until 2017-03-20) 2016-07-06 15:07:45 PDT
Comment on attachment 8761881 [details]
Bug 1232442 - Seed images with a stream clone of the unified Firefox repo;

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/58922/diff/2-3/
Comment 22 User image Gregory Szorc [:gps] (away until 2017-03-20) 2016-07-06 16:06:59 PDT
(In reply to Nick Thomas [:nthomas] from comment #21)
> Worth considering if these kind of prepopulations
>  
> https://github.com/mozilla/build-cloud-tools/blob/master/instance_data/us-
> east-1.instance_data_prod.json
>  
> https://github.com/mozilla/build-cloud-tools/blob/master/instance_data/us-
> east-1.instance_data_try.json
>  
> https://github.com/mozilla/build-cloud-tools/blob/master/configs/
> Ec2UserdataUtils.psm1#L831
> are also obsolete with the changes here.

Looks like it.

What are these used for? Windows machines? Does a fix belong in this bug or elsewhere?
Comment 23 User image Chris AtLee [:catlee] 2016-07-12 11:45:41 PDT
Comment on attachment 8761881 [details]
Bug 1232442 - Seed images with a stream clone of the unified Firefox repo;

https://reviewboard.mozilla.org/r/58922/#review60762
Comment 24 User image Gregory Szorc [:gps] (away until 2017-03-20) 2016-07-12 11:50:00 PDT
https://hg.mozilla.org/build/puppet/rev/1d1ae9de0da2

Note You need to log in before you can comment on or make changes to this bug.