Closed Bug 1097979 Opened 11 years ago Closed 9 years ago

Use a cross-repository shared Mercurial repository / store

Categories

(Release Engineering :: General, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED DUPLICATE of bug 1270317

People

(Reporter: gps, Assigned: gps)

References

Details

Attachments

(1 file, 10 obsolete files)

38 bytes, text/x-review-board-request
Details
AFAICT Mercurial repositories in Firefox release automation go something like this: 1) Select a repo to build. Let's call it $repo 2) See if /builds/hg-shared/$repo exists 3a) If exists, do nothing 3b) If not exists, download a bundle for $repo and unbundle 4) hg -R /builds/hg-shared/$repo pull 5) hg share /builds/hg-shared/$repo /job/specific/path/to/$repo 6) hg -r /job/specific/path/to/$repo up -r <rev> In the end, we have a /builds/hg-shared directory with {mozilla-central,mozilla-inbound,try,etc} repositories. This current approach is very inefficient because all the Firefox repositories share a common root commit (8ba995b74e18) and maintaining separate stores means that we have to fetch and store the 99% of the shared commit data N>1 times. A better solution is to have a single shared store for all the Firefox commit data. Let's call it /builds/hg-shared/gecko. When an inbound job comes in, we pull inbound into the "gecko" repo. When a central job comes in, we pull central into the "gecko" repo. In this world, you store each commit exactly once. You pay the transfer and storage penalties once. This frees up bandwidth, lowers disk space utilization, and makes job start quicker (because of a lower chance you need to initialize a repository from scratch). It is wins all around. Looking at the code, we need to update build/tools:lib/util/hg.py:mercurial() to allow repositories to share a store. We can pass in an explicit argument to tell the code what name/directory to use for the shared store. Or, with a little Mercurial magic, we could perform a remote server lookup and choose the name automatically. For example, we could lookup rev 0 from the remote and use its hash as the shared store name. This is more magical and wouldn't require any extra logic in hgtool invocation to work. Anyway, if we implement this, it will likely result in drastic Mercurial server and network load reduction. I think it should be a P1.
I'm going to take a stab at this.
Assignee: nobody → gps
Status: NEW → ASSIGNED
Attached file MozReview Request: bz://1097979/gps (obsolete) —
Attachment #8521806 - Flags: review?(catlee)
/r/509 - Bug 1097979 - Require share support in Mercurial /r/511 - Bug 1097979 - Introduce lookup Mercurial extension /r/513 - Bug 1097979 - Introduce lookup() API /r/515 - Bug 1097979 - Share a Mercurial store for each logical repository group Pull down these commits: hg pull review -r 7cc1fc8cf3a5950943e97a9b88a9d975b8ed05f1
https://reviewboard.mozilla.org/r/507/#review219 I haven't tested any of this. I'm not sure how to test any of this. Are there automated tests for any of this?
Attached file MozReview Request: bz://1097979/gps2 (obsolete) —
Attachment #8521810 - Flags: review?(mh+mozilla)
/r/523 - lookup: add an extension to look up keys on a remote Pull down this commit: hg pull review -r bbffe87384241972ff46018f39945dcf41557b1e
https://reviewboard.mozilla.org/r/517/#review221 I needed to write this minimal extension for deployment in automation. Since we have the testing infrastructure in version-control-tools, I think it makes sense for v-c-t to be the canonical home (with tests) and automation can maintain a shadow copy. Long term, I imagine automation will want to clone v-c-t. But I don't want to introduce that dependency right now.
Attachment #8521806 - Flags: review?(catlee27071982)
/r/525 - Bug 1097979 - Export HGRCPATH earlier in tests /r/527 - Bug 1097979 - Change directory earlier during tests /r/509 - Bug 1097979 - Require share support in Mercurial /r/511 - Bug 1097979 - Introduce lookup Mercurial extension /r/513 - Bug 1097979 - Introduce lookup() API /r/515 - Bug 1097979 - Share a Mercurial store for each logical repository group Pull down these commits: hg pull review -r bd9deb85bcb4ded7b2473dd1fdad8db9e3cf027d
https://reviewboard.mozilla.org/r/507/#review225 I discovered the tests! The last commit breaks the tests. I'll work on updating the tests. In the mean time, I added 2 commits to make the tests run more reliably.
Attachment #8521806 - Flags: review?(catlee27071982)
/r/525 - Bug 1097979 - Export HGRCPATH earlier in tests /r/527 - Bug 1097979 - Change directory earlier during tests /r/509 - Bug 1097979 - Require share support in Mercurial /r/511 - Bug 1097979 - Introduce lookup Mercurial extension /r/513 - Bug 1097979 - Introduce lookup() API /r/515 - Bug 1097979 - Share a Mercurial store for each logical repository group Pull down these commits: hg pull review -r bd9deb85bcb4ded7b2473dd1fdad8db9e3cf027d
/r/525 - Bug 1097979 - Export HGRCPATH earlier in tests /r/527 - Bug 1097979 - Change directory earlier during tests /r/571 - Bug 1097979 - Pass positional arguments from tox /r/573 - Bug 1097979 - Pin dates during Mercurial tests /r/575 - Bug 1097979 - Remove randomness from Mercurial operations /r/509 - Bug 1097979 - Require share support in Mercurial /r/511 - Bug 1097979 - Add lookup Mercurial extension /r/513 - Bug 1097979 - Introduce lookup() API /r/515 - Bug 1097979 - Share a Mercurial store for each logical repository group Pull down these commits: hg pull review -r 3940fb8bd92e22998e610661c16fd729c78bad14
https://reviewboard.mozilla.org/r/507/#review231 And the tests pass with the latest commit series! I had to make some changes to the Mercurial test environment so commit SHA-1 were stable. Once that was in place, making the tests pass was pretty easy.
https://reviewboard.mozilla.org/r/517/#review275 ::: hgext/lookup/__init__.py (Diff revision 1) > + result = remote.lookup(key) I'm seeing a pattern here. It's the second extension that is adding a command to do something on a remote. Wouldn't it be better to have only one such command with subcommands? Like hg remote <url> lookup foo or hg remote <url> heads
Attachment #8521810 - Flags: review?(mh+mozilla)
https://reviewboard.mozilla.org/r/507/#review309 ::: lib/python/mozilla_buildtools/test/test_util_hg.py (Diff revision 3) > - sharerepo = os.path.join(shareBase, self.repodir.lstrip("/")) > + sharerepo = os.path.join(shareBase, 'f99e43af0df3849a690155ce9aba15a7e36bdbf6') Could you mention what this magic revision comes from? From above it looks like this is r0? ::: lib/python/mozilla_buildtools/test/test_util_hg.py (Diff revision 3) > + self.assertEqual(root2, '7e8f621acee0444054fdc2f47e6555d338ba96c4') I think it would be clearer to say that root2 != root1. We don't actually care what r0 of repo2 is, only that it's different than the original repo. ::: lib/python/util/hg.py (Diff revision 3) > - why remove this check here?
https://reviewboard.mozilla.org/r/507/#review369 > why remove this check here? This is documented in the commit message of /r/509.
/r/525 - Bug 1097979 - Export HGRCPATH earlier in tests /r/527 - Bug 1097979 - Change directory earlier during tests /r/571 - Bug 1097979 - Pass positional arguments from tox /r/573 - Bug 1097979 - Pin dates during Mercurial tests /r/575 - Bug 1097979 - Remove randomness from Mercurial operations /r/509 - Bug 1097979 - Require share support in Mercurial /r/511 - Bug 1097979 - Add lookup Mercurial extension /r/513 - Bug 1097979 - Introduce lookup() API /r/515 - Bug 1097979 - Share a Mercurial store for each logical repository group Pull down these commits: hg pull review -r 241f9c34663b15fbdf466861c4375871725be358
https://reviewboard.mozilla.org/r/507/#review593 ::: lib/python/util/hg.py (Diff revision 4) > + raise I think we need to add some retry logic here, or wherever lookup() is called to handle sporatic network/server-side issues.
https://reviewboard.mozilla.org/r/517/#review4631 It turns out this extension isn't needed. `hg identify` works with URLs.
Attachment #8521810 - Attachment is obsolete: true
Comment on attachment 8521806 [details] MozReview Request: bz://1097979/gps /r/525 - Bug 1097979 - Export HGRCPATH earlier in tests /r/527 - Bug 1097979 - Change directory earlier during tests /r/571 - Bug 1097979 - Pass positional arguments from tox /r/573 - Bug 1097979 - Pin dates during Mercurial tests /r/575 - Bug 1097979 - Remove randomness from Mercurial operations /r/509 - Bug 1097979 - Require share support in Mercurial /r/513 - Bug 1097979 - Introduce lookup() API /r/515 - Bug 1097979 - Share a Mercurial store for each logical repository group Pull down these commits: hg pull review -r 756a2eef5713476f995322566b0581d3658d7bc0
Comment on attachment 8521806 [details] MozReview Request: bz://1097979/gps https://reviewboard.mozilla.org/r/507/#review4805 ::: lib/python/util/hg.py (Diff revisions 4 - 5) > If successful, we return the 40 character hex node for the key. If the this docstring needs a small update since we're no longer returning the full 40 character hex node.
Attachment #8521806 - Flags: review?(catlee)
https://reviewboard.mozilla.org/r/507/#review4807 ::: lib/python/util/hg.py (Diff revision 5) > - # we need to clobber both the shared checkout and the dest, > + sharedRepo = os.path.join(shareBase, root_node) do you think it's worthwhile adding a host-level cache here so that each hgtool invocation doesn't result in a call to lookup()?
https://reviewboard.mozilla.org/r/507/#review4809 > do you think it's worthwhile adding a host-level cache here so that each hgtool invocation doesn't result in a call to lookup()? Good catch. The answer depends on how frequently this code path will be invoked. Logs reveal 0.023s CPU time for a lookup on mozilla-central. That's pretty cheap. But anything multipled by the volume Firefox release automation is capable of sending is potentially problematic. Given events of the past week, I'm inclined to put the cache in as a safety net. Then again, that creates a new problem: cache invalidation. I don't expect that to be a concern for Firefox repos. But stranger things have happend. Let's get the numbers first.
Attachment #8521806 - Attachment is obsolete: true
I am upstreaming the work around automatically sharing repositories belonging to the same logic repository group. Once this is deployed, we'll be able to define share.pool=/path/to/base/directory in /etc/mercurial/hgrc and `hg clone` will automatically set up a share behind the scenes. Once this feature and bundleclone are deployed everywhere, I reckon a lot of the custom code in build/tools, mozharness, taskcluster, etc for cloning repositories from bundles and shares can be deleted since Mercurial core or a Mercurial extension will do it for us. https://selenic.com/pipermail/mercurial-devel/2015-July/071888.html
> I reckon a lot of the custom code in build/tools, mozharness, taskcluster, etc I'd expect everything to be in hgtool.py these days. Not a bad thing to obsolete it.
Depends on: 1181872
My upstream work has been accepted and will be part of Mercurial 3.5. That's due for release around August 1. We typically wait until the X.Y.1 releases to deploy. So this likely won't be deployed until September at the earliest. If we want, I could probably hack up hgtool to do the exact same thing as Mercurial 3.5. When 3.5+ is deployed everywhere, we can remove the code from hgtool. Let's see what things look like after bundleclone is deployed everywhere, then we can make a call on whether to do extra work so we can realize benefits before September.
This work was mostly done in bug 1270317.
Status: ASSIGNED → RESOLVED
Closed: 9 years ago
Resolution: --- → DUPLICATE
Attachment #8618614 - Attachment is obsolete: true
Attachment #8618616 - Attachment is obsolete: true
Attachment #8618613 - Attachment is obsolete: true
Attachment #8618618 - Attachment is obsolete: true
Attachment #8618619 - Attachment is obsolete: true
Attachment #8618615 - Attachment is obsolete: true
Attachment #8618611 - Attachment is obsolete: true
Attachment #8618612 - Attachment is obsolete: true
Component: General Automation → General
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: