1351859 - Mass upgrade repos to generaldelta

Reporter

Description

•

8 years ago

Mercurial 3.7 uses the "generaldelta" storage mechanism by default. This storage format computes deltas in storage against any parent, not the logical parent before it. For repos with lots of merges, this can have a drastic impact on repo size. For example, non-generaldelta mozilla-central is ~2,183 MB on disk (not including inode overhead). With generaldelta, it is ~1,758 MB on disk. generaldelta is why the mozilla-unified repo (a superset of central, aurora, beta, release, etc) is smaller than mozilla-central. Since generaldelta repos use more optimal storage, they generally result in faster repo operations. This includes clone, pull, and push times. This also includes more common operations, like committing. We haven't yet mass converted repos to generaldelta on the server out of concern for performance. A drawback with generaldelta is that when a server sends data to legacy clients, the server may have to re-encode data to non-generaldelta. On large repos (like the Firefox repo), this can result in a ton of extra CPU. Automated clients under Mozilla's control are responsible for most requests against hg.mozilla.org. Most are running Mercurial 3.9 or 4.1, which means they can consume generaldelta repos optimally without incurring any extra server-side load. And, enough time has passed that the number of legacy clients not supporting generaldelta data exchange has decreased to a point where we can afford the server-side CPU hit. Mercurial 4.1 introduced an `hg debugupgraderepo` command that can be used to upgrade a Mercurial repository in place to use the latest/greatest storage format. It can also re-encode data during the upgrade so optimal/minimal storage is used. That command works by creating a new, empty "store" directory then copies revlogs one-by-one using an internal API. It basically replays all data using the new code paths so optimal storage is used. During the conversion, the repo is locked and can't be written to. At the end of conversion, it does a mostly-atomic swap of the .hg/store directory. It also leaves a backup of the old .hg/store directory around in the .hg directory. The process for upgrading repositories to generaldelta basically involves running `hg debugupgraderepo` on each repository. We should be running with `--optimize redeltamultibase` and either `--optimize redeltaall` or `--redeltaparentoptimize`. This combination will minimize repo storage by forcing deltas to be recomputed, taking advantage of generaldelta. The time it takes to run `hg debugupgraderepo` with delta calculation varies on the size of the repo. For a mozilla-central like repo, it likely takes 2-4 hours on the hgssh server and NFS. During that time, the repo is read-only. This is unfortunate. The command can be aborted any time without data loss. So scheduling the upgrade of critical repos may not be necessary. e.g. we can ask the sheriffs when a quiet time will be and just do it. If there is a chemspill and we need to land something, we abort the upgrade and try again later. Once a repo is upgraded on hgssh4, we'll need to re-clone the repo on each hgweb node so it serves the generaldelta version. I think there is an Ansible playbook for that that will even do the re-clone in such a way that will result in no client-visible downtime. Another concern with mass upgrading the repos is inode bloat and backup overhead on the NFS server. Snapshot backups will encounter hundreds of thousands or millions of new files from the "store backup" directory. This could potentially bring the backup system to its knees. We may want to coordinate with the backup process operators when we do mass upgrading.