Closed Bug 1457342 Opened 6 years ago Closed 6 years ago

prevent_subrepos hook likely adds significant overhead to large pushes

Categories

(Developer Services :: Mercurial: hg.mozilla.org, enhancement, P2)

enhancement

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: gps, Assigned: gps)

References

(Blocks 1 open bug)

Details

Attachments

(1 file)

On uplift day pushes earlier today, I noticed that we're spending ~750s in the hgext_mozhooks.pretxnchangegroup hook.

Auditing the hooks, the only thing I can find that could possibly be adding so much overhead is this line in prevent_subrepos.py:

    if '.hgsub' not in ctx and '.hgsubstate' not in ctx:

This requires resolving the manifest fulltext for each revision being pushed. Resolving the manifest fulltext on large repositories without using tree manifests can be very slow. >100ms per manifest.

While the last N manifests are supposed to be cached, resolving the fulltext of several manifests can likely add several seconds in overhead.

ctx.files() would be a much faster check. But the reason we don't check ctx.files() is because it can be spoofed. We ban subrepos because of security concerns. We want the hook/check to be robust. Hence why we're reading manifests.

But there's another way we can speed this up. Mercurial indexes tracked paths separately. So you can ask the file store for history of a specific path. We can bypass the per-changeset check by doing an up-front check to see if the repo has *any* data for .hgsub or .hgsubstate. If not, we can avoid reading manifest data.
p2 since this is likely slowing down uplift day pushes by several minutes.
Priority: -- → P2
Assignee: nobody → gps
Status: NEW → ASSIGNED
Previously, we would read the manifest for each context and look for
the presence of an .hgsub or .hgsubstate file. Reading manifests can
be expensive for large manifests (such as those found in Firefox
repos).

I suspect that reading manifests is making large pushes to Firefox
repos slower than they could be.

This commit implements an optimization to make subrepo checking
faster. Essentially, when the hook starts, it looks to see if there
are *any* revisions for a .hgsub or .hgsubstate file. If not, then
no revision contains this file and we can avoid a potentially
expensive manifest resolve and lookup.
Comment on attachment 8972421 [details]
hghooks: short circuit subrepos check if files missing (bug 1457342)

Byron Jones ‹:glob› has approved the revision.

https://phabricator.services.mozilla.com/D1073
Attachment #8972421 - Flags: review+
Pushed by gszorc@mozilla.com:
https://hg.mozilla.org/hgcustom/version-control-tools/rev/07f46c8333df
hghooks: short circuit subrepos check if files missing r=glob
Status: ASSIGNED → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
Blocks: 1460722
See Also: → 1415233
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: