Open
Bug 1408357
Opened 7 years ago
Updated 2 years ago
mach vendor rust should detect copies to avoid hg repo bloat
Categories
(Firefox Build System :: General, enhancement)
Firefox Build System
General
Tracking
(Not tracked)
NEW
People
(Reporter: kats, Unassigned)
References
Details
Often when a dependency is updated in a crate we will end up with a crate getting moved and a new crate getting added. For example the third_party/rust/bincode folder (currently holding 0.8.0) might get moved to bincode-0.8.0 with the bincode folder getting updated to 0.9.0. (this is the scenario that will happen with the next webrender update).
The thing is, the files getting moved to -0.8.0 don't get detected as moves by hg so it ends up storing another copy of them internally. So if we are smart about this we can avoid hg storing the same thing twice. Right now mach vendor rust uses `hg addremove` to stage the changes which will detect renames but not copies. If `hg addremove` detected copies we would get this fixed for free, which would be great. That is an outstanding bug against hg, https://bz.mercurial-scm.org/show_bug.cgi?id=3432. :gps, anything you can do to help move that along?
Failing that another approach we could take would be to not use `hg addremove` but instead write something smarter that takes advantage of what we know happens during vendoring at the crate level (this crate got copied over there, and the in-place one got updated).
For context, the reason I thought of this is because the bincode repo contains a useless logo.png which is larger than the 100k file size limit that `mach vendor rust` now imposes. I'll deal with that separately though.
Flags: needinfo?(gps)
Comment 1•7 years ago
|
||
There's a `--similarity` argument to addremove that takes a percentage to use to detect renames. We could add that and tweak it until we get reasonable results.
Reporter | ||
Comment 2•7 years ago
|
||
--similarity only picks up on moves but not copies. That is, the original file would have to be removed in order for it to work, which is usually not the case for us. Tweaking the similarity down from the default of 100 might help but I expect it will have a minimal impact based on the types of changes that vendoring produces in practice.
Comment 3•7 years ago
|
||
Mercurial's default storage layer doesn't do content-level de-duping (like Git). So copy detection would be mostly about preserving metadata as opposed to storage savings.
I suspect we could get some traction to fix this upstream. Especially since people have recently touched the copy tracing code for the 4.4 release (https://www.mercurial-scm.org/repo/hg/rev/036d47d7cf39). I don't have a good grasp on this area of the code, otherwise I would consider taking a stab at it. I'll comment on the upstream issue...
Flags: needinfo?(gps)
Updated•7 years ago
|
Product: Core → Firefox Build System
Updated•2 years ago
|
Severity: normal → S3
You need to log in
before you can comment on or make changes to this bug.
Description
•