Toy STR for git dropping file history when renaming/copying files when submitted to phabricator
Categories
(Toolkit :: UI Widgets, task)
Tracking
()
People
(Reporter: tgiles, Assigned: tgiles)
Details
Attachments
(12 obsolete files)
(This is 100% not a UI widgets bug, but putting it here so I can track it easier, feel free to move to a better spot if this is indeed an issue)
We've been running into issues where, when using git/git-cinnabar, we lose file history of files that are copied/renamed. If a file is simply copied, then the file history will stay intact, however most people (including myself) end up copying/renaming a file and putting the necessary changes in the new file. This causes Git to determine the renamed file is actually a newly added file and thus drop the file history. There should be some way around this, but I haven't discovered a workaround that consistently works.
I'm using this bug so I can attach WIP patches demonstrating the issue (since I don't know how to submit patches to phab without a bug associated with the patches/don't think it is possible).
Assignee | ||
Comment 1•1 year ago
|
||
Steps to reproduce this Git file history issue:
- While using git-cinnabar, rename a file in mozilla central
(I renamed resetpassword.xthml since it has history going back to
2007) - Add this renamed file to git which should create a "renamed" change
in git status - Use git log --follow <renamed_file>
- Observe that there is no file history
- Create commit with renamed file
- Use git log --follow <renamed_file>
- Observe that file history has been restored in this new file
While there are workarounds to this issue, such as this post from
Raymond Chen on Microsoft's dev blog,
https://devblogs.microsoft.com/oldnewthing/20190919-00/?p=102904, I've
been unable to preserve file history when submitting patches to
Phabricator.
Assignee | ||
Comment 2•1 year ago
|
||
If we submit multiple commits, the file history may be intact. However,
if we squash this commit into the previous one, then we will lose the
existing file history due to the renaming happening at the same time
as other changes.
Depends on D198050
Assignee | ||
Comment 3•1 year ago
|
||
This stack has a squashed commit with a large amount of the renamed file
changed and should force the new file added issue.
issue.
Steps to reproduce this Git file history issue:
- While using git-cinnabar, rename a file in mozilla central
(I renamed resetpassword.xthml since it has history going back to
2007) - Add this renamed file to git which should create a "renamed" change
in git status - Use git log --follow <renamed_file>
- Observe that there is no file history
- Create commit with renamed file
- Use git log --follow <renamed_file>
- Observe that file history has been restored in this new file
- Modify around 40% of the changed file
- Make a new commit and squash that commit back into the original
rename commit - Observe that the file history has been deleted since git performed
a create and delete action instead of a rename action
While there are workarounds to this issue, such as this post from
Raymond Chen on Microsoft's dev blog,
https://devblogs.microsoft.com/oldnewthing/20190919-00/?p=102904, I've
been unable to preserve file history when submitting patches to
Phabricator.
WIP: Bug 1873734 - A change that forces git to consider this a new file on squash
If we submit multiple commits, the file history may be intact. However,
if we squash this commit into the previous one, then we will lose the
existing file history due to the renaming happening at the same time
as other changes.
more changes to force new file added
Assignee | ||
Comment 4•1 year ago
|
||
This stack has a commit that would cause a new file to be added if the
change occurs at the same time as the file rename.
Steps to reproduce this Git file history issue:
- While using git-cinnabar, rename a file in mozilla central
(I renamed resetpassword.xthml since it has history going back to
2007) - Add this renamed file to git which should create a "renamed" change
in git status - Use git log --follow <renamed_file>
- Observe that there is no file history
- Create commit with renamed file
- Use git log --follow <renamed_file>
- Observe that file history has been restored in this new file
- Take the change from the second patch in this stack and apply it to
the first patch as a child commit. - Observe that the file history is preserved in this case
While there are workarounds to this issue, such as this post from
Raymond Chen on Microsoft's dev blog,
https://devblogs.microsoft.com/oldnewthing/20190919-00/?p=102904, I've
been unable to preserve file history when submitting patches to
Phabricator.
Assignee | ||
Comment 5•1 year ago
|
||
If we submit multiple commits, the file history may remain intact.
However, if we squash this commit into the previous one,then we will
lose the existing file history due to the renaming happening at the
same time as the other changes.
more file changes, no squashed commit though
This method keeps the file history intact, but creates an extraneous
commit of simply adding the files into the tree. Maybe that's a
trade-off that we can deal with.
Depends on D198054
Assignee | ||
Comment 6•1 year ago
|
||
copy over moz-card into test-git-file and squash to show loss of file
history.
Updated•1 year ago
|
Updated•1 year ago
|
Updated•1 year ago
|
Updated•1 year ago
|
Updated•1 year ago
|
Assignee | ||
Comment 7•1 year ago
|
||
By performing the copy/rename in a separate commit from the intended
changes, git seems able to track the existing history in a file. This
would create extraneous commits as far as I can tell though.
Assignee | ||
Comment 8•1 year ago
|
||
If this changes are squashed/fixed up into the parent commit, then git,
and I assume phabricator, will both show this as a new file with no
file history.
Depends on D198066
Assignee | ||
Comment 9•1 year ago
|
||
Assignee | ||
Comment 10•1 year ago
|
||
This patch uses git mv and git merge --no-ff --no-commit on a temp
branch to get the rename to stick without losing file history.
No idea if phabricator will agree or not though.
Depends on D198069
Assignee | ||
Comment 11•1 year ago
|
||
This stack uses Raymond Chen's method for duplicating file history
https://devblogs.microsoft.com/oldnewthing/20190919-00/?p=102904
git checkout -b dup
git mv browser/components/shopping/content/shopping-card.mjs toolkit/content/widgets/moz-card/test-git-file-history-moz-card.mjs
git commit -m "duplicate shopping card to test-git-file-history"
git checkout HEAD~ browser/components/shopping/content/shopping-card.mjs
git commit -m "restore shopping-card.mjs"
git checkout -
git merge --no-ff dup
Assignee | ||
Comment 12•1 year ago
|
||
After duplicating file history, I copied moz-card.mjs into
test-git-file-history-moz-card.mjs. Here's hoping the shopping-card.mjs
doesn't get deleted and the test-git-file has the file history.
Depends on D198076
Comment 13•1 year ago
|
||
Hey glandium, do you know if this is just a gotcha for Git that we're going to have to live with as we migrate to it? https://stackoverflow.com/questions/47401843/git-copy-file-as-opposed-to-git-mv suggests that this behaviour is a natural consequence of how Git tracks changes... or is there a way we can do these copies in a Git-friendly way?
Comment 14•1 year ago
|
||
I'm not sure what this bug is talking about. Is it about file copies/moves not making their way to the mercurial repo? Or is it a question about the future exclusive use of git?
Assignee | ||
Comment 15•1 year ago
|
||
Hey glandium, we've been running into issues where copying a file and then editing it causes either git-cinnabar or phabricator (I think it's the diffing part of git-cinnabar, but obviously will defer to your expertise on that) to drop the file history of the copied file. It seems like this is a known issue (https://github.com/glandium/git-cinnabar/issues/10) of git-cinnabar. We were trying to find a workaround so that we aren't losing file history when duplicating files and submitting patches via git/git-cinnabar. It does seem like this issue will go away once we're exclusively using git, at least by my experiments.
I've been able to duplicate files and then keep the file history intact locally in git via two commits, one commit copying/renaming the source file and then one commit creating changes in the duplicated file. However submitting those changes to phabricator ends up with a "file created, file deleted" action from git/git-cinnabar which is not what we want. I suppose we just want to be sure that we're not losing file history in these cases and that this case is something that git does support (even if we have to do some workarounds to make it happen).
I hope that clears up the original intent of the bug, but please let me know if things are still unclear!
Comment 16•1 year ago
|
||
The git-cinnabar issue is about git-cinnabar pushing to mercurial. That's not involved when submitting to phabricator. What is involved is whatever tool you're using, presumably moz-phab. The relevant part of moz-phab is https://github.com/mozilla-conduit/review/blob/ee870fd6a2be3cc0fbaef5fed36543fd660bd884/mozphab/git.py#L847-L857 which sets both -M and -C, which means "find renames and copies", and also uses the default similarity threshold, which is 50%. So if the file is less than 50% identical, renames/copies are not found. When you look at a commit locally, you can obviously tweak those, but moz-phab doesn't allow that. Whether it should or not is a good question. Considering phabricator (is supposed to) detect cross-file moves, is it really a problem that file renames/copies are not found in some cases? (I mean, apart from the fact that that won't be recorded in mercurial, but it's soon going to stop mattering)
Assignee | ||
Comment 17•1 year ago
|
||
(In reply to Mike Hommey [:glandium] from comment #16)
The git-cinnabar issue is about git-cinnabar pushing to mercurial. That's not involved when submitting to phabricator. What is involved is whatever tool you're using, presumably moz-phab. The relevant part of moz-phab is https://github.com/mozilla-conduit/review/blob/ee870fd6a2be3cc0fbaef5fed36543fd660bd884/mozphab/git.py#L847-L857 which sets both -M and -C, which means "find renames and copies", and also uses the default similarity threshold, which is 50%. So if the file is less than 50% identical, renames/copies are not found.
I appreciate that link, I remember modifying that when I ran into this issue before but couldn't remember exactly where it was at.
When you look at a commit locally, you can obviously tweak those, but moz-phab doesn't allow that. Whether it should or not is a good question.
So that's one of the things, I'm not modifying the default similarity threshold locally as far as I can tell. When I run git log --follow the_modified_file
when using the two commit approach (one commit that adds, one commit that changes more than 50% of the file contents), git is determining that a rename happened followed by a modification. Unless my git log
has an increased similarity threshold, I don't understand why moz-phab or phabricator isn't reacting in the same way as git log
(since git only does diffs when it needs to, as I understand it).
Wait, so there's a chance that a file that moz-phab/phabricator (I'm using them together since I'm still not entirely sure where this disconnect is, apologies for the conflation) determines is a new file could still be recorded as a rename/copy in the source repo?
Considering phabricator (is supposed to) detect cross-file moves, is it really a problem that file renames/copies are not found in some cases? (I mean, apart from the fact that that won't be recorded in mercurial, but it's soon going to stop mattering)
I'm sure this is something that has already been discussed before, but I don't believe it's good to lose history of files in these cases, especially given the longevity of the Firefox code-base. I will admit that this could be a case of me making a big deal out of edge cases. It does appear that it will need to be a new muscle to learn from the folks who previously used Mercurial and are used to the whole hg cp
command that handles this case without issue.
I will absolutely defer in this matter. If this particular case isn't really an issue, I'm happy to get out of the way.
Assignee | ||
Comment 18•1 year ago
|
||
Hey :sheehan, :mconley let me know that you're maintaining moz-phab so I wanted to make sure you were aware of this bug. I'm still not 100% sure if this is a moz-phab bug or a Phabricator bug, but you probably have a much better picture of how these systems work than I do!
To get you up to speed, there are cases where we're losing file history when using git (git-cinnabar) to submit patches to Phabricator. These cases involve copying a file and then making changes to more than 50% of the file (which should trip the default similarity threshold in the git.py file). For example, in this squashed patch, test-git-file-history-moz-card
is considered a new file by both git and moz-phab, this is to be expected as far as I can tell (because more than 50% of the file changed).
However, if we split the copying of the file into one commit and modification in another commit, git locally can follow this duplicated file's history but moz-phab/phabricator believes this is a new file. You can see this in this two commit stack. Do you believe this is a bug in moz-phab or phabricator? I'm not sure how using moz-phab with Mercurial doesn't show this issue (when using the hg cp
command). I'd appreciate any insight into the matter (and happy to file an issue in the Conduit github if it is a moz-phab bug)
Updated•1 year ago
|
Updated•1 year ago
|
Updated•1 year ago
|
Updated•1 year ago
|
Updated•1 year ago
|
Updated•1 year ago
|
Updated•1 year ago
|
Assignee | ||
Updated•1 year ago
|
Description
•