Closed Bug 1443375 Opened 3 years ago Closed 3 years ago
revisions created with arc diff (hg) can't be applied with arc patch (git)
Because they don't translate the commit ID
Working on writing something up about the issues and possible solutions here.
Assignee: nobody → smacleod
Alright, based on reading the slack discussion that prompted this bug, I'm going to summarize what I think the desire is here: Phabricator revisions can be applied locally to get the same working copy the author had when posting the revision. Or more specifically, Phabricator revisions can be applied locally to the parent commit they are based upon. Both the creator of the revision and the patcher can be using either hg or git and it should still work. An important aspect of this problem though is if we'd like to be able to apply revisions that are based on non-public or unrelated parents. This comes up quite a bit with gecko developers who will have several bugs / features on the same branch, or changes based on others which have not landed. # Stock Phabricator Solutions: ## Staging Area The robust Phabricator/Arcanist workflow to enable this revolves around using a "Staging Area" repository. Phabricator allows configuring a repository that "arc diff" will also push changes to when uploading the diff. This would mean the commits used to create a revision exist publicly and could be pulled for exact reproduction. This feature is currently git only and would not work with confidential changes. Without a Staging Area, you need to either make sure all commits leading up to those in a revision exist publicly with some other mechanism, or you need to store enough meta-data along with the revision that the patch can be applied locally. ## Storing the Parent Commit "arc diff" without staging will include the commit identifier for the parent commit. "arc patch" looks for this parent locally, and attempts to apply the diff on top of it. If this parent doesn't exist in your local repository it is still possible to apply the patch on top of another commit but this isn't guaranteed to work. In the hg -> git or git -> hg case the parent will never exist since the commit identifiers aren't translated (a feature arc does not have). For cases where the scm tool matches it may still break, since there is no requirement that the uploaded diff is based on a commit the developer has pushed or uploaded elsewhere. This uploaded meta-data is also completely optional, custom clients and web UI uploads aren't required to set any of it. It can even be set incorrectly to be misleading or malicious. # Different levels of functionality So it doesn't appear any of the official Phabricator workflows and tools work for us or fully solve our problem, as-is. If we're going to build something that solves this, or official Phabricator tools gain features to support Mozilla, I see a few different levels of functionality as options: 1. Full commit reproduction. The adopted workflow and tools would allow reproducing a revision's commit or commits identically in your local repository, including all history leading up to it (hg <--> git translation being deterministic and supported). 2. Revisions applied as patches (commits are still squashed together) but parent commit to be applied to is guaranteed to be accessible or applied first as a patch itself (enforced by tools/workflow). 3. Patch application for revisions based on already public changes works, regardless of the version control system on either end. # Solution: The details of the solution we choose are dependent on the level of functionality we'd like to hit. For example, requiring Arcanist be used and having it improved upstream most likely would never get to the level of 1 or even 2. I'd imagine 2 or 3 could probably be hit with a custom client or Arcanist extension quite easily. If we'd really like the functionality described in 1 we'd need both a custom client and custom server side infrastructure. I'm going to leave out any more implementation specific details as this is already quite a long reply.  https://secure.phabricator.com/book/phabricator/article/harbormaster/
Hi Stephen, Taking a step back from this. What is the functionality we now provide with only one revision control tool? I.e. say that you have two diffs, D1 and D2, where D2 is based on D1. If I do arc patch D2, does that work properly?
Reading more closely, it seems like right now the answer is "no" but that you have a somewhat straightforward partial fix for git (the staging area) but not hg. Is that correct? Also, how does it work with patch stacks? Assuming that this is correct, then it seems like the immediate task ought to bring non-matching VCSs up to parity with matching VCSs (presumably with commit id translation). That doesn't solve every use case, but it does solve many.
:ekr and I discussed this off bugzilla. The immediate concern is making patch application work when the author and reviewer are using distinct VCS. Mismatch is already a common problem and is sure to come up frequently after general release. Particularly there are 4 cases, based around what the change in review has as a parent commit, and how the author's VCS compares to the reviewers. a) Parent commit exists publicly, VCS match (currently works) b) Parent commit exists publicly, VCS differ c) Parent commit not public, VCS match d) Parent commit not public, VCS differ While it would be nice if these 4 cases all worked (more so that our tools and workflows we could guarantee c and d were impossible), we should focus on making the common cases (a + b) work. This is much more achievable in a short time, requiring less/simpler Mozilla specific code. I'll tie this back to comment #2's levels of functionality. (In reply to Steven MacLeod [:smacleod] from comment #2) > 1. Full commit reproduction. The adopted workflow and tools would allow > reproducing a revision's commit or commits identically in your local > repository, including all history leading up to it (hg <--> git > translation being deterministic and supported). > 2. Revisions applied as patches (commits are still squashed together) but > parent commit to be applied to is guaranteed to be accessible or applied > first as a patch itself (enforced by tools/workflow). > 3. Patch application for revisions based on already public changes works, > regardless of the version control system on either end. The support we're aiming for would be level 3. This means we don't need any sort of storage for intermediate commits that would usually not be public or the machinery for creating, uploading, or applying them. We should be able to get away with implementing some form of commit id translation. We have a few options for solving this. To get started though I think it's probably quite unlikely that support for our workflow could land upstream in Arcanist. Phabricator has support for hosting repositories itself, I'd imagine any upstream solution that allows both git and hg clients would probably be much easier if using hosted repositories is mandatory. We do not use the hosted repositories at mozilla and translating commits without a standard mapping is hard. Our best bet for having that standard mapping is requiring git-cinnabar be used for all git clients or at least standardize on a repo to clone from that would have the same mapping as a cinnabar client. If we can rely on a standard mapping it's just a question of where / when we perform the mapping and how the code applying the patch for the reviewer gets this mapping (or a pre-mapped commit identifier). Diving down the Arcanist source rabbit hole, it doesn't appear possible to write an extension which will override the base commit on "arc diff" or "arc patch" without re-implementing a ton of functionality or maintaining a ton of copy and pasted code. "arc patch" operates by looking for the parent commit identifier stored on Phabricator ("sourceControlBaseRevision") and then attempting to apply on top of it. Since we can't easily extend this logic without requiring a forked Arcanist, we're going to need one of: a custom patching client, an arc extension which adds a new patching command, or a custom wrapper around arc for patching. The vcs used by the commit author who uploaded the patch is also stored in Phabricator ("SourceControlSystem"), so if we could do mapping client side on patch application we'd be able to handle any vcs from the author and know when to map. To make this work for every case though you'd need every reviewer to also have git-cinnabar to do mappings. So instead I'd propose two options to choose from here: 1. Call a remote service to do the git <-> hg mapping in our custom patcher 2. Use a custom "arc diff" command for git-cinnabar users which will translate commit ids before upload so that all ids in Phabricator are hg based. The custom patcher on git machines will already have git-cinnabar installed and can use it to translate locally. tl;dr: To solve the immediate git <--> hg patch application issues we need to build a custom patcher for reviewers to use in addition to either a remote commit mapping service or a custom "arc diff" alternative
un-assigning as I'm not actually implementing any of this right now.
Assignee: smacleod → nobody
ekr, do you have any opinions on the options listed in comment 5? We're ready to work on this any time.
Discussed this off bugzilla with ekr, going to post a follow-up here soon.
I've filed an issue with Phacility to get their thoughts before we make a final decision on approach here: https://admin.phacility.com/PHI572
Assignee: nobody → smacleod
I created a solution based on the Phacility proposal. https://phabricator.services.mozilla.com/D1034
zalun's doing the actual implementation; reassigning to him.
Assignee: smacleod → pzalewa
Status: NEW → ASSIGNED
Solution proposed in https://admin.phacility.com/PHI572 enhanced with an ability to work when the diff's base commit is not public (checking for '00..' as a result of h2git or git2hg). Checked with `arc diff` and `arc patch` commands.
Status: ASSIGNED → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.