Closed Bug 1443375 Opened 2 years ago Closed 2 years ago

revisions created with arc diff (hg) can't be applied with arc patch (git)

Categories

(Conduit :: General, enhancement)

enhancement
Not set

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: ekr, Assigned: zalun)

References

Details

(Keywords: conduit-story, conduit-triaged)

Attachments

(1 obsolete file)

Because they don't translate the commit ID
Working on writing something up about the issues and possible solutions here.
Assignee: nobody → smacleod
Alright, based on reading the slack discussion that prompted this bug,
I'm going to summarize what I think the desire is here:

Phabricator revisions can be applied locally to get the same working copy
the author had when posting the revision. Or more specifically, Phabricator
revisions can be applied locally to the parent commit they are based upon.
Both the creator of the revision and the patcher can be using either hg or
git and it should still work.

An important aspect of this problem though is if we'd like to be able to
apply revisions that are based on non-public or unrelated parents. This
comes up quite a bit with gecko developers who will have several bugs /
features on the same branch, or changes based on others which have not
landed.

# Stock Phabricator Solutions:

## Staging Area

The robust Phabricator/Arcanist workflow to enable this revolves around
using a "Staging Area"[1] repository. Phabricator allows configuring a
repository that "arc diff" will also push changes to when uploading the diff.
This would mean the commits used to create a revision exist publicly and could
be pulled for exact reproduction. 

This feature is currently git only and would not work with confidential
changes.

Without a Staging Area, you need to either make sure all commits leading up
to those in a revision exist publicly with some other mechanism, or you need
to store enough meta-data along with the revision that the patch can be applied
locally.


## Storing the Parent Commit

"arc diff" without staging will include the commit identifier for the
parent commit. "arc patch" looks for this parent locally, and attempts to
apply the diff on top of it. If this parent doesn't exist in your local
repository it is still possible to apply the patch on top of another commit
but this isn't guaranteed to work.

In the hg -> git or git -> hg case the parent will never exist since the
commit identifiers aren't translated (a feature arc does not have). For cases
where the scm tool matches it may still break, since there is no requirement
that the uploaded diff is based on a commit the developer has pushed or
uploaded elsewhere.

This uploaded meta-data is also completely optional, custom clients and web UI
uploads aren't required to set any of it. It can even be set incorrectly to be
misleading or malicious.


# Different levels of functionality

So it doesn't appear any of the official Phabricator workflows and tools
work for us or fully solve our problem, as-is. If we're going to build
something that solves this, or official Phabricator tools gain features
to support Mozilla, I see a few different levels of functionality as options:

 1. Full commit reproduction. The adopted workflow and tools would allow
    reproducing a revision's commit or commits identically in your local
    repository, including all history leading up to it (hg <--> git
    translation being deterministic and supported).
 2. Revisions applied as patches (commits are still squashed together) but
    parent commit to be applied to is guaranteed to be accessible or applied
    first as a patch itself (enforced by tools/workflow).
 3. Patch application for revisions based on already public changes works,
    regardless of the version control system on either end.


# Solution:

The details of the solution we choose are dependent on the level of
functionality we'd like to hit. For example, requiring Arcanist be used and
having it improved upstream most likely would never get to the level of 1 or
even 2. I'd imagine 2 or 3 could probably be hit with a custom client or
Arcanist extension quite easily. If we'd really like the functionality
described in 1 we'd need both a custom client and custom server side
infrastructure.

I'm going to leave out any more implementation specific details as this is
already quite a long reply.

[1] https://secure.phabricator.com/book/phabricator/article/harbormaster/
Hi Stephen,

Taking a step back from this. 

What is the functionality we now provide with only one revision control tool?

I.e. say that you have two diffs, D1 and D2, where D2 is based on D1. If I do arc patch D2, does that work properly?
Reading more closely, it seems like right now the answer is "no" but that you have a somewhat straightforward partial fix for git (the staging area) but not hg. Is that correct? Also, how does it work with patch stacks?
 
Assuming that this is correct, then it seems like the immediate task ought to bring non-matching VCSs up to parity with matching VCSs (presumably with commit id translation). That doesn't solve every use case, but it does solve many.
:ekr and I discussed this off bugzilla.

The immediate concern is making patch application work when the author and
reviewer are using distinct VCS. Mismatch is already a common problem and is
sure to come up frequently after general release.

Particularly there are 4 cases, based around what the change in review has
as a parent commit, and how the author's VCS compares to the reviewers.
  a) Parent commit exists publicly, VCS match (currently works)
  b) Parent commit exists publicly, VCS differ
  c) Parent commit not public, VCS match
  d) Parent commit not public, VCS differ

While it would be nice if these 4 cases all worked (more so that our tools
and workflows we could guarantee c and d were impossible), we should focus on
making the common cases (a + b) work. This is much more achievable in a short
time, requiring less/simpler Mozilla specific code.

I'll tie this back to comment #2's levels of functionality.

(In reply to Steven MacLeod [:smacleod] from comment #2)
>  1. Full commit reproduction. The adopted workflow and tools would allow
>     reproducing a revision's commit or commits identically in your local
>     repository, including all history leading up to it (hg <--> git
>     translation being deterministic and supported).
>  2. Revisions applied as patches (commits are still squashed together) but
>     parent commit to be applied to is guaranteed to be accessible or applied
>     first as a patch itself (enforced by tools/workflow).
>  3. Patch application for revisions based on already public changes works,
>     regardless of the version control system on either end.


The support we're aiming for would be level 3. This means we don't need any sort
of storage for intermediate commits that would usually not be public or the
machinery for creating, uploading, or applying them. We should be able to get
away with implementing some form of commit id translation.

We have a few options for solving this. To get started though I think it's
probably quite unlikely that support for our workflow could land upstream in
Arcanist. Phabricator has support for hosting repositories itself, I'd imagine
any upstream solution that allows both git and hg clients would probably be much
easier if using hosted repositories is mandatory. We do not use the hosted
repositories at mozilla and translating commits without a standard mapping is
hard.

Our best bet for having that standard mapping is requiring git-cinnabar be used
for all git clients or at least standardize on a repo to clone from that would
have the same mapping as a cinnabar client. If we can rely on a standard mapping
it's just a question of where / when we perform the mapping and how the
code applying the patch for the reviewer gets this mapping (or a pre-mapped
commit identifier).

Diving down the Arcanist source rabbit hole, it doesn't appear possible to write
an extension which will override the base commit on "arc diff" or "arc patch"
without re-implementing a ton of functionality or maintaining a ton of copy and
pasted code. "arc patch" operates by looking for the parent commit identifier
stored on Phabricator ("sourceControlBaseRevision") and then attempting to apply
on top of it. Since we can't easily extend this logic without requiring a forked
Arcanist, we're going to need one of: a custom patching client, an arc extension
which adds a new patching command, or a custom wrapper around arc for patching.

The vcs used by the commit author who uploaded the patch is also stored in
Phabricator ("SourceControlSystem"), so if we could do mapping client side
on patch application we'd be able to handle any vcs from the author and know
when to map. To make this work for every case though you'd need every reviewer
to also have git-cinnabar to do mappings.

So instead I'd propose two options to choose from here:
  1. Call a remote service to do the git <-> hg mapping in our custom patcher
  2. Use a custom "arc diff" command for git-cinnabar users which will
     translate commit ids before upload so that all ids in Phabricator
     are hg based. The custom patcher on git machines will already have
     git-cinnabar installed and can use it to translate locally.


tl;dr: To solve the immediate git <--> hg patch application issues we need to
build a custom patcher for reviewers to use in addition to either a remote
commit mapping service or a custom "arc diff" alternative
un-assigning as I'm not actually implementing any of this right now.
Assignee: smacleod → nobody
ekr, do you have any opinions on the options listed in comment 5?  We're ready to work on this any time.
Flags: needinfo?(ekr)
Discussed this off bugzilla with ekr, going to post a follow-up here soon.
Flags: needinfo?(ekr)
I've filed an issue with Phacility to get their thoughts before we make a final decision on approach here: https://admin.phacility.com/PHI572
Assignee: nobody → smacleod
I created a solution based on the Phacility proposal.
https://phabricator.services.mozilla.com/D1034
zalun's doing the actual implementation; reassigning to him.
Assignee: smacleod → pzalewa
Status: NEW → ASSIGNED
Depends on: 1457847
Depends on: 1457848
Depends on: 1457849
Depends on: 1457856
Attached file WIP using git-cinnabar in Arcanist (obsolete) —
Solution proposed in https://admin.phacility.com/PHI572 enhanced with an ability
to work when the diff's base commit is not public (checking for '00..' as a
result of h2git or git2hg).
Checked with `arc diff` and `arc patch` commands.
Depends on: 1458191
Depends on: 1459193
Attachment #8972262 - Attachment is obsolete: true
Depends on: 1459199
Depends on: 1459922
Status: ASSIGNED → RESOLVED
Closed: 2 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.