Open Bug 1288845 Opened 8 years ago Updated 7 years ago

Allow "remote backouts" on the autoland repo

Categories

(Developer Services :: Mercurial: hg.mozilla.org, defect)

defect
Not set
normal

Tracking

(Not tracked)

ASSIGNED

People

(Reporter: gps, Assigned: gps)

References

(Blocks 1 open bug)

Details

Attachments

(3 files)

+++ This bug was initially created as a clone of Bug #1286022 +++
Depends on: 1288859
(THIS IS STILL A WORK IN PROGRESS. DO NOT R+.)

Currently, backouts are performed by creating a local commit that
reverts changes to a "bad" changeset and then pushing the result. The
DAG range between the bad changesets and its backout is forever
"bad."

If you bisect in the middle of that range, your build could be busted.
This makes automated bisection tools less reliable and less useful.

If you are performing repository archeology, backout commits (and
their backed out commits) make your life harder by polluting history
and adding complexity.

Backout commits add little value and introduce chaos. We should
minimize their frequency.

This commit changes our approach to backouts on the autoland
repository by introducing a mechanism whereby users with write access
to the autoland repository can instruct the remote server to "drop"
changesets. Instead of creating a changeset that reverts an earlier
one, the "bad" changeset is discarded and all changesets that came
after it are effectively rebased on top of the "bad" changeset's
parent. Imagine you have a set of blocks stacked on top of one another.
This mechanism is like flicking a block out of the stack and having
every block on top now resting one level below.

We introduce a "remoteadmin" extension that is configured on the
hg.mozilla.org server. This extension provides the `hg purgebackout`
command to perform the backout mechanism described above. We add
plumbing to our SSH login "shell" to proxy "autoland-backout"
requests to this command. e.g.
`ssh hg.mozilla.org autoland-backout <rev>`.

Because resolving file merges is difficult, we don't let them occur:
if a subsequent changeset touches files that were impacted by a
"dropped" changeset, we also drop that changeset.

We do something similar for changesets in the same push as a dropped
changeset.

We produce obsolescence markers recording the relationship between
rewritten changesets. We also copy pushlog entries from rewritten
changesets. To downstream automation, this will appear as if new
pushes were performed. This ensures there is automation results
for all push eads, even if there are rewrites.

Review commit: https://reviewboard.mozilla.org/r/66630/diff/#index_header
See other reviews: https://reviewboard.mozilla.org/r/66630/
Attachment #8773998 - Flags: review?(mh+mozilla)
Depends on: 1289514
Comment on attachment 8773998 [details]
mozbackout: extension to discard changesets (bug 1288845);

https://reviewboard.mozilla.org/r/66630/#review68600

::: hgserver/pash/hg_helper.py:645
(Diff revision 1)
> +    elif ssh_command.startswith('autoland-backout '):
> +        if not enable_autoland_backout:
> +            print('autoland-backout command not available')
> +            sys.exit(1)
> +
> +        args = [HG, 'purgebackout'] + ssh_command.split()[1:]
> +        os.chdir('/repo/hg/mozilla/integration/autoland')
> +        os.execv(HG, args)

I'm not going to pretend I know this script: what is ensuring only sheriffs run this command?

::: hgserver/remoteadmin.py:42
(Diff revision 1)
> +def purgebackout(ui, repo, *revs, **opts):
> +    """Backout changesets by obsoleting and rebasing.

I'm not sure we should use the term backout to describe this feature. I prefer "drop" from the commit message.

::: hgserver/remoteadmin.py:61
(Diff revision 1)
> +    publishing = repo.ui.configbool('phases', 'publish', True)
> +    if publishing:
> +        raise error.Abort('cannot run command on publishing repos')
> +

meh, I think you can lift this check. phases.publish=True doesn't necessarily mean everything in the repo is public. And if it is, that will be caught by the phase checks on the revisions to be purged.

::: hgserver/remoteadmin.py:94
(Diff revision 1)
> +    warned = False
> +    for rev in backoutrevs:
> +        ctx = repo[rev]
> +        if ctx.phase() == phases.public:
> +            ui.warn('cannot backout public changeset: %s %s\n' %
> +                    (short(ctx.node()), ctx.description().splitlines()[0]))

always amazed that there is no ctx.summary() unifying all the code that does the same thing in different ways in mercurial.

::: hgserver/remoteadmin.py:107
(Diff revision 1)
> +        # While we're iterating, assemble the set of push IDs containing
> +        # changesets to back out.
> +        if pushlog:
> +            push = pushlog.pushfromchangeset(ctx)
> +            if push:
> +                backoutpushes.add(push[0])

It may or may not be clearer to do something like
  pushid, who, when, node = push
  backoutpushes.add(pushid)

::: hgserver/remoteadmin.py:119
(Diff revision 1)
> +    # Above, we collected the set of revisions that were explicitly requested
> +    # for backout and the pushlog pushes associated with them.
> +    #
> +    # Our strategy now is to start with the oldest changeset to be backed out
> +    # and walk towards the DAG head. For each changeset, we examine to see if
> +    # it is to be dropped. If it is good, we graft it on top of the last good

Somehow felt hard to parse. s/If it is good/If not/ would have been clearer to me.

::: hgserver/remoteadmin.py:127
(Diff revision 1)
> +    # Pushes work similarly. If we encounter a changeset that is part of a
> +    # push that contains a backout, we drop it. The thinking here is that
> +    # changesets from the same push that come after a bad changeset are
> +    # likely impacted by the bad changeset. While this may cause some false
> +    # positives, it is better to be safe than sorry.

Why not expect the sheriffs to do the distinction themselves. With a shorthand if necessary.
 Like:
  - `hg purgebackout $rev` when they want to pinpoint what is to be dropped
  - `hg purgebackout push($rev)` when they want the whole push.

::: hgserver/remoteadmin.py:143
(Diff revision 1)
> +    lastctx = repo[backoutrevs.min()].p1()
> +
> +    for ctx in repo.set('%d::', backoutrevs.min()):
> +        node = ctx.node()
> +        desc = ctx.description().splitlines()[0]
> +        changedfiles = set(ctx.manifest().diff(ctx.p1().manifest()).keys())

I know it /can/, hypothetically, be wrong, if someone has used an ancient mercurial, but why not simply use ctx.files()? The reason files are stored there is to avoid doing diffs like that in the first place. If the answer is "because merges", considering p1 and p2 might not be in the order you expect them to be, the diff is not right either. (and yes, ctx.files() is likely wrong for merges, too, but for merges, you have to check against the development line you're backing out from, which may or may not be p1)

::: hgserver/remoteadmin.py:146
(Diff revision 1)
> +        node = ctx.node()
> +        desc = ctx.description().splitlines()[0]
> +        changedfiles = set(ctx.manifest().diff(ctx.p1().manifest()).keys())
> +        impacted.append(node)
> +
> +        def adddropped(reason):

add_dropped, easier to the eye :)

::: hgserver/remoteadmin.py:197
(Diff revision 1)
> +        if changedfiles & impactedfiles:
> +            if len(ctx.parents()) > 1:
> +                raise error.Abort('merge changeset %s changed files impacted by a '
> +                                  'backout; cannot proceed' % short(node),
> +                                  help='perform a traditional backout commit')
> +            adddropped('file merge necessary')

+maybe

::: hgserver/remoteadmin.py:202
(Diff revision 1)
> +            adddropped('file merge necessary')
> +            continue
> +
> +        ui.write('grafting %s: %s\n' % (short(node), desc))
> +
> +        newctx = _graft(repo, ctx, lastctx.node(), ctx.p2().node())

I think it would be more desirable to just create a list of commands for histedit, and let it do the job. Bonus: it will handle obsolete markers for you. Although now that I look at hgext/histedit.py, I'm horrified.

Coming back here after reading _graft and the mozhg.rewrite module, why are you not using replacechangesets?

Note: I haven't checked the whole thing does the right thing for the graft/rebase/call-it-what-you-want. I hope there's an extensive selection of tests, with all kinds of cases.

::: hgserver/remoteadmin.py:243
(Diff revision 1)
> +            if node not in remaining:
> +                continue
> +
> +            push = pushlog.pushfromchangeset(urepo[node])
> +            if not push:
> +                return

silent return?

::: hgserver/remoteadmin.py:255
(Diff revision 1)
> +            newnodes = [replacements[n] for n in pushbnodes if n in replacements]
> +            if not newnodes:
> +                continue
> +
> +            lastpushid += 1
> +            ui.write('copying pushlog entry %d to %d (from %s)\n' % (

'creating new pushlog entry #%d by %s from entry #%d'

::: hgserver/remoteadmin.py:261
(Diff revision 1)
> +    p = lambda: tr.writepending() and repo.root or ''
> +    repo.hook('pretxnchangegroup', throw=True, pending=p, **hookargs)

I see the same thing in pushrebase, but I don't know where this comes from. What is supposed to call that pending callback? I don't see anything related to that in mercurial.

::: hgserver/tests/test-autolandbackout.t:62
(Diff revision 1)
> +
> +A member of the autoland group can perform a remote backout
> +
> +  $ sheriffssh autoland-backout ebbf0c1f3aaa
> +  dropping (requested explicitly) ebbf0c1f3aaa: p1c1
> +  grafting 6d7db80a435e: p1c2

seeing this, while it would be cumbersome, it would be nicer to print out the new changeset id here. (it would also help make sense of the test output for e.g. obsolescence marker)

::: hgserver/tests/test-autolandbackout.t:66
(Diff revision 1)
> +  recording new pushlog entry for rewritten push 3
> +  recording new pushlog entry for rewritten push 4

That doesn't match the message I suggested rewording for.

::: hgserver/tests/test-autolandbackout.t:68
(Diff revision 1)
> +  grafting 63a61534cf94: p2c1
> +  grafting ea75f4c834c9: p2c2
> +  no username found, using 'sheriff@mozilla.com@*' instead (glob)
> +  recording new pushlog entry for rewritten push 3
> +  recording new pushlog entry for rewritten push 4
> +  (not updating pushlog since changesets come from purgebackout)

No idea where this message comes from.

::: hgserver/tests/test-autolandbackout.t:89
(Diff revision 1)
> +  63a61534cf94a09c152ec8a172007a93272afd45 82e161f27af3c9330af67b9013a2748fdfd07967 0 (*) {'user': 'sheriff@mozilla.com@*'} (glob)
> +  6d7db80a435e3decb099bade31d46ae6db25f66e f075dc78cce22e9d5980da78e2bc65ae0939c0ba 0 (*) {'user': 'sheriff@mozilla.com@*'} (glob)
> +  ea75f4c834c9c3aa24c5cc036297db796ea14d19 d23e2d2f13bed83398168f8f0492bd4629ff9705 0 (*) {'user': 'sheriff@mozilla.com@*'} (glob)
> +  ebbf0c1f3aaaffc9dd7d597f5114babc5c9e1272 0 {77538e1ce4bec5f7aac58a7ceca2da0e38e90a72} (*) {'user': 'sheriff@mozilla.com@*'} (glob)
> +
> +  $ hg pull ${HGWEB_0_URL}integration/autoland

For my own curiosity, what happens if $user had local stuff on top of p2c2?

::: hgserver/tests/test-purgebackout-basic.t:82
(Diff revision 1)
> +  $ hg -q commit -A -m commit1
> +  $ hg rm bar
> +  $ hg -q commit -A -m commit2
> +  $ hg -q up null
> +
> +  $ hg purgebackout fe212abfd273

out of the blue, I have no idea which of the commits fe212abfd273 refers to.

::: hgserver/tests/test-purgebackout-revision-validation.t:91
(Diff revision 1)
> +  $ hg purgebackout 84c442544a2a
> +  cannot backout merge: 84c442544a2a merge
> +  abort: cannot perform backouts on selected revisions
> +  [255]
> +
> +  $ cd ..

None of the tests are checking that the manifests and files are right. Even if they did, none of the tests have significant file data to assess whether the manifests and files are right.

There is also no test of rewriting a merge that would be on top of a dropped changeset.
Comment on attachment 8773998 [details]
mozbackout: extension to discard changesets (bug 1288845);

https://reviewboard.mozilla.org/r/66630/#review68614
Attachment #8773998 - Flags: review?(mh+mozilla)
Comment on attachment 8773998 [details]
mozbackout: extension to discard changesets (bug 1288845);

https://reviewboard.mozilla.org/r/66630/#review68600

> Why not expect the sheriffs to do the distinction themselves. With a shorthand if necessary.
>  Like:
>   - `hg purgebackout $rev` when they want to pinpoint what is to be dropped
>   - `hg purgebackout push($rev)` when they want the whole push.

The arguments are revsets. And the pushlog extension is running. So they could specify a revset defined by the pushlog extension. I've filed bug 1295780 to implement missing revsets to make this easy for sheriffs.

> I think it would be more desirable to just create a list of commands for histedit, and let it do the job. Bonus: it will handle obsolete markers for you. Although now that I look at hgext/histedit.py, I'm horrified.
> 
> Coming back here after reading _graft and the mozhg.rewrite module, why are you not using replacechangesets?
> 
> Note: I haven't checked the whole thing does the right thing for the graft/rebase/call-it-what-you-want. I hope there's an extensive selection of tests, with all kinds of cases.

I thought about doing this initially. I can't remember why I decided against it. It may have been after I looked at the histedit code, which is pretty horrible :)

That being said, I think I discarded histedit before I implemented all the logic for "contaminating" changesets. Let me take another stab at having this use a histedit "plan" to drive the execution.

> I see the same thing in pushrebase, but I don't know where this comes from. What is supposed to call that pending callback? I don't see anything related to that in mercurial.

I curbed it from pushrebase. I'll dig deeper.

> None of the tests are checking that the manifests and files are right. Even if they did, none of the tests have significant file data to assess whether the manifests and files are right.
> 
> There is also no test of rewriting a merge that would be on top of a dropped changeset.

Yeah. Test coverage in this patch is not great. I wanted to get your feedback on the approach (specifically with regards to not attempting file merges) before I spent the effort on writing them.
Comment on attachment 8773998 [details]
mozbackout: extension to discard changesets (bug 1288845);

https://reviewboard.mozilla.org/r/66630/#review68600

> I thought about doing this initially. I can't remember why I decided against it. It may have been after I looked at the histedit code, which is pretty horrible :)
> 
> That being said, I think I discarded histedit before I implemented all the logic for "contaminating" changesets. Let me take another stab at having this use a histedit "plan" to drive the execution.

I remember now.

1) histedit likes to interact with the working directory
2) histedit's state tracking isn't great (makes pushlog updating a bit harder)

Since we'd only be updating parents info in changesets, I figured it easier to just implement the changeset rewriting ourselves.

FWIW, there are tentative plans to add generic history rewriting APIs to Mercurial. https://www.mercurial-scm.org/wiki/PlanHistoryRewritePlan covers some of that. Unfortunately those aren't in place yet.
Comment on attachment 8773998 [details]
mozbackout: extension to discard changesets (bug 1288845);

https://reviewboard.mozilla.org/r/66630/#review68600

> No idea where this message comes from.

It comes from the pushlog extension. I'll get a patch up that fixes this.
Depends on: 1296368
The precursor work to getting this deployed on hg.mozilla.org is effectively done. (A lot of work had to be done to teach the pushlog, replication, and event notification systems about obsolete/hidden changesets.)

We still have a bit of work between when this is deployed and when we can use it. Specifically:

* pulsebot needs to send notifications when a changeset is dropped
* build/test automation needs to react gracefully when doing VCS foo on a hidden changeset
* we should consider deploying a mechanism that stops pending/running automation against changesets that are dropped
(In reply to Gregory Szorc [:gps] from comment #11)
> * we should consider deploying a mechanism that stops pending/running
> automation against changesets that are dropped

If there is an exchange/topic I can listen to I can cancel the jobs on Buildbot for you.
I think we can teach mozilla-taskcluster to listen to it as well (besides pulse messages from TH) and cancel the jobs.
Depends on: 1298910
What should we do for performance data in perfherder that was generated against a discarded push? Discard it as well?
Flags: needinfo?(gps)
(In reply to William Lachance (:wlach) from comment #13)
> What should we do for performance data in perfherder that was generated
> against a discarded push? Discard it as well?

It may depend. It may be beneficial to retain the data. But you probably don't want perfherder alerts looking at dropped changesets because it will likely get confused because changesets aren't in "linear" order.
Flags: needinfo?(gps)
Comment on attachment 8773998 [details]
mozbackout: extension to discard changesets (bug 1288845);

https://reviewboard.mozilla.org/r/66630/#review73390

::: hgext/drop/__init__.py:90
(Diff revision 4)
> +    if not droprevs:
> +        raise error.Abort('could not find revisions to drop',
> +                          hint='arguments must be revisions or revsets that '
> +                               'resolve to known changesets')
> +
> +    droppushes = collections.defaultdict(set)

droppushes is kind of a misnomer, considering the stored values are revs to be dropped, grouped by pushes.

::: hgext/drop/__init__.py:164
(Diff revision 4)
> +        # There are edge cases where ctx.files() may not be accurate. Compute
> +        # the file changes from the manifest diff. This is slower but more
> +        # accurate.
> +        changedfiles = set(ctx.manifest().diff(ctx.p1().manifest()).keys())

The previous review said p1 was not necessarily the right thing to compare to. That is still true and not addressed.

::: hgext/drop/__init__.py:175
(Diff revision 4)
> +            if len(ctx.parents()) > 1:
> +                raise error.Abort('cannot drop merge changesets',
> +                                  help='perform a traditional backout commit')

You're accepting to rewrite merges, so why not allow to drop them as well?

::: hgext/drop/__init__.py:181
(Diff revision 4)
> +            push = pushlog.pushfromchangeset(ctx)
> +            if push:
> +                droppushes[push.pushid].add(node)

If push is None here, it means bad things. I'm not sure silently ignoring the rev is a good thing to do.

Same comment applies to the same construct at the beginning of the function.

::: hgext/drop/__init__.py:198
(Diff revision 4)
> +        # However, if the first or a middle changeset is bad, all bets are off
> +        # as to the state of the following changesets. In many scenarios, those
> +        # subsequent changesets rely on work done by the bad one. So, the only
> +        # safe thing to do is drop all changesets in a push coming after a bad
> +        # one.

I still don't agree with this. There are many cases where you can get away with just backing out a changeset in the middle of a push ; it should be sheriffs's prerogative to pick what needs to be dropped or not.

In the current world, at least. But I guess if autoland never grows a way to group-push several bugs (hint: I wish it did), then it's rarely going to be true in autoland.

::: hgext/drop/__init__.py:207
(Diff revision 4)
> +        # one.
> +        push = pushlog.pushfromchangeset(ctx)
> +        if push and push.pushid in droppushes:
> +            # And the changeset comes *after* a changeset in the push that
> +            # was dropped.
> +            if ctx.rev() in repo.revs('%ln::', droppushes[push.pushid]):

It would be better to put an upper bound here (push.nodes[-1])

::: hgext/drop/__init__.py:240
(Diff revision 4)
> +        newman = newctx.manifest()
> +        for path in sorted(set(ctx.files()) & set(newctx.files())):
> +            # TODO add test for file deleted
> +            if path in oldman and oldman[path] != newman[path]:
> +                raise error.Abort('file %s different after grafting' % path,
> +                                  hint='you found a bug; file it')

s/file/please \0/

::: hgext/drop/__init__.py:292
(Diff revision 4)
> +    # Fire hook indicating changesets are about to be committed.
> +    if added:
> +        p = lambda: tr.writepending() and repo.root or ''
> +        repo.hook('pretxnchangegroup', throw=True, pending=p, **hookargs)

Still no indication what is supposed to call the pending callback. Cf. previous review, I don't see anything related to that in mercurial, not in v-c-t. There's only the same pattern in pushrebase, but that doesn't explain what is supposed to call the pending callback.

::: hgext/drop/__init__.py:323
(Diff revision 4)
> +        # ctx.files() doesn't return the full set of files on merge
> +        # commits. We diff the manifest with p1's manifest to do the
> +        # right thing.
> +        files = ctx.manifest().diff(ctx.p1().manifest()).keys()

I'm surprised I didn't comment on that on previous round. That doesn't what mercurial does for merges at all.

::: hgext/drop/tests/test-basic.t:17
(Diff revision 4)
> +Backing out a changeset independent of all other changesets works
> +
> +  $ echo file0 > file0
> +  $ hg -q commit -A -m commit0
> +  $ hg -q push
> +  recorded push in pushlog

Should probably file a separate bug for this: the pushlog extension shouldn't print this out with -q.

::: hgext/drop/tests/test-basic.t:79
(Diff revision 4)
> +  creating new pushlog entry #4 by hguser from entry #3
> +
> +  $ hg log -G
> +  o  changeset:   3:1d4ac05f1ac0
> +  |  tag:         tip
> +  |  parent:      0:77538e1ce4be

Funny that mercurial displays the parent in that case.

::: hgext/drop/tests/test-graft-merge.t:86
(Diff revision 4)
> +  |  tag:         tip
> +  |  user:        Test User <someone@example.com>
> +  |  date:        Thu Jan 01 00:00:00 1970 +0000
> +  |  summary:     after merge
> +  |
> +  o    changeset:   6:c742f0b84388

You should check the files associated with the merge commit, which would show you what _graft does is wrong. The simplest command-line way to do that is to check the output of hg debugdata -c.

(Also, you should probably check more contrived merges)

::: hgext/drop/tests/test-impacted-files.t:54
(Diff revision 4)
> +
> +Backing out a changeset touching file X will back out descendants touching that file
> +
> +  $ hg drop 137d0337a5d2
> +  dropping 137d0337a5d2 (requested explicitly): file0 1
> +  dropping 7d439224fbae (changes impacted files): file0 2

it might be nice to indicate which changeset impacted the same files

::: hgext/drop/tests/test-impacted-merge.t:40
(Diff revision 4)
> +  | |  parent:      1:1b1b9817e7c9
> +  | |  user:        Test User <someone@example.com>
> +  | |  date:        Thu Jan 01 00:00:00 1970 +0000
> +  | |  summary:     head 2 commit 1
> +  | |
> +  o |  changeset:   3:4ce85fa40723

You should also test a drop of this changeset, which doesn't impact files from other changesets, and check the merge commit after the drop is still good.
Attachment #8773998 - Flags: review?(mh+mozilla)
Comment on attachment 8785082 [details]
hgserver: pash command for remote backouts (bug 1288845);

https://reviewboard.mozilla.org/r/74404/#review73406

::: hgserver/tests/test-autoland-drop.t:89
(Diff revision 1)
> +  63a61534cf94a09c152ec8a172007a93272afd45 beb391348e4b58d3290e6f117d21b6610628dde8 0 (*) {'user': 'sheriff@mozilla.com@*'} (glob)
> +  6d7db80a435e3decb099bade31d46ae6db25f66e 0 {ebbf0c1f3aaaffc9dd7d597f5114babc5c9e1272} (*) {'user': 'sheriff@mozilla.com@*'} (glob)
> +  ea75f4c834c9c3aa24c5cc036297db796ea14d19 60e23897a0e5aeca8a660ca5bf14a8377b37f873 0 (*) {'user': 'sheriff@mozilla.com@*'} (glob)
> +  ebbf0c1f3aaaffc9dd7d597f5114babc5c9e1272 0 {77538e1ce4bec5f7aac58a7ceca2da0e38e90a72} (*) {'user': 'sheriff@mozilla.com@*'} (glob)
> +
> +  $ hg pull ${HGWEB_0_URL}integration/autoland

Asking the curiosity question again: what happens if $user had local stuff on top of p2c2?
Attachment #8785082 - Flags: review?(mh+mozilla) → review+
Comment on attachment 8773998 [details]
mozbackout: extension to discard changesets (bug 1288845);

https://reviewboard.mozilla.org/r/66630/#review73390

> You're accepting to rewrite merges, so why not allow to drop them as well?

In short,I don't think we can... easily. If you drop a merge, you also have to a) drop the changesets from 1 parent of that merge b) temporarily have multiple heads in the repo. In the case where a merge came from e.g. inbound, the inbound changesets will be in public phase. And you can't obsolete public phase changesets. Your only option is to leave them around as an unmerged head or to strip them. I don't like either option. There are potentially phase hacks we could apply as well. But that could lead to badness, such as the creation of an obsolescence marker for a public phase changeset.

Refusing to drop merge changesets (at least in the case where 1 parent has public changesets) feels appropriate.
I had a few questions regarding this.

1) on autoland when we go to backout a change, that will strip the original change and reland the pushes in between, correct?
2) won't patches landing on inbound/fx-team end up with original + backout?
3) what about the case when we land a followup patch to fix bustage?  that is very common on inbound?  Right now I would assume the number of 'backouts' are half the number of bustage occurances we have.
4) what if we backout a change 3 days later (say a talos issue?), then we have history of bad revision + backout?
5) what if we find an issue with PGO 40 revisions later?  Will we backout and retest all 40 revisions?
6) when we generate performance alerts for data between a bad+backout, we will get them for a revision which doesn't exist?  Have we figured out how to fix our alerting algorithms and data sets to account for removal of revisions + all related data?
7) I want to know based on recent trends on our integration branches (autoland/inbound/fx-team) how many cases we will backout and how many revisions we will need to re-land.  In addition, knowing the jobs of the original revisions which did start/run, how many machine hours we will have to retest by landing the revisions again?  Given the fact that we cannot deliver try jobs with our current load, I would like to have an idea of what kind of load we would be introducing.
8) how will we work with downstream tools like AWFY, AWSY, Autophone, external-media-tests, etc.?
Comment on attachment 8773998 [details]
mozbackout: extension to discard changesets (bug 1288845);

https://reviewboard.mozilla.org/r/66630/#review73390

> I still don't agree with this. There are many cases where you can get away with just backing out a changeset in the middle of a push ; it should be sheriffs's prerogative to pick what needs to be dropped or not.
> 
> In the current world, at least. But I guess if autoland never grows a way to group-push several bugs (hint: I wish it did), then it's rarely going to be true in autoland.

Yes, this needs more consideration.

Something else I just considered: the logic as implemented is too liberal because MozReview :/

MozReview doesn't handle partial series landings very well. So e.g. if you have a series of 3 commits (A -> B -> C), land the 3, C gets backed out, and then you push C back to MozReview, you'll either get a new review request (because the original series was closed) or (if the series was reopened), pushing C' to MozReview will mask the reviews on the landed commits A and B. Ideally pushing C' would result in the review request for C being reopened while retaining data/linking for A and B.

I really dislike this behavior in MozReview/Autoland: our tools should encourage forward progress, not an all-or-nothing approach (for many of the same reasons I prefer microcommits over massive single commits: progress is progress, no matter how small).

The main reason MozReview doesn't handle partial series landing (and multi bug series) is because it is difficult to implement. And, for better or worse, our landing and backout model up to this point has been all or nothing. The sheriffs default to backing out an entire push even if a single commit is bad. I know I've been upset by this policy before - having a 10 commit series backed out because the last commit broke something.

Until MozReview handles partial series landings better, I think we'll need to change this logic to drop the entire push, even if just the last commit is bad. Otherwise the alternative will create chaos in the review world. This disappoints me greatly :/
Comment on attachment 8773998 [details]
mozbackout: extension to discard changesets (bug 1288845);

https://reviewboard.mozilla.org/r/66630/#review73390

> Funny that mercurial displays the parent in that case.

I /think/ this is because p1 != rev-1. A sad consequence of exposing the integer revision numbers in the output. Every few months there is talk about hiding the integer revisions from the UI but nothing ever happens from it.
(In reply to Joel Maher ( :jmaher) from comment #18)
> I had a few questions regarding this.
> 
> 1) on autoland when we go to backout a change, that will strip the original
> change and reland the pushes in between, correct?

That is essentially what is happening in the current patches, yes.

> 2) won't patches landing on inbound/fx-team end up with original + backout?

Correct. The plan is to eventually not have an inbound/fx-team so this goes away long term and we only have autoland's approach.

> 3) what about the case when we land a followup patch to fix bustage?  that
> is very common on inbound?  Right now I would assume the number of
> 'backouts' are half the number of bustage occurances we have.

We won't do that on autoland. The "bad" commit(s) will get dropped and the author will need to fold a fix into the original commit(s) and reland.

> 4) what if we backout a change 3 days later (say a talos issue?), then we
> have history of bad revision + backout?

One can label backouts various ways. One way to bucket them would be whether or not they are needed to fix build or test bustage. You could also classify them on whether they introduce a "soft" or "hard" failure (where a "hard" failure is bustage and "soft" is something like a benchmark regression).

We ideally don't want any "bad" commits in final history. With the autoland repo, we have a window of a few hours where we can scrub history of commits.

Automation tends to find the "hard" failures pretty quickly (they manifest as build or test failures). And, "hard" failures are more impactful because they prevent people from building/testing specific revisions. So, we'll probably do a good job of scrubbing "hard" failures from final history.

"soft" failures are harder, as the commits may have found their way to mozilla-central before we recognize the failure. But since there aren't severe automation bustages with "soft" failures, we can tolerate these in history. For "soft" failures, we'll perform a classic "backout" commit like we do today. Except unlike "hard" failures, the commits in between should still be "good enough" for more downstream consumers to operate on.

> 5) what if we find an issue with PGO 40 revisions later?  Will we backout
> and retest all 40 revisions?

Maybe, but only if the regressing changeset hasn't made it out of the autoland repo.

> 6) when we generate performance alerts for data between a bad+backout, we
> will get them for a revision which doesn't exist?  Have we figured out how
> to fix our alerting algorithms and data sets to account for removal of
> revisions + all related data?

Someone (I think wlach) pinged me about this. We'll want perfherder to ignore dropped changesets for some kinds of analysis.

> 7) I want to know based on recent trends on our integration branches
> (autoland/inbound/fx-team) how many cases we will backout and how many
> revisions we will need to re-land.  In addition, knowing the jobs of the
> original revisions which did start/run, how many machine hours we will have
> to retest by landing the revisions again?  Given the fact that we cannot
> deliver try jobs with our current load, I would like to have an idea of what
> kind of load we would be introducing.

That's a good question and I don't have specific numbers.

I'll tell you what I told Armen earlier this week: we're introducing a new feature to the autoland repo, not a new requirement that it be used. I consider it a near certainty that the first time we perform a "remote drop" we'll find unexpected fallout and cause a fire drill. I also anticipate that we're going to gradually transition to "remote drops," not change everything from day 0.

If we start triggering too much [redundant] automation from remote drops or if we cause other problems from remote drops, we'll stop performing remote drops until we have an answer for those problems. The presence of the remote drop feature does not mean traditional backouts can't be performed: it just means the feature is available.

> 8) how will we work with downstream tools like AWFY, AWSY, Autophone,
> external-media-tests, etc.?

Tools consuming the autoland repo need to know that changesets can "disappear" at any time. They can consume Pulse to find out exactly what changesets have been obsoleted and what they were rewritten to. Tools that don't do this may experience oddities. I would also say that I'd recommend that most tools don't look at autoland: they should instead wait for updates to mozilla-central. That may not be viable today because we don't push to mozilla-central frequently enough. But I'd really like to change that and have autoland "integrated" into mozilla-central every few hours.

FWIW, one of the main reasons we're establishing these new processes on the autoland repo and not e.g. mozilla-inbound is because we couldn't change semantics of the existing repos without causing too much chaos for downstream consumers, including developers.
thanks for the reply!
(In reply to Gregory Szorc [:gps] from comment #21)
> (In reply to Joel Maher ( :jmaher) from comment #18)
> > 3) what about the case when we land a followup patch to fix bustage?  that
> > is very common on inbound?  Right now I would assume the number of
> > 'backouts' are half the number of bustage occurances we have.
> 
> We won't do that on autoland. The "bad" commit(s) will get dropped and the
> author will need to fold a fix into the original commit(s) and reland.

This seems like we will only increase the load on our infrastructure much more than necessary- is there a plan in place for maybe batching up commits via autoland as well as when we reland patches after a backout?

> > 7) I want to know based on recent trends on our integration branches
> > (autoland/inbound/fx-team) how many cases we will backout and how many
> > revisions we will need to re-land.  In addition, knowing the jobs of the
> > original revisions which did start/run, how many machine hours we will have
> > to retest by landing the revisions again?  Given the fact that we cannot
> > deliver try jobs with our current load, I would like to have an idea of what
> > kind of load we would be introducing.
> 
> That's a good question and I don't have specific numbers.
> 
> I'll tell you what I told Armen earlier this week: we're introducing a new
> feature to the autoland repo, not a new requirement that it be used. I
> consider it a near certainty that the first time we perform a "remote drop"
> we'll find unexpected fallout and cause a fire drill. I also anticipate that
> we're going to gradually transition to "remote drops," not change everything
> from day 0.
> 
> If we start triggering too much [redundant] automation from remote drops or
> if we cause other problems from remote drops, we'll stop performing remote
> drops until we have an answer for those problems. The presence of the remote
> drop feature does not mean traditional backouts can't be performed: it just
> means the feature is available.
> 

I am glad to hear this is something that is being designed as a simple toggle and tweakable.

> > 8) how will we work with downstream tools like AWFY, AWSY, Autophone,
> > external-media-tests, etc.?
> 
> Tools consuming the autoland repo need to know that changesets can
> "disappear" at any time. They can consume Pulse to find out exactly what
> changesets have been obsoleted and what they were rewritten to. Tools that
> don't do this may experience oddities. I would also say that I'd recommend
> that most tools don't look at autoland: they should instead wait for updates
> to mozilla-central. That may not be viable today because we don't push to
> mozilla-central frequently enough. But I'd really like to change that and
> have autoland "integrated" into mozilla-central every few hours.

I would assert we need a better plan to communicate and work with downstream consumers.  Right now it is important to work with the autoland repository, but in the future when there are merges every couple of hours, I would agree with you that we could test on m-c.
Depends on: 1321371
QA Contact: hwine → klibby
glob and I decided to resurrect this "remote backouts" idea to help with Servo "vcs sync" efforts. However, we're going to move the goalposts to make this achievable. We're going to implement "remote backouts" as actual backout changesets as opposed to dropping/obsoleting changesets / rewriting history. It allows us to transition to a new, server-based mechanism for performing backouts without solving all the problems around history rewriting. Once the history rewriting problems are solved, we can simply change the implementation on the server to start dropping/obsoleting changesets instead of making a backout changeset.
Blocks: 1365318
No longer depends on: 1287648, 1298563, 1298910
glob: please treat review request as feedback. Well, part 3 at least. We definitely don't want to deploy this (at least available to the world) just yet.
Attachment #8773998 - Flags: review?(glob)
Attachment #8785082 - Flags: review?(glob)
Attachment #8868792 - Flags: review?(glob)
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: