Open Bug 1973300 Opened 4 days ago Updated 2 days ago

make it possible to run release promotion through `./mach try`

Categories

(Release Engineering :: Release Automation, enhancement, P3)

enhancement

Tracking

(Not tracked)

People

(Reporter: bhearsum, Unassigned)

Details

Testing release promotion changes on Try at the moment is generally not very easy. The main pain points are:

  • To test it at all, you need to be able to run the release-promotion action, which requires explicit scope grants
  • If you want to be able to run staging releases through Ship It (the only way to run them without expert-level knowledge of release promotion, you need additional scopes, and the VPN
  • If you want to test quickly, you must bypass Ship It, and have the expert level knowledge needed to set previous_graph_ids correctly when running the release promotion action

In effect, this means that nobody outside of RelEng (and maybe one or two others) can be expected to be able to run it. We should be able to do better.

I'm envisioning either a new ./mach try subcommand, or (ideally) enhancements to ./mach try release that would:

  • Accept shipping-product and shipping-phase arguments (and maybe other things, if needed)
  • Use the above arguments to set-up parameters such that the original decision task runs the release promotion tasks immediately
  • Optionally, accepts --previous-graph-ids to allow re-use of tasks from previous pushes

Bonus points (and perhaps follow-ups) for supporting the following:

  • Automatically finding and using appropriate and unused version+build number combinations to avoid issues, eg: creating balrog blobs
  • Automatically finding previous try pushes from the same user and re-using tasks from them (probably allowing for --rebuild-kinds to be used to force certain things to re-run)
  • Finding a way to wire this into ./mach try fuzzy, such that those tasks could "just work" there. (I don't even know if this is tractable...but it would be lovely.)

One reason I'm uncertain about whether/how to do this is there are shared resources that staging releases use/need, such as ftp.stage and releases/rules on balrog stage, that mean more people running those means more likelihood of spurious errors from contention/conflicts on those shared resources. (And bypassing shipit makes this even worse since shipit is what prevents reuse of build numbers, e.g.)

(In reply to Julien Cristau [:jcristau] from comment #1)

One reason I'm uncertain about whether/how to do this is there are shared resources that staging releases use/need, such as ftp.stage and releases/rules on balrog stage, that mean more people running those means more likelihood of spurious errors from contention/conflicts on those shared resources.

It's a good point, yeah. I think we'd probably need some sanity checking of ftp.stage & balrog stage as part of this command at minimum. (That's not perfect, obviously, but it would probably prevent most collisions?)

In an ideal world, we'd have some way to use distinct instances of these services (and some way to re-use them for partials, update verify, etc.) - but that's a whole other can of worms...

You need to log in before you can comment on or make changes to this bug.