[meta] Improve merge day end to end times
Categories
(Release Engineering :: Release Automation, enhancement, P3)
Tracking
(Not tracked)
People
(Reporter: ahal, Unassigned)
References
(Depends on 2 open bugs)
Details
(Keywords: meta)
In the July 28th releaseduty postmortem there was a good brainstorm around how we can speed up merge automation.
This came up because many mobile release day activities are blocked on the mozilla-central bump + 4-5 hours for builds to finish and upload to maven. So the faster we can get the Gecko side done the better.
I'm going to list all our ideas here for now, but this is a meta bug and the ideas can spin off into their own bugs / Jira tickets as we pick them up.
Have bump-central task run as part of the central-to-beta action
Currently we need to wait for the merge to complete successfully before we can bump central. But there's no reason the bump-central task needs to be triggered manually. Instead the central-to-beta action could schedule a graph that looks like:
central-to-beta -> bump-central
And this way we cut out the time it takes the person on mergeduty to notice that the merge is finished and trigger the task.
Patch treescript to check if central tag already exists and merge from there rather than tip
Taking the previous step even further, we could patch treescript so that instead of merging from tip we tag central first and then merge from that tag. The central-to-beta action would submit a graph that looks like:
tag-central --> central-to-beta (using the tag as the base)
`-> bump-central
The additional benefit here is that bump-central can run in parallel with the merge and we don't need to block on expensive hg operations.
Run "dry run" tasks before merge e-mail comes in
Scheduling the dry run tasks probably takes ~30 min. There's no reason these need to wait until after the merge day e-mail, so mergeduty should try and do them first thing in the morning or even the Friday before.
Eventually, this could become part of the sheriffs regular merge simulation testing they do.
Allow sheriffs and/or relman the ability to trigger merge-automation actions
These days triggering the actions is not very complicated. By allowing sheriffs or relman permission to trigger them, we can guarantee the process can be started ASAP (since sheriffs have 100% coverage). This can help in the event both people on releaseduty are in North American timezones or have something come up and are afk.
Investigate bigger / different instance types to improve performance
Some experimentation with different instances types might yield to better performance. There were some doubts cast about how effective this would be, but could be worth a shot.
Investigate improving performance of the automation itself
There were no specific ideas here, but it's possible we can improve the performance of these tasks somehow.
Comment 1•3 years ago
|
||
> central-to-beta -> bump-central
>
> tag-central --> central-to-beta (using the tag as the base)
> `-> bump-central
I think these can work, with the following caveats:
- we generate tasks in random order, based on whether all of a kind's kind dependencies have finished generating. I'm not sure if we can have tasks of the same kind depending on each other easily or cleanly. The simple solution may involve additional kinds.
- the latter model may be harder to test in a dry-run. We may be able to fake it.
Run "dry run" tasks before merge e-mail comes in
Historically we were supposed to do this the week before merge day; in practice we found that we consistently forgot. I do agree that it's worth doing beforehand to save time :)
Allow sheriffs and/or relman the ability to trigger merge-automation actions
+1
Investigate improving performance of the automation itself
One solution, which may be coming at some point, is moving to a single repo, multiple branch model.
We would likely want this if we migrated gecko to git. Rather than pushing a bunch of commits to the beta repo, we'd create a new branch off of the current main/central branch, much like the Fenix model.
To do this properly, we'd need to change our entire security/automation model from looking at the repo to looking at the repo+branch, but once we do that, merge day may become pretty simple+fast.
Updated•2 years ago
|
Updated•11 months ago
|
Description
•