Closed Bug 1178324 Opened 9 years ago Closed 9 years ago

release runner / ship it dev environment

Categories

(Release Engineering :: Release Automation: Other, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: bhearsum, Assigned: bhearsum)

References

Details

Attachments

(2 files)

During release promotion planning last week we made the decision to set up a permanent "staging" version of release runner/ship-it to enable end to end release promotion testing (as well as end to end testing of future ship it or release automation changes). Our idea is to dedicate a twig to it, probably Date.

We need to set-up a version of release-runner that talks to ship-it-dev.allizom.org. It will still be deployed to production, and needs to be able to reconfig masters, check in version bumps/tags, etc. We should add some safeguards in place to make sure it only operates on Date.

One interesting thing may be that ship-it has tons of hardcodes around branch names that are unlikely to work with Date. It would be really great to remove those and get back to being purely data driven. Alternative ways to deal with this is live with autopopulation and such being broken, or adding Date to the hardcodes.

We need to think about which versions of other systems we should talk with, too. Eg: we probably shouldn't publish to the "production" release candidates s3 bucket or produdction Balrog.
Blocks: 680514
I spent some time sketching this out more today. Here's a much more detailed plan:

Release runner needs the following changes:
* Add support for "allowed_repos" in config file to make it possible to protect against the "staging" instance (which will be connected to ship it dev) update actual release repos.
* Adjust existing release runner instance to be allowed to operate on mozilla-beta, mozilla-release, and mozilla-esr* (and their comm-* equivalents).
* Deploy new release runner instance configured as follows:
** Only allowed to operate on Date
** Talks to ship-it-dev.allizom.org
** Notifies some public place that isn't release-automation-notifications@mozilla.com (a new list, maybe)
** Sends e-mail with subjects that make it clear it is not production
** Creates non-production tags, eg: DEV_FIREFOX_40_0b1_RELEASE. (This isn't ideal, but we can always remove the tags later, and they should go away entirely when bug 1178303 is fixed.)

Some misc. existing release automation may need some adjustments for the new tag names. Most (if not all) go through the getReleaseTag helper function, so this shouldn't be too hard.

We'll need a new release config for Date. Since we plan to kill them anyways it's not worthwhile redesigning them, but we may as well avoid cargo culting any now unnecessary parts (eg, anything that's only used by tagging or en-US builds, maybe some other stuff like AVVendorsRecipients). Nick, I see that your initial release promotion misc.py code drops release configs entirely -- was that intentional, or were you just not at the point of needing them?

I tested out using "projects/date" as a branch on ship-it-dev, and it does indeed bust the autofill. I don't think fixing it is a blocker, but it would be nice to.

We should make sure that Date is using self signed certs, to avoid any confusion should someone stumble on builds that come out of it. This will cause Windows update verify to fail (bug 997732), but I think we can live with that.

(In reply to Ben Hearsum [:bhearsum] from comment #0)
> We need to think about which versions of other systems we should talk with,
> too. Eg: we probably shouldn't publish to the "production" release
> candidates s3 bucket or produdction Balrog.

It seems to make sense that we use the dev or stage versions of systems wherever possible. This means that we should publish bits to a different s3 bucket (which I assume we'll be able to specify in the release config), use aus4-dev for update bits, as well as bouncer dev or stage. The only wrinkle here is that since we're working on a Twig, we still need to publish to production automation repositories. Patcher & update verify configs are branch specific, so we can create new versions of those for Date to avoid stomping on production. The notes above talk about adjusting tagging.

And just to avoid any confusion: release promotion assumes we're off of FTP, so we don't need to think about that system at all.

We should make it well known that this new way of testing release automation will be doing reconfigs of production buildbot. A mail to the public RelEng list + sheriffs should be enough.
Flags: needinfo?(rail)
Flags: needinfo?(nthomas)
Depends on: 1179405
(In reply to Ben Hearsum [:bhearsum] from comment #1)
> ** Creates non-production tags, eg: DEV_FIREFOX_40_0b1_RELEASE. (This isn't
> ideal, but we can always remove the tags later, and they should go away
> entirely when bug 1178303 is fixed.)

Agree not ideal to pepper tags and commits in prod repos, but probably better than having forked copies of the automation repos, special masters, pinned slaves and figuring out how to launch 'em for twig jobs!

> We'll need a new release config for Date. Since we plan to kill them anyways
> it's not worthwhile redesigning them, but we may as well avoid cargo culting
> any now unnecessary parts (eg, anything that's only used by tagging or en-US
> builds, maybe some other stuff like AVVendorsRecipients). Nick, I see that
> your initial release promotion misc.py code drops release configs entirely
> -- was that intentional, or were you just not at the point of needing them?

IIRC my plan was to use branch config instead of a separate release config, so far just three params were needed 
        'enable_release_promotion': True,
        'partners_repo_path': 'build/partner-repacks',
        'partner_repack_platforms': ('linux', 'linux64', 'win32', 'macosx64'),
(see http://hg.mozilla.org/build/buildbot-configs/file/3fcb4a2a79cd/mozilla/project_branches.py#l144)

I didn't get into many downstream jobs that needed a revision, build # or similar though. For partner builds I looked up the en-US bits in the TC index, which I guess we'd do with the taskgraph and are calling reconfigless releases now. Anyway, a separate config based on the current release configs is fine by me.

> We should make sure that Date is using self signed certs, to avoid any
> confusion should someone stumble on builds that come out of it.

We're OK there, still using the default certs for CI builds.
Flags: needinfo?(nthomas)
In overall the plan sounds reasonable. However, I'd like to suggest a bit more radical plan if we decide that we definitely want this project reconfigless.

We can start with release-runner generating a task graph with 2 types of jobs:

1) jobs that don't require reconfigs, tags, etc, and can use runtime information (from the task definition) to execute. These jobs can be added to the tree right away.

2) If a job requires reconfig, tag, etc., we replace it with "echo booo" and work on fixing it.
Flags: needinfo?(rail)
We talked about this some more in a meeting yesterday. First off, we agreed to call this dev (not stage).

We also decided that we will make the decision about whether or not we need reconfigs or release configs until later. As far as this bug goes, this means that we don't have to worry about all the complicated bits for now. Release config bumping, master reconfigs, and repo tagging can all be ignored for now. My plan here is to deploy release runner dev that uses code from my Github repo with all of these things commented out.

Given the above, I think the only open question is where to send e-mail notification to.
Summary: release runner / ship it "staging" → release runner / ship it dev environment
Super simple patch. Just renaming stage to dev in the puppet manifests, deploying to bm83, and adding Nick, you, and I to the notify list. Looks like that notify_to is only used for reconfig warnings (not the initial "Build of XXXXXX" mail), so it probably won't ever send us anytihng.

I still need to prep a tools branch with all of the Buildbot junk commented out before I land this.
Attachment #8631021 - Flags: review?(rail)
(In reply to Ben Hearsum [:bhearsum] from comment #5)
> I still need to prep a tools branch with all of the Buildbot junk commented
> out before I land this.

I put this together for that: https://github.com/mozilla/build-tools/compare/master...bhearsum:release-runner-dev?expand=1

It basically comments out all of the logic of release runner. We'll need to add task graph creation into it to actually make anything happen.

We also need to decide how to run release sanity checks without a release config, if it even makes sense to still.
Attachment #8631021 - Flags: review?(rail) → review+
Attachment #8631021 - Flags: checked-in+
OK, I think the initial release runner dev env is set-up now. After deploying, it ran through a bunch of releases that were in the dev system and marked them as completed.
(In reply to Ben Hearsum [:bhearsum] from comment #4)
> Given the above, I think the only open question is where to send e-mail
> notification to.

We use release-automation-notifications@mozilla.com for production, so let's use release-automation-notifications-dev@mozilla.com for dev. The bonus of using a google group is that individuals can opt-in or out of it on their own!
Attachment #8641773 - Flags: review?(rail)
Attachment #8641773 - Flags: review?(rail) → review+
Attachment #8641773 - Flags: checked-in+
The base of this is now set-up! It's using my release runner branch, which is getting enhanced in other bugs (such as bug 1150162).
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: