Closed Bug 763929 Opened 12 years ago Closed 11 years ago

tracking bug for initial implementation + deployment of release kickoff and release runner

Categories

(Release Engineering :: Release Automation: Other, defect, P3)

defect

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: bhearsum, Assigned: bhearsum)

References

Details

(Whiteboard: [shipit])

Attachments

(7 files, 3 obsolete files)

There's a bunch of things we need to do to start a release. They're very standarized these days, and rely on a few different inputs. We should script it. Specifically, the tool needs to:
* Download a reviewed release config patch, double land it in a local repository
* Download l10n-changesets from the dashboard
* Tag buildbot-configs, buildbotcustom, and tools with the release tags
* Set clobbers
* Set reserved_slaves
* Update and reconfig the relevant master
* Run release sanity in dry run mode
* Run release sanity to start the automation

All the parts that involve interacting with the master could probably be done by enhancing our existing Fabric script. The rest is very simple scripting.
Attached file tagging repos script
This is a tagging repos script I've always used (a script I should have improved so I would not feel embarrassed when attaching it).


BTW +100000000000000 to this bug
We should make this script e-mail metrics, too.
Hope we can get to this soon, it's not higher priority than some other things though.
Priority: -- → P3
Rail and I got a big start on this last week. There's a webapp over in https://github.com/bhearsum/release-kickoff, and a bunch of modifications to existing scripts in https://github.com/bhearsum/tools/compare/master...kickoff

There's still more to be done, details over in this etherpad: https://etherpad.mozilla.org/ReleaseKickOff
Assignee: nobody → bhearsum
Attachment #676704 - Flags: review?(rail)
Attachment #676704 - Flags: review?(rail) → review+
Attachment #676704 - Flags: checked-in+
still need templates for release/esr configs
Attached patch release kickoff webapp (obsolete) — Splinter Review
It occurs to me that having some review before we're almost ready to land would be smart. This patch is the release kickoff app. Not sure who should review it though.
Attachment #677553 - Flags: review?
Most of this was done by Rail but also includes the release kickoff api that I wrote. Catlee knows most of these tools pretty well, so I choose him!
Attachment #677557 - Flags: review?(catlee)
Attachment #677553 - Flags: review? → feedback?
Attachment #677557 - Flags: review?(catlee) → feedback?(catlee)
Depends on: 810389
Depends on: 810393
Depends on: 810394
Depends on: 810397
Depends on: 810400
Depends on: 810402
Depends on: 810411
No longer depends on: 810411
Depends on: 810418
Depends on: 810422
Depends on: 810472
Depends on: 811054
Depends on: 811839
Attachment #681636 - Flags: review?(rail)
Attachment #681636 - Flags: review?(rail) → review+
Attachment #681636 - Flags: checked-in+
Attachment #681971 - Flags: review?(rail)
Attachment #681971 - Flags: review?(rail) → review+
Attachment #681971 - Flags: checked-in+
In production.
Whiteboard: [kickoff]
Depends on: 810423
Comment on attachment 677553 [details] [diff] [review]
release kickoff webapp

We're going to do reviews in person next week.
Attachment #677553 - Flags: feedback?
Attachment #677557 - Flags: feedback?(catlee)
Attachment #677553 - Attachment is obsolete: true
Attachment #677557 - Attachment is obsolete: true
Depends on: 813117
No longer depends on: 811054
Chris, here's a diff of release kickoff (vendor library excluded) with your initial review comments addressed. For posterity, here's what you said in e-mail:
- Use @classmethod instead of @staticmethod in model
- SubmitRelease.post should have a comment that says REMOTE_USER is being checked for elsewhere
  same with Releases.post

I also added a bunch of comments and fixed some unused imports.
Attachment #696076 - Flags: review?(catlee)
Attached patch release runner + friends (obsolete) — Splinter Review
Addresses these comments that you made, except where noted below:
- Remove lib/python/vendor from site.addsitedir in manage_masters
- Move update() and symlink calls in release-sanity up inside previous else block
- Comment why need list(reversed(options.releaseConfigFiles))
- Move unlink() calls into a check against which configs dir we're using. Why do we need to unlink "localconfig.py"? For clean commits? Why not use hg commit file...
- remove duplicate sendchange_master = config.get('...') and fix error handling if it's not set
- release runner needs comments; e.g. what while loop is doing
- refactor code like
    for release in rr.new_releases:
        log.releaseName = ...
-- I refactored the main part of release runner's code, but the specific thing about log.releaseName isn't relevant, because that work is being postphoned.

- refactor tagging calls
-- Not sure what this means
- look at having log handlers per release instead of one common log handler
-- This got postphoned, too.
- look at buildbot sendchange --help to figure out if we should use --who or --username
- pass Popen object to run_cmd_poll callback
Attachment #696120 - Flags: review?(catlee)
Blocks: 825094
Depends on: 825238
Found a bug in the buildbotcustom/tools tagging section today (it tries to tag the repositories multiple times, resulting in failures during a build1 with more than one release).
Attachment #696120 - Attachment is obsolete: true
Attachment #696120 - Flags: review?(catlee)
Attachment #696723 - Flags: review?(catlee)
Attachment #696076 - Flags: review?(catlee) → review+
Comment on attachment 696723 [details] [diff] [review]
release runner + friends

Review of attachment 696723 [details] [diff] [review]:
-----------------------------------------------------------------

::: buildbot-helpers/release_sanity.py
@@ +425,5 @@
> +
> +    # https://bugzilla.mozilla.org/show_bug.cgi?id=678103#c5
> +    # This goes through the list of config files in reverse order, which is a
> +    # hacky way of making sure that the config file that's listed first is the
> +    # one that's loaded in releaseConfig for the sendchange.

Why does this matter? The only thing that the sendchange uses from the releaseConfig is the tag name, but the tag for any product should be equivalent, right?

::: buildfarm/release/release-runner.sh
@@ +1,4 @@
> +#!/bin/bash
> +
> +# Sleep 3 days in case of failure
> +SLEEP_TIME=259200

3 days? really?
Attachment #696723 - Flags: review?(catlee) → review+
(In reply to Chris AtLee [:catlee] from comment #17)
> Comment on attachment 696723 [details] [diff] [review]
> release runner + friends
> 
> Review of attachment 696723 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> ::: buildbot-helpers/release_sanity.py
> @@ +425,5 @@
> > +
> > +    # https://bugzilla.mozilla.org/show_bug.cgi?id=678103#c5
> > +    # This goes through the list of config files in reverse order, which is a
> > +    # hacky way of making sure that the config file that's listed first is the
> > +    # one that's loaded in releaseConfig for the sendchange.
> 
> Why does this matter? The only thing that the sendchange uses from the
> releaseConfig is the tag name, but the tag for any product should be
> equivalent, right?
> 
> ::: buildfarm/release/release-runner.sh
> @@ +1,4 @@
> > +#!/bin/bash
> > +
> > +# Sleep 3 days in case of failure
> > +SLEEP_TIME=259200
> 
> 3 days? really?

302 rail on these
(In reply to Chris AtLee [:catlee] from comment #17)
> Comment on attachment 696723 [details] [diff] [review]
> release runner + friends
> 
> Review of attachment 696723 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> ::: buildbot-helpers/release_sanity.py
> @@ +425,5 @@
> > +
> > +    # https://bugzilla.mozilla.org/show_bug.cgi?id=678103#c5
> > +    # This goes through the list of config files in reverse order, which is a
> > +    # hacky way of making sure that the config file that's listed first is the
> > +    # one that's loaded in releaseConfig for the sendchange.
> 
> Why does this matter? The only thing that the sendchange uses from the
> releaseConfig is the tag name, but the tag for any product should be
> equivalent, right?

I've just copied the explanation from the corresponding bug. However, it looks like that part can be simplified.
 
> ::: buildfarm/release/release-runner.sh
> @@ +1,4 @@
> > +#!/bin/bash
> > +
> > +# Sleep 3 days in case of failure
> > +SLEEP_TIME=259200
> 
> 3 days? really?

per https://bugzilla.mozilla.org/show_bug.cgi?id=810395#c0
  * sleep 3 days (in case of weekend)

It gives us enough time to fix the problem and restart the service. Otherwise the service would be started again by supervisord and probably fail again.
Now that we've farmed off all the remaining tasks to other bugs, updating the summary to reflect that this is a tracking bug.
Summary: create a script/tool that knows how to start a release → tracking bug for initial implementation + deployment of release kickoff and release runner
Comment on attachment 696076 [details] [diff] [review]
release kickoff webapp

This has been landed for awhile.
Attachment #696076 - Flags: checked-in+
Attachment #696723 - Flags: checked-in+
We had some test failures in release runner upon landing:
======================================================================
ERROR: testSuccess3secsWith2secsPoll (mozilla_buildtools.test.test_util_commands.TestRunCmdiPeriodicPoll)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/builds/buildbot/preproduction/slave/test-masters/tools/lib/python/mozilla_buildtools/test/test_util_commands.py", line 61, in testSuccess3secsWith2secsPoll
    warning_interval=2),
  File "/builds/buildbot/preproduction/slave/test-masters/tools/lib/python/util/commands.py", line 111, in run_cmd_periodic_poll
    elapsed))
TypeError: %d format: a number is required, not NoneType
-------------------- >> begin captured logging << --------------------
util.commands: INFO: command: START
util.commands: INFO: command: bash -c "sleep 3 && true"
util.commands: INFO: command: cwd: /builds/buildbot/preproduction/slave/test-masters/tools/lib/python
util.commands: INFO: command: output:
--------------------- >> end captured logging << ---------------------

======================================================================
FAIL: testMakeHGUrl (mozilla_buildtools.test.test_util_hg.TestHg)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/builds/buildbot/preproduction/slave/test-masters/tools/lib/python/mozilla_buildtools/test/test_util_hg.py", line 416, in testMakeHGUrl
    self.assertEquals(file_url, expected_url)
AssertionError: 'https://hg.mozilla.org/build/tools/raw-file/FIREFOX_3_6_12_RELEASE/lib/python/util/hg.py' != 'http://hg.mozilla.org/build/tools/raw-file/FIREFOX_3_6_12_RELEASE/lib/python/util/hg.py'
-------------------- >> begin captured logging << --------------------
util.commands: INFO: command: START
util.commands: INFO: command: /builds/buildbot/preproduction/slave/test-masters/tools/lib/python/mozilla_buildtools/test/init_hgrepo.sh /tmp/tmpw7IHXC/repo
util.commands: INFO: command: cwd: /builds/buildbot/preproduction/slave/test-masters/tools/lib/python
util.commands: INFO: command: output:
util.commands: INFO: command: END (0.65s elapsed)

util.commands: INFO: command: START
util.commands: INFO: command: hg log -R /tmp/tmpw7IHXC/repo --template {node|short}

util.commands: INFO: command: cwd: /builds/buildbot/preproduction/slave/test-masters/tools/lib/python
util.commands: INFO: command: output:
util.commands: INFO: 8a9b4d8790e3
c877587e5685
c9d4cab6dd85
3c3f19b1c095

util.commands: INFO: command: END (0.09 elapsed)

--------------------- >> end captured logging << ---------------------

======================================================================
FAIL: testMakeHGUrlNoFilename (mozilla_buildtools.test.test_util_hg.TestHg)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/builds/buildbot/preproduction/slave/test-masters/tools/lib/python/mozilla_buildtools/test/test_util_hg.py", line 425, in testMakeHGUrlNoFilename
    self.assertEquals(file_url, expected_url)
AssertionError: 'https://hg.mozilla.org/build/tools/rev/default' != 'http://hg.mozilla.org/build/tools/rev/default'
-------------------- >> begin captured logging << --------------------
util.commands: INFO: command: START
util.commands: INFO: command: /builds/buildbot/preproduction/slave/test-masters/tools/lib/python/mozilla_buildtools/test/init_hgrepo.sh /tmp/tmp1caXNg/repo
util.commands: INFO: command: cwd: /builds/buildbot/preproduction/slave/test-masters/tools/lib/python
util.commands: INFO: command: output:
util.commands: INFO: command: END (0.94s elapsed)

util.commands: INFO: command: START
util.commands: INFO: command: hg log -R /tmp/tmp1caXNg/repo --template {node|short}

util.commands: INFO: command: cwd: /builds/buildbot/preproduction/slave/test-masters/tools/lib/python
util.commands: INFO: command: output:
util.commands: INFO: 61ecdbf8ea00
d0ffc2c2c3f4
200b6339415e
c57d88da3e99

util.commands: INFO: command: END (0.11 elapsed)

--------------------- >> end captured logging << ---------------------

======================================================================
FAIL: testMakeHGUrlNoRevisionNoFilename (mozilla_buildtools.test.test_util_hg.TestHg)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/builds/buildbot/preproduction/slave/test-masters/tools/lib/python/mozilla_buildtools/test/test_util_hg.py", line 433, in testMakeHGUrlNoRevisionNoFilename
    self.assertEquals(repo_url, expected_url)
AssertionError: 'https://hg.mozilla.org/build/tools' != 'http://hg.mozilla.org/build/tools'
-------------------- >> begin captured logging << --------------------
util.commands: INFO: command: START
util.commands: INFO: command: /builds/buildbot/preproduction/slave/test-masters/tools/lib/python/mozilla_buildtools/test/init_hgrepo.sh /tmp/tmpUpxA4h/repo
util.commands: INFO: command: cwd: /builds/buildbot/preproduction/slave/test-masters/tools/lib/python
util.commands: INFO: command: output:
util.commands: INFO: command: END (0.98s elapsed)

util.commands: INFO: command: START
util.commands: INFO: command: hg log -R /tmp/tmpUpxA4h/repo --template {node|short}

util.commands: INFO: command: cwd: /builds/buildbot/preproduction/slave/test-masters/tools/lib/python
util.commands: INFO: command: output:
util.commands: INFO: c2d24c97c173
ab42e8e02cf7
3384239335c1
a7a2846df2f1

util.commands: INFO: command: END (0.11 elapsed)

--------------------- >> end captured logging << ---------------------

----------------------------------------------------------------------
Ran 312 tests in 177.151s

FAILED (errors=1, failures=3)

One of them seems to repro'ing the problem I hit in bug 828023.
Attachment #700421 - Flags: review?(rail)
Attachment #700421 - Flags: review?(rail) → review+
Attachment #700421 - Flags: checked-in+
Blocks: 822752
Depends on: 822757
No longer blocks: 822752
Depends on: 822752
No longer depends on: 822752
No longer depends on: 822757
No longer blocks: 825094
Whiteboard: [kickoff] → [shipit]
The initial version of the web app and release runner are both deployed to production now. Follow up issues and improvements will be tracked in individual bugs marked with "[shipit]" in the whiteboard. Huge thanks to everyone who helped out with this!
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Product: mozilla.org → Release Engineering
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: