Closed Bug 440351 Opened 12 years ago Closed 12 years ago

Set up Thunderbird release automation for 2.0.0.x

Categories

(Release Engineering :: General, defect, P2)

defect

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: nthomas, Assigned: bhearsum)

References

Details

Attachments

(5 files, 4 obsolete files)

To start fully automating Tb 2.0.0.x releases we need to setup clones of the nightly boxes with buildbot, connect them to staging-1.8-master, and do some testing. Once that's complete we can clone again for production-1.9-master.
Assignee: sachinrthomas → nthomas
Priority: -- → P2
Depends on: 440356
This config is to hook up the new staging-{patrcoles,crazyhorse} slaves to staging-1.8-master, using the thunderbird specific config. Much of the changes are conversion to the Buildbot 0.7.7 format, as well as adding the two new slaves. There are new linux_prestage_tb and win32_prestage_tb builders since I'd like to use staging-crazyhorse and staging-patrocles for only TinderConfig-Build-Repack (win32_build, linux_build), and the existing Firefox slaves for update verify and the staging server. Also doesn't lock on the new slaves since they don't do consecutive tasks.
Attachment #326886 - Flags: review?(bhearsum)
Comment on attachment 326886 [details] [diff] [review]
Buildbot master config for staging

>Index: master.cfg
>===================================================================
>@@ -12,22 +12,24 @@ c = BuildmasterConfig = {}
> # tuple of bot-name and bot-password. These correspond to values given to the
> # buildslave's mktap invocation.

Hrm. The copy of this file on the slave has:
from buildbot.buildslave BuildSlave

which is needed. I guess I missed that when I did the upgrade. Can you add it here?

> c['slaves'] = [
>  BuildSlave("staging-1.8-master",""),

>Index: tb-master.cfg
>===================================================================
> ####### BUILDSLAVES
> 
> # the 'bots' list defines the set of allowable buildslaves. Each element is a
> # tuple of bot-name and bot-password. These correspond to values given to the
> # buildslave's mktap invocation.

Here too

>+c['slaves'] = [
>+ BuildSlave("staging-1.8-master",""),

> slave_prestage_scheduler = Scheduler(
>  name="slave_prestage", branch=None,
>  treeStableTimer=0,
>  builderNames=[
>   "linux_prestage",
>+  "linux_prestage_tb",
>   "win32_prestage",
>+  "win32_prestage_tb",
>   "macosx_prestage",
>  ],
> )

Hrm. If we do it like this, *all* of those builders will get triggered when you do a 'buildbot sendchange'. 

> from buildbot.status import html
> c['status'].append(
>- html.Waterfall(http_port=8810, css='./mozilla.css')
>-)
>-c['status'].append(
>- html.Waterfall(http_port=8811, css='./mozilla.css', categories = ['nightly'])
>-)
>-c['status'].append(
>- html.Waterfall(
>-  http_port=8812,
>-  css='./mozilla.css', 
>-  categories = ['release'],
>- )
>+ html.WebStatus(http_port=8810)

I found out a couple weeks ago that the default for allowForce changed to False. Should add allowForce=True so you can kick things along if necessary.


As discussed on IRC we should merge tb-master.cfg and master.cfg at some point (soon). That'd make it easier to do staging runs on both of them without stepping on toes, and would just plain be nicer.
Attachment #326886 - Flags: review?(bhearsum) → review+
(In reply to comment #2)

> Hrm. If we do it like this, *all* of those builders will get triggered when you
> do a 'buildbot sendchange'. 

Disregard this. I meant to remove it before submitting.
Checking in master.cfg;
/cvsroot/mozilla/tools/buildbot-configs/automation/staging/master.cfg,v  <--  master.cfg
new revision: 1.42; previous revision: 1.41
done
Checking in tb-master.cfg;
/cvsroot/mozilla/tools/buildbot-configs/automation/staging/tb-master.cfg,v  <--  tb-master.cfg
new revision: 1.3; previous revision: 1.2
done

and merged into the checkout on the master. Needed a followup checkin for
  -from buildbot.buildslave BuildSlave
  +from buildbot.buildslave import BuildSlave
Attachment #326886 - Attachment is obsolete: true
Attachment #326890 - Flags: review+
Machine checks, to make sure that bm-xserve02 and bm-xserve03 are "sufficiently identical". On each machine (substituting 2 or 3 for x):

1, set > 0x-env
Get only trivial/expected differences (HOSTNAME, OLDPWD, PPID, PWD, SECURITYSESSIONID), so PATH, PYTHONPATH, PYTHONHOME identical.

2, sudo find -L /opt/local/bin /opt/local/sbin /bin /sbin /usr/bin /usr/sbin -type f | perl -nle 'system("openssl","dgst","-sha1",$_)' 2>&1 | tee ~cltbld/0x-env
When diffing 02-sha1sum to 03-sha1sum, get only
  +SHA1(/usr/bin/7z)= d4938386e618eb6d73e922074179613c14b6e608
so we have 7zip on 03 but not 02 (for l10n_verify). Otherwise all the files on the path are identical. Got permission denied for 8 system utils not used in the build.
Updating the CVS mirror on staging-1.8-master as an easy way to remove THUNDERBIRD_2_0_0_13_... tags before a staging run using tb-moz18-staging-bootstrap.cfg.
After this we had to switch the CVS/Root for the en-US build to 
  cltbld@staging-1.8-master.build.mozilla.org:/builds/cvsmirror/cvsroot
and adjust the tb-moz18-staging-bootstrap.cfg to use /cygdrive/e/ for the two buildDir's.
We also needed to install perl's Config-General and switch the CVS/Root for the en-US tinderbox configs to 
  cltbld@staging-1.8-master.build.mozilla.org:/builds/cvsmirror/cvsroot
Fixed up the talkback binary blobs on staging-crazyhorse and staging-patrocles. As clones of the nightly boxes they wanted to upload to talkback-upload.m.o, which we don't want for staging. We now have two files
  staging-talkback-thunderbird-1.8-linux.tar.bz2
which uploads to staging-1.8-master, and the original
  production-talkback-thunderbird-1.8-linux.tar.bz2

Tinderbox expects to find talkback-thunderbird-1.8-linux.tar.bz2, so this now a symlink to either file, currently the staging one. This will need to be changed when these slaves get swapped to prod.

Started another automation run (with appropriate uncommenting in the master.cfg and removal of tinderbox config files).
Committed in two steps as rev 1.6 and 1.7, this is needed because we last used the tb-moz18-staging-bootstrap.cfg to build Tb2.0.0.x on the Firefox staging slaves (as a general does automation work test). Now that we're using clones of the Tb nightly/release slaves the buildPlatform's in tb-moz18-bootstrap.cfg are correct. Except that we want to use the same machine for en-US and l10n.
Attachment #327081 - Flags: review?(bhearsum)
Must've failed to remove the tinderbox config for linux l10n, because it failed to pull an updated tinder-config.pl for the changes in comment #10. Reached into the machine to fix that, updated the master config so that the buildFactory only does the repack steps, and forced a linux build.
Attachment #327081 - Flags: review?(bhearsum) → review+
Current status of 2.0.0.13 test run:

* Tag: OK (tagging fix already landed and in Ben's RELEASE_AUTOMATION_M10)
* Source: OK
* Build & Repack: OK
* l10n verify: OK (get expected changes in comments and line endings)
* Update generation: Running now

I put things I know need to change for a staging-->production switch at 
  http://wiki.mozilla.org/User:CF:Tb200xAutomation

Ben has addressed several already in bug 442243.
Another paranoid comparison: comparing the system files that Firefox links to.
Used otool to find out which libraries firefox-bin links to:
ca-245:MacOS bhearsum$ for i in firefox-bin *.dylib; do otool -L $i; done | awk '{print $1}' | grep -v executable_path | grep -v firefox-bin > libs.list

On both slaves (bm-xserve02 and bm-xserve05):
for i in `cat libs.list`; do openssl dgst -sha1 $i; done > sha1-libs.txt

I then diff'ed those two text files and they were identical. I *think* this means that we will not break binary extensions - no guarantees though ;).
* Update generation: Sortof OK. Cycle was green and it generated partials for 2.0.0.12-2.0.0.13 but the snippets use completes for both cases. This is fallout from building a 2.0.0.13 test release after the real 2.0.0.14 shipped.

* Update verify: All green, but obviously not testing the generated partials. They should be OK, the host and code hasn't changed.

* Stage: OK.

Next steps (as we discussed)

* clone to production machines (bug 442351)
* collapse snapshot on newly cloned patrocles, if it works ok then collapse on staging-patrocles
* the steps in 
    http://wiki.mozilla.org/User:CF:Tb200xAutomation
* do tb 2.0.0.15/6 release as appropriate
Assignee: nthomas → bhearsum
Depends on: 442351
Oh, and probably do a complete staging run for Tb2.0.0.15 to test updates properly.
In staging, we just switch between master.cfg and tb-master.cfg for testing Firefox and Thunderbird respectively. In production, this doesn't work since we use the Firefox Buildbot to drive the 1.8 builds. Therefore, we need to run the Thunderbird automation Buildbot in parallel. Main changes:
* Rename builders so they don't conflict with Firefox ones on MozillaRelease tinderbox
* Remove dep builders
* Change slave port, waterfall port
Attachment #328022 - Flags: review?(ccooper)
Attached patch the right patch (obsolete) — Splinter Review
Attachment #328022 - Attachment is obsolete: true
Attachment #328025 - Flags: review?(ccooper)
Attachment #328022 - Flags: review?(ccooper)
Attachment #328025 - Flags: review?(ccooper) → review+
Attached patch once more, with feeling (obsolete) — Splinter Review
In my friday-induced haze I forgot that the locks we use are useless - and I forgot to get all of the builders to use the new slaves.
Attachment #328025 - Attachment is obsolete: true
Attachment #328318 - Flags: review?(ccooper)
The clones appeared to be in the post-Nick's setup notes state but did *not* have a snapshot in them. Here's what I did to get them ready:

* On production-patrocles I fixed up the automatic login (which probably broke after our last password change).
* production-{crazyhorse,patrocles} have been fixed up with the right ssh keys, and are connected to the new master.
* I copied over the tbirdbld keys to bm-xserve05.
* On all of the slaves I made sure the tinderbox checkout was RELEASE_AUTOMATION_M10, and made sure the Tb-Mozilla1.8-{110n,}-Release tree had the appropriate tinderbox-configs checked out.
* Made sure all slaves had stage.mozilla.org, dm-symbolpush01, and cvs.mozilla.org host keys accepted.

I think we're ready to try a 2.0.0.16 on automation once we have the "go".
Comment on attachment 328318 [details] [diff] [review]
once more, with feeling

Urgh. Let me double check my assumptions here and think back - it may have been intentional that we only used the new slaves for builds...
Attachment #328318 - Flags: review?(ccooper)
OK. This is the last one, promise. I recalled in my conversations with Nick that we are going to use the Firefox slaves for non-build steps to avoid some extra setup on production-crazyhorse.

Sorry for all the iteration here.
Attachment #328318 - Attachment is obsolete: true
Attachment #328331 - Flags: review?(ccooper)
Attachment #328331 - Flags: review?(ccooper) → review+
Comment on attachment 328331 [details] [diff] [review]
[checked in] use firefox slaves for non-build steps, still remove locks

Checking in tb-master.cfg;
/cvsroot/mozilla/tools/buildbot-configs/automation/production/tb-master.cfg,v  <--  tb-master.cfg
new revision: 1.4; previous revision: 1.3
done
Attachment #328331 - Attachment description: use firefox slaves for non-build steps, still remove locks → [checked in] use firefox slaves for non-build steps, still remove locks
Given that we produced 2.0.0.16 on automation I'm going to call this FIXED :).
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
Product: mozilla.org → Release Engineering
You need to log in before you can comment on or make changes to this bug.