Closed Bug 421411 Opened 13 years ago Closed 12 years ago

migrate firefox 1.9 en-US nightlies to release automation

Categories

(Release Engineering :: General, defect, P2)

defect

Tracking

(Not tracked)

RESOLVED WONTFIX

People

(Reporter: rhelmer, Assigned: catlee)

References

Details

Attachments

(6 files, 6 obsolete files)

7.79 KB, patch
bhearsum
: review+
Details | Diff | Splinter Review
704 bytes, patch
rhelmer
: review+
Details | Diff | Splinter Review
8.37 KB, patch
nthomas
: review+
Details | Diff | Splinter Review
5.50 KB, patch
nthomas
: review+
Details | Diff | Splinter Review
2.84 KB, patch
Details | Diff | Splinter Review
2.07 KB, patch
rhelmer
: review+
Details | Diff | Splinter Review
The goal here is to move the Firefox builders on the 1.9 tree over to the
release automation system (bug 410936). Initially, these machines will build in
a continuous cycle, although we should have the option of easily moving to only
building on checkin.
(In reply to comment #0)
> The goal here is to move the Firefox builders on the 1.9 tree over to the
> release automation system (bug 410936). Initially, these machines will build in
> a continuous cycle, although we should have the option of easily moving to only
> building on checkin.

Sorry, should be bug 401936 not 410936.

Going to take the same steps as used for the 1.8 tree (bug 417147)
Assignee: nobody → rhelmer
Status: NEW → ASSIGNED
Attachment #307835 - Flags: review?(bhearsum)
Attachment #307838 - Flags: review?(nrthomas)
Attached patch need to use Periodic scheduler (obsolete) — Splinter Review
Can't use BonsaiPoller here yet, need to use Periodic scheduler to mimic current Tinderbox behavior.
Attachment #307839 - Flags: review?(bhearsum)
Comment on attachment 307835 [details] [diff] [review]
buildbot config for 1.9 nightlies 

This looks fine but I'm wondering if we need to add locks to the release automation builders. The 'rm -rfv /builds/configs' worries me; what happens if this builder does that while release automation is running TinderConfig? We may want to lock there simply because doing two builds at the same time would strain the hardware.

Doesn't look like we're locking on 1.8, either. TinderConfig is pretty quick so the chances of it happening are low but I think we should do it anyways. What do you think?
Also: did you mean to obsolete the first attachment?
(In reply to comment #4)
> Created an attachment (id=307838) [details]
> staging and production configs for 1.9 nightly bootstrap

So what's the first step here ? They report to MozillaRelease and don't publish to stage.m.o ? I see a several differences I didn't expect when diffing against fx-moz18-nightly-bootstrap.cfg (eg milestone, from, maybe sshUser/sshServer/BuildTree).
(In reply to comment #6)
> (From update of attachment 307835 [details] [diff] [review])
> This looks fine but I'm wondering if we need to add locks to the release
> automation builders. The 'rm -rfv /builds/configs' worries me; what happens if
> this builder does that while release automation is running TinderConfig? We may
> want to lock there simply because doing two builds at the same time would
> strain the hardware.
> 
> Doesn't look like we're locking on 1.8, either. TinderConfig is pretty quick so
> the chances of it happening are low but I think we should do it anyways. What
> do you think?
> 

I think we should lock everything. I'd rather start from the position that we lock everything and only explicit allow shared buildslaves, when dealing with release stuff.

We probably need to do a sweep through all the 1.9/1.8 configs and make sure it's set up the way we want.
Attachment #307835 - Attachment is obsolete: true
Attachment #307835 - Flags: review?(bhearsum)
(In reply to comment #8)
> (In reply to comment #4)
> > Created an attachment (id=307838) [details] [details]
> > staging and production configs for 1.9 nightly bootstrap
> 
> So what's the first step here ? They report to MozillaRelease and don't publish
> to stage.m.o ? I see a several differences I didn't expect when diffing against
> fx-moz18-nightly-bootstrap.cfg (eg milestone, from, maybe
> sshUser/sshServer/BuildTree).

Hmm, for production the tree should be Firefox actually. I didn't compare these against the 1.8 configs, I'll do that and post a followup to keep them in better sync. 

Diffs against each other and against 1.8 versions should be clearer now, I hope!
BuildTree should be Firefox-Staging and Firefox, I think.
Attachment #307838 - Attachment is obsolete: true
Attachment #308065 - Flags: review?(nrthomas)
Attachment #307838 - Flags: review?(nrthomas)
Attachment #307839 - Attachment is obsolete: true
Attachment #308067 - Flags: review?(bhearsum)
Attachment #307839 - Flags: review?(bhearsum)
Whiteboard: waiting for review
Comment on attachment 308067 [details] [diff] [review]
add locks around all builders w/ shared buildslaves

Did you attach the wrong patch? This one seems to be the Periodic scheduler one.
(In reply to comment #11)
> Created an attachment (id=308065) [details]
> sync with 1.8 configs and each other
> 
> Diffs against each other and against 1.8 versions should be clearer now, I
> hope!
> BuildTree should be Firefox-Staging and Firefox, I think.

Attached the wrong patch here ? This one's a buildbot config instead of tinderbox.


Yep, wrong patch attached, sorry about that!
Attachment #308065 - Attachment is obsolete: true
Attachment #308412 - Flags: review?(nrthomas)
Attachment #308065 - Flags: review?(nrthomas)
Attachment #308067 - Attachment is obsolete: true
Attachment #308067 - Flags: review?(bhearsum)
The last patch did not include the production master.cfg, sorry about that.
Attachment #308433 - Flags: review?(bhearsum)
Comment on attachment 308433 [details] [diff] [review]
[checked in] add locks, carry forward Periodic scheduler change

Looks good...
Attachment #308433 - Flags: review?(bhearsum) → review+
Comment on attachment 308433 [details] [diff] [review]
[checked in] add locks, carry forward Periodic scheduler change

Checking in production-1.9/master.cfg;
/cvsroot/mozilla/tools/buildbot-configs/automation/production-1.9/master.cfg,v  <--  master.cfg
new revision: 1.17; previous revision: 1.16
done
Checking in staging-1.9/master.cfg;
/cvsroot/mozilla/tools/buildbot-configs/automation/staging-1.9/master.cfg,v  <--  master.cfg
new revision: 1.29; previous revision: 1.28
done
Attachment #308433 - Attachment description: add locks, carry forward Periodic scheduler change → [checked in] add locks, carry forward Periodic scheduler change
I updated the master.cfg on production-1.9-master and got an error about locks; turns out the linux_lock and win32_lock were missing. This patch adds them in. I've commented out the dep builders for now, I don't know if you wanted them turned on.
Attachment #308613 - Flags: review?(rhelmer)
Attachment #308613 - Flags: review?(rhelmer) → review+
Comment on attachment 308613 [details] [diff] [review]
[checked in] add linux_lock and win32_lock

Checking in production-1.9/master.cfg;
/cvsroot/mozilla/tools/buildbot-configs/automation/production-1.9/master.cfg,v  <--  master.cfg
new revision: 1.19; previous revision: 1.18
done
Attachment #308613 - Attachment description: add linux_lock and win32_lock → [checked in] add linux_lock and win32_lock
Comment on attachment 308412 [details] [diff] [review]
sync with 1.8 configs and each other

>Index: fx-moz18-nightly-staging-bootstrap.cfg
>===================================================================
>+bootstrapTag     = RELEASE_AUTOMATION_M7_2

Ben tagged _M8 for Fx20013. This is for tinderbox being updated by buildbot ?

>Index: fx-moz19-nightly-bootstrap.cfg
>===================================================================
>+mozillaCvsroot  = cltbld@cvs.mozilla.org:/cvsroot
>+l10nCvsroot     = cltbld@cvs.mozilla.org:/l10n
>+mofoCvsroot     = cltbld@cvs.mozilla.org:/mofo
>+anonCvsroot	    = :pserver:anonymous@cvs-mirror.mozilla.org:/cvsroot

Dunno how easy you'd want to make diffing between configs, but it's :ext:cltbld... in the equivalent 1.8 config.

> from            = staging-bootstrap@mozilla.org

Should be just bootstrap@mozilla.org

> # where QA updates/builds go
> stagingUser     = cltbld
>-stagingServer   = staging-1.9-master.build.mozilla.org
>+stagingServer   = production-1.9-master.build.mozilla.org

The 1.8 nightly config has stage.m.o for stagingServer, but I don't think it's used for nightlies.

>+externalStagingServer   = stage.mozilla.org

1.8 puts ftp.m.o

> # username and server to push builds
> sshUser         = cltbld

Please use ffxbld here

>+sshServer       = production-1.9-master.build.mozilla.org

This is the one that gets into the tinderbox-config and needs to be stage.m.o

>+testsPhoneHome   = 1

Don't think you can do this when first introducing the builds to the Firefox tree, because of the hard coding of $GraphNameOverride in the 3 tinder-config.pl

>Index: fx-moz19-nightly-staging-bootstrap.cfg
>===================================================================

The diff of this file and fx-moz19-nightly-bootstrap.cfg looks nice and clean, but my helpful suggestions above will screw that up for ya.

>+buildTree       = Firefox-Staging

This guy doesn't exist yet. Going to create it or meant something else ?
Attachment #308412 - Flags: review?(nrthomas) → review-
Priority: -- → P2
(In reply to comment #21)
> >+buildTree       = Firefox-Staging
> 
> This guy doesn't exist yet. Going to create it or meant something else ?

Planning on creating it. Patch incoming for the other stuff :) 

Attachment #308412 - Attachment is obsolete: true
Attachment #308747 - Flags: review?(nrthomas)
Comment on attachment 308747 [details] [diff] [review]
[checked in] bootstrap configs for 1.9 nightlies take 2

Looks good, r+ with one issue to fix below.

>Index: fx-moz19-nightly-staging-bootstrap.cfg
>===================================================================
>+# symbol server variables
>+symbolServer     = staging-1.9-master.build.mozilla.org
>+symbolServerUser = cltbld
>+symbolServerPath = /data/symbols
>+win32_symbolServerKey  = /c/Documents and Settings/cltbld/.ssh/ffxbld_dsa
>+linux_symbolServerKey  = /home/cltbld/.ssh/ffxbld_dsa
>+macosx_symbolServerKey  = /Users/cltbld/.ssh/ffxbld_dsa

I think this will die, because the key doesn't match the account. The ownership of /data/symbols is cltbld, but I think the staging boxes don't have that key. There's a ffxbld user on the master you could also use.
Attachment #308747 - Flags: review?(nrthomas) → review+
Whiteboard: waiting for review
Comment on attachment 308747 [details] [diff] [review]
[checked in] bootstrap configs for 1.9 nightlies take 2

Thanks.. I'll try to figure out how to get the symbol stuff set right on staging.
Attachment #308747 - Attachment description: bootstrap configs for 1.9 nightlies take 2 → [checked in] bootstrap configs for 1.9 nightlies take 2
Whiteboard: testing on staging
This seems ok on staging, but it appears that locking slaves doesn't always work. For instance, Tag right now is supposed to get linux_lock, but I see on staging the nightly builder and tag builder running at the same time :/

I'll see if I can get a simpler test case for this.

Besides this (which is also a problem on 1.8 branch), I think that we're pretty much ready to go. I don't like the way we needed to manually stop the nightly builders for the .13 release, so I'd like to try to get the above figured out before we start trying to roll this out to production.
Bootstrap should handle:

* use test tag for configs
* set BuildTree to MozillaTest

Set in the tinder-config.pl files themselves (to be committed to "test" branch):

* turn off TestsPhoneHome
* publish builds to "nightly/experimental"
* turn off update_pushinfo
Attachment #310282 - Flags: review?(nrthomas)
Whiteboard: testing on staging → preparing production rollout
Comment on attachment 310282 [details] [diff] [review]
turn on 1.9 nightlies in parallel to current builders

>Index: tinderbox-configs/firefox/linux/tinder-config.pl
>===================================================================

These look fine, except the build_hour also changed for 1.8 (to avoid symbol collisions etc)

>Index: release/configs/fx-moz19-nightly-bootstrap.cfg
>===================================================================
>RCS file: /cvsroot/mozilla/tools/release/configs/fx-moz19-nightly-bootstrap.cfg,v
>retrieving revision 1.9
>diff -u -r1.9 fx-moz19-nightly-bootstrap.cfg
>--- release/configs/fx-moz19-nightly-bootstrap.cfg	14 Mar 2008 12:37:05 -0000	1.9
>+++ release/configs/fx-moz19-nightly-bootstrap.cfg	18 Mar 2008 17:25:59 -0000
>@@ -2,7 +2,7 @@
> milestone       = nightly
> # _RCn and _RELEASE will be appended as-needed
> # not used by nightly
>-productTag      = FIREFOX_3_0b5pre
>+productTag      = test
> # Branch name and pull dates to use for base tag
> branchTag       = HEAD

I think you need to change branchTag here, not productTag, so that TinderConfig does the right thing.

symbolServer can also be 'dm-symbolpush01.mozilla.org', now that's verified to work OK.

>-buildTree       = Firefox
>+buildTree       = MozillaTest

r+ with those changes.
Attachment #310282 - Flags: review?(nrthomas) → review+
Checking in release/configs/fx-moz19-nightly-bootstrap.cfg;
/cvsroot/mozilla/tools/release/configs/fx-moz19-nightly-bootstrap.cfg,v  <--  fx-moz19-nightly-bootstrap.cfg
new revision: 1.10; previous revision: 1.9
done
Checking in tinderbox-configs/firefox/linux/tinder-config.pl;
/cvsroot/mozilla/tools/tinderbox-configs/firefox/linux/tinder-config.pl,v  <--  tinder-config.pl
new revision: 1.25.2.1; previous revision: 1.25
done
Checking in tinderbox-configs/firefox/macosx/tinder-config.pl;
/cvsroot/mozilla/tools/tinderbox-configs/firefox/macosx/tinder-config.pl,v  <--  tinder-config.pl
new revision: 1.42.2.1; previous revision: 1.42
done
Checking in tinderbox-configs/firefox/win32/tinder-config.pl;
/cvsroot/mozilla/tools/tinderbox-configs/firefox/win32/tinder-config.pl,v  <--  tinder-config.pl
new revision: 1.32.2.1; previous revision: 1.32
done
Ok, I am going to set up the nightly Tinderbox directories on production and start testing this. Once it's working find, we just need to switch the BuildTree over to Firefox and run it for a week or two.
Component: Build & Release → Release Engineering
QA Contact: build → release
fx-win32-1.9-slave2's hostname is still win2k3-ref-img, which is what will show up in Tinderbox, so we should change this :)

I changed it to fx-win32-19-s2, as dots are not allowed in windows hostnames, and "slave2" brings it over the character limit for windows hostnames.

This needs a reboot to take effect. Once this has rebooted and everything is green on MozillaTest, let's start publishing the results to Firefox tree.
Current progress:

all three builders are reporting to MozillaTest under the names:
* fx-win32-19-s2
* bm-xserve10
* fx-linux-1

Note the last is because there's a dot in the hostname "fx-linux-1.9-slave2", shouldn't do that :) We'll need to change this to get Tinderbox to report under a more reasonable name. The win32 name is truncated to fit into the allowed length for Windows hostnames.

I had to do "xhost +" on fx-linux-1.9-slave2 in order to get the tests to run, we should set up X so this works (note that it's been a problem for releases in the past as well), because we're starting from the shell and not e.g. VNC'd into the X session.

I'll change the linux hostname tomorrow, and I think we're basically ready to start publishing to Firefox tree. I had to do a local patch to bootstrap.cfg which I'll post up here later too (need to set whether tests are enabled or not).

I'd like to compare the compile times for these machines versus the existing nightlies, a couple quick comparisons showed these as being slower, so that's something we'll want to fix as well if it's true, before disable the current nightlies.
We need to either figure out how to have multiple buildslaves shared fairly across different builders, or have separate nightly/release buildslaves. Until that happens, we need to shut these down for the 3.0b5 release. Lowering priority.
Priority: P2 → P3
Whiteboard: preparing production rollout → waiting for 3.0b5 release
Attachment #310300 - Attachment is patch: true
Attachment #310300 - Attachment mime type: application/octet-stream → text/plain
I turned these nightly builders back on now that the slaves are finished with 3.0b5rc2, and the Linux build was complaining about not being able to substitute runMozillaTests. This patch adds that to the bootstrap.cfg, and tweaks the tinder-config.pl for win32 & mac (the extra space means they didn't break).
Attachment #312289 - Flags: review?(rhelmer)
Attachment #312289 - Flags: review?(rhelmer) → review+
Comment on attachment 312289 [details] [diff] [review]
[checked in] fix TinderConfig

Checking in firefox/macosx/tinder-config.pl;
/cvsroot/mozilla/tools/tinderbox-configs/firefox/macosx/tinder-config.pl,v  <--  tinder-config.pl
new revision: 1.42.2.3; previous revision: 1.42.2.2
done
Checking in firefox/win32/tinder-config.pl;
/cvsroot/mozilla/tools/tinderbox-configs/firefox/win32/tinder-config.pl,v  <--  tinder-config.pl
new revision: 1.32.2.3; previous revision: 1.32.2.2
done
Checking in fx-moz19-nightly-bootstrap.cfg;
/cvsroot/mozilla/tools/release/configs/fx-moz19-nightly-bootstrap.cfg,v  <--  fx-moz19-nightly-bootstrap.cfg
new revision: 1.11; previous revision: 1.10
done

& production-1.9-master updated.
Attachment #312289 - Attachment description: fix TinderConfig → [checked in] fix TinderConfig
Assignee: robert → joduinn
Status: ASSIGNED → NEW
Component: Release Engineering: Talos → Release Engineering
Depends on: 419071
Whiteboard: waiting for 3.0b5 release
Assignee: joduinn → armenzg
Blocks: 439778
Priority: P3 → P2
Status: NEW → ASSIGNED
Currently, we don't have dependent/nightly builds under 1.9 buildbot master because this comment on the master.cfg (not as in source code):
#depBuildFactory.addStep(ShellCommand,
# description='TinderConfig',
# workdir='build',
# command=['perl', './release', '-o', 'TinderConfig'], 
# timeout=36000,
# haltOnFailure=1,
# env={'CVS_RSH': 'ssh'},
#)

We only have one slave per platform and not two slaves as we have in production 1.8 and staging 1.8. Therefore, we would have to stop dependent/nightly builds to do releases.

More VMs to be added?
(In reply to comment #36)
> Currently, we don't have dependent/nightly builds under 1.9 buildbot master
> because this comment on the master.cfg (not as in source code):
> #depBuildFactory.addStep(ShellCommand,
> # description='TinderConfig',
> # workdir='build',
> # command=['perl', './release', '-o', 'TinderConfig'], 
> # timeout=36000,
> # haltOnFailure=1,
> # env={'CVS_RSH': 'ssh'},
> #)
> 
> We only have one slave per platform and not two slaves as we have in production
> 1.8 and staging 1.8. Therefore, we would have to stop dependent/nightly builds
> to do releases.
> 
> More VMs to be added?

Yes, as far as I know everything is tested and ready, but having to stop coverage while we do release builds is not nice to developers.
Blocks: 444957
Depends on: 446038
Priority: P2 → P3
Summary: migrate firefox 1.9 nightlies to release automation → migrate firefox 1.9 en-US nightlies to release automation
I haven't had time to work on this, putting back into the pool
Assignee: armenzg → nobody
Priority: P3 → --
What's left to do with this?
Assignee: nobody → joduinn
We already have l10n nightlies done this way for 1.9, right? Whats left to do in this bug?
This bug is not related to l10n. This is for the en-US nightly builds that are done in 3 dedicated machines driven by tbox
According to http://tinderbox.mozilla.org/showbuilds.cgi?tree=Firefox3.0 , we have these 6 machines:

fx-linux-tbox
fxdbug-linux-tbox
fx-win32-tbox
fxdbug-win32-tbox
bm-xserve08
bm-xserve11

...producing dep, dep-debug and nightly builds, for en-US. 

Note, however, that the l10n nightly repacks/builds, and all the release builds, are being produced in our tiny 1.9 pool-of-build-slaves (currently only a pool of one - fx-*-1.9-slave2!). Once bug#462515 and bug#446038 are resolved, we should have enough production 1.9 slaves to also handle these dep and nightly en-US builds. We could then shutdown these 6 machines.
Depends on: 462515
Priority: -- → P3
Assignee: joduinn → catlee
Priority: P3 → P2
We're going to leave the 1.9 en-US nightlies on tbox.  l10n repacks and all the release builds will be done via production-1.9-master.

John, Ben and I agreed over the phone, and in an email thread ("sorting out various 1.9 machines") that we wouldn't worry about:

- doing en-us nightlies on different machines to en-us releases
- having one of these dedicated (tbox) machines die between now and EOL for
FF3.0. Belief is that we can quickly rebuild if this happens, so not
worth the effort.
Status: ASSIGNED → RESOLVED
Closed: 12 years ago
Resolution: --- → WONTFIX
Maybe this should be WontFix->Fixed, as there were 5 checkins...
Product: mozilla.org → Release Engineering
You need to log in before you can comment on or make changes to this bug.