Closed
Bug 429427
Opened 16 years ago
Closed 15 years ago
Redesign linux refimage VM so no additional manual setup needed
Categories
(Release Engineering :: General, defect, P3)
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: joduinn, Unassigned)
References
Details
Attachments
(3 files, 16 obsolete files)
4.60 KB,
text/plain
|
catlee
:
review+
bhearsum
:
checked-in+
|
Details |
7.97 KB,
patch
|
catlee
:
review+
bhearsum
:
checked-in+
|
Details | Diff | Splinter Review |
2.49 KB,
text/plain
|
catlee
:
review+
bhearsum
:
checked-in+
|
Details |
To setup a new linux VM, we start by cloning an existing refimage VM, which gives us a basic set of OS+toolchain installs. However, we then need to follow these manual instructions: http://wiki.mozilla.org/ReferencePlatforms or http://wiki.mozilla.org/BuildbotTestfarm to manually install the rest of the toolchain softwarem not included in the refimage VM. All this takes a long time, and can be tricky. As we setup more and more machines, this overhead is becoming a problem. Lets eliminate these manual post-install steps 1) For software that needs configuring, write a script, run as root, which can be run on first boot, and which fills in details like machine/host name, smtp configuration, etc. 2) For software that changes rapidly, whatever version is on the refimage is likely to be out of date anyway by the time we get to clone new VMs. Instead lets design the refimage to contain enough to get started, and include in refimage a script that checks for updates on first launch, and refresh forward to the latest available at that time. One way of doing this would be to pull specific tag versions of buildbot from CVS, for example, but there are probably other ways to do this. Or pull a tagged version of a text file, which contains URLs for downloading using WGET. 3) After VM is created, with bang-up-to-date versions of software, it will have to keep checking and refreshing forward, or else this new VM will drift out of date. We need a periodic check to verify VM is still running the right versions of toolchain, and refresh forward if not. This would allow us to know that all slaves are always in sync with each other. Open question: how frequently should we be rechecking & refreshing? Once a day seems reasonable start, but thats just a swag.
Comment 1•16 years ago
|
||
I'll take this while doing some buildbot setup on the new moz2 unittest master. I'm writing some scripts to pull and install the files automagically.
Assignee: nobody → rcampbell
Updated•16 years ago
|
Priority: -- → P2
Comment 2•16 years ago
|
||
wip. set of scripts running through installation of latest versions python, zope-interface, twisted and buildbot. Scripts download, install then cleanup after themselves. Additional script to configure user's profile upon successful completion forth-coming.
Comment 3•16 years ago
|
||
tweaked and hopefully improved version of setup scripts.
Attachment #316279 -
Attachment is obsolete: true
Comment 4•16 years ago
|
||
TBD: parametrization of installables.
Comment 5•16 years ago
|
||
This set of scripts works for the downloadable centos-5-ref-tools vm, and it adds: 1. making directories for logging in grab_files.sh 2. exporting paths to cltbld .bash_profile in post_install.sh 3. using /tools/dist for downloading of packages Now to look at parametrization and incorporating a cvs file or someway to auto-update.
Comment 6•16 years ago
|
||
Attachment #321286 -
Attachment is obsolete: true
Updated•16 years ago
|
Assignee: rcampbell → lukasblakk
Comment 7•16 years ago
|
||
the reason your image is much larger is because there are a whole bunch of 50k files with "._" prefixes on them, probably mac resource forks.
Comment 8•16 years ago
|
||
I ran each of the steps listed in install.sh individually and each seemed to run through without a hitch. Only issues were: 1) cleanup not removing previous versions of python, twisted(-core) and zope-interface. 2) .bash_profile options not picked up on login. Moved contents to .bashrc Otherwise, this seems good to go.
Comment 9•16 years ago
|
||
also, should add mercurial to these.
Comment 10•16 years ago
|
||
whups, ignore that last comment. Mercurial's already been included in the refplatform image.
Comment 11•16 years ago
|
||
Did you run them as root and then su - cltbld? cause that's how i was running them (which means maybe I should have a readme.txt in there) and when I did the user switch, cltbld had the correct paths
Comment 12•16 years ago
|
||
Ignore that last comment, just did some testing and saw the error of my ways. A readme.txt is probably still a good idea though...
Comment 13•16 years ago
|
||
So now there's a little ReadMe.txt about running as root, the PATH is exported to .bashrc instead of .bash_profile and the cleanup script should delete the old versions of twisted/zope/python in /tools
Attachment #316454 -
Attachment is obsolete: true
Attachment #321295 -
Attachment is obsolete: true
Comment 14•16 years ago
|
||
line 10 in install_buildboot.sh should create another level of folder mkdir buildbotcustom + cd buildbotcustom cvs -d:pserver:anonymous@cvs-mirror.mozilla.org:/cvsroot co -d buildbotcustom mozilla/tools/buildbotcustom This fixes the situation that $PYTHONPATH=/tools/buildbotcustom which did not allow to do "import buildbotcustom" To fix I had to "export PYTHONPATH=/tools:$PYTHONPATH" NOTE = I believe this script will actually better as one file which allows you to actually comment it
Comment 15•16 years ago
|
||
I used it for my local purposes, it might be good to put a lot of things into variables as the PROFILE variable I use It might be interesting to have another one to "update" since this one is for a "first-time-run"
Updated•16 years ago
|
Attachment #323439 -
Attachment mime type: application/octet-stream → text/plain
Comment 16•16 years ago
|
||
For some reason I was removing twisted-core-2.4.0 from the /tools dir and that broke the symlink and also removed twisted-core which we kind of need. Also, I made some changes to the ReadMe regarding cvs key issues that arose when trying to use these scripts on the build machines.
Attachment #323176 -
Attachment is obsolete: true
Reporter | ||
Comment 17•16 years ago
|
||
Putting back in the pool after triage.
Assignee: lukasblakk → nobody
Component: Release Engineering → Release Engineering: Future
Priority: P2 → P3
Comment 18•16 years ago
|
||
Still tweaking on this - added more scripts for mercurial, nagios and autoconf so this can be used for moz2. Also cleaned up some of cleanup.sh - fixed path settings, improved buildbot installation. At this point - run it as root - be sure to have copied ssh keys into ~/.ssh for root and cltbld
Attachment #323439 -
Attachment is obsolete: true
Attachment #326069 -
Attachment is obsolete: true
Comment 19•16 years ago
|
||
Er. I'm a little confused. This bug is about automating the post-install setup of the ref platform, right? Nagios, Mercurial, and Autoconf come standard with the ref platform...
Comment 20•16 years ago
|
||
This time the PYTHONHOME and PATH are exported in the zope and twisted install scripts because they are not set in the root user's profile. The most important thing to know about running this automated script is that you must have cvs keys in your root .ssh dir. Otherwise buildbot will not check out and thus, not install. This information is in the README as well.
Attachment #329016 -
Attachment is obsolete: true
Comment 21•16 years ago
|
||
So, if we were to put the stgbld keys on our ref image (both root and cltbld) - then these scripts would remove the manual steps.
Comment 22•15 years ago
|
||
Comment on attachment 332960 [details]
Automated install script for update/new cltbld setup
this is definitely obsolete now
Attachment #332960 -
Attachment is obsolete: true
Comment 23•15 years ago
|
||
Comment on attachment 341934 [details]
Automated Install Scripts with stgbld keys
this is definitely obsolete now
Attachment #341934 -
Attachment is obsolete: true
Updated•15 years ago
|
Assignee: nobody → bhearsum
Comment 24•15 years ago
|
||
Catlee, this is very similar to what I showed you last week, with the fixes you suggested and some others. I think this will be a good starting point for when we want to extend it to talk to a web service or some other way of autobalancing. get_default_options is kindof messy, but I'm not sure how to improve it. I diff'ed the generated buildbot.tac against the existing one on linux-slave03. The only differences were related to the fact that the existing .tac on linux-slave03 was generated a long time ago, before a lot of log handling code was added. When diff'ed against a newer, production slave the only differences aside from ordering of the options and whitespace were the buildmaster_host and slavename values - for obvious reasons.
Attachment #410018 -
Flags: review?(catlee)
Comment 25•15 years ago
|
||
This script should be able to handle both CentOS style init (chkconfig) as well as being launched from launchd. The start and stop is a little strange here, but it seems to work (I copied it from the 'firstboot' service on the ref platform). I tested this a couple of times on moz2-linux-slave03 and it works as intended: if /builds/slave/buildbot.tac doesn't exist and /etc/sysconfig/buildbot-tac doesn't have 'RUN=YES' in it, it does nothing. It also bails early if the current hostname is in IGNORE_HOSTS. Otherwise, it calls out to buildbot-tac.py, which intelligently generates the tac file based on the hostname. Note that the password is in this script, as I'm intending it to live in the private puppet-files repository. Still to do, Puppet manifest updates to maintain a /tools checkout and deploy this script.
Attachment #410023 -
Flags: review?(catlee)
Comment 26•15 years ago
|
||
Sorry, forgot to :w before submitting
Attachment #410023 -
Attachment is obsolete: true
Attachment #410025 -
Flags: review?(catlee)
Attachment #410023 -
Flags: review?(catlee)
Updated•15 years ago
|
Attachment #410025 -
Attachment mime type: application/octet-stream → text/plain
Comment 27•15 years ago
|
||
Comment on attachment 410018 [details]
buildbot tac generator
A teeny tiny nit:
add a \ after the initial triple quote of your header and footer to prevent extra newlines in the buildbot.tac file.
Looks good otherwise.
Attachment #410018 -
Flags: review?(catlee) → review+
Comment 28•15 years ago
|
||
Comment on attachment 410025 [details]
buildbot-tac service with chown
How hard would it be to add some detection and handling for stale lock files? Or does the 'stop' action handle this case when the machine is shutdown/rebooted?
Comment 29•15 years ago
|
||
(In reply to comment #27) > (From update of attachment 410018 [details]) > A teeny tiny nit: > add a \ after the initial triple quote of your header and footer to prevent > extra newlines in the buildbot.tac file. The one on the footer is intentional - it separates the footer from the options above it. I'll fix the header one, though.
Comment 30•15 years ago
|
||
Comment on attachment 410018 [details]
buildbot tac generator
This worked great in my testing, too, so in it goes:
changeset: 407:25611d4e0cf9
Attachment #410018 -
Flags: checked-in+
Comment 31•15 years ago
|
||
Ok, this version has lockfile aging as you requested. I've also made the action=skip messages better, and made sure we don't override existing tac files (this could've happened when we roll this out). I ended up putting buildbot-tac.py somewhere else, so I've also updated this script to reflect that. Should be pretty straightforward.
Attachment #410025 -
Attachment is obsolete: true
Attachment #410588 -
Flags: review?(catlee)
Attachment #410025 -
Flags: review?(catlee)
Comment 32•15 years ago
|
||
These manifests can be a little obtuse, so here's the order of operations: * Untar build-tools checkout into /tools, set-up symlink * Copy buildbot-tac to /etc/init.d/buildbot-tac and install the service * Start the buildbot-tac service On existing machines this should amount to no change, as the buildbot-tac already exists. For those, it will also fill in the control file so that even if we have buildbot.tac out of the way for awhile it will never be regenerated. The buildbot service has also been updated to ensure that the buildbot-tac service starts first, thus ensuring the new slaves will come up properly.
Attachment #410592 -
Flags: review?(catlee)
Comment 33•15 years ago
|
||
Sorry about all the churn; my last version forgot to update ${CONTROL_FILE} if buildbot.tac already exists.
Attachment #410588 -
Attachment is obsolete: true
Attachment #410593 -
Flags: review?(catlee)
Attachment #410588 -
Flags: review?(catlee)
Comment 34•15 years ago
|
||
Comment on attachment 410592 [details] [diff] [review] puppet manifests to roll out buildbot-tac.py and the init script Going to be modifying this slightly to make it work better for Mac.
Attachment #410592 -
Attachment is obsolete: true
Attachment #410592 -
Flags: review?(catlee)
Comment 35•15 years ago
|
||
Comment on attachment 410593 [details] [diff] [review] one more time, again Going to be modifying this slightly to make it work better for Mac.
Attachment #410593 -
Attachment is obsolete: true
Attachment #410593 -
Flags: review?(catlee)
Comment 36•15 years ago
|
||
The mac part of this should probably go in bug 429430, but I thought keeping it together would be easier for review purposes.
Attachment #410822 -
Flags: review?(catlee)
Comment 37•15 years ago
|
||
Pretty much the same as the last version, just changed the location of the control file on Mac. This script will end up in the puppet-files CVS repo.
Attachment #410824 -
Flags: review?(catlee)
Comment 38•15 years ago
|
||
Attachment #410825 -
Flags: review?(catlee)
Comment 39•15 years ago
|
||
Comment on attachment 410825 [details] [diff] [review] buildbot-tac plist launcher Sorry, wrong bug for this one.
Attachment #410825 -
Attachment is obsolete: true
Attachment #410825 -
Flags: review?(catlee)
Updated•15 years ago
|
Attachment #410822 -
Flags: review?(catlee) → review+
Updated•15 years ago
|
Attachment #410824 -
Attachment mime type: application/octet-stream → text/plain
Comment 40•15 years ago
|
||
Comment on attachment 410824 [details] buildbot-tac, again > if ! `ps ax | awk '{print $1}' | grep -q \`cat ${LOCKFILE}\`` || > `find ${LOCKFILE} -cmin ${MAX_LOCKFILE_AGE} | grep -q ${LOCKFILE}`; I think you need -cmin +${MAX_LOCKFILE_AGE} here. r=me with that change.
Attachment #410824 -
Flags: review?(catlee) → review+
Comment 41•15 years ago
|
||
Comment on attachment 410824 [details]
buildbot-tac, again
checked-in with the fix and the correct password.
Checking in buildbot-tac;
/mofo/puppet-files/shared/buildbot-tac,v <-- buildbot-tac
initial revision: 1.1
done
Attachment #410824 -
Flags: checked-in+
Comment 42•15 years ago
|
||
Comment on attachment 410822 [details] [diff] [review] deploy zero to staging scripts on linux and mac changeset: 74:2787b357f0c7
Attachment #410822 -
Flags: checked-in+
Updated•15 years ago
|
Assignee: bhearsum → nobody
Component: Release Engineering: Future → Release Engineering
Comment 43•15 years ago
|
||
After getting everything landed and updating the ref platform I had Phong clone a new machine for me, moz2-linux-test01. After he cloned it and turned it on it appeared on staging-master.b.m.o:9010. This should be the case for all new slaves, provided the buildbot-configs are updated to know about them in advance. I've also updated the reference platform doc and removed the now-unnecessary manual steps, https://wiki.mozilla.org/ReferencePlatforms/Linux-CentOS-5.0#Post-Install_Setup. Victory!
Status: NEW → RESOLVED
Closed: 15 years ago
Resolution: --- → FIXED
Reporter | ||
Comment 44•15 years ago
|
||
(In reply to comment #43) > After getting everything landed and updating the ref platform I had Phong clone > a new machine for me, moz2-linux-test01. After he cloned it and turned it on it > appeared on staging-master.b.m.o:9010. > > This should be the case for all new slaves, provided the buildbot-configs are > updated to know about them in advance. > > I've also updated the reference platform doc and removed the now-unnecessary > manual steps, > https://wiki.mozilla.org/ReferencePlatforms/Linux-CentOS-5.0#Post-Install_Setup. > > Victory! Very very sweet! :-)
Assignee | ||
Updated•11 years ago
|
Product: mozilla.org → Release Engineering
You need to log in
before you can comment on or make changes to this bug.
Description
•