Closed Bug 744067 Opened 12 years ago Closed 12 years ago

Setup new linux32 and linux64 vms in scl3

Categories

(Infrastructure & Operations Graveyard :: CIDuty, task, P2)

x86
Linux

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: coop, Assigned: hwine)

References

Details

(Whiteboard: [puppet][scl3])

Attachments

(4 files)

While testing puppet on new linux VMs in scl3, I ran into the errors reported in https://bugzilla.mozilla.org/show_bug.cgi?id=735381#c26

err: //Node[linux32-temp]/vm/Mount[builds]/ensure: change from present to mounted failed: Execution of '/bin/mount -o noatime /builds' returned 32: mount: special device /dev/sdb1 does not exist

err: //Node[linux64-temp]/vm/Mount[builds]/ensure: change from present to mounted failed: Execution of '/bin/mount -o noatime /builds' returned 32: mount: special device /dev/sdb1 does not exist

We'll need to create a new class for these VMs that doesn't try to mount /builds.
Assignee: nobody → coop
Status: NEW → ASSIGNED
Priority: P3 → P2
Hrmm, I thought there was more to the existing vm class than just mounting /builds, but it turns out not.

Oh well, I'll just get the new VMs into the puppet configs then.
Summary: Need new puppet class for linux32 & 64 build VMs that doesn't mount /builds → Add new linux32 and linux64 vms to scl3 puppet configs
These VMs don't have a separate /builds mount, so they don't need to inherit the vm class. 

That's the only change I've made since testing one VM of each type against a facsimile of this config yesterday in bug 735381.
Attachment #614030 - Flags: review?(jhford)
Attachment #614030 - Flags: review?(jhford) → review+
Comment on attachment 614030 [details] [diff] [review]
Add linux32 and linux64 VMs to scl3 puppet configs

https://hg.mozilla.org/build/puppet-manifests/rev/5adb88d3b22e
Attachment #614030 - Flags: checked-in+
Hal: thanks for taking this on.

The VMs have been setup in bug 735381, now we just need to get them talking to puppet and then hooked up to the new buildbot masters.

Here's the list of slaves with the build/try breakdown from https://bugzilla.mozilla.org/show_bug.cgi?id=735381#c13 :

bld-centos5-32-vmw-[001-022] .build.releng.scl3.mozilla.com
bld-centos5-32-vmw-[023-039] .try.releng.scl3.mozilla.com
bld-centos5-64-vmw-[001-006] .build.releng.scl3.mozilla.com
bld-centos5-64-vmw-[007-011] .try.releng.scl3.mozilla.com

One thing you'll need to do is make sure that both sets of slaves (build/try) get the proper ssh keys (and known_hosts) installed. These can be grabbed from any other slave of the same type. I recommend grabbing the from the lion slaves in scl3.

A tool like csshX may be useful too if you need to do verification across multiple slaves at once.

These new slaves will also need to be added to slavealloc.
Assignee: coop → hwine
Depends on: 735381, 744882
Summary: Add new linux32 and linux64 vms to scl3 puppet configs → Setup new linux32 and linux64 vms in scl3
Once the slaves have been puppetized, I can add them to nagios.
ssh keys, known_hosts, and authorized_keys updated per type on all slaves
input for dbimport.py
Attachment #617560 - Flags: review?(coop)
here's the diff from the existing scl version, on which it is based:
$ diff -u puppet.scl*
--- puppet.scl  2010-09-23 11:34:09.000000000 -0700
+++ puppet.scl3 2012-04-24 07:31:53.000000000 -0700
@@ -1,5 +1,5 @@
 # The puppetmaster server
-PUPPET_SERVER=scl-production-puppet.build.scl1.mozilla.com
+PUPPET_SERVER=scl3-production-puppet.srv.releng.scl3.mozilla.com
 
 # If you wish to specify the port to connect to do so here
 #PUPPET_PORT=8140
Attachment #617874 - Flags: review?(rail)
Attachment #617874 - Flags: review?(rail) → review+
Attachment #617560 - Attachment mime type: text/csv → text/plain
Attachment #617560 - Flags: review?(coop) → review+
slavealloc update completed - all slaves manually set to disabled
all these new slaves now locked to preproduction master for burn in
All slaves added to nagios for ping, disk, and buildbot process.
(In reply to Hal Wine [:hwine] from comment #10)
> all these new slaves now locked to preproduction master for burn in

Correction - only the build slaves are connected at this time. (try slaves can't easily be hooked up at same time - will be handled as "next batch" once build slaves burned in).
build slaves being transitioned into production after removal of build dirs (very time consuming on vm)

try slaves being loaded into staging
Attachment #620816 - Flags: review? → review+
Also http://hg.mozilla.org/build/buildbot-configs/rev/4ab5af03cce1
r=aki in irc via pastebin - they need to be in staging as well

passes unit tests
filed bug 751976 - host bld-centos5-32-vmw-036 was ignored during initial slavealloc setup. As noted there, bld-centos5-32-vmw-036 has been manually added to the db, so this is not a blocker.
I backed out http://hg.mozilla.org/build/buildbot-configs/rev/5e9b9414b37a
since we were getting issues with the signing server:
https://tbpl.mozilla.org/php/getParsedLog.php?id=11474398&tree=Mozilla-Inbound

Slave: bld-centos5-32-vmw-011
IP: 10.26.52.139
Duration: 14400
URI: https://signing3.srv.releng.scl3.mozilla.com:9110/token
<buildbotcustom.steps.signing.SigningServerAuthenication instance at 0xf392e18>: token generation failed, error message: 403 Forbidden
URI: https://signing3.srv.releng.scl3.mozilla.com:9110/token
<buildbotcustom.steps.signing.SigningServerAuthenication instance at 0xf392e18>: token generation failed, error message: 403 Forbidden
URI: https://signing3.srv.releng.scl3.mozilla.com:9110/token
<buildbotcustom.steps.signing.SigningServerAuthenication instance at 0xf392e18>: token generation failed, error message: 403 Forbidden
URI: https://signing3.srv.releng.scl3.mozilla.com:9110/token
<buildbotcustom.steps.signing.SigningServerAuthenication instance at 0xf392e18>: token generation failed, error message: 403 Forbidden
URI: https://signing3.srv.releng.scl3.mozilla.com:9110/token
<buildbotcustom.steps.signing.SigningServerAuthenication instance at 0xf392e18>: token generation failed, error message: 403 Forbidden
I re-added the slaves as catlee mentioned that signing3 was not syncing with puppet.
It is now.
http://hg.mozilla.org/build/buildbot-configs/rev/c6530e968abd
try slaves added to production on 2012-05-07, except those on loan for other purposes. But since they are up and running, closing this issue!
Status: ASSIGNED → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
Product: mozilla.org → Release Engineering
Component: Platform Support → Buildduty
Product: Release Engineering → Infrastructure & Operations
Product: Infrastructure & Operations → Infrastructure & Operations Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: