Closed
Bug 737594
Opened 12 years ago
Closed 12 years ago
configure buildbot masters for r5 and linux builders in scl3
Categories
(Infrastructure & Operations Graveyard :: CIDuty, task, P3)
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: arich, Assigned: jhford)
References
Details
(Whiteboard: [buildmasters][capacity])
Attachments
(3 files)
2.48 KB,
patch
|
armenzg
:
review+
coop
:
checked-in+
|
Details | Diff | Splinter Review |
1.12 KB,
patch
|
armenzg
:
review+
coop
:
checked-in+
|
Details | Diff | Splinter Review |
1.38 KB,
patch
|
armenzg
:
review+
coop
:
checked-in+
|
Details | Diff | Splinter Review |
As soon as puppet is working in releng.scl3.mozilla.com, I'll be deploying a two vms for buildbot masters called buildbot-master30 and buildbot-master31.srv.releng.scl3.mozilla.com. In order to get the r5 builders up by the end of this week, we'll need to have someone set up at least one of these buildbot masters (we'll also be bringing up 50 linux vmware vm builders in the next week or so). Is there anything else that will be required other than a vm (matching other buildbot masters in specs) with the appropriate root pw set? Will a third bm be required?
Reporter | ||
Comment 1•12 years ago
|
||
The buildbot-master30.srv.releng.scl3.mozilla.com and buildbot-master31.srv.releng.scl3.mozilla.com vms are up with the standard root pw.
Updated•12 years ago
|
Assignee: nobody → coop
Status: NEW → ASSIGNED
Component: Release Engineering → Release Engineering: Platform Support
OS: Mac OS X → Linux
Priority: -- → P2
QA Contact: release → coop
Whiteboard: [buildmasters][capacity]
Comment 2•12 years ago
|
||
Attachment #608791 -
Flags: review?(armenzg)
Comment 3•12 years ago
|
||
Attachment #608792 -
Flags: review?(armenzg)
Comment 4•12 years ago
|
||
Comment on attachment 608791 [details] [diff] [review] Add new masters to production-masters.json If you have them already up and running then land as is. If you don't have them yet running could you please land it with "enabled": false? Otherwise, people using fabric will try to reconfigure disabled masters.
Attachment #608791 -
Flags: review?(armenzg) → review+
Updated•12 years ago
|
Attachment #608792 -
Flags: review?(armenzg) → review+
Comment 5•12 years ago
|
||
(In reply to Armen Zambrano G. [:armenzg] - Release Engineer from comment #4) > Comment on attachment 608791 [details] [diff] [review] > Add new masters to production-masters.json > > If you have them already up and running then land as is. > If you don't have them yet running could you please land it with "enabled": > false? > Otherwise, people using fabric will try to reconfigure disabled masters. I'll aiming to have them running before that would be an issue.
Comment 6•12 years ago
|
||
Comment on attachment 608791 [details] [diff] [review] Add new masters to production-masters.json https://hg.mozilla.org/build/tools/rev/b6ba25b1e855
Attachment #608791 -
Flags: checked-in+
Comment 7•12 years ago
|
||
Comment on attachment 608792 [details] [diff] [review] Add new masters to buildmaster-production.pp https://hg.mozilla.org/build/puppet-manifests/rev/4a9544063c2b
Attachment #608792 -
Flags: checked-in+
Comment 8•12 years ago
|
||
Attachment #608910 -
Flags: review?(armenzg)
Comment 9•12 years ago
|
||
Both masters are running now: http://buildbot-master30.srv.releng.scl3.mozilla.com:8001/ http://buildbot-master31.srv.releng.scl3.mozilla.com:8101/ On reboot, the initial connection to puppet seems to hang, despite it reporting success in /var/log/messages, e.g.: Mar 23 17:22:11 buildbot-master30 puppet-agent[2766]: Starting Puppet client version 2.6.14 Mar 23 17:22:21 buildbot-master30 puppet-agent[2766]: Finished catalog run in 7.33 seconds For now, it's enough to know that if we kill that hung process on reboot, the master will start correctly. I'll try to debug this further on Monday.
Reporter | ||
Comment 10•12 years ago
|
||
Right after you did this, all of the buildbot processes on all of the minis in scl3 stopped. I'm not sure what to do about the nagios alerts for now (are things broken, is this expected?), so I'm just going to leave them be and let you or someone else in releng downtime/ack/or fix them, whatever is appropriate.
Comment 11•12 years ago
|
||
The slaves hit exceptions because the passwords in the slave buildbot.tac were set to None and that only caused thing to break when the masters were available to connect to. Dustin helped me track down the missing password entries in the slavealloc db and get them fixed up. I've also made temporary additions to localconfig.py on both masters to allow the slaves to actually connect, but I'll need to discuss with jhford on Monday how we actually want to handle the new slaves.
Comment 12•12 years ago
|
||
I've shut down both of these masters. See bug 739032
Updated•12 years ago
|
Attachment #608910 -
Flags: review?(armenzg) → review+
Comment 13•12 years ago
|
||
Comment on attachment 608910 [details] [diff] [review] Ganglia and fileserver changes for new buildbot masters https://hg.mozilla.org/build/puppet-manifests/rev/b182eed16427
Attachment #608910 -
Flags: checked-in+
Comment 14•12 years ago
|
||
Re-assigning to jhford to get the rev5 builders running side-by-side with the existing builders for a little while.
Assignee: coop → jhford
Status: ASSIGNED → NEW
Priority: P2 → P3
Comment 15•12 years ago
|
||
(In reply to Chris AtLee [:catlee] from comment #12) > I've shut down both of these masters. See bug 739032 I disabled them in the JSON, too:
Assignee | ||
Comment 16•12 years ago
|
||
Catlee fixed the master-side issues in 739032. I've been running the r5 machines on scl1 masters, limited to the build-system branch. I'd like to have the r5 slaves pointing to scl3 masters today. Does anyone have objections to this plan? I have a test build happening at http://buildbot-master30.srv.releng.scl3.mozilla.com:8001/builders/OS%20X%2010.7%20build-system%20build/builds/0 as a test to make sure that the keys are still working.
Assignee | ||
Comment 17•12 years ago
|
||
(In reply to John Ford [:jhford] from comment #16) > I'd like to have the r5 slaves pointing to scl3 masters today. Does anyone > have objections to this plan? This was done by about 10am on April 4th. These masters are working in production, with a possible issue sending tinderbox email as documented in bug 744462. Bug 744462 is the only remaining work item for this bug, so I think it's time to close this bug.
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
Updated•11 years ago
|
Product: mozilla.org → Release Engineering
Updated•6 years ago
|
Component: Platform Support → Buildduty
Product: Release Engineering → Infrastructure & Operations
Updated•4 years ago
|
Product: Infrastructure & Operations → Infrastructure & Operations Graveyard
You need to log in
before you can comment on or make changes to this bug.
Description
•