Closed Bug 417058 Opened 16 years ago Closed 15 years ago

consolidate buildmaster instances to shared server

Categories

(Release Engineering :: General, defect, P3)

defect

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: rhelmer, Unassigned)

References

Details

Currently we have a ton of buildmasters:

* staging-1.8-master
* staging 1.9-master
* production-1.8-master
* production-1.9-master
* moz2-master

That's 6 separate VMs. We should instead have these be multiple processes on the same two VMs - staging-master and production-master.
The Mobile master will end up here, too.
(In reply to comment #0)
> Currently we have a ton of buildmasters:
> 
> * staging-1.8-master
> * staging 1.9-master
> * production-1.8-master
> * production-1.9-master
> * moz2-master
> 
> That's 6 separate VMs. We should instead have these be multiple processes on
> the same two VMs - staging-master and production-master.

 
Actually 5 because we're leaving tryserver master alone for now (in it's own network) but who's counting? :)
I'm currently running two masters on the same VM for the unittest staging (6 slaves) and leak testing (4 slaves) setups. It's usable but slow-ish. Not sure how 5 masters on the same host will fare.
(In reply to comment #3)
> I'm currently running two masters on the same VM for the unittest staging (6
> slaves) and leak testing (4 slaves) setups. It's usable but slow-ish. Not sure
> how 5 masters on the same host will fare.
> 

Do you know what aspect of it is slow? I think we could suitably bump up cpu, disk, or memory if necessary.
(In reply to comment #1)
> The Mobile master will end up here, too.
There was never, afaik, a plan to create a mobile master. I thought mobile builds would be driven by the staging-1.9-master and production-1.9-master, just like another platform.
(In reply to comment #3)
> I'm currently running two masters on the same VM for the unittest staging (6
> slaves) and leak testing (4 slaves) setups. It's usable but slow-ish. Not sure
> how 5 masters on the same host will fare.

Well, we currently run a slave and a master process on each of these, so we know slow :)

Also it'll be 3 masters on staging-master and 3 on production-master, but your point still stands. We can beef up the machine if we have to, let's try on staging and see how it goes.
We talked about this on IRC, here's a summary of what we're doing (and who's doing it):
1) (Ben) moz2-master will become production-master.
2) (Rob) A new VM will be created to be 'staging-master'.
3) (Ben) The moz2-master instance will stay on production-master.
4) (Rob) The staging automation masters will move to staging-master. The slaves that run on those machines will stay there.
5) (Rob) If all goes well with staging, the production automation masters will move to production-master. The slaves running on those machines will stay there.
6) (Ben) A temporary mobile master will be used on staging-master while the mobile Buildbot is brought online. This will integrated with the existing 1.9 master when it is stable.
Priority: -- → P2
(In reply to comment #7)
> We talked about this on IRC, here's a summary of what we're doing (and who's
> doing it):
> 1) (Ben) moz2-master will become production-master.
> 2) (Rob) A new VM will be created to be 'staging-master'.

Why don't we flip this around, and make moz2-master into staging-master? Then we don't have to block on setting up a new VM, we're not going to need production-master until the staging stuff is all up and running right? 
Sounds good to me.
Alright, moz2-master has been renamed to staging-master. Currently the Moz2 master and mobile master exist on it, both in subdirectories of /builds/buildbot/.

The moz2 master uses 8010 for WebStatus, and 9010 as it's slave port.
The mobile master uses 8020 for WebStatus, and 9020 as it's slave port.

I suggest that we continue with this formula for all of the other masters (and for production-master, when it is setup).
So, we still need a production-master. We also need to move the release automation masters over to a consolidated server. This depends on bug 415970 (automation should not need slave/ftp on master server).

Looks like we're blocked on that for now.
Depends on: 415970
Priority: P2 → P3
(In reply to comment #12)
> So, we still need a production-master. We also need to move the release
> automation masters over to a consolidated server. This depends on bug 415970
> (automation should not need slave/ftp on master server).
> 
> Looks like we're blocked on that for now.

I'm trying to get 1.8 staging working again (fallout from binary talkback switch, just want to make sure it works before trying to move it!).

There's another issue I thought of - how do we deal with the cvsmirrors? Currently 1.8 and 1.9 have their own.. should they continue to? We'd have to move the syncing outside of buildbot I think if so.
Blocks: 420005
No longer blocks: 420005
Component: Release Engineering → Release Engineering: Future
QA Contact: build → release
Consolidated down already. (although we did have to split out production-master into two separate masters on separate VMs to handle load).
Status: NEW → RESOLVED
Closed: 15 years ago
Resolution: --- → FIXED
Moving closed Future bugs into Release Engineering in preparation for removing the Future component.
Component: Release Engineering: Future → Release Engineering
Product: mozilla.org → Release Engineering
You need to log in before you can comment on or make changes to this bug.