Closed Bug 654844 Opened 13 years ago Closed 13 years ago

Create buildbot-master{07,08,09,10} in SJC1

Categories

(Infrastructure & Operations :: RelOps: General, task)

x86_64
Linux
task
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: catlee, Assigned: phong)

References

Details

These should be 64-bit Centos 5.5 images like what was used for buildbot-master{04,06}. They can be ESX of KVM guests, doesn't matter much to me. Can we make them all with 8GB RAM?
The plan is to transition masters from production-master{01,02,03} over to these, and then shut down the old ones.
Phong: I know that we're constrained on power in sjc1. Are we out of vm capacity on the existing vmware servers?
Bug 629377 should have created some VM capacity on the SJC ESX boxes (bm-vmware*).
We've agreed to meet to discuss the requirements of the new buildbot master machines (how much RAM, how many are required) before creating new VMs.
Assignee: server-ops-releng → phong
We've agreed to 6GB RAM, 6GB swap, and 2 CPUs. Let us know if that's not possible.
Until we figure out the memory leak in the buildbot master process, we need these VMs to have the same spec as the buildbot-master04,06 VMs. After getting off phone with Phong, catlee tells me we are going to try reducing these KVM VMs from 8gb to 6gb RAM. So these 4 new buildbot-master{07,08,09,10} VMs should each have: 6gb RAM and 20gb disk. I think we have room because we've recently deleted 18 VMs in bug#629377, which freed up 76gb RAM and some amount of disk space. Phong can you verify, and let us know if you are still blocked?
Blocks: 656413
Phong: any ETA on these VMs?
Note that we only need 6GB on each of these.
(In reply to comment #6) > Until we figure out the memory leak in the buildbot master process, we need > these VMs to have the same spec as the buildbot-master04,06 VMs. After > getting off phone with Phong, catlee tells me we are going to try reducing > these KVM VMs from 8gb to 6gb RAM. So these 4 new > buildbot-master{07,08,09,10} VMs should each have: 6gb RAM and 20gb disk. > > > I think we have room because we've recently deleted 18 VMs in bug#629377, > which freed up 76gb RAM and some amount of disk space. > > Phong can you verify, and let us know if you are still blocked? Technically you didn't free up that much since they are shared resources that goes through spike. With memory leak problem of the buildbots, these will constantly being using all of the RAM assigned.
(In reply to comment #7) > https://wiki.mozilla.org/ReleaseEngineering/Master_Setup#Hardware Do we have a template for theses or do they need to be built fresh from CD with CentOS 5.5 64bit?
I've been using the image that bkero set up on the kvm servers, but that probably doesn't translate here. If we don't have a template already, could you please make one that's the same as the kvm image/template?
Phong: Current policy is one buildbot master instance per VM 64-bit guest centos 5.5 2 virtual CPUs 6 GB RAM 6 GB swap 30GB partition mounted at / 100MB partition mounted at /boot
Phong has built buildbot-master07, but I'm waiting for him to clone it (so he can build 08) before releng makes modifications to it. In order to build 09 and 10, phong says he needs to upgrade vmware on the other cluster.
all four buildbot-master servers in sjc1 are now online and waiting for configuration. Please open a new bug when you're ready to have them monitored.
Status: NEW → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
Hi, phong, there are a couple issues. First, we need centos 5.5, and it appears these machines have 5.6 on them. Secondly, we're getting consistently poor performance on 08 and 10. Timing buffered disk reads: 4 MB in 3.58 seconds = 1.12 MB/sec on 08 and 10 while 07 and 09 have about 50 MB/s Could you please look into the performance issue (are 08 and 10 on different storage or a different cluster or something?) and rebuild them with 5.5? Thanks!
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
recreated all 4 with CentOS 5.5.
Status: REOPENED → RESOLVED
Closed: 13 years ago13 years ago
Resolution: --- → FIXED
Component: Server Operations: RelEng → RelOps
Product: mozilla.org → Infrastructure & Operations
You need to log in before you can comment on or make changes to this bug.