Closed Bug 1273528 Opened 8 years ago Closed 8 years ago

increase number of CPU for releng-puppet1.srv.releng.scl3.mozilla.com

Categories

(Infrastructure & Operations :: Virtualization, task)

task
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: arich, Assigned: cknowles)

References

Details

We just added 200 new machines that are puppetizing off of this host, and it's having some load issues when too many hit it at once. Can we bump this up to 6 CPUs to handle the additional load, please?
So, looking at Vm-side load monitoring - seeing an average of 80+% CPU used in the last day - So I'm not against this increase.

Several questions -
1) it's got a buddy, releng-puppet2.srv.releng.scl3.mozilla.com - would we be wanting to increase it as well
2) is this permanent?
3) if it's permanent - VM's really do fare better when split into multiples - is having 2 4-core servers doing the work of this one an option?  (I'm guessing not, but I have to ask)
4) and assuming the answer to 3 is *no* - when can we take this down to increase the CPU?  outage expected to be < 15 minutes
1) probably not, since the test VLAN is the biggest one, and these are segregated by VLAN because of the way we use CNAMEs for each domain. I don't think releng-puppet2 has been having any load issues since all the new load as gone to 1.

2) yes

3) releng-puppet2 is already the second server in the set, but things are segregated by CNAME/VLAN, and you can't have multiple CNAMEs for puppet.test.releng.scl3.mozilla.com (etc).

4) Flexible. We can sustain a brief outage, so just ping me/buildduty to let us know when it's happening.
Alright, poked in IRC, rebooted, bumped the CPU.

Note for futuretimes - vm-folk get more and more resistant to calls for more CPUs the higher the count goes - due to inefficiencies you get less bang for your buck, as the hypervisor needs to schedule ALL N CPUs at the same time, so you get less than expected performance the higher you go.  So in general you get less and less happy with performance the more CPUs you get - counter-intuitive, but there you go.

However, we're always happy to talk and figure out paths forward.  Reach out and we'll do out best to figure *something* out.
Assignee: server-ops-virtualization → cknowles
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
See Also: → 1312025
You need to log in before you can comment on or make changes to this bug.