Closed Bug 759631 Opened 12 years ago Closed 12 years ago

enhance or replicate github-sync1 for both disk/cpu

Categories

(Developer Services :: General, task)

x86
macOS
task
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: hwine, Assigned: fox2mike)

References

Details

As part of the Q2 goal to get repos available via git, it appears more of the initial conversion work will need to be done on poc hardware.

This means I need more disk & CPU - at least temporarily (not sure there's enough data yet to size production steady state). I'm fine with either enhancing the heck out of github-sync1.dmz.releng.scl3.mozilla.com or standing up new boxes for the CPU side of things. Note that the box _requires_ Reiser FS for any type of performance (see bug 731329 and bug 739100)

Disk needs: 
 - 50 "full" repos @ 8.5 GB each ~ 425GB (need not be on existing mounts)

RAM needs:
 - each full conversion so far had a max RSS of 2,248,640 KB (per /usr/bin/time)

CPU needs:
 - each "full repo" conversion takes 6 CPU days, and each box takes on CPU to keep the repos on that box up to date. so an 8-way box can convert & serve about 7 repos a week.
Assignee: server-ops-releng → server-ops-devservices
Component: Server Operations: RelEng → Server Operations: Developer Services
QA Contact: arich → shyam
Dan,

Need an ack from you before I bump up resources on this box.

Hal, 

I can probably throw in 450 GB of Disk and bump up ram, but I'm not sure what you mean by 8 way box. Is your conversion multi-threaded? How many cores do you need? 

Also, with 4 weeks left in the quarter, I'm not sure we can get hardware for this, but if the VM won't cut it, I can see what can be done.
Assignee: server-ops-devservices → server-ops
Component: Server Operations: Developer Services → Server Operations: Virtualization
QA Contact: shyam → dparsons
Shyam, the conversion process is single threaded, and cpu bound. So initial conversions I start as separate jobs, and they'll bind to one core.

By '8-way' I meant 8 core. I'll leave it to you to decide if adding cores or adding new vms is best. fwiw, the conversion times on VM's are the same we see on real hardware - it's just 6 days of CPU. 

Based on my conversion-in-progress, it's running a bit over that, so I suspect that there may have been some additional i/o wait time, as keeping the existing conversions up-to-date is the new addition this time, and both conversions would be competing for the same i/o channel to the filer.
Alright. That makes sense. As far as machines vs VMs go, the only maybe slight downside of VMs could probably be the shared storage (vs local storage on machines). That's probably going to be minimal considering how state of the art scl3 is :)

So let's bump up the VM for now to the following :

8 cores 
8-16 GB of RAM
450 GB additional disk

Does that sound good?

Dan, once you ack this, I'll make these changes for Hal.
Assignee: server-ops → shyam
Shyam - yep, sounds good with 2GB/core ram. Let me know when you bounce the box - I'll need to restart some things, but they do keep state.
This is a terrible candidate for virtualization. Just because VMs can be spun up faster than physical hardware doesn't negate the fact that this is a high CPU, high disk IO, high RAM server. I know we have many spare blades from the sjc1 move, can those be used?
(In reply to Dan Parsons [:lerxst] from comment #5)
> This is a terrible candidate for virtualization. Just because VMs can be
> spun up faster than physical hardware doesn't negate the fact that this is a
> high CPU, high disk IO, high RAM server. I know we have many spare blades
> from the sjc1 move, can those be used?

Which is why I asked before I did anything :D 

I'll look into options.
Assignee: shyam → server-ops-devservices
Component: Server Operations: Virtualization → Server Operations: Developer Services
QA Contact: dparsons → shyam
Alright, we'll try out a seamicro xeon and see how it performs. I'll set it up.
Assignee: server-ops-devservices → shyam
Hal,

You can login to github-sync1-dev.dmz.scl3.mozilla.com and try out stuff.
Shyam,

Box looking good - I'll be spinning more repos up on it - where were you going to add the disk? More on github-sync1-dev.dmz.scl3.mozilla.com would be better so I can take advantage of the extra cores.
Blocks: 745989
(In reply to Hal Wine [:hwine] from comment #9)
> Shyam,
> 
> Box looking good - I'll be spinning more repos up on it - where were you
> going to add the disk? More on github-sync1-dev.dmz.scl3.mozilla.com would
> be better so I can take advantage of the extra cores.

So, two paths here. I can get NFS added and see how it performs, or add more disk to VM. NFS = we have no choice of filesystems and if you're going to be doing really i/o heavy stuff, I'm not sure how well it'll perform. Adding disk to the VM is easy and we'll have reiserfs.

Alternatively, I can get you more machines, but with the same disk capacity.
So far, so good. No speed improvement in time of individual conversion, but ability to do multiple is very good. Will evaluate need for additional machines after first conversion cycle done (6 days), and open new ticket if needed.

Thanks!
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
Component: Server Operations: Developer Services → General
Product: mozilla.org → Developer Services
You need to log in before you can comment on or make changes to this bug.