Closed Bug 592793 Opened 14 years ago Closed 13 years ago

tools-staging-master vm can't keep up with sendchanges

Categories

(Infrastructure & Operations :: RelOps: General, task)

task
Not set
normal

Tracking

(Not tracked)

RESOLVED WONTFIX

People

(Reporter: anodelman, Assigned: arich)

References

Details

The tools-staging-master vm is hosted on the resource constrained ESX host in mountain view - thus it is unable to keep up with all of the sendchanges from production build machines. We can keep things afloat here by reducing the talos masters on tools-staging-master to only test moz-central and ignore all other branches/tests. The long term solution is having the esx host upgraded.
Blocks: 586828
Assignee: nobody → mrz
What luck! Hardware just arrived today. This is likely to be after the 42 builders though.
Ben - can you spin up this VM for Alice on the KVM cluster @ Castro?
Assignee: mrz → bkero
Sure, I'll do that this afternoon.
Any news here?
Sorry for the delay, I needed to finish setting up the KVM cluster for you. The machine should be available at tools-staging-master02.build.mtv1.mozilla.com. Let me know if you have any issues with the VM, or if you needed any hardware(RAM, CPUs, drive space) added, or if you need any SSH keys put on there.
So, we have a new staging master?
Punting this back to Alice. You have a root login to that box.
Assignee: bkero → anodelman
Let's not punt so fast... I'd like tools-staging-master02 to be a copy of tools-staging-master as I've already done a bunch of setup work on tools-staging-master, as have my coworkers. Can you pull a copy of tools-staging-master and apply it to tools-staging-master02?
Assignee: anodelman → server-ops
Component: Talos → Server Operations
Product: Testing → mozilla.org
QA Contact: talos → mrz
Version: unspecified → other
No, not possible (or easy). One's an ESX VM and the other is a KVM VM.
What's the image on tools-staging-master02?
Pretty sure it's a fresh CentOS install but I'll let Ben comment.
Assignee: server-ops → bkero
Yep, it's a fresh centos 5.5 install.
How should we proceed on this? If I learn a bit about ESX I should be able to dump the image for tools-staging-master, and import it into KVM.
If it's possible to get a copy of the current state of tools-staging-master and put it onto tools-staging-master02 that would be great, but I'm unsure what is involved. Setting up from scratch is going to be time consuming on the auto-tools team side (upwards of a week, I bet, to get all the ducks in a row).
I'm going to need to research ESX to accomplish this. I'll be doing this anyway for another bug. I'm first going to try to do this without any downtime, but will let you know the situation as I proceed.
(In reply to comment #13) > How should we proceed on this? If I learn a bit about ESX I should be able to > dump the image for tools-staging-master, and import it into KVM. If you can take a short downtime, I can copy the VM to another location.
Assignee: bkero → server-ops-releng
Component: Server Operations → Server Operations: RelEng
QA Contact: mrz → zandr
Short downtimes are a-okay if it gets the existing, configured master somewhere where it can run more efficiently.
Assignee: server-ops-releng → phong
Assignee: phong → bkero
Phong, did this ever get copied somewhere? If not, I have enough ESX chops now to copy this somewhere for replication.
I'm going through the releng ops queue and just checking to see where we stand on this bug. Ben/Alice, have the two of you coordinated a downtime and gotten a new vm spun up?
I believe that a downtime was attempted but was not successful...
I can create this VM at any time, except I think this also follows the same problem as talos-addon-master1, where you'll need a new reference image. Would you like me to deploy with the tools-staging-master reference image, or wait for a new one here too?
Depends on: 659512
This is dependent on bug 659512 - having a copy of the old tools-staging-master won't be helpful as it doesn't work with the latest releng master set up.
Since we started over fresh with tools-staging-master02, where do we stand on this bug?
I think that we could mark this is a WONTFIX: - the new vm should be better able to keep up - we are no longer linked to the releng masters that were generating the glut of sendchanges and are instead generating sendchanges locally using scripts
Status: NEW → RESOLVED
Closed: 13 years ago
Resolution: --- → WONTFIX
Assignee: bkero → arich
Component: Server Operations: RelEng → RelOps
Product: mozilla.org → Infrastructure & Operations
You need to log in before you can comment on or make changes to this bug.