Closed Bug 653769 Opened 14 years ago Closed 14 years ago

increase RAM for ns1 and ns2 in scl1

Categories

(Infrastructure & Operations :: RelOps: General, task)

x86
macOS
task
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: arich, Assigned: bkero)

Details

(Whiteboard: [buildduty])

We'd like to pick a time to increase the RAM for the two nameservers vms in scl1. This will require a short downtime for each vm (~2 minutes). Due to the sensitivities around the windows resolver, I wanted to make sure that this was clearly communicated and we all chose a time when the scl1 systems might be less busy than others (or at least not at a highly critical time for the infrastructure in that datacenter). Releng folks, when is a good time to schedule this work?
I say early EST morning (Monday and Friday being better in sense of expected load). After 10AM EST the load starts raising. We would need to gracefully shutdown the two masters which should take 1hr or less. buildduty should be on the loop as well.
Whiteboard: [buildduty]
Please completely ignore my previous comment. I will let others reply. I got confused with other hostnames.
Actually, armen's timing description is good. Buildbot-master04/06 will not need to be taken down, though, so unless something fails, or the windows resolver is worse than we think, we will have no downtime. At worst, the windows slaves fail to upload a build and we re-run it. Since they're many hours long, that would be bad, but manageable. Ben, I'll leave it to you to pick a precise time.
I'm willing to do this on the weekend too, since it's such a trivial change. If you'd like Monday, I can get up at 6AM PST to reboot these hosts. How does that sound?
Sounds good to me. Even with the hardware troubles, this week's kvm work went quite smoothly, so I'm very comfortable with this.
The RAM upgrade went flawlessly. We had an issue with ns2 after the reboot which was due to network configuration, but a second reboot (after fixing the problem with ns2) went well.
Status: NEW → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
Component: Server Operations: RelEng → RelOps
Product: mozilla.org → Infrastructure & Operations
You need to log in before you can comment on or make changes to this bug.