Closed Bug 1061879 Opened 11 years ago Closed 11 years ago

virtualize upload1.dmz.scl3

Categories

(Infrastructure & Operations :: Change Requests, task)

task
Not set
minor

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: gcox, Assigned: gcox)

References

Details

(Keywords: p2v, Whiteboard: [vm-p2v:1])

When: a TCW, 1 hour duration System: upload1.dmz.scl3 Impact: Dealer's choice: disruptive cutover/cutback to using upload2 during the p2v, or builds can't upload during p2v. Plan: established p2v practices Notifs: standard TCW notices Point: gcox Why: Opportunistically p2v'ing a critical system before the warranty runs out. Can be deferred if there's conflicting work.
Flags: cab-review?
Nick - sounds like this is fine by us to happen during a TCW. Would we need to shut down any of our process, or okay to just let them recover retrigger after the fact? And would you recommend a hard tree closure or a soft one?
Flags: needinfo?(nthomas)
A drain on upload1 probably wouldn't be too disruptive, just a few builds that would fail to upload if they're using the post_upload.py method (which connects multiple times and relies on stuff it puts in /tmp. Affects firefox desktop, fennec, logs from masters IRRC). Not a huge difference from a hard cut if we're in a TCW anyways. What about the spec of the VM ? A nice big network pipe to the Netapp would be super.
Flags: needinfo?(nthomas)
My spec stab: 2 vCPU*, 8G vRAM**, 40G disk ('default'), standard ethernet*** * load doesn't really go over 2 unless there's collisions on rsyncs, and then it's just catchup. ** 16G is what it has now, but it's ALL cache, so that's quite trimmable. *** when we put in the vNIC it'll get 10GbE and be one switch away.
Flags: cab-review? → cab-review+
Blocks: 1065514
(In reply to Greg Cox [:gcox] (plz don't needinfo me) from comment #3) > My spec stab: 2 vCPU*, 8G vRAM**, 40G disk ('default'), standard ethernet*** Fwiw we currently have: /dev/sda3 67G 18G 46G 29% / on this host, where I wouldn't be shocked if we easily chew up 22+ GB extra in /tmp on occassion in high load-spikey moments. (just informative, not considering it as any reason to alter the plan-of-record)
OK, over on upload2 I shrank / to be smaller, but broke /tmp out be a separate 40g VMDK so it can't crush the root disk. We'll do the same for upload1.
Assignee: server-ops → gcox
Nice catch Callek.
p2v done during TCW, 0900-0950. /tmp put on its own partition, added DRS rules to ensure separation from upload2. Left a note for the oncall to watch for alerts in case we misjudged the size needs.
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Whiteboard: [vm-p2v:1]
Product: mozilla.org → Infrastructure & Operations
See Also: → 1133638
Change Request: --- → approved
Flags: cab-review+
You need to log in before you can comment on or make changes to this bug.