Closed Bug 425790 Opened 17 years ago Closed 16 years ago

Triadic unit test coverage for Mac OSX

Categories

(Release Engineering :: General, defect, P2)

x86
macOS
defect

Tracking

(Not tracked)

RESOLVED WORKSFORME

People

(Reporter: coop, Assigned: coop)

References

Details

We need to setup two more machines to match qm-xserve01 to get to triadic coverage here.
Blocks: 425786
Priority: -- → P3
IT: can you please look in to acquiring two Mac machines as close in configuration as possible to qm-xserve01? Ideally the machines would be identical, since that's the spirit of this bug and would hopefully allow for cloning. I'm not going to be a real stickler here, since I know you appreciate lots of lead-time on hardware acquisition. If we have Mac minis that can go into production sooner vs. waiting on matching hardware, I'm fine with that. Others can pipe up if they are not.
Assignee: nobody → server-ops
two more xserves? copying schrep for $11k (out of budget) approval. didn't we just buy like 4 of these?
Justin: I don't really know the history here. We already have an xserve in production, yes, but I would personally be fine with having 3 minis providing the Mac coverage and recycling that single xserve for something else if that's more fiscally responsible. Maybe robcee knows why we went the xserve route in the first place for qm-xserve01, and whether we need to stay that course?
ok - thought you wanted to something like the current xserve. If we are OK replacing them all with mini's, and if one goes down, a few days to recover is OK we can go that route. let us know what works.
the history: when we first settled on machinery for qm-xserve01 it was recommended that we go with an Xserve. The rationale was that it's high-availability, mission-critical hardware and they're easier to support in a colo. I believe those were the reasons, Justin might remember the discussion. That said, if we're duplicating the machines and building in redundancy, I think we could go with minis as long as the cycle times are sufficiently fast.
Did you decide on using two Minis for this?
a+ for budget on whatever is the most human admin time effective.
(In reply to comment #6) > Did you decide on using two Minis for this? > No - xserves. need the specs of the current ones and we'll get 2 more on order. Robcee - can you get us that?
Assignee: server-ops → justin
we currently have a quadcore xeon, 2.66GHz machine with 4GB RAM and 60GB of RAID storage, iirc. I'm not 100% on the RAM or storage but that should be close enough.
on order.
Status: NEW → ASSIGNED
Assignee: justin → mrz
Status: ASSIGNED → NEW
Whiteboard: waiting for xserves
Coop - do you want to clone qm-xserve01 or use one of the standard OSX images (and then which one)? These two are tentiviely called qm-xserve04 and qm-xserve05. The assumption is that these will be on the QA network too. Correct if wrong.
(In reply to comment #11) > Coop - do you want to clone qm-xserve01 or use one of the standard OSX images > (and then which one)? A clone of qm-xserve01 would be ideal. That's what we're trying to achieve with these. (In reply to comment #11) > These two are tentiviely called qm-xserve04 and qm-xserve05. The assumption is > that these will be on the QA network too. Correct if wrong. Yes, that's correct.
Any chance I can take xserve01 offline to take a clean clone of it?
(In reply to comment #13) > Any chance I can take xserve01 offline to take a clean clone of it? It's a tier 1 unit test machine, so it'll need to have downtime scheduled and publicly announced for it.
Coop - want to take a downtime hit or try to deal with a hot clone? Actually, I'll try the hot clone and if that fails make you schedule a window :)
Tried three times to clone this running system. Each time the restore fails to boot on the destination host. The clone takes at least an hour, but I'd budget for 2-3 hours. Passing back to Rel Eng for scheduling.
Assignee: mrz → nobody
Any chance of making this clone during the outage window/tree closure tomorrow night (Thursday, April 24th, from 8:00-11:00 PM PDT), or will IT have its hands full with the graph server migration?
That's very possible. I think Mark is going to be onsite in the morning - could you move the Firewire drive marked "Xserver imagemaster" to qm-xserve01? I believe the drive is attached to one of the boxes in 101.06. I should be able to boot off that drive at night to make the clone.
Assignee: nobody → mark
Component: Release Engineering: Talos → Server Operations
Yep, the drive is moved.
Assignee: mark → mrz
Cold clone taken, over to Phong to try to restore. Image is on the Desktop in a folder called "qm-xserve01".
Assignee: mrz → phong.tran
Whiteboard: waiting for xserves
Running into a platform/hardware issue. The image from qm-xserve01 will not boot on the new gen xservers. At this point your choice is either a fresh install of OSX10.4 or 10.5.
Flags: colo-trip+
Erk, OK. Let us discuss this at tomorrow's build team meeting and get back to you.
phong/mrz: can we get 10.4 installed on one box, have me set it up for unit testing, and then clone that over to the other machine?
Where is this server physically located now?
qm-server04 & 05 are in 101.07.
Non of the 10.4.x install disc can see the drives. I was only able to install 10.5 on this server.
Status: NEW → ASSIGNED
Not sure if we can use 10.5; we need these new machines to match the existing qm-xserve01 (which is running 10.4). What exactly is the platform/hardware issue blocking us from using the qm-xserve01 image? Can we find out from Apple why we cannot image/clone an xserve from an existing "identical" xserve?
I am assuming this might be the same issue with not being able to install 10.4.x from the install disc. QM-XSERVE01 and the new server have different hardware configurations.
john: should we offer up the older xserves we've recently freed up for this process? I'd rather go the cloning route if we can. Installing every new set of machines by hand kinda defeats the purpose (and is boring as hell). We can reserve the new xserves for Moz2 work, since we'll need 10.5 there anyway.
Phong - recall what username/password you use on that box?
cltbld/L....... Call me if you can't figure out what I mean by that.
John/Coop - is there anything left for IT to do on this bug? Comment #29 suggests that these will be for moz2 instead?
Assignee: phong.tran → mrz
Status: ASSIGNED → NEW
Punting over the fence. Push back if there's more work for IT.
Assignee: mrz → nobody
Component: Server Operations → Release Engineering
Flags: colo-trip+
Assignee: nobody → ccooper
Priority: P3 → P2
Status: NEW → ASSIGNED
We've got two Macs covering trunk trunk now: qm-xserve01 and qm-xserve06. These machines have been solid, relative to there VM counterparts for Linux and Windows. Are people fine with only having 2 machines for coverage here instead of 3? My primary concern is where best to expend effort on unit testing coverage now that 3.0 is out the door. The current xserves I mentioned above are running 10.4, and the newer xserve hardware won't accept a 10.4 image. AFAIK, we don't have any older hardware that is idle -- joduinn: correct me if I'm wrong -- so we'd need to track down a machine to get up to triadic coverage.
hey coop, We talked about this a bit, and considering the relative reliability of the xserves and the newly de-emphasized state of 1.9, I think we can safely ignore an extra machine here. One experiment I've been considering lately is trying out a single Xserve running three slaves on it under different user spaces. We're not exactly maxxing out our hardware here and might be able to get away with 2 or 3 test slaves or builders per machine.
(In reply to comment #35) > We talked about this a bit, and considering the relative reliability of the > xserves and the newly de-emphasized state of 1.9, I think we can safely ignore > an extra machine here. Sold. We can file another bug for the multiple slave per xserve experiment.
Status: ASSIGNED → RESOLVED
Closed: 16 years ago
Resolution: --- → WORKSFORME
Product: mozilla.org → Release Engineering
You need to log in before you can comment on or make changes to this bug.