Closed
Bug 688838
Opened 14 years ago
Closed 14 years ago
Repurpose 16 OS X 10.5 mac minis for Thunderbird builds (8 10.5/8 10.6)
Categories
(Infrastructure & Operations :: RelOps: General, task, P3)
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: jhopkins, Assigned: arich)
References
Details
(Whiteboard: [hardware])
Please repurpose eight OS X 10.5 mac minis for use as Thunderbird build slaves as soon as possible.
I have a clonezilla image that we can hopefully use to image these machines.
Comment 1•14 years ago
|
||
Armen: can you figure out where best to pull these slaves from, please? They can be any rev3 minis, i.e. they don't have to be running 10.5 currently, since they'll be re-imaged anyway.
If Amy would prefer these minis to be contiguous or in a particular colo, she should weigh-in on that.
We can reassign this bug over to relops once we've decide which slaves to re-purpose, have stopped buildbot on them, and have tagged them as going over to Thunderbird in slavealloc.
Assignee: nobody → armenzg
OS: Linux → Mac OS X
Priority: -- → P3
Whiteboard: [hardware]
Comment 2•14 years ago
|
||
I have disabled in slavealloc the following:
* moz2-darwin9-slave064
* moz2-darwin9-slave065
* moz2-darwin9-slave066
* moz2-darwin9-slave067
* moz2-darwin9-slave069
* moz2-darwin9-slave070
* moz2-darwin9-slave071
* moz2-darwin9-slave072
* moz2-darwin9-slave073
* moz2-darwin9-slave074
* moz2-darwin9-slave075
NOTE: there is no 68
We will have to remove them from nagios, slavealloc, buildbot-configs and puppet.
Feel free to do this at any time.
Assignee: armenzg → server-ops-releng
Component: Release Engineering → Server Operations: RelEng
QA Contact: release → zandr
| Reporter | ||
Comment 3•14 years ago
|
||
Our new minis are being moved to the MPT colo for simplicity, so these older minis should do the same.
| Assignee | ||
Comment 4•14 years ago
|
||
These minis are already in sjc1 and so is the rest of the thunderbird infrastructure. Right now the thunderbird infra is self contained, and we do not router their vlans across the rest of the sjc1 datacenter as far as I understand it. In talking to gozer earlier (also cced on this bug), he said that there's room in the thunderbird rack and on the switch for more mini servers. It would merely be a question of making sure there's enough power there (I've cced mrz for this purpose). If there is not sufficient power in that rack, we can work to try and route the thunderbird vlans across the datacenter, but this would be more complicated.
There is also still a question of how to get the thunderbird OS image onto these minis, and perhaps jhopkins can be of some help there.
| Assignee | ||
Comment 5•14 years ago
|
||
I spoke with dmoore, and he says we have space and power in the rack right across from the tbird rack if we do not have space in the rack itself. We can make the hardware side of things work.
Comment 6•14 years ago
|
||
Please update here when we have a timeframe for when the minis will be ready.
| Assignee | ||
Comment 7•14 years ago
|
||
Releng folks, I believe that these (there are 11 listed, not 8, btw) are good to move now, correct (I'll take care of removing them from nagios)?
| Assignee | ||
Comment 8•14 years ago
|
||
Ah, I also have no record of the following 3 machines in dns or nagios, so maybe that's where the spurious machines above 8 came in:
* moz2-darwin9-slave073
* moz2-darwin9-slave074
* moz2-darwin9-slave075
| Assignee | ||
Comment 9•14 years ago
|
||
I've removed the following from nagios:
* moz2-darwin9-slave64
* moz2-darwin9-slave65
* moz2-darwin9-slave66
* moz2-darwin9-slave67
* moz2-darwin9-slave69
* moz2-darwin9-slave70
* moz2-darwin9-slave71
* moz2-darwin9-slave72
I've left moz2-darwin9-slave68 in nagios.
The following don't exist:
* moz2-darwin9-slave73
* moz2-darwin9-slave74
* moz2-darwin9-slave75
| Assignee | ||
Updated•14 years ago
|
Severity: normal → critical
| Assignee | ||
Comment 11•14 years ago
|
||
Please move these hand run networking to the momo router. Please also reimage half of these with 10.5 and half with 10.6 using deploystudio (we can walk someone through the reimage).
Comment 12•14 years ago
|
||
I have set aside another 8 machines (coop & arr are on the loop):
* moz2-darwin9-slave55 - still building
* moz2-darwin9-slave56
* moz2-darwin9-slave57
* moz2-darwin9-slave58
* moz2-darwin9-slave59
* moz2-darwin9-slave60 - still building
* moz2-darwin9-slave61 - still building
* moz2-darwin9-slave62 - still building
* moz2-darwin9-slave63
I have disabled all of them from slavealloc, added a note and gracefully shutdown all of them. Except the 4 builds going on everything else can go to be reimaged.
Comment 13•14 years ago
|
||
To be clear, this new batch should also be split 50/50 between 10.5 and 10.6.
Sorry for the last minute addition.
| Assignee | ||
Comment 14•14 years ago
|
||
FYI you listed 9 machines, not 8. Should that be:
* moz2-darwin9-slave56
* moz2-darwin9-slave57
* moz2-darwin9-slave58
* moz2-darwin9-slave59
* moz2-darwin9-slave60 - still building
* moz2-darwin9-slave61 - still building
* moz2-darwin9-slave62 - still building
* moz2-darwin9-slave63
?
Comment 15•14 years ago
|
||
moz2-darwin9-slave59 doesn't exist, i.e. it doesn't appear in slavealloc or nagios AFAICT.
| Assignee | ||
Comment 16•14 years ago
|
||
The full list of machines should be:
* moz2-darwin9-slave55
* moz2-darwin9-slave56
* moz2-darwin9-slave57
* moz2-darwin9-slave58
* moz2-darwin9-slave60
* moz2-darwin9-slave61
* moz2-darwin9-slave62
* moz2-darwin9-slave63
* moz2-darwin9-slave64
* moz2-darwin9-slave65
* moz2-darwin9-slave66
* moz2-darwin9-slave67
* moz2-darwin9-slave69
* moz2-darwin9-slave70
* moz2-darwin9-slave71
* moz2-darwin9-slave72
| Assignee | ||
Comment 17•14 years ago
|
||
The first 8 in this bug imaged without error and are racked in/near the momo rack. At this point, we have a power adapter issue which we'll be working to solve tomorrow morning. We'll also reimage the other 8 machines tomorrow morning and move them in/next to the momo rack. Once that's done, we'll cable them up to the momo switch, and the configuration changes that gozer made to momo dns/dhcp should allow them to boot up on the tbird build network.
Note that they'll be configured just like firefox build slave minis, so the passwords will be the same, and they will be trying to talk to the releng puppet server. jhopkins is going to get the pws from the releng team and work on getting the tbird builds working on these machines after they're up.
A huge thanks to everyone in IT (mrz, phong, jabba, dmoore, ravi, dustin, gozer, et.al.) who's scrambled to make this happen at the last minute.
| Assignee | ||
Updated•14 years ago
|
Assignee: server-ops-releng → arich
| Assignee | ||
Updated•14 years ago
|
Summary: Repurpose 8 OS X 10.5 mac minis for Thunderbird builds → Repurpose 16 OS X 10.5 mac minis for Thunderbird builds (8 10.5/8 10.6)
Comment 18•14 years ago
|
||
yes, thanks from the vancouver & TB teams as well.
Comment 19•14 years ago
|
||
Internal DNS names/IPs have been allocated for these:
; Network 10.200.80.0
tb2-darwin9-slave55 IN A 55
tb2-darwin9-slave56 IN A 56
tb2-darwin9-slave57 IN A 57
tb2-darwin9-slave58 IN A 58
tb2-darwin9-slave60 IN A 60
tb2-darwin9-slave61 IN A 61
tb2-darwin9-slave62 IN A 62
tb2-darwin9-slave63 IN A 63
tb2-darwin9-slave64 IN A 64
tb2-darwin9-slave65 IN A 65
tb2-darwin9-slave66 IN A 66
tb2-darwin9-slave67 IN A 67
tb2-darwin9-slave69 IN A 69
tb2-darwin9-slave70 IN A 70
tb2-darwin9-slave71 IN A 71
tb2-darwin9-slave72 IN A 72
I kept the names the same (s/moz2/tb2/) to keep things simple.
Comment 20•14 years ago
|
||
Networking is ready for them, just hook them up in order on momo-core3 (HP Procurve), starting at port 5.
tb2-darwin9-slave55 5
tb2-darwin9-slave56 6
tb2-darwin9-slave57 7
tb2-darwin9-slave58 8
tb2-darwin9-slave60 9
tb2-darwin9-slave61 10
tb2-darwin9-slave62 11
tb2-darwin9-slave63 12
tb2-darwin9-slave64 13
tb2-darwin9-slave65 14
tb2-darwin9-slave66 15
tb2-darwin9-slave67 16
tb2-darwin9-slave69 17
tb2-darwin9-slave70 18
tb2-darwin9-slave71 19
tb2-darwin9-slave72 20
They should get everything via DHCP.
Comment 21•14 years ago
|
||
(In reply to Amy Rich [:arich] from comment #17)
> A huge thanks to everyone in IT (mrz, phong, jabba, dmoore, ravi, dustin,
> gozer, et.al.) who's scrambled to make this happen at the last minute.
Seriously! Well executed :D
Comment 22•14 years ago
|
||
(In reply to Amy Rich [:arich] from comment #17)
> The first 8 in this bug imaged without error and are racked in/near the momo
> rack. At this point, we have a power adapter issue which we'll be working
> to solve tomorrow morning. We'll also reimage the other 8 machines tomorrow
> morning and move them in/next to the momo rack. Once that's done, we'll
> cable them up to the momo switch, and the configuration changes that gozer
> made to momo dns/dhcp should allow them to boot up on the tbird build
> network.
Based on comment #20, does IT have all the information from gozer now to get these networked this morning? jhopkins already has login info for these machines, so he's ready to start as soon as these machines are up.
> A huge thanks to everyone in IT (mrz, phong, jabba, dmoore, ravi, dustin,
> gozer, et.al.) who's scrambled to make this happen at the last minute.
Yes, thanks to everyone who has helped out on this firedrill.
Comment 23•14 years ago
|
||
(In reply to Chris Cooper [:coop] from comment #22)
> Based on comment #20, does IT have all the information from gozer now to get
> these networked this morning? jhopkins already has login info for these
> machines, so he's ready to start as soon as these machines are up.
Yes, that's underway now.
As a reminder, inventory will need to be updated when the dust settles. I'm happy to help with that - let's just make sure we have the necessary info (switchports, rack locations).
| Assignee | ||
Comment 24•14 years ago
|
||
Here are the MACs
host moz2-darwin9-slave55 { hardware ethernet 00:16:cb:af:a1:c0
host moz2-darwin9-slave56 { hardware ethernet 00:16:cb:af:a2:06
host moz2-darwin9-slave57 { hardware ethernet 00:16:cb:af:a1:ec
host moz2-darwin9-slave58 { hardware ethernet 00:16:cb:b0:75:66
host moz2-darwin9-slave60 { hardware ethernet 00:16:cb:af:5b:90
host moz2-darwin9-slave61 { hardware ethernet 00:16:cb:af:40:39
host moz2-darwin9-slave62 { hardware ethernet 00:16:cb:af:24:cd
host moz2-darwin9-slave63 { hardware ethernet 00:16:cb:af:6c:04
host moz2-darwin9-slave64 { hardware ethernet 00:16:cb:af:24:7f
host moz2-darwin9-slave65 { hardware ethernet 00:16:cb:af:71:72
host moz2-darwin9-slave66 { hardware ethernet 00:16:cb:af:9d:d5
host moz2-darwin9-slave67 { hardware ethernet 00:16:cb:af:9d:83
host moz2-darwin9-slave69 { hardware ethernet 00:16:CB:AE:AF:08
host moz2-darwin9-slave70 { hardware ethernet 00:16:CB:AE:26:FF
host moz2-darwin9-slave71 { hardware ethernet 00:1F:F3:46:C6:CD
host moz2-darwin9-slave72 { hardware ethernet 00:16:CB:AF:6B:FF
Comment 25•14 years ago
|
||
As a note, the stickers on these say "tb-dar.." not "tb2-dar..". I'll add a note in inventory to that effect to head off future confusion.
If we can note rack position here, that would be great. Adam, when you find these on the switches, if you can add switchport info, that'd also be great. If you don't get a chance, I can go back later to find them, too.
| Reporter | ||
Comment 26•14 years ago
|
||
We have two of the above build slaves (one 10.5, one 10.6) networked, configured with minimal changes, and running tryserver builds at the moment. Might be a bit of tweaking yet but things are looking pretty good so far.
Comment 27•14 years ago
|
||
Here are the OS layouts. Sorry it's kinda random:
tb-darwin9-slave55 Leopard 10.5
tb-darwin0-slave56 Leopard 10.5
tb-darwin9-slave57 Leopard 10.5
tb-darwin9-slave58 Leopard 10.5
tb-darwin9-slave60 Snow Leopard 10.6
tb-darwin9-slave61 Snow Leopard 10.6
tb-darwin9-slave62 Snow Leopard 10.6
tb-darwin9-slave63 Snow Leopard 10.6
tb-darwin9-slave64 Snow Leopard 10.6
tb-darwin9-slave65 Leopard 10.5
tb-darwin9-slave66 Snow Leopard 10.6
tb-darwin9-slave67 Leopard 10.5
tb-darwin9-slave69 Snow Leopard 10.6
tb-darwin9-slave70 Leopard 10.5
tb-darwin9-slave71 Snow Leopard 10.6
tb-darwin9-slave72 Leopard 10.5
All are done and I verified the OS booted up. The hostnames above match the stickers on the minis. I did not set hostnames in the OS on any of them.
| Assignee | ||
Comment 28•14 years ago
|
||
jabba and adam finished the network set up on these so they're all reachable from within the tbird network now. Over to gozer/jhopkins for more build magic.
Comment 29•14 years ago
|
||
Port information:
moz2-darwin9-slave55 0016cb-afa1c0 - asx103-07b:37
moz2-darwin9-slave56 0016cb-afa206 - asx103-07b:35
moz2-darwin9-slave57 0016cb-afa1ec - asx103-07b:33
moz2-darwin9-slave58 0016cb-b07566 - asx103-07a:10
moz2-darwin9-slave60 0016cb-af5b90 - asx103-07a:3
moz2-darwin9-slave61 0016cb-af4039 - asx103-07a:5
moz2-darwin9-slave62 0016cb-af24cd - asx103-07a:7
moz2-darwin9-slave63 0016cb-af6c04 - asx103-07a:9
moz2-darwin9-slave64 0016cb-af247f - ?????????????
moz2-darwin9-slave65 0016cb-af7172 - asx103-07a:48
moz2-darwin9-slave66 0016cb-af9dd5 - asx103-07a:43
moz2-darwin9-slave67 0016cb-af9d83 - asx103-07a:45
moz2-darwin9-slave69 0016cb-aeaf08 - asx103-07a:42
moz2-darwin9-slave70 0016cb-ae26ff - asx103-07a:46
moz2-darwin9-slave71 001ff3-46c6cd - asx103-07a:44
moz2-darwin9-slave72 0016cb-af6bff - asx103-07a:47
Comment 30•14 years ago
|
||
Inventory updated/verified.
Comment 31•14 years ago
|
||
(In reply to Adam Newman from comment #29)
> Port information:
> moz2-darwin9-slave64 0016cb-af247f - ?????????????
This one is plugged into the momo procurve switch, port number 5, down on the 14th floor in the momo rack.
Comment 32•14 years ago
|
||
(In reply to Justin Dow [:jabba] from comment #31)
> (In reply to Adam Newman from comment #29)
> > Port information:
>
> > moz2-darwin9-slave64 0016cb-af247f - ?????????????
>
> This one is plugged into the momo procurve switch, port number 5, down on
> the 14th floor in the momo rack.
Is the plan to leave that one there ?
Comment 33•14 years ago
|
||
All hosts are up and on our network, yay!
Host tb2-darwin9-slave55.sj.mozillamessaging.com (10.200.80.55) appears to be up.
Host tb2-darwin9-slave56.sj.mozillamessaging.com (10.200.80.56) appears to be up.
Host tb2-darwin9-slave57.sj.mozillamessaging.com (10.200.80.57) appears to be up.
Host tb2-darwin9-slave58.sj.mozillamessaging.com (10.200.80.58) appears to be up.
Host tb2-darwin9-slave60.sj.mozillamessaging.com (10.200.80.60) appears to be up.
Host tb2-darwin9-slave61.sj.mozillamessaging.com (10.200.80.61) appears to be up.
Host tb2-darwin9-slave62.sj.mozillamessaging.com (10.200.80.62) appears to be up.
Host tb2-darwin9-slave63.sj.mozillamessaging.com (10.200.80.63) appears to be up.
Host tb2-darwin9-slave64.sj.mozillamessaging.com (10.200.80.64) appears to be up.
Host tb2-darwin9-slave65.sj.mozillamessaging.com (10.200.80.65) appears to be up.
Host tb2-darwin9-slave66.sj.mozillamessaging.com (10.200.80.66) appears to be up.
Host tb2-darwin9-slave67.sj.mozillamessaging.com (10.200.80.67) appears to be up.
Host tb2-darwin9-slave69.sj.mozillamessaging.com (10.200.80.69) appears to be up.
Host tb2-darwin9-slave70.sj.mozillamessaging.com (10.200.80.70) appears to be up.
Host tb2-darwin9-slave71.sj.mozillamessaging.com (10.200.80.71) appears to be up.
Host tb2-darwin9-slave72.sj.mozillamessaging.com (10.200.80.72) appears to be up.
There is a small inconsistency in naming, the 10.6 ones should be called darwin10, but I'll do that at a later time.
slave64 and slave65 are currently connected to the try master and running their first builds now. If these turn green, we can move these minis into production and start moving the other ones in as well.
It's a bit of a manual process, but not too bad.
NOTE: We are changing passwords and nuking firefox keys from there
Comment 34•14 years ago
|
||
Inventory updated with new names and a new location for slave64. As adam noted, the switchports were correct.
Comment 35•14 years ago
|
||
(In reply to Philippe M. Chiasson (:gozer) from comment #33)
> All hosts are up and on our network, yay!
...
> slave64 and slave65 are currently connected to the try master and running
> their first builds now. If these turn green, we can move these minis into
> production and start moving the other ones in as well.
...
more sweet progress, thanks. I'll keep my fingers crossed for green builds! :-)
| Assignee | ||
Comment 36•14 years ago
|
||
I don't think gozer put his last update in before he was off for the night, but we had a green build on the 10.5 machine:
http://build.mozillamessaging.com/buildbot/try/builders/OS%20X%2010.5.2%20try%20build/builds/262
He was going to push more builds try builds into the queue to give them some exercise. If all goes well, he was expecting to be able to put them into the production pool in the morning.
I didn't see any comment from him on the 10.6 build before he left for the evening.
| Reporter | ||
Comment 37•14 years ago
|
||
Status of the most recent tryserver build on each:
55 - ok - 10.5
56 - ok - 10.5
57 - ok - 10.5
58 - ok - 10.5
65 - ok - 10.5
67 - ok - 10.5
72 - ok - 10.5
60 - fail - 10.6
62 - fail - 10.6
64 - fail - 10.6
66 - fail - 10.6
69 - fail - 10.6
70 - fail - 10.5
61 - no builds - 10.6
63 - no builds - 10.6
71 - no builds - 10.6
| Reporter | ||
Comment 38•14 years ago
|
||
I've stopped the buildbot client on the systems listed above that are marked "ok" so that new builds will test the others.
Most of the others were failing on yasm being out of date. Turns out there is a puppet recipe to upgrade it to 1.1.0. Got the .dmg from rail and installed it on all out of date build slaves.
slave64 was having an auto-login issue so buildbot didn't start automatically. Fixed.
slave70 is still broken - it needs a reinstall or fix of mercurial. /opt/local/bin/hg is missing. Copying that file from a couple of other systems didn't work. I've stopped the buildbot client on this build slave.
I pushed enough changes to get builds running on all the "failed" or "no builds" slaves above, except slave70. They all seem to have made it to the compile stage so far.
Since yasm was out of date, we should do an audit of the required package versions on these systems to make sure there aren't other older dependencies that need upgrading.
Comment 39•14 years ago
|
||
(In reply to John Hopkins (:jhopkins) from comment #38)
> slave70 is still broken - it needs a reinstall or fix of mercurial.
> /opt/local/bin/hg is missing. Copying that file from a couple of other
> systems didn't work. I've stopped the buildbot client on this build slave.
slave70 mercurial fixed, not sure what went on there, had to rsync /tools/ from another 10.6 host.
Comment 40•14 years ago
|
||
Per quick phone call w/jhopkins just now:
(In reply to John Hopkins (:jhopkins) from comment #37)
> Status of the most recent tryserver build on each:
>
> 55 - ok - 10.5
> 56 - ok - 10.5
> 57 - ok - 10.5
> 58 - ok - 10.5
> 65 - ok - 10.5
> 67 - ok - 10.5
> 72 - ok - 10.5
These are looking good. We should be able to make a "go/nogo" decision on moving these into Thunderbird production ~1pm PDT today.
> 60 - fail - 10.6
> 62 - fail - 10.6
> 64 - fail - 10.6
> 66 - fail - 10.6
> 69 - fail - 10.6
> 70 - fail - 10.5
> 61 - no builds - 10.6
> 63 - no builds - 10.6
> 71 - no builds - 10.6
These machines still being investigated.
Comment 41•14 years ago
|
||
gozer:
(In reply to John Hopkins (:jhopkins) from comment #38)
> I've stopped the buildbot client on the systems listed above that are marked
> "ok" so that new builds will test the others.
>
> Most of the others were failing on yasm being out of date. Turns out there
> is a puppet recipe to upgrade it to 1.1.0. Got the .dmg from rail and
> installed it on all out of date build slaves.
>
> slave64 was having an auto-login issue so buildbot didn't start
> automatically. Fixed.
>
> slave70 is still broken - it needs a reinstall or fix of mercurial.
> /opt/local/bin/hg is missing. Copying that file from a couple of other
> systems didn't work. I've stopped the buildbot client on this build slave.
>
> I pushed enough changes to get builds running on all the "failed" or "no
> builds" slaves above, except slave70. They all seem to have made it to the
> compile stage so far.
>
> Since yasm was out of date, we should do an audit of the required package
> versions on these systems to make sure there aren't other older dependencies
> that need upgrading.
(In reply to Philippe M. Chiasson (:gozer) from comment #39)
> (In reply to John Hopkins (:jhopkins) from comment #38)
> > slave70 is still broken - it needs a reinstall or fix of mercurial.
> > /opt/local/bin/hg is missing. Copying that file from a couple of other
> > systems didn't work. I've stopped the buildbot client on this build slave.
>
> slave70 mercurial fixed, not sure what went on there, had to rsync /tools/
> from another 10.6 host.
Its great to see these issues identified and fixed. However, I'm curious - what happened in the imaging/setup process that allowed these machines to be imaged incorrectly like this? What changes to imaging process do we need to make before we are confident that we can image machines successfully next time?
Comment 42•14 years ago
|
||
On 10.5, we are happy and have just moved 7 slaves to the production pool, leaving the other one in try:
production:
tb2-darwin9-slave55
tb2-darwin9-slave56
tb2-darwin9-slave57
tb2-darwin9-slave58
tb2-darwin9-slave65
tb2-darwin9-slave67
tb2-darwin9-slave70
try:
tb2-darwin9-slave72
Comment 43•14 years ago
|
||
(In reply to John O'Duinn [:joduinn] from comment #41)
> gozer:
> [...]
>
> (In reply to Philippe M. Chiasson (:gozer) from comment #39)
> > (In reply to John Hopkins (:jhopkins) from comment #38)
> > > slave70 is still broken - it needs a reinstall or fix of mercurial.
> > > /opt/local/bin/hg is missing. Copying that file from a couple of other
> > > systems didn't work. I've stopped the buildbot client on this build slave.
> >
> > slave70 mercurial fixed, not sure what went on there, had to rsync /tools/
> > from another 10.6 host.
>
> Its great to see these issues identified and fixed. However, I'm curious -
> what happened in the imaging/setup process that allowed these machines to be
> imaged incorrectly like this?
Except for the mercurial strangeness, it wasn't an imaging problem. Just differences between TB's refplatform and this one.
> What changes to imaging process do we need to
> make before we are confident that we can image machines successfully next
> time?
Not sure what happened on slave70, could have been operator error. But what I know is that there was a mercurial symlink under /tools that was pointing to a versionned /tools/hg-n.m.o directory that wasn't there on that box, but was on the others.
Comment 44•14 years ago
|
||
(In reply to Philippe M. Chiasson (:gozer) from comment #32)
> (In reply to Justin Dow [:jabba] from comment #31)
> > (In reply to Adam Newman from comment #29)
> > > Port information:
> ...
> >
> > This one is plugged into the momo procurve switch, port number 5, down on
> > the 14th floor in the momo rack.
>
> Is the plan to leave that one there ?
Just to clarify, that one *needs* to stay there so I can stick it in the calendar VLAN later.
| Reporter | ||
Comment 45•14 years ago
|
||
I've done a software version compare (YVR minis vs. replacement minis) and gone over it with gozer. The only potential 10.6 version issue is that Xcode is version 3.2.2 on YVR minis but 3.2.1 on replacement minis. I'll ask standard8 whether this is a concern. If it is, we'll need an Xcode 3.2.2 .dmg to install from.
As gozer mentioned above, we've deployed the 10.5 minis to production tests, however we are holding off on allowing these to do comm-1.9.2 builds until they've soaked for awhile.
Once the 10.6 minis are ok'd (re: xcode version above) and have green builds, we'll move these to comm-central builds for awhile to let them soak, then move them to release builds once we're absolutely happy with them (they will not be used for 7.0.1 builds, so that QA can focus on the product changes, without concern for build environment changes).
gozer is working through a virtualenv fix and test build is running right now.
Comment 46•14 years ago
|
||
Okay, found out why builds are a bit slow on these, they all have only 1GB of RAM, so swapiness ensues...
But they are building fine so far!
Comment 47•14 years ago
|
||
The 10.5 builders aren't quite right, they are failing some unit tests:
http://tinderbox.mozilla.org/showlog.cgi?log=ThunderbirdTrunk/1317290353.1317291834.6648.gz
http://tinderbox.mozilla.org/showlog.cgi?log=ThunderbirdTrunk/1317246832.1317248141.16426.gz
http://tinderbox.mozilla.org/showlog.cgi?log=ThunderbirdTrunk/1317240584.1317241901.1839.gz#err2
(note the test_smtpPasswordFailure3.js is a random orange).
The older minis in Vancouver are passing those tests just fine.
And now I look at the results - this looks like a umask issue on those builders, which I believe we've had before.
Comment 48•14 years ago
|
||
(In reply to Mark Banner (:standard8) from comment #47)
> The 10.5 builders aren't quite right, they are failing some unit tests:
>
> http://tinderbox.mozilla.org/showlog.cgi?log=ThunderbirdTrunk/1317290353.
> 1317291834.6648.gz
> http://tinderbox.mozilla.org/showlog.cgi?log=ThunderbirdTrunk/1317246832.
> 1317248141.16426.gz
> http://tinderbox.mozilla.org/showlog.cgi?log=ThunderbirdTrunk/1317240584.
> 1317241901.1839.gz#err2
>
> (note the test_smtpPasswordFailure3.js is a random orange).
>
> The older minis in Vancouver are passing those tests just fine.
>
> And now I look at the results - this looks like a umask issue on those
> builders, which I believe we've had before.
Right on, that was the problem. Umask fixed (umask = 002) on all tb2-darwin* slaves
Also, mozmill (and friends) was missing for older (not in tree) mozmill runs, so fixed that too.
Comment 49•14 years ago
|
||
tb2-darwin9-slave70:
http://tinderbox.mozilla.org/showlog.cgi?log=Thunderbird-Release-Release/1317309283.1317309289.15663.gz
abort: couldn't find mercurial libraries in [/opt/local/lib/python2.5/site-packages /opt/local/bin /System/Library/Frameworks/Python.framework/Versions/2.5/lib/python25.zip /System/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5 /System/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/plat-darwin /System/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/plat-mac /System/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/plat-mac/lib-scriptpackages /System/Library/Frameworks/Python.framework/Versions/2.5/Extras/lib/python /System/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/lib-tk /System/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/lib-dynload /Library/Python/2.5/site-packages /System/Library/Frameworks/Python.framework/Versions/2.5/Extras/lib/python/PyObjC]
(check your install and PYTHONPATH)
There's still something not quite right there.
Comment 50•14 years ago
|
||
(In reply to Mark Banner (:standard8) from comment #49)
> tb2-darwin9-slave70:
>
> http://tinderbox.mozilla.org/showlog.cgi?log=Thunderbird-Release-Release/
> 1317309283.1317309289.15663.gz
That slave hadn't quite picked up the new PATH right, fixed now.
> There's still something not quite right there.
Comment 51•14 years ago
|
||
As a gentle reminder, please update inventory to match the darwin10 machines' new hostnames.
Comment 52•14 years ago
|
||
(In reply to Dustin J. Mitchell [:dustin] from comment #51)
> As a gentle reminder, please update inventory to match the darwin10
> machines' new hostnames.
Inventory gently updated. Noted that the labels say "tb-..." and not "tb2-..."
Comment 53•14 years ago
|
||
After the RAM upgrades, the 10.6 try slaves have been producing green builds! Last step is moving them into the production pool.
Comment 54•14 years ago
|
||
(In reply to Philippe M. Chiasson (:gozer) from comment #48)
> Also, mozmill (and friends) was missing for older (not in tree) mozmill
> runs, so fixed that too.
I was looking at the comm-release and comm-beta mozmill tests for those Macs today - all the new macs are failing running the legacy mozmill installation, so this bit isn't quite right yet.
Comment 55•14 years ago
|
||
(In reply to Mark Banner (:standard8) from comment #54)
> (In reply to Philippe M. Chiasson (:gozer) from comment #48)
> > Also, mozmill (and friends) was missing for older (not in tree) mozmill
> > runs, so fixed that too.
>
> I was looking at the comm-release and comm-beta mozmill tests for those Macs
> today - all the new macs are failing running the legacy mozmill
> installation, so this bit isn't quite right yet.
I mistakenly installed the wrong version of mozmill (1.5.x) on these slaves, rolling it back fixed the problem.
| Reporter | ||
Comment 56•14 years ago
|
||
These Mac Minis are on their way from Vancouver to MPT:
2 x OSX 10.6 minis via FedEx air
ETA 10:30am Friday Sept. 30
6 x OSX 10.5 minis via FedEx ground
ETA 3 business days
| Assignee | ||
Comment 57•14 years ago
|
||
I think the work that this bug was intended for is now complete, and I can close it. Agreed? (gozer, jhopkins, coop, joduinn)?
Comment 58•14 years ago
|
||
gozer: +1
| Reporter | ||
Comment 59•14 years ago
|
||
(In reply to Amy Rich [:arich] [:arr] from comment #57)
> I think the work that this bug was intended for is now complete, and I can
> close it. Agreed? (gozer, jhopkins, coop, joduinn)?
Agreed
Comment 60•14 years ago
|
||
per irc w/gozer, jhopkins:
1) the corrected/rolledback mozmill is now giving us green builds. standard8, if you still seeing any problems, let us know.
2) the 10.5 machines are in SJ are running in production, alongside the machines in Vancouver.
3) the 10.6 machines are still being worked on.
jhopkins/gozer to give status on what (if anything) left to do before powering off the machines in Vancouver office.
Amy: Unclear at this time if there is anything left for IT to do here. I'd like to keep this open to track whatever work remains.
| Reporter | ||
Comment 61•14 years ago
|
||
Status is being tracked in bug 688230. The 16 minis have been successfully repurposed so I am closing this bug; confirmed with joduinn.
Status: NEW → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
Updated•12 years ago
|
Component: Server Operations: RelEng → RelOps
Product: mozilla.org → Infrastructure & Operations
You need to log in
before you can comment on or make changes to this bug.
Description
•