Closed
Bug 635964
Opened 14 years ago
Closed 14 years ago
re-purpose 16 darwin9 minis for darwin10 service
Categories
(Release Engineering :: General, defect, P4)
Release Engineering
General
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: dustin, Assigned: dustin)
References
Details
(Whiteboard: [slaveduty][buildslaves])
Attachments
(4 files, 1 obsolete file)
10.39 KB,
patch
|
bhearsum
:
review+
|
Details | Diff | Splinter Review |
1.25 KB,
patch
|
catlee
:
review+
dustin
:
checked-in+
|
Details | Diff | Splinter Review |
3.06 KB,
patch
|
dustin
:
review+
mozilla
:
checked-in+
|
Details | Diff | Splinter Review |
5.50 KB,
patch
|
dustin
:
review+
mozilla
:
checked-in+
|
Details | Diff | Splinter Review |
The following slaves were killed in bug 604497 (which is secret for unrelated reasons), and will be brought back up as 10.6 builders and try systems.
moz2-darwin9-slave29
moz2-darwin9-slave30
moz2-darwin9-slave31
moz2-darwin9-slave32
moz2-darwin9-slave33
moz2-darwin9-slave34
moz2-darwin9-slave35
moz2-darwin9-slave36
moz2-darwin9-slave37
try-mac-slave20
try-mac-slave21
try-mac-slave22
try-mac-slave23
try-mac-slave24
try-mac-slave25
try-mac-slave26
try-mac-slave27
try-mac-slave28
try-mac-slave29
Spencer is currently cracking them and adding RAM in the IT area.
Comment 1•14 years ago
|
||
As well as the RAM upgrade, these machines need OS 10.6 reimaging, and being renamed to:
moz2-darwin10-slave{53...61}
try-mac64-slave(27...36}
Comment 2•14 years ago
|
||
Following mac minis re-imaged and had their ram upgraded from 2gb to 4 gb
QM-PLEOPARD-TRY12
QM-PLEOPARD-TRY13
QM-PLEOPARD-TRY14
QM-PLEOPARD-TRY15
QM-PLEOPARD-TRY16
QM-PUBUNTU-TRY12 could not upgrade ram(screw is stripped)
QM-PUBUNTU-TRY13
QM-PUBUNTU-TRY14
QM-PUBUNTU-TRY15
QM-PUBUNTU-TRY16
MOZ2-DRAWIN9-SLAVE30
MOZ2-DRAWIN9-SLAVE29
MOZ2-DRAWIN9-SLAVE31
MOZ2-DRAWIN9-SLAVE32
MOZ2-DRAWIN9-SLAVE33
MOZ2-DRAWIN9-SLAVE34
MOZ2-DRAWIN9-SLAVE35
MOZ2-DRAWIN9-SLAVE36
MOZ2-DRAWIN9-SLAVE37
ZACK-TESTING
Assignee | ||
Comment 3•14 years ago
|
||
(In reply to comment #2)
> Following mac minis re-imaged and had their ram upgraded from 2gb to 4 gb
I'm assuming these are old names for the machines?
> QM-PLEOPARD-TRY12
> QM-PLEOPARD-TRY13
> QM-PLEOPARD-TRY14
> QM-PLEOPARD-TRY15
> QM-PLEOPARD-TRY16
> QM-PUBUNTU-TRY12 could not upgrade ram(screw is stripped)
> QM-PUBUNTU-TRY13
> QM-PUBUNTU-TRY14
> QM-PUBUNTU-TRY15
> QM-PUBUNTU-TRY16
So the above came from another pool of unused minis?
> MOZ2-DRAWIN9-SLAVE30
> MOZ2-DRAWIN9-SLAVE29
> MOZ2-DRAWIN9-SLAVE31
> MOZ2-DRAWIN9-SLAVE32
> MOZ2-DRAWIN9-SLAVE33
> MOZ2-DRAWIN9-SLAVE34
> MOZ2-DRAWIN9-SLAVE35
> MOZ2-DRAWIN9-SLAVE36
> MOZ2-DRAWIN9-SLAVE37
Presumably those are the same moz2-darwin9-slaveNN mentioned in comment 0.
> ZACK-TESTING
and this was in our inventory already.
These have been reimaged, but what names have they been given? And what happened to the try-mac-slaveNN mentioned in comment 0? Did you do any additional diagnostics on try-mac-slave28?
Comment 4•14 years ago
|
||
The QM- hostnames were on the front of the machines, those are actually the try-mac-slave machines listed in comment 0.
Spencer, you'll find them in inventory under the try-mac names that are on the backs of the machines. Please give them new names per comment 1 and update inventory accordingly.
As to try-mac-slave28, let's see what happens post-imaging.
Comment 5•14 years ago
|
||
(In reply to comment #4)
> As to try-mac-slave28, let's see what happens post-imaging.
iirc, this slave was also causing grief before imaging. Maybe it needs to go to greener pastures?
Assignee | ||
Comment 6•14 years ago
|
||
zandr/ssh are on top of it.
Comment 7•14 years ago
|
||
all macs have been re-named, re-labeled and updated in inventory
moz2-darwin10-slave{53...61}
try-mac64-slave(27...36}
John said he had trouble trying to ssh into TRY-MAC64-SLAVE35
Updated•14 years ago
|
Status: NEW → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
Assignee | ||
Comment 8•14 years ago
|
||
Thanks, spencer!
Now we need to bring these back up and attach them to masters - a slaveduty job.
Assignee: shui → server-ops-releng
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Assignee | ||
Comment 9•14 years ago
|
||
oops, back to the releng component for bear.
Assignee: server-ops-releng → nobody
Component: Server Operations: RelEng → Release Engineering
QA Contact: zandr → release
Comment 10•14 years ago
|
||
Power and network would be a good first step, actually. /me goes back to the server closet to make that happen.
Assignee | ||
Comment 11•14 years ago
|
||
Is nagios updated to monitor these new slaves? Can we set that up once they're powered up? New bug, or on this one?
Comment 13•14 years ago
|
||
Found in triage with zandr:
(In reply to comment #10)
> Power and network would be a good first step, actually. /me goes back to the
> server closet to make that happen.
1) instead of putting in new server room, will just put back on shelf in QA lab, and connect to power/network there.
2) nagios not yet done.
Not sure why this in RelEng, moving to IT to resolve.
Assignee: nobody → shui
Component: Release Engineering → Server Operations: RelEng
QA Contact: release → zandr
Whiteboard: [slaveduty][buildslaves] → [slaveduty][buildslaves][subject to embargo]
Comment 14•14 years ago
|
||
Powered up and networked in a much neater AFK.
It's friday at 5, I'll investigate dns/dhcp later.
Comment 15•14 years ago
|
||
I started looking at these Friday night, and inventory/dns/hostname are inconsistent, with at least two missing from inventory altogether.
Spencer, could you verify each of
moz2-darwin10-slave{53...61}
try-mac64-slave(27...36}
And make sure that each machine is:
* in inventory with a correct IP address
* in DNS with a correct IP address and hostname
* has its hostname set to match
* is labeled to match the above.
Since everything else is software, I'd start by comparing labels to MAC addresses on the physical boxes, and work through DHCP to DNS to inventory.
Comment 16•14 years ago
|
||
OK, verification complete.
Inventory/DNS/DHCP updated and verified against hostname and physical labeling.
Over to Releng for setup of new slaves:
moz2-darwin10-slave53-61
try-mac64-slave27,29-36
NB: There is no try-mac-64-slave28. That machine (the former try-mac-slave29) has a stripped screw that is currently preventing upgrade.
Assignee: shui → nobody
Component: Server Operations: RelEng → Release Engineering
QA Contact: zandr → release
Updated•14 years ago
|
Status: REOPENED → NEW
Priority: -- → P3
Assignee | ||
Comment 17•14 years ago
|
||
This is still waiting for setup. Bump the priority if we're seeing wait times that suggest these slaves will be useful.
Priority: P3 → P4
Assignee | ||
Comment 18•14 years ago
|
||
I'll get puppet set up, and leave the re-mastering to later.
Assignee: nobody → dustin
Attachment #521859 -
Flags: review?(bhearsum)
Assignee | ||
Comment 19•14 years ago
|
||
New version with the old machines removed
Attachment #521859 -
Attachment is obsolete: true
Attachment #521859 -
Flags: review?(bhearsum)
Attachment #521865 -
Flags: review?(bhearsum)
Updated•14 years ago
|
Attachment #521865 -
Flags: review?(bhearsum) → review+
Assignee | ||
Comment 20•14 years ago
|
||
OK, these machines are all talking to mv-production-puppet now, but not doing any buildbottish work.
Assignee | ||
Comment 21•14 years ago
|
||
This should add these machines to the staging configuration, for testing purposes.
Attachment #521905 -
Flags: review?(catlee)
Assignee | ||
Comment 22•14 years ago
|
||
I'm seeing
Mar 25 12:51:07 production-puppet puppetmasterd[19135]: Allowing unauthenticated client try-mac64-slave38.build.mozilla.org(10.250.48.198) access to puppetca.getcert
Mar 25 12:51:07 production-puppet puppetmasterd[19135]: Certificate request does not match existing certificate; run 'puppetca --clean try-mac-slave28.build.mozilla.org'.
In the mpt-production-puppet logs. Is this the missing try-mac64-slave28? Or the previously-problematic try-mac-slave28? Or both?
198.48.250.10.in-addr.arpa domain name pointer try-mac64-slave38.build.mozilla.org.
No forward DNS for either hostname
? (10.250.48.198) at 00:16:CB:B0:75:23 [ether] on eth0
Assignee | ||
Updated•14 years ago
|
Assignee: dustin → nobody
Updated•14 years ago
|
Attachment #521905 -
Flags: review?(catlee) → review+
Assignee | ||
Updated•14 years ago
|
Attachment #521905 -
Flags: checked-in+
Assignee | ||
Updated•14 years ago
|
Assignee: nobody → dustin
Assignee | ||
Comment 23•14 years ago
|
||
moz2-darwin10-slave53-61 are up and running on sm02 (via slave alloc, at that!)
Assignee | ||
Comment 24•14 years ago
|
||
try-mac64-slave27,29-36 are up and running on sm02 as well.
Comment 25•14 years ago
|
||
/etc/hosts entries for the re-tasked hosts removed from bm-admin01, and their nagios configuration modified.
Assignee | ||
Comment 26•14 years ago
|
||
Let's hold off on moving these to production for a moment - joduinn is suggesting re-re-purposing them for developers, since we're now over-capacity on these builders.
Comment 27•14 years ago
|
||
Attachment #523403 -
Flags: review?(dustin)
Assignee | ||
Updated•14 years ago
|
Attachment #523403 -
Flags: review?(dustin) → review+
Comment 28•14 years ago
|
||
Comment on attachment 523403 [details] [diff] [review]
new try mac64 slaves
http://hg.mozilla.org/build/buildbot-configs/rev/9eaa42cbbc7c
Attachment #523403 -
Flags: checked-in+
Assignee | ||
Comment 29•14 years ago
|
||
As of bug 647051, this bug is now only about
moz2-darwin10-slave53
moz2-darwin10-slave54
moz2-darwin10-slave55
moz2-darwin10-slave56
try-mac64-slave27
try-mac64-slave29
try-mac64-slave30
try-mac64-slave31
which are all still in AFK, powered on, and in staging.
Assignee | ||
Updated•14 years ago
|
Attachment #523641 -
Flags: review?(dustin) → review+
Comment 31•14 years ago
|
||
Comment on attachment 523641 [details] [diff] [review]
adjust try macs, add moz2-darwin10 macs
http://hg.mozilla.org/build/buildbot-configs/rev/912ae2b49222
Attachment #523641 -
Flags: checked-in+
Assignee | ||
Comment 32•14 years ago
|
||
aki, if I switch the ssh keys and select a production pool in slavealloc, are the production masters ready to welcome these new slaves?
Whiteboard: [slaveduty][buildslaves][subject to embargo] → [slaveduty][buildslaves]
Assignee | ||
Comment 33•14 years ago
|
||
In production now, via slavealloc (keys changed too, don't worry):
moz2-darwin10-slave53
moz2-darwin10-slave54
moz2-darwin10-slave55
moz2-darwin10-slave56
try-mac64-slave27
try-mac64-slave29
try-mac64-slave30
try-mac64-slave31
Assignee | ||
Updated•14 years ago
|
Status: NEW → RESOLVED
Closed: 14 years ago → 14 years ago
Resolution: --- → FIXED
Comment 34•13 years ago
|
||
? (10.250.48.198) at 00:16:CB:B0:75:23 [ether] on eth0
This is physically labeled as try-mac64-slave28, and forward DNS is correct for this host. Reverse, not so much. Fixed.
Assignee | ||
Comment 35•13 years ago
|
||
^^ copied to bug 652983
Comment 36•13 years ago
|
||
I am removing from our configs and from slave-alloc many of these slaves in bug 700705.
Updated•11 years ago
|
Product: mozilla.org → Release Engineering
You need to log in
before you can comment on or make changes to this bug.
Description
•