Closed Bug 857042 Opened 11 years ago Closed 11 years ago

Please rack and cable 50 ix chassis for w7 and xp testing

Categories

(Infrastructure & Operations :: DCOps, task)

x86
macOS
task
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: arich, Unassigned)

References

Details

(Whiteboard: [50/50 chassis have been racked and cabled])

We have 50 new iX chassis on order and destined for scl3 (no ETA yet).  There should be stub entries for all hosts in DNS and inventory already. Please make sure that IPMI is configured and all hosts are set to boot from disk by default.

These hosts should be racked/cabled/VLANed as follows:

VLAN 240:

talos-w7-ix-001.wintest.releng.scl3.mozilla.com
..
talos-w7-ix-100.wintest.releng.scl3.mozilla.com


talos-xp-ix-001.wintest.releng.scl3.mozilla.com
..
talos-xp-ix-100.wintest.releng.scl3.mozilla.com


VLAN 216

talos-w7-ix-001-mgmt.inband.releng.scl3.mozilla.com
..
talos-w7-ix-100-mgmt.inband.releng.scl3.mozilla.com


talos-xp-ix-001-mgmt.inband.releng.scl3.mozilla.com
..
talos-xp-ix-100-mgmt.inband.releng.scl3.mozilla.com
Blocks: 857064
Blocks: 857065
colo-trip: --- → scl3
Finney says that the bulk of these should be delivered by about the 22nd with the possibility that some batches will come in sooner.
colo-trip: scl3 → ---
Just to make sure, do we have switches et.al. all ready for these?
colo-trip: --- → scl3
(In reply to Amy Rich [:arich] [:arr] from comment #2)
> Just to make sure, do we have switches et.al. all ready for these?

We have racks already deployed for ~35 of them, and quotes are pending for switches to support the remainder.
Networking quotes approved. We'll probably take up two new racks, and expand into the underutilized r102-23.
Whiteboard: the bulk of these should be delivered by about the 22nd
9 servers out of the 50 should have been delivered Friday.
Whiteboard: the bulk of these should be delivered by about the 22nd → [9/50] the bulk of these should be delivered by about the 22nd
Would that be 36 nodes?

Do we have ETA for the remaining 41? Thanks for the update!
The remaining are targetted to arrive 4/22.
No longer blocks: win7-ix-releng
Whiteboard: [9/50] the bulk of these should be delivered by about the 22nd → [19/50] the bulk of these should be delivered by about the 22nd
10 more systems were delivered today, this makes us at 19/50.  

From the vendor:  
"We should have three more ready on Monday and currently I expect the
balance to be delivered towards the end of next week(possibly early the
following week).  The problem has been a back order on the MB's for this
build, but I'm told we should have the remaining boards over the next
few days."
We've changed the naming scheme on these.  They will now be called:

VLAN 240:

t-w732-ix-001.wintest.releng.scl3.mozilla.com
..
t-w7-ix32-100.wintest.releng.scl3.mozilla.com


t-xp32-ix-001.wintest.releng.scl3.mozilla.com
..
t-xp32-ix-100.wintest.releng.scl3.mozilla.com


VLAN 216

t-w732-ix-001-mgmt.inband.releng.scl3.mozilla.com
..
t-w732-ix-100-mgmt.inband.releng.scl3.mozilla.com


t-xp32-ix-001-mgmt.inband.releng.scl3.mozilla.com
..
t-xp32-ix-100-mgmt.inband.releng.scl3.mozilla.com
2 more systems were delivered Friday 4/19, this makes us at 21/50.
Whiteboard: [19/50] the bulk of these should be delivered by about the 22nd → [21/50] the bulk of these should be delivered by about the 22nd
We have the first 9 chassis up and ready to go. Inventory has been updated and the switch ports have been tagged with the proper VLAN.

t-xp32-ix-001.wintest.releng.scl3.mozilla.com
...
t-xp32-ix-018.wintest.releng.scl3.mozilla.com


t-w732-ix-001.wintest.releng.scl3.mozilla.com
...
t-w732-ix-018.wintest.releng.scl3.mozilla.com
For the rest of the deliveries, please finish up populating w732 (up to 100) before adding any more to xp32 machines.  We're ready to deploy on w732 but not on xp32.
Inventory updated for 1-18 of both xp and w7. I'll try some w732 installs tomorrow after the task sequence is replicated over and let you guys know if I run into any issues with connectivity.
The next set of iX systems will be held back a bit as we continue to figure out which cabling method will work best. Also the switch rack mount kits are on back order so we haven't been able to build out the new cabinets.
Van, what is the ETA for the switch rack mount kits?
>Van, what is the ETA for the switch rack mount kits?

current ETA is this Friday.
>For the rest of the deliveries, please finish up populating w732 (up to 100) before adding any more to xp32 machines.  We're ready to deploy on w732 but not on xp32.

Since I turned over 18 xp32 machines to you, do you want me to rename them and reconfigure the iLO to be useable with w732?
Please adjust this summary if I got it wrong as it is the bug that I'm least familiar with.

Summary:
########
* 21 out of 50 chassis have been delivered
* 9 chassis out of 21 are up and running
** 18 Win7 nodes and 18 WinXP nodes

* We need rack mount kits for cabinets which will arrive this Friday 4/26
* Once the rack mount kits arrive we need to build the cabinets
* We can then rack remaining 12 chassis
* We are waiting on remaining 29 chassis to be delivered
Whiteboard: [21/50] the bulk of these should be delivered by about the 22nd → [21/50 chassis have been delivered] - summary on comment 17 - waiting on delivery of rack mount kits (4/29th) and 29 remaining chassis (ETA?)
Whiteboard: [21/50 chassis have been delivered] - summary on comment 17 - waiting on delivery of rack mount kits (4/29th) and 29 remaining chassis (ETA?) → [21/50 chassis have been delivered] - summary on comment 17 - waiting on delivery of rack mount kits (4/29th) and 29 remaining chassis (4/26)
(In reply to Van Le [:van] from comment #17)

If the expectation is that we will have 100 w7 nodes available by the end of next week, no, it's more churn than it's worth to shuffle around the xp nodes (you can't just rename stuff in inventory since the other hosts already exist).  If it's going to be longer than that, then repurposing might be appropriate and we should evaluate that based on how long it will be until we will have 100 w7 nodes available.
s/29/26/ for rack mount delivery.
Whiteboard: [21/50 chassis have been delivered] - summary on comment 17 - waiting on delivery of rack mount kits (4/29th) and 29 remaining chassis (4/26) → [21/50 chassis have been delivered] - summary on comment 17 - waiting on delivery of rack mount kits (4/26th) and 29 remaining chassis (4/26)
Just FYI, holding off till the task sequence is fixed before trying any imaging.  Should have something by Monday, I hope.
Whiteboard: [21/50 chassis have been delivered] - summary on comment 17 - waiting on delivery of rack mount kits (4/26th) and 29 remaining chassis (4/26) → [33/50 chassis have been delivered] - summary on comment 17 - waiting on delivery of rack mount kits (4/26th) and 29 remaining chassis (4/26)
Whiteboard: [33/50 chassis have been delivered] - summary on comment 17 - waiting on delivery of rack mount kits (4/26th) and 29 remaining chassis (4/26) → [33/50 chassis have been delivered] - summary on comment 18 - waiting on delivery of rack mount kits 4/29 (?)
Thanks for the whiteboard updates!

I assume from it that the rack mount kits did not arrive. Do we know if they're coming sometime this week?
Whiteboard: [33/50 chassis have been delivered] - summary on comment 18 - waiting on delivery of rack mount kits 4/29 (?) → [41/50 chassis have been delivered] - rack mount kits shipping 4/30
Whiteboard: [41/50 chassis have been delivered] - rack mount kits shipping 4/30 → [49/50 chassis have been delivered] - rack mount kits shipping 4/30
I turned over another rack to Amy last night. Unfortunately we were just informed by BCT that the rack mount kits have been delayed an additional week.
Whiteboard: [49/50 chassis have been delivered] - rack mount kits shipping 4/30 → [49/50 chassis have been delivered] - rack mount kits shipping 5/6
Hi guys,
Would you please let us know the delivery ETA for the rack mounts? (I believe that 5/6 is the ship date but not necessarily the delivery date).

And how long do you estimate it will take to build the cabinets once they arrive and have the machines ready for relops?

I keep on getting asked and want to give an informed answer.

Thanks in advance!
Armen, we usually receive the rack mounts one to three days after they ship. It depends on which distribution center fulfills our order, and we won't know that until we receive the tracking numbers. Hopefully, we'll get that information today.

Assuming they ship today and we receive them by 5/9, Van feels comfortable that we can hand off another batch of servers around 5/14.

He just confirmed we have 88 nodes waiting for installation, in addition to 4 nodes which have not yet been delivered.
I tried to install some of the t-w732-ix today and ran into some issues:

49 - couldn't get a dhcp address
90 - didn't have the nic in the boot order at all
81,83,85,87,89,91,93,95 - all had the nic before the disk in the boot order.

I think all of that except 93,95 are fixed now.

We're also trying to track down some possible inventory issues with 72,73,74 (wrong mac addresses)
93 and 95 are fixed now, and all of the bad MACs in inventory are fixed as well.
(In reply to Derek Moore from comment #25)
> Armen, we usually receive the rack mounts one to three days after they ship.
> It depends on which distribution center fulfills our order, and we won't
> know that until we receive the tracking numbers. Hopefully, we'll get that
> information today.
> 
> Assuming they ship today and we receive them by 5/9, Van feels comfortable
> that we can hand off another batch of servers around 5/14.
> 
> He just confirmed we have 88 nodes waiting for installation, in addition to
> 4 nodes which have not yet been delivered.

Thanks Derek for the reply. This is very useful!
Last 4 nodes have been delivered today by iX Systems.
Whiteboard: [49/50 chassis have been delivered] - rack mount kits shipping 5/6 → [50/50 chassis have been delivered] - rack mount kits shipping 5/6
Any news if the kits were shipped? Thanks!
(In reply to Armen Zambrano G. [:armenzg] (Release Enginerring) from comment #30)
> Any news if the kits were shipped? Thanks!

They've arrived and have been installed. We'll be completing the cabling, configuration, and inventory over the next two days. I'll have another progress update for you on Friday.
Thanks Derek. Looking forward to it.
Hi, Derek, do you know if the cabling for the other racks was completed?
All the w732 nodes are complete. 
All the xp32 nodes up to t-xp32-ix-096 are complete.

t-xp32-ix-[097-100] is going to be racked in 102-22 but we need to retrofit that cabinet with new CDUs.
So, first pass, ipmi is not available for the following:

t-xp32-ix-031-mgmt.inband.releng.scl3.mozilla.com
t-xp32-ix-042-mgmt.inband.releng.scl3.mozilla.com
t-w732-ix-026-mgmt.inband.releng.scl3.mozilla.com
t-w732-ix-033-mgmt.inband.releng.scl3.mozilla.com
t-w732-ix-036-mgmt.inband.releng.scl3.mozilla.com

I'll be trying to install the rest of the w732 nodes today and will comment when I run into more issues.
t-w732-ix-041.wintest.releng.scl3.mozilla.com had the wrong primary nic MAC in inventory.

t-w732-ix-030.wintest.releng.scl3.mozilla.com appears to have the wrong boot order.
Apparently 041 still isn't working because 030 claims to have 00:25:90:c5:f2:46.  SO some combination of primary nic/management nic are wrong for those two.  Please fix them so that they ahve the correct matching information in inventory.
30 and 41 is fixed per Q. Fixed other inbands as well.

[vle@admin1a.private.scl3 ~]$ fping t-w732-ix-0{26,33,36}-mgmt.inband.releng.scl3.mozilla.com
t-w732-ix-026-mgmt.inband.releng.scl3.mozilla.com is alive
t-w732-ix-033-mgmt.inband.releng.scl3.mozilla.com is alive
t-w732-ix-036-mgmt.inband.releng.scl3.mozilla.com is alive
[vle@admin1a.private.scl3 ~]$ fping t-xp32-ix-0{31,42}-mgmt.inband.releng.scl3.mozilla.com
t-xp32-ix-031-mgmt.inband.releng.scl3.mozilla.com is alive
t-xp32-ix-042-mgmt.inband.releng.scl3.mozilla.com is alive
I'm still unable to get to t-xp32-ix-031-mgmt.inband.releng.scl3.mozilla.com.
Subnet mask was incorrect for t-xp32-ix-031-mgmt.inband.releng.scl3.mozilla.com.  I've updated it to the correct subnet.
Whiteboard: [50/50 chassis have been delivered] - rack mount kits shipping 5/6 → [50/50 chassis have been delivered] - 1 chassis left to rack and cable
With the XP machines that have been racked so far, I have found the following issues:

t-xp32-ix-016 - wrong boot order
t-xp32-ix-018 - wrong boot order
t-xp32-ix-081 - wrong vlan
t-xp32-ix-082 - wrong vlan
t-xp32-ix-083 - wrong vlan
t-xp32-ix-084 - wrong vlan
t-xp32-ix-085 - wrong vlan
t-xp32-ix-086 - wrong vlan
t-xp32-ix-087 - wrong vlan
t-xp32-ix-088 - wrong vlan
t-xp32-ix-089 - wrong vlan
t-xp32-ix-090 - wrong vlan
t-xp32-ix-091 - wrong vlan
t-xp32-ix-092 - wrong vlan
t-xp32-ix-093 - wrong vlan
t-xp32-ix-094 - wrong vlan
t-xp32-ix-095 - wrong vlan
t-xp32-ix-096 - wrong vlan

I've also run into an issue with t-xp32-ix-077, but I'm not sure if that's bad inventory/DNS info or somethign else yet.  Still investigating.
sorry about the VLANs, can you try again?

+   interface-range iX_xp32 {
+       member-range ge-0/0/7 to ge-0/0/14;
+       member-range ge-0/0/25 to ge-0/0/32;
+       unit 0 {
+           family ethernet-switching {
+               port-mode access;
+               vlan {
+                   members releng-wintest;
t-xp32-ix-016 and t-xp32-ix-018,now has the following boot order 1: Hard drive 2: Network
82 and 84 have the wrong boot order.
Had mark boot them into the bios screen so I could fix them.
(In reply to Van Le [:van] from comment #34)

The last chassis is waiting on the same PDU work as bug 848995.
There is one chassis remaining to be mounted. Due to rack capacity, this chassis is standalone and will be mixed into the other equipment in r102-22.

We would like to complete Bug 872179 before installing this chassis. However, if demand is critical, we can install immediately.
Depends on: 872179
The last node has been racked and cabled. Inventory is located here: https://docs.google.com/spreadsheet/ccc?key=0Au66gf0GRi1MdHhWamxRczBiTVdZZ0ltdjVlMWE2dHc#gid=1

Please let me know of any issues.
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
I've added the information to inventory, but none of the IPMIs were configured (I managed to do 97,99,100) and it doesn't appear as though they're on the right VLAN.

Can someone please check the ipmi config for 98, fix the vlan for all of them, and either verify the boot order or leave them at the bios so I can do it myself?

Thanks
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
All hardware now checks out and I've completed all the installs, thanks!
Status: REOPENED → RESOLVED
Closed: 11 years ago11 years ago
Resolution: --- → FIXED
Whiteboard: [50/50 chassis have been delivered] - 1 chassis left to rack and cable → [50/50 chassis have been racked and cabled]
Product: mozilla.org → Infrastructure & Operations
You need to log in before you can comment on or make changes to this bug.