Closed Bug 804312 Opened 13 years ago Closed 13 years ago

linux-ix-slave10-mgmt.build.scl1 issues

Categories

(Infrastructure & Operations Graveyard :: NetOps, task)

task
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: dustin, Unassigned)

Details

I suspect this is a repeat of what we saw with kvm* and vlan40 yesterday during/after the maintenance window. The immediate symptom is that linux-ix-slave10-mgmt.build.scl1 is not pingable from outside its VLAN. From bug 804053: --- linux-ix-slave10.build.scl1 is pingable, true, but only from admin1a/b. It sounds like a default gateway problem, or potentially like the problem we had with the KVM hosts. The host is configured from DHCP. Watching broadcast traffic on VLAN48, if I enable NTP with an off-network address, expecting it to ARP for the gateway, I get: 13:03:27.437585 00:25:90:09:7f:64 (oui Unknown) > 00:10:db:ff:10:01 (oui Unknown), ethertype 802.1Q (0x8100), length 94: vlan 48, p 0, ethertype IPv4, (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto UDP (17), length 76) which matches ARP on admin1a: ? (10.12.48.1) at 00:10:db:ff:10:00 [ether] on bond0.48 --- Wait, why are those packets, which are unicast at the MAC level, being flooded broadly enough for admin1a to see them? This suggests a switching issue to me..
Assignee: server-ops → network-operations
Component: Server Operations → Server Operations: Netops
QA Contact: shyam → ravi
typo: > linux-ix-slave10.build.scl1 is pingable, true, but only from admin1a/b. It ^-mgmt Amy adds a good point that the IPMI and host MACs appear on the same switchport. The top-of-rack switch seems to have the right data in its MAC table, though, so this is probably irrelevant: sw-3a.scl1# sh mac-address 19 Status and Counters - Port Address Table - 19 MAC Address ------------- 002590-0974de 002590-097f64 sw-3a.scl1# sh mac-address 0010-dbff-1000 Status and Counters - Address Table - 0010db-ff1000 MAC Address : 0010db-ff1000 Located on Port : 49
(to be clear, this is, so far, the only host affected, so this isn't in need of escalation, but may be a clue to a more significant phenomenon)
Ugh, so warm and cold cycling the IPMI system didn't fix this, but resetting it to factory settings did. So this is clear now, and was a host issue after all. h/t to mrz for bouncing ideas around with me on this.
Status: NEW → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
Summary: network unreachability in scl1 → linux-ix-slave10-mgmt.build.scl1 issues
Product: mozilla.org → Infrastructure & Operations
Product: Infrastructure & Operations → Infrastructure & Operations Graveyard
You need to log in before you can comment on or make changes to this bug.