t-w864-ix-207 is unreachable

RESOLVED FIXED

Status

Infrastructure & Operations
DCOps
RESOLVED FIXED
2 years ago
2 years ago

People

(Reporter: Release Engineering SlaveAPI Service, Assigned: van)

Tracking

Details

(Whiteboard: hardware failure)

Comment hidden (empty)
(Assignee)

Comment 1

2 years ago
back online.

[vle@admin1b.private.scl3 ~]$ fping !$
fping t-w864-ix-207.wintest.releng.scl3.mozilla.com
t-w864-ix-207.wintest.releng.scl3.mozilla.com is alive
[vle@admin1b.private.scl3 ~]$ ssh !$
ssh t-w864-ix-207.wintest.releng.scl3.mozilla.com
The authenticity of host 't-w864-ix-207.wintest.releng.scl3.mozilla.com (10.26.42.100)' can't be established.
RSA key fingerprint is 28:d4:73:90:7f:86:fb:cd:b6:29:62:09:db:84:9e:66.
Are you sure you want to continue connecting (yes/no)?
Assignee: server-ops-dcops → vle
Status: NEW → RESOLVED
colo-trip: --- → scl3
Last Resolved: 2 years ago
QA Contact: cshields
Resolution: --- → FIXED
Machine went down again after begining to take a job,the job was interrupeted:
[Failure instance: Traceback (failure with no frames): <class 'twisted.internet.error.ConnectionLost'>: Connection to the other side was lost in a non-clean fashion.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
(Assignee)

Comment 3

2 years ago
host crashed. running memtest to see if it is related to memory. these hosts are out of warranty so we'll need to pay for a burn-in test if memtest and disk diags don't find anything, unless you plan to decomm the host.
Whiteboard: running diags
(Assignee)

Comment 4

2 years ago
host shut itself off during memtest with a red LED blinking. this usually means hardware failure. do you want to spend any money on this host as it is no longer under warranty? we can have iX run additional system diags to replace the faulty hardware since we don't have any diagnostic tools specific to these iX servers.
Whiteboard: running diags → hardware failure
ni-ing Amy about this.
Flags: needinfo?(arich)
Since we're in the process of decreasing the pool for these (and we'll be getting new hardware in the next quarter), I think we just keep this one for parts. Coop?
Flags: needinfo?(arich) → needinfo?(coop)

Comment 7

2 years ago
(In reply to Amy Rich [:arr] [:arich] from comment #6)
> Since we're in the process of decreasing the pool for these (and we'll be
> getting new hardware in the next quarter), I think we just keep this one for
> parts. Coop?

Yeah, we can decomm this one.
Flags: needinfo?(coop)
(Assignee)

Comment 8

2 years ago
host unracked and decommissioned.

mozillas-MacBook-Air-2:invtool vle$ invtool decommission --comment "BUG 1307392" t-w864-ix-207.wintest.releng.scl3.mozilla.com --commit
http_status: 200 (Success)
comment: BUG 1307392
systems: t-w864-ix-207.wintest.releng.scl3.mozilla.com
http_status: 200
commit: True
Decommission options used:
	decommission_system_status: decommissioned
	convert_to_sreg: True
	remove_dns: True
	decommission_sreg: True
Additional information returned by Inventory:
Decommission actions for t-w864-ix-207.wintest.releng.scl3.mozilla.com
	Cleared values for operating_system, allocation, oob_ip, switch_ports, and oob_switch_port
	Set system status to decommissioned
	Deleting: 120.18.26.10.in-addr.arpa.                IN  (SREG) PTR t-w864-ix-207-mgmt.inband.releng.scl3.mozilla.com.
	Deleting: t-w864-ix-207-mgmt.inband.releng.scl3.mozilla.com.  IN  (SREG) A 10.26.18.120
	Deleting: t-w864-ix-207-mgmt.build.mozilla.org.     IN  CNAME t-w864-ix-207-mgmt.inband.releng.scl3.mozilla.com.
		Dissabling mac 00:25:90:CD:75:24 in DHCP
		Deleting dhcp_scope key(s) scl3-releng-vlan216
	Deleting: 100.42.26.10.in-addr.arpa.                IN  (SREG) PTR t-w864-ix-207.wintest.releng.scl3.mozilla.com.
	Deleting: t-w864-ix-207.wintest.releng.scl3.mozilla.com.  IN  (SREG) A 10.26.42.100
	Deleting: t-w864-ix-207.build.mozilla.org.          IN  CNAME t-w864-ix-207.wintest.releng.scl3.mozilla.com.
		Dissabling mac 00:25:90:C6:BC:FE in DHCP
		Deleting dhcp_scope key(s) scl3-releng-vlan240
Status: REOPENED → RESOLVED
Last Resolved: 2 years ago2 years ago
Resolution: --- → FIXED
This wasn't removed from nagios, so it broke releng-nagios. I've removed it and reran puppet.
You need to log in before you can comment on or make changes to this bug.