620948 - computers that need physical intervention

linux-ix-slave37.build moz:~ bear$ ssh cltbld@linux-ix-slave37.build.mozilla.org ssh: connect to host linux-ix-slave37.build.mozilla.org port 22: Operation timed out moz:~ bear$ ping linux-ix-slave37.build.mozilla.org PING linux-ix-slave37.build.scl1.mozilla.com (10.12.48.231): 56 data bytes Request timeout for icmp_seq 0 Request timeout for icmp_seq 1 Request timeout for icmp_seq 2

Chris Cooper [:coop] (he/him)

Updated

•

15 years ago

Assignee: nobody → server-ops

Component: Release Engineering → Server Operations

QA Contact: release → mrz

Zandr Milewski [:zandr]

Assignee

Comment 2

•

15 years ago

(In reply to comment #1) > linux-ix-slave37.build console says: puppetd returned non-zero, sleeping for 60...trying again repeated continuously. linux-ix-slave08 is saying the same thing, immediately after a reimage. My first thought was that this was due to a hostname issue, but if you can't even ping it, there's no way to get in and fix hostname.

bhearsum@mozilla.com (:bhearsum)

Comment 3

•

15 years ago

Probably needs its hostkey cleared on the scl and/or mpt puppet masters.

bhearsum@mozilla.com (:bhearsum)

Comment 4

•

15 years ago

oops, i mean MV, not scl

Mike Taylor [:bear]

Reporter

Comment 5

•

15 years ago

(In reply to comment #3) > Probably needs its hostkey cleared on the scl and/or mpt puppet masters. let me do that now

Mike Taylor [:bear]

Reporter

Comment 6

•

15 years ago

linux-ix-slave37 has attached to the master and is ok

Rail Aliiev [:rail]

Comment 7

•

15 years ago

talos-r3-fed-031 talos-r3-fed-032 talos-r3-fed64-001 talos-r3-fed64-011 talos-r3-fed64-018 talos-r3-fed64-022

Zandr Milewski [:zandr]

Assignee

Comment 8

•

15 years ago

(In reply to comment #7) > talos-r3-fed64-001 Seriously? This was reimaged *last night*. Did it do any work overnight?

Rail Aliiev [:rail]

Comment 9

•

15 years ago

moz2-darwin10-slave01

Rail Aliiev [:rail]

Comment 10

•

15 years ago

(In reply to comment #8) > (In reply to comment #7) > > > talos-r3-fed64-001 > > Seriously? This was reimaged *last night*. > > Did it do any work overnight? At least I couldn't reach it and Nagios reports it as unpingable for 19 hours.

Justin Dow [:jabba]

Updated

•

15 years ago

Assignee: server-ops → zandr

Rail Aliiev [:rail]

Comment 11

•

15 years ago

talos-r3-w7-036

Chris Cooper [:coop] (he/him)

Comment 12

•

15 years ago

An updated amd collated list: bm-xserve18 linux-ix-slave13 linux-ix-slave14 linux-ix-slave16 moz2-darwin10-slave53 moz2-darwin10-slave54 mv-moz2-linux-ix-slave05 talos-r3-fed-031 talos-r3-fed-032 talos-r3-fed64-001 talos-r3-fed64-011 talos-r3-fed64-018 talos-r3-fed64-022 talos-r3-fed64-047 talos-r3-fed64-055 talos-r3-leopard-003 talos-r3-w7-036 w32-ix-slave03 w32-ix-slave08 w32-ix-slave41 talos-r3-fed-031 talos-r3-fed-032 talos-r3-fed64-001 talos-r3-fed64-011 talos-r3-fed64-018 talos-r3-fed64-022

Zandr Milewski [:zandr]

Assignee

Comment 13

•

15 years ago

talos-r3-fed-031: "Couldn't mount root filesystem": reimaged talos-r3-fed-032: talos-r3-fed64-001: rebooted: responds to ssh talos-r3-fed64-011: weird dhcp problem, reimaged talos-r3-fed64-018: same hang at USB: reimaged talos-r3-fed64-022: "Couldn't mount root filesystem": reimaged talos-r3-fed64-047: not in scl1, may not exist, cc :jhford for comment talos-re-fed64-055: not in scl1, may not exist, cc :jhford for comment talos-r3-leopard-003: WFM: responds to ssh, vnc talos-r3-w7-036: w32-ix-slave41: presumed drive failure. Imaging at 20MB/*minute* and falling. see bug 615744

Zandr Milewski [:zandr]

Assignee

Comment 14

•

15 years ago

Oops, hit save too early. talos-r3-fed-032: was hung at grey boot screen: rebooted talos-r3-w7-036: was hung at grey boot screen: rebooted And that's it for scl1.

Zandr Milewski [:zandr]

Assignee

Comment 15

•

15 years ago

linux-ix-slave14 is also MIA. Not present in 650.

Justin Lazaro [:jlaz] (use needinfo)

Comment 16

•

15 years ago

That machine was given to IX to investigate the issues in bug 596366 (comment 11)