kickstarting talos-linux64-ix-057 is very slow

RESOLVED FIXED

Status

RESOLVED FIXED
6 years ago
4 years ago

People

(Reporter: arich, Assigned: heulalia)

Tracking

Details

(Reporter)

Description

6 years ago
Trying to kickstart this host takes a very long time, pointing to some sort of hardware issue.  We've already tried replacing the cable and changing switch ports without effect.  I asked hubear to run diagnostics on it to see if we can figure out what the issue is (network, disk, RAM, etc).
(Reporter)

Updated

6 years ago
Blocks: 849985
(Assignee)

Updated

6 years ago
colo-trip: --- → scl3
I went ahead and filed a ticket for iX Systems and described the problem above. As soon as I get a reply, I'll update the bug. They're pretty good at responding quickly.

Ticket ID: DHG-111212
colo-trip: scl3 → ---
Whiteboard: [Waiting on response from iX Systems]
iX Systems suggested us trying this before we request for a replacement. 

Remove the node for at least 20 seconds and place it back into the chassis.

If you continue to experience issues afterward, please let us know and we will promptly schedule a pick-up (for repair/replacement).
(Assignee)

Comment 3

6 years ago
We reseated it a couple of times already, we had no luck with it.
(Assignee)

Updated

6 years ago
Assignee: server-ops-dcops → heulalia

Updated

6 years ago
colo-trip: --- → scl3
Whiteboard: [Waiting on response from iX Systems] → [Picked up by iX Systems for repair]

Comment 4

5 years ago
Host is back from iX systems with it's motherboard replaced. nic0 and mgmt mac addresses have been updated in inventory. Please kickstart host and close bug if issue is resolved.
Status: NEW → ASSIGNED
(Reporter)

Comment 5

5 years ago
This machine still doesn't seem happy.  I wonder if it's a bad disk, too?  Or maybe something that's not seated properly?

Comment 6

5 years ago
:hubear, can you grab a drive from one of the new 10 iX systems that came in and swap it with this host? Set the suspect drive aside so we can RMA it if necessary.

Updated

5 years ago
Whiteboard: [Picked up by iX Systems for repair]
(Assignee)

Comment 7

5 years ago
Hard drive swapped, Please kickstart host and close bug if issue is resolved.
(Reporter)

Comment 8

5 years ago
Now it doesn't even PXE boot.  All I get after the initial power on screen is a blinking cursor.
I filed a ticket for iXsystems and let them know its still acting up.
:arr iX Systems suggested we try this: 
Blank screen usually means there is an invalid boot block on the boot device. Please try booting the system and while the BIOS screens are up press F12 repeatedly. This will instruct BIOS to PXE boot instead of following the standard boot path.
:ashlee, :hubear

Can you two put a monitor on this host and perform some onsite debugging? We want to try reseating all the components and confirming we can actually reach the PXE boot screen before returning it to :arr. She can't do anything remotely until the host can reach that screen.
(Assignee)

Comment 12

5 years ago
:dmoore

The host is now on the PXE boot screen.
(Reporter)

Comment 13

5 years ago
Still slow.
We'll try moving it to one of the chassis we received this week in order to troubleshoot the problem.

Comment 15

5 years ago
:arr, I swapped the location of talos-linux64-ix-057.test.releng.scl3.mozilla.com and talos-linux32-ix-100.test.releng.scl3.mozilla.com. Can you kickstart both and let me know of any issues?
(Reporter)

Comment 16

5 years ago
This seems to be functioning normally now after some reseating.
Status: ASSIGNED → RESOLVED
Last Resolved: 5 years ago
Resolution: --- → FIXED
Product: mozilla.org → Infrastructure & Operations
You need to log in before you can comment on or make changes to this bug.