Closed
Bug 1205227
Opened 10 years ago
Closed 10 years ago
Please take a look at t-w732-ix-194
Categories
(Infrastructure & Operations :: DCOps, task)
Infrastructure & Operations
DCOps
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: aselagea, Unassigned)
References
Details
(Whiteboard: imaging issues)
Attachments
(3 files)
For some reason, it times out every job it touches. It has been re-imaged multiple times during the past few months, also the memory and disk diagnostics did not seem to find any issue.
Updated•10 years ago
|
Assignee: relops → server-ops-dcops
Severity: major → normal
Component: RelOps → DCOps
QA Contact: arich
Comment 1•10 years ago
|
||
Filed a ticket with IX for a burn in test.
ticket id; #SBK-126-27184
colo-trip: --- → scl3
Whiteboard: #SBK-126-27184
Comment 2•10 years ago
|
||
I can't find this host in inventory. Did it get renamed to something else?
Comment 3•10 years ago
|
||
:vinh if you look at svn revision 108640, new host name;
- 'talos-linux32-ix-022.test.releng.scl3.mozilla.com' => {
+ 't-w732-ix-194.wintest.releng.scl3.mozilla.com' => {
Comment 4•10 years ago
|
||
dropped off at IX
Updated•10 years ago
|
Summary: Please take a look at talos-linux32-ix-022 → Please take a look at t-w732-ix-194
Updated•10 years ago
|
Whiteboard: #SBK-126-27184 → #SBK-126-27184 - burn in test at iX
Comment 5•10 years ago
|
||
update from IX;
"Hello Sal,
I wanted to update the ticket to reflect the current status of the node.
We placed it into our burn-in for an extended test and everything has tested smoothly.
The node is in the final stages of our tests and if all continues to test smoothly, the node will be placed at Will-Call for pick-up at your convenience.
An update will be posted to the ticket to confirm such.
If you have any further questions or concerns, we are here to help.
Thanks"
Comment 6•10 years ago
|
||
Picked up host from IX today, it passed the burn-in tests.
reimaging.
Whiteboard: #SBK-126-27184 - burn in test at iX → reimaging
| Reporter | ||
Comment 7•10 years ago
|
||
Noticed that the monitor is currently connected to the integrated video adapter on motherboard and not the dedicated one (see attachment)
| Reporter | ||
Comment 8•10 years ago
|
||
| Reporter | ||
Updated•10 years ago
|
Blocks: t-w732-ix-194
Comment 9•10 years ago
|
||
changed video output to external adapter and reimaged as i dont believe the external video drivers were installed properly; still reimaging.
Comment 10•10 years ago
|
||
any known issues with w7 reimaging?
it looks like im running into the same issue after a reimage. the host will get stuck at the windows splash screen (still hasnt finished booting up the login prompt) and will hang there. the host will be pingable and sshable. however if i reboot the host it will boot to same windows splash screen then the o/s will crash and the screen will black out. no ping and no ssh even after subsequent reboots.
i saw a message that complained of a bad driver after a reimage. let me try to get the exact message if possible.
Flags: needinfo?(q)
Flags: needinfo?(arich)
Updated•10 years ago
|
Flags: needinfo?(arich)
Comment 12•10 years ago
|
||
Comment 13•10 years ago
|
||
>If we can get a pic that would be great
attached. did a reimage, host rebooted after installation completes then fails to completely boot up. performed a start up repair to get these messages.
Comment 14•10 years ago
|
||
A remote command reimage of the machine worked Looking at it now). However the we can't load the NVidia control panel with a monitor plugged into the onboard card since "there is no display attached to an NVidia gpu" To fool the system we have to have the onboard card set to "disabled" and the no monitor plugged in.
Updated•10 years ago
|
Flags: needinfo?(q) → needinfo?(vle)
Comment 16•10 years ago
|
||
>However the we can't load the NVidia control panel with a monitor plugged into the onboard card since "there is no display attached to an NVidia gpu" To fool the system we have to have the onboard card set to "disabled" and the no monitor plugged in.
this is a win7 host so onboard has been disabled prior to reimage and no monitor is connected to the onboard as no video would be redirected there. i went ahead and started another reimage and removed the monitor (to external adapter), will check back in ~30 minutes.
Flags: needinfo?(vle)
Comment 17•10 years ago
|
||
same issue. rebooted and reimaged host with no external cables attached and host is still hanging during the 2nd reboot (after initial o/s install). host is unresponsive on kvm so it seems like it's crashing somewhere.
:arr, im not able to ssh in to the few hosts i tested from your w7 servers you retasked this morning. are they imaging fine for you?
mozillas-MacBook-Air-2:~ vle$ ssh !$
ssh t-w732-ix-204.wintest.releng.scl3.mozilla.com
Received disconnect from 10.26.41.254: 2: Handshake failed
Disconnected from 10.26.41.254
mozillas-MacBook-Air-2:~ vle$ ssh t-w732-ix-205.wintest.releng.scl3.mozilla.com
Received disconnect from 10.26.42.19: 2: Handshake failed
Flags: needinfo?(arich)
Comment 18•10 years ago
|
||
new error message when boot up fails and the system attempts to fix, i dont see the bad driver error this time.
Comment 19•10 years ago
|
||
This doesn't look like it ran a new install (which should wipe the disk) this looks like it tried to pickup an incomplete previous install.
Comment 20•10 years ago
|
||
van: Some of the reimages worked, some didn't. You can see which by looking at the parent bug.
Flags: needinfo?(arich)
Comment 21•10 years ago
|
||
>This doesn't look like it ran a new install (which should wipe the disk) this looks like it tried to pickup an incomplete previous install
i went ahead and pxebooted, chose the local disk format option, let it does its thing, and it rebooted. confirmed o/s was no longer present by letting it complete the boot process. rebooted and reimaged to same error/issues.
Whiteboard: reimaging → imaging issues
Comment 22•10 years ago
|
||
Let's try replacing the video card?
Comment 23•10 years ago
|
||
I've replaced the video card, reimaging in progress.
Comment 24•10 years ago
|
||
Host back online
vhua$ ssh t-w732-ix-194.wintest.releng.scl3.mozilla.com
The authenticity of host 't-w732-ix-194.wintest.releng.scl3.mozilla.com (10.26.42.18)' can't be established.
RSA key fingerprint is 3b:f9:39:8d:96:4f:0c:c8:4d:be:df:9c:2c:44:09:95.
Are you sure you want to continue connecting (yes/no)?
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Comment 25•10 years ago
|
||
Did that fix the issue?
Comment 26•10 years ago
|
||
Looks much better!
Comment 27•10 years ago
|
||
I rebooted the host 3 times and each time it booted up to the correct video card and resolution.
Comment 28•10 years ago
|
||
back in the slave pool
You need to log in
before you can comment on or make changes to this bug.
Description
•