Closed
Bug 890317
(t-w732-ix-064)
Opened 12 years ago
Closed 10 years ago
t-w732-ix-064 problem tracking
Categories
(Infrastructure & Operations Graveyard :: CIDuty, task, P3)
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: bhearsum, Unassigned)
References
Details
(Whiteboard: [buildduty][buildslaves][capacity])
Attachments
(1 file)
96.93 KB,
image/png
|
Details |
trying pdu reboot
Reporter | ||
Comment 1•12 years ago
|
||
didn't work, needs it help
Comment 2•12 years ago
|
||
Per 890333 the nic is still being "weird" but it is taking jobs right now.
I gave the OK to it to yank it as needed, I've disabled in slavealloc so future boots don't start jobs.
Comment 3•12 years ago
|
||
cc'ing the sheriffs in case the issue mentioned in comment 2 gets on our radars
Comment 4•12 years ago
|
||
re-enabled in slavealloc
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
Updated•12 years ago
|
Assignee | ||
Updated•12 years ago
|
Product: mozilla.org → Release Engineering
Reporter | ||
Comment 5•11 years ago
|
||
Slave is ready for production...but didn't come back from a reboot.
Depends on: 912206
Comment 6•11 years ago
|
||
Slave is in production, but not ready for it - failing webgl tests, timing out xperf tests, timing out cloning talos, generally acting like even after a reimage to catch it up with all the things it missed over the last two months, it'll still be in pretty bad shape.
Disabled in slavealloc.
Reporter | ||
Comment 7•11 years ago
|
||
(In reply to Phil Ringnalda (:philor) from comment #6)
> Slave is in production, but not ready for it - failing webgl tests, timing
> out xperf tests, timing out cloning talos, generally acting like even after
> a reimage to catch it up with all the things it missed over the last two
> months, it'll still be in pretty bad shape.
>
> Disabled in slavealloc.
Cheers. Back to diagnostics.
Reporter | ||
Comment 8•11 years ago
|
||
Diagnostics showed no errors. Not sure what to do now.
Comment 9•11 years ago
|
||
Since diags did a reimage and saw no errors I tried a reboot, but that failed too, so we'll need human touch again anyway.
[jwood@cruncher.srv.releng.scl3 ~]$ for i in 064; do curl http://slaveapi-dev1.srv.releng.scl3.mozilla.com:8000/slave/t-w732-ix-$i/action/reboot; done
{
"reboots": {
"55975568": {
"state": 3,
"text": "Attempting SSH reboot...Failed.\nAttempting IPMI reboot...Failed.\nCan't do
anything else, human intervention needed."
}
}
}
Updated•11 years ago
|
Status: REOPENED → RESOLVED
Closed: 12 years ago → 11 years ago
Resolution: --- → FIXED
Comment 10•11 years ago
|
||
I rebooted it again.
Comment 11•11 years ago
|
||
Still having problems. Disabled.
https://tbpl.mozilla.org/php/getParsedLog.php?id=29988866&tree=Mozilla-Inbound
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Comment 12•11 years ago
|
||
2 monitors.
dxdiag shows acceleration disabled.
Need IT's intervention.
Reporter | ||
Comment 13•11 years ago
|
||
The graphics setup should be correct now - I put the machine back in production.
Status: REOPENED → RESOLVED
Closed: 11 years ago → 11 years ago
Resolution: --- → FIXED
Comment 14•11 years ago
|
||
Sure about that?
https://tbpl.mozilla.org/php/getParsedLog.php?id=30437374&tree=Mozilla-Central
Disabled.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Reporter | ||
Comment 15•11 years ago
|
||
(In reply to Ryan VanderMeulen [:RyanVM UTC-5] from comment #14)
> Sure about that?
> https://tbpl.mozilla.org/php/getParsedLog.php?id=30437374&tree=Mozilla-
> Central
>
> Disabled.
Of course I'm not.
Comment 16•11 years ago
|
||
Definitely not a one-off fluke.
https://tbpl.mozilla.org/php/getParsedLog.php?id=30438553&tree=Mozilla-Aurora
Comment 17•11 years ago
|
||
Hard drive has been replaced and the machine has been re-imaged.
Back into production.
Status: REOPENED → RESOLVED
Closed: 11 years ago → 11 years ago
Resolution: --- → FIXED
Comment 18•10 years ago
|
||
Attempting SSH reboot...Failed.
Attempting IPMI reboot...Failed.
Filed IT bug for reboot (bug 1188656)
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Comment 19•10 years ago
|
||
Reenabled after another disk replacement/reimage.
Status: REOPENED → RESOLVED
Closed: 11 years ago → 10 years ago
Resolution: --- → FIXED
Comment 20•10 years ago
|
||
But after multiple reboots, it still isn't taking jobs.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Comment 21•10 years ago
|
||
Re-imaged the slave enabled it in slavealloc. At the moment, it has already successfully completed two jobs.
Status: REOPENED → RESOLVED
Closed: 10 years ago → 10 years ago
Resolution: --- → FIXED
Updated•7 years ago
|
Product: Release Engineering → Infrastructure & Operations
Updated•5 years ago
|
Product: Infrastructure & Operations → Infrastructure & Operations Graveyard
You need to log in
before you can comment on or make changes to this bug.
Description
•