Closed
Bug 786366
(tegra-307)
Opened 12 years ago
Closed 11 years ago
decommission tegra-307
Categories
(Infrastructure & Operations Graveyard :: CIDuty, task)
Infrastructure & Operations Graveyard
CIDuty
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: bhearsum, Unassigned)
References
()
Details
(Whiteboard: [buildduty])
Didn't come back from a PDU reboot.
Reporter | ||
Updated•12 years ago
|
Whiteboard: [buildduty]
Reporter | ||
Comment 1•12 years ago
|
||
Back in production.
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
Comment 2•12 years ago
|
||
Didn't come back from a PDU reboot.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Comment 3•12 years ago
|
||
IT handled this, back to taking jobs
Status: REOPENED → RESOLVED
Closed: 12 years ago → 12 years ago
Resolution: --- → FIXED
Comment 4•12 years ago
|
||
nagios has been yapping about it not responding to pings for so long that even I'm getting tired of the noise.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Comment 5•12 years ago
|
||
Apparently sometime during the intervening week it managed to respond once, since it was back to yapping again.
Comment 6•12 years ago
|
||
(In reply to Phil Ringnalda (:philor) from comment #5)
> Apparently sometime during the intervening week it managed to respond once,
> since it was back to yapping again.
During the week it physically moved and had itself restarted due to a power event in mtv1.
Reporter | ||
Comment 7•12 years ago
|
||
tried a pdu reboot, don't know what else to do.
Assignee: nobody → bugspam.Callek
Updated•12 years ago
|
Reporter | ||
Comment 9•12 years ago
|
||
pdu reboot
start_cp.sh
Reporter | ||
Comment 10•12 years ago
|
||
Didn't come back, off to recovery (again):
10:16 < nagios-releng> Thu 07:16:39 PST [494] tegra-307.build.mtv1.mozilla.com is DOWN :PING CRITICAL - Packet loss = 100%
Depends on: 825335
Comment 11•12 years ago
|
||
the switch was set to BATT instead of NORM so the board wasn't powered on. reimaged SD card anyways.
Comment 13•12 years ago
|
||
did a pdu reboot again, waiting that this thing comes up again
Comment 14•12 years ago
|
||
(mass change: filter on tegraCallek02reboot2013)
I just rebooted this device, hoping that many of the ones I'm doing tonight come back automatically. I'll check back in tomorrow to see if it did, if it does not I'll triage next step manually on a per-device basis.
---
Command I used (with a manual patch to the fabric script to allow this command)
(fabric)[jwood@dev-master01 fabric]$ python manage_foopies.py -j15 -f devices.json `for i in 021 032 036 039 046 048 061 064 066 067 071 074 079 081 082 083 084 088 093 104 106 108 115 116 118 129 152 154 164 168 169 174 179 182 184 187 189 200 207 217 223 228 234 248 255 264 270 277 285 290 294 295 297 298 300 302 304 305 306 307 308 309 310 311 312 314 315 316 319 320 321 322 323 324 325 326 328 329 330 331 332 333 335 336 337 338 339 340 341 342 343 345 346 347 348 349 350 354 355 356 358 359 360 361 362 363 364 365 367 368 369; do echo '-D' tegra-$i; done` reboot_tegra
The command does the reboot, one-at-a-time from the foopy the device is connected from. with one ssh connection per foopy
Comment 15•12 years ago
|
||
now taking jobs
Status: REOPENED → RESOLVED
Closed: 12 years ago → 12 years ago
Resolution: --- → FIXED
Reporter | ||
Comment 16•12 years ago
|
||
9 days, 16:30:17 since last job
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Reporter | ||
Comment 17•12 years ago
|
||
neither pdu reboot nor recovery helped, dunno what to do
Reporter | ||
Comment 18•12 years ago
|
||
back in production
Status: REOPENED → RESOLVED
Closed: 12 years ago → 12 years ago
Resolution: --- → FIXED
Updated•12 years ago
|
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Comment 19•12 years ago
|
||
Came up, but with a LOT of problems through the day per https://secure.pub.build.mozilla.org/builddata/reports/slave_health/slave.html?class=test&type=tegra&name=tegra-307
Its currently unpingable.
tegra-307 - not running - enabled - 'Automation Error: Unable to ping device after 5 attempts'
Comment 21•11 years ago
|
||
reflashed and reimaged.
Updated•11 years ago
|
Assignee | ||
Updated•11 years ago
|
Product: mozilla.org → Release Engineering
Reporter | ||
Comment 22•11 years ago
|
||
ping and agent checks failing, pdu reboot didn't help
Depends on: 912682
Comment 23•11 years ago
|
||
Tegra-307 keeps overheating, also a strong burning smell(tried different power supplies).
I'd rather decom this tegra do avoid something bad
Reporter | ||
Comment 24•11 years ago
|
||
(In reply to Salvador Espinoza [:sal] from comment #23)
> Tegra-307 keeps overheating, also a strong burning smell(tried different
> power supplies).
>
>
> I'd rather decom this tegra do avoid something bad
Sounds good, thanks for looking at it!
Summary: tegra-307 problem tracking → decommission tegra-307
Reporter | ||
Comment 25•11 years ago
|
||
Updated buildbot-configs and devices.json to reflect decomm.
Status: REOPENED → RESOLVED
Closed: 12 years ago → 11 years ago
Resolution: --- → FIXED
Comment 26•11 years ago
|
||
I also removed tegra-307 from foopy28
(on foopy28: rm -rf /builds/tegra-307)
Currently we do not have an automatic mechanism to keep the foopy device directories in sync with the devices.json file.
Updated•11 years ago
|
QA Contact: other → armenzg
Updated•7 years ago
|
Product: Release Engineering → Infrastructure & Operations
Updated•5 years ago
|
Product: Infrastructure & Operations → Infrastructure & Operations Graveyard
You need to log in
before you can comment on or make changes to this bug.
Description
•