Closed
Bug 778804
(tegra-270)
Opened 12 years ago
Closed 10 years ago
tegra-270 problem tracking
Categories
(Infrastructure & Operations Graveyard :: CIDuty, task, P3)
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: coop, Unassigned)
References
()
Details
(Whiteboard: [buildduty][buildslaves][capacity])
No description provided.
Comment 1•12 years ago
|
||
This was down for a month and a half as far as jobs are concerned, just logged into its foopy, had the buildslave returned 1 issue, and a 2-day old hung clientproxy (verify.py actually). so I killed verify.py and say the buildslave issue, then I cycled cp (stop/stop) and tailed twistd.log and it is now taking a job. Not sure if it will persist, but I'll resolve for now.
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
Comment 2•12 years ago
|
||
Made it through 8 jobs, then it's done a bit over 50 red with a sprinkling of purple. I'd say we were better off without it.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Comment 4•12 years ago
|
||
Back in production.
Status: REOPENED → RESOLVED
Closed: 12 years ago → 12 years ago
Resolution: --- → FIXED
Comment 5•12 years ago
|
||
Made it 20 green-and-orange runs before breaking again, but the last 7 in a row have been red.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Comment 6•12 years ago
|
||
ran stop_cp.sh on it.
Comment 8•12 years ago
|
||
Had a hung SUTAgent, PDU cycled and now SUTAgent isn't even starting [afaict] Slating for reimage one last time before I call it dead as a doorknob.
Assignee: bugspam.Callek → nobody
Depends on: 787407
Comment 9•12 years ago
|
||
This tegra seems to be working fine the last few days.
Status: REOPENED → RESOLVED
Closed: 12 years ago → 12 years ago
Resolution: --- → FIXED
Comment 10•12 years ago
|
||
https://secure.pub.build.mozilla.org/buildapi/recent/tegra-270 - six reds in a row.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Comment 11•12 years ago
|
||
11.
Comment 12•12 years ago
|
||
stop ran here
Comment 13•12 years ago
|
||
IT handled this, start_cp run now
Status: REOPENED → RESOLVED
Closed: 12 years ago → 12 years ago
Resolution: --- → FIXED
Comment 14•12 years ago
|
||
10 red in a row, SD card, or the trash pile?
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Summary: tegra-270 problem tracking → [disable me] tegra-270 problem tracking
Comment 15•12 years ago
|
||
stop_cp run on foopy23.
Summary: [disable me] tegra-270 problem tracking → tegra-270 problem tracking
Comment 16•12 years ago
|
||
22 retries in a row.
Updated•12 years ago
|
Summary: tegra-270 problem tracking → [disable me] tegra-270 problem tracking
Updated•12 years ago
|
Summary: [disable me] tegra-270 problem tracking → tegra-270 problem tracking
Comment 17•12 years ago
|
||
Based on its history, I'll be back within 50 jobs.
Status: REOPENED → RESOLVED
Closed: 12 years ago → 12 years ago
Resolution: --- → FIXED
Comment 18•12 years ago
|
||
https://tbpl.mozilla.org/php/getParsedLog.php?id=15902674&tree=Mozilla-Aurora should mark the start
Comment 19•12 years ago
|
||
https://tbpl.mozilla.org/php/getParsedLog.php?id=15903073&tree=Mozilla-Aurora
Comment 20•12 years ago
|
||
https://tbpl.mozilla.org/php/getParsedLog.php?id=15905301&tree=Mozilla-Aurora
Comment 21•12 years ago
|
||
https://tbpl.mozilla.org/php/getParsedLog.php?id=15938407&tree=Mozilla-Beta
Updated•12 years ago
|
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Comment 22•12 years ago
|
||
This tegra seems to be running jobs fine, until Philor shows me how wrong I am.
Status: REOPENED → RESOLVED
Closed: 12 years ago → 12 years ago
Resolution: --- → FIXED
Comment 23•12 years ago
|
||
84% green, which is the sort of thing which is making me start to think that somehow the results of recovery are... less determinate than we would like. Sort of random, even.
Comment 24•11 years ago
|
||
No jobs taken on this device for >= 7 weeks
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Comment 25•11 years ago
|
||
(mass change: filter on tegraCallek02reboot2013) I just rebooted this device, hoping that many of the ones I'm doing tonight come back automatically. I'll check back in tomorrow to see if it did, if it does not I'll triage next step manually on a per-device basis. --- Command I used (with a manual patch to the fabric script to allow this command) (fabric)[jwood@dev-master01 fabric]$ python manage_foopies.py -j15 -f devices.json `for i in 021 032 036 039 046 048 061 064 066 067 071 074 079 081 082 083 084 088 093 104 106 108 115 116 118 129 152 154 164 168 169 174 179 182 184 187 189 200 207 217 223 228 234 248 255 264 270 277 285 290 294 295 297 298 300 302 304 305 306 307 308 309 310 311 312 314 315 316 319 320 321 322 323 324 325 326 328 329 330 331 332 333 335 336 337 338 339 340 341 342 343 345 346 347 348 349 350 354 355 356 358 359 360 361 362 363 364 365 367 368 369; do echo '-D' tegra-$i; done` reboot_tegra The command does the reboot, one-at-a-time from the foopy the device is connected from. with one ssh connection per foopy
Comment 26•11 years ago
|
||
had to cycle clientproxy to bring this back
Status: REOPENED → RESOLVED
Closed: 12 years ago → 11 years ago
Resolution: --- → FIXED
Comment 27•11 years ago
|
||
10 days, 14:03:35 since last job
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Comment 28•11 years ago
|
||
Back in production.
Status: REOPENED → RESOLVED
Closed: 11 years ago → 11 years ago
Resolution: --- → FIXED
Comment 29•11 years ago
|
||
Sending this slave to recovery -->Automated reopening of bug
Comment 30•11 years ago
|
||
flashed and reimaged.
Updated•11 years ago
|
Assignee | ||
Updated•11 years ago
|
Product: mozilla.org → Release Engineering
Comment 32•11 years ago
|
||
back in production
Status: REOPENED → RESOLVED
Closed: 11 years ago → 11 years ago
Resolution: --- → FIXED
Comment 33•11 years ago
|
||
one last recovery attempt, if this fails out before mtv move date (Jan 11 I think) we can call it decomm-worthy
Comment 34•11 years ago
|
||
SD card has been replaced and reimaged/flashed.
Comment 35•11 years ago
|
||
Back in production
Status: REOPENED → RESOLVED
Closed: 11 years ago → 11 years ago
Resolution: --- → FIXED
Comment 36•10 years ago
|
||
Hasn't taken a job for two weeks.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Updated•10 years ago
|
QA Contact: armenzg → bugspam.Callek
Comment 37•10 years ago
|
||
Disabled in slavealloc to stop the pointless stream of reboots.
Comment 38•10 years ago
|
||
SD card formatted, tegra flashed and reimaged. vle@vle-10516 ~ $ telnet tegra-270.tegra.releng.scl3.mozilla.com 20701 Trying 10.26.85.218... Connected to tegra-270.tegra.releng.scl3.mozilla.com. Escape character is '^]'. $>^] telnet> q
Comment 39•10 years ago
|
||
Reenabled.
Status: REOPENED → RESOLVED
Closed: 11 years ago → 10 years ago
Resolution: --- → FIXED
Updated•6 years ago
|
Product: Release Engineering → Infrastructure & Operations
Updated•4 years ago
|
Product: Infrastructure & Operations → Infrastructure & Operations Graveyard
You need to log in
before you can comment on or make changes to this bug.
Description
•