Closed Bug 747694 (tegra-159) Opened 12 years ago Closed 10 years ago

tegra-159 problem tracking

Categories

(Infrastructure & Operations Graveyard :: CIDuty, task)

ARM
Android
task
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: philor, Unassigned)

References

()

Details

(Whiteboard: [buildduty][capacity])

Hit four retries yesterday afternoon, and since then has been pure red (which seems to be the new pattern for Tegra failures).
I gracefulled from the webUI on the master, and powercycled while connected to foopy05.

Hopefully one or both of those fix this up, otherwise someone will be tasked for officially bringing this tegra offline
Mike, FYI for when you start on my deploys for this, look at the log to see what went wrong here.

2012-04-21 23:20:02 tegra-159 p  INACTIVE   active  OFFLINE :: SUTAgent not present;
2012-04-21 23:24:04 tegra-159 p    online INACTIVE  OFFLINE :: CP 0d 511s;

Right after I did the above.
This is restarted, and my deploy should solve (most?) of the chances for CP to go inactive.
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
Component: Release Engineering → Release Engineering: Machine Management
QA Contact: armenzg
I don't know exactly what it means is busted, but this has taken over the job of being every single instance in bug 789751 - if a reftest fails to load, then it's because it's running on tegra-159. Something's wrong with it.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Blocks: 438871
Whiteboard: [buildduty][capacity] → [buildduty][capacity][orange]
Actively running jobs.
Status: REOPENED → RESOLVED
Closed: 12 years ago12 years ago
Resolution: --- → FIXED
(In reply to Ben Hearsum [:bhearsum] from comment #9)
> Actively running jobs.

Typo there, it's spelled "ruining" with an i and only one n.
OK, I pulled it out again. Let's get it into recovery and see if that helps.
Status: RESOLVED → REOPENED
Depends on: 806950
Resolution: FIXED → ---
(In reply to Ben Hearsum [:bhearsum] from comment #11)
> OK, I pulled it out again. Let's get it into recovery and see if that helps.

I'm 95% sure I'll regret this, but now that it's back from recovery I started it again and it's back in the production pool.
Status: REOPENED → RESOLVED
Closed: 12 years ago12 years ago
Resolution: --- → FIXED
38% green.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Depends on: 808437
No longer blocks: 438871
Whiteboard: [buildduty][capacity][orange] → [buildduty][capacity]
Ran ./stop_cp.sh
Blocks: 808468
Blocks: 813012
No longer blocks: 813012
back to life
Status: REOPENED → RESOLVED
Closed: 12 years ago11 years ago
Resolution: --- → FIXED
...but needs cpr again, please reimage.
Status: RESOLVED → REOPENED
Depends on: 817995
Resolution: FIXED → ---
Tegra-159 reimaged

--- tegra-159.build.mtv1.mozilla.com ping statistics ---
3 packets transmitted, 3 packets received, 0.0% packet loss
Status: REOPENED → RESOLVED
Closed: 11 years ago11 years ago
Resolution: --- → FIXED
No longer blocks: 808468
Depends on: 808468
Rebooted but still cannot connect.
Status: RESOLVED → REOPENED
Depends on: 884380
Resolution: FIXED → ---
Back in production running jobs.
Status: REOPENED → RESOLVED
Closed: 11 years ago11 years ago
Resolution: --- → FIXED
Product: mozilla.org → Release Engineering
agent check failing, pdu reboot didn't help
Status: RESOLVED → REOPENED
Depends on: 949447
Resolution: FIXED → ---
SD card is wiped. flashed and reimaged.
jobs are green
Status: REOPENED → RESOLVED
Closed: 11 years ago10 years ago
Resolution: --- → FIXED
Product: Release Engineering → Infrastructure & Operations
Product: Infrastructure & Operations → Infrastructure & Operations Graveyard
You need to log in before you can comment on or make changes to this bug.