Closed Bug 808437 Opened 12 years ago Closed 12 years ago

Something has broken in tegra recovery

Categories

(Infrastructure & Operations Graveyard :: CIDuty, task)

ARM
Android
task
Not set
major

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: philor, Unassigned)

References

Details

Last week, we had three tegra recovery bugs: bug 806950, bug 807663 and bug 807965. Every single tegra involved in those now has its tracking bug reopened because it is broken, doing horribly, failing more than 50% of the time and doing so in suspicious ways like timing out in reftests and mochitests and failing to even initialize the browser in talos. Particularly telling are tegra-093 and tegra-182, because from looking at buildapi/recent, there's no evidence I can see that they had a problem at all, but then they got recovered, and have turned into broken tegras. The one before those was bug 802655, which did three tegras, two of which I persuaded Callek were bad hardware and should be scrapped, the third of which I regretted having not included in that tar-brushing a day later. The one before that was bug 792692, which only did two of the tegras in the 300s, which are all awful and thus hard to tell about, but one of the two, tegra-336, seems to have been restarted on November 1st and to be running okay. So, what could have changed between 2012-09-21, the last time we did a successful reimage, and now?
tegra-064 got reimaged in bug 807963 rather than a tegra-recovery bug, but it's broken just the same.
Blocks: tegra-064
Blocks: 808468
Depends on: 808474
>So, what could have changed between 2012-09-21, the last time we did a successful reimage, and now? >tegra-064 got reimaged in bug 807963 rather than a tegra-recovery bug, but it's broken just the same. :philor, is there a way to check if the latest image was used in tegra-064? I know there are several images on the imaging netbook and want to confirm the latest image has been used since 9-21.
s/philor/Callek/, since I'm a volunteer who looks at logs after test jobs finish, not a releng employee with access to anything.
Flags: needinfo?(bugspam.Callek)
tegra-057 got a reimage in bug 807962, and is also busted.
Blocks: tegra-057
Hi, I just reimaged tegra-057 and tegra-064 with what is supposed to be the correct image. Is there a way you can run tests on them to confirm they're working as normal? Thanks, Van
(In reply to Van Le [:van] from comment #5) > I just reimaged tegra-057 and tegra-064 with what is supposed to be the > correct image. Is there a way you can run tests on them to confirm they're > working as normal? Apparently we crossed streams, and I never took down 057 first -- but no big worry there, it will pickup a new job soon. I've just started up 064 as well. *leaving* their problem tracking bugs open for now
Flags: needinfo?(bugspam.Callek)
The fact that it's difficult to say whether or not 057 and 064 got another "bad image" doesn't bode well for that email thread about verifying that tegras are in good shape before putting them back in service. 064 still has a busted sd card, whether because it got one bad one replaced by another, or it has a busted slot, or something less imaginable - every other test run was failing by not being able to write to the card. The ones that did run... there were only four, only one failed, but in a suspicious way. 057 is probably busted in the bad-image way - it's done 15 green runs and 12 non-green, which is a bit higher than the average success rate for broken ones, but well below the average for unbroken ones. But if the eventual post-image verification process takes two days to be sure a tegra is healthy, that's not going to be very handy.
057 is busted in the bad-image way, but it's ugly that it takes this long to be sure.
Blocks: 813012
No longer blocks: 813012
From what I can tell, the system is working as-intended, and nothing unexpected changed. This bug is closeable.
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
No longer blocks: 808468
Depends on: 808468
Product: mozilla.org → Release Engineering
Product: Release Engineering → Infrastructure & Operations
Product: Infrastructure & Operations → Infrastructure & Operations Graveyard
You need to log in before you can comment on or make changes to this bug.