Closed Bug 792318 Opened 12 years ago Closed 12 years ago

Stop running verify.py twice during Android Talos

Categories

(Release Engineering :: General, defect)

ARM
Android
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: philor, Assigned: Callek)

References

Details

Attachments

(1 file, 2 obsolete files)

For unittests, we run verify.py once, at the start of the run, to be sure we have a live and up to date tegra to run the tests on. For talos, we run verify.py at the start of the run, to be sure we have a live and up to date tegra to run the tests on, and again at the end of the run, to... I think the theory was to increase the likelihood that it would be ready for its next run, but in fact what it seems to do is increase the likelihood that a perfectly good talos run will end up red from bug 686084 or bug 781419, or purple from bug 660480. The bug 781419 situation is particularly interesting to me, because I turned that into an auto RETRY for unittests in bug 791477, but I wasn't willing to do that for talos because of this - we would do a perfectly good talos run, getting exactly what we wanted out of it, and then if the tegra happened to die between then and the time we ran verify.py on it a second time, we would set RETRY and pointlessly run the job a second time.
doubtful this is the most elegant way to do it, but I was already thinking it would be a good end-to-end reducer and a sanity-increaser. The fact that philor filed the bug for us increased my desire to write something *now* rather than *tomorrow*.
Assignee: nobody → bugspam.Callek
Status: NEW → ASSIGNED
Attachment #662431 - Flags: review?(armenzg)
Attachment #662431 - Attachment is patch: true
This time, qref'd
Attachment #662431 - Attachment is obsolete: true
Attachment #662431 - Flags: review?(armenzg)
Attachment #662436 - Flags: review?(armenzg)
What about if we remove verify.py from buildbotcustom all together?
(In reply to Armen Zambrano G. [:armenzg] from comment #3) > What about if we remove verify.py from buildbotcustom all together? To answer here, we can't easily. It is catching many device intermittent failures, and taking the device out of the pool when they happen. If we had the ability to properly graceful the tegra jobs, so that it ran verify from outside of buildbot - always - we'd be in better shape, but all my attempts of that had hit up against buildbot state bugs. to clarify verify.py does many important things: * Verifies the tegra is pingable * Verifies we can telnet to the SUTAgent Port [and get a prompt] * Verify we have the expected SUTAgent version [if not we attempt a SUT Upgrade] * Verify the SDCard is mounted and writeable * Cleans up errant procs/files on the device and the foopy. If any of those things fail, we set an error.flg which forcibly takes the device out of the buildbot rotation on us, until/unless the outside-of-buildbot verify.py run succeeds. Unfortunately there are still a few cases where verify.py can timeout/fail where it won't kill off buildbot, but that is less common than the many cases where it does [properly]
Attachment #662436 - Attachment is obsolete: true
Attachment #662436 - Flags: review?(armenzg)
Attachment #662611 - Flags: review?(bugspam.Callek)
Comment on attachment 662611 [details] [diff] [review] run verify.py only at the beginning Review of attachment 662611 [details] [diff] [review]: ----------------------------------------------------------------- ::: process/factory.py @@ +5246,5 @@ > + name="verify_tegra_state", > + description="Running verify.py", > + command=['python', '/builds/sut_tools/verify.py'], > + workdir='build', > + haltOnFailure=True, nit: add |log_eval_func=lambda c,s: regex_log_evaluator(c, s, tegra_errors),|
Attachment #662611 - Flags: review?(bugspam.Callek) → review+
This deployed last week
Status: ASSIGNED → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
Product: mozilla.org → Release Engineering
Component: General Automation → General
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: