Closed Bug 792318 Opened 12 years ago Closed 12 years ago

Stop running verify.py twice during Android Talos

Tracking

(Not tracked)

Status:

RESOLVED FIXED

People

(Reporter: philor, Assigned: Callek)

References

Details

Attachments

(1 file, 2 obsolete files)

[custom] Make a preCleanup* def that is called here. 12 years ago Justin Wood (:Callek) 1.79 KB, patch		Details \| Diff \| Splinter Review
[custom] Make a preCleanup* def that is called here. (v2, properly qref'd) 12 years ago Justin Wood (:Callek) 1.87 KB, patch		Details \| Diff \| Splinter Review
run verify.py only at the beginning 12 years ago Armen [:armenzg] 2.94 KB, patch	Callek : review+ armenzg : checked-in+	Details \| Diff \| Splinter Review

Phil Ringnalda (:philor)

Reporter

Description

•

12 years ago

For unittests, we run verify.py once, at the start of the run, to be sure we have a live and up to date tegra to run the tests on. For talos, we run verify.py at the start of the run, to be sure we have a live and up to date tegra to run the tests on, and again at the end of the run, to... I think the theory was to increase the likelihood that it would be ready for its next run, but in fact what it seems to do is increase the likelihood that a perfectly good talos run will end up red from bug 686084 or bug 781419, or purple from bug 660480. The bug 781419 situation is particularly interesting to me, because I turned that into an auto RETRY for unittests in bug 791477, but I wasn't willing to do that for talos because of this - we would do a perfectly good talos run, getting exactly what we wanted out of it, and then if the tegra happened to die between then and the time we ran verify.py on it a second time, we would set RETRY and pointlessly run the job a second time.

Justin Wood (:Callek)

Assignee

Comment 1

•

12 years ago

Attached patch [custom] Make a preCleanup* def that is called here. (obsolete) — Details — Splinter Review

doubtful this is the most elegant way to do it, but I was already thinking it would be a good end-to-end reducer and a sanity-increaser. The fact that philor filed the bug for us increased my desire to write something *now* rather than *tomorrow*.

Assignee: nobody → bugspam.Callek

Status: NEW → ASSIGNED

Attachment #662431 - Flags: review?(armenzg)

Justin Wood (:Callek)

Assignee

Updated

•

12 years ago

Attachment #662431 - Attachment is patch: true

Justin Wood (:Callek)

Assignee

Comment 2

•

12 years ago

Attached patch [custom] Make a preCleanup* def that is called here. (v2, properly qref'd) (obsolete) — Details — Splinter Review

This time, qref'd

Attachment #662431 - Attachment is obsolete: true

Attachment #662431 - Flags: review?(armenzg)

Attachment #662436 - Flags: review?(armenzg)

Armen [:armenzg]

Comment 3

•

12 years ago

What about if we remove verify.py from buildbotcustom all together?

Justin Wood (:Callek)

Assignee

Comment 4

•

12 years ago

(In reply to Armen Zambrano G. [:armenzg] from comment #3) > What about if we remove verify.py from buildbotcustom all together? To answer here, we can't easily. It is catching many device intermittent failures, and taking the device out of the pool when they happen. If we had the ability to properly graceful the tegra jobs, so that it ran verify from outside of buildbot - always - we'd be in better shape, but all my attempts of that had hit up against buildbot state bugs. to clarify verify.py does many important things: * Verifies the tegra is pingable * Verifies we can telnet to the SUTAgent Port [and get a prompt] * Verify we have the expected SUTAgent version [if not we attempt a SUT Upgrade] * Verify the SDCard is mounted and writeable * Cleans up errant procs/files on the device and the foopy. If any of those things fail, we set an error.flg which forcibly takes the device out of the buildbot rotation on us, until/unless the outside-of-buildbot verify.py run succeeds. Unfortunately there are still a few cases where verify.py can timeout/fail where it won't kill off buildbot, but that is less common than the many cases where it does [properly]

Armen [:armenzg]

Comment 5

•

12 years ago

Attached patch run verify.py only at the beginning — Details — Splinter Review

Attachment #662436 - Attachment is obsolete: true

Attachment #662436 - Flags: review?(armenzg)

Attachment #662611 - Flags: review?(bugspam.Callek)

Justin Wood (:Callek)

Assignee

Comment 6

•

12 years ago

Comment on attachment 662611 [details] [diff] [review] run verify.py only at the beginning Review of attachment 662611 [details] [diff] [review]: ----------------------------------------------------------------- ::: process/factory.py @@ +5246,5 @@ > + name="verify_tegra_state", > + description="Running verify.py", > + command=['python', '/builds/sut_tools/verify.py'], > + workdir='build', > + haltOnFailure=True, nit: add |log_eval_func=lambda c,s: regex_log_evaluator(c, s, tegra_errors),|

Attachment #662611 - Flags: review?(bugspam.Callek) → review+

Armen [:armenzg]

Comment 7

•

12 years ago

Comment on attachment 662611 [details] [diff] [review] run verify.py only at the beginning http://hg.mozilla.org/build/buildbotcustom/rev/698145470ef7

Attachment #662611 - Flags: checked-in+

Justin Wood (:Callek)

Assignee

Comment 8

•

12 years ago

This deployed last week

Status: ASSIGNED → RESOLVED

Closed: 12 years ago

Resolution: --- → FIXED

Nobody; OK to take it and work on it

Updated

•

11 years ago

Product: mozilla.org → Release Engineering

Nobody; OK to take it and work on it

Updated

•

7 years ago

Component: General Automation → General

You need to log in before you can comment on or make changes to this bug.

Bugzilla

Stop running verify.py twice during Android Talos

Categories

(Release Engineering :: General, defect)

Tracking

(Not tracked)

People

(Reporter: philor, Assigned: Callek)

References

Details

Crash Data

Security

(public)

User Story

Attachments

(1 file, 2 obsolete files)

Description

Comment 1

Updated

Comment 2

Comment 3

Comment 4

Comment 5

Comment 6

Comment 7

Comment 8

Updated

Updated

Attachment

General

Description

File Name

Content Type