Closed Bug 892563 Opened 11 years ago Closed 11 years ago

Add a timeout parameter to mozharness' run_command

Categories

(Release Engineering :: Applications: MozharnessCore, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: jgriffin, Assigned: jgriffin)

References

Details

Attachments

(1 file, 4 obsolete files)

We have a lot of B2G test runs getting killed during the test by the 1200s timeout, and this prevents the mozharness script from being able to dump logcat, which leaves us with very little information to go on.

I think we need to add a 'timeout' parameter to the run_command method so we can kill the command when it has exceeded a certain threshold without output, and then continue with the mozharness script.

Alternately, we could implement some new mozprocess-based command that could be used selectively in scripts that have installed mozprocess.
Essentially a dup of bug 840305, though this bug has the better summary for non-TBPL folk.
Assignee: nobody → jgriffin
This isn't exactly what I want for a final version, but I will run this on ash to see if anything breaks horribly.
Is 203 test failures in a push horrible enough? ;)

Just an import error with mozprocess:
https://tbpl.mozilla.org/php/getParsedLog.php?id=25200546&tree=Ash
Yep that's bad, I'm trying to figure out what went wrong. :)
So, I made some fixes and it looks like desktop tests are running green.  Android tests are busted, though.  Will do some more investigation.
Looks like I fixed the Android bustage.  Now I just need to add some code to make the error handling more intelligent.
Attachment #774374 - Attachment is obsolete: true
Added better error handling; will trigger another run on ash.
Attachment #774996 - Attachment is obsolete: true
Everything was green on the last run, except the B2G emu tests, which were red due to gps' resource monitoring patches.  I'm going to merge mozharness to ash-mozharness, and trigger that set of tests again.
Attachment #775030 - Attachment is obsolete: true
Comment on attachment 775962 [details] [diff] [review]
WIP 0.4: add mozprocess_run_command to mozharness

I'm going to do another full run on ash with a couple of tweaks I added here, but I think this version is suitable for feedback.

Questions:  are we ok defining a mozprocess_run_command command, like this patch does?  Or would we rather just add a mozprocess arg to run_command, or something else?

Should we have a not-None default for outputTimeout?

If we do that, should we make run_command smart enough so that it detects if mozprocess is available in site-packages, and if so, use the mozprocess method by default?
Attachment #775962 - Flags: feedback?(aki)
Comment on attachment 775962 [details] [diff] [review]
WIP 0.4: add mozprocess_run_command to mozharness

a) Thanks for taking this on.  I've taken some stabs at this but haven't had a fix that has stuck, yet.  It seems to be a large source of tbpl comments, and this will help me make my hg-git stuff more stable, too.

b) We have some similar constructs in resource monitoring:
http://hg.mozilla.org/build/mozharness/file/f7fba2c98999/mozharness/base/python.py#l360
http://hg.mozilla.org/build/mozharness/file/f7fba2c98999/mozharness/base/python.py#l383
and mock:
http://hg.mozilla.org/build/mozharness/file/f7fba2c98999/mozharness/mozilla/mock.py#l92

c) I think my only real comment is, could you rename outputTimeout to output_timeout to match the rest of the variable names?

(In reply to Jonathan Griffin (:jgriffin) from comment #12)
> Questions:  are we ok defining a mozprocess_run_command command, like this
> patch does?  Or would we rather just add a mozprocess arg to run_command, or
> something else?
>
> Should we have a not-None default for outputTimeout?
> 
> If we do that, should we make run_command smart enough so that it detects if
> mozprocess is available in site-packages, and if so, use the mozprocess
> method by default?

If you wanted to rework this, allowing for run_command to use mozprocess if a) output_timeout is not None, and b) mozprocess is installed, would be a good workflow.  This workflow of mozprocess_run_command is still better than not having an output timeout at all, though.
Attachment #775962 - Flags: feedback?(aki) → feedback+
Blocks: 840305
I've refactored per your last suggestion, and changed outputTimeout to output_timeout.  I'm running another set of tests on ash.
Attachment #775962 - Attachment is obsolete: true
Comment on attachment 777207 [details] [diff] [review]
WIP 0.5: add output_timeout to run_command

The run on ash is looking very green.

We could file a follow-up bug to enable output_timeout support on posix systems without mozprocess, if desired.

Would you like a test for this?  I could potentially add one that is hardcoded to fetch mozprocess from puppetagain, but since none of the other tests operate like this, I'm not sure if you want to add that.
Attachment #777207 - Flags: review?(aki)
Comment on attachment 777207 [details] [diff] [review]
WIP 0.5: add output_timeout to run_command

Thanks!  I imagine we'll iterate on this, but I'd love to see how this affects the intermittent orange bugs dependent on this bug.

I'm not entirely sure how to test this; I don't see an easy way to use mock, for example.  We could potentially add mozprocess to the list of unittest dependencies in requirements.txt, to allow for testing.
Attachment #777207 - Flags: review?(aki) → review+
Comment on attachment 777207 [details] [diff] [review]
WIP 0.5: add output_timeout to run_command

Landed on default so I can watch it on cedar:

https://hg.mozilla.org/build/mozharness/rev/879cf481df30

Aki, if you want me to write a test by adding mozprocess to requirements.txt, let me know and I'll do that in a follow-up patch.
Attachment #777207 - Flags: checked-in+
Merged to production.
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Blocks: 847727
Product: mozilla.org → Release Engineering
Depends on: 904839
Component: General Automation → Mozharness
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: