Closed Bug 1020384 Opened 10 years ago Closed 10 years ago

Give Jetpack tests a shorter maxTime than 2 hours

Categories

(Release Engineering :: General, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: philor, Assigned: emorley)

Details

(Whiteboard: [capacity])

Attachments

(1 file)

Jetpack tests have a habit of hanging their harness and waiting until the job maxtime expires (bug 942111 and bug 926264 are the current dumping grounds). That's an expensive way for a 15-35 minute test suite to fail, we should switch to a 60 minute maxtime instead.
i'm not sure this is a good solution, as according to [1] our Linux x64 Debug test runs for just over an hour. 

i filed bug 1020401 instead.


[1] https://tbpl.mozilla.org/?tree=Jetpack
Freaky. I figured Win debug would be the slowest since it always is. We could still do 90 minutes instead of 120, but that's a bit less exciting of a gain.
could conceivably have per-platform limits if it was really important.

or, we fix the harness to kill after some reasonable amount of time?
i believe Erik fixed this issue in bug 1020458, so you can close this one, imho..
That bug is separate but complementary to this.

We can still reduce maxtime to reduce impact of failing jobs that don't get caught by the changes there.
i'm not trying to argue here, so here is my understanding, from the jetpack point of view:

we have 3 different kinds of test, and one of them (testaddons) didn't enforce timeouts in the python part of our test harness. this was an oversight on our part, which Erik now fixed.

there are no bugs, that i know of, that suggest this will still be an issue, because that would require both firefox and our python code to hang, which i don't remember seeing happening ever.

so i suggest closing this, for now, until new evidence shows up that refutes my claims.. ;)
Each job in buildbot has a "max time without output" and "max runtime for job no matter what happens". In order to prevent regressions either from unhandled hangs or unintended drastic increases in runtime, we set both for all jobs, regardless of how effective the test harness is. The max runtime for jetpack is already set at 2 hours - we can just reduce that to 90 mins. We've caught regressions in other suites this way (eg tests taking 30% longer to run due to console regressions).

https://hg.mozilla.org/build/buildbotcustom/file/0936f69f9608/process/factory.py#l4835
> because that would require both firefox and our python code to hang, 

ok, i stand corrected. this can also happen when the python code fails to kill firefox, as in:

https://tbpl.mozilla.org/php/getParsedLog.php?id=40908351&tree=Mozilla-Release

feel free to ignore me..
The longest Jetpack runtimes are seen on Linux debug runs.

Looking at:
https://tbpl.mozilla.org/?showall=1&jobname=jetpack

And searching logs for "Finished 'python jetpack/bin/cfx ...' (results: 0, elapsed:"

Gives:
* 44-47 mins for the 'testpkgs' step
* 6-7 mins for the 'testaddons' step

This patch sets the maxTime for the former to 75 mins (buffer of ~25-30mins) and the latter to 30 mins (buffer of ~20mins).
Attachment #8437005 - Flags: review?(catlee)
Assignee: nobody → emorley
Status: NEW → ASSIGNED
Summary: Give Jetpack tests a one hour maxtime instead of two → Give Jetpack tests a shorter maxTime than 2 hours
> Each job in buildbot has a "max time without output" 

it seems this part doesn't work well? for example, in the above log, i see a 15 minute run of the test, than a fail to exit/kill firefox, and then a timeout after two hours. if this was working properly, wouldn't that timeout much sooner?


> This patch sets the maxTime for the former to 75 mins (buffer of ~25-30mins)
> and the latter to 30 mins (buffer of ~20mins).

isn't the 2 hour timeout for the whole jetpack suite, not for parts of it?
(In reply to Tomislav Jovanovic [:zombie] from comment #10)
> > Each job in buildbot has a "max time without output" 
> 
> it seems this part doesn't work well? for example, in the above log, i see a
> 15 minute run of the test, than a fail to exit/kill firefox, and then a
> timeout after two hours. if this was working properly, wouldn't that timeout
> much sooner?

Agreed; it may not be set for jetpack - catlee: I forget what variable determines it?

> > This patch sets the maxTime for the former to 75 mins (buffer of ~25-30mins)
> > and the latter to 30 mins (buffer of ~20mins).
> 
> isn't the 2 hour timeout for the whole jetpack suite, not for parts of it?

For each substep - the diff should make things clearer? :-)
Attachment #8437005 - Flags: review?(catlee) → review+
Thank you for the review :-)

remote:   https://hg.mozilla.org/build/buildbotcustom/rev/a118174e53fb
In prod with reconfig on 2014-06-12 10:46 PT
Status: ASSIGNED → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Component: General Automation → General
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: