Closed Bug 481207 Opened 15 years ago Closed 12 years ago

Unit test runners should kill wayward processes after test runs, not time out/go orange

Categories

(Testing :: Mochitest, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED DUPLICATE of bug 501034

People

(Reporter: johnath, Unassigned)

References

Details

One of the more ignorable test failures we see right now is:

> http://tinderbox.mozilla.org/showlog.cgi?log=Firefox3.1/1236087486.1236093954.13069.gz&fulltext=1#err0

with this log text:

> *** 83307 INFO Passed: 75980
> *** 83308 INFO Failed: 0
> *** 83309 INFO Todo:   1523
> *** 83310 INFO SimpleTest FINISHED
> 
> command timed out: 300 seconds without output
> program finished with exit code 1
> TinderboxPrint: mochitest<br/>75980/0/1523
> buildbot.slave.commands.TimeoutError: command timed out: 300 seconds without output
> TinderboxPrint: mochitest<br/><em class="testfail">timeout</em>

This could be because of a legitimate shutdown hang or a process management bug somewhere in python/windows, but either way it's not having a positive effect to report it as orange right now.  The timeout obscures the actual, completed mochitest results, and people have learned to ignore the timeout itself.

If we want better data about shutdown time, we should solve bug 480413 and directly measure Tshutdown, but doing it indirectly by turning this box orange sometimes doesn't seem desirable. 

I recommend we wait however long for the browser to shut down naturally, but failing that we kill it forcefully and report the test results, instead of going orange on the timeout.  Logging the forced shutdown seems fine to me if it doesn't change the box status.
5 minutes is an awfully long shutdown time. If the browser is not actually hung, that's crazytown. If it's hung, then upping the timeout won't help any. I guess we can not go orange on a shutdown hang, but that's sadmaking. (Also, we would still want to go orange if we hang in the middle of mochitest, which will complicate things.)
I think a --shutdown-wait=200 or similar option in runtests.py could work.  Wait that long and attempt to kill afterwards.

One point to note:  the final test in mochitest is the bloat view check, which relies on the browser shutting down.  Killing the browser would skip that test.  Then again, not killing the browser is skipping that test as well.

Since we're tracking Tshutdown in bug 480413, I'll look into adding this option.  I think it should TinderboxPrint this fact in case it starts to happen on a regular basis.
Assignee: nobody → aki
Upon further investigation, this looks like it belongs in http://mxr.mozilla.org/mozilla-central/source/testing/mochitest/tests/SimpleTest/quit.js

Does anyone with stronger js skills than me want this?
What would we do there? That file already asks the browser to shut down, there isn't much else it can do. (Since it runs in the browser process.)
Blocks: 438871
Whiteboard: [orange]
I don't currently have time for this; throwing back in the pool.
Assignee: aki → nobody
What I'd like to see is an option to the test harnesses like Aki suggests in comment #2.  The harness should kill firefox if it takes more than N seconds to shutdown.

And if it has to kill it off, I still think we should be raising a warning about it.  Tshutdown and mochitests do very different things with the browser, so a shutdown problem caused by running mochitests may not be caught by Tshutdown.
Component: Release Engineering → Mochitest
Product: mozilla.org → Testing
QA Contact: release → mochitest
Version: other → Trunk
Can I dupe this to bug 501034? The harness will kill the browser if it doesn't output anything for a certain amount of time now. This will produce a stack on Linux (and on Windows when I fix that again). I don't think we should silently ignore the browser taking >5 minutes to shut down.
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → DUPLICATE
Whiteboard: [orange]
You need to log in before you can comment on or make changes to this bug.