Closed Bug 584470 Opened 14 years ago Closed 13 years ago

[exit] Global timeout should allow to proceed the next test instead of killing the complete test-run

Categories

(Testing Graveyard :: Mozmill, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: whimboo, Assigned: k0scist)

References

Details

(Whiteboard: [mozmill-2.0+][CLI])

Attachments

(1 file, 1 obsolete file)

With the fix on bug 581733 we made sure that we (can) kill the browser even for restart tests. Well, it doesn't catch all 100% of possible cases but that should be ok and not part of this bug. Right now we completely kill the whole restart test-run. In the future we should only kill the currently run restart test and continue with the next restart test if a global timeout gets hit.

This is something post Mozmill-1.4.2. Putting on the radar for 2.0
Depends on: 584464
There are a few related issues that give us latitude (maybe too much latitude) on how to structure:

 - how does this play with per test timeout? https://bugzilla.mozilla.org/show_bug.cgi?id=574871 Is this really better somewhere at that level?

 - currently, the CLI functions as a controller that monitors for the exception and kills mozmill and (sorta) cleans up;  it might be better to have Mozmill and/or MozmillRestart function as controllers.  The CLI is an interface, and at best a reflective one.  It shouldn't do much controlling

Also, as noted in the dependencies, this has to do with restructuring our cleanup.
Whiteboard: [mozmill-2.0?] → [mozmill-2.0?][CLI]
Summary: Global timeout should allow to proceed the next restart test instead of killing the complete test-run → [exit] Global timeout should allow to proceed the next restart test instead of killing the complete test-run
Assignee: nobody → jhammel
This really feels like you're asking for the moon here.

If something goes wrong and we time out, then ending the entire test run seems like a fine thing to do.  All the other harnesses work that way, why should mozmill be any different?  Why is it clearly better to time out on one test (which means there are clearly substantial underlying problems) and keep going?
With all of the refactoring for 2.0, this shouldn't be too technically challenging (i.e. if all tests are atomic, then if one dies, recovery should be straight-forward).  

Whether we *want* to do this...I'm not at all sure.  It doesn't seem particularly mandatory for 2.0.  If we did allow this, I would prefer it to be optional, as, e.g. if one test fails, the same underlying cause (no internet, broken shared-modules, etc) could cause all tests to fail.  So i'm cautious at best as whether this is a good idea (technical implementation aside)
(In reply to comment #2)
> If something goes wrong and we time out, then ending the entire test run seems
> like a fine thing to do.  All the other harnesses work that way, why should
> mozmill be any different?  Why is it clearly better to time out on one test

Eh, no other harness is able to run restart tests? So how could those work that way? :)
So, with our refactor of restart tetss this makes more sense:
if a test js file times out then we can continue with the rest of the test js files that were specified for this mozmill run.  That's how I propose solving this.  It is equivalent to what mochitest and reftest do.
Whiteboard: [mozmill-2.0?][CLI] → [mozmill-2.0+][CLI]
There's nothing particular that has to do with restart tests in this bug, so renaming (in fact, for mozmill 2.0, it is vague what a restart test is).
Summary: [exit] Global timeout should allow to proceed the next restart test instead of killing the complete test-run → [exit] Global timeout should allow to proceed the next test instead of killing the complete test-run
We'll need a better way of ensuring that we are in the state we think we are (stopped, started) before we can do this.  Otherwise, on a case where a browser can't close, the harness will just chug away, failing to start JSbridge for the remainder of the test run.  Robustness >> feature.

We should probably add something like --stop-on-failure so that you can more easily get to the failing test.  Alternatively, (and a better idea albeit more work), we could develop and ship tools with mozmill that can better interpret the results you get back.
Depends on: 639991, 640010
We should also consider when/how to denote the jsbridge exception.  With an (effectively) empty test, we get the following:

│mozmill -t test_globalshutdown/
Xlib:  extension "GLX" missing on display ":0.0".
TEST-START | /home/jhammel/mozmill/src/mozmill/mutt/mutt/tests/js/test_globalshutdown/test_long.js | setupModule
TEST-START | /home/jhammel/mozmill/src/mozmill/mutt/mutt/tests/js/test_globalshutdown/test_long.js | testTakesTooLong
INFO | Timeout: bridge.execFunction("283b5956-920f-11e0-946c-00262df16844", bridge.registry["{9e191a47-185b-417d-b7d9-257162d6c21a}"]["runTestFile"], ["/home/jhammel/mozmill/src/mozmill/mutt/mutt/tests/js/test_globalshutdown/test_long.js", false, null])
INFO | 
INFO | DEBUG: stop is calling kill
INFO | 
TEST-UNEXPECTED-FAIL | Disconnect Error: Application unexpectedly closed
INFO | Passed: 0
INFO | Failed: 1
INFO | Skipped: 0
ERROR | Traceback (most recent call last):
ERROR |   File "/home/jhammel/mozmill/src/mozmill/mozmill/mozmill/__init__.py", line 631, in run
ERROR |     mozmill.run(*self.manifest.tests)
ERROR |   File "/home/jhammel/mozmill/src/mozmill/mozmill/mozmill/__init__.py", line 363, in run
ERROR |     self.run_tests(*tests)
ERROR |   File "/home/jhammel/mozmill/src/mozmill/mozmill/mozmill/__init__.py", line 345, in run_tests
ERROR |     self.run_test_file(frame, test['path'])
ERROR |   File "/home/jhammel/mozmill/src/mozmill/mozmill/mozmill/__init__.py", line 303, in run_test_file
ERROR |     frame.runTestFile(path, False, name)
ERROR |   File "/home/jhammel/mozmill/src/mozmill/jsbridge/jsbridge/jsobjects.py", line 131, in __call__
ERROR |     response = self._bridge_.execFunction(self._name_, args)
ERROR |   File "/home/jhammel/mozmill/src/mozmill/jsbridge/jsbridge/network.py", line 216, in execFunction
ERROR |     return self.run(_uuid, 'bridge.execFunction('+ ', '.join(exec_args)+')', interval)
ERROR |   File "/home/jhammel/mozmill/src/mozmill/jsbridge/jsbridge/network.py", line 193, in run
ERROR |     raise JSBridgeDisconnectError("Connection timed out")
ERROR | JSBridgeDisconnectError: Connection timed out

So currently we raise an exception that is displayed after the results.  What do we actually want to do? (Could be "we don't care")
Attached patch this would work if... (obsolete) — Splinter Review
I forgot this bug had a hard dependency on bug 639991 . This patch, while requiring some associated cleanup, should do mostly what is desired.  However, you can't actually stop the runner correctly since it depends on endRunner being called.  So that should be fixed.  As such, a run with this patch gives the following traceback and does not proceed to the next test:

│mozmill -m manifest.ini 
Xlib:  extension "GLX" missing on display ":0.0".
TEST-START | /home/jhammel/mozmill/src/mozmill/mutt/mutt/tests/js/test_globalshutdown/test_long.js | setupModule
TEST-START | /home/jhammel/mozmill/src/mozmill/mutt/mutt/tests/js/test_globalshutdown/test_long.js | testTakesTooLong
INFO | Timeout: bridge.execFunction("d8d445f6-921f-11e0-aafd-00262df16844", bridge.registry["{d8bf353f-0173-46db-86a4-eeb8911e5211}"]["runTestFile"], ["/home/jhammel/mozmill/src/mozmill/mutt/mutt/tests/js/test_globalshutdown/test_long.js", false, null])
INFO | 
INFO | Timeout: bridge.set("fd63ab1e-921f-11e0-aafd-00262df16844", Components.utils.import('resource://mozmill/modules/mozmill.js'))
INFO | 
INFO | DEBUG::Calling wait for finish now
INFO | 
ERROR | Wait timed out, attempting kill
ERROR | 
INFO | DEBUG: stop is calling kill
INFO | 
TEST-UNEXPECTED-FAIL | Disconnect Error: Application unexpectedly closed
INFO | Passed: 0
INFO | Failed: 1
INFO | Skipped: 0
ERROR | Traceback (most recent call last):
ERROR |   File "/home/jhammel/mozmill/src/mozmill/mozmill/mozmill/__init__.py", line 634, in run
ERROR |     mozmill.run(*self.manifest.tests)
ERROR |   File "/home/jhammel/mozmill/src/mozmill/mozmill/mozmill/__init__.py", line 366, in run
ERROR |     self.run_tests(*tests)
ERROR |   File "/home/jhammel/mozmill/src/mozmill/mozmill/mozmill/__init__.py", line 353, in run_tests
ERROR |     self.stop_runner()
ERROR |   File "/home/jhammel/mozmill/src/mozmill/mozmill/mozmill/__init__.py", line 438, in stop_runner
ERROR |     raise Exception('endRunner was never called. There must have been a failure in the framework')
ERROR | Exception: endRunner was never called. There must have been a failure in the framework
Attachment #538129 - Attachment is obsolete: true
Attachment #545782 - Flags: review?(ctalbert)
Attachment #545782 - Flags: review?(ctalbert) → review+
pushed to master: https://github.com/mozautomation/mozmill/commit/ff2110480b421af428fa7bb2775d29efd96ececa
Status: NEW → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
Product: Testing → Testing Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: