Closed Bug 972912 Opened 11 years ago Closed 10 years ago

Mozmill fails to run all expected tests

Categories

(Testing Graveyard :: Mozmill, defect, P1)

defect

Tracking

(Not tracked)

RESOLVED WONTFIX

People

(Reporter: whimboo, Unassigned)

References

Details

(Whiteboard: [mozmill-2.1+])

Attachments

(1 file, 4 obsolete files)

This test most likely fails the last days because the master password dialog may come up and block the execution of the test. We are using the same profile, so the master password test under restart tests might not reset the master password. Someone should investigate this. 06:52:19 TEST-START | testPreferences/testPasswordSavedAndDeleted.js | setupModule 06:52:19 TEST-START | testPreferences/testPasswordSavedAndDeleted.js | testSaveAndDeletePassword 06:54:22 NOTE: child process received `Goodbye', closing down 06:54:22 Timeout: bridge.execFunction("d8957811-9587-11e3-b3b5-3c07540e8d1c", bridge.registry["{1258f97a-e74d-f541-a336-853f01bc9f40}"]["runTestFile"], ["/Users/mozauto/jenkins/workspace/mozilla-central_functional/data/mozmill-tests/firefox/tests/functional/testPreferences/testPasswordSavedAndDeleted.js", null]) 06:54:22 06:54:22 Timeout: bridge.set("fda36a38-9587-11e3-8b96-3c07540e8d1c", Components.utils.import("resource://mozmill/modules/frame.js")) 06:54:22
This doesn't always happen, and might need a couple of testruns to be seen.
Assignee: nobody → andrei.eftimie
I can reproduce this issue if I don't run test3 from the Master Password suite (where we remove the password). Checking some reports alludes to the same issue: http://mozmill-daily.blargon7.com/#/functional/report/0100465a763e4862d4004053841c87ac test3.js hasn't been run at all. I've checked the console log of such a testrun, and there's no additional information there: > 06:49:59 TEST-START | restartTests/testPreferences_masterPassword/test1.js | setupModule > 06:49:59 TEST-START | restartTests/testPreferences_masterPassword/test1.js | testSetMasterPassword > 06:50:05 TEST-PASS | restartTests/testPreferences_masterPassword/test1.js | testSetMasterPassword > 06:50:05 TEST-START | restartTests/testPreferences_masterPassword/test1.js | teardownModule > 06:50:05 TEST-END | restartTests/testPreferences_masterPassword/test1.js | finished in 5422ms > 06:50:09 TEST-START | restartTests/testPreferences_masterPassword/test2.js | setupModule > 06:50:09 TEST-START | restartTests/testPreferences_masterPassword/test2.js | testInvokeMasterPassword > 06:50:13 TEST-PASS | restartTests/testPreferences_masterPassword/test2.js | testInvokeMasterPassword > 06:50:13 TEST-START | restartTests/testPreferences_masterPassword/test2.js | teardownModule > 06:50:13 TEST-END | restartTests/testPreferences_masterPassword/test2.js | finished in 3785ms > 06:50:19 TEST-START | restartTests/testRestartChangeArchitecture/test1.js | setupModule I'll look into what could be causing this
This is pretty bad. We should really figure out why we miss some test modules sometimes. Andrei, what was the other bug/issue for that problem? It should certainly block this failure.
Let's change the summary to reflect the real issue here.
Component: Mozmill Tests → Mozmill
Product: Mozilla QA → Testing
Summary: Disconnect of Mozmill in testPasswordSavedAndDeleted.js because master password dialog may block execution → [mozmill] Mozmill fails to ran all expected tests
Actually we should have created a new bug for mozmill, and marked all the other test failures as dependent on. Well, first we should know if that's really a Mozmill bug.
Summary: [mozmill] Mozmill fails to ran all expected tests → Mozmill fails to ran all expected tests
Whiteboard: [mozmill-test-failure]
Blocks: 977010
Severity: normal → critical
Blocks: 977068
No longer blocks: 977068
Andrei, why are getting mozmill test failures duped against this mozmill bug? Those should really depend on. Please get this fixed and don't dupe even more. Thanks.
The fallout from this bug can lead to _any_ mozmill test failure. I wanted to not pollute the test failures we have, but I see your point. This is on top of my list now.
Status: NEW → ASSIGNED
I see 2 possible reasons for this. 1) We do not get a correct list while parsing the manifest file (or somewhere along this stack). Basically we _start_ with an incomplete list. 2) We fail silently somewhere in such a way that we fail to even report the test as being run. I haven't been able to reproduce this at all yet. For 1) I should be able to run this very fast multiple times by just printing the length of the initial list without actually running any test. Since this will be fast I expect to have lots of results so we can determine if the problem is here or not. In the meantime I'll try figuring out all the places we can fail as to reproduce 2). There shouldn't be many.
I've tried asserting option 1) I've run 500 testruns and about 7000 runs directly through mozmill, by cancelling to actually run the tests, just count the tests that mozmill receives here: https://github.com/mozilla/mozmill/blob/master/mozmill/mozmill/__init__.py#L401 This on OSX. I've always received the right number of tests, so 1) is probably not the problem we're seeing here. (Unless there's some specific variables that I haven't managed to reproduce)
Happened today with mm-osx-109-1, also with Aurora, in estartTests/testRestartChangeArchitecture: http://mozmill-daily.blargon7.com/#/functional/report/a438ea29b921b2e8124749eda953f1dc
Happened again today with Beta on mm-osx-106-3 in restartTests/testAddons_enableDisableExtension/ test2.js never ran. http://mozmill-release.blargon7.com/#/functional/report/306025ceedc6c6a3994dcaa01137cd06
I have experienced this issue today locally. Unfortunately I was testing something entirely different and didn't had any special logging setup :( It didn't fail afterwards at all. I would propose to do something similar in effect to bug 768239. But instead of having a ticker I would like to log the initial testlist _before_ we start running tests and compare it the final _actually ran_ tests. This way we would know whether the problem lies in parsing the manifest files or whether we're silently failing somewhere.
Do you still have the console output around? At least that would help given that we are mostly not sure if the output we see in Jenkins is complete due to caching.
I don't have it around. But I know exactly what it was, and it was the same as we see in Jenkins. Basically I had one test missing completely. I only noticed this because it was test1 from testAddons_uninstallTheme and because it didn't run both test2 and test3 had miserably failed. There was no indication at all in the console log to this test. Just the previous test marked as PASSED and /restartTests/testAddons_uninstallTheme/test2.js failed because of missing Addon (or some similar failure).
Attached patch WIP_completeness_check.patch (obsolete) — Splinter Review
Here's a possible route we could take. I've added a "completeness check" (I'm not sure about the naming yet) at the end of a testrun. This will be a basic check that tells us that all tests have been run. (We could go further and save the initial test list and compare the full list to see exactly which test was not run, but a basic length check should do the trick for now) After we show the final RESULTS we also show the Completeness level (True|False) This will be False when we have a Disconnect and fail to finish the testrun. I see 2 additional things needed. 1) To make sure to mark the build as failed in Jenkins (so we'll get the failure mail) and 2) to add support in the dashboard to easily spot these incomplete testruns. ==== This will greatly help us: a) see the extent of the problem we're facing. We only catch this if a missed test is in a module and the rest depend on it (which will generate additional failures) b) know if the problem lies in the Manifest parser / collect_tests method OR if the problem is in Mozmill and we fail silently (I currently assume its the latter) What do you guys think of adding such a check?
Attachment #8397766 - Flags: feedback?(hskupin)
Attachment #8397766 - Flags: feedback?(daniel.gherasim)
Attachment #8397766 - Flags: feedback?(cosmin.malutan)
Attachment #8397766 - Flags: feedback?(andreea.matei)
Comment on attachment 8397766 [details] [diff] [review] WIP_completeness_check.patch Review of attachment 8397766 [details] [diff] [review]: ----------------------------------------------------------------- It may be an approach but this will not give us the details we need to further investigate and fix it. What we really have to know is the test module which has not been executed. So my proposal would be to wait for an endModule event (not implemented yet but worked on in another bug) after the module has been done. If we do not get it we can directly add a failure to the report for the missed module. This would be way more informative.
Attachment #8397766 - Flags: feedback?(hskupin)
Attachment #8397766 - Flags: feedback?(daniel.gherasim)
Attachment #8397766 - Flags: feedback?(cosmin.malutan)
Attachment #8397766 - Flags: feedback?(andreea.matei)
Attachment #8397766 - Flags: feedback-
Blocks: 966235
Blocks: 972814
I hoped to get something today as I've seen several failures over the weekend with this issue on mm-osx-109-3. I took the machine offline this morning, and the first testrun lo and behold it failed! I plugged in some changes to monitor the testrun. Mainly I have a patch that could potentially show us if we fail somewhere between calling frame.js and actually running the test. And logged the initial test list received from parsing the manifest files. With both these we would be able to correctly assert if the problem is really in the manifest parser or if we silently fail in mozmill. Unfortunately the problem didn't manifest itself at all again. I ran around 20 testruns. The machine is back online, I've left the folder with my tests on the machine, will try at a later date again to see if I can reproduce this.
This bug might be related to bug 974971. Otherwise I'm not sure this actually can happen. Usually you should see a jsbridge disconnect. Not sure why that isn't the case here.
Run 50 testruns with latest Aurora on a Mac OSX 10.6 node with the WIP patch and other logs added by me but I got no failure. We'll continue the investigation.
Blocks: 904610
Attached patch WIP 2 (obsolete) — Splinter Review
Another attempt. We get _all_ missed tests now. So we take the initial test list and compare it with the actual results. With the difference we create failed test results. Here's how it looks: http://mozmill-crowd.blargon7.com/#/functional/report/c69af0ad3436e0d472c1ca8a0614e205 With bug 994040 fixed, this issue is the one affecting our results the most. == There's one downside, we can't count tests that don't have a Pass message (we have a few test1 files in restart tests that only set things up and don't to anything). This has an easy fix in setting up a dummy testModule with an assert.pass()
Attachment #8397766 - Attachment is obsolete: true
Attachment #8435806 - Flags: feedback?(hskupin)
Attached patch WIP 3 (obsolete) — Splinter Review
Some improvements. In the console we get a list: > TEST-UNEXPECTED-FAIL | Missed tests: ['/var/folders/9l/sn_p3bw914s360j602z20jsc0000gq/T/tmpx0cno3.workspace/mozmill-tests/firefox/tests/functional/restartTests/testSoftwareUpdateAutoProxy/test1.js'] If all were run we get: > TEST-PASS | Missed tests: [] If there's a missed test we'll get notified in the dashboard as well: http://mozmill-crowd.blargon7.com/#/functional/report/3ed251269efbb25c65b0fc5d9d082d6b
Attachment #8435806 - Attachment is obsolete: true
Attachment #8435806 - Flags: feedback?(hskupin)
Attachment #8438254 - Flags: feedback?(hskupin)
Comment on attachment 8438254 [details] [diff] [review] WIP 3 Review of attachment 8438254 [details] [diff] [review]: ----------------------------------------------------------------- ::: mozmill/mozmill/__init__.py @@ +404,2 @@ > while tests: > test = tests.pop(0) While seeing this code here I'm afraid that this pop action might be the reason why we see tests dropped. Something I would like to see here is that we should move to using a current index when iterating through tests. That way we also don't need a separate list, as what you create above. @@ +589,5 @@ > + # check for consistency of executed tests > + initial_tests = map(lambda item: item['path'], self.test_list) > + final_tests = map( > + lambda item: item['filename'].encode('ascii'), self.results.alltests) > + missing_tests = list(set(initial_tests).difference(set(final_tests))) How do you ensure that multiple tests in the same module have been executed? I don't see that this is checked at the moment. @@ +592,5 @@ > + lambda item: item['filename'].encode('ascii'), self.results.alltests) > + missing_tests = list(set(initial_tests).difference(set(final_tests))) > + > + self.results.misses = map(lambda path: {u'passes': [], > + u'fails': [{'exception': {'stack': '', This code is complex to read with all the lambda statements. For long ones convert them to closures. Also move the code to its own method, so it doesn't pollute the run method. @@ +605,5 @@ > + u'failed': len(missing_tests), > + u'passed': len(initial_tests) - len(missing_tests) > + }, missing_tests) > + self.results.alltests.extend(self.results.misses) > + self.results.fails.extend(self.results.misses) May be better to do that when we are building up the results, and not extend it afterward. ::: mozmill/mozmill/logger.py @@ +224,5 @@ > + l = "TEST-PASS" if len(results.misses) is 0 else "TEST-UNEXPECTED-FAIL" > + level = self.custom_levels[l] > + self.logger.log( > + level, "Missed tests: %s" % map(lambda test: test['filename'], > + results.misses)) Please use a simple if condition here, similar to the fatal one before. It should just print out what's wrong. The number of missed tests should be printed out together with the passes, failed, and skipped ones below.
Attachment #8438254 - Flags: feedback?(hskupin) → feedback+
Andrei, it would be great if you could take care of this bug over the next two weeks, so that we might be able to land it when I'm back. If you can get to it by end of this week, even better.
This is still a P1 and a bug which is causing us most of the failing tests at least for beta. I asked a month ago for updates, but until now nothing happened on this bug. So please make this one of your P1 items for the Mozmill related goal this quarter. Thanks. This is blocking the 2.1 release of Mozmill.
Blocks: 970594
QA Contact: mozmill-tests → hskupin
Whiteboard: [mozmill-2.1+]
After a 2,5 month hiatus, lets push to get some progress here.
Updated the patch in regards to the feedback received and for compatibility with recent mozmill changes. (In reply to Henrik Skupin (:whimboo) from comment #28) > Comment on attachment 8438254 [details] [diff] [review] > WIP 3 > > Review of attachment 8438254 [details] [diff] [review]: > ----------------------------------------------------------------- > > ::: mozmill/mozmill/__init__.py > @@ +404,2 @@ > > while tests: > > test = tests.pop(0) > > While seeing this code here I'm afraid that this pop action might be the > reason why we see tests dropped. Something I would like to see here is that > we should move to using a current index when iterating through tests. That > way we also don't need a separate list, as what you create above. I've removed tests.pop in favor of iterating through them (I haven't seen any indication that that could be the cause). Nevertheless I agree this is better than pop. > @@ +589,5 @@ > > + # check for consistency of executed tests > > + initial_tests = map(lambda item: item['path'], self.test_list) > > + final_tests = map( > > + lambda item: item['filename'].encode('ascii'), self.results.alltests) > > + missing_tests = list(set(initial_tests).difference(set(final_tests))) > > How do you ensure that multiple tests in the same module have been executed? > I don't see that this is checked at the moment. This is a valid point. Yet I'm not sure how to handle this ATM. In mozmill a `test` is file, the modules are executed by frame, so we'll probably need to make some changes there as well. > @@ +592,5 @@ > > + lambda item: item['filename'].encode('ascii'), self.results.alltests) > > + missing_tests = list(set(initial_tests).difference(set(final_tests))) > > + > > + self.results.misses = map(lambda path: {u'passes': [], > > + u'fails': [{'exception': {'stack': '', > > This code is complex to read with all the lambda statements. For long ones > convert them to closures. I'm not sure creating methods that are used once for transforming an array to another form would help readability here... > Also move the code to its own method, so it > doesn't pollute the run method. I've moved this to a separate method -- this is called in stop(), not run(). > @@ +605,5 @@ > > + u'failed': len(missing_tests), > > + u'passed': len(initial_tests) - len(missing_tests) > > + }, missing_tests) > > + self.results.alltests.extend(self.results.misses) > > + self.results.fails.extend(self.results.misses) > > May be better to do that when we are building up the results, and not extend > it afterward. We can't. We need the testrun to be finished, we have self.alltests to compare against self.tests to know if we missed something. We can't do it during the run. > ::: mozmill/mozmill/logger.py > @@ +224,5 @@ > > + l = "TEST-PASS" if len(results.misses) is 0 else "TEST-UNEXPECTED-FAIL" > > + level = self.custom_levels[l] > > + self.logger.log( > > + level, "Missed tests: %s" % map(lambda test: test['filename'], > > + results.misses)) > > Please use a simple if condition here, similar to the fatal one before. It > should just print out what's wrong. The number of missed tests should be > printed out together with the passes, failed, and skipped ones below. Done. === Some things: - I don't reproduce the issue anymore locally as I had in the past - Again, this won't fix the problem we have, just point it out in a more obvious manner. - I'll want to add a mutt test. It will need to be a python test. - As mentioned above, we might want to check modules... not sure how to yet - mozmill-tests will need some updates to tests that don't do anything (a simple assert.pass() will be enough) so the test will be correctly reported as being run (and will be shown in the dashboard as well)
Attachment #8438254 - Attachment is obsolete: true
Attachment #8484981 - Flags: feedback?(hskupin)
Some significant changes here: I've mapped out all possible places in frame.js where if something failed, we did not get notified at all (so exactly the symptom from this bug). I did this with a Type or Reference JS error in each method. I enhanced the methods that allowed a silent fail with a try/catch and issued a fail event. With this change we'll get notified that we have had an error, and we'll be able to pinpoint each failure and fix it. I've also left the completeness check, since it will still be good to be notified that we didn't run _all_ tests. I've thought on mutt tests and for the 2 issues changed here: 1) Since we're dealing with a framework failure, I don't see how I could add a test for that (I would need to alter framework code - from frame.js) 2) The completeness check is done on all mutt tests, so if one is missing we'll get an error. I feel this should be enough cover, so no need for a specific test.
Attachment #8484981 - Attachment is obsolete: true
Attachment #8484981 - Flags: feedback?(hskupin)
Attachment #8485699 - Flags: review?(hskupin)
Summary: Mozmill fails to ran all expected tests → Mozmill fails to run all expected tests
There were a lot of failures today because of this issue, mainly on mm-osx-108-1 and mm-ub-1404-64-4 hosts. Functional testruns for Firefox 35.0 candidate (build1), it seems that all locales are affected. Another thing I have noticed is that on different hosts it fails with different failures or in different tests. I haven't noticed this issue on any Windows host, most of the failures were on OS X hosts (mm-osx-106-1, mm-osx-106-2, mm-osx-107-1, mm-osx-107-3. mm-osx-108-1, mm-osx-108-2, mm-osx-108-3) and a few on Ubuntu (mm-ub-1404-64-4, mm-ub-1404-64-2). Reports: mm-osx-108-1: http://mozmill-release.blargon7.com/#/functional/report/5c7e7a22f9984b46554bc3b9fc780445 http://mozmill-release.blargon7.com/#/functional/report/5c7e7a22f9984b46554bc3b9fc76c4f5 http://mozmill-release.blargon7.com/#/functional/report/5c7e7a22f9984b46554bc3b9fc7b3234 http://mozmill-release.blargon7.com/#/functional/report/5c7e7a22f9984b46554bc3b9fc7bee0c http://mozmill-release.blargon7.com/#/functional/report/5c7e7a22f9984b46554bc3b9fc7edb02 http://mozmill-release.blargon7.com/#/functional/report/5c7e7a22f9984b46554bc3b9fc8400a6 http://mozmill-release.blargon7.com/#/functional/report/04e54d608fc589a28217304fe603dff5 mm-ub-1404-64-4: http://mozmill-release.blargon7.com/#/functional/report/04e54d608fc589a28217304fe602ec93 mm-osx-107-3: http://mozmill-release.blargon7.com/#/functional/report/5c7e7a22f9984b46554bc3b9fc7ce4a1 mm-osx-106-2: http://mozmill-release.blargon7.com/#/functional/report/04e54d608fc589a28217304fe605809f
No longer blocks: 866599
Assignee: andrei → nobody
Status: ASSIGNED → NEW
Comment on attachment 8485699 [details] [diff] [review] 0005-Bug-972912-Catch-previously-unhandled-errors-in-fram.patch Review of attachment 8485699 [details] [diff] [review]: ----------------------------------------------------------------- It's clearly an annoying issue and I can see it happen about 5 times a day for all the daily builds. But we do not have the manpower right now to get this fully investigated and fixed. :( For now I will reset the review flag, but will leave the bug open. Just in case we see a spike and really have to get it fixed.
Attachment #8485699 - Flags: review?(hskupin)
We are not going back to use Mozmill for our UI testing of Firefox and this hasn't been occurred for a while now by running our update tests only. So nothing we want to do on that bug anymore.
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → WONTFIX
Product: Testing → Testing Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: