Closed
Bug 951628
Opened 11 years ago
Closed 11 years ago
When Firefox gets closed via shutdownApplication() we do not wait until runner process has been quit
Categories
(Testing Graveyard :: Mozmill, defect)
Testing Graveyard
Mozmill
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: andrei, Assigned: whimboo)
References
Details
(Whiteboard: [mozmill-2.0.3+])
Attachments
(1 file)
1.59 KB,
patch
|
davehunt
:
review+
andrei
:
feedback+
|
Details | Diff | Splinter Review |
Mozmill 2.X is failing with a Jsbridge Disconnect Error.
This is intermittent but we're seeing very often.
This is what we see on the staging CI server:
> 04:15:39 Traceback (most recent call last):
> 04:15:39 File "/Users/mozilla/mozmill-ci/jenkins-master/jobs/mozilla-central_functional/workspace/mozmill-env-mac/python-lib/mozmill_automation/testrun.py", line 349, in run
> 04:15:39 self.run_tests()
> 04:15:39 File "/Users/mozilla/mozmill-ci/jenkins-master/jobs/mozilla-central_functional/workspace/mozmill-env-mac/python-lib/mozmill_automation/testrun.py", line 573, in run_tests
> 04:15:39 TestRun.run_tests(self)
> 04:15:39 File "/Users/mozilla/mozmill-ci/jenkins-master/jobs/mozilla-central_functional/workspace/mozmill-env-mac/python-lib/mozmill_automation/testrun.py", line 300, in run_tests
> 04:15:39 self._mozmill.run(tests, self.options.restart)
> 04:15:39 File "/Users/mozilla/mozmill-ci/jenkins-master/jobs/mozilla-central_functional/workspace/mozmill-env-mac/python-lib/mozmill/__init__.py", line 409, in run
> 04:15:39 frame = self.run_test_file(frame or self.start_runner(),
> 04:15:39 File "/Users/mozilla/mozmill-ci/jenkins-master/jobs/mozilla-central_functional/workspace/mozmill-env-mac/python-lib/mozmill/__init__.py", line 326, in start_runner
> 04:15:39 self.create_network()
> 04:15:39 File "/Users/mozilla/mozmill-ci/jenkins-master/jobs/mozilla-central_functional/workspace/mozmill-env-mac/python-lib/mozmill/__init__.py", line 287, in create_network
> 04:15:39 self.jsbridge_port)
> 04:15:39 File "/Users/mozilla/mozmill-ci/jenkins-master/jobs/mozilla-central_functional/workspace/mozmill-env-mac/python-lib/jsbridge/__init__.py", line 44, in wait_and_create_network
> 04:15:39 raise Exception("Cannot connect to jsbridge extension, port %s" % port)
> 04:15:39 Exception: Cannot connect to jsbridge extension, port 58833
I have seen this on windows.
There is always the following message:
> IO Completion Port unexpectedly closed
Then a notification stating:
> "Firefox is already running, but is not responding. To open a new window, you must first close the existing Firefox process, or restart the system."
Afterwards it fails with the jsbridge disconnect error mentioned above.
We've had similar failures before. See bug 865690.
Assignee | ||
Comment 1•11 years ago
|
||
Andrei, by any chance do you have a minimized testcase? That would help me a lot to get this investigated and fixed.
Reporter | ||
Comment 2•11 years ago
|
||
Not at the moment.
It looks related to restarts.
I'll check with the testcase I made in bug 872414.
Assignee | ||
Comment 3•11 years ago
|
||
I will check with staging again and if it still fails I might take a os x node offline in the production cluster for a better investigation.
Assignee | ||
Comment 4•11 years ago
|
||
It might be that we indeed continue too fast in Python when the Firefox process exits. So the next call to run() will produce this.
Assignee | ||
Comment 5•11 years ago
|
||
So this is always happening for restart tests and specifically for the last test module. So I assume we somehow wrongly shutdown Firefox and are running into a timing issue. Interestingly I'm not able to reproduce this issue on any of our mac minis in mozmill-ci production. It's only happening on master in mozmill-staging. Lets see if I can debug some stuff cause it seems to always fail there.
Assignee | ||
Comment 6•11 years ago
|
||
Ok, I found the issue here, which is indeed understandable and makes total sense. Not sure why we haven't noticed that ever before! It exists since the very early days of Mozmill. So what happens here is:
When we shutdown Firefox from within a test via frame.shutdownApplication(), a JSBridgeDisconnectError is thrown on the Python side:
https://github.com/mozilla/mozmill/blob/master/mozmill/mozmill/__init__.py#L354
We handle that correctly but totally don't take into account that the application could not have been closed already. So we happily continue with the next test and do NOT wait until the current mozrunner process has been quit. Exactly this is causing the 'Profile already in use' disconnect we have faced a lot in the past.
So a solution here is in that we have to call runner.wait() before we continue. I'm testing a patch right now and I will upload soon.
Assignee: nobody → hskupin
Status: NEW → ASSIGNED
Hardware: x86 → All
Summary: Testrun fails with Disconnect Error with Firefox already running notification → When Firefox gets closed via shutdownApplication() we do not wait until runner process has been quit
Whiteboard: [mozmill-2.0.3+]
Assignee | ||
Comment 7•11 years ago
|
||
With this patch I do not see this disconnect anymore, given that we are waiting for the process to shutdown now. Andrei please test on those machines where you have seen it.
Attachment #8349975 -
Flags: review?(dave.hunt)
Attachment #8349975 -
Flags: feedback?(andrei.eftimie)
Updated•11 years ago
|
Attachment #8349975 -
Flags: review?(dave.hunt) → review+
Reporter | ||
Comment 8•11 years ago
|
||
Comment on attachment 8349975 [details] [diff] [review]
Patch v1
Review of attachment 8349975 [details] [diff] [review]:
-----------------------------------------------------------------
Works fine for me.
I can now complete a testrun without disconnects:
http://mozmill-crowd.blargon7.com/#/functional/report/b646dc9797659302414b7b8a9d11e710
The fix I initially uploaded for bug 950003 now introduces another failure:
http://mozmill-crowd.blargon7.com/#/functional/report/b646dc9797659302414b7b8a9d12dd67
But I highly suspect that's another problem.
We don't properly handle a dialog window here.
I remember seeing this dialog window remain open, but would eventually disappear when we restarted firefox later on.
Its possible that now that we wait for the process to finish, we end up with a timeout becuase of the unhandled window.
I'll raise another bug for this issue.
Attachment #8349975 -
Flags: feedback?(andrei.eftimie) → feedback+
Assignee | ||
Comment 9•11 years ago
|
||
Landed as:
https://github.com/mozilla/mozmill/commit/327c7a42d3f18750f4ea1ce0de9a0788df2e8bc3 (master)
https://github.com/mozilla/mozmill/commit/a898570bf83a8e0dd9b3998be83003e8733b726a (hotfix-2.0)
Status: ASSIGNED → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Updated•8 years ago
|
Product: Testing → Testing Graveyard
You need to log in
before you can comment on or make changes to this bug.
Description
•