Closed
Bug 1290372
Opened 8 years ago
Closed 8 years ago
wait_for_port() doesn't take process status of application into account (eg. return early if process does not exist)
Categories
(Remote Protocol :: Marionette, defect)
Tracking
(firefox52 fixed)
RESOLVED
FIXED
mozilla52
Tracking | Status | |
---|---|---|
firefox52 | --- | fixed |
People
(Reporter: whimboo, Assigned: whimboo)
References
Details
Attachments
(1 file)
Right now when we call wait_for_port() we wait until the socket timeout, and trying constantly to create a socket connection. We could drastically reduce the duration of that check if the status of the process would be taken into account. It means if the returncode is not None, the process has been shutdown or died. In such a case it doesn't make sense to wait longer for the socket server to be started. I will first check how to improve the socket timeout situation over on bug 1284457 before getting to this bug.
Assignee | ||
Updated•8 years ago
|
Summary: wait_for_port() doesn't take process status of application into account (return early of process does not exist) → wait_for_port() doesn't take process status of application into account (eg. return early if process does not exist)
Assignee | ||
Comment 1•8 years ago
|
||
The code I'm referring to is here: https://dxr.mozilla.org/mozilla-central/rev/506facea63169a29e04eb140663da1730052db64/testing/marionette/client/marionette_driver/marionette.py#645 As far as I can see the code always gets called for an already running or freshly started instance. So in case of a startup crash we definitely would run into a timeout, and waste about 120s of our time for nothing. Maybe we should get the loop out of transport.py and implement it in wait_for_port(). We can then rename the method in transport.py to something like check_for_port() which returns immediately. With that change wait_for_port() could also check for the process status (returncode) because the Marionette class has the runner instance. Andreas, do you agree with that? If yes, I would like to make it a mentored bug. Thanks.
Flags: needinfo?(ato)
Comment 2•8 years ago
|
||
The return code is only something I believe we can check for Firefox, as it isn’t really a thing for Fennec (or was for B2G). transport.wait_for_port is as far as I can tell only used in marionette.py, i.e. there’s no internal use in transport.py of it, so I do have some sympathy towards making what is essentially a “has Gecko started up yet and spun up the Marionette server, and can we speak to it” smarter. So I have no problems with detaching transport.wait_for_port from transport, or making it look at more things than just socket connection. But I need clarification if this will work for mobile on device and emulator (maybe it’s fine to skip those checks there). Are there more things we should look at?
Flags: needinfo?(ato)
Assignee | ||
Comment 3•8 years ago
|
||
I do not see a difference between Firefox and Fennec. Both are applications we launch and should have the returncode for when the process ends. So at any time we know if the process is running or not. Even in case of B2G the returncode might have been always None (running), so it shouldn't harm us. I don't see anything else we should check for at this point. I just want to make sure that we do not run into a timeout only because the process is not alive. The time we spent here can be better used for other jobs. Thanks for your quick reply!
Mentor: hskupin
Whiteboard: [lang=py]
Comment 4•8 years ago
|
||
(In reply to Henrik Skupin (:whimboo) from comment #3) > I do not see a difference between Firefox and Fennec. Both are applications > we launch and should have the returncode for when the process ends. So at > any time we know if the process is running or not. The “process” for Fennec is of the Android emulator, right? I didn’t think the exit code of the emulator corresponded to the exit code of the Fennec Android app.
Assignee | ||
Comment 5•8 years ago
|
||
What we care about here is the process status or exit code of Fennec only. That is the application Marionette works with.
You should be able to treat Fennec and Desktop instances the same way in Marionette. When dealing with Fennec in Marionette, self.instance.runner.returncode and .is_running refer to the remote process (Fennec on the emulator/device) and self.instance.runner.device refers to the Emulator process.
Comment hidden (mozreview-request) |
Assignee | ||
Updated•8 years ago
|
Assignee: nobody → hskupin
Status: NEW → ASSIGNED
Assignee | ||
Updated•8 years ago
|
Attachment #8793728 -
Flags: review?(ato)
Comment 8•8 years ago
|
||
mozreview-review |
Comment on attachment 8793728 [details] Bug 1290372 - wait_for_port() has to return early if instance is not running. https://reviewboard.mozilla.org/r/80410/#review80232 ::: testing/marionette/client/marionette_driver/marionette.py:646 (Diff revision 1) > except socket.error: > return False > finally: > s.close() > > - def wait_for_port(self, timeout=None): > + def wait_for_port(self, host=None, port=None, timeout=None): I can’t see that this is ever called with the `host` or `port` arguments? Also I don’t think we should make this public. ::: testing/marionette/client/marionette_driver/marionette.py:647 (Diff revision 1) > return False > finally: > s.close() > > - def wait_for_port(self, timeout=None): > + def wait_for_port(self, host=None, port=None, timeout=None): > + """Wait for the specified host/port to become available.""" This documentation is wrong now that it picks up `self.host` and `self.port`. ::: testing/marionette/client/marionette_driver/marionette.py:648 (Diff revision 1) > + host = host or self.host > + port = port or self.port I’m not a fan of this `or` notation. If the first variable is falsy, this will select the second variable. So for example with `wait_for_port(port=0)`, 0 will evaluate to false and it will pick `self.port` instead of the passed port. Instead we need to do: ```python if host is None: host = self.host ``` ::: testing/marionette/client/marionette_driver/marionette.py:653 (Diff revision 1) > + port = port or self.port > timeout = timeout or self.DEFAULT_STARTUP_TIMEOUT > - return transport.wait_for_port(self.host, self.port, timeout=timeout) > > + poll_interval = 0.1 > + runner = self.instance.runner if self.instance else None `if self.instance is not None` ::: testing/marionette/client/marionette_driver/marionette.py:658 (Diff revision 1) > + runner = self.instance.runner if self.instance else None > + starttime = datetime.datetime.now() > + > + while datetime.datetime.now() - starttime < datetime.timedelta(seconds=timeout): > + # If the instance we want to connect to is not running return immediately > + if runner and not runner.is_running(): `if runner is not None` ::: testing/marionette/client/marionette_driver/marionette.py:672 (Diff revision 1) > + if ":" in data: > + return True > + except socket.error: > + pass > + finally: > + if sock: This if-condition is unnecessary. `sock` is always assigned. ::: testing/marionette/client/marionette_driver/marionette.py:673 (Diff revision 1) > + try: > + sock.shutdown(socket.SHUT_RDWR) > + except: > + pass Will this ever throw?
Comment 9•8 years ago
|
||
mozreview-review |
Comment on attachment 8793728 [details] Bug 1290372 - wait_for_port() has to return early if instance is not running. https://reviewboard.mozilla.org/r/80410/#review80356
Attachment #8793728 -
Flags: review?(ato) → review+
Assignee | ||
Comment 10•8 years ago
|
||
mozreview-review-reply |
Comment on attachment 8793728 [details] Bug 1290372 - wait_for_port() has to return early if instance is not running. https://reviewboard.mozilla.org/r/80410/#review80232 > I can’t see that this is ever called with the `host` or `port` arguments? > > Also I don’t think we should make this public. The method was already public before my changes. So it would mean we change the API. I'm happy to do so in case it is wanted, but would have to check for side-effects. Please let me know. Regarding the options I will remove them. Now that the code is in the same file, we don't need both options anymore. > This if-condition is unnecessary. `sock` is always assigned. The if condition was already present in the old code. So I kept it. Keep in mind that we can have it as `None` here if the call to `sock = socket.socket()` fails above in the try block. So I will leave it in but make it explicitly a `None` check again. > Will this ever throw? Yes, you cannot call `sock.shutdown()` twice on the same socket. I don't see why it should happen here, but I would keep it for safety. Just in case something else shutdown the port.
Comment hidden (mozreview-request) |
Assignee | ||
Comment 12•8 years ago
|
||
Andreas, please let me know if you like the rename of the methods. Also I would like to get the feedback for the remaining open questions. Thanks.
Flags: needinfo?(ato)
Assignee | ||
Updated•8 years ago
|
Mentor: hskupin
Whiteboard: [lang=py]
Comment 13•8 years ago
|
||
mozreview-review-reply |
Comment on attachment 8793728 [details] Bug 1290372 - wait_for_port() has to return early if instance is not running. https://reviewboard.mozilla.org/r/80410/#review80232 > The method was already public before my changes. So it would mean we change the API. I'm happy to do so in case it is wanted, but would have to check for side-effects. Please let me know. > > Regarding the options I will remove them. Now that the code is in the same file, we don't need both options anymore. We are already changing the API. > The if condition was already present in the old code. So I kept it. Keep in mind that we can have it as `None` here if the call to `sock = socket.socket()` fails above in the try block. So I will leave it in but make it explicitly a `None` check again. A ctor in the standard library is pretty much guaranteed to never have adverse side effects FWIW.
Comment 14•8 years ago
|
||
(In reply to Henrik Skupin (:whimboo) from comment #12) > Andreas, please let me know if you like the rename of the methods. Also I > would like to get the feedback for the remaining open questions. Thanks. I think we should unpublish it and by prepending a _. I consider this to be an internal method. Other than that, I think I’ve addressed the remaining open questions.
Flags: needinfo?(ato)
Assignee | ||
Comment 15•8 years ago
|
||
mozreview-review-reply |
Comment on attachment 8793728 [details] Bug 1290372 - wait_for_port() has to return early if instance is not running. https://reviewboard.mozilla.org/r/80410/#review80232 > A ctor in the standard library is pretty much guaranteed to never have adverse side effects FWIW. Well, what should happen if an exception is thrown in the socket's __init__ method? Then sock will still be None and we will fail with the if check in the finally block.
Assignee | ||
Comment 16•8 years ago
|
||
(In reply to Andreas Tolfsen ‹:ato› from comment #14) > I think we should unpublish it and by prepending a _. I consider this to be > an internal method. That was actually already done with the last uploaded version of the patch. :) So once we are on agreement for the last remaining issue, we can get this patch landed. Thanks.
Comment 17•8 years ago
|
||
mozreview-review-reply |
Comment on attachment 8793728 [details] Bug 1290372 - wait_for_port() has to return early if instance is not running. https://reviewboard.mozilla.org/r/80410/#review80232 > Well, what should happen if an exception is thrown in the socket's __init__ method? Then sock will still be None and we will fail with the if check in the finally block. This is not supposed to happen. > Yes, you cannot call `sock.shutdown()` twice on the same socket. I don't see why it should happen here, but I would keep it for safety. Just in case something else shutdown the port. Okay, but I guess there is a pretty remote chance of this happening since `sock` isn’t exposed.
Comment 18•8 years ago
|
||
(In reply to Henrik Skupin (:whimboo) from comment #16) > So once we are on agreement for the last remaining issue, we can get this > patch landed. Thanks. I don’t think there is any disagreement. I already gave the patch an r+.
Assignee | ||
Comment 19•8 years ago
|
||
mozreview-review-reply |
Comment on attachment 8793728 [details] Bug 1290372 - wait_for_port() has to return early if instance is not running. https://reviewboard.mozilla.org/r/80410/#review80232 > Okay, but I guess there is a pretty remote chance of this happening since `sock` isn’t exposed. I checked this more and I tend to agree now. It's also that this socket only exists for a split of a second, and no client can connect to it. So lets remove this code.
Comment hidden (mozreview-request) |
Comment 21•8 years ago
|
||
Pushed by hskupin@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/198c4bf0c8df _wait_for_connection() has to return early if instance is not running. r=ato
Comment 22•8 years ago
|
||
sorry had to back this out for failures like https://treeherder.mozilla.org/logviewer.html#?job_id=5150943&repo=autoland
Flags: needinfo?(hskupin)
Comment 23•8 years ago
|
||
Backout by cbook@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/a7fc516956bc Backed out changeset 198c4bf0c8df for failing wpt tests
Assignee | ||
Comment 24•8 years ago
|
||
(In reply to Carsten Book [:Tomcat] from comment #22) > sorry had to back this out for failures like > https://treeherder.mozilla.org/logviewer.html#?job_id=5150943&repo=autoland It looks like maybe even other harnesses could make use of the wait_for_port() method. Making it private might not have been a good choice. So I will get this reverted to keep API compatibility. At the same time I will also rename the method back to its original name.
Flags: needinfo?(hskupin)
Comment hidden (mozreview-request) |
Assignee | ||
Comment 26•8 years ago
|
||
I triggered another try build with wp-tests included: https://treeherder.mozilla.org/#/jobs?repo=try&revision=7bfddb301448
Assignee | ||
Comment 27•8 years ago
|
||
Andreas, the latest changes look better now for wp-tests. I'm also more confident with the current state as before. Can you please have another look at it? Sadly I cannot re-request r? from you.
Flags: needinfo?(ato)
Comment 29•8 years ago
|
||
Pushed by hskupin@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/dabadf63cb44 wait_for_port() has to return early if instance is not running. r=ato
Comment 30•8 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/dabadf63cb44
Status: ASSIGNED → RESOLVED
Closed: 8 years ago
status-firefox52:
--- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla52
Updated•1 year ago
|
Product: Testing → Remote Protocol
You need to log in
before you can comment on or make changes to this bug.
Description
•