Closed Bug 1290372 Opened 9 years ago Closed 9 years ago

wait_for_port() doesn't take process status of application into account (eg. return early if process does not exist)

Categories

(Remote Protocol :: Marionette, defect)

Product:

Component:

Version:

Version 3

Type:

defect

Priority:

Not set

Severity:

normal

Tracking

(firefox52 fixed)

Status:

RESOLVED FIXED

Milestone:

mozilla52

Tracking Flags:

Tracking

Status

firefox52

---

fixed

People

(Reporter: whimboo, Assigned: whimboo)

References

Details

Attachments

(1 file)

Bug 1290372 - wait_for_port() has to return early if instance is not running. 9 years ago Henrik Skupin [:whimboo][⌚️UTC+2] 58 bytes, text/x-review-board-request	ato : review+	Details

Henrik Skupin [:whimboo][⌚️UTC+2]

Assignee

Description

•

9 years ago

Right now when we call wait_for_port() we wait until the socket timeout, and trying constantly to create a socket connection. We could drastically reduce the duration of that check if the status of the process would be taken into account. It means if the returncode is not None, the process has been shutdown or died. In such a case it doesn't make sense to wait longer for the socket server to be started. I will first check how to improve the socket timeout situation over on bug 1284457 before getting to this bug.

Henrik Skupin [:whimboo][⌚️UTC+2]

Assignee

Updated

•

9 years ago

Summary: wait_for_port() doesn't take process status of application into account (return early of process does not exist) → wait_for_port() doesn't take process status of application into account (eg. return early if process does not exist)

Henrik Skupin [:whimboo][⌚️UTC+2]

Assignee

Comment 1

•

9 years ago

The code I'm referring to is here: https://dxr.mozilla.org/mozilla-central/rev/506facea63169a29e04eb140663da1730052db64/testing/marionette/client/marionette_driver/marionette.py#645 As far as I can see the code always gets called for an already running or freshly started instance. So in case of a startup crash we definitely would run into a timeout, and waste about 120s of our time for nothing. Maybe we should get the loop out of transport.py and implement it in wait_for_port(). We can then rename the method in transport.py to something like check_for_port() which returns immediately. With that change wait_for_port() could also check for the process status (returncode) because the Marionette class has the runner instance. Andreas, do you agree with that? If yes, I would like to make it a mentored bug. Thanks.

Flags: needinfo?(ato)

Andreas Tolfsen ❲:ato❳

Comment 2

•

9 years ago

The return code is only something I believe we can check for Firefox, as it isn’t really a thing for Fennec (or was for B2G). transport.wait_for_port is as far as I can tell only used in marionette.py, i.e. there’s no internal use in transport.py of it, so I do have some sympathy towards making what is essentially a “has Gecko started up yet and spun up the Marionette server, and can we speak to it” smarter. So I have no problems with detaching transport.wait_for_port from transport, or making it look at more things than just socket connection. But I need clarification if this will work for mobile on device and emulator (maybe it’s fine to skip those checks there). Are there more things we should look at?

Flags: needinfo?(ato)

Henrik Skupin [:whimboo][⌚️UTC+2]

Assignee

Comment 3

•

9 years ago

I do not see a difference between Firefox and Fennec. Both are applications we launch and should have the returncode for when the process ends. So at any time we know if the process is running or not. Even in case of B2G the returncode might have been always None (running), so it shouldn't harm us. I don't see anything else we should check for at this point. I just want to make sure that we do not run into a timeout only because the process is not alive. The time we spent here can be better used for other jobs. Thanks for your quick reply!

Mentor: hskupin

Whiteboard: [lang=py]

Andreas Tolfsen ❲:ato❳

Comment 4

•

9 years ago

(In reply to Henrik Skupin (:whimboo) from comment #3) > I do not see a difference between Firefox and Fennec. Both are applications > we launch and should have the returncode for when the process ends. So at > any time we know if the process is running or not. The “process” for Fennec is of the Android emulator, right? I didn’t think the exit code of the emulator corresponded to the exit code of the Fennec Android app.

Henrik Skupin [:whimboo][⌚️UTC+2]

Assignee

Comment 5

•

9 years ago

What we care about here is the process status or exit code of Fennec only. That is the application Marionette works with.

Maja Frydrychowicz :impossibus (was :maja_zf) (needinfo me)

Comment 6

•

9 years ago

You should be able to treat Fennec and Desktop instances the same way in Marionette. When dealing with Fennec in Marionette, self.instance.runner.returncode and .is_running refer to the remote process (Fennec on the emulator/device) and self.instance.runner.device refers to the Emulator process.

Henrik Skupin [:whimboo][⌚️UTC+2]

Assignee

Updated

•

9 years ago

Depends on: 1301661

Henrik Skupin [:whimboo][⌚️UTC+2]

Assignee

Updated

•

9 years ago

Blocks: 1283906

Comment hidden (mozreview-request)

Henrik Skupin [:whimboo][⌚️UTC+2]

Assignee

Updated

•

9 years ago

Assignee: nobody → hskupin

Status: NEW → ASSIGNED

Henrik Skupin [:whimboo][⌚️UTC+2]

Assignee

Updated

•

9 years ago

Attachment #8793728 - Flags: review?(ato)

Andreas Tolfsen ❲:ato❳

Comment 8

•

9 years ago

mozreview-review

Comment on attachment 8793728 [details] Bug 1290372 - wait_for_port() has to return early if instance is not running. https://reviewboard.mozilla.org/r/80410/#review80232 ::: testing/marionette/client/marionette_driver/marionette.py:646 (Diff revision 1) > except socket.error: > return False > finally: > s.close() > > - def wait_for_port(self, timeout=None): > + def wait_for_port(self, host=None, port=None, timeout=None): I can’t see that this is ever called with the `host` or `port` arguments? Also I don’t think we should make this public. ::: testing/marionette/client/marionette_driver/marionette.py:647 (Diff revision 1) > return False > finally: > s.close() > > - def wait_for_port(self, timeout=None): > + def wait_for_port(self, host=None, port=None, timeout=None): > + """Wait for the specified host/port to become available.""" This documentation is wrong now that it picks up `self.host` and `self.port`. ::: testing/marionette/client/marionette_driver/marionette.py:648 (Diff revision 1) > + host = host or self.host > + port = port or self.port I’m not a fan of this `or` notation. If the first variable is falsy, this will select the second variable. So for example with `wait_for_port(port=0)`, 0 will evaluate to false and it will pick `self.port` instead of the passed port. Instead we need to do: ```python if host is None: host = self.host ``` ::: testing/marionette/client/marionette_driver/marionette.py:653 (Diff revision 1) > + port = port or self.port > timeout = timeout or self.DEFAULT_STARTUP_TIMEOUT > - return transport.wait_for_port(self.host, self.port, timeout=timeout) > > + poll_interval = 0.1 > + runner = self.instance.runner if self.instance else None `if self.instance is not None` ::: testing/marionette/client/marionette_driver/marionette.py:658 (Diff revision 1) > + runner = self.instance.runner if self.instance else None > + starttime = datetime.datetime.now() > + > + while datetime.datetime.now() - starttime < datetime.timedelta(seconds=timeout): > + # If the instance we want to connect to is not running return immediately > + if runner and not runner.is_running(): `if runner is not None` ::: testing/marionette/client/marionette_driver/marionette.py:672 (Diff revision 1) > + if ":" in data: > + return True > + except socket.error: > + pass > + finally: > + if sock: This if-condition is unnecessary. `sock` is always assigned. ::: testing/marionette/client/marionette_driver/marionette.py:673 (Diff revision 1) > + try: > + sock.shutdown(socket.SHUT_RDWR) > + except: > + pass Will this ever throw?

Andreas Tolfsen ❲:ato❳

Comment 9

•

9 years ago

mozreview-review

Comment on attachment 8793728 [details] Bug 1290372 - wait_for_port() has to return early if instance is not running. https://reviewboard.mozilla.org/r/80410/#review80356

Attachment #8793728 - Flags: review?(ato) → review+

Henrik Skupin [:whimboo][⌚️UTC+2]

Assignee

Comment 10

•

9 years ago

mozreview-review-reply

Comment on attachment 8793728 [details] Bug 1290372 - wait_for_port() has to return early if instance is not running. https://reviewboard.mozilla.org/r/80410/#review80232 > I can’t see that this is ever called with the `host` or `port` arguments? > > Also I don’t think we should make this public. The method was already public before my changes. So it would mean we change the API. I'm happy to do so in case it is wanted, but would have to check for side-effects. Please let me know. Regarding the options I will remove them. Now that the code is in the same file, we don't need both options anymore. > This if-condition is unnecessary. `sock` is always assigned. The if condition was already present in the old code. So I kept it. Keep in mind that we can have it as `None` here if the call to `sock = socket.socket()` fails above in the try block. So I will leave it in but make it explicitly a `None` check again. > Will this ever throw? Yes, you cannot call `sock.shutdown()` twice on the same socket. I don't see why it should happen here, but I would keep it for safety. Just in case something else shutdown the port.

Comment hidden (mozreview-request)

Henrik Skupin [:whimboo][⌚️UTC+2]

Assignee

Comment 12

•

9 years ago

Andreas, please let me know if you like the rename of the methods. Also I would like to get the feedback for the remaining open questions. Thanks.

Flags: needinfo?(ato)

Henrik Skupin [:whimboo][⌚️UTC+2]

Assignee

Updated

•

9 years ago

Mentor: hskupin

Whiteboard: [lang=py]

Andreas Tolfsen ❲:ato❳

Comment 13

•

9 years ago

mozreview-review-reply

Comment on attachment 8793728 [details] Bug 1290372 - wait_for_port() has to return early if instance is not running. https://reviewboard.mozilla.org/r/80410/#review80232 > The method was already public before my changes. So it would mean we change the API. I'm happy to do so in case it is wanted, but would have to check for side-effects. Please let me know. > > Regarding the options I will remove them. Now that the code is in the same file, we don't need both options anymore. We are already changing the API. > The if condition was already present in the old code. So I kept it. Keep in mind that we can have it as `None` here if the call to `sock = socket.socket()` fails above in the try block. So I will leave it in but make it explicitly a `None` check again. A ctor in the standard library is pretty much guaranteed to never have adverse side effects FWIW.

Andreas Tolfsen ❲:ato❳

Comment 14

•

9 years ago

(In reply to Henrik Skupin (:whimboo) from comment #12) > Andreas, please let me know if you like the rename of the methods. Also I > would like to get the feedback for the remaining open questions. Thanks. I think we should unpublish it and by prepending a _. I consider this to be an internal method. Other than that, I think I’ve addressed the remaining open questions.

Flags: needinfo?(ato)

Henrik Skupin [:whimboo][⌚️UTC+2]

Assignee

Comment 15

•

9 years ago

mozreview-review-reply

Comment on attachment 8793728 [details] Bug 1290372 - wait_for_port() has to return early if instance is not running. https://reviewboard.mozilla.org/r/80410/#review80232 > A ctor in the standard library is pretty much guaranteed to never have adverse side effects FWIW. Well, what should happen if an exception is thrown in the socket's __init__ method? Then sock will still be None and we will fail with the if check in the finally block.

Henrik Skupin [:whimboo][⌚️UTC+2]

Assignee

Comment 16

•

9 years ago

(In reply to Andreas Tolfsen ‹:ato› from comment #14) > I think we should unpublish it and by prepending a _. I consider this to be > an internal method. That was actually already done with the last uploaded version of the patch. :) So once we are on agreement for the last remaining issue, we can get this patch landed. Thanks.

Andreas Tolfsen ❲:ato❳

Comment 17

•

9 years ago

mozreview-review-reply

Comment on attachment 8793728 [details] Bug 1290372 - wait_for_port() has to return early if instance is not running. https://reviewboard.mozilla.org/r/80410/#review80232 > Well, what should happen if an exception is thrown in the socket's __init__ method? Then sock will still be None and we will fail with the if check in the finally block. This is not supposed to happen. > Yes, you cannot call `sock.shutdown()` twice on the same socket. I don't see why it should happen here, but I would keep it for safety. Just in case something else shutdown the port. Okay, but I guess there is a pretty remote chance of this happening since `sock` isn’t exposed.

Andreas Tolfsen ❲:ato❳

Comment 18

•

9 years ago

(In reply to Henrik Skupin (:whimboo) from comment #16) > So once we are on agreement for the last remaining issue, we can get this > patch landed. Thanks. I don’t think there is any disagreement. I already gave the patch an r+.

Henrik Skupin [:whimboo][⌚️UTC+2]

Assignee

Comment 19

•

9 years ago

mozreview-review-reply

Comment on attachment 8793728 [details] Bug 1290372 - wait_for_port() has to return early if instance is not running. https://reviewboard.mozilla.org/r/80410/#review80232 > Okay, but I guess there is a pretty remote chance of this happening since `sock` isn’t exposed. I checked this more and I tend to agree now. It's also that this socket only exists for a split of a second, and no client can connect to it. So lets remove this code.

Comment hidden (mozreview-request)

Comment 21

•

9 years ago

Pushed by hskupin@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/198c4bf0c8df _wait_for_connection() has to return early if instance is not running. r=ato

Carsten Book [:Tomcat]

Comment 22

•

9 years ago

sorry had to back this out for failures like https://treeherder.mozilla.org/logviewer.html#?job_id=5150943&repo=autoland

Flags: needinfo?(hskupin)

Comment 23

•

9 years ago

Backout by cbook@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/a7fc516956bc Backed out changeset 198c4bf0c8df for failing wpt tests

Henrik Skupin [:whimboo][⌚️UTC+2]

Assignee

Comment 24

•

9 years ago

(In reply to Carsten Book [:Tomcat] from comment #22) > sorry had to back this out for failures like > https://treeherder.mozilla.org/logviewer.html#?job_id=5150943&repo=autoland It looks like maybe even other harnesses could make use of the wait_for_port() method. Making it private might not have been a good choice. So I will get this reverted to keep API compatibility. At the same time I will also rename the method back to its original name.

Flags: needinfo?(hskupin)

Comment hidden (mozreview-request)

Henrik Skupin [:whimboo][⌚️UTC+2]

Assignee

Comment 26

•

9 years ago

I triggered another try build with wp-tests included: https://treeherder.mozilla.org/#/jobs?repo=try&revision=7bfddb301448

Henrik Skupin [:whimboo][⌚️UTC+2]

Assignee

Comment 27

•

9 years ago

Andreas, the latest changes look better now for wp-tests. I'm also more confident with the current state as before. Can you please have another look at it? Sadly I cannot re-request r? from you.

Flags: needinfo?(ato)

Andreas Tolfsen ❲:ato❳

Comment 28

•

9 years ago

r+

Flags: needinfo?(ato)

Comment 29

•

9 years ago

Pushed by hskupin@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/dabadf63cb44 wait_for_port() has to return early if instance is not running. r=ato

Carsten Book [:Tomcat]

Comment 30

•

9 years ago

bugherder

https://hg.mozilla.org/mozilla-central/rev/dabadf63cb44

Status: ASSIGNED → RESOLVED

Closed: 9 years ago

status-firefox52: --- → fixed

Resolution: --- → FIXED

Target Milestone: --- → mozilla52

Updated

•

2 years ago

Product: Testing → Remote Protocol

You need to log in before you can comment on or make changes to this bug.