Closed Bug 1025897 Opened 10 years ago Closed 10 years ago

Get a free port in moznetwork

Categories

(Testing :: Mozbase, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED WONTFIX

People

(Reporter: ato, Assigned: ato)

Details

Attachments

(1 file)

We should consider uplifting one of the many functions for getting a free port on the local system to moznetwork.

There already exists many implementations of such functions and it would be good to have one canonical function to refer to.  My suggestion is to uplift get_free_port from wptrunner.  I think it's a fine starting point.
So where is this code you are referring to? Does it release the socket after acquiring a free port, or does it keep the socket open and pass it back to the caller? If it frees it up, this can cause a race condition as Ted pointed out last week.
It does cause a race condition, but that's unavoidable. Most servers won't let you pass a socket object in, only a port number. So at some point you have to close the socket that you probed and reopen a socket with the same port number in the server.

The only alternative option is to always use an OS-provided free port i.e. "port 0". But there are significant tradeoffs there too e.g. you can't be sure that the port will be open in any external firewall and the UX is very poor (you have to learn a new port number for each invocation of the program, as does your browser location bar). The get_free_port in wptrunner tries to use ports in a range, so it's rather predictable what will be chosen and these problems are avoidable.

Basically the tradeoff seems to be a problem in rare racy cases (which you could deal with if you were careful by catching the exception and retrying with a different port number) or a problem in every single case. Given that, I think the race is OK.
I have to say I'm pretty sceptical about this, we've been getting along pretty well just using the trick that :whimboo mentioned for a long time. Firewalls have never really been an issue for us and I don't really understand why copypasting a randomly generated url/port combination is so difficult.

Maybe if I heard a really compelling use case / story I could be convinced, but this seems to fall under "when in doubt, leave it out" to me.
Indeed. I'd rather promote best practices than put footguns in mozbase.
Assignee: nobody → ato
Status: NEW → ASSIGNED
Attachment #8440734 - Flags: review?(james)
Attachment #8440734 - Flags: review?(james) → feedback?(james)
Comment on attachment 8440734 [details] [diff] [review]
0001-Bug-1025897-Add-moznetwork.get_free_port.-r-jgraham.patch

Review of attachment 8440734 [details] [diff] [review]:
-----------------------------------------------------------------

I suggest getting someone else to review this since there appears to be skepticism about the general idea. FWIW I think it's a rather low risk of races, but the cases I really care about it makes more sense to require a specific port and just fail if that is already in use (which is exactly what happens if this races, which is why I can't get too worried about that. Indeed it seems to me that every time you write a program that doesn't pick the port in the same operation as binding the socket there is some risk of race conditions, but no one complains about programs that take a fixed port number on the command line).
Attachment #8440734 - Flags: feedback?(james)
I attached a proof of concept patch that shows what we're thinking.

My argument is that the convenience of providing a canonical implementation for this in moznetwork outweighs the case of a rare race condition which can be managed by the normal cautiousness one should exhibit when doing low-level socket programming.

Since there are real use cases for this in many different programs we write it would be beneficial for them to have a single implementation to refer to, instead of us somewhat randomly patching different programs when we're made aware of further shortcomings.

The proposed approach is to let the start_port default to 0 but to allow the user to begin probing for free ports from a given integer, optionally excluding a set of ports.
I don't think we should do this, full stop. I think programs should either pass port zero down to the OS to reliably get a free port (c.f. mozhttpd) or use a hardcoded port (and allow commandline overrides) if that's important for some reason.

(In reply to James Graham [:jgraham] from comment #6)
> the port in the same operation as binding the socket there is some risk of
> race conditions, but no one complains about programs that take a fixed port
> number on the command line).

I actually see people complain about this not-infrequently because we as programmers are terrible about choosing port numbers and tend to wind up using 8000/8080 for lots of things, so it's very easy to wind up with those already in use.
(In reply to Ted Mielczarek [:ted.mielczarek] from comment #8)

> I actually see people complain about this not-infrequently because we as
> programmers are terrible about choosing port numbers and tend to wind up
> using 8000/8080 for lots of things, so it's very easy to wind up with those
> already in use.

But using a port that was already in use when the program started isn't a race condition. This code would in fact prevent that kind of bug and still give you a reasonably memorable port like 8001 or 8081, rather than something unmemorable like 18673.

It would only be a race condition if you verified before starting the program that port 8000 was free, specified that port on the command line, and then, before the port was assigned by that program, another program came along and assigned itself port 8000. Which is possible, but not really something that people worry about. So I'm not sure why it's more worrying in this case.
(In reply to Andreas Tolfsen (:ato) from comment #7)
> My argument is that the convenience of providing a canonical implementation
> for this in moznetwork outweighs the case of a rare race condition which can
> be managed by the normal cautiousness one should exhibit when doing
> low-level socket programming.

Except if you provide a function like this, you have to know to handle the exception case, and put in the code to handle it correctly each time. I trust you and James to do this, but I don't know who's going to be writing new automation code in the future. I'd rather just keep things simple and avoid giving people tools to shoot themselves in the foot. In a sense I *like* the fact that we don't have a method for this functionality in mozbase, as it often provides an opportunity to teach people about robustness and correctness when they ask where it is.

I can see how this is an argument that can go either way and perhaps if I had a slightly different set of experiences I would take your side. But given that this has come up before and we came to the current decision (i.e. not to include something like this) and we're not seeing any really new information or data which directly contradicts our original reasoning (just various arguments about "ease of use" that could IMO go either way), I'm very much inclined to just leave things as they are.
I still think the robustness argument is being oversold; exactly the same argument would lead to you reject most uses of os.path.exists, for example. Inspection of the mozbase code suggests that isn't policy.

Having said that, I think that for most cases where a random port is bad, using a prespecified port, and dealing with any conflict by failing, is the best option. And when the approach in this patch is the best option, it's not very hard to roll your own. Which turns out to be what people do. So I don't mind this not going in mozbase, but I don't expect it to have much effect on what people actually do.
(In reply to James Graham [:jgraham] from comment #11)
> I still think the robustness argument is being oversold; exactly the same
> argument would lead to you reject most uses of os.path.exists, for example.
> Inspection of the mozbase code suggests that isn't policy.

Well, we certainly would reject a hand-rolled "CreateTempFile" implementation that didn't give you the same race protection as tempfile.NamedTemporaryFile.

If we see people using this anti-pattern in code we ought to be calling it out in review and rejecting it. There *is* a correct solution (ask the OS to give you the free port when binding the socket), so we should use it. If we put this code in mozbase then we explicitly endorse doing things in a bad way, which is a problem.

If people are writing code doing this, we should instead fix the code to do the right thing, so others aren't copying their mistakes.
So I have examined the case where I actually use this — which I should have done earlier in the thread — and it's unclear to me what I should do differently given the constraints.

I need to start Firefox and connect to it over marionette. Firefox is the server in this case, and it has the port set in the profile. So I need to write a profile file with some port set, start Firefox, and then initiate a marionette connection with that same port. The only way to avoid this would be to have firefox pick a random port and then have it signal which port it picked, which afaict it doesn't do. I guess I could patch Firefox to write the marionette port to stdout or something in this case. But exactly the same problem occurs whenever you want to interact with any third party software over a socket and there is only a port input but no output.

As discussed at length here this approach doesn't eliminate the possibility that the external program will fail to bind to the specified port. But it makes it much less likely to fail, and the error handling you have to write is basically the same in either case. So unless there is some way of dealing with this I'm not seeing the choice is basically "make failure relatively likely" or "make failure relatively unlikely". Certainly one could argue that the first is more will motivate people to handle failure well, but the evidence is that it can also motivate them to just make the failure relatively unlikely.
(In reply to James Graham [:jgraham] from comment #13)
> I need to start Firefox and connect to it over marionette. Firefox is the
> server in this case, and it has the port set in the profile. So I need to
> write a profile file with some port set, start Firefox, and then initiate a
> marionette connection with that same port. The only way to avoid this would

Well, whatever option we finally choose I don't see how your approach makes this situation different. In both ways we try to find a free port on the Python side. Each of that can fail. So to be 100% sure, the Marionette code in Firefox itself would have to choose the port, and signal it back as you said in your last comment. So the example is not that well chosen. But I agree with other code like os.path.exist() in mozfile. Ideally you should not use it but really enclose the code in try/raise. File can be gone at any time, so that the next remove() action will also fail.
There seems to be a fairly strong consensus against putting something like this in moznetwork.  For that reason I'm resolving this request as WONTFIX.

In principle I agree with what's being said about delegating the random port selection to the OS at the time when the program needs to use the socket.  In practice it's not uncommon you have a client that needs to tell a server to open a socket it can connect on.  Using a predefined or hardcoded port certainly addresses that, but in a situation where you have multiple servers running in parallel on the same system, it doesn't give you a solution on how to mitigate race conditions.

A central theme here is error handling.  I'd like to highlight the point made earlier about the best option being to use prespecified/hardcoded ports and dealing with any conflict by failing:  This is quite often not a practical solution because of the overhead the server might have in setup- and teardown time (which is significant in the case of Firefox).

It leads to people trying other solutions, such as the brute force method of first probing for a free port in the client and passing that along to the server.  Which in the common case has minimal risk of a race condition, which turns out is why people do it.  If it does fail, it forces you to have to do the same error handling you would need in the first place of the server failing to bind to a predefined port.  And thus you have guards and retries and probing all over the place, both in the client and the server.

So I wish we had a better solution in place for situations similar to the use case of Marionette that jgraham mentions.  It's unclear to me what sort of infrastructure we need to put in place to have the server binding to a random port signal back to the client which port to connect on.

Writing the port to stdout and having the client parse that seems potentially very error prone and cumbersome for the client to implement safely.
Status: ASSIGNED → RESOLVED
Closed: 10 years ago
Resolution: --- → WONTFIX
If that's the specific problem in need of a solution then we should figure out how to build a solution to that problem. Off the top of my head I could imagine opening a socket on a local free port, passing that down to the child process, and having the child send back its listen socket back.

We could also tweak Marionette to allow it to be used in a call-back manner like that, where it connects to an existing listen socket on startup instead of listening on its own socket. That would eliminate a whole class of problems we hit trying to figure out if Marionette is ready for a connection. That seems like it would be a generally better architecture.
I put together a proof-of-concept implementation in Marionette in bug 1027022.
That sounds great, although there are non-marionette cases where one still has to interact with a program that wants to take a port number and start a server on that port. For example wptrunner does this with pywebsocket and chromedriver (when testing Chrome). These are third party programs that we can't rearchitect like we can for marionette. So whilst I agree that the proposed change to marionette is a huge win, it doesn't solve all the use cases here.
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: