Closed Bug 895542 Opened 6 years ago Closed 6 years ago

httpd.js / nsServerSocket automatically select client inaccessible port

Categories

(Testing :: General, defect)

defect
Not set

Tracking

(Not tracked)

RESOLVED FIXED
mozilla25

People

(Reporter: gps, Assigned: gps)

References

(Blocks 2 open bugs)

Details

Attachments

(1 file)

In bug 884421 I modified Sync's unit tests to have httpd.js select a port automatically. Things initially appeared to work. However, some intermittent failures were occurring on our Windows XP testers:

https://tbpl.mozilla.org/php/getParsedLog.php?id=25442339&branch=mozilla-inbound
https://tbpl.mozilla.org/php/getParsedLog.php?id=25439034&tree=Mozilla-Inbound
https://tbpl.mozilla.org/php/getParsedLog.php?id=25443653&branch=mozilla-inbound

The common signature for all the failures is:

10:50:03     INFO -  Sync.Resource	WARN	Caught an error in asyncOpen: Component returned failure code: 0x804b0013 (NS_ERROR_PORT_ACCESS_NOT_ALLOWED) [nsIChannel.asyncOpen] Stack trace: _doRequest()@resource://gre/modules/services-sync/resource.js:224 < Res__request()@resource://gre/modules/services-sync/resource.js:388 < Res_put()@resource://gre/modules/services-sync/resource.js:420 < upload()@resource://services-sync/record.js:62 < generateAndUploadKeys()@C:/slave/test/build/tests/xpcshell/tests/services/sync/tests/unit/test_errorhandler.js:140 < test_wipeRemote_syncAndReportErrors_server_maintenance_error()@C:/slave/test/build/tests/xpcshell/tests/services/sync/tests/unit/test_errorhandler.js:1346 < _run_next_test()@C:\slave\test\build\tests\xpcshell\head.js:1160 < do_execute_soon/<.run()@C:\slave\test\build\tests\xpcshell\head.js:444 < <file:unknown>
10:50:03  WARNING -  TEST-UNEXPECTED-FAIL |  | Error: Component returned failure code: 0x804b0013 (NS_ERROR_PORT_ACCESS_NOT_ALLOWED) [nsIChannel.asyncOpen] - See following stack:
10:50:03     INFO -  Res_put@resource://gre/modules/services-sync/resource.js:420
10:50:03     INFO -  upload@resource://services-sync/record.js:62
10:50:03     INFO -  generateAndUploadKeys@C:/slave/test/build/tests/xpcshell/tests/services/sync/tests/unit/test_errorhandler.js:140
10:50:03     INFO -  test_wipeRemote_syncAndReportErrors_server_maintenance_error@C:/slave/test/build/tests/xpcshell/tests/services/sync/tests/unit/test_errorhandler.js:1346
10:50:03     INFO -  _run_next_test@C:\slave\test\build\tests\xpcshell\head.js:1160


The ports being selected were 2049, 2049, and 2043.

MXR revealed this error is thrown in one place: NS_CheckPortSafety.https://mxr.mozilla.org/mozilla-central/source/netwerk/base/public/nsNetUtil.h#733

Digging into httpd.js source code, it appears it eventually calls nsServerSocket::InitSpecialConnection() (https://mxr.mozilla.org/mozilla-central/source/netwerk/base/src/nsServerSocket.cpp#260) to create the listening socket.

Unfortunately, the automatically selected port via PR_SetNetAddr() appears to be selecting a port that clients aren't allowed to connect to! (I would appreciate confirmation from someone in Necko this is correct - I'm not an expert on the network stack!)

I reckon that nsServerSocket or httpd.js should be modified to only listen on client-accessible ports. Otherwise, tests will continue to randomly fail due to selecting a "banned" port. Fortunately, nsIIOService exposes allowPort (https://mxr.mozilla.org/mozilla-central/source/netwerk/base/public/nsIIOService.idl#102), so it should be possible to easily modify httpd.js to hack around this if that is the appropriate remedy.

This issue blocks making the xpcshell tests run in parallel, which is Mihnea's summer intern project. So, a quick resolution is desired.
2049 is in the disallowed blacklist of ports: http://mxr.mozilla.org/mozilla-central/source/netwerk/base/src/nsIOService.cpp#132

So yes, we should not use port 2049, 4045, or 6000 on the server side. I don't know what's up with 2043.
This patch adds a retry loop in httpd.js for instantiating ServerSocket
instances.

This patch is one of several ways we could solve this problem. I /think/
it is the easiest.

I don't know how we can reliably test this code, so no explicit tests
were added.

I'm not sure if this patch leaks a ServerSocket if we create multiple
instances. If we have to call socket.close() even if
socket.asyncListen() isn't called, then it leaks. The docs in
nsIServerSocket.idl don't make this clear so I'm assuming we only need
to call close() if asyncListen() is called.
Attachment #778156 - Flags: review?(jwalden+bmo)
Assignee: nobody → gps
Comment on attachment 778156 [details] [diff] [review]
Only automatically select ports that aren't blocked from clients

Review of attachment 778156 [details] [diff] [review]:
-----------------------------------------------------------------

Ugh, this behavior kind of throws a monkey wrench at the whole problem.  I kind of wonder if port = -1 should really be part of the server-socket interface, or the round-robin pick-a-port thing should be at a higher level.  Oh well, this works enough.

::: netwerk/test/httpserver/httpd.js
@@ +536,5 @@
> +      // ourselves to finite attempts just so we don't loop forever.
> +      var ios = Cc["@mozilla.org/network/io-service;1"]
> +                  .getService(Ci.nsIIOService);
> +      var socket;
> +      for (var i = 100; i; i--) {

Style in this file has the brace on a new line.

@@ +542,5 @@
>                                      loopback, // true = localhost, false = everybody
>                                      maxConnections);
> +
> +        var allowed = ios.allowPort(port.port, "http");
> +        if (!allowed) {

Brace on new line.

@@ +547,5 @@
> +          dumpn(">>>Warning: obtained ServerSocket listens on a blocked " +
> +                "port: " + socket.port);
> +        }
> +
> +        if (!allowed && this._port == -1) {

{ on new line.

@@ +549,5 @@
> +        }
> +
> +        if (!allowed && this._port == -1) {
> +          dumpn(">>>Throwing away ServerSocket with bad port.");
> +          continue;

Looking at the implementation, you're definitely going to temporarily leak the forbidden port if you don't .close() the socket here.  The interface comments should be clarified about exactly what grabs the port, and what un-grabs it -- separate bug -- but for now add a temp.close() here.

@@ +558,5 @@
> +      }
> +
> +      if (!socket) {
> +        throw new Error("No socket server available. Are there no available ports?");
> +      }

{ on new line.  But more importantly, to match the interface contract, you should do this instead:

  dumpn(">>> Failed to find a usable port to start the server");
  throw Cr.NS_ERROR_NOT_AVAILABLE;
Attachment #778156 - Flags: review?(jwalden+bmo) → review+
Component: Networking → httpd.js
Product: Core → Testing
I think the problem here is that ServerSocket is a generic socket implementation, so giving you any port is okay. It's only our HTTP stack that will refuse to connect to these blacklisted ports. It might be nice to be able to tell ServerSocket "give me any free port that's not blacklisted".
(In reply to Ted Mielczarek [:ted.mielczarek] (away July 19th-28th) from comment #4)
> I think the problem here is that ServerSocket is a generic socket
> implementation, so giving you any port is okay. It's only our HTTP stack
> that will refuse to connect to these blacklisted ports. It might be nice to
> be able to tell ServerSocket "give me any free port that's not blacklisted".

I had this thought as well, which is why I initially filed this under Core :: Networking. If I was competent at writing Gecko C++, I might have attempted this. But, I haven't authored a single C++ patch for Gecko, knew how to hack httpd.js, and wanted to unblock a summer internship project, so I went with the httpd.js monkey wrench.

(In reply to Jeff Walden [:Waldo] (remove +bmo to email) from comment #3)
> @@ +558,5 @@
> > +      }
> > +
> > +      if (!socket) {
> > +        throw new Error("No socket server available. Are there no available ports?");
> > +      }
> 
> { on new line.  But more importantly, to match the interface contract, you
> should do this instead:
> 
>   dumpn(">>> Failed to find a usable port to start the server");
>   throw Cr.NS_ERROR_NOT_AVAILABLE;

It's barely captured in the diff context (I can't wait until ReviewBoard with VCS integration so we can expand diff context during reviews), but this whole block is inside a try..catch which logs the inner exception and throws a Cr.NS_ERROR_NOT_AVAILABLE. So, the current exception throw should be fine.
(In reply to Gregory Szorc [:gps] from comment #5)
> I had this thought as well, which is why I initially filed this under Core
> :: Networking. If I was competent at writing Gecko C++, I might have
> attempted this. But, I haven't authored a single C++ patch for Gecko, knew
> how to hack httpd.js, and wanted to unblock a summer internship project, so
> I went with the httpd.js monkey wrench.

I fully agree with this plan, let's get the work unblocked. Maybe file a followup bug on fixing it at a lower level?
https://hg.mozilla.org/integration/mozilla-inbound/rev/bf1f7d342f5b
Status: NEW → ASSIGNED
Target Milestone: --- → mozilla25
Blocks: 895969
Blocks: 895970
https://hg.mozilla.org/mozilla-central/rev/bf1f7d342f5b
Status: ASSIGNED → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
Blocks: 896093
Component: httpd.js → General
You need to log in before you can comment on or make changes to this bug.