Closed Bug 1279259 Opened 4 years ago Closed 3 years ago

TURN server fixed port 8191 prevents simultaneous multiple test runners

Categories

(Testing :: Mochitest, defect)

defect
Not set

Tracking

(firefox51 fixed, firefox52 fixed, firefox53 fixed, firefox54 fixed)

RESOLVED FIXED
mozilla54
Tracking Status
firefox51 --- fixed
firefox52 --- fixed
firefox53 --- fixed
firefox54 --- fixed

People

(Reporter: bc, Assigned: bc)

References

Details

Attachments

(2 files)

I ran into this while diagnosing issues with Autophone Mm autophone-mochitest-media.

bug 1231981 introduced the TURN server for mochitest CI.

This introduced startWebsocketProcessBridge(...) https://hg.mozilla.org/mozilla-central/rev/aea71aa8552a#l2.48 to testing/mochitest/runtests.py which creates the socket using the fixed port 8191.

https://hg.mozilla.org/mozilla-central/rev/aea71aa8552a#l2.65
sock = socket.create_connection(("127.0.0.1", 8191))

This is problematic for systems such as Autophone which can run multiple instances of test suites simultaneously since this socket creation will fail with 

Traceback (most recent call last):
  File "websocketprocessbridge/websocketprocessbridge.py", line 96, in <module>
    reactor.listenTCP(8191, txws.WebSocketFactory(bridgeFactory))
  File "/mozilla/projects/.virtualenvs/autophone/lib/python2.7/site-packages/twisted/internet/posixbase.py", line 478, in listenTCP
    p.startListening()
  File "/mozilla/projects/.virtualenvs/autophone/lib/python2.7/site-packages/twisted/internet/tcp.py", line 984, in startListening
    raise CannotListenError(self.interface, self.port, le)
twisted.internet.error.CannotListenError: Couldn't listen on any:8191: [Errno 98] Address already in use.

There also appears to be another instance of this issue at:

https://dxr.mozilla.org/mozilla-central/source/dom/media/tests/mochitest/pc.js#1843

Autophone can detect in use ports and could in principle choose an unused port when running the tests if there was a command line option for selecting the port. I'm not sure how to deal with the fixed port in the pc.js file though.
I would expect us to also have problems with port 8888 (the port used to serve the mochitest pages) here.
The http and ssl ports for the web server run by the unittests are configurable via the command line arguments --http-port and --ssl-port. I make an effort to find open ports and to specify those instead of the defaults when running the unittests though it is not yet perfect.

I'm testing an approach where I block execution of the webrtc related mochitests using an exclusive file lock but this is very much less than perfect since it can result in serial execution of the tests which effectively increases the total time required by a factor of N where N is the number of devices on the host which are attempting to execute the test.

What we need is the ability to specify port used by the TURN server in the same way that we do with the http and ssl ports.

We need to:

* change https://dxr.mozilla.org/mozilla-central/source/testing/tools/websocketprocessbridge/websocketprocessbridge.py#96 to take a command line argument for the port

* change https://dxr.mozilla.org/mozilla-central/source/testing/mochitest/runtests.py#1086 to pass the port to the command line

* change https://dxr.mozilla.org/mozilla-central/source/testing/mochitest/mochitest_options.py to add the new port

I'll finish up my current testing and will return to this.

* change https://dxr.mozilla.org/mozilla-central/source/testing/mochitest/runtests.py to use the new port
Assignee: nobody → bob
Status: NEW → ASSIGNED
Blocks: 1325698
Attachment #8829803 - Flags: review?(jmaher)
I ran a test locally with a local build + local patched tests and found no socket/port errors. It did select random ports instead of the default however. Without the need for locking, things ran much faster with multiple devices running tests which start the websocket bridge.

I did a try run in production though it only tested the default port since the changes to autophone are not on the production servers. I included a number of platforms and mochitest tests. This should prove the changes do not affect the default use.

I also used the same try run from my local instance of autophone which did include the changes to select the port.

Both sets of results can be found at:

https://treeherder.allizom.org/#/jobs?repo=try&author=bclary@mozilla.com&fromchange=d5192bc8cb0043e117aae057a659754353a14f65&exclusion_profile=false&filter-resultStatus=success&filter-resultStatus=testfailed&filter-resultStatus=busted&filter-resultStatus=exception&filter-resultStatus=retry&filter-resultStatus=usercancel&filter-resultStatus=running&filter-resultStatus=pending&filter-resultStatus=runnable&filter-resultStatus=coalesced

Autophone only:
https://treeherder.allizom.org/#/jobs?repo=try&author=bclary@mozilla.com&fromchange=d5192bc8cb0043e117aae057a659754353a14f65&exclusion_profile=false&filter-resultStatus=success&filter-resultStatus=testfailed&filter-resultStatus=busted&filter-resultStatus=exception&filter-resultStatus=retry&filter-resultStatus=usercancel&filter-resultStatus=running&filter-resultStatus=pending&filter-resultStatus=runnable&filter-resultStatus=coalesced&filter-searchStr=autophone
Attachment #8829809 - Flags: review?(jmaher)
Comment on attachment 8829803 [details] [diff] [review]
bug-1279259-turn-server-port-mc-v1.patch

Review of attachment 8829803 [details] [diff] [review]:
-----------------------------------------------------------------

please push this to try, there could be a flake8 issue with this patch, but otherwise, this is great
Attachment #8829803 - Flags: review?(jmaher) → review+
Comment on attachment 8829809 [details] [diff] [review]
bug-1279259-autophone-v1.patch

Review of attachment 8829809 [details] [diff] [review]:
-----------------------------------------------------------------

looks good!
Attachment #8829809 - Flags: review?(jmaher) → review+
I did as you can see in comment 4. Thanks.
oh, you win!
Pushed by bclary@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/b2835a87db21
make TURN server port configurable, r=jmaher.
https://hg.mozilla.org/mozilla-central/rev/b2835a87db21
Status: ASSIGNED → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla54
Reopening until I can land on Autophone...

In order to completely resolve this bug and land the Autophone related changes, I need to land bug-1279259-turn-server-port-mc-v1.patch in mozilla-aurora, mozilla-beta and mozilla-release. Since this is a test only change, a=testonly should be sufficient. I've applied the first "mc" patch to aurora, beta and release and a part from a bit of fuzz on release it applies cleanly.

It isn't clear to me if I can land this directly myself and if I should wait until after "merge week". aurora and release appear to be open but beta is still marked as "closed for uplift week".

Ryan, Tomcat: should I just wait until next week? I've never pushed to any release branch before. Are their any permissions I need other than normal access?
Status: RESOLVED → REOPENED
Flags: needinfo?(ryanvm)
Flags: needinfo?(cbook)
Resolution: FIXED → ---
Ryan gave me the low down on irc. Yay!
Flags: needinfo?(ryanvm)
Flags: needinfo?(cbook)
Whiteboard: [checkin-needed-aurora][checkin-needed-beta][checkin-needed-release]
https://hg.mozilla.org/releases/mozilla-aurora/rev/2cc634c87009
Flags: in-testsuite-
Whiteboard: [checkin-needed-aurora][checkin-needed-beta][checkin-needed-release] → [checkin-needed-beta][checkin-needed-release]
https://hg.mozilla.org/releases/mozilla-beta/rev/14e0a18e9ee0
Whiteboard: [checkin-needed-beta][checkin-needed-release] → [checkin-needed-release]
the in tree changes have been pushed to everything but mozilla-release. I really want to get this deployed so I'm *assuming* they will be pushed at the same time as the next release build.

https://github.com/mozilla/autophone/commit/bb8bb3c37ec6637601bc363eb53eaa436f4654c6
deployed 2017-01-27 ~00:30
This is fixed.
Status: REOPENED → RESOLVED
Closed: 3 years ago3 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.