Closed Bug 865296 Opened 11 years ago Closed 8 years ago

Add a TURN server for running mochitest automation

Categories

(Release Engineering :: General, defect, P2)

defect

Tracking

(Not tracked)

RESOLVED DUPLICATE of bug 1231981

People

(Reporter: jsmith, Assigned: catlee)

References

Details

(Whiteboard: [kanban:engops:https://mozilla.kanbanize.com/ctrl_board/6/1922] [qa-automation-blocked, p=5, ft:webrtc])

For our mochitest automation to run against a TURN server, we need a TURN server stood up that our automation will hit during CI runs. Note that we do not want this be accessible out of VPN, given the costs of having a TURN server up and the fact that we need to have credentials for this server. We will eventually move towards a model that will remove the need to have a remote TURN server stood up, but for now, we want to get something up for our automation to hit.
Blocks: 864118
Whiteboard: [qa-automation-blocked]
Jason and Eric, do you have further information about what's required? What type of software is needed? How should it be configured? Where would it have to live? I assume all of our automation is running in MPT?
I'll redirect the questions in comment 1 to Eric - he knows the details about this far better than I do.
Flags: needinfo?(ekr)
(In reply to Henrik Skupin (:whimboo) from comment #1) > Jason and Eric, do you have further information about what's required? What > type of software is needed? I use restund. > How should it be configured? That depends on the environment. Roughly it needs to do STUN long-term authentication with credentials that mochitest knows. > Where would it have to live? I assume all of our automation is running in > MPT? I don't understand this question. It needs to be stood up on a server that's accessible from mochitests but not from the internet.
Flags: needinfo?(ekr)
Is this the same as bug 844994?
(In reply to Chris AtLee [:catlee] from comment #4) > Is this the same as bug 844994? Nope. This is focusing on a TURN server. bug 844944 has do with a DNS entry request for a STUN server that automation can reference.
Product: mozilla.org → Release Engineering
Blocks: 894565
Assigning to catlee (to figure out who will handle this)
Assignee: nobody → catlee
To clarify: I think what we need is a TURN server which is accessible from tbpl so that unit tests and/or mochitests running as part of the build can utilize this TURN server. One proposal would be to modify our tests to only try to exercise these TURN tests if a special environment variable points to a TURN server. That would allow flexible usage of other TURN servers in different test environments. Efforts to get a TURN server for steeplechase and potentially one for engineers in the office networks can be tracked separately. Links to TURN server implementations: http://www.creytiv.com/restund.html http://turnserver.sourceforge.net/ https://code.google.com/p/rfc5766-turn-server/
As I understand the request, you're looking to set up a service on the releng network that's available to all testers, and which essentially hosts "tunnels to nowhere" that tests can use to verify TURN client functionality. It sounds like there's a desire to also host a public version of this service for tests run outside of the releng environment -- but that's a separate request. Within the releng network, we're reluctant to stand up new services because those services immediately become tree-critical. So, it's not just building one VM running a TURN server -- it's building a redundant set of VMs, configuring some HA mechanism, monitoring, a runbook for monitoring failures, and training for those who are expected to keep the system running. We'd also need to work through the inevitable bugs due to variable network latencies and reliabilities from AWS and scl3. All of that is a substantial commitment and well worth thinking about in advance. I'm sorry that "in advance" is 1 year after filing the bug - I only just became aware of it. I htink it's generally agreed that tests should not depend on external resources. In comment 1 Jason mentioned a model that doesn't need an external TURN server. This is probably the optimal solution. In the short term, I think the next-best solution is to support running a local TURN server on each test machine, perhaps by installing the software using puppet and writing code in the mozharness scripts to start and stop the service. We already do this with Apache for talos: talos runs against http://localhost. We could fall back to a one-off TURN server somewhere on the network, but that would be limited to use from your project branch or try, and would need to be a time-definite stopgap.
(In reply to Dustin J. Mitchell [:dustin] from comment #8) > As I understand the request, you're looking to set up a service on the > releng network that's available to all testers, and which essentially hosts > "tunnels to nowhere" that tests can use to verify TURN client functionality. > It sounds like there's a desire to also host a public version of this > service for tests run outside of the releng environment -- but that's a > separate request. This is a separate issue. > Within the releng network, we're reluctant to stand up new services because > those services immediately become tree-critical. So, it's not just building > one VM running a TURN server -- it's building a redundant set of VMs, > configuring some HA mechanism, monitoring, a runbook for monitoring > failures, and training for those who are expected to keep the system > running. We'd also need to work through the inevitable bugs due to variable > network latencies and reliabilities from AWS and scl3. All of that is a > substantial commitment and well worth thinking about in advance. > > I'm sorry that "in advance" is 1 year after filing the bug - I only just > became aware of it. I recognize all this, but nevertheless, TURN is a critical piece of WebRTC and therefore needs to be tested. Ultimately, this means system tests and that means we need a separate TURN server. > I htink it's generally agreed that tests should not depend on external > resources. If by "external" you mean "outside Mozilla's control", then yes, I agree. If by "external" you mean "not on the machine running the test", then no I don't agree. To the contrary, multi-machine tests with off-machine network resources are absolutely critical for testing WebRTC. > In comment 1 Jason mentioned a model that doesn't need an > external TURN server. This is probably the optimal solution. I don't really know what he's referring to here. As I said above, a topology with separate TURN servers is what's required. > In the short > term, I think the next-best solution is to support running a local TURN > server on each test machine, perhaps by installing the software using puppet > and writing code in the mozharness scripts to start and stop the service. > We already do this with Apache for talos: talos runs against > http://localhost. This doesn't solve the problem of multi-machine testing.
I'm afraid Dustin is referring to this part of Jason's initial statements for this bug: (In reply to Jason Smith [:jsmith] from comment #0) > We will eventually move towards a model that will remove the need to have a > remote TURN server stood up, but for now, we want to get something up for > our automation to hit. From my perspective as the QA responsible for WebRTC testing this statement is no longer true (lets leave aside if it ever has been). We need a TURN server for tests.
If anyone needs a proof for how important/urgent this is: Bug 1000858 reveals several months after adding TURN support to FF that it has never worked on Windows.
Priority: -- → P2
Whiteboard: [qa-automation-blocked] → [qa-automation-blocked, p=5, ft:webrtc]
Target Milestone: --- → B2G C1 (to 19nov)
Target Milestone: B2G C1 (to 19nov) → ---
Summary: Add a TURN server for running mochitest automation under a VPN → Add a TURN server for running mochitest automation
Whiteboard: [qa-automation-blocked, p=5, ft:webrtc] → [kanban:engops:https://mozilla.kanbanize.com/ctrl_board/6/1922] [qa-automation-blocked, p=5, ft:webrtc]
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → DUPLICATE
Component: General Automation → General
You need to log in before you can comment on or make changes to this bug.