Closed Bug 1131779 Opened 9 years ago Closed 9 years ago

Webrtc stops using relay port after permission error response

Categories

(Core :: WebRTC: Networking, defect, P2)

35 Branch
x86
macOS
defect

Tracking

()

RESOLVED FIXED
mozilla40
Tracking Status
firefox40 --- fixed

People

(Reporter: nzli, Assigned: drno)

Details

Attachments

(2 files, 1 obsolete file)

User Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/40.0.2214.111 Safari/537.36

Steps to reproduce:

Trying to enable Webrtc call in Firefox with our own TURN server. 


Actual results:

Firefox stops sending pair checks on the relay port after receiving a Permission Error Response. However, it should really only fail the pair that is associated with the failed permission. In our case, our TURN server returned Forbidden error for internal IP addresses. 


Expected results:

Firefox should continue to use this relay port for checking other pairs.
Severity: normal → major
Component: Untriaged → WebRTC: Networking
Product: Firefox → Core
Not sure what the ICE & WebRTC specs say for this case.  CC-ing relevant people.
5766 seems kind of vague on this point, to be honest, but I tend to think that
a 403 should be ignored and other responses should cause failure.
Agreed that 403 should be just ignored (or only fail the associated pair). From rfc 5766:

   The server MAY impose restrictions on the IP address allowed in the
   XOR-PEER-ADDRESS attribute -- if a value is not allowed, the server
   rejects the request with a 403 (Forbidden) error.

It would seem silly that other IP addresses that are allowed for the same allocation are affected.
Any update? Would like to see it fixed in 38 if possible.
I don't believe there's any real chance that there is going to be a fix for this in FF 38, since that's presently in beta and we don't even have a patch yet. If we had a patch now it might appear in FF 39 since we could uplift it to Aurora, but I don't know if any of the ICE maintainers are going to have time to write such a patch in the next few weeks.
Flags: needinfo?(mreavy)
Hi Naizhi -- Which TURN server are you using?  Do you know how widely it is used?  Do you have the option of patching the TURN server?  

I'll talk with our ICE developers (Byron and Nils) next week when they are back from PTO to estimate level of effort and timing. I'm making this a "P2" for now which means it's most likely to land in Fx41. This may change once we hear answers back from Naihi and after I talk with Byron and Nils.  

Doing a needinfo to just Nils (as Byron is away from bugzilla this week) to ask for his thoughts and a rough level of effort.
Rank: 29
Flags: needinfo?(nzli)
Flags: needinfo?(mreavy)
Flags: needinfo?(drno)
Priority: -- → P2
I could see this happening if nr_turn_client_error_cb() gets called because of such a 403 response. But from looking at the code nr_turn_client_error_cb() only gets invoked for turn specific request like allocation etc. So just from looking at code I don't quite understand how we get to the described problem. 

If Naizhi can not provide us with a test server I need to setup a manipulated TURN server to replicate the scenario and then find out what is happening here (which is probably a few days effort in total).
Flags: needinfo?(drno)
Sorry I mis-read the description to some degree initially.

I think it is now pretty clear what it is going on here:

The permissions request registers the nr_turn_client_error_cb when sending out the permission request:
https://dxr.mozilla.org/mozilla-central/source/media/mtransport/third_party/nICEr/src/stun/turn_client_ctx.c#950
That callback then marks the whole turn relay as failed. A patch specifically for the 403 in this scenario should not be too hard.
So my idea is to replace the generic turn client error callback here: https://dxr.mozilla.org/mozilla-central/source/media/mtransport/third_party/nICEr/src/stun/turn_client_ctx.c#952
with a specific one for the permission request and then handle the 403 in the error callback as a separate case from other error replies.

Question is:
- should we just treat the 403 as a R_RETRY, which would be the quick and easy hack, but would result in potentially further permission requests for the same destination being send.
- or should we add some status flag to the nr_turn_permission_ struct and that way turn the 403 into a stored rejection, which would allow us to internally reject all further request for that target address.

Byron, any thoughts?
Flags: needinfo?(docfaraday)
So, we don't want to get into a challenge loop, since that would be silly. The conventional wisdom here is that when you get a 403 but have no new credentials, you give up. We probably want to permanently fail just that permission.
Flags: needinfo?(docfaraday)
Yes, I think that's reasonable.

I think you should do the second one of these, I.e., record that it failed.

I agree that you should replace the error callback, and then in the non-403 case,
call the generic error callback.
Assignee: nobody → drno
Status: UNCONFIRMED → NEW
Ever confirmed: true
Attached file MozReview Request: bz://1131779/drno (obsolete) —
/r/7829 - Bug 1131779: 403 permission response don't fail the TURN client

Pull down this commit:

hg pull -r ac59f86f1d8d5d9581818fb0b18a6f98f2313652 https://reviewboard-hg.mozilla.org/gecko/
Attachment #8599108 - Flags: review?(docfaraday)
Seeking feedback early to check if I'm on track here...
https://reviewboard.mozilla.org/r/7827/#review6617

::: media/mtransport/third_party/nICEr/src/stun/turn_client_ctx.h:53
(Diff revision 1)
> +  nr_turn_permission *perm;

Instead of creating another slot here, why not pass the nr_turn_permission\* as the cb_arg. It has a pointer to the nr_stun_client_ctx so you should be able to fail the context if necessary.
Naizhi as this pretty hard to test for us it would help us a lot if you could download a Nightly build from this try build page: https://treeherder.mozilla.org/#/jobs?repo=try&revision=118dd8803a98 (click on the green 'B' right next to your platform and then on the link in the "build" line in the lower left corner)
and let me know if this works with your TURN server or not.
(In reply to Eric Rescorla (:ekr) from comment #14)
> https://reviewboard.mozilla.org/r/7827/#review6617
> 
> ::: media/mtransport/third_party/nICEr/src/stun/turn_client_ctx.h:53
> (Diff revision 1)
> > +  nr_turn_permission *perm;
> 
> Instead of creating another slot here, why not pass the nr_turn_permission\*
> as the cb_arg. It has a pointer to the nr_stun_client_ctx so you should be
> able to fail the context if necessary.

That was my first attempt, until I realized that the cb_arg also gets passed to nr_turn_stun_ctx_cb https://dxr.mozilla.org/mozilla-central/source/media/mtransport/third_party/nICEr/src/stun/turn_client_ctx.c#225
Seems like it might be more elegant to split the cb_arg values instead.
Attachment #8599108 - Flags: review?(docfaraday)
Comment on attachment 8599108 [details]
MozReview Request: bz://1131779/drno

https://reviewboard.mozilla.org/r/7827/#review6663

::: media/mtransport/third_party/nICEr/src/stun/turn_client_ctx.h:53
(Diff revision 1)
> +  nr_turn_permission *perm;

How about we add an int last_error_code field here that is filled out by the error cb. This would be generally useful (I'm thinking about stats). Then, when something is about to use a permission, it can follow perm->stun->last_error_code to determine whether it wants to give up, and if not, perhaps log something. This way, we don't need any extra callbacks or fields on nr_turn_permission.
(In reply to Byron Campen [:bwc] from comment #18)
> Comment on attachment 8599108 [details]
> MozReview Request: bz://1131779/drno
> 
> https://reviewboard.mozilla.org/r/7827/#review6663
> 
> ::: media/mtransport/third_party/nICEr/src/stun/turn_client_ctx.h:53
> (Diff revision 1)
> > +  nr_turn_permission *perm;
> 
> How about we add an int last_error_code field here that is filled out by the
> error cb. This would be generally useful (I'm thinking about stats). Then,
> when something is about to use a permission, it can follow
> perm->stun->last_error_code to determine whether it wants to give up, and if
> not, perhaps log something. This way, we don't need any extra callbacks or
> fields on nr_turn_permission.

Byron and I discussed this online and I agree that this is the most elegant
version, though you'll need to be careful to set this properly initially.
Comment on attachment 8599108 [details]
MozReview Request: bz://1131779/drno

/r/7829 - Bug 1131779: 403 permission response don't fail the TURN client
/r/8003 - Bug 1131779: tests for permission denied code

Pull down these commits:

hg pull -r 8097cdcf699e2c1d4e070dfeac05801d51e95e87 https://reviewboard-hg.mozilla.org/gecko/
Attachment #8599108 - Flags: review?(docfaraday)
Comment on attachment 8599108 [details]
MozReview Request: bz://1131779/drno

https://reviewboard.mozilla.org/r/7827/#review6775

::: media/mtransport/test/turn_unittest.cpp:382
(Diff revision 2)
> +  perm = STAILQ_FIRST(&turn_ctx_->permissions);

Does this unit-test ever create any permissions? I don't see anything in this file that does, nor do I see any stuff that tests return traffic.
Attachment #8599108 - Flags: review?(docfaraday)
https://reviewboard.mozilla.org/r/7827/#review6793

> Does this unit-test ever create any permissions? I don't see anything in this file that does, nor do I see any stuff that tests return traffic.

Yes it does, because our turn client implicitly creates them for you, but without waiting for response: https://dxr.mozilla.org/mozilla-central/source/media/mtransport/third_party/nICEr/src/stun/turn_client_ctx.c#790

But I'm working on a better test for faking a 403 response.
Comment on attachment 8599108 [details]
MozReview Request: bz://1131779/drno

https://reviewboard.mozilla.org/r/7827/#review6795

Ship It!
Attachment #8599108 - Flags: review+
Attachment #8599108 - Flags: review+ → review?(docfaraday)
Comment on attachment 8599108 [details]
MozReview Request: bz://1131779/drno

/r/7829 - Bug 1131779: 403 permission response don't fail the TURN client
/r/8003 - Bug 1131779: tests for permission denied code

Pull down these commits:

hg pull -r f1ce5517b212a9fe9a1628a5332df98d0595c385 https://reviewboard-hg.mozilla.org/gecko/
Comment on attachment 8599108 [details]
MozReview Request: bz://1131779/drno

https://reviewboard.mozilla.org/r/7827/#review6869

::: media/mtransport/test/turn_unittest.cpp:235
(Diff revisions 2 - 3)
> +    ASSERT_NE(offset, std::string::npos);

Switch order here.
Attachment #8599108 - Flags: review?(docfaraday)
Comment on attachment 8599108 [details]
MozReview Request: bz://1131779/drno

https://reviewboard.mozilla.org/r/7827/#review6871

Ship It!
Attachment #8599108 - Flags: review+
Amazing job here! Thanks a lot guys. I will give the nightly build a try. 

To answer a question asked earlier - yes we are able to work around the problem by sending back success instead.
Flags: needinfo?(nzli)
(In reply to Naizhi from comment #28)
> Amazing job here! Thanks a lot guys. I will give the nightly build a try. 
> 
> To answer a question asked earlier - yes we are able to work around the
> problem by sending back success instead.

We haven't quite landed this yet. We're very close. However, you can test our solution right now using the build off the try: https://treeherder.mozilla.org/#/jobs?repo=try&revision=8b8469d83294.  

If you haven't ever tested using a Firefox try build, it's pretty easy.  Follow the link above, click on the "B" for the platform you want to test and then in the lower left click on the "Build:" line (which will open up a tab with a directory listing -- and you download the build from there).

Nils -- Is there anything else you'd like Naizhi (the reporter) to try?  Am I correct that this is likely to land today or tomorrow?  

Thanks!
Flags: needinfo?(drno)
Yes a quick test run with the build from the treeherder link would really appreciated.
And yes we should land this in 40.
Flags: needinfo?(drno)
https://hg.mozilla.org/mozilla-central/rev/0cc4812a1cd7
Status: ASSIGNED → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla40
Attachment #8599108 - Attachment is obsolete: true
Attachment #8619419 - Flags: review+
Attachment #8619420 - Flags: review+
You need to log in before you can comment on or make changes to this bug.