Open Bug 1034964 Opened 10 years ago Updated 4 months ago

Use ICE regular nomination when peer is ICE-Lite

Categories

(Core :: WebRTC: Networking, defect, P3)

33 Branch
defect

Tracking

()

People

(Reporter: ibc, Unassigned)

References

Details

User Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/35.0.1916.153 Safari/537.36

Steps to reproduce:

Firefox generates a SDP offer and receives a SDP answer indicating "a=ice-lite".


Actual results:

Firefox starts ICE procedures by using aggressive nomination. This is incorrect and can cause ICE to fail since a ICE-Lite server must just receive a single Binding request with USE-CANDIDATE (this is: ICE regular nomination).


Expected results:

Firefox should use ICE regular nomination as RFC 5245 section 8.1.1 states:

"If its peer has a lite implementation, an agent MUST use a regular nomination algorithm."

http://tools.ietf.org/html/rfc5245#section-8.1.1

There is a thread about this topic in rtcweb ML: http://www.ietf.org/mail-archive/web/rtcweb/current/msg12583.html
I believe this may be causing a problem with projectsquared.com which responds with 'ice-lite'.

To reproduce you can use Firefox 35 on Windows and be logged into a VPN.  We are seeing many connection failures is this case and I think specifying USE_CANDIDATE might fix this.
We've discussed this briefly; the current plan is to stop doing aggressive nomination entirely.

We need two things:

1. the ability to send on any valid pair prior to nomination

2. stop using USE-CANDIDATE aggressively

The first is tricky, because it affects the state machines we have (we potentially need an additional "connected, but not done" state) and that might be disruptive.

It's not clear why aggressive nomination is causing the ICE-lite server specific issues.  I'm guessing that this depends on how the ICE-lite implementation handles USE-CANDIDATE, since Ethan notes that Squared is OK with our shotgun-style nomination.
OS: Mac OS X → All
Hardware: x86 → All
Martin -- Is there a bug that covers what you outline in comment 2?  If so, can we dup this to that -- or (if not) use this bug to cover the work?
Status: UNCONFIRMED → NEW
backlog: --- → webRTC+
Rank: 25
Ever confirmed: true
Flags: needinfo?(martin.thomson)
Priority: -- → P2
Rather than dup it out, I think that we should just link a few of these.  I think that there might be another bug regarding the new nombis work, but I couldn't find that.
Flags: needinfo?(martin.thomson)
See Also: → 1138559
Any update on this? Nightly 46.0a1 does not yet support negotiating ICE with a ICE-Lite server.
There is also bug 1213442.

I guess Martin was referring to passive-agressive nomination in comment #2. Which the Google folks started to implement with interesting side effects. But I don't think we have a separate ticket for that. Probably we should as this here refers to ice-lite interop.

AFAIK the Cisco folks basically have implemented a workaround on their ice-lite end, by not responding to aggressive nomination requests until they have done X round trip checks on the pair which makes their ICE lite regular nomination happy.
See Also: → 1213442
See Also: → 1238249
Facing similar issue with my ICE-Lite webrtc gateway.

1. call initiated from a webrtc client towards gateway
 - Even after processing answer successfully, no stun connectivity checks have been initiated from the client which should be in ICE-Controlling state. Is there a way to debug what Ice state the client is as well as which nomination policy it has selected?
(In reply to hotshot47 from comment #7)
> Is there a way to debug what Ice state the client is as well as which
> nomination policy it has selected?

As this bug points out Firefox supports right now only aggressive nomination. So there is nothing to pick. One way to find out about the ICE is look at the logging: https://wiki.mozilla.org/Media/WebRTC/Logging
> It's not clear why aggressive nomination is causing the ICE-lite server specific issues.  I'm guessing that this depends on how the ICE-lite implementation handles USE-CANDIDATE, since Ethan notes that Squared is OK with our shotgun-style nomination.

The ICE-Lite server receives a STUN with USE-CANDIDATE over its IPv4 UDP port, so that becomes the selected path and DTLS ClientHello is sent by the server over that transport.

A few ms later another STUN with USE-CANDIDATE arrives over the server's IPv6 UDP port.

- Should the server changes the sending address to the IPv6 one?

Even more: Firefox, for any reason, prefers the IPv6 path and starts sending SRTP over it. Should the server still sends its SRTP over the IPv4 path? or what?

This is 100% unclear because the spec becomes broken when a ICE aggressive endpoint talks to a ICE Lite server.
Any news about proper ICE support in Firefox?
I think the plan right now is rather to implement passive aggressive nomination in bug 1238249, then implementing full nomination support. But supporting passive aggressive would result in Firefox no longer sending binding request with USE-CANDIDATE attributes by default.

Currently none of these is actively worked on. Patches are always welcome.
Mass change P2->P3 to align with new Mozilla triage process.
Priority: P2 → P3
This bug causes significant issues with the ICE-lite implementation in the MCU used by Blackboard.  We don't have problems with IPv6, as that is not currently supported by our infrastructure, but if the FF endpoint has multiple IPv4 addresses (e.g. ethernet + wireless, or an active VPN connection) then connection attempts are highly unreliable.  The MCU will use the first candidate that sends USE-CANDIDATE (after the SDP answer is received), while Firefox will end up using the highest-priority candidate that is successful; often these are not the same, resulting in a failed connection.
Hi, so this issue (that affects real scenarios and deployments) has been moved from P2 to P3? Any plan to fix it instead of waiting for bug 1238249 (which has also been moved from P2 to P3 and has zero activity?

This is important, guys.
P3 is basically the equivalent of the backlog. Implementing ICE bis per RFC 8445 in in the backlog right now. To support that we will have to implement proper full nomination, which would solve this ticket. But right now it is not clear when this will be actively worked on.
Nils: What are missing pieces. How can we help? 
If I understand it correctly the plan is still to move and replace Aggressive nomination with Regular nomination. Correct? 

Is NICEr ready for normal/regular nomination?
I can see that there are test case for it:
https://github.com/resiprocate/nICEr/blob/b14598f34d12373069693a1f0535fe354a3f1fd5/src/test/test-remote-ekr.conf#L6

And in this place is where the Aggressive nomination is set in the code:
mozilla-central/media/mtransport/nricectx.cpp
 line 595:  
UINT4 flags = NR_ICE_CTX_FLAGS_AGGRESSIVE_NOMINATION;

Can you point out the missing parts, that we should write to fix it?
So that test case is not being run by anyone (that I know of) right now. Also, the version on reSIProcate is kinda dead from a development perspective; the copy in hg.mozilla.org is the most recent one.

We expect that nICEr's regular nomination will need some work to be considered "ready".

More than 6 years to properly support ICE-Lite endpoints is somehow unexpected IMHO.

Because of this bug we have to force Firefox to relay to make sure we don't have unexpected user experiences. One scenario where this is happening is when there is UDP blocked inbound, but open outbound, which is not a very uncommon network scenario.

Jitsi should be interested in this as well. There a all Firefox users we force to use TURN because of this, which means we degrade privacy and user experience for them by default :(

In the BigBlueButton project, we're attempting to switch our SFU from Kurento (which has a full ice implementation) to MediaSoup (which uses ice-lite) and have hit this issue in practice (for some reason, my personal network environment triggers this issue quite reliably!).

We're currently looking into a workaround on our end, either via hacks in the ice-lite implementation to allow switching media path to late-received nominated candidates (which make it non-compliant to specs) https://github.com/versatica/mediasoup/issues/650 or by forcing Firefox to use only relay candidates, which isn't desirable for the reasons mentioned earlier - along with the need for more capacity in TURN servers.

Severity: normal → S3

The severity field for this bug is relatively low, S3. However, the bug has 12 votes.
:bwc, could you consider increasing the bug severity?

For more information, please visit auto_nag documentation.

Flags: needinfo?(docfaraday)

The last needinfo from me was triggered in error by recent activity on the bug. I'm clearing the needinfo since this is a very old bug and I don't know if it's still relevant.

Flags: needinfo?(docfaraday)

This bug is definitely still relevant, see Calvin Walton's message regarding BigBlueButton's move from Kurento (full ICE) to MediaSoup (ICE lite).

Hi, I would like to add that there is very related issue to the described one.

According to specs aggressive nomination MUST NOT be used also when:
https://datatracker.ietf.org/doc/html/rfc5245#section-8.1.1

If its peer is using ICE options (present in
an ice-options attribute from the peer) that the agent does not
understand, the agent MUST use a regular nomination algorithm.

Or the same in new spec https://datatracker.ietf.org/doc/html/rfc8445#section-8.1.1

The usage of the 'ice2' ICE option (Section 10) by endpoints
supporting this specification is supposed to prevent controlling
agents that are implemented according to RFC 5245 from using
aggressive nomination.

Currently in our project we use in SDP

a=ice-options:trickle ice2

But Firefox doesn't react on that and keeps using aggressive nomination.

Are there plans in implementing regular nomination logic as described in specs?

Flags: needinfo?(docfaraday)

We do, but we have our plate full with other stuff right now.

Flags: needinfo?(docfaraday)
You need to log in before you can comment on or make changes to this bug.