Open Bug 1748953 Opened 3 years ago Updated 3 years ago

ICE mode 2 can fail on dual stacks when target for default local address lookup is an ipv6 link-local via mDNS

Categories

(Core :: WebRTC: Networking, defect, P4)

Firefox 95
defect

Tracking

()

UNCONFIRMED

People

(Reporter: eric, Unassigned)

Details

The way that ICE mode 2 is implemented, specifically nr_ice_get_default_address in nICEr, will fail instances where the host that served page originated from mDNS (e.g. somehost.local) and its peer address ultimately resolves to an IPv6 link-local address. What's going on here is a couple of things. First, the fact that the IP is link-local is lost by the time it gets to the linked function (although fe80 is also an indicator). It's lost by way of when HttpBaseChannel::GetRemoteAddress calls mPeerAddr.ToStringBuffer, only the IP bytes are converted into a string, include the link-local scope id. After this happens, nr_ice_get_default_address will try to bind to ::0 and make a connection to an IPv6 that will result in a "host unreachable" error since the scope id isn't present.

In my current experiments, this happens fairly easy in the same LAN segment with two hosts that can resolve each other using mDNS. In my setup, I have two macbooks, where one is serving a web page and the other will access that page over the host's mDNS address and initiate a peer connection (see the GitHub repository for a minimal repro). When the host accessing the web page has a dual stack configuration, it seems to, at least on a mac, default to resolving to the ipv6 link-local address (e.g. fe80::1870:7be6:8af9:d1c3%en0) and immediately fails with the standard error WebRTC: ICE failed, add a STUN server.... Underneath the covers and based on the description above, the actual failure that ultimately fails the ICE gathering process is a PR_Connect failure from a host unreachable error (-5927). I edited the source code to make sure the above description checks out and I can confirm it is attempting to connect the socket to the resolved IPv6 address without the scope id.

The difference between the Google WebRTC implementation which does not suffer from this issue and Mozilla's is simply that Mozilla is being RFC compliant while Google is not. Google always reaches out to its DNS servers to resolve the default local address instead of using them as a fallback (see QueryDefaultLocalAddress).

I think in order to be even more compliant here, nr_ice_get_default_address should in this case of "host unreachable" fallback to the known public addresses, in both network families. A better fix is to make it possible to include the scope id in the socket connection so that the right network interface is used. Normally this would be not okay when used as an address to share with others since the scope id is unique to the host in question but here it fits.

Some user workarounds for anyone reading this:

  • Don't use mDNS to load the page
  • Turn off the dual stack if you don't need it (e.g. networksetup -setv6off Wi-Fi)

Some possibly helpful macOS tips while debugging:
I noticed for some odd reason (I didn't take the time to found out why), when killing Firefox and restoring sessions, the gathering will actually complete and it will use the IPv4 address instead! This can be frustrating to debug with though, so if you need to make it resolve correctly again, this usually works for me sudo killall -HUP mDNSResponder;sudo killall mDNSResponderHelper;sudo dscacheutil -flushcache.

Summary: ICE mode 2 can fail on dual stacks when target for default local address lookup is a ipv6 link-local via mDNS → ICE mode 2 can fail on dual stacks when target for default local address lookup is an ipv6 link-local via mDNS

Thanks for the detailed report. This is an interesting corner case, but probably not a high priority compared to the other stuff we're working on right now.

Severity: -- → S4
Priority: -- → P4

Understood. I'm happy to write a fix if there's an agreed upon solution!

You need to log in before you can comment on or make changes to this bug.