Open Bug 1964973 Opened 13 days ago Updated 11 days ago

Webrtc ICE candidate priority incorrect in STUN BINDING

Categories

(Core :: WebRTC: Networking, defect, P2)

Firefox 138
defect

Tracking

()

UNCONFIRMED

People

(Reporter: kventers, Assigned: bwc)

Details

User Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/137.0.0.0 Safari/537.36 Edg/137.0.0.0

Steps to reproduce:

WebRTC call using Firefox in IPV4-only network in a webrtc call with iceTransportPolicy: 'relay'

Actual results:

Firefox sends STUN BINDING requests for multiple candidate pairs with identical priority.
As Firefox is ICE CONTROLLER, the remote side non-deterministically depending on the network request order considers one of these pairs as the highest priority pair.
Media connectivity drops

Firefox nominates pair COmx with priority 57e1fff, and keeps using this:

[Socket 4972: Socket Thread]: D/mtransport (ice/INFO) ICE(PC:{4c8e264f-f525-45cd-9acc-0d60b3ba49b5} 1746367238748000 (id=12884901890) /CAND-PAIR(5wlz): Pairing candidate IP4:20.202.1.223:50604/UDP (57e1fff):IP4:52.113.121.225:3480/UDP (337ffff) priority=231935376698916863 (337ffff0afc3fff)
[Socket 4972: Socket Thread]: D/mtransport (ice/INFO) ICE-PEER(PC:{37eefef4-d272-4df0-93b1-780841bcd334} 1746367271465000 (id=12884901890 url=https://latest-webclient.skype.com/?userId=0):default)/STREAM(PC:{37eefef4-d272-4df0-93b1-780841bcd334} 1746367271465000 (id=12884901890) transport-id=transport_0 - 8c6a51a8:0c2a03ea4414621ba1f509584350eb64)/COMP(1)/CAND-PAIR(COmx): nominated pair is COmx|IP4:20.202.1.223:50783/UDP|IP4:52.113.121.225:3480/UDP(turn-relay(IP4:10.11.142.98:0/TCP|IP4:20.202.1.223:50783/UDP)|candidate:1 1 UDP 54001663 52.113.121.225 3480 typ relay raddr 10.0.16.9 rport 3480)

However, it sends two STUN BINDING requests to the peer, one for UDP and one for TCP

Packet#71902 2025-05-04 17:00:41,151986    10.11.142.98      20.202.1.6  STUN  194         7953,443    Binding Request user:
DATA
Attribute Type: DATA
Attribute Length: 96
Value: 0001004c2112a4429cb3d82fca9c9f1c88ce2b410006000d7876356c3a38663066613762370000000025000000240004057e1fff802a00082f55c84291b96db10008001472c7fc44a09dd225b46e0007a60ea55898a07055802800048d41be5f
Session Traversal Utilities for NAT
[Response In: 72034]
Message Type: 0x0001 (Binding Request)
Message Length: 76
Message Cookie: 2112a442
Message Transaction ID: 9cb3d82fca9c9f1c88ce2b41
[STUN Network Version: RFC-5389/8489 (3)]
Attributes
USE-CANDIDATE
PRIORITY
Attribute Type: PRIORITY
Attribute Length: 4
Priority: 92151807
ICE-CONTROLLING
MESSAGE-INTEGRITY
FINGERPRINT

Packet#71972 2025-05-04 17:00:41,230900    10.11.142.98      20.202.1.223      STUN  182   50195,3478        Binding Request user:
DATA
Attribute Type: DATA
Attribute Length: 96
Value: 0001004c2112a4428be068488fbbcd27389585bf0006000d7876356c3a38663066613762370000000025000000240004057e1fff802a00082f55c84291b96db100080014954017deb7df6a3ce4345e3270d72a22556f21c2802800040a3d800e
Session Traversal Utilities for NAT
[Duplicated original message in: 71881]
[Response In: 72033]
Message Type: 0x0001 (Binding Request)
Message Length: 76
Message Cookie: 2112a442
Message Transaction ID: 8be068488fbbcd27389585bf
[STUN Network Version: RFC-5389/8489 (3)]
Attributes
USE-CANDIDATE
PRIORITY
Attribute Type: PRIORITY
Attribute Length: 4
Priority: 92151807
ICE-CONTROLLING
MESSAGE-INTEGRITY
FINGERPRINT

Expected results:

Firefox sends STUN BINDING with unique priorities for each candidate pair, so that ICE CONTROLLED side can correctly select the highest priority pair to use.

Component: Untriaged → WebRTC: Networking
Product: Firefox → Core

As Firefox is ICE CONTROLLER, the remote side non-deterministically depending on the network request order considers one of these pairs as the highest priority pair.

Firefox uses aggressive nomination, so I believe this works as intended: the first path that proves connectivity wins (though I've been wrong before so let me cc Byron to confirm).

The controlled peer chooses the first valid pair when the controlling side uses aggressive nomination. It does not look at the PRIORITY attribute in the incoming checks; it sorts its checklist using a formula and then honors the first check that arrives with USE-CANDIDATE.

Priority: 92151807

RFC 8445 section 7.1.1 says PRIORITY is "for the local candidate". In contrast, pair priority is derived from it but never sent on the wire AFAIK.

Flags: needinfo?(docfaraday)

Thanks for the comment Jan-Ivar! I think I am starting to understand what is happening:

Consider these two pairs htoS and 5wlz:

Line   4148: [Socket 4972: Socket Thread]: D/mtransport (ice/INFO) ICE(PC:{4c8e264f-f525-45cd-9acc-0d60b3ba49b5} 1746367238748000 (id=12884901890)/CAND-PAIR(5wlz): Pairing candidate IP4:20.202.1.223:50604/UDP (57e1fff):IP4:52.113.121.225:3480/UDP (337ffff) priority=231935376698916863 (337ffff0afc3fff)
Line   4150: [Socket 4972: Socket Thread]: D/mtransport (ice/INFO) ICE(PC:{4c8e264f-f525-45cd-9acc-0d60b3ba49b5} 1746367238748000 (id=12884901890)/CAND-PAIR(htoS): Pairing candidate IP4:20.202.1.234:52455/UDP (7e1fff):IP4:52.113.121.225:3480/UDP (337ffff) priority=35501027250667518 (7e1fff066ffffe)

STUN BINDING requests for these are fired in short succession with identical priority as noted in the bug description

We see that the controlled peer does choose the first valid pair, this is from our internal logs for the peer for candidate pair where traffic is expected. Binding for pair with port 52455 is received 150 ms earlier than port 50604, making pair htoS as the first successful aggressive nomination pair

05-04-2025 17:00:402025-05-04 14:00:40.184 :ProcessBindingResponse: Pair is Ready and updated to Succeeded, IceCandidatePair{ P:0x0337ffff0afc3fff L:IceCandidate{D, F:1 Rtp:{ {IP:10.0.16.x:3480, ID:{d9a70e4b48805179}}, base:10.0.16.x:3480, rel:10.0.16.x:3480, bw:0, p:0x0337ffff, pipe:UDP, nic:Ethernet}, Rtcp:{Mux}} R:IceCandidate{F: Rtp:{PeerDerived, {IP:20.202.1.x:52455}, base:20.202.1.x:52455, rel:20.202.1.x:52455, bw:0, p:0x057e1fff, pipe:UDP, nic:Other}, Rtcp:{Mux}} DL:52.113.121.x:3480,52.113.121.x:3480}
05-04-2025 17:00:402025-05-04 14:00:40.651 :ProcessBindingResponse: Pair is Ready and updated to Succeeded, IceCandidatePair{ P:0x0337ffff0afc3fff L:IceCandidate{D, F:1 Rtp:{ {IP:10.0.16.x:3480, ID:{d9a70e4b48805179}}, base:10.0.16.x:3480, rel:10.0.16.x:3480, bw:0, p:0x0337ffff, pipe:UDP, nic:Ethernet}, Rtcp:{Mux}} R:IceCandidate{D, F:1 Rtp:{TurnUDP, {IP:20.202.1.x:50604}, base:<null>, rel:20.202.1.x:50604, bw:0, p:0x057e1fff, pipe:None, nic:Other}, Rtcp:{Mux}} DL:52.113.121.x:3480,52.113.121.x:3480}

Back on browser side, however, the STUN BINDING success responses are received in this order, in the opposite order with a single packet difference:

Packet# 72033	2025-05-04 17:00:41,291503	20.202.1.223	10.11.142.98	STUN	150	3478,50195		Binding Success Response XOR-MAPPED-ADDRESS: 20.202.1.223:50604	
Packet# 72034	2025-05-04 17:00:41,291503	20.202.1.6	10.11.142.98	STUN	162		443,7953	Binding Success Response XOR-MAPPED-ADDRESS: 20.202.1.234:52455	

It appears that based on this network response order, Firefox considers pair htoS as the #2 and 5wlz as the #1 (opposite as the peer)
Firefox then proceeds to cancel htoS:

Line  23987: [Socket 4972: Socket Thread]: D/mtransport (ice/INFO) ICE-PEER(PC:{4c8e264f-f525-45cd-9acc-0d60b3ba49b5} 1746367238748000 (id=12884901890):default)/CAND-PAIR(5wlz): setting pair to state SUCCEEDED: 5wlz|IP4:20.202.1.223:50604/UDP|IP4:52.113.121.225:3480/UDP(turn-relay(IP4:10.11.142.98:50195/UDP|IP4:20.202.1.223:50604/UDP)|candidate:1 1 UDP 54001663 52.113.121.225 3480 typ relay raddr 10.0.16.9 rport 3480)
Line  23988: [Socket 4972: Socket Thread]: D/mtransport (ice/INFO) ICE-PEER(PC:{4c8e264f-f525-45cd-9acc-0d60b3ba49b5} 1746367238748000 (id=12884901890 ):default)/STREAM(PC:{4c8e264f-f525-45cd-9acc-0d60b3ba49b5} 1746367238748000 (id=12884901890) transport-id=transport_0 - 8f0fa7b7:85598e91161b0277805b0396650f8c77)/COMP(1)/CAND-PAIR(5wlz): nominated pair is 5wlz|IP4:20.202.1.223:50604/UDP|IP4:52.113.121.225:3480/UDP(turn-relay(IP4:10.11.142.98:50195/UDP|IP4:20.202.1.223:50604/UDP)|candidate:1 1 UDP 54001663 52.113.121.225 3480 typ relay raddr 10.0.16.9 rport 3480)
Line  23995: [Socket 4972: Socket Thread]: D/mtransport (ice/INFO) ICE-PEER(PC:{4c8e264f-f525-45cd-9acc-0d60b3ba49b5} 1746367238748000 (id=12884901890):default)/CAND-PAIR(htoS): setting pair to state SUCCEEDED: htoS|IP4:20.202.1.234:52455/UDP|IP4:52.113.121.225:3480/UDP(turn-relay(IP4:10.11.142.98:0/TCP|IP4:20.202.1.234:52455/UDP)|candidate:1 1 UDP 54001663 52.113.121.225 3480 typ relay raddr 10.0.16.9 rport 3480)
Line  24096: [Socket 4972: Socket Thread]: D/mtransport (ice/ERR) CAND-PAIR(htoS): pair htoS|IP4:20.202.1.234:52455/UDP|IP4:52.113.121.225:3480/UDP(turn-relay(IP4:10.11.142.98:0/TCP|IP4:20.202.1.234:52455/UDP)|candidate:1 1 UDP 54001663 52.113.121.225 3480 typ relay raddr 10.0.16.9 rport 3480): state=SUCCEEDED, priority=0x7e1fff066ffffe
Line  24102: [Socket 4972: Socket Thread]: D/mtransport (ice/INFO) ICE-PEER(PC:{4c8e264f-f525-45cd-9acc-0d60b3ba49b5} 1746367238748000 (id=12884901890):default)/CAND-PAIR(htoS): setting pair to state CANCELLED: htoS|IP4:20.202.1.234:52455/UDP|IP4:52.113.121.225:3480/UDP(turn-relay(IP4:10.11.142.98:0/TCP|IP4:20.202.1.234:52455/UDP)|candidate:1 1 UDP 54001663 52.113.121.225 3480 typ relay raddr 10.0.16.9 rport 3480)

Ok, so you're saying it looks like we're assigning the same priority to TURN UDP and TURN TCP candidates? That's definitely not what the code is trying to do, but I'll look into it.

Assignee: nobody → docfaraday
Flags: needinfo?(docfaraday)
Severity: -- → S3
Priority: -- → P2

Ok, I see the problem now. We're computing candidate priority from the candidate's base, but when a relay candidate finishes allocating, its base becomes the external IP address of the TURN server, transforming it into a UDP base, which upsets the priority (and results in the duplicate). This shouldn't be a super difficult fix.

You need to log in before you can comment on or make changes to this bug.