RTP Header Extension IDs in Offer/Answer Exchange

RESOLVED FIXED in Firefox 55

Status

()

defect
P1
normal
Rank:
15
RESOLVED FIXED
2 years ago
2 years ago

People

(Reporter: paulej, Assigned: drno)

Tracking

({cisco-spark})

55 Branch
mozilla55
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(firefox55 fixed)

Details

Attachments

(1 attachment)

(Reporter)

Description

2 years ago
When Firefox 55 (current nightly build) sends an offer, the SDP looks like this (abbreviated):

v=0
o=mozilla...THIS_IS_SDPARTA-55.0a1 3439260750675219731 0 IN IP4 ...
...
m=video 49540 UDP\/TLS\/RTP\/SAVPF 120 121 126 97
...
a=extmap:1 http:\/\/www.webrtc.org\/experiments\/rtp-hdrext\/abs-send-time
a=extmap:2 urn:ietf:params:rtp-hdrext:toffset

The answer Cisco Spark is returning looks like this:

v=0
o=linus 0 1 IN IP4 ...
...
m=video 5004 UDP\/TLS\/RTP\/SAVPF 126
...
a=extmap:3\/sendrecv urn:ietf:params:rtp-hdrext:toffset

Noting the difference in value from "2" to "3" for the extension ID, Firefox is returning "Answer changed id for extmap attribute at level 1 (urn:ietf:params:rtp-hdrext:toffset) from 2 to 3."  Indeed, the server is using a different ID.

Consulting with the product team, the justification for the server's approach includes:
1) header extension IDs are called "local identifiers" per RFC 5285, suggesting they can be different values for each endpoint in a conference (and between the endpoint and conference server in this case);
2) nowhere in the RFC does it say the answer has to match the offer;
3) the text that says, "Identifiers values in the valid range MUST NOT be altered (remapped)" was taken to mean only that an endpoint cannot change it's local ID from 1 to 3 from one offer to the next in the same call (i.e., it is not a restriction placed on what value the other endpoint assigns locally); and
4) in legacy 3PCC scenarios where a call control function initiates outbound calls to two endpoints in order to connect them (but might not be party to the media flow), there is no way to guarantee that the two endpoints will select the same identifier values.
(Reporter)

Updated

2 years ago
Keywords: cisco-spark
(Assignee)

Comment 1

2 years ago
After reading 5285 I tend to agree that RTP header extensions don't need to be aligned between offerer and answerer. It's not great because you then need to look at the sending and the receiving table.

@Byron: what do you think?
Rank: 15
Depends on: 1344556
Flags: needinfo?(docfaraday)
Priority: -- → P1
Comment hidden (mozreview-request)
If the answerer is permitted to change the ids, it means that the offerer is unable to properly interpret RTP header extensions until the answer arrives, which is a violation of the general rule of "Be ready to receive pre-answer media."

As for 3PCC, if we assume that answerers don't change the values, putting the same thing in both offers should have the desired effect, so this doesn't clear things up either.

I dunno, I can see arguments for either side.

abr? What's your take?
Flags: needinfo?(docfaraday) → needinfo?(adam)
As a specific example of the expectation that RTP header extensions be intelligible pre-answer, consider the mid RTP extension. A big part of the rationale for the mid RTP extension was to allow bundled RTP/RTCP to be demuxed prior to reception of the answer; if the answerer gets to choose the id, however, that doesn't work out.
(Assignee)

Comment 5

2 years ago
(In reply to Byron Campen [:bwc] from comment #3)
> If the answerer is permitted to change the ids, it means that the offerer is
> unable to properly interpret RTP header extensions until the answer arrives,
> which is a violation of the general rule of "Be ready to receive pre-answer
> media."

Just to make sure we are on the same page: I think the Linus devs say the values of the ID's are not something you have agree on. One side can send with ID 2 and the other with 5. As long as both indicated in the signaling/SDP that they are willing to support that extension.

Pre-answer media is a good point. But:
1) 5285 even allows to insert header extension which have not been agreed on (example is an extension for a relay) - I guess with the assumption/requirement that the receiver is suppose to ignore anything it doesn't understand
2) and 5285 mandates that header extensions only carry non-vital information. And I think it's true that we can start rendering or at least decoding audio or video without understanding any of the header extensions we are using so far.
(Assignee)

Comment 6

2 years ago
(In reply to Byron Campen [:bwc] from comment #4)
> As a specific example of the expectation that RTP header extensions be
> intelligible pre-answer, consider the mid RTP extension. A big part of the
> rationale for the mid RTP extension was to allow bundled RTP/RTCP to be
> demuxed prior to reception of the answer; if the answerer gets to choose the
> id, however, that doesn't work out.

Point taken. I think MID violates 5285 requirement of only carrying non-vital information.
(Reporter)

Comment 7

2 years ago
Even if an offer contains the ID values for some extension, it doesn't mean it will know what to do with RTP header extensions when they arrive.  If you look at the example in section 6 of RFC 5485, you'll see these lines in the offer:

   a=extmap:1 URI-toffset
   a=extmap:14 URI-obscure
   a=extmap:4096 URI-gps-string
   a=extmap:4096 URI-gps-binary
   a=extmap:4097 URI-frametype
 
The answering device in that example returned this in the answer:

   a=extmap:1 URI-toffset
   a=extmap:2/recvonly URI-gps-string
   a=extmap:3 URI-frametype

Since the offer did not offer ID=2 or ID=3 explicitly, it would have to wait for the SDP answer to arrive before it could interpret those RTP header extensions.  We simply do not have a mechanism in any SDP-based offer/answer protocol today that can ensure signaling happens before media.  Being able to handle early media is desirable, but we need to invent a new protocol to make it mandatory.
(Assignee)

Comment 8

2 years ago
(In reply to Paul E. Jones from comment #7)
> Even if an offer contains the ID values for some extension, it doesn't mean
> it will know what to do with RTP header extensions when they arrive.  If you
> look at the example in section 6 of RFC 5485, you'll see these lines in the
> offer:
> 
>    a=extmap:1 URI-toffset
>    a=extmap:14 URI-obscure
>    a=extmap:4096 URI-gps-string
>    a=extmap:4096 URI-gps-binary
>    a=extmap:4097 URI-frametype
>  
> The answering device in that example returned this in the answer:
> 
>    a=extmap:1 URI-toffset
>    a=extmap:2/recvonly URI-gps-string
>    a=extmap:3 URI-frametype
> 
> Since the offer did not offer ID=2 or ID=3 explicitly, it would have to wait
> for the SDP answer to arrive before it could interpret those RTP header
> extensions.  We simply do not have a mechanism in any SDP-based offer/answer
> protocol today that can ensure signaling happens before media.  Being able
> to handle early media is desirable, but we need to invent a new protocol to
> make it mandatory.

Just to clarify: Firefox never offers anything in the 4000 range, where the answerer has to move the extension into the valid range in it's answer. So this is just a theoretic example for discussion purpose only. :-)

In general it would be good/nice if we could treat media before the SDP answer as optional. The real issue I think which Byron tried to make is that the ID value for the MID header extension needs to be known on a bundled transport.

In other words: by using bundle and 5285 you have to throw away any media which arrives before the SDP answer, because you have no clue to which of your media pipelines you have to route the RTP packet to.

Comment 9

2 years ago
Not clearing my NI? yet because I haven't had time to dig into the relevant RFCs; however, I did want to answer this:

> In general it would be good/nice if we could treat media before
> the SDP answer as optional.

When using both ICE and DTLS-SRTP (as we must in WebRTC), you can't process media before the answer. Pasting from a different conversation I had on the topic:

This has been an on-again-off-again topic in WebRTC for several years, with certain parties claiming that early media can arise under certain poorly-explained situations, and others laying out what sound like reasonable explanations of what data gets where to show that it cannot happen. See, for example, Peter Thatcher's two comments starting at <https://github.com/w3c/webrtc-pc/issues/849#issuecomment-290514459>.

The MMUSIC dicussion (thread currently ending at <https://www.ietf.org/mail-archive/web/mmusic/current/msg17801.html>) also seems to be reaching the conclusion that the situation cannot arise in browsers, which would mean that this is not a concern for WebRTC at all; I've attempted to summarize the situation here: <https://github.com/w3c/webrtc-pc/pull/1026#issuecomment-291194679>.
(Assignee)

Comment 10

2 years ago
Good point. I forgot about the certificate fingerprints for DTLS. So we have to rely on the SDP answer to be present, which then tells us the remotes extension headers IDs.
But still leaves us with the question if the IDs need to me matched or not...
Right, and those points mean that those of us in webrtc-land tend to forget about early media, and the requirements to make it work. But these specs we're debating are not just for webrtc.

Moreover, there is actually a case where early media is possible. Consider an established session that does not use bundle (let's say because it has only one m-section). ICE and DTLS are done, and media flows. Then, one side sends an offer that has an additional m-section (let's say it has the same payload types), which it bundles with the pre-existing m-section (the pre-existing m-section is the bundle-tag here). The transport for the pre-existing m-section is reusable here for the new m-section, but now the offerer needs to see some RTP MID header extensions to demux.
(not that it is the end of the world if we can't demux the new stream; but still)
tl;dr -- They have to be symmetrical.

I've read over section 6 of RFC 5285 several times now, and Paul is right that there is no normative prohibition on asymmetric IDs. HOWEVER, there are two facts that point towards a need for the IDs being the same in both directions.

The first is that RFC 5285 does not indicate whether the number is what the SDP generator expects to *receive* for the associated URI, or what it is announcing it will *send* for the associated URI. If the numbers are allowed to be asymmetrical, then we would need a clear indication here. There is precedent for both: for PTs, you announce what the other party must send. For MIDs and RIDs, you announce what you will send. Without clear language here, it's not clear whether Paul's SDP in Comment 0 means that they expect to *send* toffset as 2 and *receive* it as 3, or to *send* as 3 and *receive* as 2. This points to a need for symmetric values.

Even more compelling is the fact that the technique of remapping header extensions from the 4096-4351 range into the valid range simply doesn't work otherwise. As much as I hate to rely on examples, the example in section 6 serves as a very useful illustration here: the offerer indicates "4097 URI-frametype" in its offer, giving the answerer the opportunity to turn it on if it so desires. The answerer moves this to "3 URI-frametype" (n.b.: THIS IS SENDRECV), indicating that it intends to both *send* URI-frametype with an ID of 3, and *receive* URI-frametype with an ID of 3. This simply doesn't work -- it *can't* work -- unless IDs are required to be symmetrical.

I suspect what happened here is that the authors didn't even imagine that implementations would try to do this in an asymmetrical pattern, so they didn't think to expressly prohibit it.

I also agree with Paul that this scheme doesn't work with 3PCC. A lot doesn't work with 3PCC. 3PCC is a trick that JDR noticed could be played with SIP rather than a designed-in feature. It works, sometimes, kind of, under certain circumstances. The use of RTP header extensions falls outside those circumstances.
Flags: needinfo?(adam)
FWIW, on careful reading, I'm pretty sure the adjective "local" in the term "local identifier" is meant to convey "local to this media session," not "local to a single network element."

Comment 15

2 years ago
mozreview-review
Comment on attachment 8863593 [details]
Bug 1361206: warn about non-matching RTP header extension IDs.

https://reviewboard.mozilla.org/r/135364/#review140224
Attachment #8863593 - Flags: review?(docfaraday)
(Assignee)

Comment 16

2 years ago
Thanks Adam, your arguments make a lot of sense to me. That would mean we would leave the ID verification code in Firefox. But raises the question of how quickly can Cisco fix this on the their end?

Paul: please let us know if we should temporarily disable this new code, until Cisco is able to fix this.

Comment 17

2 years ago
I agree with Adam. RFC seems pretty clear on this.
(Assignee)

Updated

2 years ago
Blocks: 1363900
Comment hidden (mozreview-request)
(Assignee)

Comment 19

2 years ago
The plan is to turn this into a warning for now, to give Cisco enough time to fix it and turn it back into an error (see bug 1363900).
Assignee: nobody → drno

Comment 20

2 years ago
mozreview-review
Comment on attachment 8863593 [details]
Bug 1361206: warn about non-matching RTP header extension IDs.

https://reviewboard.mozilla.org/r/135364/#review141570
Attachment #8863593 - Flags: review?(docfaraday) → review+

Comment 21

2 years ago
Pushed by drno@ohlmeier.org:
https://hg.mozilla.org/integration/autoland/rev/3d48e05919b3
warn about non-matching RTP header extension IDs. r=bwc

Comment 22

2 years ago
bugherder
https://hg.mozilla.org/mozilla-central/rev/3d48e05919b3
Status: NEW → RESOLVED
Last Resolved: 2 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla55
You need to log in before you can comment on or make changes to this bug.