DTLS message reassembly & retransmission could be better

NEW
Unassigned

Status

()

Core
WebRTC: Networking
P5
normal
Rank:
42
5 years ago
10 months ago

People

(Reporter: Andrey Kovalenko, Unassigned)

Tracking

22 Branch
x86
Windows 7
Points:
---

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(2 attachments, 1 obsolete attachment)

(Reporter)

Description

5 years ago
Created attachment 772077 [details]
Handshake packet trace, Firefox is a client

User Agent: Mozilla/5.0 (Windows NT 6.1; rv:22.0) Gecko/20100101 Firefox/22.0 (Beta/Release)
Build ID: 20130618035212

Steps to reproduce:

Try to establish peerconnection, which involves DTLS handshake, on LTE network


Actual results:

In most networks, DTLS packets are being received in correct order, which results in correct session establishment.
But some networks (for example, LTE) can often change packet transmission order. In this case Firefox doesn't sort them according to sequence number, and whole flight is discarded. Since there is no guarantee that packets will be received in correct order  (UDP doesn't provide such mechanism), whole session can fail, although all packets were transmitted successfully.


Expected results:

DTLS Packets should have been sorted according to their sequence numbers, whole message should have been reconstructed, and DTLS handshake should have continued
(Reporter)

Updated

5 years ago
Summary: DTLS messages not combined correctly if packets are received in wrong order → WebRTC: DTLS messages not combined correctly if packets are received in wrong order

Comment 1

5 years ago
Do you think you could provide this as regular PCAP? I would like to look at it using tcpdump.
(Reporter)

Comment 2

5 years ago
Created attachment 774061 [details]
Same as previous, but in pcap format
Attachment #772077 - Attachment is obsolete: true
(Reporter)

Comment 3

5 years ago
Created attachment 774073 [details]
DTLS handshake in correct packet order

This is packet trace of DTLS handshake with same server software sending same certificate, but packets are received in correct order. So certificate itself and all packets individually seem to be correct.

Updated

5 years ago
Component: Untriaged → WebRTC: Networking
Product: Firefox → Core

Comment 4

5 years ago
Is this the entire trace? Both sides should be retransmitting their
flights.

The DTLS stack currently does ignore out of order HS messages, so what
should happen is that it should accept the ServerHello and Certificate
but ignore the ServerHelloDone and CertificateRequest because they come
after the Certificate is complete. Then, on the retransmit, it should
accept them.
(Reporter)

Comment 5

5 years ago
Yes, it's the entire trace. The original one was bigger, but I extracted DTLS packets from it. For further 30 seconds there were no retransmission from  both sides.

We're working now on fixing it on our side (adding server retransmissions). I'll report if it makes any difference.

But there's no guarantee that retransmitted fligths will come in correct order. Will Firefox DTLS stack ignore them as well?

Comment 6

5 years ago
What it does it it processes any in-order packet, so if you get say

1 2 4 3

It stores and processes 1 2 3

Then when 1 2 4 3 is retransmitted it processes 4

Comment 7

5 years ago
BTW, I believe that Firefox is behaving correctly here, modulo the fact
that it could try harder to reassemble. I.e., it should not be retransmitting
since the fact that the ServerHello got through means that the client
(Firefox's) data got through. it's the server's job to retransmit.
(Reporter)

Comment 8

5 years ago
(In reply to Eric Rescorla (:ekr) from comment #6)
> What it does it it processes any in-order packet, so if you get say
> 
> 1 2 4 3
> 
> It stores and processes 1 2 3
> 
> Then when 1 2 4 3 is retransmitted it processes 4

That's great news. Everything should work fine at last.
(Reporter)

Comment 9

5 years ago
(In reply to Eric Rescorla (:ekr) from comment #7)
> BTW, I believe that Firefox is behaving correctly here, modulo the fact
> that it could try harder to reassemble. I.e., it should not be retransmitting
> since the fact that the ServerHello got through means that the client
> (Firefox's) data got through. it's the server's job to retransmit.

I'm not sure about that. RFC states that 
"Partial reads (whether partial messages or only some of the messages in the
      flight) do not cause state transitions or timer resets."

So, if I get it correctly, it should keep retransmitting until it gets all packets from particular flight (since it drops packet 4 in your example).
Huh. You're right it does say that. I will try to remember why we wrote that,
given my reasoning above.
(Reporter)

Comment 11

5 years ago
(In reply to akvakh from comment #9)
> (In reply to Eric Rescorla (:ekr) from comment #7)
> > BTW, I believe that Firefox is behaving correctly here, modulo the fact
> > that it could try harder to reassemble. I.e., it should not be retransmitting
> > since the fact that the ServerHello got through means that the client
> > (Firefox's) data got through. it's the server's job to retransmit.
> 
> I'm not sure about that. RFC states that 
> "Partial reads (whether partial messages or only some of the messages in the
>       flight) do not cause state transitions or timer resets."
> 
> So, if I get it correctly, it should keep retransmitting until it gets all
> packets from particular flight (since it drops packet 4 in your example).

Anyway, it doesn't say "must not", so in the real world both approaches seem to be quite reasonable.
(Reporter)

Comment 12

5 years ago
(In reply to Eric Rescorla (:ekr) from comment #6)
> What it does it it processes any in-order packet, so if you get say
> 
> 1 2 4 3
> 
> It stores and processes 1 2 3
> 
> Then when 1 2 4 3 is retransmitted it processes 4

We implemented server retransmissions on our side, and it helps to avoid this issue.
Changing title to reflect the remaining issue.
Status: UNCONFIRMED → NEW
backlog: --- → webRTC+
Rank: 42
Ever confirmed: true
Priority: -- → P4
Summary: WebRTC: DTLS messages not combined correctly if packets are received in wrong order → DTLS message reassembly & retransmission could be better
Mass change P4->P5 to align with new Mozilla triage process.
Priority: P4 → P5
You need to log in before you can comment on or make changes to this bug.