Closed Bug 1529612 Opened 11 months ago Closed 8 months ago

RTCDataChannel.bufferedAmount is updated too soon after sending data

Categories

(Core :: WebRTC: Signaling, defect, P2)

defect

Tracking

()

RESOLVED FIXED
mozilla68
Tracking Status
firefox68 --- fixed

People

(Reporter: bwc, Assigned: bwc)

References

Details

Attachments

(5 files)

The spec says:

"Even though the data transmission can occur in parallel, the returned value MUST NOT be decreased before the current task yielded back to the event loop to prevent race conditions."

This is causing failures in testing/web-platform/tests/webrtc/RTCDataChannel-bufferedAmount.html

It seems that we need to cache this value on main, increase it when calling send(), and decrease it by dispatching events to main.

Priority: -- → P2
Assignee: nobody → docfaraday
Attachment #9059607 - Attachment description: Bug 1529612: (WIP) Re-implement bufferedAmount to only update on main. → Bug 1529612 - Part 0: Re-enable a test-case. r?mjf

Still seeing failures on linux mochitest. Probably some timing bug.

Yeah, mochitest on linux is seeing failures on try, but I cannot reproduce on linux locally. Going to need to figure this out via code inspection.

At least part of this is bad code in pc.js, but it is not that simple.

Blocks: 1548272

Ok, none of the usual ways to enable logging on try are working.

This patch does not make it happen:

https://hg.mozilla.org/try/rev/153184fa3292751223ddb562093cd1b7c1b02dbe

Nor does passing a MOZ_LOG env variable to mach try like so:

--env="MOZ_LOG=DataChannel:5"

I'm going to need to hard-code a modification to the logging system, I think.

ahal, is there some new way of doing this?

Flags: needinfo?(ahal)

Right. So even after hard-coding the logging system to use DataChannel at 5, there's still no logging output in those try runs. I'm going to guess that MOZ_LOG is just not logging on try right now. R_LOG seems to work, so maybe printf will work.

I'm not familiar with Gecko's logging systems. Maybe someone replaced the MOZ_LOG env and forgot to change mochitest? Though I don't recall any announcements to that effect. Does setting MOZ_LOG locally still work? I can't think what would be different on try vs local.

Flags: needinfo?(ahal)

MOZ_LOG locally still works. Something is filtering out or disabling MOZ_LOG on try however.

It really does looks like DataChannel messages are being processed by the receiver so quickly on linux that they often are processed before the sender can dispatch an update to the buffered amount on main. This seems to be caused by the fact that we are calling usrsctp_sendv on main most of the time. What must be happening is as follows:

  1. Call usrsctp_sendv on main, from the JS callstack that called send.
  2. This causes a dispatch to STS to actually send the packet, that begins processing right away and sends.
  3. STS receives the packet, and hands it off to the SCTP stack.
  4. SCTP stack dispatches the received message to main.
  5. The stack that called usrsctp_sendv (see step 1) has not returned yet, and dispatches an update to the buffered amount.

I think I can work around this by moving the work in step 5 off of main, and do it during step 2.

It would be really nice if the sctp conn_output callback gave us the amount of non-SCTP data in that packet. That would make it easy to fix this problem, and would also allow us to give a better estimate of how much data was buffered; right now we don't count any data that is buffered in usrsctp.

How hard would it be to implement something like this? Alternately, is there a utility function that can count the amount of non-SCTP data in an SCTP packet?

Flags: needinfo?(tuexen)

Hi Byron,

my understanding is that bufferedAmount is the number of bytes the application has given to the JS layer via the send method, which is buffered with the JS layer or the SCTP layer and which is not sent yet. I think we can avoid looking at the DTLS layer...

However, it is incremented when the user calls send() and is decremented when the data is sent the first time. Do we agree on this?

Assuming that we do, I don't think the functionality you are asking for is really giving you the correct values. The point is that you not only need to know how many bytes of user data are in an SCTP packet, but how many bytes of user data, which are not retransmissions are in the packet. This is something you can't simply deduce by parsing the packet again.

I think what you need is
(a) keep track of the number of bytes queued in the JS layer.
(b) a way to ask the SCTP stack, how many unsent bytes are queued for a specific SCTP stream.
The sum of (a) and (b) gives you what you want. Is my understanding correct?

Implementing (b) requires some additional stat counters in the SCTP stack and a new socket option to
retrieve these numbers. I guess this has to be added to the SCTP stack. I can try to implement this.
However, it might take some time (lets say until the next IETF or so) to get it implemented.

Does this help?

Flags: needinfo?(tuexen)

That would be useful, yes. I'm not going to block this bug on that, since it isn't a huge deal for the decrease in bufferedAmount to happen a little early.

Seeing some sort of infra problem on that try push. Retriggering some stuff.

Try looks fine after the retriggers.

Pushed by bcampen@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/f19730eee267
Part 0: Re-enable a test-case. r=jib
https://hg.mozilla.org/integration/autoland/rev/138bf6e11c05
Part 1: Refactor nsDOMDataChannel so DataChannel will be able to tell how large a blob is without needing to stream the whole thing. r=mjf
https://hg.mozilla.org/integration/autoland/rev/2375b78d0f7a
Part 2: Keep/update bufferedAmount on main only. Some simplifications. r=mjf
https://hg.mozilla.org/integration/autoland/rev/5ea4e8902e7e
Part 3: Update a mochitest to expect bufferedamountlow events even when the threshold is 0 (Part 2 fixed this bug). r=mjf
https://hg.mozilla.org/integration/autoland/rev/074678d618bb
Part 4: Work around issue where receiver would get message before sender saw its bufferedAmount drop. r=mjf
You need to log in before you can comment on or make changes to this bug.