Open Bug 1888232 Opened 1 year ago Updated 1 year ago

New wpt crashes in /webrtc/RTCDataChannel-send-close.html

Categories

(Core :: WebRTC, defect)

defect

Tracking

()

People

(Reporter: wpt-sync, Unassigned)

References

(Blocks 1 open bug)

Details

(Whiteboard: [wpt])

Syncing wpt PR 45357 found new crashes in CI

Affected Tests

  • /webrtc/RTCDataChannel-send-close.html [wpt.fyi]: CRASH [Gecko-android-em-7.0-x86_64-lite-qr-opt-geckoview, Gecko-android-em-7.0-x86_64-qr-debug-geckoview, Gecko-android-em-7.0-x86_64-qr-opt-geckoview, Gecko-windows11-32-2009-qr-debug, Gecko-windows11-32-2009-qr-opt], OK [Gecko-linux1804-64-qr-debug, Gecko-windows11-64-2009-qr-debug], TIMEOUT [Gecko-linux1804-64-qr-opt, Gecko-windows11-64-2009-qr-opt, GitHub]

CI Results

Gecko CI (Treeherder)
GitHub PR Head

Notes

Getting the crash signature into these bug reports is a TODO; sorry

These updates will be on mozilla-central once bug 1888067 lands.

Note: this bug is for tracking fixing the issues and is not
owned by the wpt sync bot.

This bug is linked to the relevant tests by an annotation in
https://github.com/web-platform-tests/wpt-metadata. These annotations
can be edited using the wpt interop dashboard
https://jgraham.github.io/wptdash/

If this bug is split into multiple bugs, please also update the
annotations, otherwise we are unable to track which wpt issues are
already triaged. Resolving as duplicate or closing this issue should
be cause the bot to automatically update or remove the annotation.

Severity: -- → S3

The (android) crashes are most likely from this busy-spin loop in our current version of RTCDataChannel-send-close.html, and because we don't appear to have any buffer limit on our outgoing send buffer.

The upstream version doesn't have that busy-loop any longer, so the tests will stop crashing with the next wpt merge. But the tests still fail, so I've filed a separate bug 1911667 on that.

Here in this bug we might wish to reconsider our seemingly unbounded send buffer?

https://w3c.github.io/webrtc-pc/#datachannel-send says: "If queuing data is not possible because not enough buffer space is available, throw an OperationError. "

Both Chrome and Safari appear to throw somewhere about 20MB, i.e. they appear to implement an artificial limit, which is one interpretation of the spec I suppose. Though is it the right one?

Firefox can apparently queue 4GB or more. But crashing when this happens seems bad.

Flags: needinfo?(rjesup)

Given we can only send data so fast, even over a LAN, we could set a limit. It does mean that if something is trying to send a lot of data over a channel, and it gets ahead of the pipe (pretty easy), you can hit the limit. And applications might not code to handle that case (and there's no 'good' way to deal with that case, unless you actively monitor the amount of queued data - which I think you can do now, though not when we originally wrote the no-limits code. Backpressure is actually preferable, so maybe some limit is good here, now that we can monitor. My suggestion is to set it to say 2x Chrome/Safari's limit, which shouldn't be problematic, but pretty much guarantees we don't hit the limit before they do.

Flags: needinfo?(rjesup)
You need to log in before you can comment on or make changes to this bug.