Terrible video quality after negotiating video tracks 3 times
Categories
(Core :: WebRTC: Audio/Video, defect, P2)
Tracking
()
People
(Reporter: nazar, Assigned: mjf)
References
(Regression)
Details
(Keywords: regression)
User Agent: Mozilla/5.0 (X11; Linux x86_64; rv:87.0) Gecko/20100101 Firefox/87.0
Steps to reproduce:
- Go to https://jsbin.com/huxibegado/edit?js,output
- Add video track
- [optional] Remove video track (optional because doesn't impact results with these steps)
- Add second video track
- [optional] Remove video track
- Add third video track
Actual results:
Regardless of optional steps above video quality and FPS goes down dramatically, with active motion I was able to see FPS as low as 2FPS.
From the testing I've done, audio (if captured) also starts crackling, not just video.
Expected results:
Video quality should be consistent no matter how many renegotiations happen (especially assuming video tracks are removed).
Comment 1•2 years ago
|
||
The Bugbug bot thinks this bug should belong to the 'Core::Audio/Video: Playback' component, and is moving the bug to that component. Please revert this change in case you think the bot is wrong.
Reporter | ||
Updated•2 years ago
|
Reporter | ||
Comment 2•2 years ago
|
||
Reproducible on macOS 11.2 in Firefox Stable and Linux in Firefox Nightly, which leads me to believe that it is not platform-specific and is not a regression.
It causes major issues for our users at Restream.io that are using our Studio product for live streaming from browser.
Comment 3•2 years ago
|
||
[Tracking Requested - why for this release]: Serious video quality degradation in WebRTC web conferencing that can hit after other participants join or leave. Likely to affect sites that use the API in a Plan-B compatible manner.
Thanks Nazar for the jsbin! I've instrumented it in https://jsfiddle.net/jib1/caznsg28/3/ and was able to find the regression range.
Workaround
Uncommenting the following line in the fiddle near the bottom makes the problem go away for me:
// pc1.getTransceivers().find(tc => tc.sender == sender).stop();
You'll also need this modification in your original jsbin to make the video still disappear when removed:
event.track.onended =
event.streams[0].onremovetrack = () => document.body.removeChild(video);
To explain the workaround: In Unified Plan, pc.removeTrack(sender)
is insufficient to make senders and their receivers on the other end disappear entirely. Instead, it merely nulls the sender.track
, and mutes the receiver.track
and temporarily removes it from its stream(s).
To remove the sender and receiver entirely, and end the receiver.track
, transceiver.stop()
must be called.
Sorry about this inconvenience! We hope this workaround may help until we can fix this performance regression.
Mjf, what do we think is going on here? This is happening to regular non-simulcast video.
Updated•2 years ago
|
Reporter | ||
Comment 4•2 years ago
|
||
Quality is still very bad even with suggested workaround and tracks/videos removed in your jsfiddle.
Just move from right to left in front of the camera, FPS will go down significantly. That is the primary issue I'm concerned with.
Updated•2 years ago
|
Comment 5•2 years ago
|
||
Hi Nazar, note in JSFiddle you have to hit the â–¶ Run
button in the upper left after uncommenting the line, unlike JSBin.
Are you still seeing the frame rate drop to half with the workaround in https://jsfiddle.net/jib1/caznsg28/4/ ?
I'm waving my hand back and forth in front of my face, while I
- click
add video track
- observe
Frame rate:
- click
remove video track
- goto 1
In comment 3, I'm seeing output: Frame rate:
~30 fps, ~30 fps, ~15 fps, ~15 fps.
In comment 5, I'm seeing output: Frame rate:
~30 fps, ~30 fps, ~30 fps, ~30 fps.
Just making sure we're seeing the same thing.
Assignee | ||
Comment 7•2 years ago
|
||
(In reply to Jan-Ivar Bruaroey [:jib] (needinfo? me) from comment #3)
Mjf, what do we think is going on here? This is happening to regular non-simulcast video.
My first guess is that since the mitigation for Bug 1674463 involved lowering the min bitrate for HD, a different (higher) resolution is picked. If stopping the transceiver makes the problem better, it may that multiple higher-resolution video tracks are continuing to fight for available bandwidth in the non-stop case.
I'll try to add some ad-hoc logging on my end to see if this theory holds up.
Reporter | ||
Comment 8•2 years ago
|
||
Jan, I think it is potentially a different issue, but yes, the way you describe it does indeed work fine.
But if you add 4 videos quality degrades permanently even if you start removing them afterwards.
Comment 9•2 years ago
|
||
Wow, you're right. Even with the comment 5 fiddle, pressing add video track
4 times, used to produce 4 smooth videos before the regression:
Frame rate: 29.08
Frame rate: 28.85
Frame rate: 28.92
Frame rate: 28.92
...whereas now it's abysmal:
Frame rate: 9.02
Frame rate: 7.98
Frame rate: 8.98
Frame rate: 8.06
Interestingly, pressing remove video track
3 times does not recover the frame rate.
Clicking remove video track
a fourth time, waiting a bit and then pressing add video track
seems to recover it, but not reliably.
Reporter | ||
Comment 10•2 years ago
|
||
Thanks for confirming!
Most of Firefox users have this issue in our https://restream.io/studio product caused by the nature of complex functionality we offer there.
If there is a workaround or a way to speed-up solution on this, I'd be very thankful for that!
Assignee | ||
Comment 11•2 years ago
•
|
||
(In reply to Michael Froman [:mjf] from comment #7)
(In reply to Jan-Ivar Bruaroey [:jib] (needinfo? me) from comment #3)
Mjf, what do we think is going on here? This is happening to regular non-simulcast video.
My first guess is that since the mitigation for Bug 1674463 involved lowering the min bitrate for HD, a different (higher) resolution is picked. If stopping the transceiver makes the problem better, it may that multiple higher-resolution video tracks are continuing to fight for available bandwidth in the non-stop case.
I'll try to add some ad-hoc logging on my end to see if this theory holds up.
Well, resolution was the wrong thing to key on here, but I suppose it makes sense that if our min bitrate is allowed to drop to 200kbps (whereas previously only allowed to drop to 600kbps), the frame rate will drop for a given resolution. I've run both cases (with the patch from Bug 1674463 included and not), and we're choosing the same line from kResolutionAndBitrateLimits[1]. Given that, I guess the question becomes what is causing the available bandwidth to decrease?
Assignee | ||
Comment 12•2 years ago
|
||
To followup, we can also consider raising the minimum bitrate on that line, with a potential degradation in our simulcast switching performance vs chrome when using jitsi.
Comment 13•2 years ago
•
|
||
what is causing the available bandwidth to decrease?
The bandwidth can't be decreasing because before the patch we observe the same code sending 4 videos at 30 fps without problem.
Also, at the end of comment 9, we've called pc.removeTrack
and transceiver.stop()
on all but one transceiver, and renegotiated. All transceivers but one have transceiver.sender.track == null
at this point, so there's no real competition for actual bandwidth, yet the remaining video is still transmitting at poor quality and low frame rate. There's also no simulcast being done, so why is simulcast logic kicking in?
Assignee | ||
Comment 14•2 years ago
|
||
(In reply to Jan-Ivar Bruaroey [:jib] (needinfo? me) from comment #13)
what is causing the available bandwidth to decrease?
The bandwidth can't be decreasing because before the patch we observe the same code sending 4 videos at 30 fps without problem.
I guess I misspoke - something appears to be acting as if bandwidth is lower. The previous minimum allowed bandwidth was set to 600k, which explains why the 4 videos continue to work at ~30fps without issues. The new lower minimum is allowing the framerate to drop too low.
Also, at the end of comment 9, we've called
pc.removeTrack
andtransceiver.stop()
on all but one transceiver, and renegotiated. All transceivers but one havetransceiver.sender.track == null
at this point, so there's no real competition for actual bandwidth, yet the remaining video is still transmitting at poor quality and low frame rate. There's also no simulcast being done, so why is simulcast logic kicking in?
Since I don't have that much experience with the guts of simulcast I'm not sure if this is distinctly simulcast logic or not. I thought the simulcast logic dealt with switching streams as the available bandwidth changed. Since we're not doing simulcast here, I don't know why the encoder would suddenly start using less bandwidth unless it is receiving input that would indicate that it should do so.
Assignee | ||
Comment 15•2 years ago
|
||
I'm going to try to get some info on what is driving the used bandwidth down.
Updated•2 years ago
|
Assignee | ||
Comment 16•2 years ago
•
|
||
Progress report:
After a bit of debugging, I can verify that something is driving the reported bitrate down to the minimum allowed for the stream. VideoSender::SetChannelParameters receives a target_bitrate_bps, and that results in VP8EncoderImpl::SetRateAllocation called with an updated BitrateAllocation. This happens with the 3rd click of "Add video track" whether or not other tracks have been removed. I do not ever see the bitrate go above the minimum from that point on.
I'll keep digging.
Assignee | ||
Comment 17•2 years ago
|
||
Progress report:
The last call to SendSideCongestionController::MaybeTriggerOnNetworkChanged
after the send "Add video track", we see bitrate: 4,852,111. Suspiciously when we click "Add video track" the 3rd time, the next call to SendSideCongestionController::MaybeTriggerOnNetworkChanged
we see bitrate: 2,426,055, exactly half the previous value. The values fall quickly from there to a number (10,000) far below the minimum allowed bitrate (200,000) for the stream.
Assignee | ||
Comment 18•2 years ago
•
|
||
Progress report:
Upon adding the 3rd video track, the call to DelayBasedBwe::IncomingPacketFeedbackVector is triggering the delayed_feedback code path here[1], which then halves the estimated bandwidth here[2]. This pattern (5 calls in a row with delayed_feedback) continues permanently from this point forward.
[1] https://searchfox.org/mozilla-central/rev/e84baf5c730abd453be5d50dcb0aba1e743bac47/third_party/libwebrtc/webrtc/modules/congestion_controller/delay_based_bwe.cc#157-161
[2] https://searchfox.org/mozilla-central/rev/e84baf5c730abd453be5d50dcb0aba1e743bac47/third_party/libwebrtc/webrtc/modules/congestion_controller/delay_based_bwe.cc#176
Assignee | ||
Comment 19•2 years ago
|
||
On a lark, I allowed more (up to 20) consecutive delayed feedbacks here[1]. I see a very interesting pattern emerge. With each remove/add track cycle after adding the first track, the number of consecutive delayed feedbacks increases by 2.
Example:
click "Add video track"
DelayBasedBwe::IncomingPacketFeedbackVector calling MaybeUpdateEstimate after 1 consecutive_delayed_feedbacks_
DelayBasedBwe::IncomingPacketFeedbackVector calling MaybeUpdateEstimate after 1 consecutive_delayed_feedbacks_
<snip>
DelayBasedBwe::IncomingPacketFeedbackVector calling MaybeUpdateEstimate after 1 consecutive_delayed_feedbacks_
DelayBasedBwe::IncomingPacketFeedbackVector calling MaybeUpdateEstimate after 1 consecutive_delayed_feedbacks_
click "Remove video track"
click "Add video track"
DelayBasedBwe::IncomingPacketFeedbackVector calling MaybeUpdateEstimate after 3 consecutive_delayed_feedbacks_
DelayBasedBwe::IncomingPacketFeedbackVector calling MaybeUpdateEstimate after 3 consecutive_delayed_feedbacks_
<snip>
DelayBasedBwe::IncomingPacketFeedbackVector calling MaybeUpdateEstimate after 3 consecutive_delayed_feedbacks_
DelayBasedBwe::IncomingPacketFeedbackVector calling MaybeUpdateEstimate after 3 consecutive_delayed_feedbacks_
click "Remove video track"
click "Add video track"
DelayBasedBwe::IncomingPacketFeedbackVector calling MaybeUpdateEstimate after 5 consecutive_delayed_feedbacks_
DelayBasedBwe::IncomingPacketFeedbackVector calling MaybeUpdateEstimate after 5 consecutive_delayed_feedbacks_
<snip>
DelayBasedBwe::IncomingPacketFeedbackVector calling MaybeUpdateEstimate after 5 consecutive_delayed_feedbacks_
DelayBasedBwe::IncomingPacketFeedbackVector calling MaybeUpdateEstimate after 5 consecutive_delayed_feedbacks_
click "Remove video track"
click "Add video track"
DelayBasedBwe::IncomingPacketFeedbackVector calling MaybeUpdateEstimate after 7 consecutive_delayed_feedbacks_
DelayBasedBwe::IncomingPacketFeedbackVector calling MaybeUpdateEstimate after 7 consecutive_delayed_feedbacks_
<snip>
DelayBasedBwe::IncomingPacketFeedbackVector calling MaybeUpdateEstimate after 7 consecutive_delayed_feedbacks_
DelayBasedBwe::IncomingPacketFeedbackVector calling MaybeUpdateEstimate after 7 consecutive_delayed_feedbacks_
Assignee | ||
Comment 20•2 years ago
|
||
With each remove/add track cycle we're making 2 more calls to TransportFeedbackAdapter::GetPacketFeedbackVector
right on top of each other, as in with the same (or off by 1) clock_->TimeInMilliseconds() value
.
Assignee | ||
Comment 21•2 years ago
|
||
It appears that after each additional remove/add track cycle, we have an additional WebrtcVideoConduit object. However each of the WebrtcVideoConduit objects is using the same Call object.
Assignee | ||
Comment 22•2 years ago
•
|
||
Each WebrtcVideoConduit's ReceivedRTCPPacket is called twice with the same packet (data ptr and len). We end up with 3 WebrtcVideoConduit objects, each calling DeliverPacket on the same Call object twice. We get one execution of DeliverPacket that works correctly, and then each of the other 5 calls fails to lookup the send time for the packets in the feedback vector, which looks like a consecutive_delayed_feedbacks_. Our limit for consecutive_delayed_feedbacks_ is 5. This in turn trips the OnLongFeedbackDelay call which halves the reported available bandwidth.
I'm not sure if the 2 WebrtcVideoConduit objects the correspond to removed tracks should continue to receive RTCP packets, but this is definitely a combination of issues.
Reporter | ||
Comment 23•2 years ago
|
||
Is there a chance to have some solution for this in time for Firefox 87?
We had to add a large banned discouraging people from using in the meantime :(
Comment 24•2 years ago
|
||
I suspect that fixing this will require fixing bug 1654112. If that is the case, there's no way this will be in 87. If we can fix this without bug 1654112, it will be pretty hard to do this in 87, since we're coming up on the release date, and doing a beta uplift that close to release is hard to justify, especially if it is already broken on release for a couple of cycles at least.
Updated•2 years ago
|
Comment 25•2 years ago
|
||
fix-optional for 87 based on comment 24, but we'll keep tracking it for 88.
Reporter | ||
Comment 26•2 years ago
|
||
Is there a way to sponsor or somehow differently accelerate fixes for this and some other WebRTC-related issues?
Assignee | ||
Comment 27•2 years ago
|
||
(In reply to Byron Campen [:bwc] from comment #24)
I suspect that fixing this will require fixing bug 1654112. If that is the case, there's no way this will be in 87. If we can fix this without bug 1654112, it will be pretty hard to do this in 87, since we're coming up on the release date, and doing a beta uplift that close to release is hard to justify, especially if it is already broken on release for a couple of cycles at least.
Based on some added instrumentation, it appears this problem is fixed with the updates coming in bug 1654112 (at least on a linux build). I'm in the process of trying to pull and build for macOS.
Assignee | ||
Comment 28•2 years ago
|
||
I've confirmed the fix on macOS as well.
Comment 29•2 years ago
|
||
If this depends on the libwebrtc update to be fixed with no other short-term options in the mean time, we're at least a cycle or two away still from seeing a fix here.
Comment 30•1 year ago
|
||
Bug 1654112 has landed and I've verified this now works as expected on Nightly.
Description
•