Closed Bug 831831 Opened 11 years ago Closed 11 years ago

WebRTC Video calls freeze

Categories

(Core :: WebRTC: Audio/Video, defect, P1)

defect

Tracking

()

VERIFIED FIXED
mozilla21

People

(Reporter: standard8, Assigned: jesup)

References

Details

(Whiteboard: [WebRTC], [blocking-webrtc+])

Attachments

(3 files, 2 obsolete files)

STR

1. Start a webrtc video call via one of the demos
2. Once connected, leave going for a short time

Expected Results:

Audio and Video keep working

Actual Results:

Audio keeps working, video freezes.

The latest test I've done is on a local network with 2013-01-16 builds one on Windows, one Mac.


I've been seeing this for a while, but so have others. It feels like it has got worse in the last week or so, so I'm currently going to test some older versions.
Assignee: nobody → rjesup
Priority: -- → P1
Whiteboard: [WebRTC], [blocking-webrtc+]
Just tried with tokeshu on latest nightly (21.0a1 (2013-01-17)): 
no audio, no video with Multi host PeerConnection on http://mozilla.github.com/webrtc-landing/. 

Test between Macbook Air and Windows 7 laptop

Debug is as follows:

Incoming call with offer [object Object]

Got onaddstream of type audio

Got onaddstream of type video

setRemoteDescription, creating answer

created Answer and setLocalDescription {"type":"answer","sdp":"v=0\r\no=Mozilla-SIPUA 22005 0 IN IP4 0.0.0.0\r\ns=SIP Call\r\nt=0 0\r\na=ice-ufrag:16ad712d\r\na=ice-pwd:ec732ccbfe824b7692c5a5d5aeb063de\r\na=fingerprint:sha-256 A5:88:F7:5F:CD:39:BA:BB:96:1D:CA:7B:E0:41:57:E3:FF:EA:D2:18:7A:95:EA:4C:A2:53:BB:67:78:FE:1A:F5\r\nm=audio 54895 RTP/SAVPF 109 101\r\nc=IN IP4 128.79.9.60\r\na=rtpmap:109 opus/48000/2\r\na=ptime:20\r\na=rtpmap:101 telephone-event/8000\r\na=fmtp:101 0-15\r\na=sendrecv\r\na=candidate:0 1 UDP 2113601791 192.168.1.5 54895 typ host\r\na=candidate:1 1 UDP 1694236671 128.79.9.60 54895 typ srflx raddr 192.168.1.5 rport 54895\r\na=candidate:0 2 UDP 2113601790 192.168.1.5 55168 typ host\r\na=candidate:1 2 UDP 1694236670 128.79.9.60 55168 typ srflx raddr 192.168.1.5 rport 55168\r\nm=video 56344 RTP/SAVPF 120\r\nc=IN IP4 128.79.9.60\r\na=rtpmap:120 VP8/90000\r\na=sendrecv\r\na=candidate:0 1 UDP 2113601791 192.168.1.5 56344 typ host\r\na=candidate:1 1 UDP 1694236671 128.79.9.60 56344 typ srflx raddr 192.168.1.5 rport 56344\r\na=candidate:0 2 UDP 2113601790 192.168.1.5 64008 typ host\r\na=candidate:1 2 UDP 1694236670 128.79.9.60 64008 typ srflx raddr 192.168.1.5 rport 64008\r\nm=application 64006 SCTP/DTLS 5001 \r\nc=IN IP4 128.79.9.60\r\na=sendrecv\r\na=candidate:0 1 UDP 2113601791 192.168.1.5 64006 typ host\r\na=candidate:1 1 UDP 1694236671 128.79.9.60 64006 typ srflx raddr 192.168.1.5 rport 64006\r\na=candidate:0 2 UDP 2113601790 192.168.1.5 62782 typ host\r\na=candidate:1 2 UDP 1694236670 128.79.9.60 62782 typ srflx raddr 192.168.1.5 rport 62782\r\n"}
(In reply to Jb Piacentino from comment #1)
> Just tried with tokeshu on latest nightly (21.0a1 (2013-01-17)): 
> no audio, no video with Multi host PeerConnection on
> http://mozilla.github.com/webrtc-landing/. 

If you're not getting any video or audio whatsoever, then that's a separate bug. This one is about the freezing.
Can we get explicit steps to reproduce?  ANd also try the standard tests (webrtc-landing local peerconnection, and also Multihost peerconnection?

I tried multihost for 1/2 hour with no problem, plus a bunch of other tests.

A debug build with NSPR logs on (to a file, upload here) using mtransport:5,signaling:5 would be a good start to tracking down what is happening.
No video, no audio is likely a set of too-restrictive firewalls
(In reply to Randell Jesup [:jesup] from comment #3)
> Can we get explicit steps to reproduce?

For what I can reproduce: Generally I see this most of the time when running video over the network. I don't see it when running video with two separate instances of Firefox on the same machine.

Both machines are on wifi hosted by a Netgear router. I'm currently using the webrtc-demo (aka social api demo). One of my machines is a Mac 10.8.2, the other a Windows 7 box. However, I've seen this freeze when talking over the internet to other users on Mac machines.

At the moment I can't give much more detail. AFAIK I'm not doing anything special.

I can try a debug build in my morning and get the logs etc.
Attached is a first log file where Randell and I tried the multi host demo. Log is Firefox logs with xport NSPR_LOG_MODULES=mtransport:5; export NSPR_LOG_FILE=/tmp/nspr_logs;
I did a long test with the multihost peer connection demo between two laptop connected via wifi. The first was running 21.0a1 (2013-01-15) on Ubuntu and the second was running 21.0a1 (2013-01-16) on Windows 7.

I can reproduce video freezing via these steps:

  - Place a call between two machines
  - Wait for a minute and see that the video keeps going
  - Start to download a big file to consume bandwidth
  - See the two video freezing
  - Stop the download
  - Wait for a few minutes to see one video stream coming back to life, sometime both, then another one freeze again.

Once you had a freeze it's really hard to see the both streams coming back to normal.
I don't know if downloading a file to reproduce the problem is relevant here, as Mark and I experienced these freeze in « normal » circumstances.
Attachment #703755 - Flags: review?(ekr)
to reprise the email/IRC converations:

There were two bugs:
1) RTCP packets when received weren't being processed
2) When sending, the code believed all the sends were failing (but without error) because they returned 0 from Audio/VideoConduit::Send(RTCP)Packet()

#2 caused some minor-ish problems with mis-estimation of bitrates and stats (not sure how much that affected things higher up)
#1 caused failure to recover from packet loss on video, which was the primary issue.
Comment on attachment 703755 [details] [diff] [review]
Don't ignore incoming RTCP; don't make webrtc code think no bytes were sent

I pushed this patch to the try server per Jb's request:
https://tbpl.mozilla.org/?tree=Try&rev=69a46eb33445
(In reply to rgauthier from comment #8)
> Once you had a freeze it's really hard to see the both streams coming back
> to normal.
> I don't know if downloading a file to reproduce the problem is relevant
> here, as Mark and I experienced these freeze in « normal » circumstances.

My normal circumstances tend to include at least one tbpl instance open, etherpads open, Thunderbird running as well as irc. Plus at least my phone is connected to the same network, and at times other items. So I suspect mine is quite noisy in some respects.
Priority: P1 → --
Priority: -- → P1
I've just done the same test as previously, but with the try builds referenced above. Unfortunately, it still hangs after a few moments, so here is the log, let me know if you need more options on.

I'll also investigate a quieter local network and see if that changes things.
Contrary to what I previously said, the patch may be working. We're still working through a bunch of test cases over irc to try and characterise what is going on.
Comment on attachment 703755 [details] [diff] [review]
Don't ignore incoming RTCP; don't make webrtc code think no bytes were sent

Review of attachment 703755 [details] [diff] [review]:
-----------------------------------------------------------------

Jesup, I think you're missing the change from mEngineReceiving to mEngineTransmitting in the audio conduit
Attachment #703755 - Flags: review?(ekr) → review-
Attachment #703755 - Attachment is obsolete: true
Comment on attachment 703881 [details] [diff] [review]
Don't ignore incoming RTCP; don't make webrtc code think no bytes were sent

Fixed.

Odd thing though: in a 700MB nspr.log file, there were no errors from WebrtcAudioConduit.  I've noticed the usage between video and audio of the transport is inconsistent on send, though I'm not sure how that could affect receive.  I'll investigate.
Attachment #703881 - Flags: review?(ekr)
Comment on attachment 703881 [details] [diff] [review]
Don't ignore incoming RTCP; don't make webrtc code think no bytes were sent

Review of attachment 703881 [details] [diff] [review]:
-----------------------------------------------------------------

::: media/webrtc/signaling/src/media-conduit/AudioConduit.cpp
@@ +492,5 @@
>  WebrtcAudioConduit::ReceivedRTPPacket(const void *data, int len)
>  {
>    CSFLogDebug(logTag,  "%s : channel %d", __FUNCTION__, mChannel);
>  
> +  if(mEngineTransmitting)

Ooops. This is ReceivedRTPPacket
Attachment #703881 - Flags: review?(ekr) → review-
Attachment #703881 - Attachment is obsolete: true
Comment on attachment 703883 [details] [diff] [review]
Don't ignore incoming RTCP; don't make webrtc code think no bytes were sent

Review of attachment 703883 [details] [diff] [review]:
-----------------------------------------------------------------

lgtm
Attachment #703883 - Flags: review+
Update on our results of testing so far:

- The try server build with the patch seems more stable than a nightly build without the patch.
- It seems to recover from video hangs where as it wouldn't previously.
- Doing video over Wifi with two clients on one router appears to induce more hangs than wired connections to the same router.
-- The hangs can be multiple minutes, and can affect one or both ends.
-- On a Windows - Mac client, the video received by the mac tends to hang most frequently.
https://hg.mozilla.org/integration/mozilla-inbound/rev/8a7a3de45a94
Whiteboard: [WebRTC], [blocking-webrtc+] → [WebRTC], [blocking-webrtc+][webrtc-uplift]
Target Milestone: --- → mozilla21
Blocks: 832567
https://hg.mozilla.org/mozilla-central/rev/8a7a3de45a94
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Blocks: 832683
Keywords: verifyme
Verified on Nightly by running a 10 minute video between two machines and not seeing the video freeze through apprtc.appspot.com.
Status: RESOLVED → VERIFIED
Keywords: verifyme
Whiteboard: [WebRTC], [blocking-webrtc+][webrtc-uplift] → [WebRTC], [blocking-webrtc+]
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: