Closed Bug 962338 Opened 10 years ago Closed 9 years ago

[B2G][WebRTC] Firefox OS to Desktop Communication Fails to Send Audio Intermittently

Categories

(Core :: WebRTC: Networking, defect)

28 Branch
ARM
Gonk (Firefox OS)
defect
Not set
normal

Tracking

()

RESOLVED INVALID
Tracking Status
b2g-v1.3 --- affected
b2g-v1.3T --- affected
b2g-v1.4 --- affected
b2g-v2.0 --- affected

People

(Reporter: nkhristoforov, Unassigned)

References

Details

(Whiteboard: [mwcdemo2014], permafail)

Attachments

(6 files, 3 obsolete files)

Attached file Logcat
The user cannot establish an AppRTC connection using Firefox OS devices on http://apprtc.appspot.com/. When two Firefox OS devices enter the same room, a "Connecting" message appears but the two devices hang there indefinitely. For the Firefox OS simulator, the user is stuck on the "Initializing" message. The Google Chrome desktop browser was able to connect properly.

Pre-requisites:
1) Two Firefox OS devices that have the "pref('media.peerconnection.enabled', true);" preferance added to the device (copy and paste text inside the quotations including the semicolon). 

Repro Steps:
1) Updated both Buris to Build ID: 20140121004137
2) Make sure both devices are connected to Wifi.
3) Open the Browser app on both devices.
4) On one device (Device A) go to: http://apprtc.appspot.com/
5) On the other device (Device B) go to the URL shown at the bottom of the screen on Device A.

Actual:
There's a "Connecting" message on both devices that hangs there indefinitely.

Expected:
The "Connecting" message would be there temporarily then the devices would connect.

Environmental Variables
Device: Buri v1.3 MOZ RIL
Build ID: 20140121004137
Gecko: http://hg.mozilla.org/releases/mozilla-aurora/rev/6f7dfe36ab6c
Gaia: 47049555282a9a01fb60d1e1421b57e2810c96f5
Platform Version: 28.0a2
Firmware Version: V1.2_device.cfg

Repro frequency: 100%
Link to failed test case: https://moztrap.mozilla.org/manage/case/11219/
See attached: Logcat and Screenshot
This issue does not reproduce on 1.2 because this feature was not implemented yet.
I confirmed the webrtc pref was definitely turned on in the build he tested on.
Component: Gaia::Browser → WebRTC: Networking
Product: Firefox OS → Core
Version: unspecified → 28 Branch
I think this could potentially just be a bug with apprtc, not a bug in the webrtc code.

If you ignore the fact that connecting is still being shown after two peers connect, then can you eventually communicate audio between each FxOS device?
Flags: needinfo?(nkhristoforov)
No, I was not able to communicate with audio while the "Connecting" message was present. After 5 minutes audio did not go through between the devices.
Flags: needinfo?(nkhristoforov)
Can you try the following network scenarios using http://mysecondwebrtc.appspot.com by setting up an audio P2P call for 5 minutes?

* Desktop Firefox Nightly --> Firefox OS - wifi
* Firefox OS --> Desktop Firefox Nightly - wifi
* Firefox OS --> Firefox OS - wifi
Flags: needinfo?(nkhristoforov)
(In reply to Jason Smith [:jsmith] from comment #6)
> Can you try the following network scenarios using
> http://mysecondwebrtc.appspot.com by setting up an audio P2P call for 5
> minutes?
> 
> * Desktop Firefox Nightly --> Firefox OS - wifi
> * Firefox OS --> Desktop Firefox Nightly - wifi
> * Firefox OS --> Firefox OS - wifi

Also - do each of three test cases on the same wifi network & on different wifi networks. So that's a total of 6 test cases to run here.
The Firefox OS --> Firefox OS - wifi tests passed for both wifi networks.

The connection between Firefox OS and Desktop Firefox was harder to establish. I was able to connect the two together about 60% of the time. And that's both Firefox OS --> Desktop Firefox Nightly and Desktop Firefox Nightly --> Firefox OS. Also, when a connection was established, the Firefox OS device did not receive audio from the Desktop Firefox Nightly. I think that's a hardware issue with the laptop because the Data Channel Status field had an Open value which I think means that the connection was established. The connection problems and audio issues occurred on both wifi networks.

Device: Buri v1.3 MOZ RIL
Build ID: 20140121004137
Gecko: http://hg.mozilla.org/releases/mozilla-aurora/rev/6f7dfe36ab6c
Gaia: 47049555282a9a01fb60d1e1421b57e2810c96f5
Platform Version: 28.0a2
Firmware Version: V1.2_device.cfg
Flags: needinfo?(nkhristoforov)
(In reply to Nikolai Khristoforov from comment #8)
> The Firefox OS --> Firefox OS - wifi tests passed for both wifi networks.
> 
> The connection between Firefox OS and Desktop Firefox was harder to
> establish. I was able to connect the two together about 60% of the time. And
> that's both Firefox OS --> Desktop Firefox Nightly and Desktop Firefox
> Nightly --> Firefox OS. Also, when a connection was established, the Firefox
> OS device did not receive audio from the Desktop Firefox Nightly. I think
> that's a hardware issue with the laptop because the Data Channel Status
> field had an Open value which I think means that the connection was
> established. The connection problems and audio issues occurred on both wifi
> networks.
> 
> Device: Buri v1.3 MOZ RIL
> Build ID: 20140121004137
> Gecko: http://hg.mozilla.org/releases/mozilla-aurora/rev/6f7dfe36ab6c
> Gaia: 47049555282a9a01fb60d1e1421b57e2810c96f5
> Platform Version: 28.0a2
> Firmware Version: V1.2_device.cfg

I've actually seen this bug before, which leads to believe that there's a legitimate bug here, not a hardware issue. I just reopened bug 952237 to track this.
The last thing I think we need to do here is to get logs when this reproduces to determine if this is a problem with apprtc itself or a problem in the webrtc codebase. I'll work with Nikolai to get the logs.
Keywords: qawanted
QA Contact: jsmith
Discussed with Nikolai offline - he's working on getting signaling logs.
QA Contact: jsmith → nkhristoforov
Wireshark trace of the Firefox OS device connecting to the Firefox Desktop Nightly on apprtc.appspot.com
Wireshark trace of the Firefox Desktop Nightly connecting to the Firefox OS device on apprtc.appspot.com
Keywords: qawanted
Steven - Can you look at the wireshark traces & comment on what you think might be the problem here?
Flags: needinfo?(slee)
Hi Jason,

From the 2 logs, I can see rtp packets carrying audio data in both directions. So that the connection should be established correctly. I can try it on my devices later and see if there is any problems.
Flags: needinfo?(slee)
(In reply to StevenLee[:slee] from comment #15)
> Hi Jason,
> 
> From the 2 logs, I can see rtp packets carrying audio data in both
> directions. So that the connection should be established correctly. I can
> try it on my devices later and see if there is any problems.

Strange. From an end-user perspective, the audio isn't heard from either peer. So why would the rtp packets be sent using audio data, but the audio wouldn't be heard by the peer receiving the audio?

Nikolai - Can you get signaling logs following the directions below?

Run the following commands:

adb shell
export NSPR_LOG_MODULES=signaling:5,mtransport:5
export NSPR_LOG_FILE=/data/local/tmp/myLogFile
stop b2g
/system/bin/b2g.sh

Reference: https://developer.mozilla.org/en-US/docs/Mozilla/Debugging/HTTP_logging#Firefox_OS_phones

Then, reproduce each bug. After reproducing the bug, pull the myLogFile from each device using adb pull (adb pull /data/local/tmp/myLogFile) & attach the log files to the bug.
Flags: needinfo?(nkhristoforov)
Attachment #8366368 - Attachment description: AppRTC B2G to Desktop → AppRTC B2G to Desktop Wireshark
So I can definitely reproduce the b2g --> desktop failure here. Although strangely enough, I'm not getting signaling log output with the above commands.
Flags: needinfo?(nkhristoforov)
(In reply to Jason Smith [:jsmith] from comment #17)
> So I can definitely reproduce the b2g --> desktop failure here. Although
> strangely enough, I'm not getting signaling log output with the above
> commands.

Nikolai mentioned offline that 2 FxOS devices works here. So I think the original bug involved using the simulator --> device, rather than device --> device.
Figured out a different way to signaling logs working.
Attachment #8369035 - Attachment description: Logcat with Signaling Logs → Logcat with Signaling Logs - FxOS
Attachment #8369035 - Attachment description: Logcat with Signaling Logs - FxOS → Logcat with Signaling Logs - FxOS (mysecondwebrtc.appspot.com)
Attachment #8369036 - Attachment description: Signaling Logs - Desktop → Signaling Logs - Desktop (mysecondwebrtc.appspot.com)
Attachment #8369036 - Attachment is obsolete: true
Attachment #8369035 - Attachment is obsolete: true
On FxOS side, I'm seeing a huge dump of this in the log:

2014-02-01 08:08:28.964813 UTC - 17990128[432fbf00]: Flow[ff0730894239774f:1,rtp(none)]; Layer[dtls]: Would have blocked

Which is coming from:

http://mxr.mozilla.org/mozilla-central/source/media/mtransport/transportlayerdtls.cpp#659
That's annoying but normal.

(In reply to Jason Smith [:jsmith] from comment #24)
> On FxOS side, I'm seeing a huge dump of this in the log:
> 
> 2014-02-01 08:08:28.964813 UTC - 17990128[432fbf00]:
> Flow[ff0730894239774f:1,rtp(none)]; Layer[dtls]: Would have blocked
> 
> Which is coming from:
> 
> http://mxr.mozilla.org/mozilla-central/source/media/mtransport/
> transportlayerdtls.cpp#659
This is interesting...  repeated a *lot* (~250ms) on the FxOS side:

02-01 00:08:31.660: W/AudioFlinger(140): Thread AudioIn_55 cannot connect to the power manager service
02-01 00:08:31.660: W/AudioFlinger(140): Thread AudioIn_55 cannot connect to the power manager service
02-01 00:08:31.670: I/PRLog(1173): 2014-02-01 08:08:31.681796 UTC - 17990128[432fbf00]: Flow[ff0730894239774f:1,rtp(none)]; Layer[dtls]: Would have blocked
02-01 00:08:31.680: W/AudioFlinger(140): Thread AudioIn_55 cannot connect to the power manager service
02-01 00:08:31.690: W/AudioHardwareMSM76XXA(140): rpc_snd_set_device(3, 1, 1)

Perhaps it's innocuous, but it seems like it shouldn't be complaining that often about something.

I'll note the FxOS logs start in the middle of the call, so if it's a startup issue we can't see the logs.  If we get logs that start before the call, it might help (can't be sure)
Michael - Do you know what the errors seen in comment 26 imply?

Nils - Can you help finish off the investigation of this bug? I'm at a work week this week, so I won't get to this until next week. However, I've seen repeated intermittent failures to have audio be sent from desktop --> fxos in a fxos --> desktop call, so I think there's a bug here.
Flags: needinfo?(mwu)
Flags: needinfo?(drno)
MWC demo test run showed evidence of audio failing from FxDesktop to FxOS in a bunch of calls. If we can't figure out the root cause here in time, then QA will advise to pull the demo from MWC.
Whiteboard: burirun1.3-2 → burirun1.3-2 [mwcdemo2014]
I am able to reproduce one of the problems:
If fxos starts the session at apprtc and the desktop joins that session I get only audio from the desktop to the fxos (Helix device with 1.3), but no audio from fxos to the desktop, although media RTP flows in both directions.

Note: of the desktop browser starts the apprtc session and fxos joins audio works as expected in both directions.

I'll try to investigate the first problem.
(In reply to Randell Jesup [:jesup] from comment #26)
> This is interesting...  repeated a *lot* (~250ms) on the FxOS side:
> 
> 02-01 00:08:31.660: W/AudioFlinger(140): Thread AudioIn_55 cannot connect to
> the power manager service
> 02-01 00:08:31.660: W/AudioFlinger(140): Thread AudioIn_55 cannot connect to
> the power manager service
> 02-01 00:08:31.670: I/PRLog(1173): 2014-02-01 08:08:31.681796 UTC -
> 17990128[432fbf00]: Flow[ff0730894239774f:1,rtp(none)]; Layer[dtls]: Would
> have blocked
> 02-01 00:08:31.680: W/AudioFlinger(140): Thread AudioIn_55 cannot connect to
> the power manager service
> 02-01 00:08:31.690: W/AudioHardwareMSM76XXA(140): rpc_snd_set_device(3, 1, 1)
This is harmless although it's annoying.

(In reply to Nils Ohlmeier [:drno] from comment #29)
> I am able to reproduce one of the problems:
> If fxos starts the session at apprtc and the desktop joins that session I
> get only audio from the desktop to the fxos (Helix device with 1.3), but no
> audio from fxos to the desktop, although media RTP flows in both directions.
Hi Nil,

Can you provide the settings of desktop and device, fps and frame size? As I know, helix is not able to encode and decode video with VGA size smoothly.
Flags: needinfo?(mwu)
(In reply to StevenLee[:slee] from comment #30)
> 
> (In reply to Nils Ohlmeier [:drno] from comment #29)
> > I am able to reproduce one of the problems:
> > If fxos starts the session at apprtc and the desktop joins that session I
> > get only audio from the desktop to the fxos (Helix device with 1.3), but no
> > audio from fxos to the desktop, although media RTP flows in both directions.
> Hi Nil,
> 
> Can you provide the settings of desktop and device, fps and frame size? As I
> know, helix is not able to encode and decode video with VGA size smoothly.

This doesn't have anything to do with video - this bug was reproduced doing audio calls only.
I tried to reproduce the problem with audio-only on buri today. I found that when it is connected, the CPU usage of b2g and browser is more than 85%. I am not sure if it is related to the problem. I will try to profile why the CPU usage is so high and use g711 codec tomorrow.
(In reply to StevenLee[:slee] from comment #30)
> Hi Nil,
> 
> Can you provide the settings of desktop and device, fps and frame size? As I
> know, helix is not able to encode and decode video with VGA size smoothly.

Jason is correct: I was and am referring to audio only. If we have a video issue I think we should track it in another ticket.
I found the problem is resulted from that the "volume" in [1] is 0.0. After setting the default value of "volume" to 1.0 in both device and desktop, I can hear the sounds. I also found that when I connect to my gupshup with audio-only(by changing the js code of GetUserMedia with audio only), I can hear sounds on both sides. I am not familiar with MediaStreamGraph, is anyone know it can help?

[1] http://dxr.mozilla.org/mozilla-central/source/content/media/MediaStreamGraph.cpp#895
(In reply to StevenLee[:slee] from comment #34)
> I found the problem is resulted from that the "volume" in [1] is 0.0. After
> setting the default value of "volume" to 1.0 in both device and desktop, I
> can hear the sounds. I also found that when I connect to my gupshup with
> audio-only(by changing the js code of GetUserMedia with audio only), I can
> hear sounds on both sides. I am not familiar with MediaStreamGraph, is
> anyone know it can help?
> 
> [1]
> http://dxr.mozilla.org/mozilla-central/source/content/media/MediaStreamGraph.
> cpp#895

Good analysis - sounds like we've identified the root cause.

Jesup & Roc could probably help with this.

It sounds like this is going to legitimately block hearing sound between peers, which means this is a blocker to shipping webrtc p2p on 1.3. Nominating.
blocking-b2g: --- → 1.3?
Flags: needinfo?(drno)
jesup or roc - can one of you look into this? This apparently appears to be a problem in media stream graph.
Flags: needinfo?(roc)
Flags: needinfo?(rjesup)
Reworking title to the root cause of the problem.
Summary: [B2G][WebRTC][Browser] Firefox OS devices cannot establish a connection on http://apprtc.appspot.com/ → [B2G][WebRTC] Firefox OS to Desktop Communication Fails to Send Audio Intermittently
Volume from the AudioOutputs (volume is the summation of the stream AudioOutputs) being zero makes me wonder if it's an application (JS) problem - i.e. something isn't hooked up or is muted.  apprtc may well mute via volume until "something" happens....  You could add logs around the volume commands (SetAudioOutputVolume/etc) to verify

apprtc has been known to have "issues" with audio-only calls in the past (wasn't really designed for it, though it works generally).  The fact that it works in the (much simpler) gupshup is a good indication this is likely.

Roc: what ways can volume here be zero?  what should they look for in the JS (or mediastreamgraph:5 logs) to verify?

slee: can you add those logs to see why it's zero?

padenot: anything you can think of?
Flags: needinfo?(rjesup) → needinfo?(slee)
If you're not using Web Audio, I don't know where the volume value of 0 would come from. Could it possibly be the 'volume' setting of the <audio> element itself? It shouldn't be that hard to debug with some logging, assertions or breakpoints in AudioChunk.
Flags: needinfo?(roc)
Do we know why this isn't reproducing in FxOS --> FxOS call then? I've never seen this problem in that call setup. It only happens in the FxOS --> Desktop case.
(In reply to Robert O'Callahan (:roc) (Mozilla Corporation) from comment #39)
> If you're not using Web Audio, I don't know where the volume value of 0
> would come from. Could it possibly be the 'volume' setting of the <audio>
> element itself? It shouldn't be that hard to debug with some logging,
> assertions or breakpoints in AudioChunk.

Looked at the source for apprtc and saw:

https://code.google.com/p/webrtc/source/browse/trunk/samples/js/apprtc/js/main.js#579

Which indicates that audio tracks are muted if <hotkey>-D is pressed.
ROC,

Would it be possible to chime in with risk analysis of fix?
Flags: needinfo?(roc)
(In reply to Preeti Raghunath(:Preeti) from comment #42)
> ROC,
> 
> Would it be possible to chime in with risk analysis of fix?

I think we need to first understand with concrete evidence if this is an app bug or a gecko bug.
(In reply to Preeti Raghunath(:Preeti) from comment #42)
> Would it be possible to chime in with risk analysis of fix?

No, because we don't know what the problem is. It sounds like Steven Lee should be able to debug it.
Flags: needinfo?(roc)
I think I found the root cause. If I am wrong, please correct me.

The reason why we cannot hear sound is that there is no MediaStream callbacked when we connect to apprtc and request audio device only. That's because the expected tracks are not matched and it is waiting for complete tracks, [1]. The audio packets are sent between the peers so that from the capture log of wireshark, we can see the rtp packets.

But in comment 34, I said that when I exchange the peer information by gupshup and request audio device only, I can hear sounds. On gupshup, it does not specify the "remote" media types. On apprtc, it adds extra-constraints in [2], and it changes the expected types in [1].

[1] http://dxr.mozilla.org/mozilla-central/source/content/media/DOMMediaStream.cpp#335
[2] https://code.google.com/p/webrtc/source/browse/trunk/samples/js/apprtc/js/main.js#17
Flags: needinfo?(slee)
That analysis makes sense - media constraints would restrict something like this. Thanks for the clarification. I'll close this as invalid, as we're technically doing the right behavior here.
Status: NEW → RESOLVED
blocking-b2g: 1.3? → ---
Closed: 10 years ago
Resolution: --- → INVALID
This doesn't actually sound correct.

Those constraints are what you are willing to accept, not requirements.
If one side does OfferToReceiveAudio and OfferToReceiveVideo and the
other side only delivers video, then you should get a callback with
a video-only stream.
Okay, so this is legitimate bug then. Sorry for the confusion.

Although this won't block 1.3 - this problem has probably been present for quite some time since desktop has been around.
Status: RESOLVED → REOPENED
Resolution: INVALID → ---
Whiteboard: burirun1.3-2 [mwcdemo2014] → burirun1.3-2 [mwcdemo2014] burirun1.3-3
(In reply to StevenLee[:slee] from comment #45)
> The reason why we cannot hear sound is that there is no MediaStream
> callbacked when we connect to apprtc and request audio device only. That's
> because the expected tracks are not matched and it is waiting for complete
> tracks, [1]. The audio packets are sent between the peers so that from the
> capture log of wireshark, we can see the rtp packets.

It sounds like PeerConnectionObserverDispatch::Run, where it creates tracksAvailableCallback, is passing the wrong hints. We shouldn't add hints for tracks that are not actually going to be added to the MediaStream.
Whiteboard: burirun1.3-2 [mwcdemo2014] burirun1.3-3 → burirun1.3-2 [mwcdemo2014] burirun1.3-3 burirun1.4-2
Whiteboard: burirun1.3-2 [mwcdemo2014] burirun1.3-3 burirun1.4-2 → burirun1.3-2 [mwcdemo2014] burirun1.3-3 burirun1.4-2 1.3tarakorun2
Whiteboard: burirun1.3-2 [mwcdemo2014] burirun1.3-3 burirun1.4-2 1.3tarakorun2 → [mwcdemo2014], permafail
QA Whiteboard: [QAnalyst-Triage?]
Flags: needinfo?(ktucker)
QA Whiteboard: [QAnalyst-Triage?] → [QAnalyst-Triage+]
Flags: needinfo?(ktucker)
QA Whiteboard: [QAnalyst-Triage+] → [QAnalyst-Triage+][lead-review+]
This code has changed completely. This should no longer be an issue.
Status: REOPENED → RESOLVED
Closed: 10 years ago9 years ago
Resolution: --- → INVALID
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: