Closed Bug 1067437 Opened 10 years ago Closed 10 years ago

[Loop] The video call shows a black image after few seconds of call

Categories

(Core :: WebRTC, defect, P1)

ARM
Gonk (Firefox OS)
defect

Tracking

()

RESOLVED FIXED
mozilla35
blocking-b2g 2.0+
Tracking Status
firefox32 --- wontfix
firefox33 --- wontfix
firefox34 --- fixed
firefox35 --- fixed
b2g-v2.0 --- verified
b2g-v2.0M --- verified
b2g-v2.1 --- verified
b2g-v2.2 --- verified

People

(Reporter: mbarone976, Assigned: jesup)

References

(Blocks 1 open bug)

Details

(Whiteboard: [platform][blocking][feedback-requested])

Attachments

(6 files)

Attached image FlameA.png
Device: Flame
Build: v2.0 with kk v180 (gecko-2d2ca12 gaia-7edd3b0)
Loop version: c28eaef

STR
1. Flame A starts a video call with Flame B (different wifi networks)
2. Flame B answers the call

ACTUAL RESULT
After few seconds (if the end-user move the device) the video fade to black and some parts of the screen are similar to big pixeles.
Attached file Flame B logcat.txt
Whiteboard: [ → [mobile app]
Severity: normal → critical
Priority: -- → P1
Attached file Flame A logcat.txt
I am also suffering this bug when making a video Loop calls in the same WiFi network.

I think this issue was not been reproduced with jelly bean builds, Diego, Randell, are you aware of any issue that make this happen?
Flags: needinfo?(rjesup)
Flags: needinfo?(dwilson)
Whiteboard: [mobile app] → [mobile app][blocking][feedback-requested]
OS: Mac OS X → Gonk (Firefox OS)
Hardware: x86 → ARM
This seems like the dynamic resolution bug.

Jay, wdyt?
Flags: needinfo?(dwilson) → needinfo?(jaywang)
Can you check if the fixes in https://bugzilla.mozilla.org/show_bug.cgi?id=1063883 is included?
Flags: needinfo?(jaywang)
(In reply to Jay from comment #5)
> Can you check if the fixes in
> https://bugzilla.mozilla.org/show_bug.cgi?id=1063883 is included?

Based on bug description I would say those fixes are included as the Gecko commit of the build is:

https://github.com/mozilla/gecko-dev/commit/2d2ca127a19e47a996d20a33dd9f089b96d2ca0a

which corresponds to:

https://hg.mozilla.org/releases/mozilla-b2g32_v2_0/rev/13e04ab68621

and by then, the fixes for bug 1063883 were already uplifted.
yes, bug 1063883 is included in 2.0/b2g32.  https://tbpl.mozilla.org/?tree=Mozilla-B2g32-v2.0&rev=de70f9a40834

Jay, you indicated there was an issue with the encoder or decoder "a while" after we switch resolutions - could this be that bug?
Flags: needinfo?(rjesup) → needinfo?(jaywang)
A few questions for TEF:
1) have you ever *not* seen this with v180?  Do you see this on JB base builds?
2) I presume from the report above that audio continues to be received, and that video is received initially
3) as a quick test, can you set media.navigator.load_adapt to false in the prefs?
4) can you turn on NSPR_LOG_MODULES=signaling:6 in b2g.sh and get a logcat?  (it will be a lot larger).

Thanks!
Flags: needinfo?(mbarone976)
Component: Gaia::Loop → WebRTC
Product: Firefox OS → Core
Could this be related to bug 1068394?  Jay?
Note: With a fresh b2g32 build I see the same issue now.  I'm going to turn on a bunch of logging
Ok: as I suspected, I see a switch from 320x240 to 192x256.  The encoder switches, apparently cleanly, then the decoder sees the IDR (SPS/PPS/Iframe), and continues to decode - though I note it was still decoding 320x240, which is what we configured it to initially (which is I presume the maximum output dimension).  I had a patch that would reconfigure the decoder as well, but per Jay that shouldn't be needed.  

The final testing of the resolution change was being done by jay since my KK builds were broken at the time (this patch was just before v180 came out); is it possible there's a decoder (or encoder) update that we don't have in v180?  Also, a repeat of the previous question: is this the issue Jay indicated he saw "a while" after switching?  This appears to happen immediately on switching, however.  Jay/Diego, is this happening in your builds/tests?

Thanks!

(mbarone/tef: we don't need the logs; thanks)
Hi Randell, I'm replying instead of Massimo.
My tested devices are fire-e with JB 2.0, and flame with KK v180 2.0
1) we see this happening on KK base builds with v180. It happens too with JB base builds for 2.0, video freezes but audio keeps playing. 
2)yes audio was still being received and video is initially received on fire-e, then freezes then goes green. Flame plays video all the time.
3)yes, setting media.navigator.load_adapt to false solves the problem, now we can see video on fire-e.
4)Right now I will capture logs with that variable set on.
Flags: needinfo?(mbarone976)
ok, I won't send logs then. Thanks!
(In reply to Oscar Patiño González from comment #12)
> Hi Randell, I'm replying instead of Massimo.
> My tested devices are fire-e with JB 2.0, and flame with KK v180 2.0
> 1) we see this happening on KK base builds with v180. It happens too with JB
> base builds for 2.0, video freezes but audio keeps playing. 

Ok, this says to me that it's happening on fire-e with JB, and flame with v180.

> 2)yes audio was still being received and video is initially received on
> fire-e, then freezes then goes green. Flame plays video all the time.

This confused me about flame.  Does it work, or not?

> 3)yes, setting media.navigator.load_adapt to false solves the problem, now
> we can see video on fire-e.

That's useful.  Is this fire-e to fire-e?
(In reply to Randell Jesup [:jesup] from comment #7)
> yes, bug 1063883 is included in 2.0/b2g32. 
> https://tbpl.mozilla.org/?tree=Mozilla-B2g32-v2.0&rev=de70f9a40834
> 
> Jay, you indicated there was an issue with the encoder or decoder "a while"
> after we switch resolutions - could this be that bug?
Randell, 

I remember that bug 1063883 is the one cause the black pixel blocks. The other issue that I am aware of is the failure of encoder. This should cause the screen freeze.

The encoder failure can be identified from kernel log.

If you see the following error message from kernel log then it indicates the encoder failure. This requires a new ADSP image. We are currently working to get it mainlined.

<7>[  328.926396] msm_vidc: 1: Core f2244800 and inst e7878000 are in bad state
<7>[  328.926485] msm_vidc: 1: Core f2244800 and inst e787b000 are in bad state
<7>[  328.926713] msm_vidc: 1: Core is in bad state can't change the state
<7>[  328.926722] msm_vidc: 1: Failed to move from state: 17 to 13
<7>[  328.926731] msm_vidc: 1: Failed to move inst: e787b000 to release res done
Flags: needinfo?(jaywang)
One thing that come to my mind is that maybe, your ADSP image is old. There were few fixes in ADSP a while back that relates to SPS/IDF
Please, check the metabuild info with the following ADB command
 adb root 
 adb shell cat /firmware/verinfo/ver_info.txt
Flags: needinfo?(rjesup)
Flags: needinfo?(opatinobugzilla)
Flags: needinfo?(mbarone976)
M8610AAAAANFYD1520.3 

I may have found a smoking gun.

After I reconfigure the encoder for 192x256, the next Encoded() callback we get is marked as an iframe, but even after scanning it with isParamSets() we didn't find SPS/PPS in the buffer, so we inserted the old SPS/PPS.  You can verify they weren't there because we send the inserted ones (len 26 and 16), then the iframe (1172x4 + 56); if they were there and we missed them, you'd see two each SPS and PPS.

I'll attach the filtered log; I have the full one if you want
Flags: needinfo?(rjesup) → needinfo?(jaywang)
Attached file no_sps
(In reply to Randell Jesup [:jesup] from comment #17)
> M8610AAAAANFYD1520.3 
> 
> I may have found a smoking gun.
> 
> After I reconfigure the encoder for 192x256, the next Encoded() callback we
> get is marked as an iframe, but even after scanning it with isParamSets() we
> didn't find SPS/PPS in the buffer, so we inserted the old SPS/PPS.  You can
> verify they weren't there because we send the inserted ones (len 26 and 16),
> then the iframe (1172x4 + 56); if they were there and we missed them, you'd
> see two each SPS and PPS.
> 
> I'll attach the filtered log; I have the full one if you want
M8610AAAAANFYD1520.3 has the old ADSP. The SPS/PPS prefix support is in M8610AAAAANFYD1530.1 meta. I think this is the reason that SPS/PPS is missing.

However, even with the meta 1530.1, in some cases, encoder can freeze during resolution change(when it goes down to 144x192). As mentioned in comment #15, we are trying to mainline this fix. I suggest that we should disable the load adaptation or at least, resolution change until the new meta is released.
Flags: needinfo?(jaywang)
[Blocking Requested - why for this release]:
Until we have base releases with the new DSP code rev, we need to disable resolution adaptation for h.264 (only).
blocking-b2g: --- → 2.0?
Flags: needinfo?(opatinobugzilla)
Flags: needinfo?(mbarone976)
Whiteboard: [mobile app][blocking][feedback-requested] → [mobile app][blocking][feedback-requested][leave-open]
Randell,

This is follow up on OMXCodecWrapper: MediaCodec error:-38 that we discussed over IRC. 

I looked at this little bit more and I think I understand what causes this error.

1. When OMXVideoEncoder::ConfigureDirect() is called from webRTC to re-configure the codec, ::ConfigureDirect() actually stop the codec itself. 
2. While codec is stopped, DrainOutput() from another thread keeps calling the dequeueOutputBuffer(). This causes MediaCodec::handleDequeueOutputBuffer() which is called by dequeueOutputBuffer() returns error with "INVALID_OPERATION(-38)" since codec is in stop state.

I see that DrainOutput() just returns without doing anything once it sees error condition so it should not impact the functionality. However, ideally, the DrainOutput() thread should be stopped while codec re-configuring is taking place. Can this be done easily?
Flags: needinfo?(rjesup)
(In reply to Randell Jesup [:jesup] from comment #14)
> (In reply to Oscar Patiño González from comment #12)
> > Hi Randell, I'm replying instead of Massimo.
> > My tested devices are fire-e with JB 2.0, and flame with KK v180 2.0
> > 1) we see this happening on KK base builds with v180. It happens too with JB
> > base builds for 2.0, video freezes but audio keeps playing. 
> 
> Ok, this says to me that it's happening on fire-e with JB, and flame with
> v180.
> 
> > 2)yes audio was still being received and video is initially received on
> > fire-e, then freezes then goes green. Flame plays video all the time.
> 
> This confused me about flame.  Does it work, or not?
> 
To be hones, I don't know if flame (KK) encoder stops or fire-e (JB) decoder stops. In any case it means that flame decodes correctly a correctly formed bitstream from fire-e
> > 3)yes, setting media.navigator.load_adapt to false solves the problem, now
> > we can see video on fire-e.
> 
> That's useful.  Is this fire-e to fire-e?
This case is flame to fire-e. Change was made on flame (KK).
(In reply to Jay from comment #21)
> I see that DrainOutput() just returns without doing anything once it sees
> error condition so it should not impact the functionality. However, ideally,
> the DrainOutput() thread should be stopped while codec re-configuring is
> taking place. Can this be done easily?

Not too hard though one must be careful with multithreaded/locked code; likely just wait on a monitor if a new state variable (mPaused) is true.  Gotta be careful we don't miss transitions from Paused to Ending, etc.  A new bug; since there's no negative impact other than some CPU and logs in a rare-ish case this can get done later.
Flags: needinfo?(rjesup)
Blocks: 1069443
Comment on attachment 8491667 [details] [diff] [review]
Disable resolution changes on OMX H.264 until OMX DSP code is updated

I've tested this (on OpenH264 with the ifdef removed) on VP8 (resolution changes) and H.264 (no resolution change)
Attachment #8491667 - Flags: review?(pkerr)
Attachment #8491667 - Flags: review?(pkerr) → review+
Whiteboard: [mobile app][blocking][feedback-requested][leave-open] → [blocking][feedback-requested][leave-open]
Whiteboard: [blocking][feedback-requested][leave-open] → [platform][blocking][feedback-requested][leave-open]
https://hg.mozilla.org/integration/mozilla-inbound/rev/eb5bd78a635f

Note: temporary hack until we're sure to be running on an encoder/decoder that handle this properly.  If we have to continue to support OMX h.264 *without* resolution changes, this would need to be a pref.  I would strongly prefer not to support that combination, and we have no easy way to check the DSP version
Target Milestone: --- → mozilla35
Comment on attachment 8491667 [details] [diff] [review]
Disable resolution changes on OMX H.264 until OMX DSP code is updated

[Approval Request Comment]
Bug caused by (feature/regressing bug #): webrtc contant analysis

User impact if declined: Video broken after resolution changes (until DSP firmware 1530.1 lands)

Testing completed: tested by myself and QC

Risk to taking this patch (and alternatives if risky): Very low risk; just removed the callback that changes the resolution.  Also, h.264 is only enabled on QC's repo or manually by a user or developer on Flame (for now).

String or UUID changes made by this patch: none
Attachment #8491667 - Flags: approval-mozilla-b2g32?
Attachment #8491667 - Flags: approval-mozilla-aurora?
(In reply to mbarone from comment #0)
> Created attachment 8489423 [details]
> FlameA.png
> 
> Device: Flame
> Build: v2.0 with kk v180 (gecko-2d2ca12 gaia-7edd3b0)
> Loop version: c28eaef
> 
> STR
> 1. Flame A starts a video call with Flame B (different wifi networks)
> 2. Flame B answers the call
> 
> ACTUAL RESULT
> After few seconds (if the end-user move the device) the video fade to black
> and some parts of the screen are similar to big pixeles.

Can you please help us verify that the issue is fixed now on a nightly, before we uplift on branches? Thanks!
Flags: needinfo?(mbarone976)
I can't verify this until the bug Bug 1067442 - H.264 OMX video isn't encoded in Webrtc on b2g 2.2 will be fixed
Flags: needinfo?(mbarone976)
Massimo, can you verify it in master as bug 1067442 has just landed?
Thanks a lot!
Flags: needinfo?(mbarone976)
blocking-b2g: 2.0? → 2.0+
The definition: MOZ_WEBRTC_OMX is not defined in video_engine_core.gypi, so it is not applying on file vie_encoder.cc and change resolution disabling code is not working.
opening new bug with the patch.
Please refer to new patch https://bugzilla.mozilla.org/show_bug.cgi?id=1073486

@Massimo. Now, with that patch, video on kit-kat android version with Master doesn't collapse when transmitting high complexity(movement) video.
Depends on: 1073486
still waiting on verification here before approving for branches, Massimo any update?
Hi, the video with the patch works fine. We have tested in both conditions: Device vs Device and Desktop vs Device.

Device vs Device: The video works, but no audio is transmitted.
Device vs Desktop: Video works fine and the audio is transmitted but with delay (3/4 seconds of delay)
Flags: needinfo?(mbarone976)
This change should have no impact on audio, so any issue there would likely be different/existing/network-related.
Attachment #8491667 - Flags: approval-mozilla-b2g32?
Attachment #8491667 - Flags: approval-mozilla-b2g32+
Attachment #8491667 - Flags: approval-mozilla-aurora?
Attachment #8491667 - Flags: approval-mozilla-aurora+
https://hg.mozilla.org/releases/mozilla-aurora/rev/53eb97ab8c3a
https://hg.mozilla.org/releases/mozilla-b2g32_v2_0/rev/e3545ca967ae

Setting the b2g statuses to fixed for tracking purposes. Feel free to set the back to affected if and when there's more to uplift here.
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Whiteboard: [platform][blocking][feedback-requested][leave-open] → [platform][blocking][feedback-requested]
hi Jay, any update about the date the new DSP code version that fixes this issue will be provided to the OEM?
Flags: needinfo?(jaywang)
(In reply to Maria Angeles Oteo (:oteo) from comment #38)
> hi Jay, any update about the date the new DSP code version that fixes this
> issue will be provided to the OEM?

Tentatively sometime next week.
Flags: needinfo?(jaywang)
This issue has been verified successfully on Flame 2.0, 2.1, 2.2 and woodduck 2.0.
The Loop version:bd8f1c2
See attachment: 1713.MP4
Reproducing rate: 0/5

Step:
1. Flame A starts a video call with Flame B (different wifi networks).
2. Flame B answers the call.

Actual result:
The screen still display normally in calling, the black image can't appear.

Woodduck version:
Gaia-Rev        ead3b72a84512750bc5faff4e9e8faa1715c0d05
Gecko-Rev       8d40d6480ee0e628b0f7655dcd6ff79a2f2fbcfc
Build-ID        20141211050313
Version         32.0
Device-Name     jrdhz72_w_ff
FW-Release      4.4.2
FW-Incremental  1418245573
FW-Date         Thu Dec 11 05:06:41 CST 2014

Flame 2.1 version:
Gaia-Rev        c226db212db4d824c09617cd6dc407b2d4258d9b
Gecko-Rev       https://hg.mozilla.org/releases/mozilla-2g34_v2_1/rev/cf8bebfa4703
Build-ID        20141210001201
Version         34.0
Device-Name     flame
FW-Release      4.4.2
FW-Incremental  eng.cltbld.20141210.035300
FW-Date         Wed Dec 10 03:53:11 EST 2014
Bootloader      L1TC00011880

Flame 2.2 version:
Gaia-Rev        e17c5656dbf517d48fb61ac9bc92119e023fd717
Gecko-Rev       https://hg.mozilla.org/mozilla-central/rev/be1f49e80d2d
Build-ID        20141210040201
Version         37.0a1
Device-Name     flame
FW-Release      4.4.2
FW-Incremental  eng.cltbld.20141210.074809
FW-Date         Wed Dec 10 07:48:20 EST 2014
Bootloader      L1TC00011880

Flame 2.0 version:
Gaia-Rev        856863962362030174bae4e03d59c3ebbc182473
Gecko-Rev       https://hg.mozilla.org/releases/mozilla-2g32_v2_0/rev/2d0860bd0225
Build-ID        20141210000202
Version         32.0
Device-Name     flame
FW-Release      4.4.2
FW-Incremental  eng.cltbld.20141210.034839
FW-Date         Wed Dec 10 03:48:50 EST 2014
Bootloader      L1TC00011880
Attached video 1713.MP4
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: