Closed Bug 989945 Opened 7 years ago Closed 6 years ago

[B2G][WebRTC] long video lag when using H.264 codec.

Categories

(Core :: WebRTC: Audio/Video, defect, P2)

ARM
Gonk (Firefox OS)
defect

Tracking

()

RESOLVED FIXED
mozilla32
feature-b2g 2.0

People

(Reporter: jhlin, Assigned: jesup)

References

Details

(Whiteboard: [p=0, POVB, ft:loop])

Attachments

(22 files, 6 obsolete files)

11.10 KB, application/pdf
Details
1.21 KB, application/x-shellscript
Details
15.41 KB, application/pdf
Details
59.71 KB, application/pdf
Details
4.46 KB, patch
Details | Diff | Splinter Review
3.53 KB, text/plain
Details
15.62 KB, application/pdf
Details
4.63 MB, text/plain
Details
7.48 MB, text/plain
Details
5.22 KB, patch
Details | Diff | Splinter Review
5.96 KB, patch
Details | Diff | Splinter Review
5.16 KB, patch
Details | Diff | Splinter Review
24.16 KB, application/pdf
Details
1.45 MB, text/plain
Details
30.82 KB, application/pdf
Details
3.95 KB, patch
jhlin
: review+
Details | Diff | Splinter Review
8.06 KB, patch
jhlin
: review+
Details | Diff | Splinter Review
5.17 KB, patch
jhlin
: review+
Details | Diff | Splinter Review
2.13 MB, text/plain
Details
33.73 KB, application/pdf
Details
4.38 MB, application/x-zip-compressed
Details
49.79 KB, application/pdf
Details
Apply patches in bug 911046 and start a video call. The remote video has several seconds lag.
John -- Are we still seeing this problem?
Flags: needinfo?(jolin)
(In reply to Maire Reavy [:mreavy] (Please needinfo me if you need my attention on a bug) from comment #1)
> John -- Are we still seeing this problem?

Yes, although the situation is much better in browser to flame video call.
I think we should address frame dropping issue (bug 989944) before checking this one.
Flags: needinfo?(jolin)
Conclusion of sync up meeting with maire, we will use this bug to trace low latency setting
Update here to make sure everyone on the same page
Attached file omx.pdf
Here's a summary of the state of play.

1. As Jesup observes, the encoder is exceeding the target bit rate.
2. This causes ViE to do a lot of frame dropping in order to bring
the bit rate down. You can see this in the WebRTC logs as:
"Drop frame due to bitrate", thus the effective frame rate is
very low, even if you introduce frames at 30fps.
3. The encoder latency is somewhat high (>100ms), but this is a
secondary problem.
4. The PPS packet is being dropped because of a timestamp glitch(see bug: 1005418).
Fixing this does not appear to fix the decoder latency problem (#5)
5. The decoder has very high latency. The first decoded frame is
around 5 seconds after the initial frame passed to the decoder.
Initial experiments indicate that this is some sort of buffering:
the first frame pops out reliably around 20 frames into the decoding
process. Injecting fewer frames and waiting does not result in frames
being output.

I have attached a figure which shows what happens over time. The lines represent:

inserted -- frames inserted
encode -- frames presented to the OMX encoder
encoded -- frame emitted from the OMX encoder (includes settings frames)
encoded_frames -- actual frames emitted from the OMX encoder
decode -- frames presented to the OMX decoder
decoded -- frames emitted by the OMX decoder

As you can see, there is excessive latency on the decoder long before ViE
starts to throttle incoming frames.


I have developed a test program which allows easy diagnosis of this in
a reproducible fashion. It runs as a standalone test. The next comment will
contain details of how to run it.
More information on this. I've built a test harness which you can find at:
https://github.com/ekr-webrtc/gecko-dev/tree/omx_benchmark2.

To run it:
1. Build and then flash gecko.
2. Copy the script (attached) to the phone in /data/
3. Create a test YUV file and copy it to the phone:

You can use any 176x144 y4m file as input
and rescale/convert as described in:

https://bugzilla.mozilla.org/show_bug.cgi?id=890419

I got mine from:
http://downloads.webmproject.org/ietf_tests/niklas_1280_720_30.yuv.xz

And then converted to y4m with:
mplayer -demuxer rawvideo -rawvideo w=1280:h=720:format=i420 niklas_1280_720_30.yuv -vo yuv4mpeg

And then rescaled with yuvscaler.

4. Run the benchmark with adb as:
adb shell sh /data/b2g-benchmark-nolog.sh -video-benchmark-file /data/<filemame>.y4m -video-benchmark-receive -video-benchmark-frames 300



The salient output value is the output "First frame output delay", as-in:

"First frame output delay 5016"

This number is in ms and should be more like "30"
I tweaked OMXCodecWrapper.cpp to set the bitrate to 300K; it definitely dropped the rate (though I'm not sure it's enough); it was 2Mbps.  I am noticing that iframes are coming every 30 frames (3 seconds), while OMXCodecWrapper configured them for 1/second and we told it 30fps I believe.  That may be another clue.  

We should check all the defaults set up in OMXCodecWrapper
Dropping the bitrate appears to have gotten rid of the initial longer delay decoding - a graph now shows a consistent 1-second delay, with a drop to around 1/2 second at an IDR.

https://developer.qualcomm.com/forum/qdevnet-forums/multimedia-optimization-qdsp/12686
suggests
setPictureOrder(int pictureOrder) with QOMX_VIDEO_DECODE_ORDER
This requires the patch from bug 1005637; WIP with a couple of things short-circuited like framerate to match what I'm seeing
Depends on: 1005637
(In reply to Randell Jesup [:jesup] from comment #9)
> Dropping the bitrate appears to have gotten rid of the initial longer delay
> decoding - a graph now shows a consistent 1-second delay, with a drop to
> around 1/2 second at an IDR.

Well, sort of.

Basically, the first frame of the decoder is delayed by 20 frames,
but once you reduce the bit rate, you get less pre-encoder
frame dropping and hence a lower time delay, but the same frame-level
delay. As you say, delay drops every second or so, presumably at
IDR frames, but what actually happens is a burst of decoded frames
being emitted in one shot. I will attach a graph showing this shortly.



> https://developer.qualcomm.com/forum/qdevnet-forums/multimedia-optimization-
> qdsp/12686
> suggests
> setPictureOrder(int pictureOrder) with QOMX_VIDEO_DECODE_ORDER
Attached file log5.pdf
First graph is deltas between "Decoder inserted" logs, and "Decoder output" logs (these are effectively the inter-frame pacing, modified by any delays caused by waiting for buffers.

Second graph is delay between a specific frame being inserted for decode and when that frame pops out of the decoder.

The third graph is bits/frame and running average bits/frame with a 10-frame decay period (avg = (avg*9 + new_value)/10)
(In reply to Randell Jesup [:jesup] from comment #9)
> Dropping the bitrate appears to have gotten rid of the initial longer delay
> decoding - a graph now shows a consistent 1-second delay, with a drop to
> around 1/2 second at an IDR.
> 
> https://developer.qualcomm.com/forum/qdevnet-forums/multimedia-optimization-
> qdsp/12686
> suggests
> setPictureOrder(int pictureOrder) with QOMX_VIDEO_DECODE_ORDER

This configuration parameter is floating around in OMX_QCOMExtns.h...

   OMX_QcomIndexParamVideoDecoderPictureOrder = 0x7F000010

However, my preliminary investigation doesn't show it being exposed
to Gecko...
Thats ok. Just import the constant, detect that its QC silicon/driver and then use it.
You also need to directly access the OMX object so you can do
setParameter() and that's hidden behind a couple of layers of
libstagefright.
I hacked libstagefright to (in theory) set OMX do DECODE_ORDER rather than DISPLAY_ORDER with the patch above. You need to push the libstagefright*.so
files onto the phone directly (./flash.sh gecko doesn't do it). Unfortunately, this doesn't seem to help. There may well be something stupidly wrong with my patch, since I'm just guessing at the parameters. Note that I am setting both
the input and output ports, which probably shouldn't be necessary, but I didn't know which one.
For your reference.
(In reply to Eric Rescorla (:ekr) from comment #11)
> (In reply to Randell Jesup [:jesup] from comment #9)
> > Dropping the bitrate appears to have gotten rid of the initial longer delay
> > decoding - a graph now shows a consistent 1-second delay, with a drop to
> > around 1/2 second at an IDR.
> 
> Well, sort of.
> 
> Basically, the first frame of the decoder is delayed by 20 frames,
> but once you reduce the bit rate, you get less pre-encoder
> frame dropping and hence a lower time delay, but the same frame-level
> delay. As you say, delay drops every second or so, presumably at
> IDR frames, but what actually happens is a burst of decoded frames
> being emitted in one shot. I will attach a graph showing this shortly.

 Burst of decoded frames is because of I-frame as you suspected. It won't happen if I

  format->setInt32("i-frame-interval", 0x7FFFFFFF / aFrameRate);
(In reply to John Lin[:jolin][:jhlin] from comment #19)
> (In reply to Eric Rescorla (:ekr) from comment #11)
> > (In reply to Randell Jesup [:jesup] from comment #9)
> > > Dropping the bitrate appears to have gotten rid of the initial longer delay
> > > decoding - a graph now shows a consistent 1-second delay, with a drop to
> > > around 1/2 second at an IDR.
> > 
> > Well, sort of.
> > 
> > Basically, the first frame of the decoder is delayed by 20 frames,
> > but once you reduce the bit rate, you get less pre-encoder
> > frame dropping and hence a lower time delay, but the same frame-level
> > delay. As you say, delay drops every second or so, presumably at
> > IDR frames, but what actually happens is a burst of decoded frames
> > being emitted in one shot. I will attach a graph showing this shortly.
> 
>  Burst of decoded frames is because of I-frame as you suspected. It won't
> happen if I
> 
>   format->setInt32("i-frame-interval", 0x7FFFFFFF / aFrameRate);

Can you please supply a patch for this? It's not clear from this comment where you
want the change.
(In reply to Eric Rescorla (:ekr) from comment #20)
> (In reply to John Lin[:jolin][:jhlin] from comment #19)
> > (In reply to Eric Rescorla (:ekr) from comment #11)
> > > (In reply to Randell Jesup [:jesup] from comment #9)
> > > > Dropping the bitrate appears to have gotten rid of the initial longer delay
> > > > decoding - a graph now shows a consistent 1-second delay, with a drop to
> > > > around 1/2 second at an IDR.
> > > 
> > > Well, sort of.
> > > 
> > > Basically, the first frame of the decoder is delayed by 20 frames,
> > > but once you reduce the bit rate, you get less pre-encoder
> > > frame dropping and hence a lower time delay, but the same frame-level
> > > delay. As you say, delay drops every second or so, presumably at
> > > IDR frames, but what actually happens is a burst of decoded frames
> > > being emitted in one shot. I will attach a graph showing this shortly.
> > 
> >  Burst of decoded frames is because of I-frame as you suspected. It won't
> > happen if I
> > 
> >   format->setInt32("i-frame-interval", 0x7FFFFFFF / aFrameRate);
> 
> Can you please supply a patch for this? It's not clear from this comment
> where you
> want the change.

I patched this into OMXCodecWrapper. The result is a constant delay but still
highly delayed.e
Please note that the Flame device has a JB based Gonk whereas the launch device Flame will be KK based. The Flame device has an older version of the DSP.

I've gotten a working build with the following components running on KK on 8610:
1) tip of master as of yesterday
2) Added the patches in Randell's list [1]
3) Added some extra debugs in WebrtcOMXH264VideoCodec.cpp (Randell, do you have a patch to turn on the timestamps of encoded/decoded packets to determine latencies that you had sent me a log earlier?)
4) Using the right version of the DSP that we will be using.

Using the above build. I'm able to select the H264 codec using Randell's test webpage and the "one way call" [2].
I do see some initial delays and the encode patch is definitely sluggish.

I'm attaching a logcat log of what I'm seeing. 

[1] http://hg.mozilla.org/users/rjesup_wgate.com/h264_omx
[2] http://mozilla.github.io/webrtc-landing/pc_test.html
Attached file webrtc_2.1.txt
Log of H264 loopback
(In reply to Mandyam Vikram from comment #23)
> Please note that the Flame device has a JB based Gonk whereas the launch
> device Flame will be KK based. The Flame device has an older version of the
> DSP.

Hmm... Can we get more modern devices?


> I've gotten a working build with the following components running on KK on
> 8610:
> 1) tip of master as of yesterday
> 2) Added the patches in Randell's list [1]
> 3) Added some extra debugs in WebrtcOMXH264VideoCodec.cpp (Randell, do you
> have a patch to turn on the timestamps of encoded/decoded packets to
> determine latencies that you had sent me a log earlier?)
> 4) Using the right version of the DSP that we will be using.
> 
> Using the above build. I'm able to select the H264 codec using Randell's
> test webpage and the "one way call" [2].
> I do see some initial delays and the encode patch is definitely sluggish.

If you use the test programs I sent you in:
https://bugzilla.mozilla.org/show_bug.cgi?id=989945#c6

The relevant metric will be printed on stderr.


> I'm attaching a logcat log of what I'm seeing. 
> 
> [1] http://hg.mozilla.org/users/rjesup_wgate.com/h264_omx
> [2] http://mozilla.github.io/webrtc-landing/pc_test.html
(In reply to Mandyam Vikram from comment #23)
> Please note that the Flame device has a JB based Gonk whereas the launch
> device Flame will be KK based. The Flame device has an older version of the
> DSP.
> 
> I've gotten a working build with the following components running on KK on
> 8610:
> 1) tip of master as of yesterday
> 2) Added the patches in Randell's list [1]
> 3) Added some extra debugs in WebrtcOMXH264VideoCodec.cpp (Randell, do you
> have a patch to turn on the timestamps of encoded/decoded packets to
> determine latencies that you had sent me a log earlier?)

edit b2g.sh to turn on logging:

adb remount
adb pull /system/bin/b2g.sh /tmp
[ modify b2g.sh to add "export NSPR_LOG_MODULES=signaling:5,WebrtcOMXH264VideoCodec:6,timestamp"
adb push /tmp/b2g.sh /system/bin
adb shell chmod 0777 /system/bin/b2g.sh
(In reply to Eric Rescorla (:ekr) from comment #25)
> (In reply to Mandyam Vikram from comment #23)
> > Please note that the Flame device has a JB based Gonk whereas the launch
> > device Flame will be KK based. The Flame device has an older version of the
> > DSP.
> 
> Hmm... Can we get more modern devices?

Or a new base build to replace V10E, based on KK?
(In reply to Mandyam Vikram from comment #23)
> Please note that the Flame device has a JB based Gonk whereas the launch
> device Flame will be KK based. The Flame device has an older version of the
> DSP.
> 
> I've gotten a working build with the following components running on KK on
> 8610:
> 1) tip of master as of yesterday
> 2) Added the patches in Randell's list [1]
> 3) Added some extra debugs in WebrtcOMXH264VideoCodec.cpp (Randell, do you
> have a patch to turn on the timestamps of encoded/decoded packets to
> determine latencies that you had sent me a log earlier?)
> 4) Using the right version of the DSP that we will be using.

I don't think you're using the right set of patches from [1], as I don't see debugs like:

I/PRLog   ( 1161): 2014-05-04 13:34:39.234837 UTC - 23203624[b3eacf00]: [OMXOutputDrain|WebrtcOMXH264VideoCodec] WebrtcOMXH264VideoCodec.cpp:589: Encoded frame: 2431 bytes, 240x320, is_param 0, is_iframe 0, timestamp 502007660, captureTimeMs 2945590704

or 

I/PRLog   ( 1161): 2014-05-04 13:34:39.298323 UTC - 23198952[b14f9080]: [DecodingThread|WebrtcOMXH264VideoCodec] WebrtcOMXH264VideoCodec.cpp:330: Decoder input: 2283 bytes (NAL 0x41), time 12764461 (254551872), flags 0x0

This would be in the "omx_logs" patch
 
> [1] http://hg.mozilla.org/users/rjesup_wgate.com/h264_omx

Reminder: Apply them in the order given in 'series'
Setting up call and emailed question on vendor plans to support low latency decode, which impacts the  2 usability bugs.  May be fixed with new firmware designed for KitKat.  Requested access.
Bug 989944 - [B2G][WebRTC] Serious frame dropping when enabling HW H.264 video in real-time mode.
Bug 989945 - [B2G][WebRTC] long video lag when using H.264 codec.
Priority: -- → P2
Whiteboard: [s=fx32, p=0]
Target Milestone: --- → mozilla32
Yes, I had missed applying the omx_logs patch. I'm attaching updated logs.
Attached file webrtc_log.txt
(In reply to Mandyam Vikram from comment #31)
> Created attachment 8419063 [details]
> webrtc_log.txt

I'll note that this doesn't seem to be generated with the NSPR_LOG_MODULES I gave, and I'm not sure why they're "I/Gecko" instead of "I/PRLog".  Without the timing info from "timestamp" there's not a lot useful I can do with it.  (I'm also wonder that at some level of extra logging beyond what we need it might start to be a noticeable hit on performance, but perhaps this is fine).

Also: other patches in my queue are missing it appears.  The encoder is generating way over-bitrate results as if configureDirect and use_configure patches aren't applied.  The initial iframe is 6KB, you have a packet or two at a reasonable bitrate (1215, 2221 bytes), then by frame 5 the frame size is averaging 14KB (about 1MB/s if it's 10fps, 3MB/s if it's 30fps).   It should be starting at either 300Kbps, and then slowly ramp from 300 to 600.  Per the other bug I filed, it appears as if setting the bitrate isn't working if we started at 2Mbps (configured starting rate set without those two patches).

I'm going add a patch to change the 'bitrate' patch  to make the default max to 600 (I was doing it via a pref, but it's easier to not have to mess with a user_pref by hand).  I'll also make it make h264 the default video codec.

Please make sure you have current versions of all the patches applied, and set NSPR_LOG_MODULES as stated, and dump the logs that way (I use simple adb logcat to a file, starting right before I hit "Start").  Thanks!
Attached file Log of loopback with latest KK build (obsolete) —
:jesup,

I'm using the same setup as Vikram in comment 23 plus the NSOR_LOG_MODULES set. Please find the log attached.

I do see some extra WebRTC messages in the log. Does it look like what you expected?
Flags: needinfo?(rjesup)
Attachment #8419802 - Flags: feedback?(rjesup)
Diego, Manyam, Jesup,

Can I ask what we're hoping to get out of the NSPR logs?

It's pretty clear that:

(a) We are injecting encoded frames into the MediaCodec decoder
(b) We aren't getting decoded frames until we inject ~20 frames?
(c) This happens even when you adjust the bitrate downward so that
VideoEngine doesn't throttle input frames.

You can see this starting from the very first decode frame, so this
seems to pretty strongly implicate something inside of MediaCodec.
(In reply to Diego Wilson [:diego] from comment #33)
> Created attachment 8419802 [details]
> Log of loopback with latest KK build
> 
> :jesup,
> 
> I'm using the same setup as Vikram in comment 23 plus the NSOR_LOG_MODULES
> set. Please find the log attached.
> 
> I do see some extra WebRTC messages in the log. Does it look like what you
> expected?

yes, and my tool processes it into a graph, which I'll attach.  I presume this is the newer DSP code?

The bitrate seems better controlled than I see here, though I don't love the "wave" form.  And we ramped up to 600Kbps (search for "set bitrate") and stayed there; the "wave" continued until (it appears) the camera frame rate shifted from 15fps to 30fps.  At that point, the bandwidth smoothed out (mostly) and the delay actually dropped from ~0.75-1 second to ~0.5 seconds.  (Still way too long, but better.)

The bitrate doesn't track the configured value very well, even at 30fps (have to do more calculations, but it appears to be averaging close to 300-400K), but it's less "wavey"

Note my graph has 'x' in frames, not seconds.  (seconds would take some more work on the script.)  You can infer the rate from the frame timings (and iframe timing).

Note we can't control the framerate of the camera (this is another currently-open question - what to do if the input rate varies from the configured rate).
Flags: needinfo?(rjesup)
(In reply to Eric Rescorla (:ekr) from comment #34)
> Diego, Manyam, Jesup,
> 
> Can I ask what we're hoping to get out of the NSPR logs?
> 
> It's pretty clear that:
> 
> (a) We are injecting encoded frames into the MediaCodec decoder
> (b) We aren't getting decoded frames until we inject ~20 frames?
> (c) This happens even when you adjust the bitrate downward so that
> VideoEngine doesn't throttle input frames.
> 
> You can see this starting from the very first decode frame, so this
> seems to pretty strongly implicate something inside of MediaCodec.

The logs are merely a tool; I've found some useful bits from it (like the over-bitrate issue and others), but you're absolutely correct: nothing has touched the basic problem of decoder delay.

Is the control mentioned above to set the decode order active in this codebase?  (JB or KK versions?)  Can we access it in any way other than modifying stagefright?
Attached file diego.pdf (obsolete) —
Attachment #8419802 - Flags: feedback?(rjesup)
(In reply to Randell Jesup [:jesup] from comment #36)
> (In reply to Eric Rescorla (:ekr) from comment #34)
> > Diego, Manyam, Jesup,
> > 
> > Can I ask what we're hoping to get out of the NSPR logs?
> > 
> > It's pretty clear that:
> > 
> > (a) We are injecting encoded frames into the MediaCodec decoder
> > (b) We aren't getting decoded frames until we inject ~20 frames?
> > (c) This happens even when you adjust the bitrate downward so that
> > VideoEngine doesn't throttle input frames.
> > 
> > You can see this starting from the very first decode frame, so this
> > seems to pretty strongly implicate something inside of MediaCodec.
> 
> The logs are merely a tool; I've found some useful bits from it (like the
> over-bitrate issue and others),

Totally. I think it's really helped us home into this problem.


> but you're absolutely correct: nothing has
> touched the basic problem of decoder delay.
> 
> Is the control mentioned above to set the decode order active in this
> codebase?  (JB or KK versions?)  Can we access it in any way other than
> modifying stagefright?

Right now I'd settle for having to modify stagefright. As you can see
from comment 17, my attempt to do that totally failed.
Randell,

Are you planning on landing your H.264 WebRTC floating patches [1]? Maybe we can track that in another bug?

[1] http://hg.mozilla.org/users/rjesup_wgate.com/h264_omx
Flags: needinfo?(rjesup)
Attachment #8417034 - Attachment is obsolete: true
(In reply to Diego Wilson [:diego] from comment #39)
> Randell,
> 
> Are you planning on landing your H.264 WebRTC floating patches [1]? Maybe we
> can track that in another bug?

I've updated my patch queue (just cleanup and reordering), and some are up for review.  Also I made put a some others up on bugs (like the omx logging patch here).  A couple are only for debugging this problem (in particular the 8x10 tweaks for decode delay).  The 8x10 settings for bitrate is a temporary patch until we can add per-codec bitrate limits (which will be a new bug); that patch also sets the default codec to h264.
Flags: needinfo?(rjesup)
The idea is to add a function to ExtendedCodec that exposes parameters supported by Qualcomm and call it in ACodec::setParameters(). It also exposes key strings for those parameters to avoid hard-coding in Gecko.
Attachment #8420855 - Flags: feedback?(rjesup)
Attachment #8420855 - Flags: feedback?(ekr)
Attachment #8420855 - Flags: feedback?(dwilson)
Configure OMX codec with parameters provided in attachment 8420855 [details] [diff] [review].
The part that sets NAL max size is for bug 997947.
Attachment #8420858 - Flags: feedback?(rjesup)
Attachment #8420858 - Flags: feedback?(ekr)
Replaces SPS from encoder with SPS-modified-to-add-VUI.  This is a hack!
Comment on attachment 8421510 [details] [diff] [review]
hack to force SPS values from encoder to be replaced with versions with VUI NOT FOR CHECKIN

This adds VUI to the SPS generated by the encoder.  Verified using h264_analyze.  Note: while the logs show that the decoder is seeing SPS with VUI at CheckSPSForVUI(), this appears to have no impact on delay.


The old SPS is:
0x67 0x42 0xc0 0x0c 0x96 0x54 0x1e 0x14 0x88 
The new SPS is:
0x67 0x42 0xc0 0x0c 0x96 0x54 0x1e 0x14 0x90 0x0f 0x16 0x2e 0xa0 

The VUI added is:

0000 0000 1111 0001 0110 0010 1110 101x
0x00      0xf1      0x62      0xea

aspect = 0
overscan = 0
video = 0
chroma = 0

timing = 0
nal hrd = 0
vcl hrd = 0
pic struct = 0

bitstream_restriction = 1
motion vecs = 1
bytes_per_pic_denom = 0
bytes_per_mb_denom = 0
max_mv_length_horizontal = 10 
max_mv_length_vertical = 10 
max_num_reorder_frames = 0
max_dec_frame_buffering = 1
stop bit = 1

exp-Golumb codes used:
0  = 1xxxxxxx
1  = 010xxxxx
10 = 0001011x
This is a graph from last night with the VUI mods; note that delay sits at around 1 second consistently (2nd graph).  Very low light; camera was running around 10fps; encoder configured for 10fps
(In reply to Randell Jesup [:jesup] from comment #47)
> Created attachment 8421874 [details]
> graphs with the VUI munging
> 
> This is a graph from last night with the VUI mods; note that delay sits at
> around 1 second consistently (2nd graph).  Very low light; camera was
> running around 10fps; encoder configured for 10fps

So the summary here is that this does not make things better?
Randell,

I updated with the latest from your patch queue which includes the VUI patch. It feels like the delay is reduced to between 1 and 2 seconds.
Attachment #8419802 - Attachment is obsolete: true
Attachment #8419840 - Attachment is obsolete: true
Attachment #8421949 - Flags: feedback?(rjesup)
Sorry, forgot to enable the logs as per comment 26. Uploading enabled log.
Attachment #8421949 - Attachment is obsolete: true
Attachment #8421949 - Flags: feedback?(rjesup)
Attachment #8421988 - Flags: review?(rjesup)
Attachment #8421988 - Flags: review?(rjesup) → feedback?(rjesup)
Attached file logcat.txt.pdf
Attachment #8421988 - Flags: feedback?(rjesup)
Randell,

Do you think jolin's stagefright patch (attachement 8420855) is still necessary?
Flags: needinfo?(rjesup)
Attachment #8420855 - Flags: feedback?(dwilson)
jolin,

Patches to stagefright (and the AOSP HAL in general) would need to be rebased on each AOSP upgrade and have to be pulled by each hardware vendor individually. So they carry an expensive maintenance price. 

We should really make sure we have no other option first. jesup, pkerr and I are currently making progress on the lag issue.
Currently on KK + newer DSP code, there's still around a 200ms delay decoding frames.

jolin's stagefright patch probably isn't necessary, merely a nice-to-have if a sender doesn't include the VUI parameters (which they should include).
Flags: needinfo?(rjesup)
Blocks: 1011683
Comment on attachment 8420855 [details] [diff] [review]
[WIP] Proposal to modify stagefright to expose more codec parameters supported by Qualcomm DSP..

Review of attachment 8420855 [details] [diff] [review]:
-----------------------------------------------------------------

Removing f? until we have a better idea of what we want to do.
Attachment #8420855 - Flags: feedback?(ekr)
Comment on attachment 8420858 [details] [diff] [review]
[WIP] Configure OMX codec parameters supported by Qualcomm.

Review of attachment 8420858 [details] [diff] [review]:
-----------------------------------------------------------------

Removing f? until we are more sure what we want to do.
Attachment #8420858 - Flags: feedback?(ekr)
Attachment #8420480 - Attachment is obsolete: true
Comment on attachment 8427517 [details] [diff] [review]
Use configureDirect to set OMX HW H264 encoder config correctly

Note: a bunch of the config gets updated by later patches in the queue
Attachment #8427517 - Flags: review?(jolin)
Attachment #8420477 - Attachment is obsolete: true
Attachment #8427519 - Flags: review?(jolin)
Attachment #8427520 - Flags: review?(jolin)
Attachment #8427517 - Flags: review?(jolin) → review+
Comment on attachment 8427519 [details] [diff] [review]
increase logging for H264 OMX code

Review of attachment 8427519 [details] [diff] [review]:
-----------------------------------------------------------------

::: media/webrtc/signaling/src/media-conduit/WebrtcOMXH264VideoCodec.cpp
@@ +37,5 @@
>  #define DRAIN_THREAD_TIMEOUT_US  (1000 * 1000ll) // 1s.
>  
>  #define LOG_TAG "WebrtcOMXH264VideoCodec"
>  #define CODEC_LOGV(...) CSFLogInfo(LOG_TAG, __VA_ARGS__)
> +#define CODEC_LOGD(...) CSFLogError(LOG_TAG, __VA_ARGS__) // fix before review

Do you still want this?
Attachment #8427519 - Flags: review?(jolin) → review+
Attachment #8427520 - Flags: review?(jolin) → review+
Should have been leave-open.  We've fixed what we can in the current JB-based flame release, but this does not resolve the problem.  We need new drops of code from upstream for sending VUI and other issues being worked on there to get the decode delay down to an acceptable value.  Until that is available, it's unclear if more changes for this issue will be needed in our code.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Duplicate of this bug: 989944
feature-b2g: --- → 2.0
Whiteboard: [s=fx32, p=0] → [p=0, POVB, ft:loop]
Finally loaded an ADSP with VUI support. The delay is better but we may have room for improvement. With jesup's graphing tool I see it starts at around 150ms then goes up to 250ms.
Attachment #8431171 - Flags: feedback?(rjesup)
Comment on attachment 8431171 [details]
Log of loopback with latest KK/ADSP

Thanks for the raw log!

So I looked at this, and also updated the graphing tool in the patch queue to run delay x-axis in seconds instead of frames, and to also graph decode queue depth.  Depth is the number of frames in the decode queue at any given moment, at the time of enqueue, so the value can never be less than 1, and ideally would always be 1 or close to.

I'll attach the graph, but the decode depth typically is sitting at 3; and hits 1 only for a single frame.  The worst value is 9, and at 9 seconds it goes from mostly 3 to 6-8, and then down to 4-6 with excursions to 9!  

It's unclear what the trigger for the increased delay was, as bandwidth doesn't seem to have gotten that high.  I'll note that at 9 seconds is where we're forced to re-configure the encoder for 30fps (we dropped to 20 at ~4.5 seconds when the frame rate went down to 13; and back above 24fps at ~9 seconds, and this is where the worst delay starts.

So, not surprisingly, this is where the delay is (and really is nothing more than a reflection of the delay values seen in the first graph), and these values are way too high (though a huge improvement on JB!)
Attachment #8431171 - Flags: feedback?(rjesup)
Attached file new-adsp.pdf
Blocks: 1024453
dis_arbitrary_with_10_buf.txt has the CSF log with OMX decoder arbitrary mode disabled and increasing the native window buffer to 10. These changes reduce the decoder latency.
Randell,

Can you help to generate the latency waveform with your analysis tool?
Flags: needinfo?(rjesup)
Depends on: 1031500
Attached file smaller.pdf
Looks much better!  It would be nice to see with the patch I just landed on inbound to increase the number of buffers - that may solve the glitches like around 8 seconds.

Something odd happens around 40 seconds or so; I'm not sure if it's the graph tool or a real issue - I suspect the tool and so cut that part off until I look closer.

Great!
Flags: needinfo?(rjesup)
(In reply to Randell Jesup [:jesup] from comment #72)
> Created attachment 8447562 [details]
> smaller.pdf
> 
> Looks much better!  It would be nice to see with the patch I just landed on
> inbound to increase the number of buffers - that may solve the glitches like
> around 8 seconds.
> 
> Something odd happens around 40 seconds or so; I'm not sure if it's the
> graph tool or a real issue - I suspect the tool and so cut that part off
> until I look closer.
> 
> Great!
Randell,
Thanks for the graph. I already increased buffer number to 10 in my build based on https://bugzilla.mozilla.org/show_bug.cgi?id=1009410#c38 when I captured this log. Your https://bugzilla.mozilla.org/show_bug.cgi?id=1031500 patch looks same as  https://bugzilla.mozilla.org/show_bug.cgi?id=1009410#c38 except for using a constant define to define the number of buffer.
This looks good and sounds like the patch that was discussed to solve this issue.  Is the only thing waiting to close this it landing on central and more testing?
Flags: needinfo?(rjesup)
closing with patch Randell and Randy landed - 1031500 that is already blocking-b2g 2.0+ for uplift.
Status: REOPENED → RESOLVED
Closed: 6 years ago6 years ago
Resolution: --- → FIXED
Flags: needinfo?(rjesup)
Attachment #8420855 - Flags: feedback?(rjesup)
Attachment #8420858 - Flags: feedback?(rjesup)
You need to log in before you can comment on or make changes to this bug.