Closed Bug 1087605 Opened 11 years ago Closed 11 years ago

WebRTC video freezes after 5mins

Tracking

()

Status:

VERIFIED FIXED

Milestone:

mozilla36

Project Flags:

blocking-b2g

2.1+

Tracking Flags:

Tracking

Status

firefox34

---

fixed

firefox35

---

fixed

firefox36

---

fixed

b2g-v2.1

---

verified

b2g-v2.2

---

verified

People

(Reporter: jaywang, Assigned: jesup)

References

Details

Attachments

(20 files, 5 obsolete files)

webrtc.zip 11 years ago Jay 3.14 MB, application/x-zip-compressed		Details
less_audio_logging 11 years ago Randell Jesup [:jesup] (needinfo me) 3.66 KB, patch		Details \| Diff \| Splinter Review
trace_packetization 11 years ago Randell Jesup [:jesup] (needinfo me) 9.34 KB, patch		Details \| Diff \| Splinter Review
trace_packetization_printfs 11 years ago Randell Jesup [:jesup] (needinfo me) 6.12 KB, patch		Details \| Diff \| Splinter Review
frame_match.el 11 years ago Randell Jesup [:jesup] (needinfo me) 1.79 KB, text/plain		Details
8x26_v2.1_more_trace1.txt.zip 11 years ago Jay 1.06 MB, application/x-zip-compressed		Details
webrtc_trace2.zip 11 years ago Jay 5.81 MB, application/x-zip-compressed		Details
webrtc_trace3.zip 11 years ago Jay 2.75 MB, application/x-zip-compressed		Details
logcat log - Log around MediaCodec::renderOutputBufferAndRelease() 11 years ago Sotaro Ikeda [:sotaro] 15.18 KB, text/plain		Details
log patch: gecko log 11 years ago Sotaro Ikeda [:sotaro] 16.99 KB, patch		Details \| Diff \| Splinter Review
log patch: gonk log 11 years ago Sotaro Ikeda [:sotaro] 27.78 KB, patch		Details \| Diff \| Splinter Review
thread priority of Firefox Hello app durin WebRTC 11 years ago Sotaro Ikeda [:sotaro] 5.18 KB, text/plain		Details
patch - Raise ALooper thread priority 11 years ago Sotaro Ikeda [:sotaro] 2.09 KB, patch		Details \| Diff \| Splinter Review
patch - Raise OMX CallbackDispatcherThread thread priority 11 years ago Sotaro Ikeda [:sotaro] 796 bytes, patch		Details \| Diff \| Splinter Review
logout log during H.264 WebRTC with thread priority raise 11 years ago Sotaro Ikeda [:sotaro] 183.01 KB, text/plain		Details
patch - Increase ACodec input buffer 11 years ago Sotaro Ikeda [:sotaro] 1.49 KB, patch		Details \| Diff \| Splinter Review
patch - Raise OMX hal thread priority 11 years ago Sotaro Ikeda [:sotaro] 2.25 KB, patch		Details \| Diff \| Splinter Review
Application's threads during H.264 WebRTC on b2g v2.0 11 years ago Sotaro Ikeda [:sotaro] 5.25 KB, text/plain		Details
Application's threads during H.264 WebRTC on b2g v2.1 11 years ago Sotaro Ikeda [:sotaro] 5.02 KB, text/plain		Details
Threads' cpu usage during H.264 WebRTC on b2g v2.0 11 years ago Sotaro Ikeda [:sotaro] 439.02 KB, text/plain		Details
Threads' cpu usage during H.264 WebRTC on b2g v2.1 11 years ago Sotaro Ikeda [:sotaro] 49.32 KB, text/plain		Details
Callstack when cprAdjustRelativeThreadPriority() is called 11 years ago Sotaro Ikeda [:sotaro] 7.08 KB, text/plain		Details
don't bump sipcc thread priorities on B2G 11 years ago Randell Jesup [:jesup] (needinfo me) 1.79 KB, patch	gcp : feedback+	Details \| Diff \| Splinter Review
don't try to set the priority of the CCApp thread (which doesn't exist) 11 years ago Randell Jesup [:jesup] (needinfo me) 1.25 KB, patch	bwc : review+ lsblakk : approval-mozilla-aurora+ lsblakk : approval-mozilla-beta+ bajaj : approval-mozilla-b2g34+	Details \| Diff \| Splinter Review
WIP NOT FOR CHECKIN: nuke all traces of the network-activity indicator 11 years ago Randell Jesup [:jesup] (needinfo me) 1.87 KB, patch		Details \| Diff \| Splinter Review

Jay

Reporter

Description

•

11 years ago

Attached file webrtc.zip — Details

[Blocking Requested - why for this release]: Test case: - Enable the webRTC CSF logging - Start the webRTC video call - Let it run for 5 mins - Observe that the video freezes I looks like the encoded frame was re-packed to different size before getting to decoder and causes the freeze. Here, there are four encoded frame sends from the 8x10 device with v2.0. It has the size of 1214, 1053, 962, 992, respectively. When the frame gets to 8x26 device with v2.1, the size becomes 1214, 1053, 975 and 1007 instead For some reason the last two frame size was altered. The happens occasionally throughout the call duration. This problem was only seen on v2.1 decoding path not on the v2.0. You can see this from the attached log. The log was captured from two devices simultaneously during the webRTC call. 8x26_v2.1.txt: 8x26 with v2.1 SW 8x10_v2.0.txt: 8x10 with v2.0 SW I am suspecting that the fix for https://bugzilla.mozilla.org/show_bug.cgi?id=1068394 was not completely fixing the issue or having some corner case that wasn't taken care. Here is the partial log that shows the discrepancy of frame size between 8x10 encoder output and 8x26 decoder input. I/PRLog ( 1444): 2014-10-22 20:19:53.318228 UTC - 11009168[af268b00]: [OMXOutputDrain|WebrtcOMXH264VideoCodec] WebrtcOMXH264VideoCodec.cpp:699: Encoded frame: 1214 bytes, 240x320, is_param 0, is_iframe 0, timestamp 946415800, captureTimeMs 0 I/PRLog ( 1444): 2014-10-22 20:19:53.354942 UTC - 11009168[af268b00]: [OMXOutputDrain|WebrtcOMXH264VideoCodec] WebrtcOMXH264VideoCodec.cpp:699: Encoded frame: 1053 bytes, 240x320, is_param 0, is_iframe 0, timestamp 946418860, captureTimeMs 0 I/PRLog ( 1444): 2014-10-22 20:19:53.387776 UTC - 11009168[af268b00]: [OMXOutputDrain|WebrtcOMXH264VideoCodec] WebrtcOMXH264VideoCodec.cpp:699: Encoded frame: 962 bytes, 240x320, is_param 0, is_iframe 0, timestamp 946421560, captureTimeMs 0 I/PRLog ( 1444): 2014-10-22 20:19:53.409812 UTC - 11009168[af268b00]: [OMXOutputDrain|WebrtcOMXH264VideoCodec] WebrtcOMXH264VideoCodec.cpp:699: Encoded frame: 992 bytes, 240x320, is_param 0, is_iframe 0, timestamp 946424530, captureTimeMs 0 I/PRLog ( 4179): 2014-10-22 20:19:54.195645 UTC - 20090448[b122f700]: [DecodingThread|WebrtcOMXH264VideoCodec] WebrtcOMXH264VideoCodec.cpp:403: Decoder input: 1214 bytes (NAL 0x41), time 12705511966 (1143496077), flags 0x0 I/PRLog ( 4179): 2014-10-22 20:19:54.223383 UTC - 20090448[b122f700]: [DecodingThread|WebrtcOMXH264VideoCodec] WebrtcOMXH264VideoCodec.cpp:403: Decoder input: 1053 bytes (NAL 0x41), time 12705545966 (1143499137), flags 0x0 I/PRLog ( 4179): 2014-10-22 20:19:54.329301 UTC - 20090448[b122f700]: [DecodingThread|WebrtcOMXH264VideoCodec] WebrtcOMXH264VideoCodec.cpp:403: Decoder input: 975 bytes (NAL 0x41), time 12705705966 (1143513537), flags 0x0 I/PRLog ( 4179): 2014-10-22 20:19:54.351217 UTC - 20090448[b122f700]: [DecodingThread|WebrtcOMXH264VideoCodec] WebrtcOMXH264VideoCodec.cpp:403: Decoder input: 1007 bytes (NAL 0x41), time 12705737966 (1143516417), flags 0x0

Jay

Reporter

Updated

•

11 years ago

Flags: needinfo?(rjesup)

Randell Jesup [:jesup] (needinfo me)

Assignee

Comment 1

•

11 years ago

I looked carefully at this, and am still looking. Using OpenH264, I never see a disparity on sizes. With OMX, there are two main things going on that can affect the size: on Encode(), we call getNextNALUnit() to separately send each NAL. This will grab NALs, and may strip extra/long start codes (like 16 0's followed by a 1). All of these appear to be a single NAL for the frame, and most are smaller than the packet size, so there's no FU-A breaking and reassembly. Data transferred in RTP has no start codes; they're re-inserted at the receiving end. After start-code insertion, it should be passed to the decoder, which again (in 2.1) uses getNextNALUnit() to break apart multiple NALs - normally this will occur only on iframes, and the frames where we're seeing this are not iframes. I'll check more closely into the size generation for reception, but other than start-code insertion it should be straightforward. Is this only seen in 2.0->2.1 calls, or 2.1->2.1?

Flags: needinfo?(rjesup)

Randell Jesup [:jesup] (needinfo me)

Assignee

Comment 2

•

11 years ago

> Using OpenH264, I never see a disparity on sizes. Note: that's desktop, m-c -> m-c, though there's no real difference between m-c and beta/34/2.1 here I believe.

Jay

Reporter

Comment 3

•

11 years ago

> > Is this only seen in 2.0->2.1 calls, or 2.1->2.1? The problem is seen on both 2.0->2.1 and 2.1->2.1.

Randell Jesup [:jesup] (needinfo me)

Assignee

Comment 4

•

11 years ago

Ok, in my self-call test, I never see the size disparities you're seeing, so we may still need to look at it. However, on a Flame running beta (very close to b2g34 still), I see a build up of in-decode frames. Eventually, the circular map of 30 entries in generic_encoder.cc (_timestampMap) overflows, and newly-decoded frames can't be found in it, and they get dropped on the floor (causing all video output to cease). (And you can see increasing delay as well) The cause of the decode buildup is the important part: At the start of the call, each frame is decoded and rendered almost immediately. The sequence is: 1) Start decode of frame X: mCodec->queueInputBuffer( ... <timestamp N> ) 2) Find that frame X was decoded: mCodec->dequeueOutputBuffer(... <timestamp N> ...) 3) Ask OMX codec to render it in a buffer and call us back:mCodec->renderOutputBufferAndRelease(index); 4) Codec calls OnNewFrame(): // Will be called when MediaCodec::RenderOutputBufferAndRelease() returns // buffers back to native window for rendering. void OnNewFrame() {...} 5) OnNewFrame passes the frame up to generic_encoder's callback. So all that happened very quickly to start. As time went by, the delay before 4 happens (OnNewFrame callback) gets longer, and longer. Eventually it's happening with more than 30 other frames queued, and at that point the map in generic_encoder overflows and video stops. The current log I have doesn't have timestamps, but you can see the frames coming and going between #3 and #4, and in lines in my log OnNewFrame's log goes from being the next line after #2/3's log to being 2500 lines later in the log (and that's just logging some PRlog and jitter buffer stuff, with all b2g OS/etc stuff filtered out.) There were 31 "Decoder output" logs between them, which matches the analysis. Sotaro: any idea what could cause this? (Also adding jhlin in case he has any thoughts) Why would this be different in 2.1? (and we should verify it is, and that it's not somehow an artifact of my test setup, but I can't see how it would be). #2/3: media/webrtc/signaling/src/media-conduit/WebrtcOMXH264VideoDecoder.cpp:485 (DrainOutput()) and #4: media/webrtc/signaling/src/media-conduit/WebrtcOMXH264VideoDecoder.cpp:527 (OnNewFrame())

Flags: needinfo?(sotaro.ikeda.g)

Randell Jesup [:jesup] (needinfo me)

Assignee

Comment 5

•

11 years ago

Attached patch less_audio_logging — Details — Splinter Review

first of three patches to help debug packetization issues

Randell Jesup [:jesup] (needinfo me)

Assignee

Comment 6

•

11 years ago

Attached patch trace_packetization — Details — Splinter Review

Randell Jesup [:jesup] (needinfo me)

Assignee

Comment 7

•

11 years ago

Attached patch trace_packetization_printfs — Details — Splinter Review

To use this and debug packetization issues, add this to b2g.sh: export NSPR_LOG_MODULES=mediapipeline:5,signaling:6 export COMMAND_PREFIX=logwrapper I use "adb logcat | egrep "(/b2g|/PRLog)" >/tmp/log to avoid tons of random stuff

Randell Jesup [:jesup] (needinfo me)

Assignee

Comment 8

•

11 years ago

Attached file frame_match.el — Details

emacs keyboard macro to process a logfile per above into two buffers, one of Encoded frame lines, one a Decoder input lines. I then use "cut -b 135-139 lll > decoder2" and "cut -b 135-139 kkk >encoded2" to get the frame sizes out of there for diffing (adjust the 'cut' params as needed) Note that iframes will show as NNNN on encoder side, and "17 8 MMMM" on the decode side, where MMMM is ~25 less than NNNN; this is expected. I then use "ediff-buffers" to compare.

Jay

Reporter

Comment 9

•

11 years ago

Attached file 8x26_v2.1_more_trace1.txt.zip — Details

Randell, This log has your trace_packetization patch. Also, this time I see there is 1sec delay from output to newframe. I/PRLog ( 1424): 2014-10-24 18:05:09.907252 UTC - 24825424[b1205380]: [DecodingThread|WebrtcOMXH264VideoCodec] WebrtcOMXH264VideoCodec.cpp:403: Decoder input: 1139 bytes (NAL 0x41), time 7922389533 (713015058), flags 0x0 I/PRLog ( 1424): 2014-10-24 18:05:09.952827 UTC - 24831264[b1329500]: [OMXOutputDrain|WebrtcOMXH264VideoCodec] WebrtcOMXH264VideoCodec.cpp:466: Decoder output: 8 bytes, offset 0, time 7922389533, flags 0x0 I/PRLog ( 1424): 2014-10-24 18:05:10.879652 UTC - 24830096[b1329100]: [CodecLooper|WebrtcOMXH264VideoCodec] WebrtcOMXH264VideoCodec.cpp:528: Decoder NewFrame: 240 bytes, 320x240, timestamp -5761779888365240000, renderTimeMs 713015058