Closed Bug 816664 Opened 7 years ago Closed 7 years ago

Video frames reordered in MediaStreamGraph

Categories

(Core :: WebRTC: Audio/Video, defect)

defect
Not set

Tracking

()

RESOLVED FIXED
mozilla20

People

(Reporter: ekr, Assigned: padenot)

Details

(Whiteboard: [WebRTC],[blocking-webrtc-interop][qa-])

Attachments

(7 files, 2 obsolete files)

No description provided.
I am seeing frame reordering during video calls with Chrome/Firefox. It's not clear why it happens in just this scenario.

The patch applied won't exactly apply against m-i trunk but it's obvious what it does. I'll also attach the output file of a sample run and some analysis output of that.
Summary: Video frame tainting → Video frames reordered in MediaStreamGraph
Attached file The timestamp output
Attached file analysis output
Note that this creates terribly bad video when we try to call chrome from firefox.
[15:14]	ekr	oh, so timestamp is 0
[15:14]	ekr	and render_time is crap b/c I copied from the wrong offset
[15:15]	ekr	I don't think it's needed, but I guess I can fix it if needed
[15:15]	derf	Yes, on Mac, timestamp is always 0.
[15:15]	ekr	since roc ignores render_time
[15:15]	derf	The capture driver never fills it in.
[15:15]	jesup	k, thanks, good to know.
[15:15]	derf	render_time is just going to be the current system time.
FYI, the issue here as reported is that the frame number goes backwards at times; and seems to like certain values/sequences.

As this is measuring from where we replace mImage (used for NotifyPull/AddToTrack) and NotifyQueuedTrackChanges, this appears to be somewhere in the MediaStreamGraph code.

Adding Paul as well in case he has an insight.

Ekr: can you repeat this with the logs going to LOG() and turn on peerconnection and mediastreamgraph (and the conduit logging)?  (So we can see what MediaStreamGraph thinks of all this).
More information. I am also seeing some sketchy segments

(gdb) p *$1
$2 = (mozilla::VideoSegment) {
  <mozilla::MediaSegmentBase<mozilla::VideoSegment, mozilla::VideoChunk>> = {
    <mozilla::MediaSegment> = {
      _vptr$MediaSegment = 0x1060ff940, 
      mDuration = -7368, 
      mType = mozilla::MediaSegment::VIDEO
    }, 
    members of mozilla::MediaSegmentBase<mozilla::VideoSegment, mozilla::VideoChunk>: 
    mChunks = {
      <nsTArray_base<nsTArrayDefaultAllocator>> = {
        mHdr = 0x11d68c440
      }, 
      <nsTArray_SafeElementAtHelper<mozilla::VideoChunk, nsTArray<mozilla::VideoChunk, nsTArrayDefaultAllocator> >> = {<No data fields>}, <No data fields>}
  }, <No data fields>}
(gdb) call $1->mChunks.Length()
$3 = 2
Attached file Gum output (obsolete) —
This shows how aDesiredTime can go backwards, and so invalidates the logic in MediaEngineWebRTCVideo.cpp::NotifyPull, where it calculates a delta for a frame (to avoid the graph blocking) based on the delta of aDesiredTime versus the previous aDesiredTime.

We need a new solution for determining the delta, or eliminate deltas from video streams (i.e. frames are points in times not spans of time).
Attachment #686867 - Attachment is obsolete: true
Wallpaper patch that should avoid the problem for now and will assert if it occurs for any reason
Attachment #686984 - Flags: review?(tterribe)
Attachment #686984 - Flags: review?(roc)
I assume by "wallpaper" you mean that there is some more fundamental problem that we eventually need to fix?
Do we have to worry about that biting us in some other way?
The 'right' fix is to deal with time durations for video frames differently and never block a (gUM) mediastream on them.

This should be a totally safe patch.  In odd instances (negative delta) we might drop a frame.
Whiteboard: [WebRTC],[blocking-webrtc-interop]
OK, so as currently defined it's not necessarily wrong for aDesiredTime to move backward. aDesiredTime is computed based on the optimistic assumption that the stream will not be blocked after mStateComputedTime. But if later we determine the stream will actually be blocked during some of that time, we won't need as much data as we previously thought, so aDesiredTime can validly go backward.

Obviously we don't *want* the stream to be blocked, ever, but that's a separate issue.
(In reply to Robert O'Callahan (:roc) (Mozilla Corporation) from comment #14)
> OK, so as currently defined it's not necessarily wrong for aDesiredTime to
> move backward. aDesiredTime is computed based on the optimistic assumption
> that the stream will not be blocked after mStateComputedTime. But if later
> we determine the stream will actually be blocked during some of that time,
> we won't need as much data as we previously thought, so aDesiredTime can
> validly go backward.

And given this, I believe the patch Randell wrote here is correct.
Comment on attachment 686984 [details] [diff] [review]
wallpaper patch for negative delta times for video frames

Review of attachment 686984 [details] [diff] [review]:
-----------------------------------------------------------------

::: content/media/webrtc/MediaEngineWebRTCVideo.cpp
@@ +14,5 @@
>  #define LOG(msg) PR_LOG(GetMediaManagerLog(), PR_LOG_DEBUG, msg)
>  #else
>  #define LOG(msg)
>  #endif
> +#define LOG_ALL_FRAMES 1

Do we actually want to check this line in?

@@ +115,2 @@
>  #endif
> +  // Don't append if we've already provided a frame that supposedly goes past the current aDesiredTime

If this is the correct thing to do, perhaps we should fix MediaStreamGraphImpl::ExtractPendingInput() to not call NotifyPull() in the first place if it already has enough data, instead of making every MediaStreamListener implementation check this.
Attached patch , test caseSplinter Review
Comment on attachment 687587 [details] [diff] [review]
, test case

Review of attachment 687587 [details] [diff] [review]:
-----------------------------------------------------------------

To exercise this test case, go to: http://mozilla.github.com/webrtc-landing/gum_test.html
Click "audio and video" and it will crash at the assert.
Note the Video works.
Completely untested.

If we take this, we shouldn't need the duration > 0 check in Randell's patch; since the WebRTC NotifyPull should always fill mBuffer up to the last aDesiredTime (at least), a NotifyPull with a smaller aDesiredTime would simply be skipped by this patch.
Attachment #687589 - Flags: review?(rjesup)
(In reply to Robert O'Callahan (:roc) (Mozilla Corporation) from comment #14)
> Obviously we don't *want* the stream to be blocked, ever, but that's a
> separate issue.

I filed bug 817455 to investigate this.
Paul,

Can you please take this bug over. The current consensus is that there are four things that are
needed:

1. Randell's patch.
2. Tim's proposed patch to ditch all but the last video frame in NotifyQueuedTrackChanges
3. Roc's patch.
4. Investigation of why the heck this is happening in the first place and a fix for it (see c19
and now bug 817455)

What's needed here is to review/cleanup/test items 1-3 and then do item #4 and propose a fix.
The concern is that if we just do 1-3, we won't have video time run backwards but we will
have lousy framerates

Roc has agreed to backstop you if you get stuck.
(In reply to Timothy B. Terriberry (:derf) from comment #16)
> Comment on attachment 686984 [details] [diff] [review]
> wallpaper patch for negative delta times for video frames
> 
> Review of attachment 686984 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> ::: content/media/webrtc/MediaEngineWebRTCVideo.cpp
> @@ +14,5 @@
> >  #define LOG(msg) PR_LOG(GetMediaManagerLog(), PR_LOG_DEBUG, msg)
> >  #else
> >  #define LOG(msg)
> >  #endif
> > +#define LOG_ALL_FRAMES 1
> 
> Do we actually want to check this line in?

Probably.  We're logging all the audio frames.  And having this on in debug builds lets us do a better job tracking down problems.

Perhaps we'll turn it off again later, or we could move it to a different log, but it's ok for now.
Comment on attachment 687589 [details] [diff] [review]
Don't call NotifyPull if we already have enough data

Review of attachment 687589 [details] [diff] [review]:
-----------------------------------------------------------------

Hmmm... Didn't know nsWindow.cpp was so involved in MediaStreamGraphs.  :-)
I agree the concept of not pulling when we don't actually want any new data is a good one.
Attachment #687589 - Flags: review?(rjesup) → review-
Assignee: ekr → paul
Attached patch actual patchSplinter Review
Attachment #687589 - Attachment is obsolete: true
Attachment #687912 - Flags: review?(rjesup)
Attachment #687912 - Flags: review?(rjesup) → review+
Regarding comment 19, roc's patch does not remove the need to jesup's check for the positiveness of |delta|. I still see negative delta, and that asserts later.
(In reply to Paul ADENOT (:padenot) from comment #25)
> Regarding comment 19, roc's patch does not remove the need to jesup's check
> for the positiveness of |delta|. I still see negative delta, and that
> asserts later.

Can you figure out why that is and explain it here?
That is what I'm doing right now, but that implies understanding a lot of MediaStream magic.
https://hg.mozilla.org/integration/mozilla-inbound/rev/be0058cf546a
Whiteboard: [WebRTC],[blocking-webrtc-interop] → [WebRTC],[blocking-webrtc-interop][leave open]
Comment on attachment 686984 [details] [diff] [review]
wallpaper patch for negative delta times for video frames

Review of attachment 686984 [details] [diff] [review]:
-----------------------------------------------------------------

Given the lack of progress on this bug, I suppose we can check in the wall-paper for now, but please leave the bug open afterwards until we actually understand what is going on.
Attachment #686984 - Flags: review?(tterribe) → review+
per roc's comment 14, we do actually understand what's going on here.  It may be unexpected, and probably should be commented more in MediaStreamGraph.h:NotifyPull(), and we should consider if a second NotifyPull call with an earlier time is actually a useful thing for the callee, but we do understand this and given the current MSG implementation, this is being handled correctly now.
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Whiteboard: [WebRTC],[blocking-webrtc-interop][leave open] → [WebRTC],[blocking-webrtc-interop]
Whiteboard: [WebRTC],[blocking-webrtc-interop] → [WebRTC],[blocking-webrtc-interop][qa-]
You need to log in before you can comment on or make changes to this bug.