Closed Bug 699316 Opened 8 years ago Closed 2 years ago

Avoid decoding non-keyframes after seeking

Categories

(Core :: Audio/Video: Playback, defect)

defect
Not set

Tracking

()

RESOLVED INACTIVE

People

(Reporter: kinetik, Unassigned)

Details

From bug 696390:
(In reply to Timothy B. Terriberry (:derf) from comment #17)
> (In reply to Matthew Gregan [:kinetik] from comment #16)
> > (In reply to John Koleszar from comment #9)
> > > This patch likely fixes the crash, but this behavior shouldn't be invoked on
> > > well formed streams. Seek bug maybe?
> > 
> > The Clusters pointed to by the Cues in the linked video start with with
> > P-frames.  We rely on the decoder to deal with this rather than explicitly
> > skipping forward to the first I-frame before starting to decode.
> 
> Perhaps we should pass an aSkipToKeyframe flag to DecodeToTarget, and set it
> after calling nestegg_track_seek(). It'd probably also be useful to set
> after seeking in Ogg, and in nsBuiltinDecoderReader::DecodeVideoFrame() (so
> that DecodeToFirstData skips to a keyframe for live streams). In fact the
> only time you don't want to use it is when seeking in Ogg if you decide the
> target is already close enough to the current position.
> 
> In theory there's nothing wrong with shoving random packets into the decoder
> (it needs to be robust enough to handle it), but it would be faster to skip
> decoding them, and showing garbage at the beginning of a live stream isn't a
> great user experience, either.
(In reply to Randell Jesup [:jesup] from bug 696390 comment #20)
> unknown time. Also, I suppose one could program an encoder to do "rolling"
> refresh (AIR refresh) and thus the decoder might never see an IDR/keyFrame
> if it missed the initial one, but after a bounded period of time the screen
> would be error-free.  Since I assume* that vp8 specifies the decoder, it's

Right, this can be done in theory for Theora, too, but there's no tools to produce it, and it requires some creative retconning of the granule pos field in order to communicate the size of the "bounded period of time", so I've been hesitant to officially sanction such streams.

For WebM, since cues are required, you'd have to do something terrible like only have a single entry pointing at the beginning of the stream, and the spec says that would have to be a keyframe anyway. So currently we'd just always decode from the start of the stream after seeking. This won't be any worse than that unless the file is actively violating spec. I'd wait until WebM has a way to specify what that "bounded period of time" is before trying to support the rolling-intra use case. WebM doesn't have a granule pos with a keyframe backpointer to retcon like Ogg does.

RTP streams for WebRTC may be different, but those won't be using nsBuiltinDecoderStateMachine and friends, anyway.
CCing jesup so that he sees comment 1.
Component: Audio/Video → Audio/Video: Playback
Mass closing do to inactivity.
Feel free to re-open if still needed.
Status: NEW → RESOLVED
Closed: 2 years ago
Resolution: --- → INACTIVE
You need to log in before you can comment on or make changes to this bug.