Closed Bug 1208054 Opened 4 years ago Closed 4 years ago

[MSE] Can only seek to position where we have both audio and video content

Categories

(Core :: Audio/Video: Playback, defect, P1)

defect

Tracking

()

RESOLVED INVALID
Tracking Status
firefox44 --- affected

People

(Reporter: jya, Unassigned)

References

(Blocks 1 open bug)

Details

This came up as I was reviewing bug 1208035.

When we attempt to seek, the MediaFormatReader ask the MSE demuxer to seek to a particular position. The MSE Demuxer check first if we have the data available and if not will return WAITING_FOR_DATA.

As a consequence, we can only seek when we have both the audio *and* video at the seek position.

The reason for this behaviour is that the MDSM will not consider to be complete until it has received video and audio data that is past the seek target.

Now MSE defines the seekable attribute to be:
"The HTMLMediaElement.seekable attribute returns a new static normalized TimeRanges object created based on the following steps:

If duration equals NaN:
    Return an empty TimeRanges object.
If duration equals positive Infinity:

        If the HTMLMediaElement.buffered attribute returns an empty TimeRanges object, then return an empty TimeRanges object and abort these steps.
        Return a single range with a start time of 0 and an end time equal to the highest end time reported by the HTMLMediaElement.buffered attribute. 

Otherwise:
    Return a single range with a start time of 0 and an end time equal to duration. "

So it's always a continuous range ; nothing states that we can't seek unless we have both audio and video.

Now what could the behaviour be if we have say the audio source buffer containing [0, 30) and video is [0, 15)[20,30) and you attempt to seek at position 16.

Right now we will stall, but shouldn't we seek to 16 audio, display the video frame found at position 15 and play audio until it reaches position 20 and we have video again?

This is similar to bug 1144987, except that here our seekable range is continuous.

The specification for seeking with MSE are:
https://w3c.github.io/media-source/#mediasource-seeking

"Wait until the user agent has established whether or not the media data for the new playback position is available"

but what is "media data", is just the audio enough? is it both audio + video?
Chris, JW, what's your take on this?
Flags: needinfo?(jwwang)
Flags: needinfo?(cpearce)
(In reply to Jean-Yves Avenard [:jya] from comment #0)
> Now what could the behaviour be if we have say the audio source buffer
> containing [0, 30) and video is [0, 15)[20,30) and you attempt to seek at
> position 16.
> 
> Right now we will stall, but shouldn't we seek to 16 audio, display the
> video frame found at position 15 and play audio until it reaches position 20
> and we have video again?

We stall because we assume [20,30) arrived via XHR before [15,20) which is about to be appended. That could easily happen during normal playback when a packet gets dropped causing a segment to arrive late.
Priority: -- → P1
I will take a look.
Flags: needinfo?(jwwang)
Flags: needinfo?(cpearce)
Assignee: nobody → jwwang
MSE is an extension of media element. It should conform to the definitions of the media element.

In your case where
audio = [0, 30)
video = [0, 15)[20,30),

the seekable ranges should be [0, 15)[20,30). A range without audio should not be included in the seekable ranges.
Assignee: jwwang → nobody
(In reply to JW Wang [:jwwang] from comment #4)
> MSE is an extension of media element. It should conform to the definitions
> of the media element.
> 
> In your case where
> audio = [0, 30)
> video = [0, 15)[20,30),
> 
> the seekable ranges should be [0, 15)[20,30). A range without audio should
> not be included in the seekable ranges.

that is not the definition of the seekable range:
http://w3c.github.io/media-source/index.html#htmlmediaelement-extensions

MSE extends the HTMLMediaElement.

I posted the MSE definition in the original message.

the seekable range is defined as [0, mediasource.buffered.end(mediasource.buffered.length()-1))

it is continuous.
Status: NEW → RESOLVED
Closed: 4 years ago
Resolution: --- → INVALID
For historian.

from IRC:
<jya> the seekable range with MSE is the buffered range if duration is infinite ; or 0 to duration if not.
so we can seek anywhere really
however, the buffered range is the intersection of the audio track and video track the way I read the spec, or more accurately, how I understand it. we can only play if we have buffered data at currentTime and seeing it's the intersection, it means that we should only be able to play if we have both audio and video at that point.

So while we can seek ; we can't play it.

What could be matter of argument, is that seeing we can seek in the seekable range ; shouldn't we fire the "seeked" event even if we stalled. Right now, we don't
Status: RESOLVED → REOPENED
Resolution: INVALID → ---
Upon checking
http://dev.w3.org/html5/spec-preview/media-elements.html#seeking

"Wait until the user agent has established whether or not the media data for the new playback position is available, and, if it is, until it has decoded enough data to play back that position."

So, as the buffered is empty, we can assume that we can't decode enough data, and as such not firing the seeked event is the right thing to do
Status: REOPENED → RESOLVED
Closed: 4 years ago4 years ago
Resolution: --- → INVALID
"Wait until the user agent has established whether or not the media data for the new playback position is available, and, if it is..."

What if it is not? I can't find spec talking about the "if not" case...
To me all of that falls into the "wait until" case. It is not, so the steps do not complete; blocking loop :)
I see. Seek will never fail due to timeout. Since seek is cancelable, it is up to the user to initiate a new seek if waiting too long.
You need to log in before you can comment on or make changes to this bug.