Closed Bug 1243608 Opened 4 years ago Closed 4 years ago

Can't seek in video with only a single video frame

Categories

(Core :: Audio/Video: Playback, defect, P2)

45 Branch
defect

Tracking

()

RESOLVED FIXED
mozilla47
Tracking Status
firefox47 --- fixed

People

(Reporter: jya, Assigned: jya)

References

(Blocks 1 open bug)

Details

(Keywords: regression)

Attachments

(6 files, 2 obsolete files)

Some videos are made of a single frame at the start and an audio track.

Like https://people.mozilla.org/~jyavenard/tests/1frame.html

We can't seek in those videos. instead seek cause playtime to go back to the beginning of the videos.

Those videos are commonly created by screen capture tool, and is used by Vimeo and Dailymotion for music videos (where the first frame is typically the album cover)

All other browsers handle those files properly, allowing to seek into them.
I argued that the video frame should have a duration extending to the end of the file, but the opposite argument is that in the mp4, the track is marked as having a duration of 55s (for this particular video) and per the MP4 spec, the container duration override the information found in the sample
actually we can seek to the proper location. it's not optimum however as we seek audio to where the video seeked, so this causes decoding all audio frames to be decoded from the start until the seek position.

So seeking works, however playback restart from 0.
JW, this is giving me something totally unexpected with Nightly.

With the video above do the following:
- Load https://people.mozilla.org/~jyavenard/tests/1frame.html (do not start playback)
- Seek to some place in the future, I chose 35s.
- Click on play()

the time position will go back to 0 and will start progressing. Note that no audio can be heard.
Once the time position reaches 35s (the original seek position) then audio will start playing.

What control this behaviour ?
Flags: needinfo?(jwwang)
v.canPlayType("video/mp4; codecs="avc1.42E01E, mp4a.40.22") = probably
* v.loadstart
* v.progress
* v.suspend
* v.durationchange
v.duration=55.936
* v.resize
v.height=720
v.width=960
v.loadedmetadata
v.height=720
v.width=960
v.duration=55.936
* v.loadeddata
* v.canplay
* v.canplaythrough  <---- seeking to ~50s
* v.seeking
* v.canplay
* v.canplaythrough
* v.timeupdate (v.currentTime=50.395)
* v.seeked (v.currentTime=50.395)
* v.ratechange  <------ click on play (|>)
* v.play
* v.playing
* v.timeupdate (v.currentTime=0)
* v.timeupdate (v.currentTime=0.18577)
ok..

the problem is here:

https://dxr.mozilla.org/mozilla-central/source/dom/media/MediaDecoderStateMachine.cpp#2086

We correctly seeked to where we wanted, but then pop the time of the last video frame displayed (which here is 0), so currentTime is reset to 0.

This is wrong
Flags: needinfo?(jwwang)
This is a regression introduced by bug 1112438
Depends on: 1112438
Keywords: regression
The original commit was a work around for poorly encoded videos generated by some FFOS devices which could make the first audio frame non-zero. Trying then to make them appear as zero when seeking to start.

There are better ways to get around this problem.
an alternative so to not "regress" the FFOS issue would be to narrow the workaround to situations where audio time and video time are closed to one another, like < 125ms (which is the gap we allow and still skip over)

Happy with this approach too.
Flags: needinfo?(cpearce)
The 1st decoded video frame to come after seeking is [0,42709] and then we get OnNotDecoded() with type=VIDEO. Considering we seek to 35s, it is curious the video frames returned by the reader are totally off the target.
The same problem happens to audio too. The 1st audio frame after seeking is [0,21333]. And MSDM continues to decode and drop audio frames until the one close to the seek target is received.
Oh, I see. There is only one single video frame in the track.
(In reply to JW Wang [:jwwang] from comment #11)
> The same problem happens to audio too. The 1st audio frame after seeking is
> [0,21333]. And MSDM continues to decode and drop audio frames until the one
> close to the seek target is received.

yes, that's because we always seek the audio to the video seek time.

This was done for multiple reason:
1- It's required if fast seeking
2- With YouTube, often their media segment aren't exactly aligned and it was causing broken A/V seek after seeking
(In reply to Jean-Yves Avenard [:jya] from comment #13)
> 2- With YouTube, often their media segment aren't exactly aligned and it was
> causing broken A/V seek after seeking

Isn't the AV sync problem is already handled by MDSM? MDSM is still able to keep A/V sync even in this case where audio time is far from video time after seeking.
(In reply to JW Wang [:jwwang] from comment #14)
> (In reply to Jean-Yves Avenard [:jya] from comment #13)
> > 2- With YouTube, often their media segment aren't exactly aligned and it was
> > causing broken A/V seek after seeking
> 
> Isn't the AV sync problem is already handled by MDSM? MDSM is still able to
> keep A/V sync even in this case where audio time is far from video time
> after seeking.

it should have yes, but somehow that never worked properly.
With the new MSE I thought I could remove the seek audio to video seek , but that caused issue.

(bug 1188313, was for solving fast seek)
The A/V sync issue: bug 1105066. It was also fixing a stall (bug 1173792)
I think we should remove that and rely on MDSM to handle A/V sync correctly.

As your bug 1242841 comment 3, negative timestamps could be a problem. However, they are usually too small to cause A/V sync problem. For our media stack to be more robust, we need to handle abnormal timestamps to keep A/V sync all the time.
(In reply to JW Wang [:jwwang] from comment #16)
> I think we should remove that and rely on MDSM to handle A/V sync correctly.

Yes, I have a pending patch that only seek audio and video in the reader if we are performing a fast seek.

I will upload shortly
This will allow the reader to know if we are performing a fast seek.
Attachment #8713147 - Flags: review?(cpearce)
Also makes it a private member and provide a GetTime() accessor instead.
Attachment #8713148 - Flags: review?(cpearce)
This allows for much faster seek time.
Attachment #8713151 - Flags: review?(cpearce)
Attachment #8713027 - Flags: review?(cpearce) → review+
This will allow the reader to know if we are performing a fast seek.

Rebasing following bug 1244477.
Additionally, realised that InvokeAsync actually pass a reference to the original object without making a copy. This would lead to a racy behaviour. So updating the prototype to use a value rather than a reference.
Attachment #8714213 - Flags: review?(cpearce)
Attachment #8713147 - Attachment is obsolete: true
Attachment #8713147 - Flags: review?(cpearce)
Also makes it a private member and provide a GetTime() accessor instead.

rebasing
Attachment #8714215 - Flags: review?(cpearce)
Attachment #8713148 - Attachment is obsolete: true
Attachment #8713148 - Flags: review?(cpearce)
Comment on attachment 8714213 [details] [diff] [review]
P2. Pass the full SeekTarget object to MediaDecoderReader::Seek.

Review of attachment 8714213 [details] [diff] [review]:
-----------------------------------------------------------------

::: dom/media/SeekTarget.h
@@ +15,5 @@
> +  Observable,
> +  Suppressed
> +};
> +
> +// Stores the seek target; the time to seek to, and whether an Accurate,

This 4 line comment is the same two line comment repeated twice. Can be said only once.
Attachment #8714213 - Flags: review?(cpearce) → review+
Comment on attachment 8714215 [details] [diff] [review]
P3. Make SeekTarget::mTime a TimeUnit object.

Review of attachment 8714215 [details] [diff] [review]:
-----------------------------------------------------------------

::: dom/media/MediaDecoderStateMachine.cpp
@@ +967,5 @@
>        if (!mDropVideoUntilNextDiscontinuity) {
>          // We must be after the discontinuity; we're receiving samples
>          // at or after the seek target.
>          if (mCurrentSeek.mTarget.mType == SeekTarget::PrevSyncPoint &&
> +            mCurrentSeek.mTarget.GetTime().ToMicroseconds() > mCurrentTimeBeforeSeek &&

Makes me wonder if mCurrentTimeBeforeSeek could be a TimeUnit as well.
Attachment #8714215 - Flags: review?(cpearce) → review+
Attachment #8713149 - Flags: review?(cpearce) → review+
Attachment #8713150 - Flags: review?(cpearce) → review+
Attachment #8713151 - Flags: review?(cpearce) → review+
Depends on: 1244639
Blocks: 1245019
Part 2 causes breakage on b2g: 

In file included from ../../dist/include/MediaOmxDecoder.h:9:0,
                 from ../../../../mozilla-inbound/dom/media/DecoderTraits.cpp:32:
../../dist/include/MediaOmxCommonDecoder.h:29:8: error: 'void mozilla::MediaOmxCommonDecoder::CallSeek(mozilla::SeekTarget)' marked override, but does not override
In file included from ../../../../mozilla-inbound/dom/media/DecoderTraits.cpp:8:0:
../../../../mozilla-inbound/dom/media/MediaDecoder.h:608:16: error: 'virtual void mozilla::MediaDecoder::CallSeek(const mozilla::SeekTarget&)' was hidden [-Werror=overloaded-virtual]
In file included from ../../dist/include/MediaOmxDecoder.h:9:0,
                 from ../../../../mozilla-inbound/dom/media/DecoderTraits.cpp:32:
../../dist/include/MediaOmxCommonDecoder.h:29:8: error:   by 'void mozilla::MediaOmxCommonDecoder::CallSeek(mozilla::SeekTarget)' [-Werror=overloaded-virtual]
cc1plus: all warnings being treated as errors
Flags: needinfo?(jyavenard)
Depends on: 1245052
That is weird. I saw that error on try and corrected it. 

Seems that I pushed an old version of the patch. Sorry for that.
Flags: needinfo?(jyavenard)
Flags: needinfo?(cpearce)
Depends on: 1246358
Depends on: 1284700
You need to log in before you can comment on or make changes to this bug.