Unable to play back YouTube video - continually polls/shows AJAX spinner

VERIFIED FIXED in Firefox 41

Status

()

Core
Audio/Video
--
major
VERIFIED FIXED
3 years ago
3 years ago

People

(Reporter: stephend, Assigned: jya)

Tracking

(Blocks: 1 bug)

Trunk
mozilla41
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(firefox40 affected, firefox41 fixed)

Details

(URL)

Attachments

(12 attachments, 5 obsolete attachments)

828.02 KB, text/plain
Details
5.76 MB, application/x-bzip
Details
2.08 KB, patch
Details | Diff | Splinter Review
5.99 KB, patch
Details | Diff | Splinter Review
3.93 KB, patch
Details | Diff | Splinter Review
1.13 KB, patch
Details | Diff | Splinter Review
6.18 KB, patch
mattwoodrow
: review+
Details | Diff | Splinter Review
1.21 KB, patch
kentuckyfriedtakahe
: review+
Details | Diff | Splinter Review
1.57 KB, patch
mattwoodrow
: review+
Details | Diff | Splinter Review
4.52 KB, patch
cpearce
: review+
Details | Diff | Splinter Review
4.10 KB, patch
cpearce
: review+
Details | Diff | Splinter Review
12.85 KB, patch
cpearce
: review+
Details | Diff | Splinter Review
(Reporter)

Description

3 years ago
Created attachment 8603634 [details]
MediaLog.txt

Build ID: Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:40.0) Gecko/20100101 Firefox/40.0

GPU info:
* Nvidia Quadro K600
* CPU: Inter Core i7-4470 CPU @ 3.40GHZ
* RAM: 32GB
* Driver version: 347.52

STR:

1. Have previously loaded https://www.youtube.com/watch?v=jEnd8JIMii4 in the browser (so you can hopefully reproduce my setup in this problem)
2. Set https://www.youtube.com/watch?v=jEnd8JIMii4 as your homepage, in Nightly Firefox on Windows
3. Wait for the video to render

Actual Results:

A ton of GETs fire (visible via the Network panel in the Web Console/Dev Tools), but the video -- at least in my case -- never resumes playing

Screencast showing the problem (please note; it doesn't, unfortunately, correspond to the logfile, but does demonstrate the behavior I see consistently, today): http://screencast.com/t/x8kHX626a

BTW, I've also filed bug 1152580 from the same machine, and Firefox Nightly (though a different -- earlier -- build).
(Reporter)

Comment 1

3 years ago
Apologies in advance, Anthony -- was told in #media to ni? you :-)
Flags: needinfo?(ajones)
(Reporter)

Comment 2

3 years ago
(I should also mention that I had e10s *disabled* when I experienced this bug and made this log/screencast.)
How easy is it to reproduce?
Blocks: 778617
Can you please try get a log with "MediaPromise:5" included in the logging?

It looks like we're getting stuck waiting for a seek to complete, but it's not really clear why.
Flags: needinfo?(ajones)
(Reporter)

Comment 5

3 years ago
Created attachment 8604730 [details]
MediaLog.txt.bz2

Here's a logfile from playing two 4K YouTube videos, one of which started "Experiencing Interruptions?" and then remained stuck in the AJAX-loading/spinning state - hopefully that's captured in this log.
(Reporter)

Comment 6

3 years ago
Matt/Anthony; mind taking a look at the new logfile I've attached, with MediaPromise:5? Thx!
Flags: needinfo?(matt.woodrow)
This log seems very different to the previous one, I can't find the same stall in it.

Can you please try reproduce again with only one video playing?
Flags: needinfo?(matt.woodrow)
(Reporter)

Comment 8

3 years ago
(In reply to Matt Woodrow (:mattwoodrow) from comment #7)
> This log seems very different to the previous one, I can't find the same
> stall in it.
> 
> Can you please try reproduce again with only one video playing?

I caught this again, albeit this time with a tab of https://www.youtube.com/watch?v=jEnd8JIMii4 I detached from my main browser window (which also had GitHub.com loaded in a separate tab).

The video playback stopped twice: 1) 4:04 and 2) 14:15

Here's the corresponding screencast which unfortunately only shows from the 1st stop at 4:04 onward: http://www.screencast.com/t/JvqHxiSHT
(Reporter)

Comment 9

3 years ago
Corresponding logfile with NSPR_LOG_MODULES=MediaDecoder:5,MediaSource:5,nsMediaElement:5,nsMediaElementEvents:5,mediaPromise:5 all set: https://app.box.com/s/7h8fymnyv9hthghdymjmg3kw740wx5i9
(In reply to Stephen Donner [:stephend] from comment #8)
> (In reply to Matt Woodrow (:mattwoodrow) from comment #7)
> > This log seems very different to the previous one, I can't find the same
> > stall in it.
> > 
> > Can you please try reproduce again with only one video playing?
> 
> I caught this again, albeit this time with a tab of
> https://www.youtube.com/watch?v=jEnd8JIMii4 I detached from my main browser
> window (which also had GitHub.com loaded in a separate tab).
> 
> The video playback stopped twice: 1) 4:04 and 2) 14:15
> 
> Here's the corresponding screencast which unfortunately only shows from the
> 1st stop at 4:04 onward: http://www.screencast.com/t/JvqHxiSHT

How long does playback stop for?

At 4:04 I can see that you ran out of decoded video frames and it paused briefly to catch up, shouldn't have stopped for long though.

The log doesn't appear to cover 14:15.
(Reporter)

Comment 11

3 years ago
(In reply to Matt Woodrow (:mattwoodrow) from comment #10)

<snip>

> How long does playback stop for?
> 

Sorry; I can't recollect - is there a way to get timestamps in the logs?

> At 4:04 I can see that you ran out of decoded video frames and it paused
> briefly to catch up, shouldn't have stopped for long though.
> 
> The log doesn't appear to cover 14:15.

If you just click on https://app.box.com/s/7h8fymnyv9hthghdymjmg3kw740wx5i9 you get a preview -- downloading, I get the full file, ending in:

...
9412[225a5400]: Resolving Then() call made from mozilla::MediaDecoderStateMachine::FinishShutdown [Runnable=8f83b40, Promise=8f78200, ThenValue=a0dd6c0]
9412[225a5400]: MediaPromise::~MediaPromise [this=8f78200]
0[80e800]: ResolveOrRejectRunnable::Run() [this=8f83b40]

Does that help?  I'll try to get a more concerted effort again, tomorrow (it's my work machine, in Mountain View).
(Reporter)

Comment 12

3 years ago
With Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:41.0) Gecko/20100101 Firefox/41.0 https://hg.mozilla.org/mozilla-central/rev/4fb7ff694bf5, I hit a case where it was AJAX-spinning for at least 3 solid minutes, before I shut down Nightly, to report back here.  I've captured that in a logfile and screencast, too:

1) logfile: https://app.box.com/s/5mrlrljd69otcu07fem3vojh3oqnet76
2) screencast: https://app.box.com/s/kvgz9ycvymkv7r726kqv39qavm41vh0i
Interesting piece of the log: https://pastebin.mozilla.org/8834032

Looks like the format change caused youtube to evict all existing data (line 3), and we update the buffered ranges reported to MediaSourceReader (lines 34-37).

The next audio decode fails (line 76), but the buffered ranges we see in SelectDecoder still show the old ranges, so WaitForData returns immediately with haveAudio=1 (line 89) and we try decoding audio again.

This repeats over and over again without changing the media element state, since we always think we have data.

I think this is because MediaFormatReader using cached ranges, and doesn't appear to update them when we get an eviction.

MediaFormatReader only landed recently though, so this looks like a different issue to what was originally reported :(
(Assignee)

Comment 14

3 years ago
Created attachment 8607963 [details] [diff] [review]
Part1. Notify parent decoder that data got removed

Notify MediaSourceDecoder/Reader that data was removed/evicted.
Attachment #8607963 - Flags: review?(matt.woodrow)
(Assignee)

Updated

3 years ago
Assignee: nobody → jyavenard
Status: NEW → ASSIGNED
(Assignee)

Comment 15

3 years ago
Created attachment 8607968 [details] [diff] [review]
Part2. Add MediaDecoderReader::NotifyDataRemoved method

Add MediaDecoderReader::NotifyDataRemoved() method. This allows for the MediaSource's TrackBuffer to notify the sub reader that it's just finished butchering the data. It wasn't required with MP4Reader because calling MP4Reader::GetBuffered() always force to recalculate the index. I intend to scrap this once we get rid of MediaSourceReader and the current TrackBuffer
Attachment #8607968 - Flags: review?(cpearce)
(Assignee)

Comment 16

3 years ago
Created attachment 8607975 [details] [diff] [review]
Part3. Notify mediasource sub-readers that data was evicted

Have the TrackBuffer tells the sub-readers when data were evicted
Attachment #8607975 - Flags: review?(matt.woodrow)
(Assignee)

Comment 17

3 years ago
Created attachment 8607976 [details] [diff] [review]
Part4. Fix MP4TrackDemuxer eviction offset calculations

This is the actual cause of the stall when using the MediaFormatReader. The MP4 Index returns UINT64_MAX when it can't evict... silly bugger
Attachment #8607976 - Flags: review?(matt.woodrow)
(Assignee)

Comment 18

3 years ago
I believe part1 could fix the problem in version 40 (and maybe earlier).

The rest of the patches are only relevant with the MediaFormatReader which is very conservative in refreshing its cached buffered TimeRanges
(Assignee)

Comment 19

3 years ago
may have to uplift part1 if found to be the cause.
Attachment #8607968 - Flags: review?(cpearce) → review+
Blocks: 1166169
(Assignee)

Comment 20

3 years ago
https://treeherder.mozilla.org/#/jobs?repo=try&revision=547323a7bdcd

All green. Doesn't mean much as it was all green before :)
(Assignee)

Comment 21

3 years ago
After about 2.5 hours playback, found playback stalled.
https://www.youtube.com/watch?v=_Q5pWdLAaLg
	mediasource:https://www.youtube.com/9de01f04-e056-cd4b-8d1b-d690750b805c
	currentTime: 4.87619
		SourceBuffer 0
			start=0 end=4.875895
		SourceBuffer 1
			start=0 end=5.005
	Internal Data:
	Dumping data for reader 1435ee800:
		Dumping Audio Track Decoders: - mLastAudioTime: 4.875894
			Reader 0: 14975d800 ranges=[(0.000000, 4.875895)] active=true size=81283
		Dumping Video Track Decoders - mLastVideoTime: 4.337666
			Reader 0: 14976f800 ranges=[(0.000000, 5.005000)] active=true size=1382492

Seems like a rounding issue, which is ultra-surprising since the move to TimeUnit/TimeIntervals are there should be none.

Most likely is the use of fractionals in the MP4 container.
(Assignee)

Comment 22

3 years ago
https://www.youtube.com/watch?v=_Q5pWdLAaLg
	mediasource:https://www.youtube.com/eea01ffe-23df-bb4a-8d21-da5023a9138e
	currentTime: 28.869523
		SourceBuffer 0
			start=0 end=28.722539
		SourceBuffer 1
	Internal Data:
	Dumping data for reader 1483c9000:
		Dumping Audio Track Decoders: - mLastAudioTime: 28.722538
			Reader 1: 159bd8800 ranges=[(0.000000, 28.722539)] active=true size=462037
			Reader 0: 13a1df000 ranges=[] active=false size=0
		Dumping Video Track Decoders - mLastVideoTime: 24.958266
			Reader 0: 128872000 ranges=[] active=true size=43294223

And another case with a 43MB sourcebuffer, yet empty :(
Attachment #8607963 - Flags: review?(matt.woodrow) → review+
Attachment #8607975 - Flags: review?(matt.woodrow) → review+
Attachment #8607976 - Flags: review?(matt.woodrow) → review+
(Assignee)

Comment 23

3 years ago
Created attachment 8609222 [details] [diff] [review]
Part1. Notify parent decoder that data got removed

Rebase
(Assignee)

Updated

3 years ago
Attachment #8607963 - Attachment is obsolete: true
(Assignee)

Comment 24

3 years ago
Created attachment 8609223 [details] [diff] [review]
Part2. Add MediaDecoderReader::NotifyDataRemoved method

Rebase
(Assignee)

Updated

3 years ago
Attachment #8607968 - Attachment is obsolete: true
(Assignee)

Comment 25

3 years ago
Created attachment 8609224 [details] [diff] [review]
Part3. Notify mediasource sub-readers that data was evicted

Rebase
(Assignee)

Updated

3 years ago
Attachment #8607975 - Attachment is obsolete: true
(Assignee)

Comment 26

3 years ago
Created attachment 8609225 [details] [diff] [review]
Part4. Fix MP4TrackDemuxer eviction offset calculations

Rebase
(Assignee)

Updated

3 years ago
Attachment #8607976 - Attachment is obsolete: true
(Assignee)

Comment 27

3 years ago
Created attachment 8609379 [details] [diff] [review]
Part6. Always notify MediaSourceReader of new data on its own task queue

Always ensure MediaSourceReader is notified that new data has arrived (or be removed) on its own task queue. There could be a race where we call the sub-reader's GetBuffered() leading to a discrepancy between what playback expects to find and what's really there. This caused to never come out of WAITING_FOR_DATA mode.
Attachment #8609379 - Flags: review?(matt.woodrow)
(Assignee)

Comment 28

3 years ago
Created attachment 8609381 [details] [diff] [review]
Part7. Never do blocking read if we don't have data

*never* do blocking reads with MSE. From time to time we have a rounding issue (usually by one microsecond) between the time of the last sample and the moof's end time. This caused the MediaSourceReader attempt to decode one more sample that we don't have, entering a blocking read. We ensure that the sample is within the resource's content, or report EOS otherwise.
Attachment #8609381 - Flags: review?(ajones)
(Assignee)

Comment 29

3 years ago
Created attachment 8609382 [details] [diff] [review]
Part7. Never do blocking read if we don't have data

Little simplification
Attachment #8609382 - Flags: review?(ajones)
(Assignee)

Updated

3 years ago
Attachment #8609381 - Attachment is obsolete: true
Attachment #8609381 - Flags: review?(ajones)
(Assignee)

Comment 30

3 years ago
Another interesting stall:
HTMLMediaElement debug data

https://www.youtube.com/watch?v=jEnd8JIMii4
	mediasource:https://www.youtube.com/ca06e66d-cdd0-814f-8403-0f599e77bc94
	currentTime: 153.599995
		SourceBuffer 0
			start=10.0078 end=175.054943
		SourceBuffer 1
			start=150.15 end=175.175
	Internal Data:
	Dumping data for reader 11b54d800:
		Dumping Audio Track Decoders: - mLastAudioTime: 154.621382
			Reader 2: 12ad42000 ranges=[(150.000907, 175.054943)] active=true size=404131
			Reader 1: 13620a000 ranges=[(10.007800, 150.000317)] active=false size=2249647
			Reader 0: 13a5ec000 ranges=[] active=false size=0
		Dumping Video Track Decoders - mLastVideoTime: 152.986166
			Reader 0: 132b83000 ranges=[(150.150000, 175.175000)] active=true size=75663176

We're stuck trying to seek to 152.986166 to get a new sample and this makes us jump to audio reader 1, which then fail with EOS, and MediaSourceReader retries indefinitely. Actually, I believe this is the original bug reported as the behaviour is exactly as described in the first STR.
(Assignee)

Updated

3 years ago
Blocks: 1166836
(Assignee)

Comment 31

3 years ago
Created attachment 8609703 [details] [diff] [review]
Part8. Fix incorrect seconds / microseconds conversion

Fix sourcebuffer eviction, we were evicting the entire sourcebuffer, another microseconds / seconds issue.
Attachment #8609703 - Flags: review?(matt.woodrow)
(Assignee)

Comment 32

3 years ago
Created attachment 8609722 [details] [diff] [review]
Part9. Don't reset demuxer when skipping to next keyframe

Do not reset the demuxer before skipping to the demuxer.
Attachment #8609722 - Flags: review?(cpearce)
(Assignee)

Comment 33

3 years ago
Created attachment 8609723 [details] [diff] [review]
Part10. Properly recalculate next keyframe time after seeking

Properly recalculate next keyframe time in MP4Demuxer after either a reset or skipping to next key frame. This would otherwise trigger the skip-to-nextkeyframe algorithm unecessarily.
Attachment #8609723 - Flags: review?(cpearce)
(Assignee)

Comment 34

3 years ago
Created attachment 8609739 [details] [diff] [review]
Part5: Increase verbosity content for debugging purposes

Had forgotten that one. This is MediaFormatReader with logs, made it easier to debug what was going on. Simplify ::Update(), as we don't need the monitor anymore, NeedInput() didn't have to be where it was (remnant of MP4Reader.
Attachment #8609739 - Flags: review?(cpearce)
(Assignee)

Comment 35

3 years ago
https://treeherder.mozilla.org/#/jobs?repo=try&revision=62ec2a54c4a6

super consistent on all version of mac and windows, still old intermittent with bug 1138555 on linux
Attachment #8609379 - Flags: review?(matt.woodrow) → review+
Attachment #8609703 - Flags: review?(matt.woodrow) → review+
Comment on attachment 8609382 [details] [diff] [review]
Part7. Never do blocking read if we don't have data

Review of attachment 8609382 [details] [diff] [review]:
-----------------------------------------------------------------

How is the resource read triggered in the non-MSE case?

Did you consider either of the options below?

1. Add SampleIterator::IsNextSampleBuffered()
2. Make SampleIterator::GetNext() return something that contains the meta-data but instead of providing the data, implements GetDataPromise() and GetDataIfBuffered() methods.
Duplicate of this bug: 1165808
(Assignee)

Comment 38

3 years ago
(In reply to Anthony Jones (:kentuckyfriedtakahe, :k17e) from comment #36)
> Comment on attachment 8609382 [details] [diff] [review]
> Part7. Never do blocking read if we don't have data
> 
> Review of attachment 8609382 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> How is the resource read triggered in the non-MSE case?
> 
> Did you consider either of the options below?
> 
> 1. Add SampleIterator::IsNextSampleBuffered()
> 2. Make SampleIterator::GetNext() return something that contains the
> meta-data but instead of providing the data, implements GetDataPromise() and
> GetDataIfBuffered() methods.

this is a big change compare to what we have. I can't foresee the impact changing this API. It would also render the demuxer incompatible with the existing MP4Reader (which is still there).

Returning nullptr is what we do when we reach an EOS in other cases (such as parsing the moof or the moov). 

The logic I added there is the same as we do in the moofparser:
We check that the data is contained in the current range. There are three cases to consider:
1- MSE, we know the exact buffered range. Here Length() would be the end of the buffered data. We don't want to block, ever (this would lead to stall (MediaFormatReader) or deadlocks (MP4Reader))
2- Plain MP4 from a server reporting the size. Length() returned would be the end of the data reported by the server. Read can be blocking while the resource retrieves the value
3- Plain MP4 from server not reporting the size. Length() will be returned as INT64_MAX, so we block while the resource retrieves the value.
For case 2 and 3, a EOS will be returned after should the data not be present as size read != size expected.

The patch handle all 3 conditions (as it does when reading the moof or any of the incomple boxes)
Attachment #8609722 - Flags: review?(cpearce) → review+
Attachment #8609723 - Flags: review?(cpearce) → review+
Comment on attachment 8609739 [details] [diff] [review]
Part5: Increase verbosity content for debugging purposes

Review of attachment 8609739 [details] [diff] [review]:
-----------------------------------------------------------------

::: dom/media/MediaFormatReader.cpp
@@ +722,5 @@
>    // which overrides our "few more samples" threshold.
>    return
>      !aDecoder.mError &&
>      aDecoder.HasPromise() &&
> +    !aDecoder.mDemuxRequest.Exists() &&

This looks like a behaviour change, rather than a logging change, did you mean to include it in this patch? There's a few chunks like that below.

@@ +997,5 @@
>  MediaFormatReader::ResetDecode()
>  {
>    MOZ_ASSERT(OnTaskQueue());
>  
> +  LOGV("");

Did you mean to log a blank line here, or should you be logging "MediaFormatReader::ResetDecode()" here instead?

@@ +1034,5 @@
>      return;
>    }
>  
>    RefPtr<nsIRunnable> task =
> +    NS_NewRunnableMethodWithArgs<TrackType, MediaData*>(

Here you're dropping the refcounting on MediaData. That means the MediaData is not stored refcounted, and so it could end up being destroyed before you're done using it. You should put that back!
Attachment #8609739 - Flags: review?(cpearce) → review+
(Assignee)

Comment 40

3 years ago
(In reply to Chris Pearce (:cpearce) from comment #39)

> ::: dom/media/MediaFormatReader.cpp
> @@ +722,5 @@
> >    // which overrides our "few more samples" threshold.
> >    return
> >      !aDecoder.mError &&
> >      aDecoder.HasPromise() &&
> > +    !aDecoder.mDemuxRequest.Exists() &&
> 
> This looks like a behaviour change, rather than a logging change, did you
> mean to include it in this patch? There's a few chunks like that below.

It's not a behaviour change. We were checking if mDemuxRequest existed later after calling NeedInput() (in RequestDemuxSamples()) ; so it was cleaner to do so in NeedInput()

NeedInput() taking the meaning: do we need to demux a new frame or feed data to the decoder.

> 
> @@ +997,5 @@
> >  MediaFormatReader::ResetDecode()
> >  {
> >    MOZ_ASSERT(OnTaskQueue());
> >  
> > +  LOGV("");
> 
> Did you mean to log a blank line here, or should you be logging
> "MediaFormatReader::ResetDecode()" here instead?

the macro already adds the name of the function by default.

> 
> @@ +1034,5 @@
> >      return;
> >    }
> >  
> >    RefPtr<nsIRunnable> task =
> > +    NS_NewRunnableMethodWithArgs<TrackType, MediaData*>(
> 
> Here you're dropping the refcounting on MediaData. That means the MediaData
> is not stored refcounted, and so it could end up being destroyed before
> you're done using it. You should put that back!

no no.. That's what bug 1153295 was about. MediaData being refcounted (and as such having a Release/AddRef method, NS_NewRunnableMethodWithArgs will automatically increase the refcount for those.
Attachment #8609382 - Flags: review?(ajones) → review+
(Assignee)

Updated

3 years ago
Duplicate of this bug: 1167932

Updated

3 years ago
Duplicate of this bug: 1168217

Updated

3 years ago
Duplicate of this bug: 1168215
(Reporter)

Comment 48

3 years ago
Sorry for the delay in verifying this, and thanks for all the hard work + patience!

Verified FIXED using Build identifier: Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:41.0) Gecko/20100101 Firefox/41.0
Status: RESOLVED → VERIFIED

Updated

3 years ago
Depends on: 1196696
You need to log in before you can comment on or make changes to this bug.