Closed
Bug 1163227
Opened 8 years ago
Closed 8 years ago
Unable to play back YouTube video - continually polls/shows AJAX spinner
Categories
(Core :: Audio/Video, defect)
Core
Audio/Video
Tracking
()
VERIFIED
FIXED
mozilla41
People
(Reporter: stephend, Assigned: jya)
References
(Blocks 1 open bug, )
Details
Attachments
(12 files, 5 obsolete files)
828.02 KB,
text/plain
|
Details | |
5.76 MB,
application/x-bzip
|
Details | |
2.08 KB,
patch
|
Details | Diff | Splinter Review | |
5.99 KB,
patch
|
Details | Diff | Splinter Review | |
3.93 KB,
patch
|
Details | Diff | Splinter Review | |
1.13 KB,
patch
|
Details | Diff | Splinter Review | |
6.18 KB,
patch
|
mattwoodrow
:
review+
|
Details | Diff | Splinter Review |
1.21 KB,
patch
|
ajones
:
review+
|
Details | Diff | Splinter Review |
1.57 KB,
patch
|
mattwoodrow
:
review+
|
Details | Diff | Splinter Review |
4.52 KB,
patch
|
cpearce
:
review+
|
Details | Diff | Splinter Review |
4.10 KB,
patch
|
cpearce
:
review+
|
Details | Diff | Splinter Review |
12.85 KB,
patch
|
cpearce
:
review+
|
Details | Diff | Splinter Review |
Build ID: Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:40.0) Gecko/20100101 Firefox/40.0 GPU info: * Nvidia Quadro K600 * CPU: Inter Core i7-4470 CPU @ 3.40GHZ * RAM: 32GB * Driver version: 347.52 STR: 1. Have previously loaded https://www.youtube.com/watch?v=jEnd8JIMii4 in the browser (so you can hopefully reproduce my setup in this problem) 2. Set https://www.youtube.com/watch?v=jEnd8JIMii4 as your homepage, in Nightly Firefox on Windows 3. Wait for the video to render Actual Results: A ton of GETs fire (visible via the Network panel in the Web Console/Dev Tools), but the video -- at least in my case -- never resumes playing Screencast showing the problem (please note; it doesn't, unfortunately, correspond to the logfile, but does demonstrate the behavior I see consistently, today): http://screencast.com/t/x8kHX626a BTW, I've also filed bug 1152580 from the same machine, and Firefox Nightly (though a different -- earlier -- build).
Reporter | ||
Comment 1•8 years ago
|
||
Apologies in advance, Anthony -- was told in #media to ni? you :-)
Flags: needinfo?(ajones)
Reporter | ||
Comment 2•8 years ago
|
||
(I should also mention that I had e10s *disabled* when I experienced this bug and made this log/screencast.)
How easy is it to reproduce?
Blocks: MSE
Comment 4•8 years ago
|
||
Can you please try get a log with "MediaPromise:5" included in the logging? It looks like we're getting stuck waiting for a seek to complete, but it's not really clear why.
Updated•8 years ago
|
Flags: needinfo?(ajones)
Reporter | ||
Comment 5•8 years ago
|
||
Here's a logfile from playing two 4K YouTube videos, one of which started "Experiencing Interruptions?" and then remained stuck in the AJAX-loading/spinning state - hopefully that's captured in this log.
Reporter | ||
Comment 6•8 years ago
|
||
Matt/Anthony; mind taking a look at the new logfile I've attached, with MediaPromise:5? Thx!
Flags: needinfo?(matt.woodrow)
Comment 7•8 years ago
|
||
This log seems very different to the previous one, I can't find the same stall in it. Can you please try reproduce again with only one video playing?
Flags: needinfo?(matt.woodrow)
Reporter | ||
Comment 8•8 years ago
|
||
(In reply to Matt Woodrow (:mattwoodrow) from comment #7) > This log seems very different to the previous one, I can't find the same > stall in it. > > Can you please try reproduce again with only one video playing? I caught this again, albeit this time with a tab of https://www.youtube.com/watch?v=jEnd8JIMii4 I detached from my main browser window (which also had GitHub.com loaded in a separate tab). The video playback stopped twice: 1) 4:04 and 2) 14:15 Here's the corresponding screencast which unfortunately only shows from the 1st stop at 4:04 onward: http://www.screencast.com/t/JvqHxiSHT
Reporter | ||
Comment 9•8 years ago
|
||
Corresponding logfile with NSPR_LOG_MODULES=MediaDecoder:5,MediaSource:5,nsMediaElement:5,nsMediaElementEvents:5,mediaPromise:5 all set: https://app.box.com/s/7h8fymnyv9hthghdymjmg3kw740wx5i9
Comment 10•8 years ago
|
||
(In reply to Stephen Donner [:stephend] from comment #8) > (In reply to Matt Woodrow (:mattwoodrow) from comment #7) > > This log seems very different to the previous one, I can't find the same > > stall in it. > > > > Can you please try reproduce again with only one video playing? > > I caught this again, albeit this time with a tab of > https://www.youtube.com/watch?v=jEnd8JIMii4 I detached from my main browser > window (which also had GitHub.com loaded in a separate tab). > > The video playback stopped twice: 1) 4:04 and 2) 14:15 > > Here's the corresponding screencast which unfortunately only shows from the > 1st stop at 4:04 onward: http://www.screencast.com/t/JvqHxiSHT How long does playback stop for? At 4:04 I can see that you ran out of decoded video frames and it paused briefly to catch up, shouldn't have stopped for long though. The log doesn't appear to cover 14:15.
Reporter | ||
Comment 11•8 years ago
|
||
(In reply to Matt Woodrow (:mattwoodrow) from comment #10) <snip> > How long does playback stop for? > Sorry; I can't recollect - is there a way to get timestamps in the logs? > At 4:04 I can see that you ran out of decoded video frames and it paused > briefly to catch up, shouldn't have stopped for long though. > > The log doesn't appear to cover 14:15. If you just click on https://app.box.com/s/7h8fymnyv9hthghdymjmg3kw740wx5i9 you get a preview -- downloading, I get the full file, ending in: ... 9412[225a5400]: Resolving Then() call made from mozilla::MediaDecoderStateMachine::FinishShutdown [Runnable=8f83b40, Promise=8f78200, ThenValue=a0dd6c0] 9412[225a5400]: MediaPromise::~MediaPromise [this=8f78200] 0[80e800]: ResolveOrRejectRunnable::Run() [this=8f83b40] Does that help? I'll try to get a more concerted effort again, tomorrow (it's my work machine, in Mountain View).
Reporter | ||
Comment 12•8 years ago
|
||
With Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:41.0) Gecko/20100101 Firefox/41.0 https://hg.mozilla.org/mozilla-central/rev/4fb7ff694bf5, I hit a case where it was AJAX-spinning for at least 3 solid minutes, before I shut down Nightly, to report back here. I've captured that in a logfile and screencast, too: 1) logfile: https://app.box.com/s/5mrlrljd69otcu07fem3vojh3oqnet76 2) screencast: https://app.box.com/s/kvgz9ycvymkv7r726kqv39qavm41vh0i
Comment 13•8 years ago
|
||
Interesting piece of the log: https://pastebin.mozilla.org/8834032 Looks like the format change caused youtube to evict all existing data (line 3), and we update the buffered ranges reported to MediaSourceReader (lines 34-37). The next audio decode fails (line 76), but the buffered ranges we see in SelectDecoder still show the old ranges, so WaitForData returns immediately with haveAudio=1 (line 89) and we try decoding audio again. This repeats over and over again without changing the media element state, since we always think we have data. I think this is because MediaFormatReader using cached ranges, and doesn't appear to update them when we get an eviction. MediaFormatReader only landed recently though, so this looks like a different issue to what was originally reported :(
Assignee | ||
Comment 14•8 years ago
|
||
Notify MediaSourceDecoder/Reader that data was removed/evicted.
Attachment #8607963 -
Flags: review?(matt.woodrow)
Assignee | ||
Updated•8 years ago
|
Assignee: nobody → jyavenard
Status: NEW → ASSIGNED
Assignee | ||
Comment 15•8 years ago
|
||
Add MediaDecoderReader::NotifyDataRemoved() method. This allows for the MediaSource's TrackBuffer to notify the sub reader that it's just finished butchering the data. It wasn't required with MP4Reader because calling MP4Reader::GetBuffered() always force to recalculate the index. I intend to scrap this once we get rid of MediaSourceReader and the current TrackBuffer
Attachment #8607968 -
Flags: review?(cpearce)
Assignee | ||
Comment 16•8 years ago
|
||
Have the TrackBuffer tells the sub-readers when data were evicted
Attachment #8607975 -
Flags: review?(matt.woodrow)
Assignee | ||
Comment 17•8 years ago
|
||
This is the actual cause of the stall when using the MediaFormatReader. The MP4 Index returns UINT64_MAX when it can't evict... silly bugger
Attachment #8607976 -
Flags: review?(matt.woodrow)
Assignee | ||
Comment 18•8 years ago
|
||
I believe part1 could fix the problem in version 40 (and maybe earlier). The rest of the patches are only relevant with the MediaFormatReader which is very conservative in refreshing its cached buffered TimeRanges
Assignee | ||
Comment 19•8 years ago
|
||
may have to uplift part1 if found to be the cause.
Updated•8 years ago
|
Attachment #8607968 -
Flags: review?(cpearce) → review+
Assignee | ||
Comment 20•8 years ago
|
||
https://treeherder.mozilla.org/#/jobs?repo=try&revision=547323a7bdcd All green. Doesn't mean much as it was all green before :)
Assignee | ||
Comment 21•8 years ago
|
||
After about 2.5 hours playback, found playback stalled. https://www.youtube.com/watch?v=_Q5pWdLAaLg mediasource:https://www.youtube.com/9de01f04-e056-cd4b-8d1b-d690750b805c currentTime: 4.87619 SourceBuffer 0 start=0 end=4.875895 SourceBuffer 1 start=0 end=5.005 Internal Data: Dumping data for reader 1435ee800: Dumping Audio Track Decoders: - mLastAudioTime: 4.875894 Reader 0: 14975d800 ranges=[(0.000000, 4.875895)] active=true size=81283 Dumping Video Track Decoders - mLastVideoTime: 4.337666 Reader 0: 14976f800 ranges=[(0.000000, 5.005000)] active=true size=1382492 Seems like a rounding issue, which is ultra-surprising since the move to TimeUnit/TimeIntervals are there should be none. Most likely is the use of fractionals in the MP4 container.
Assignee | ||
Comment 22•8 years ago
|
||
https://www.youtube.com/watch?v=_Q5pWdLAaLg mediasource:https://www.youtube.com/eea01ffe-23df-bb4a-8d21-da5023a9138e currentTime: 28.869523 SourceBuffer 0 start=0 end=28.722539 SourceBuffer 1 Internal Data: Dumping data for reader 1483c9000: Dumping Audio Track Decoders: - mLastAudioTime: 28.722538 Reader 1: 159bd8800 ranges=[(0.000000, 28.722539)] active=true size=462037 Reader 0: 13a1df000 ranges=[] active=false size=0 Dumping Video Track Decoders - mLastVideoTime: 24.958266 Reader 0: 128872000 ranges=[] active=true size=43294223 And another case with a 43MB sourcebuffer, yet empty :(
Updated•8 years ago
|
Attachment #8607963 -
Flags: review?(matt.woodrow) → review+
Updated•8 years ago
|
Attachment #8607975 -
Flags: review?(matt.woodrow) → review+
Updated•8 years ago
|
Attachment #8607976 -
Flags: review?(matt.woodrow) → review+
Assignee | ||
Comment 23•8 years ago
|
||
Rebase
Assignee | ||
Updated•8 years ago
|
Attachment #8607963 -
Attachment is obsolete: true
Assignee | ||
Comment 24•8 years ago
|
||
Rebase
Assignee | ||
Updated•8 years ago
|
Attachment #8607968 -
Attachment is obsolete: true
Assignee | ||
Comment 25•8 years ago
|
||
Rebase
Assignee | ||
Updated•8 years ago
|
Attachment #8607975 -
Attachment is obsolete: true
Assignee | ||
Comment 26•8 years ago
|
||
Rebase
Assignee | ||
Updated•8 years ago
|
Attachment #8607976 -
Attachment is obsolete: true
Assignee | ||
Comment 27•8 years ago
|
||
Always ensure MediaSourceReader is notified that new data has arrived (or be removed) on its own task queue. There could be a race where we call the sub-reader's GetBuffered() leading to a discrepancy between what playback expects to find and what's really there. This caused to never come out of WAITING_FOR_DATA mode.
Attachment #8609379 -
Flags: review?(matt.woodrow)
Assignee | ||
Comment 28•8 years ago
|
||
*never* do blocking reads with MSE. From time to time we have a rounding issue (usually by one microsecond) between the time of the last sample and the moof's end time. This caused the MediaSourceReader attempt to decode one more sample that we don't have, entering a blocking read. We ensure that the sample is within the resource's content, or report EOS otherwise.
Attachment #8609381 -
Flags: review?(ajones)
Assignee | ||
Comment 29•8 years ago
|
||
Little simplification
Attachment #8609382 -
Flags: review?(ajones)
Assignee | ||
Updated•8 years ago
|
Attachment #8609381 -
Attachment is obsolete: true
Attachment #8609381 -
Flags: review?(ajones)
Assignee | ||
Comment 30•8 years ago
|
||
Another interesting stall: HTMLMediaElement debug data https://www.youtube.com/watch?v=jEnd8JIMii4 mediasource:https://www.youtube.com/ca06e66d-cdd0-814f-8403-0f599e77bc94 currentTime: 153.599995 SourceBuffer 0 start=10.0078 end=175.054943 SourceBuffer 1 start=150.15 end=175.175 Internal Data: Dumping data for reader 11b54d800: Dumping Audio Track Decoders: - mLastAudioTime: 154.621382 Reader 2: 12ad42000 ranges=[(150.000907, 175.054943)] active=true size=404131 Reader 1: 13620a000 ranges=[(10.007800, 150.000317)] active=false size=2249647 Reader 0: 13a5ec000 ranges=[] active=false size=0 Dumping Video Track Decoders - mLastVideoTime: 152.986166 Reader 0: 132b83000 ranges=[(150.150000, 175.175000)] active=true size=75663176 We're stuck trying to seek to 152.986166 to get a new sample and this makes us jump to audio reader 1, which then fail with EOS, and MediaSourceReader retries indefinitely. Actually, I believe this is the original bug reported as the behaviour is exactly as described in the first STR.
Assignee | ||
Comment 31•8 years ago
|
||
Fix sourcebuffer eviction, we were evicting the entire sourcebuffer, another microseconds / seconds issue.
Attachment #8609703 -
Flags: review?(matt.woodrow)
Assignee | ||
Comment 32•8 years ago
|
||
Do not reset the demuxer before skipping to the demuxer.
Attachment #8609722 -
Flags: review?(cpearce)
Assignee | ||
Comment 33•8 years ago
|
||
Properly recalculate next keyframe time in MP4Demuxer after either a reset or skipping to next key frame. This would otherwise trigger the skip-to-nextkeyframe algorithm unecessarily.
Attachment #8609723 -
Flags: review?(cpearce)
Assignee | ||
Comment 34•8 years ago
|
||
Had forgotten that one. This is MediaFormatReader with logs, made it easier to debug what was going on. Simplify ::Update(), as we don't need the monitor anymore, NeedInput() didn't have to be where it was (remnant of MP4Reader.
Attachment #8609739 -
Flags: review?(cpearce)
Assignee | ||
Comment 35•8 years ago
|
||
https://treeherder.mozilla.org/#/jobs?repo=try&revision=62ec2a54c4a6 super consistent on all version of mac and windows, still old intermittent with bug 1138555 on linux
Updated•8 years ago
|
Attachment #8609379 -
Flags: review?(matt.woodrow) → review+
Updated•8 years ago
|
Attachment #8609703 -
Flags: review?(matt.woodrow) → review+
Comment on attachment 8609382 [details] [diff] [review] Part7. Never do blocking read if we don't have data Review of attachment 8609382 [details] [diff] [review]: ----------------------------------------------------------------- How is the resource read triggered in the non-MSE case? Did you consider either of the options below? 1. Add SampleIterator::IsNextSampleBuffered() 2. Make SampleIterator::GetNext() return something that contains the meta-data but instead of providing the data, implements GetDataPromise() and GetDataIfBuffered() methods.
Assignee | ||
Comment 38•8 years ago
|
||
(In reply to Anthony Jones (:kentuckyfriedtakahe, :k17e) from comment #36) > Comment on attachment 8609382 [details] [diff] [review] > Part7. Never do blocking read if we don't have data > > Review of attachment 8609382 [details] [diff] [review]: > ----------------------------------------------------------------- > > How is the resource read triggered in the non-MSE case? > > Did you consider either of the options below? > > 1. Add SampleIterator::IsNextSampleBuffered() > 2. Make SampleIterator::GetNext() return something that contains the > meta-data but instead of providing the data, implements GetDataPromise() and > GetDataIfBuffered() methods. this is a big change compare to what we have. I can't foresee the impact changing this API. It would also render the demuxer incompatible with the existing MP4Reader (which is still there). Returning nullptr is what we do when we reach an EOS in other cases (such as parsing the moof or the moov). The logic I added there is the same as we do in the moofparser: We check that the data is contained in the current range. There are three cases to consider: 1- MSE, we know the exact buffered range. Here Length() would be the end of the buffered data. We don't want to block, ever (this would lead to stall (MediaFormatReader) or deadlocks (MP4Reader)) 2- Plain MP4 from a server reporting the size. Length() returned would be the end of the data reported by the server. Read can be blocking while the resource retrieves the value 3- Plain MP4 from server not reporting the size. Length() will be returned as INT64_MAX, so we block while the resource retrieves the value. For case 2 and 3, a EOS will be returned after should the data not be present as size read != size expected. The patch handle all 3 conditions (as it does when reading the moof or any of the incomple boxes)
Updated•8 years ago
|
Attachment #8609722 -
Flags: review?(cpearce) → review+
Updated•8 years ago
|
Attachment #8609723 -
Flags: review?(cpearce) → review+
Comment 39•8 years ago
|
||
Comment on attachment 8609739 [details] [diff] [review] Part5: Increase verbosity content for debugging purposes Review of attachment 8609739 [details] [diff] [review]: ----------------------------------------------------------------- ::: dom/media/MediaFormatReader.cpp @@ +722,5 @@ > // which overrides our "few more samples" threshold. > return > !aDecoder.mError && > aDecoder.HasPromise() && > + !aDecoder.mDemuxRequest.Exists() && This looks like a behaviour change, rather than a logging change, did you mean to include it in this patch? There's a few chunks like that below. @@ +997,5 @@ > MediaFormatReader::ResetDecode() > { > MOZ_ASSERT(OnTaskQueue()); > > + LOGV(""); Did you mean to log a blank line here, or should you be logging "MediaFormatReader::ResetDecode()" here instead? @@ +1034,5 @@ > return; > } > > RefPtr<nsIRunnable> task = > + NS_NewRunnableMethodWithArgs<TrackType, MediaData*>( Here you're dropping the refcounting on MediaData. That means the MediaData is not stored refcounted, and so it could end up being destroyed before you're done using it. You should put that back!
Attachment #8609739 -
Flags: review?(cpearce) → review+
Assignee | ||
Comment 40•8 years ago
|
||
(In reply to Chris Pearce (:cpearce) from comment #39) > ::: dom/media/MediaFormatReader.cpp > @@ +722,5 @@ > > // which overrides our "few more samples" threshold. > > return > > !aDecoder.mError && > > aDecoder.HasPromise() && > > + !aDecoder.mDemuxRequest.Exists() && > > This looks like a behaviour change, rather than a logging change, did you > mean to include it in this patch? There's a few chunks like that below. It's not a behaviour change. We were checking if mDemuxRequest existed later after calling NeedInput() (in RequestDemuxSamples()) ; so it was cleaner to do so in NeedInput() NeedInput() taking the meaning: do we need to demux a new frame or feed data to the decoder. > > @@ +997,5 @@ > > MediaFormatReader::ResetDecode() > > { > > MOZ_ASSERT(OnTaskQueue()); > > > > + LOGV(""); > > Did you mean to log a blank line here, or should you be logging > "MediaFormatReader::ResetDecode()" here instead? the macro already adds the name of the function by default. > > @@ +1034,5 @@ > > return; > > } > > > > RefPtr<nsIRunnable> task = > > + NS_NewRunnableMethodWithArgs<TrackType, MediaData*>( > > Here you're dropping the refcounting on MediaData. That means the MediaData > is not stored refcounted, and so it could end up being destroyed before > you're done using it. You should put that back! no no.. That's what bug 1153295 was about. MediaData being refcounted (and as such having a Release/AddRef method, NS_NewRunnableMethodWithArgs will automatically increase the refcount for those.
Updated•8 years ago
|
Attachment #8609382 -
Flags: review?(ajones) → review+
Comment 42•8 years ago
|
||
https://hg.mozilla.org/integration/mozilla-inbound/rev/380baadfdd70 https://hg.mozilla.org/integration/mozilla-inbound/rev/ad14e3d99bd4 https://hg.mozilla.org/integration/mozilla-inbound/rev/6ddd55a8331c https://hg.mozilla.org/integration/mozilla-inbound/rev/a0dcad6b9a4a https://hg.mozilla.org/integration/mozilla-inbound/rev/81f13a822200 https://hg.mozilla.org/integration/mozilla-inbound/rev/d5c950a3f731 https://hg.mozilla.org/integration/mozilla-inbound/rev/48053f713a9d https://hg.mozilla.org/integration/mozilla-inbound/rev/6befcdfdb56c https://hg.mozilla.org/integration/mozilla-inbound/rev/5cf8087b6812 https://hg.mozilla.org/integration/mozilla-inbound/rev/5c2429fd2699
Comment 43•8 years ago
|
||
https://hg.mozilla.org/mozilla-central/rev/380baadfdd70 https://hg.mozilla.org/mozilla-central/rev/ad14e3d99bd4 https://hg.mozilla.org/mozilla-central/rev/6ddd55a8331c https://hg.mozilla.org/mozilla-central/rev/a0dcad6b9a4a https://hg.mozilla.org/mozilla-central/rev/81f13a822200 https://hg.mozilla.org/mozilla-central/rev/d5c950a3f731 https://hg.mozilla.org/mozilla-central/rev/48053f713a9d https://hg.mozilla.org/mozilla-central/rev/6befcdfdb56c https://hg.mozilla.org/mozilla-central/rev/5cf8087b6812 https://hg.mozilla.org/mozilla-central/rev/5c2429fd2699
Status: ASSIGNED → RESOLVED
Closed: 8 years ago
status-firefox41:
--- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla41
Reporter | ||
Comment 48•8 years ago
|
||
Sorry for the delay in verifying this, and thanks for all the hard work + patience! Verified FIXED using Build identifier: Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:41.0) Gecko/20100101 Firefox/41.0
Status: RESOLVED → VERIFIED
You need to log in
before you can comment on or make changes to this bug.
Description
•