walterebert.com/playground/video/hls/ videos don't play

VERIFIED FIXED in Firefox 68

Status

()

defect
P2
normal
VERIFIED FIXED
2 months ago
Last month

People

(Reporter: eliza.balazs, Assigned: cpearce)

Tracking

(Regression, {regression})

Firefox 68
Firefox 69
ARM
Android
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(firefox-esr60 unaffected, firefox66 unaffected, firefox67 unaffected, firefox68+ verified, firefox69+ verified)

Details

Attachments

(1 attachment, 1 obsolete attachment)

Environment:
Devices:
Nexus 6P (Android 8.1.0);
Sony Xperia Z5 Premium (Android 7.1.1);

Build: Nightly 68.0a1 (2019-05-15);

Steps to reproduce:

  1. Go to https://walterebert.com/playground/video/hls/
  2. Play a video;
  3. Observe the behavior.

Expected result:
The videos should play.

Actual result:
The videos don't play and the message "Video can't be played because the file is corrupt."/ "No video with supported format and MIME type found." is displayed.

Notes:
Screenshot: https://drive.google.com/file/d/14zAWCO3XX2146-78-tWse6RQN2Nus7Qx/view
After refreshing the page, one video can be played.

[Tracking Requested - why for this release]:
HLS playback is broken on Android, both Fennec and GeckoView.

Hmm, who can look at this? John?

Blocks: HLS
Flags: needinfo?(jolin)
Priority: -- → P2

Hi!

I tested this with Nexus 6P (Android 8.1.0) and I found a regression for this issue:

Last good revision: bb4e78986eed98bb51eaa06e20e9ea032d62f3fe
First bad revision: 90ebd256a92b6abfa5a701dfac54b40e45ca3351

Pushlog:
https://hg.mozilla.org/integration/autoland/pushloghtml?fromchange=bb4e78986eed98bb51eaa06e20e9ea032d62f3fe&tochange=90ebd256a92b6abfa5a701dfac54b40e45ca3351

cpearce: It seems that Bug 1540573 introduced this issue, can you have a look?

Thanks!

Flags: needinfo?(cpearce)
Regressed by: 1540573

Based on the findings in comment #2 I'm assigning to you Chris for now.

Assignee: nobody → cpearce

Funny. The changes in Bug 1540573 make us download more data faster if we're on WiFi on Android than we used to. So if I turn off WiFi and view that page on 4G, it works as expected. There must be some race in the HLS backend...

The behaviour change from Bug 1540573 was to change the default preload action from "none" to "metadata" when we're on Wifi. When the preload action is "metadata", we'll suspend the download sometime soon after loading the first frame.

The error we see is NS_ERROR_NOT_INITIALIZED. What happens is when we load the page the media loads to HAVE_METADATA and since it's not playing, then goes dormant and we shutdown the decoder. When you hit play, the media comes out of dormant, and seeks to its previous position. We create a new decoder, wrapped by a MediaChangeMonitor. The first frame out of the HLSDemuxer is invalid (according to H264::GetFrameType()), and isn't a keyframe, so the MediaChangeMonitor doesn't have the extradata it needs to create a decoder. The frames which come out of the HLSDemuxer are in AnnexB format (the GeckoHlsVideoRenderer is prepending the initData as expected), and the Java side never set the out of band mExtraData fields.

The problem is that the HLSDemuxer underneath uses ExoPlayer in Java. We have a custom renderer which accepts the still compressed frames and forwards them out to the C++ demuxer. ExoPlayer's seek function doesn't seek to the keyframe before the seek target, it's an accurate seek. So when you seek, there's nothing to ensure the HLSDemuxer gives you a key frame. Additionally, the samples ExoPlayer outputs don't have a duration. So the GeckoHlsVideoRenderer queues samples output by ExoPlayer until it can calculate a duration for them, and only releases samples once they have a duration. The seek code doesn't clear that queue; meaning after a seek, the HlsDemuxer will output samples from the previous playback position for a while.

The GeckoHlsPlayer (Java) gets a callback when ExoPlayer detects a discontinuity, so we can clear the GeckoHlsVideoRenderer's queue of samples there in order to fix that. That however causes us to basically seek the video stream to the keyframe after the seek target. This is particularly jarring in the case where we're loading a new video, as we'll be showing the first frame, go dormant, and the seek to come out of dormant will advance us to the first keyframe after the start time. i.e., we'll skip ahead the first time a user plays. The same thing will happen any time we pause mid-stream long enough to go dormant.

Probably the best thing to do here is change the seek to seek to a point some arbitrary distance behind the seek target, clear the GeckoHlsVideoRenderer's queue of samples, and demux forwards, dropping frames until we hit a keyframe.

Flags: needinfo?(cpearce)

The Java ExoPlayer that we use for HLS support on Android does an accurate
seek, that is, it seeks to the frame at the seek target. This may not be a
keyframe, which we can start decoding at. So change the HLS seek to seek 2
seconds behind the seek target, and drop all frames up to the next keyframe.
This means that after a seek the HLSDemuxer will output a keyframe, and
hopefully (but we can't guarantee of course) it will lie behind the actual seek
target.

We also need to purge the GeckoHlsVideoRenderer's queue of frames which it
is holding onto in order to determine their durations, otherwise after a seek,
we'll get output from this queue of frames. That is, after a seek we would still
get a few frames from the old playback position.

This seek case is particularly problematic as we aggressively shutdown decoders
when the media is paused, including right after the load reaches
loadedmetadata, and we need to seek in order to recover from going dormant.

Pushed by cpearce@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/d637ee5ceb51
Change HLS seek to seek before seek target and drop frames until next keyframe. r=jya
Status: NEW → RESOLVED
Closed: Last month
Resolution: --- → FIXED
Target Milestone: --- → Firefox 69

Eliza: are you able to confirm this is fixed in Nightly, once a Nightly build which contains this fix is out?

Flags: needinfo?(jolin) → needinfo?(eliza.balazs)

(In reply to Chris Pearce [:cpearce (GMT+13)] from comment #9)

Eliza: are you able to confirm this is fixed in Nightly, once a Nightly build which contains this fix is out?

We talked about this in the media standup today. jya realised that this is only a regression because of bug 1538508. If not for bug 1538508, we'd ignore the error thrown when the HLS demuxer emits invalid samples. I tested this theory, and it seems correct. So we can fix this more cleanly, and succinctly, by adding a new CreateDecoderParam::Option and changing the MediaChangeMonitor::Decode() to not reject here if the option is set, and resolve in the same was as if we need a keyframe but don't have one. Which is basically what's happening here.

So we can backout d637ee5ceb51, and fix this as above, more succinctly, and without the seek to t-2.

Flags: needinfo?(eliza.balazs)

[Tracking Requested - why for this release]:

Backed out as per request.
backout link: https://hg.mozilla.org/mozilla-central/rev/ac9c72ac6aa02aab8ab9b9987d9bf288fe6a0aa6

Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Target Milestone: Firefox 69 → ---

In bug 1538508 we changed the MediaChangeMonitor to raise an error when we try
to create a decoder and we don't yet have the H.264 extra data, to maintain the
assumptions made by OpenH264/WebRTC. This however doesn't play nice with the
HLS demuxer, which won't always give you extra data after a seek.

So change the MediaChangeMonitor to by default assume that it should just drop
frames until it has the extra data it needs to create a decoder. We can flag
this mode off in the WebRTC MediaDataDecoder.

Attachment #9068920 - Attachment is obsolete: true
Pushed by cpearce@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/b48d6e1b7507
Make MediaChangeMonitor by default not try to init decoder until it receives extradata. r=jya
Status: REOPENED → RESOLVED
Closed: Last monthLast month
Resolution: --- → FIXED
Target Milestone: --- → Firefox 69

Note that to verify this at the moment, a CI build from Treeherder will need to be used since Fennec Nightly builds are riding the 68 train to ESR.

Hi Chris, please request uplift for Beta68 if you think it's worth shipping this fix in 68.

Flags: needinfo?(cpearce)

Comment on attachment 9070445 [details]
Bug 1552145 - Make MediaChangeMonitor by default not try to init decoder until it receives extradata. r?jya

Beta/Release Uplift Approval Request

  • User impact if declined: HLS video streams may error on load instead of playing in Fennec for users on WiFi.
  • Is this code covered by automated tests?: Unknown
  • Has the fix been verified in Nightly?: No
  • Needs manual test from QE?: Yes
  • If yes, steps to reproduce: Ask RyanVM which build you need to download, since it seems Fennec Nightly isn't being built from mozilla-central, open https://walterebert.com/playground/video/hls/ and check that videos play, reload page and check that videos still play.
  • List of other uplifts needed: None
  • Risk to taking this patch: Low
  • Why is the change risky/not risky? (and alternatives if risky): It only affects HLS on android. It's a simple change that basically partially undoes bug 1538508.
  • String changes made/needed: None.
Flags: needinfo?(cpearce)
Attachment #9070445 - Flags: approval-mozilla-beta?

(In reply to Ritu Kothari (:ritu) from comment #16)

Hi Chris, please request uplift for Beta68 if you think it's worth shipping this fix in 68.

Uplift requested. It would be good if QA could verify this, but they will likely need RyanVM's help to identify what is the appropriate build to test with; I tested Fennec Nightly, and the bug still appears there, but it sounds like Fennec Nightly isn't running m-c at the moment.

Verified as fixed on 69.0a1 (2019-06-10) Fennec build from the CI of mozilla-central with Nexus 6P (Android 8.1.0).
Due to my findings, I'll mark this issue as verified on Firefox 69. Thanks!

Flags: qe-verify+

Comment on attachment 9070445 [details]
Bug 1552145 - Make MediaChangeMonitor by default not try to init decoder until it receives extradata. r?jya

android hls fix, approved for 68.0b10

Attachment #9070445 - Flags: approval-mozilla-beta? → approval-mozilla-beta+

Verified as fixed on Beta 68.0b10, Nightly 68.0a1 (2019-06-13) with Nexus 6P (Android 8.1.0).
Due to my findings, I'll mark this issue as verified on Firefox 68. Thanks.

Status: RESOLVED → VERIFIED
You need to log in before you can comment on or make changes to this bug.