Stalls or video corruption when using https://github.com/dailymotion/hls.js player

RESOLVED INVALID

Status

()

P2
normal
RESOLVED INVALID
4 years ago
3 years ago

People

(Reporter: jya, Unassigned)

Tracking

(Depends on: 1 bug, Blocks: 1 bug)

Trunk
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(firefox44 affected)

Details

(URL)

(Reporter)

Description

4 years ago
Dailymotion made their HLS -> MSE player open source.

This bug tracks various issues noticed in order to identify if the issue is with our MSE implementation.

Attempting to watch NASA TV stream:
http://dailymotion.github.io/hls.js/demo/?src=http%3A%2F%2Fnasatv-lh.akamaihd.net%2Fi%2FNASA_101%40319270%2Fmaster.m3u8

shows either stalls or visual corruptions.
The first 10 seconds of the Big Buck Bunny video are playing very slowly for me on Windows.
Playback is fast enough after the first 10 seconds.
(Reporter)

Comment 2

4 years ago
Focusing on the stall first ; from https://bugzilla.mozilla.org/show_bug.cgi?id=1163076#c29

"yes initsegment are sent when switching resolution.
i also checked this stream with flashls
http://www.flashls.org/latest/examples/chromeless/?src=http%3A%2F%2Fnasatv-lh.akamaihd.net%2Fi%2FNASA_101%40319270%2Fmaster.m3u8
and I got the following warning msg printed in the console:
WARN:TS: discarding video tag, as AVC HEADER not found yet, fragment not starting with I-Frame ?

most probably the fragment is not starting with a keyframe, leading to this behaviour.
an option would be to filter out video mp4 samples until the first keyframe is found..."

per spec:
https://w3c.github.io/media-source/#sourcebuffer-init-segment-received

"3.3 Set the need random access point flag on all track buffers to true."

so when a new init segment is added ; all samples prior a keyframe will be dropped.
If the data being appended doesn't start with a keyframe and too many of them are dropped ; this will lead to a gap of the data:

in this particular run, we had first the range (0.670844, 10.005333) appended ; then an init segment and then (10.004977 to 20.007644] ; however frame 10.004977 isn't a key frame, and as such dropped. The first keyframe is found at 11.743244 so any frames up to that point is dropped (https://w3c.github.io/media-source/#sourcebuffer-coded-frame-processing step 10: "If the need random access point flag on track buffer equals true, then run the following steps:   If the coded frame is not a random access point, then drop the coded frame and jump to the top of the loop to start processing the next coded frame."

We end up as such with the buffered range:
(0.670844, 10.135777), (11.743244, 20.080755)

this is a gap of 1.6s which we can't pass as it's greater than our 125ms threshold.

As such we will stall.

The behaviour on what we should be doing when there is such a wide gap in the video stream isn't defined. So there's room for interpretations.

1- You stall
2- If you have audio buffered, you display the last video frame for the entire duration of that gap and continue to play the audio

We do 1.

In the mean time, I strongly advise that the hls.js perform the conversion and append data such that no such gap is created as a result.

This could be done by simply not truncating the previous media segment that early. It's perfectly fine to append more data and have the two streams overlap ; the overlap will be handled by the sourcebuffer such as the gap is minimal.

Like in here; you could have added say
(0.670844, 15.135777) followed by an init segment and then (10.004977 to 20.007644)
this would have resulted in a contiguous buffered range of (0.670844, 20.007644) ; (there could have been a small gap, but it would have been much less than 125ms)
(Reporter)

Comment 3

4 years ago
(In reply to Marco Castelluccio [:marco] from comment #1)
> The first 10 seconds of the Big Buck Bunny video are playing very slowly for
> me on Windows.
> Playback is fast enough after the first 10 seconds.

not knowing for sure ; but DASH being adaptative ; it can't detect how fast your connection / playback speed really is until you've started playback. Maybe the hls.js isn't starting with the lowest resolution available first which would ensure you could play it.

Comment 4

4 years ago
(In reply to Jean-Yves Avenard [:jya] from comment #2)
> Focusing on the stall first ; from
> https://bugzilla.mozilla.org/show_bug.cgi?id=1163076#c29
> 
> "yes initsegment are sent when switching resolution.
> i also checked this stream with flashls
> http://www.flashls.org/latest/examples/chromeless/?src=http%3A%2F%2Fnasatv-
> lh.akamaihd.net%2Fi%2FNASA_101%40319270%2Fmaster.m3u8
> and I got the following warning msg printed in the console:
> WARN:TS: discarding video tag, as AVC HEADER not found yet, fragment not
> starting with I-Frame ?
> 
> most probably the fragment is not starting with a keyframe, leading to this
> behaviour.
> an option would be to filter out video mp4 samples until the first keyframe
> is found..."
> 
> per spec:
> https://w3c.github.io/media-source/#sourcebuffer-init-segment-received
> 
> "3.3 Set the need random access point flag on all track buffers to true."
> 
> so when a new init segment is added ; all samples prior a keyframe will be
> dropped.
> If the data being appended doesn't start with a keyframe and too many of
> them are dropped ; this will lead to a gap of the data:
> 
> in this particular run, we had first the range (0.670844, 10.005333)
> appended ; then an init segment and then (10.004977 to 20.007644] ; however
> frame 10.004977 isn't a key frame, and as such dropped. The first keyframe
> is found at 11.743244 so any frames up to that point is dropped
> (https://w3c.github.io/media-source/#sourcebuffer-coded-frame-processing
> step 10: "If the need random access point flag on track buffer equals true,
> then run the following steps:   If the coded frame is not a random access
> point, then drop the coded frame and jump to the top of the loop to start
> processing the next coded frame."
> 
> We end up as such with the buffered range:
> (0.670844, 10.135777), (11.743244, 20.080755)


i guess there was a quality switch between your first and second fragment,as hls.js only appends initsegment in a case a discontinuity or quality level switch.
you can check this by clicking on 'toggle metrics display' and then click on 'metrics permalink'
you will see  playlist/fragment loading/parsing/appending timings along with the video events.

you can also press 'toggle quality level controls' and force hls.js to stick with a given quality level. in that case you will see video artifacts at the beginning of the playback but not afterwards.

> 
> this is a gap of 1.6s which we can't pass as it's greater than our 125ms
> threshold.
> 
> As such we will stall.
> 
> The behaviour on what we should be doing when there is such a wide gap in
> the video stream isn't defined. So there's room for interpretations.
> 
> 1- You stall
> 2- If you have audio buffered, you display the last video frame for the
> entire duration of that gap and continue to play the audio
> 
> We do 1.


going for 2 would be great IMHO.
In our catalogs we have videos for which we have one video keyframe at the beginning of the clip, but audio lasting until the end... and currently they are not playing back with MSE. but Flash handles them without any issue.

> In the mean time, I strongly advise that the hls.js perform the conversion
> and append data such that no such gap is created as a result.
> 
> This could be done by simply not truncating the previous media segment that
> early. It's perfectly fine to append more data and have the two streams
> overlap ; the overlap will be handled by the sourcebuffer such as the gap is
> minimal.
> 

hls.js is not truncating as such, the logic is as follow:
fragment 0 level 0 is loaded, with A/V from t=0 to t=10s => buffer [0,10]
bandwidth is enough to switch to level 1 =>
fragment 1 level 1 is loaded, with A/V from t=10 to t=20s, but with first keyframe @ t=11,6s, ending up with buffer being [0,10][11.6,20]

in case we want a defensive approach (to handle fragments not starting with keyframe), we would have to do the following :
fragment 0 level 0 is loaded, with A/V from t=0 to t=10s => buffer [0,10]
bandwidth is enough to switch to level 1 => still load fragment 1 level 0 as a defensive approach to handle the hypothetical case in which frag is not starting with KF (which is not that common)
fragment 1 level 0 is loaded, with A/V from t=10 to t=20s => buffer [0,20]
fragment 1 level 1 is loaded, with A/V from t=10 to t=20s, but with first keyframe @ t=11,6s. => buffer [0,20]


> Like in here; you could have added say
> (0.670844, 15.135777) followed by an init segment and then (10.004977 to
> 20.007644)
> this would have resulted in a contiguous buffered range of (0.670844,
> 20.007644) ; (there could have been a small gap, but it would have been much
> less than 125ms)

Comment 5

4 years ago
(In reply to Marco Castelluccio [:marco] from comment #1)
> The first 10 seconds of the Big Buck Bunny video are playing very slowly for
> me on Windows.
> Playback is fast enough after the first 10 seconds.

@Marco, i would like to get a better understanding  of your issue, could you eventually report a ticket on github ?
or click on 'toggle metrics display' and then click on 'metrics permalink' so that i could understand what is going on your side ?
thanks
Mangui
(Reporter)

Comment 6

4 years ago
I've narrowed the test case there:
http://people.mozilla.org/~jyavenard//tests/mse_mp4/nasa.html

this gives me exactly the same buffered range between FF 42 and Chrome (a 1.6s gap) and so will Safari (though with their crazy calculation, it shows a gap almost between every frame)

Chrome will give a decode error ; Firefox will play up to 10s (where the gap is) and stall and Safari will just hang requiring a force quit

Now in FF, if you seek to 15s, on a mac at least it will show a grean that is 75% green.

Looks to me that the data fed is nonsensical
(Reporter)

Comment 7

4 years ago
(In reply to g.du.pontavice from comment #4)
> going for 2 would be great IMHO.
> In our catalogs we have videos for which we have one video keyframe at the beginning of the clip,
> but audio lasting until the end... and currently they are not playing back with MSE. but Flash 
> handles them without any issue.

The other issue that remains is how do you seek in those ?

Would you seek the video to the first keyframe again and audio to the right point ?
knowing that the video is made of a single frame isn't that straightforward ; they main behaviour is seeking audio and video together to ensure you get proper A/V sync once the seek is complete.
If we were to seek to the video at one place, but do audio at the other : the A/V sync logic would need to be split in two.


> in case we want a defensive approach (to handle fragments not starting with
> keyframe), we would have to do the following :
> fragment 0 level 0 is loaded, with A/V from t=0 to t=10s => buffer [0,10]
> bandwidth is enough to switch to level 1 => still load fragment 1 level 0 as
> a defensive approach to handle the hypothetical case in which frag is not
> starting with KF (which is not that common)
> fragment 1 level 0 is loaded, with A/V from t=10 to t=20s => buffer [0,20]
> fragment 1 level 1 is loaded, with A/V from t=10 to t=20s, but with first
> keyframe @ t=11,6s. => buffer [0,20]

there is no requirement for a HLS segment starts with a keyframe, just that a keyframe be found in that segment (though it's recommended that it does).

The mpegts being made of 188 segments ; and assuming most web server these days handle request-range properly ; you could easily download a partial segment based on the published bitrate so say you only download 2s (which is the maximum recommended gap between keyframes).
So your defensive approach doesn't have to be that aggressive :)
(Reporter)

Comment 8

4 years ago
I've opened bug 1208054 to discuss the seeking behaviour for such videos with gaps.
(In reply to Jean-Yves Avenard [:jya] from comment #3)
> (In reply to Marco Castelluccio [:marco] from comment #1)
> > The first 10 seconds of the Big Buck Bunny video are playing very slowly for
> > me on Windows.
> > Playback is fast enough after the first 10 seconds.
> 
> not knowing for sure ; but DASH being adaptative ; it can't detect how fast
> your connection / playback speed really is until you've started playback.
> Maybe the hls.js isn't starting with the lowest resolution available first
> which would ensure you could play it.

I forgot to mentioned that playback was slow on Windows during the first ten seconds also when I sought to the beginning (and so I suppose the video was already loaded).

(In reply to g.du.pontavice from comment #5)
> (In reply to Marco Castelluccio [:marco] from comment #1)
> > The first 10 seconds of the Big Buck Bunny video are playing very slowly for
> > me on Windows.
> > Playback is fast enough after the first 10 seconds.
> 
> @Marco, i would like to get a better understanding  of your issue, could you
> eventually report a ticket on github ?
> or click on 'toggle metrics display' and then click on 'metrics permalink'
> so that i could understand what is going on your side ?
> thanks
> Mangui

I will definitely open a ticket on GitHub if I'll be able to reproduce again.

Comment 10

4 years ago
(In reply to Jean-Yves Avenard [:jya] from comment #6)
> I've narrowed the test case there:
> http://people.mozilla.org/~jyavenard//tests/mse_mp4/nasa.html
> 
> this gives me exactly the same buffered range between FF 42 and Chrome (a
> 1.6s gap) and so will Safari (though with their crazy calculation, it shows
> a gap almost between every frame)
> 
> Chrome will give a decode error ; Firefox will play up to 10s (where the gap
> is) and stall and Safari will just hang requiring a force quit
> 
> Now in FF, if you seek to 15s, on a mac at least it will show a grean that
> is 75% green.
> 
> Looks to me that the data fed is nonsensical

indeed there must be something wrong with the provided fMP4. I started investigating and I am demuxing the same NALUs with hls.js and flashls, with which the decoding is working fine.
I will investigate further and update once I get a better understanding on what is going on.

Comment 11

4 years ago
FYI, as the issue is not FF specific, I opened a ticket on Github
https://github.com/dailymotion/hls.js/issues/23
will keep you posted

Comment 12

3 years ago
FYI, this NASA stream is now working fine on Chrome, since hls.js workarounds fragment not starting with keyframe (on Chrome only)
i.e. first video segment appended in the sourceBuffer will always start with a keyframe.
but following ones might not start with a keyframe

but the stream is still broken with FF42
http://dailymotion.github.io/hls.js/demo/?src=http%3A%2F%2Fnasatv-lh.akamaihd.net%2Fi%2FNASA_101%40319270%2Fmaster.m3u8

Comment 13

3 years ago
same thing with FF43.0b1
(Reporter)

Comment 14

3 years ago
Chrome is more forgiving with invalid streams and its use of ffmpeg makes it handle most content no matter how incorrect it is...

I'm highly confident that of our MSE code. We did have a change this week in Nightly that will affect returned buffered ranges; but apart of that the MSE is virtually identical between 42, 43, 44 and 45 (eg. There's been no report of anything needing fixing)

Comment 15

3 years ago
ok, what would be the next step to investigate further ? I am willing to dig into the details but I don't know exactly how to step in efficiently.
(Reporter)

Comment 16

3 years ago
I'll have another look tomorrow and report on what's wrong.

Comment 17

3 years ago
thanks :jya, if there is any way I could help you let me know

Comment 18

3 years ago
FYI I just checked this stream on FF42.0.1/Android Lollipop (Asus Zenfone 2) and I got the stream playing fine for more than 5 minutes, without any visible artifacts, whereas I tried to play the same stream at the same time on FF42/OS X and I can see artifacts on either playback of the first or second appended segments.
(Reporter)

Comment 19

3 years ago
The effect you describe happens when you're missing an init segment for the data that follow. 

It works on Android (and it would on Windows) because the decoder takes annex B as input, with the SPS NAL in-band. 

So a side effect is that you really never have to bother with the init segment, so long that the H264 extradata is in the stream. It will automatically reconfigure itself whenever it detects a new format. 

The Mac decoder doesn't have such capabilities. 

It would work in chrome because they are using ffmpeg to decode and like the WMF decoder it can also take annex B. 

But those streams are invalid. Whenever the resolution is changing, there must be the associated init segment.

Comment 20

3 years ago
thanks for your input, init segment is appended when changing resolution. however it is not re-appended when SPS/PPS is changing on the same resolution.
are you suggesting that the observed behavior would match with a stream that has SPS/PPS changing on the same resolution ? 
hls.js does not handle this case for the moment but i will look into it.
(Reporter)

Comment 21

3 years ago
Yes that would match, but only if the SPS was different than the first one seen when creating the decoder. 

But that would mean they use different encoders within the same stream. How weird.

In theory we have code to detect sps changes in-band but it will only be detected if it's the first NAL of the stream

Comment 22

3 years ago
as expected,
SPS/PPS are different for each quality level and never changing on each quality level.

what is weird is that while playing a mono level stream, we can still see the video artifacts
for example, using this link : http://dailymotion.github.io/hls.js/demo/?src=http%3A%2F%2Fnasatv-lh.akamaihd.net%2Fi%2FNASA_101%40319270%2Findex_1000_av-p.m3u8%3Fsd%3D10%26rebase%3Don


here is a log with SPS/PPS dump for this level.

loadSource:http://nasatv-lh.akamaihd.net/i/NASA_101@319270/index_1000_av-p.m3u8?sd=10&rebase=on hls.js:3790:7
attachMedia hls.js:3776:7
media source opened hls.js:1918:7
manifest loaded,1 level(s) found, first bitrate:undefined hls.js:573:11
switching to level 0 hls.js:591:9
(re)loading playlist for level 0 hls.js:597:11
level 0 loaded [144849162,144849164],duration:30 hls.js:1669:7
live playlist - first load, unknown sliding hls.js:1683:11
Loading 144849162 of [144849162 ,144849164],level 0, currentTime:0,bufferEnd:0.000 hls.js:1068:1
Demuxing 144849162 of [144849162 ,144849164],level 0 hls.js:1737:11
level switch detected hls.js:2697:9
manifest codec:undefined,ADTS data:type:2,sampleingIndex:3[48000kHz],channelConfig:2 hls.js:3232:7
parsed codec:mp4a.40.2,rate:48000,nb channel:2 hls.js:3171:1
SPS:274d401fb9180a00b76022000007d20001d4c1c48001e8480004c4b77bdc07c22114e0 hls.js:2977:1
PPS:28febc80 hls.js:3007:1
SPS:274d401fb9180a00b76022000007d20001d4c1c48001e8480004c4b77bdc07c22114e0 hls.js:2977:1
PPS:28febc80 hls.js:3007:1
SPS:274d401fb9180a00b76022000007d20001d4c1c48001e8480004c4b77bdc07c22114e0 hls.js:2977:1
PPS:28febc80 hls.js:3007:1
SPS:274d401fb9180a00b76022000007d20001d4c1c48001e8480004c4b77bdc07c22114e0 hls.js:2977:1
PPS:28febc80 hls.js:3007:1
SPS:274d401fb9180a00b76022000007d20001d4c1c48001e8480004c4b77bdc07c22114e0 hls.js:2977:1
PPS:28febc80 hls.js:3007:1
SPS:274d401fb9180a00b76022000007d20001d4c1c48001e8480004c4b77bdc07c22114e0 hls.js:2977:1
PPS:28febc80 hls.js:3007:1
SPS:274d401fb9180a00b76022000007d20001d4c1c48001e8480004c4b77bdc07c22114e0 hls.js:2977:1
PPS:28febc80 hls.js:3007:1
SPS:274d401fb9180a00b76022000007d20001d4c1c48001e8480004c4b77bdc07c22114e0 hls.js:2977:1
PPS:28febc80 hls.js:3007:1
SPS:274d401fb9180a00b76022000007d20001d4c1c48001e8480004c4b77bdc07c22114e0 hls.js:2977:1
PPS:28febc80 hls.js:3007:1
SPS:274d401fb9180a00b76022000007d20001d4c1c48001e8480004c4b77bdc07c22114e0 hls.js:2977:1
PPS:28febc80 hls.js:3007:1
SPS:274d401fb9180a00b76022000007d20001d4c1c48001e8480004c4b77bdc07c22114e0 hls.js:2977:1
PPS:28febc80 hls.js:3007:1
SPS:274d401fb9180a00b76022000007d20001d4c1c48001e8480004c4b77bdc07c22114e0 hls.js:2977:1
PPS:28febc80 hls.js:3007:1
selected A/V codecs for sourceBuffers:mp4a.40.2,avc1.4d401f hls.js:1768:11
parsed data, type/startPTS/endPTS/startDTS/endDTS/nb:video/1.254/10.097/1.221/9.997/263 hls.js:1798:1
parsed data, type/startPTS/endPTS/startDTS/endDTS/nb:audio/0.000/9.984/0.000/9.984/468 hls.js:1798:1
media buffered :  hls.js:1855:1
SN just loaded, load next one: 144849163 hls.js:1053:21
Loading 144849163 of [144849162 ,144849164],level 0, currentTime:9.984,bufferEnd:9.984 hls.js:1068:1
Demuxing 144849163 of [144849162 ,144849164],level 0 hls.js:1737:11
SPS:274d401fb9180a00b76022000007d20001d4c1c48001e8480004c4b77bdc07c22114e0 hls.js:2977:1
PPS:28febc80 hls.js:3007:1
SPS:274d401fb9180a00b76022000007d20001d4c1c48001e8480004c4b77bdc07c22114e0 hls.js:2977:1
PPS:28febc80 hls.js:3007:1
SPS:274d401fb9180a00b76022000007d20001d4c1c48001e8480004c4b77bdc07c22114e0 hls.js:2977:1
PPS:28febc80 hls.js:3007:1
SPS:274d401fb9180a00b76022000007d20001d4c1c48001e8480004c4b77bdc07c22114e0 hls.js:2977:1
PPS:28febc80 hls.js:3007:1
SPS:274d401fb9180a00b76022000007d20001d4c1c48001e8480004c4b77bdc07c22114e0 hls.js:2977:1
PPS:28febc80 hls.js:3007:1
SPS:274d401fb9180a00b76022000007d20001d4c1c48001e8480004c4b77bdc07c22114e0 hls.js:2977:1
PPS:28febc80 hls.js:3007:1
SPS:274d401fb9180a00b76022000007d20001d4c1c48001e8480004c4b77bdc07c22114e0 hls.js:2977:1
PPS:28febc80 hls.js:3007:1
SPS:274d401fb9180a00b76022000007d20001d4c1c48001e8480004c4b77bdc07c22114e0 hls.js:2977:1
PPS:28febc80 hls.js:3007:1
SPS:274d401fb9180a00b76022000007d20001d4c1c48001e8480004c4b77bdc07c22114e0 hls.js:2977:1
PPS:28febc80 hls.js:3007:1
SPS:274d401fb9180a00b76022000007d20001d4c1c48001e8480004c4b77bdc07c22114e0 hls.js:2977:1
PPS:28febc80 hls.js:3007:1
SPS:274d401fb9180a00b76022000007d20001d4c1c48001e8480004c4b77bdc07c22114e0 hls.js:2977:1
PPS:28febc80 hls.js:3007:1
SPS:274d401fb9180a00b76022000007d20001d4c1c48001e8480004c4b77bdc07c22114e0 hls.js:2977:1
PPS:28febc80 hls.js:3007:1
SPS:274d401fb9180a00b76022000007d20001d4c1c48001e8480004c4b77bdc07c22114e0 hls.js:2977:1
PPS:28febc80 hls.js:3007:1
Video/PTS/DTS adjusted:899700/899700 hls.js:5137:15
parsed data, type/startPTS/endPTS/startDTS/endDTS/nb:video/9.997/20.107/9.997/20.007/300 hls.js:1798:1
parsed data, type/startPTS/endPTS/startDTS/endDTS/nb:audio/9.984/19.989/9.984/19.989/469 hls.js:1798:1
media buffered : [1.253644,19.989333] hls.js:1855:1
buffer end: 0 is located too far from the end of live sliding playlist, media position will be reseted to: 1.254 hls.js:1001:1
Loading 144849162 of [144849162 ,144849164],level 0, currentTime:0,bufferEnd:1.254 hls.js:1068:1
Demuxing 144849162 of [144849162 ,144849164],level 0 hls.js:1737:11
SPS:274d401fb9180a00b76022000007d20001d4c1c48001e8480004c4b77bdc07c22114e0 hls.js:2977:1
PPS:28febc80 hls.js:3007:1
SPS:274d401fb9180a00b76022000007d20001d4c1c48001e8480004c4b77bdc07c22114e0 hls.js:2977:1
PPS:28febc80 hls.js:3007:1
SPS:274d401fb9180a00b76022000007d20001d4c1c48001e8480004c4b77bdc07c22114e0 hls.js:2977:1
PPS:28febc80 hls.js:3007:1
SPS:274d401fb9180a00b76022000007d20001d4c1c48001e8480004c4b77bdc07c22114e0 hls.js:2977:1
PPS:28febc80 hls.js:3007:1
SPS:274d401fb9180a00b76022000007d20001d4c1c48001e8480004c4b77bdc07c22114e0 hls.js:2977:1
PPS:28febc80 hls.js:3007:1
SPS:274d401fb9180a00b76022000007d20001d4c1c48001e8480004c4b77bdc07c22114e0 hls.js:2977:1
PPS:28febc80 hls.js:3007:1
SPS:274d401fb9180a00b76022000007d20001d4c1c48001e8480004c4b77bdc07c22114e0 hls.js:2977:1
PPS:28febc80 hls.js:3007:1
SPS:274d401fb9180a00b76022000007d20001d4c1c48001e8480004c4b77bdc07c22114e0 hls.js:2977:1
PPS:28febc80 hls.js:3007:1
SPS:274d401fb9180a00b76022000007d20001d4c1c48001e8480004c4b77bdc07c22114e0 hls.js:2977:1
PPS:28febc80 hls.js:3007:1
SPS:274d401fb9180a00b76022000007d20001d4c1c48001e8480004c4b77bdc07c22114e0 hls.js:2977:1
PPS:28febc80 hls.js:3007:1
SPS:274d401fb9180a00b76022000007d20001d4c1c48001e8480004c4b77bdc07c22114e0 hls.js:2977:1
PPS:28febc80 hls.js:3007:1
SPS:274d401fb9180a00b76022000007d20001d4c1c48001e8480004c4b77bdc07c22114e0 hls.js:2977:1
PPS:28febc80 hls.js:3007:1
parsed data, type/startPTS/endPTS/startDTS/endDTS/nb:video/0.000/10.097/0.000/9.997/300 hls.js:1798:1
parsed data, type/startPTS/endPTS/startDTS/endDTS/nb:audio/0.000/9.984/0.000/9.984/468 hls.js:1798:1
media buffered : [1.253644,10.1068][11.263155,19.989333] hls.js:1855:1
Loading 144849163 of [144849162 ,144849164],level 0, currentTime:1.2536666666666667,bufferEnd:10.107 hls.js:1068:1
Demuxing 144849163 of [144849162 ,144849164],level 0 hls.js:1737:11
SPS:274d401fb9180a00b76022000007d20001d4c1c48001e8480004c4b77bdc07c22114e0 hls.js:2977:1
PPS:28febc80 hls.js:3007:1
SPS:274d401fb9180a00b76022000007d20001d4c1c48001e8480004c4b77bdc07c22114e0 hls.js:2977:1
PPS:28febc80 hls.js:3007:1
SPS:274d401fb9180a00b76022000007d20001d4c1c48001e8480004c4b77bdc07c22114e0 hls.js:2977:1
PPS:28febc80 hls.js:3007:1
SPS:274d401fb9180a00b76022000007d20001d4c1c48001e8480004c4b77bdc07c22114e0 hls.js:2977:1
PPS:28febc80 hls.js:3007:1
SPS:274d401fb9180a00b76022000007d20001d4c1c48001e8480004c4b77bdc07c22114e0 hls.js:2977:1
PPS:28febc80 hls.js:3007:1
SPS:274d401fb9180a00b76022000007d20001d4c1c48001e8480004c4b77bdc07c22114e0 hls.js:2977:1
PPS:28febc80 hls.js:3007:1
SPS:274d401fb9180a00b76022000007d20001d4c1c48001e8480004c4b77bdc07c22114e0 hls.js:2977:1
PPS:28febc80 hls.js:3007:1
SPS:274d401fb9180a00b76022000007d20001d4c1c48001e8480004c4b77bdc07c22114e0 hls.js:2977:1
PPS:28febc80 hls.js:3007:1
SPS:274d401fb9180a00b76022000007d20001d4c1c48001e8480004c4b77bdc07c22114e0 hls.js:2977:1
PPS:28febc80 hls.js:3007:1
SPS:274d401fb9180a00b76022000007d20001d4c1c48001e8480004c4b77bdc07c22114e0 hls.js:2977:1
PPS:28febc80 hls.js:3007:1
SPS:274d401fb9180a00b76022000007d20001d4c1c48001e8480004c4b77bdc07c22114e0 hls.js:2977:1
PPS:28febc80 hls.js:3007:1
SPS:274d401fb9180a00b76022000007d20001d4c1c48001e8480004c4b77bdc07c22114e0 hls.js:2977:1
PPS:28febc80 hls.js:3007:1
SPS:274d401fb9180a00b76022000007d20001d4c1c48001e8480004c4b77bdc07c22114e0 hls.js:2977:1
PPS:28febc80 hls.js:3007:1
Video/PTS/DTS adjusted:899700/899700 hls.js:5137:15
parsed data, type/startPTS/endPTS/startDTS/endDTS/nb:video/9.997/20.107/9.997/20.007/300 hls.js:1798:1
parsed data, type/startPTS/endPTS/startDTS/endDTS/nb:audio/9.984/19.989/9.984/19.989/469 hls.js:1798:1
media buffered : [1.253644,19.989333] hls.js:1855:1
SN just loaded, load next one: 144849164 hls.js:1053:21
Loading 144849164 of [144849162 ,144849164],level 0, currentTime:2.008332,bufferEnd:19.989 hls.js:1068:1
Demuxing 144849164 of [144849162 ,144849164],level 0 hls.js:1737:11
SPS:274d401fb9180a00b76022000007d20001d4c1c48001e8480004c4b77bdc07c22114e0 hls.js:2977:1
PPS:28febc80 hls.js:3007:1
SPS:274d401fb9180a00b76022000007d20001d4c1c48001e8480004c4b77bdc07c22114e0 hls.js:2977:1
PPS:28febc80 hls.js:3007:1
SPS:274d401fb9180a00b76022000007d20001d4c1c48001e8480004c4b77bdc07c22114e0 hls.js:2977:1
PPS:28febc80 hls.js:3007:1
SPS:274d401fb9180a00b76022000007d20001d4c1c48001e8480004c4b77bdc07c22114e0 hls.js:2977:1
PPS:28febc80 hls.js:3007:1
SPS:274d401fb9180a00b76022000007d20001d4c1c48001e8480004c4b77bdc07c22114e0 hls.js:2977:1
PPS:28febc80 hls.js:3007:1
SPS:274d401fb9180a00b76022000007d20001d4c1c48001e8480004c4b77bdc07c22114e0 hls.js:2977:1
PPS:28febc80 hls.js:3007:1
SPS:274d401fb9180a00b76022000007d20001d4c1c48001e8480004c4b77bdc07c22114e0 hls.js:2977:1
PPS:28febc80 hls.js:3007:1
SPS:274d401fb9180a00b76022000007d20001d4c1c48001e8480004c4b77bdc07c22114e0 hls.js:2977:1
PPS:28febc80 hls.js:3007:1
SPS:274d401fb9180a00b76022000007d20001d4c1c48001e8480004c4b77bdc07c22114e0 hls.js:2977:1
PPS:28febc80 hls.js:3007:1
SPS:274d401fb9180a00b76022000007d20001d4c1c48001e8480004c4b77bdc07c22114e0 hls.js:2977:1
PPS:28febc80 hls.js:3007:1
SPS:274d401fb9180a00b76022000007d20001d4c1c48001e8480004c4b77bdc07c22114e0 hls.js:2977:1
PPS:28febc80 hls.js:3007:1
Video/PTS/DTS adjusted:1800600/1800600 hls.js:5137:15
parsed data, type/startPTS/endPTS/startDTS/endDTS/nb:video/20.007/30.017/20.007/30.017/300 hls.js:1798:1
parsed data, type/startPTS/endPTS/startDTS/endDTS/nb:audio/19.989/29.995/19.989/29.995/469 hls.js:1798:1
media buffered : [1.253644,29.994666] hls.js:1855:1
level 0 loaded [144849163,144849165],duration:30 hls.js:1669:7
live playlist sliding:9.997 hls.js:1677:13
SN just loaded, load next one: 144849165 hls.js:1053:21
Loading 144849165 of [144849163 ,144849165],level 0, currentTime:8.115166,bufferEnd:29.995 hls.js:1068:1
Demuxing 144849165 of [144849163 ,144849165],level 0 hls.js:1737:11
SPS:274d401fb9180a00b76022000007d20001d4c1c48001e8480004c4b77bdc07c22114e0 hls.js:2977:1
PPS:28febc80 hls.js:3007:1
SPS:274d401fb9180a00b76022000007d20001d4c1c48001e8480004c4b77bdc07c22114e0 hls.js:2977:1
PPS:28febc80 hls.js:3007:1
SPS:274d401fb9180a00b76022000007d20001d4c1c48001e8480004c4b77bdc07c22114e0 hls.js:2977:1
PPS:28febc80 hls.js:3007:1
SPS:274d401fb9180a00b76022000007d20001d4c1c48001e8480004c4b77bdc07c22114e0 hls.js:2977:1
PPS:28febc80 hls.js:3007:1
SPS:274d401fb9180a00b76022000007d20001d4c1c48001e8480004c4b77bdc07c22114e0 hls.js:2977:1
PPS:28febc80 hls.js:3007:1
SPS:274d401fb9180a00b76022000007d20001d4c1c48001e8480004c4b77bdc07c22114e0 hls.js:2977:1
PPS:28febc80 hls.js:3007:1
SPS:274d401fb9180a00b76022000007d20001d4c1c48001e8480004c4b77bdc07c22114e0 hls.js:2977:1
PPS:28febc80 hls.js:3007:1
SPS:274d401fb9180a00b76022000007d20001d4c1c48001e8480004c4b77bdc07c22114e0 hls.js:2977:1
PPS:28febc80 hls.js:3007:1
SPS:274d401fb9180a00b76022000007d20001d4c1c48001e8480004c4b77bdc07c22114e0 hls.js:2977:1
PPS:28febc80 hls.js:3007:1
SPS:274d401fb9180a00b76022000007d20001d4c1c48001e8480004c4b77bdc07c22114e0 hls.js:2977:1
PPS:28febc80 hls.js:3007:1
SPS:274d401fb9180a00b76022000007d20001d4c1c48001e8480004c4b77bdc07c22114e0 hls.js:2977:1
PPS:28febc80 hls.js:3007:1
Video/PTS/DTS adjusted:2701500/2701500 hls.js:5137:15
parsed data, type/startPTS/endPTS/startDTS/endDTS/nb:video/30.017/39.992/30.017/39.992/299 hls.js:1798:1
parsed data, type/startPTS/endPTS/startDTS/endDTS/nb:audio/29.995/40.000/29.995/40.000/469 hls.js:1798:1
media buffered : [1.253644,40] hls.js:1855:1
level 0 loaded [144849164,144849166],duration:30 hls.js:1669:7
live playlist sliding:20.007 hls.js:1677:13
Loading 144849166 of [144849164 ,144849166],level 0, currentTime:18.169416,bufferEnd:40.000 hls.js:1068:1
Demuxing 144849166 of [144849164 ,144849166],level 0 hls.js:1737:11
SPS:274d401fb9180a00b76022000007d20001d4c1c48001e8480004c4b77bdc07c22114e0 hls.js:2977:1
PPS:28febc80 hls.js:3007:1
SPS:274d401fb9180a00b76022000007d20001d4c1c48001e8480004c4b77bdc07c22114e0 hls.js:2977:1
PPS:28febc80 hls.js:3007:1
SPS:274d401fb9180a00b76022000007d20001d4c1c48001e8480004c4b77bdc07c22114e0 hls.js:2977:1
PPS:28febc80 hls.js:3007:1
SPS:274d401fb9180a00b76022000007d20001d4c1c48001e8480004c4b77bdc07c22114e0 hls.js:2977:1
PPS:28febc80 hls.js:3007:1
SPS:274d401fb9180a00b76022000007d20001d4c1c48001e8480004c4b77bdc07c22114e0 hls.js:2977:1
PPS:28febc80 hls.js:3007:1
SPS:274d401fb9180a00b76022000007d20001d4c1c48001e8480004c4b77bdc07c22114e0 hls.js:2977:1
PPS:28febc80 hls.js:3007:1
SPS:274d401fb9180a00b76022000007d20001d4c1c48001e8480004c4b77bdc07c22114e0 hls.js:2977:1
PPS:28febc80 hls.js:3007:1
SPS:274d401fb9180a00b76022000007d20001d4c1c48001e8480004c4b77bdc07c22114e0 hls.js:2977:1
PPS:28febc80 hls.js:3007:1
SPS:274d401fb9180a00b76022000007d20001d4c1c48001e8480004c4b77bdc07c22114e0 hls.js:2977:1
PPS:28febc80 hls.js:3007:1
SPS:274d401fb9180a00b76022000007d20001d4c1c48001e8480004c4b77bdc07c22114e0 hls.js:2977:1
PPS:28febc80 hls.js:3007:1
SPS:274d401fb9180a00b76022000007d20001d4c1c48001e8480004c4b77bdc07c22114e0 hls.js:2977:1
PPS:28febc80 hls.js:3007:1
SPS:274d401fb9180a00b76022000007d20001d4c1c48001e8480004c4b77bdc07c22114e0 hls.js:2977:1
PPS:28febc80 hls.js:3007:1
Video/PTS/DTS adjusted:3608339/3599250 hls.js:5137:15
parsed data, type/startPTS/endPTS/startDTS/endDTS/nb:video/40.093/50.002/39.992/50.002/300 hls.js:1798:1
parsed data, type/startPTS/endPTS/startDTS/endDTS/nb:audio/40.000/49.984/40.000/49.984/468 hls.js:1798:1
media buffered : [1.253644,49.984]
(Reporter)

Comment 23

3 years ago
(In reply to g.du.pontavice from comment #22)
> as expected,
> SPS/PPS are different for each quality level and never changing on each
> quality level.

then we're back to the issue of init segment not matching the content.

The only way you can have those type of corruption on mac is when you open a decoder for a particular SPS/PPS and feed NALs from a different stream

Comment 24

3 years ago
then if init segment is not matching the content, that would mean that all other browsers on which this transmuxing is working fine are either able to detect in band SPS/PPS or overcome my potentially badly formatted init segment.

I guess I should focus on potential issues in avc1 box generation ?
https://github.com/dailymotion/hls.js/blob/master/src/remux/mp4-generator.js#L313-L371

do you have a pointer ou FF source code handling this parsing ?
(Reporter)

Comment 25

3 years ago
(In reply to g.du.pontavice from comment #24)
> then if init segment is not matching the content, that would mean that all
> other browsers on which this transmuxing is working fine are either able to
> detect in band SPS/PPS or overcome my potentially badly formatted init
> segment.

As I mentioned before: chrome uses ffmpeg which works with annexB (it looks on the in-band SPS, not the one in the init segment) ; and same for Internet Explorer/Edge which only works on Windows, with the Windows Media Fundation also working with annex B content: no init segment, checking the SPS/PPS inband.

I can reproduce the same issue with Safari on mac.

> 
> I guess I should focus on potential issues in avc1 box generation ?
> https://github.com/dailymotion/hls.js/blob/master/src/remux/mp4-generator.
> js#L313-L371
> 
> do you have a pointer ou FF source code handling this parsing ?

oh, you are generating your own SPS/PPS found inside the init segment ??
could it be that you're generating the init segment data incorrectly then?

that would make sense and explain the current behaviour of decoders on mac.

The work we do to convert AVCC to AnnexB is there:
http://mxr.mozilla.org/mozilla-central/source/media/libstagefright/binding/AnnexB.cpp#20
the code to "decode" the SPS is found there:
http://mxr.mozilla.org/mozilla-central/source/media/libstagefright/binding/H264.cpp#468

BTW: https://github.com/dailymotion/hls.js/blob/master/src/remux/mp4-generator.js#L317 this looks extremely suspicious to me, appears that you are assuming that a SPS has a constant size; that is definitely not the case.
(Reporter)

Comment 26

3 years ago
this is our code converting annexB into AVCC (including generating the H264 extradata)

http://mxr.mozilla.org/mozilla-central/source/media/libstagefright/binding/AnnexB.cpp#217
The first step is calling ParseNALUnits which will scan and convert the 00000001 (or 000001) header into 4 bytes NAL size (but I'm guessing you already do that as mp4 can only contain avcc nal)

once the data has been converted into AVCC, we then scan the NALs to find the SPS and PPS:
http://mxr.mozilla.org/mozilla-central/source/media/libstagefright/binding/AnnexB.cpp#237

Suggestion: use AVC3 not AVC1, AVC3 is similar to annexB in that you dont need to have the SPS/PPS in the init segment and you can have it in-band instead.

It would be much easier for you to generate AVC3 content coming from mpeg-ts and annexB

Comment 27

3 years ago
FWIW, Safari may have trouble with AVC3, at least it has with dash.js (not sure whether it's dash.js or Safari), but then Safari has trouble with hls.js as well anyway.

Comment 28

3 years ago
indeed I got the issue only in FF42/OSX, not reproducible in FF42/Win7

regarding the initsegment generation

I rechecked and it seems as expected, unless i misunderstand the spec ?

raw SPS extracted from Annex-B (without 4 byte length): 274d401fb9180a00b76022000007d20001d4c1c48001e8480004c4b77bdc07c22114e0
raw PPS extracted from Annex-B (without 4 byte length) : 28febc80

mp4 avcC box :

0000003a // box size
61766343 // avcC
01       // configurationVersion    
4d       // AVCProfileIndication
08       // profile_compatibility
1f       // AVCLevelIndication
ff       // ‘111111’b + lengthSizeMinusOne, hardcoded to 4 bytes
c1       // ‘111’b + numOfSequenceParameterSets
0023     // sequenceParameterSetLength
274d401fb9180a00b76022000007d20001d4c1c48001e8480004c4b77bdc07c22114e0 // sequenceParameterSetNALUnit
01       // numOfPictureParameterSets
0004     // pictureParameterSetLength
28febc80 // pictureParameterSetNALUnit

I am not clear whether sequenceParameterSetNALUnit and pictureParameterSetNALUnit raw NAL unit should be prepended by their 4 bytes length as well ?

I tried with the following avcC but I got a MEDIA ERROR raised straight away so I might do things incorrectly ...

00000042
61766343
01
4d
08
1f
ff
c1
0027
00000023274d401fb9180a00b76022000007d20001d4c1c48001e8480004c4b77bdc07c22114e0 // sequenceParameterSetNALUnit prepended by 4 bytes length
01
0008
0000000428febc80 // // pictureParameterSetNALUnit prepended by 4 bytes length
(Reporter)

Comment 29

3 years ago
It does need the size before each sps and pps NAL; but they are stored in a word (16 bits)

Here is the relevant bit creating the avcC box content, from annexB (linked provided above):
285     if (nalType == 0x7) { /* SPS */
286       numSps++;
287       spsw.WriteU16(nalLen);
288       spsw.Write(p, nalLen);
289     } else if (nalType == 0x8) { /* PPS */
290       numPps++;
291       ppsw.WriteU16(nalLen);
292       ppsw.Write(p, nalLen);
293     }
294   }
295 
296   if (numSps && sps.length() > 5) {
297     extradata->AppendElement(1);        // version
298     extradata->AppendElement(sps[3]);   // profile
299     extradata->AppendElement(sps[4]);   // profile compat
300     extradata->AppendElement(sps[5]);   // level
301     extradata->AppendElement(0xfc | 3); // nal size - 1
302     extradata->AppendElement(0xe0 | numSps);
303     extradata->AppendElements(sps.begin(), sps.length());
304     extradata->AppendElement(numPps);
305     if (numPps) {
306       extradata->AppendElements(pps.begin(), pps.length());
307     }
308   }

Regarding AVC3; all BBC dash streams use avc3; they told me that to get it to work with IE (and maybe adage) they had to tag it as AVC1

Comment 30

3 years ago
ok I updated the avcc box generation as doing this way is more optimal and a few things were not identical.
but SPS/PPS insertion in AVCC was correct.

see new code here
https://github.com/dailymotion/hls.js/blob/master/src/remux/mp4-generator.js#L318-L344

but i still get video artifact on FF42/OSX.
http://dailymotion.github.io/hls.js/demo/?src=http%3A%2F%2Fnasatv-lh.akamaihd.net%2Fi%2FNASA_101%40319270%2Findex_1000_av-p.m3u8%3Fsd%3D10%26rebase%3Don

I am quite confident with SPS/PPS extracted from Annex-B, and AVCC creation, as i am creating the same AVCC box with flashls (bit 2 bit identical now) and it is working fine. 

is there any other boxes in initsegment that I should check carefully ?

Comment 31

3 years ago
(In reply to Jean-Yves Avenard [:jya] from comment #29)
> Regarding AVC3; all BBC dash streams use avc3; they told me that to get it
> to work with IE (and maybe adage) they had to tag it as AVC1

In Safari masquerading as AVC1 does not help, also according to the BBC guys (they have filed bug reports with Apple). At least Safari 9 now gives the correct answer in `MediaSource.isTypeSupported` - before it returned true for codecs="bogus" as well.
The best you can get for Safari with MP4Box for instance is with `MP4Box -bs-switching merge`.
See: https://github.com/Dash-Industry-Forum/dash.js/issues/734
Jean-Yves - is there still an issue here?
(Reporter)

Comment 33

3 years ago
It is. Guillaume and I have isolated the issue, and it's something to do with the Apple VT decoder. The same issue can be reproduced in Safari
Depends on: 1239178
(Reporter)

Comment 34

3 years ago
Invalid.

Issue in hls.js being a bit enthousiastic when demuxing the TS packets.
see:
https://github.com/dailymotion/hls.js/issues/197#issuecomment-175652540
Status: NEW → RESOLVED
Last Resolved: 3 years ago
Resolution: --- → INVALID
You need to log in before you can comment on or make changes to this bug.