Closed Bug 1491132 Opened 6 years ago Closed 6 years ago

Cannot finish decoding for a FLAC audio on nightly

Categories

(Core :: Audio/Video: Playback, defect, P2)

defect

Tracking

()

VERIFIED FIXED
mozilla64
Tracking Status
firefox-esr60 --- verified
firefox62 --- wontfix
firefox63 --- verified
firefox64 --- verified

People

(Reporter: chunmin, Assigned: jya)

References

Details

(Keywords: regression)

Attachments

(1 file)

Step to reproduce:
1. Run `$ MOZ_LOG="MediaDemuxer:5, MediaDecoder:5" ./mach run` to open nightly
2. open https://bit.ly/2p6KGEW
3. Seek audio to 4:55
4. You will find the audio get stuck at the end and the playback button doesn't change from "pause" to "play". The log shows that we have a "NS_ERROR_DOM_MEDIA_DECODE_ERR" error.

Log for the bug
-------------------------------------------------
...

[Child 27156: MediaPlayback #2]: V/MediaDecoder MediaDecoderStateMachine[0x126a1e000] Decoder=121c91800 OnAudioDecoded [295601632,295706121]
[Child 27156: MediaPlayback #2]: V/MediaDecoder MediaDecoderStateMachine[0x126a1e000] Decoder=121c91800 Queueing audio task - queued=7, decoder-queued=0
[Child 27156: MediaPDecoder #2]: V/MediaDemuxer FlacTrackDemuxer[0x126aeed60] GetSamples(1) Begin offset=24379238 mParsedFramesDuration=295.706121 mTotalFrameLen=24176109
[Child 27156: MediaPDecoder #2]: V/MediaDemuxer FlacTrackDemuxer[0x126aeed60] FindNext() Begin offset=24379238 mParsedFramesDuration=295.706121 mTotalFrameLen=24176109
[Child 27156: MediaPDecoder #2]: V/MediaDemuxer FlacTrackDemuxer[0x126aeed60] FindNext() End time=295.706122 offset=24383136 mParsedFramesDuration=295.810611 mTotalFrameLen=24180007
[Child 27156: MediaPDecoder #2]: D/MediaDemuxer FlacTrackDemuxer[0x126aeed60] GetNextFrame() Begin(time=295.706122 offset=24379230 size=3898)
[Child 27156: MediaPDecoder #2]: V/MediaDemuxer FlacTrackDemuxer[0x126aeed60] GetSamples() End mSamples.Length=1 aNumSamples=-1 offset=24383136 mParsedFramesDuration=295.810611 mTotalFrameLen=24180007
[Child 27156: MediaPlayback #3]: V/MediaDecoder MediaDecoderStateMachine[0x126a1e000] Decoder=121c91800 OnAudioDecoded [295706122,295810611]
[Child 27156: MediaPlayback #3]: V/MediaDecoder MediaDecoderStateMachine[0x126a1e000] Decoder=121c91800 Queueing audio task - queued=8, decoder-queued=0
[Child 27156: MediaPDecoder #1]: V/MediaDemuxer FlacTrackDemuxer[0x126aeed60] GetSamples(1) Begin offset=24383136 mParsedFramesDuration=295.810611 mTotalFrameLen=24180007
[Child 27156: MediaPDecoder #1]: V/MediaDemuxer FlacTrackDemuxer[0x126aeed60] FindNext() Begin offset=24383136 mParsedFramesDuration=295.810611 mTotalFrameLen=24180007
[Child 27156: MediaPDecoder #1]: V/MediaDemuxer FlacTrackDemuxer[0x126aeed60] FindNext() End time=295.810612 offset=24387022 mParsedFramesDuration=295.810612 mTotalFrameLen=24183891
[Child 27156: MediaPDecoder #1]: D/MediaDemuxer FlacTrackDemuxer[0x126aeed60] GetNextFrame() Begin(time=295.810612 offset=24383128 size=3884)
[Child 27156: MediaPDecoder #1]: V/MediaDemuxer FlacTrackDemuxer[0x126aeed60] GetSamples() End mSamples.Length=1 aNumSamples=-1 offset=24387022 mParsedFramesDuration=295.810612 mTotalFrameLen=24183891
[Child 27156: MediaPlayback #3]: V/MediaDecoder MediaDecoderStateMachine[0x126a1e000] Decoder=121c91800 OnAudioDecoded [295810612,295915101]
[Child 27156: MediaPlayback #3]: V/MediaDecoder MediaDecoderStateMachine[0x126a1e000] Decoder=121c91800 Queueing audio task - queued=9, decoder-queued=0
[Child 27156: MediaPDecoder #1]: V/MediaDemuxer FlacTrackDemuxer[0x126aeed60] GetSamples(1) Begin offset=24387022 mParsedFramesDuration=295.810612 mTotalFrameLen=24183891
[Child 27156: MediaPDecoder #1]: V/MediaDemuxer FlacTrackDemuxer[0x126aeed60] FindNext() Begin offset=24387022 mParsedFramesDuration=295.810612 mTotalFrameLen=24183891
[Child 27156: MediaPDecoder #1]: V/MediaDemuxer FlacTrackDemuxer[0x126aeed60] FindNext() End time=51.631020 offset=24391118 mParsedFramesDuration=295.810612 mTotalFrameLen=24187997
[Child 27156: MediaPDecoder #1]: D/MediaDemuxer FlacTrackDemuxer[0x126aeed60] GetNextFrame() Begin(time=51.631020 offset=24387012 size=4106)
[Child 27156: MediaPDecoder #1]: V/MediaDemuxer FlacTrackDemuxer[0x126aeed60] GetSamples() End mSamples.Length=1 aNumSamples=-1 offset=24391118 mParsedFramesDuration=295.810612 mTotalFrameLen=24187997
[Child 27156: MediaPlayback #1]: V/MediaDecoder MediaDecoderStateMachine[0x126a1e000] Decoder=121c91800 OnAudioNotDecoded aError=NS_ERROR_DOM_MEDIA_DECODE_ERR
[Child 27156, MediaPlayback #1] WARNING: Decoder=121c91800 Decode error: NS_ERROR_DOM_MEDIA_DECODE_ERR (0x806e0004) - virtual mozilla::MediaResult mozilla::FFmpegAudioDecoder<46465650>::DoDecode(mozilla::MediaRawData *, uint8_t *, int, bool *, mozilla::MediaDataDecoder::DecodedData &): FFmpeg audio error:-1094995529: file /Users/cchang/Work/gecko-dev/dom/media/MediaDecoderStateMachine.cpp, line 3429
Did it play un 61 or 62 ?

If you let it play without seeking, does the same error occurs as well?
Flags: needinfo?(cchang)
I don't have 61 but it works well on 62.
It still fails even without seeking

...
...
[Child 27585: MediaPlayback #2]: V/MediaDecoder MediaDecoderStateMachine[0x12d38d000] Decoder=12a3b7000 Queueing audio task - queued=0, decoder-queued=1
[Child 27585: MediaPlayback #3]: V/MediaDecoder MediaDecoderStateMachine[0x12d38d000] Decoder=12a3b7000 OnAudioNotDecoded aError=NS_ERROR_DOM_MEDIA_DECODE_ERR
[Child 27585, MediaPlayback #3] WARNING: Decoder=12a3b7000 Decode error: NS_ERROR_DOM_MEDIA_DECODE_ERR (0x806e0004) - virtual mozilla::MediaResult mozilla::FFmpegAudioDecoder<46465650>::DoDecode(mozilla::MediaRawData *, uint8_t *, int, bool *, mozilla::MediaDataDecoder::DecodedData &): FFmpeg audio error:-1094995529: file /Users/cchang/Work/gecko-dev/dom/media/MediaDecoderStateMachine.cpp, line 3429
Flags: needinfo?(cchang)
Sorry, I'm confused. It works in 62 or it fails just the same?

Something weird with the file, with safari the audio plays fine until currentTime gets to 4m55, and then
Assignee: nobody → jyavenard
Blocks: 1488065
Keywords: regression
It works in 62.
In nightly on my laptop, it fails even without seeking.
What version of central were you using? I can't reproduce with today's nightly

There was a botched attempt when bug 1488065 went in when bug 1487792 didn't. If bug 1487792 you would have seen the decoding failure.

So depending the time at which you pulled inbound/autoland it could have failed (not in central though)
No longer blocks: 1488065
Flags: needinfo?(cchang)
Keywords: regression
hmmm, interesting..

I couldn't reproduce the error as I had first stored the file locally. But when using the URL provided (https://bit.ly/2p6KGEW) then it does fail.
Flags: needinfo?(cchang)
I can reproduce with version as far up to 2018-01-01, this isn't a recent regression.
Had a closer look today. It's not the flac demuxer nor the flac decoder.

When reading the end of the file, when it's local you read:
MRI:ReadAt(offset:24387022 aCount:4096 aBytes:679): OK
MRI:ReadAt(offset:24387012 aCount:689 aBytes:689): OK
MRI:ReadAt(offset:24387701 aCount:4096 aBytes:0): OK

The file is 24387701 bytes so this all makes sense.

Now, over the http channel however we see:
MRI:ReadAt(offset:24383128 aCount:3884 aBytes:3884): OK
MRI:ReadAt(offset:24387022 aCount:4096 aBytes:4096): OK
MRI:ReadAt(offset:24391118 aCount:4096 aBytes:0): OK
MRI:ReadAt(offset:24387012 aCount:4106 aBytes:4106): OK

so it reads at offset 24387022 (the same as earlier) but rather than return 679 bytes (24387701-24387022) it returns 4096. It treats the content as if it was 24391118 bytes long.

24391118 is a multiply of 32768 (32768*41077) 32768 being the size of a MediaCache block.

We can see in MediaCacheStream::FlushPartialBlockInternal (https://searchfox.org/mozilla-central/source/dom/media/MediaCache.cpp#2153) is called once MediaCacheStream::NotifyDataEndedInternal

mMediaCache->AllocateAndWriteBlock (https://searchfox.org/mozilla-central/source/dom/media/MediaCache.cpp#2169) will always write a block at a time .

Once the MediaResource attempts to read from the MediaCacheStream, it will successfully read past the end of the real resource, returning here an extra 4775 bytes of rubbish.

This is what's causing the decoding failure later.

The bug is in the MediaCache, and has been for a while by the look of things.

This is a serious bug, as potentially all content read from the MediaCache will contain rubbish at its end
Severity: normal → major
Priority: -- → P2
Prior bug 1416085, reads were clamped to the content's duration (if known). It appears that the new code relied on MediaCacheReadBlockFromCache to properly account for the end of content.

However, this was not the case, the MediaCache always reads (and write) one full block at a time regardles of the size requested (a block is 32768 bytes).

Rather than clamping in the Read() method as it used to be, we clamp in ReadBlockFromCache as such safety will benefit other callers that would have otherwise also returned gargage reads.
Comment on attachment 9009406 [details]
Bug 1491132 - Clamp cache reads to content size.

Gerald Squelart [:gerald] has approved the revision.
Attachment #9009406 - Flags: review+
Pushed by jyavenard@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/e0bd55a3e557
Clamp cache reads to content size. r=gerald
https://hg.mozilla.org/mozilla-central/rev/e0bd55a3e557
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla64
Comment on attachment 9009406 [details]
Bug 1491132 - Clamp cache reads to content size.

[Approval Request Comment]
If this is not a sec:{high,crit} bug, please state case for ESR consideration:
User impact if declined: intermittent decoding errors
Fix Landed on Version: 64
Risk to taking this patch (and alternatives if risky): 
String or UUID changes made by this patch: none

See https://wiki.mozilla.org/Release_Management/ESR_Landing_Process for more info.

Approval Request Comment
[Feature/Bug causing the regression]: 1416085
[User impact if declined]: intermittent decoding errors
[Is this code covered by automated tests?]: no
[Has the fix been verified in Nightly?]: no, manually tested by dev
[Needs manual test from QE? If yes, steps to reproduce]: steps in bug description 
[List of other uplifts needed for the feature/fix]: no
[Is the change risky?]: no
[Why is the change risky/not risky?]: we clamp reads to element size, which is what used to be done prior 1416085
[String changes made/needed]: none
Attachment #9009406 - Flags: approval-mozilla-esr60?
Attachment #9009406 - Flags: approval-mozilla-beta?
This seems to be broken since Firefox 59 and just got reported, can't it just ride the trains to 64?
Flags: needinfo?(jyavenard)
(In reply to Pascal Chevrel:pascalc from comment #15)
> This seems to be broken since Firefox 59 and just got reported, can't it
> just ride the trains to 64?

I would prefer not, while it just got reported, there's no doubt lots of people would have been affected by it.
We've had several issues reported about some videos being corrupted, including a rise an intermittent failures in the past 6 months, I wouldn't be surprised they were all related to this bug.
Flags: needinfo?(jyavenard)
Comment on attachment 9009406 [details]
Bug 1491132 - Clamp cache reads to content size.

Approved for 63 Beta 8 as it may be the cause for intermittent failures in the last months, thanks.
Attachment #9009406 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
Flags: qe-verify+
Depends on: 1492365
I managed to reproduce the initial issue using Nightly from 2018-09-13 using steps provided in comment0. I verified that the error does not occur anymore and "pause" button does change to "play" using Latest Beta and Nightly build from taskcluster across platforms (Windows 10 64bit, Ubuntu 18.04 64bit and macOS 10.13). 

I did saw something interesting: If I seek to about 4:50 and leave it to play it will stop at about 4:54 for 1-2 sec and then jump directly to 4:56 (I saw this on both builds on all platforms). Also I saw an error in the logs, not sure if this is something that is concerning:
"[Child 3416: MediaPlayback #1]: V/MediaDecoder MediaDecoderStateMachine[0x119ed0000] Decoder=117367400 OnAudioNotDecoded aError=NS_ERROR_DOM_MEDIA_END_OF_STREAM"
Flags: needinfo?(jyavenard)
With Flac, seeking is done via bisecting in the file, as such it's never exact. The duration is also estimated by looking at the last sample of the stream.

Seeing a EOS error during playback is expected as the file is ended.
Flags: needinfo?(jyavenard)
(In reply to Jean-Yves Avenard [:jya] from comment #20)
> With Flac, seeking is done via bisecting in the file, as such it's never
> exact. The duration is also estimated by looking at the last sample of the
> stream.
> 
> Seeing a EOS error during playback is expected as the file is ended.

Thanks for the clarification, marking accordingly based on comment 19 and comment 20.
Status: RESOLVED → VERIFIED
Flags: qe-verify+
Better to leave the qe-verify+ for verification in ESR though. I assume we want it in 60esr.
Flags: qe-verify+
Comment on attachment 9009406 [details]
Bug 1491132 - Clamp cache reads to content size.

Fixes FLAC playback issues, approved for ESR 60.3.
Attachment #9009406 - Flags: approval-mozilla-esr60? → approval-mozilla-esr60+
Also verified that this is also fixed in latest ESR build from taskcluster (https://tools.taskcluster.net/index/gecko.v2.mozilla-esr60.pushdate.2018.10.10.20181010193149.firefox) across platforms (Windows 10 64bit, Ubuntu 18.04 64bit, macOS 10.13.6).
Flags: qe-verify+
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: