Closed Bug 1258673 Opened 4 years ago Closed 4 years ago

Reading MediaResource cached value can cause invalid reads

Categories

(Core :: Audio/Video: Playback, defect, P2)

defect

Tracking

()

RESOLVED FIXED
mozilla48
Tracking Status
firefox46 --- fixed
firefox47 --- fixed
firefox48 --- fixed

People

(Reporter: jya, Assigned: jwwang)

References

Details

Attachments

(4 files)

The issue was seen in bug 1256038.

While I haven't fully identified the problem, when playing a MP3 with DirectShow you can hear audio corruption, like somehow playback went back by a few milliseconds.

The issue was resolved by simply removing the MediaResource's pin and reads.

While resolved, it shouldn't happen. It appears that the reads are invalid somehow.

This should be resolved.
Priority: -- → P2
From bug 1256038 comment 56

"Can you catch logs with "NSPR_LOG_MODULES=timestamp:1,MediaCache:5,MediaResource:5" so I can check if there are any abnormal network/cache activities?"
Attached file logs1.txt
Attached file logs.txt
just enabling the logging make the issue less severe.

it took longer in this case for the corruption to be heard, but the corruption was more significant than in the previous run (which had 3 interruptions)
Flags: needinfo?(jwwang)
Can you check if this patch fixes the  stuttering? Thanks!
Flags: needinfo?(jyavenard)
awesome find. I knew there was a race somewhere.

thank you !

perfect playback now
Flags: needinfo?(jyavenard)
Comment on attachment 8734220 [details] [diff] [review]
1258673_fix_media_cache_race.patch

Review of attachment 8734220 [details] [diff] [review]:
-----------------------------------------------------------------

I wasn't asked, but r+ anyway :)

i wonder if this can explain other corruptions / error we're seeing from time to time

::: dom/media/MediaCache.cpp
@@ +2308,2 @@
>    *aBytes = count;
> +  mStreamOffset = streamOffset;

seems to me that Tell() will return an invalid result with two concurrent reads, but I guess it doesn't matter with how it's used.

it's just an approximation anyway.
I wonder if this bug will also cause corrupted video frames or some troubles for the demuxer since the data returned by MediaResource::Read()/ReadAt() could be corrupted.
(In reply to Jean-Yves Avenard [:jya] from comment #6)
> seems to me that Tell() will return an invalid result with two concurrent
> reads, but I guess it doesn't matter with how it's used.
> 
> it's just an approximation anyway.

Right, I plan to forbid the use of MediaResource::Read() which is totally unreliable. The client code should use ReadAt() or its helpers to ensure data is read at the desired position.
Flags: needinfo?(jwwang)
Can just make Read() private, I believe I have removed all call to MediaCache::Read with bug 1190238
Comment on attachment 8734220 [details] [diff] [review]
1258673_fix_media_cache_race.patch

Per comment 6. Thanks for the review!
Attachment #8734220 - Flags: review+
Will need to uplift to 46.
Flags: needinfo?(jwwang)
###!!! ASSERTION: Using a block that's behind the read position?: 'bo->mStreamBlock*BLOCK_SIZE <= bo->mStream->mStreamOffset', file /builds/slave/try-l64-d-00000000000000000000/build/src/dom/media/MediaCache.cpp, line 1644
INFO -  #01: mozilla::MediaCacheStream::Read [dom/media/MediaCache.cpp:2288]
INFO -  #02: mozilla::MediaCacheStream::ReadAt [dom/media/MediaCache.cpp:2322]
INFO -  #03: mozilla::ChannelMediaResource::ReadAt [dom/media/MediaResource.cpp:680]
INFO -  #04: mozilla::MediaResourceIndex::ReadAt [dom/media/MediaResource.cpp:1617]

Hit the assertion in MediaCache::NoteBlockUsage() because mStreamOffset is not updated until the end of MediaCacheStream::Read(). I will pass the offset explicitly to NoteBlockUsage() to avoid the assertion.
couldn't you simple remove this assertion?

it serves no purpose really as mStream->mStreamOffset isn't used anywhere else.
https://hg.mozilla.org/mozilla-central/file/6202ade0e6d688ffb67932398e56cfc6fa04ceb3/dom/media/MediaCache.cpp#l1643

You mean this assertion? It seems to be used to catch abnormal read position.
(In reply to JW Wang [:jwwang] from comment #14)
> https://hg.mozilla.org/mozilla-central/file/
> 6202ade0e6d688ffb67932398e56cfc6fa04ceb3/dom/media/MediaCache.cpp#l1643
> 
> You mean this assertion? It seems to be used to catch abnormal read position.

yes.

but i can understand how it could have made sense after seeking + read.

but this is no long possible.
my english is terrible today.

I meant to write, this is no longer possible.

There's enough check before hand to ensure that the assertion can't be triggered.

passing the offset as argument only help not triggering the assertion, it has no functional use. I'd say drop it
Per comment 12.
Assignee: nobody → jwwang
Attachment #8734545 - Flags: review?(jyavenard)
Attachment #8734545 - Flags: review?(jyavenard) → review+
Thanks for the review!
Need to uplift to 46.
Flags: needinfo?(jwwang)
https://hg.mozilla.org/mozilla-central/rev/08a806c70b5c
https://hg.mozilla.org/mozilla-central/rev/a5066deb9e8d
Status: NEW → RESOLVED
Closed: 4 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla48
Comment on attachment 8734220 [details] [diff] [review]
1258673_fix_media_cache_race.patch

Approval Request Comment
[Feature/regressing bug #]: 1258673
[User impact if declined]: bad MP3 playback on Windows XP: pauses, stutter, possibly decoding error. It's what the real cause behind bug 1256038 and a regression introduced in 45.
[Describe test coverage new/current, TreeHerder]: in central, manual test. Confirmed resolved by the original bug reporter.
[Risks and why]: very low. this is a bug that has always been there and is a data race. It got exposed recently as timing of events changed. The fix is simple and safe.
[String/UUID change made/needed]: none.
Attachment #8734220 - Flags: approval-mozilla-beta?
Attachment #8734220 - Flags: approval-mozilla-aurora?
Comment on attachment 8734545 [details] [diff] [review]
1258673_part2_fix_assertion.patch

Approval Request Comment
[Feature/regressing bug #]:1258673
[User impact if declined]:bad MP3 playback on Windows XP: pauses, stutter, possibly decoding error. It's what the real cause behind bug 1256038 and a regression introduced in 45.
[Describe test coverage new/current, TreeHerder]:in central, manual test. Confirmed resolved by the original bug reporter.
[Risks and why]: very low. this is a bug that has always been there and is a data race. It got exposed recently as timing of events changed. The fix is simple and safe.
[String/UUID change made/needed]:none
Flags: needinfo?(jwwang)
Attachment #8734545 - Flags: approval-mozilla-beta?
Attachment #8734545 - Flags: approval-mozilla-aurora?
Comment on attachment 8734220 [details] [diff] [review]
1258673_fix_media_cache_race.patch

Fixes media playback issues, Aurora47+
Attachment #8734220 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Attachment #8734545 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Comment on attachment 8734220 [details] [diff] [review]
1258673_fix_media_cache_race.patch

Approving on behalf of Liz with the hope that this will be included in 46.0b6 (if not it should be in 46.0b7).
Attachment #8734220 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
Attachment #8734545 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
You need to log in before you can comment on or make changes to this bug.