Closed
Bug 1119456
Opened 11 years ago
Closed 11 years ago
Stop doing blocking Moof reads from within the mp4 demuxer
Categories
(Core :: Audio/Video, defect, P1)
Tracking
()
RESOLVED
FIXED
mozilla37
People
(Reporter: bholley, Assigned: bholley)
References
(Blocks 1 open bug)
Details
Attachments
(2 files, 9 obsolete files)
13.64 KB,
patch
|
ajones
:
review+
Sylvestre
:
approval-mozilla-beta+
|
Details | Diff | Splinter Review |
2.33 KB,
patch
|
bholley
:
review+
Sylvestre
:
approval-mozilla-beta+
|
Details | Diff | Splinter Review |
In bug 1114383, we had to unlock the mp4 demuxer when calling out to do blocking moof reads. It turns out this doesn't work, because the demuxer code isn't re-entrant. If another thread comes along and rebuilds the index while one thread is doing a blocking moof read, we'll end up using stale locals to compute persistent parser state. This manifested itself when I was debugging the tests in bug 1064128 - chunks of the audio would occasionally play twice because the moof was read twice, causing the audio and video to get out of sync.
It's tempting to "do this right" by making all of this stuff fully asynchronous with promises. I think we don't have the time to do that right now, especially since changing the async interaction patterns here in valid ways might still uncover other bugs. So the goal here is to do the minimal possible change to making things correct, and later do the long-term fix jya suggested in bug 1119208.
I think the least invasive change is to simply move the blocking out of the demuxer and into the caller (MP4Reader::DemuxAudioSample). IIUC this code does need to work for non-MSE Resources (i.e. non-SourceBufferResources), so we may have to get a bit tricky with the API to do the waiting properly. But I think it's relatively doable.
This should fix bug 1114682, as well as other potential a/v sync issues we've been having.
Comment 1•11 years ago
|
||
(In reply to Bobby Holley (Busy with media, don't ask for DOM/JS/XPConnect things) from comment #0)
> IIUC this code
> does need to work for non-MSE Resources (i.e. non-SourceBufferResources), so
> we may have to get a bit tricky with the API to do the waiting properly. But
> I think it's relatively doable.
Yes. MacOSX and Windows non-MSE <video> uses this demuxer in Gecko 35 and up.
Assignee | ||
Comment 2•11 years ago
|
||
Update - Anthony and I flirted with the idea of demuxer-per-thread, but we decided at the end of the meeting to just stick with the original plan and get rid of the blocking reads. I have patches that do this - just working out the kinks.
Comment 3•11 years ago
|
||
We can simply do one blocking and one non-blocking demuxer. That is only a two line change. Alternatively you can continue your ironing if you prefer.
Comment 4•11 years ago
|
||
Updated•11 years ago
|
Assignee: bobbyholley → ajones
Status: NEW → ASSIGNED
Assignee | ||
Comment 5•11 years ago
|
||
Comment on attachment 8546387 [details] [diff] [review]
Use a separate MoofParser for demuxing
Review of attachment 8546387 [details] [diff] [review]:
-----------------------------------------------------------------
This doesn't solve the race condition in the seek case, right? Fundamentally I think we need to get rid of the unlocking behavior, or we're going to keep missing things like this.
My patches more or less work. I'm going to massage them a bit tomorrow morning and then I'll put them up for review.
Comment 6•11 years ago
|
||
(In reply to Bobby Holley (Busy with media, don't ask for DOM/JS/XPConnect things) from comment #5)
> This doesn't solve the race condition in the seek case, right? Fundamentally
> I think we need to get rid of the unlocking behavior, or we're going to keep
> missing things like this.
When we seek we do:
* Seek Video
* Pop Sample
* Seek Audio
Although this doesn't apply to the MSE case where audio and video are separate. We don't need to worry too much about getting this case perfect right now because fMP4 is rarely used if ever outside of MSE.
It is looking more and more like the proper solution for MSE is to demux everything when it arrives and store it in a big data structure. That is definitely the long term plan.
Assignee | ||
Comment 7•11 years ago
|
||
(In reply to Anthony Jones (:kentuckyfriedtakahe, :k17e) from comment #6)
> (In reply to Bobby Holley (Busy with media, don't ask for DOM/JS/XPConnect
> things) from comment #5)
> > This doesn't solve the race condition in the seek case, right? Fundamentally
> > I think we need to get rid of the unlocking behavior, or we're going to keep
> > missing things like this.
>
> When we seek we do:
> * Seek Video
> * Pop Sample
> * Seek Audio
>
> Although this doesn't apply to the MSE case where audio and video are
> separate.
My point is that, if a decode thread is in the middle of doing a blocking read with the lock released, another thread can come by and seek the index, which will cause local variables to be out of date when the block read stack unwinds. Is there a reason this isn't a problem?
Comment 8•11 years ago
|
||
I'm not saying it isn't a problem. You are right and you should land your solution when it is ready. I'm just adding that there is a potentially annoying seeking issue that we can probably punt into next week.
Assignee | ||
Comment 9•11 years ago
|
||
I've got working patches. Spinning off some sub-bugs.
Assignee | ||
Comment 10•11 years ago
|
||
Attachment #8546946 -
Flags: review?(cpearce)
Assignee | ||
Comment 11•11 years ago
|
||
Attachment #8546947 -
Flags: review?(cpearce)
Assignee | ||
Comment 12•11 years ago
|
||
We've determined that this isn't necessary, since all of the internal monitors
on MediaResources are "leaf monitors" (i.e. they will always be the last monitor
acquired), so we don't need to worry about deadlocks.
Attachment #8546948 -
Flags: review?(cpearce)
Assignee | ||
Updated•11 years ago
|
Assignee: ajones → bobbyholley
Assignee | ||
Comment 13•11 years ago
|
||
With the whole stack applied, I can occasionally seek the minecraft video back and forward enough that the audio sink gets confused and starts doing PlaySilence while it plays the video. Once I figure out the test timeouts in the dependent patch (bug 1120023#c12) I'll retest.
Comment 14•11 years ago
|
||
Comment on attachment 8546946 [details] [diff] [review]
Part 1 - Switch blocking reads to the aWaitingForData model. v1
Review of attachment 8546946 [details] [diff] [review]:
-----------------------------------------------------------------
::: media/libstagefright/binding/Index.cpp
@@ +152,5 @@
>
> nsTArray<Moof>& moofs = mIndex->mMoofParser->Moofs();
> while (true) {
> if (mCurrentMoof == moofs.Length()) {
> + if (!mIndex->mMoofParser->TryToReadNextMoof()) {
It seems to me that you'll now be unable to distinguish between an invalid moof, and waiting for data? So if you're passed an invalid file, you can't fail and playback will stall indefinitely? Seems like you should be passing aWaitingForData through here, and distinguishing the error condition.
@@ +180,5 @@
> mCurrentMoof = 0;
> mCurrentSample = 0;
> Sample* sample;
> + bool ignored;
> + while (!!(sample = Get(&ignored))) {
Should this be ignored? That sounds dangerous to me.
::: media/libstagefright/binding/include/mp4_demuxer/MoofParser.h
@@ +194,5 @@
> void ParseTrak(Box& aBox);
> void ParseMdia(Box& aBox, Tkhd& aTkhd);
> void ParseMvex(Box& aBox);
>
> + bool TryToReadNextMoof();
Pass aWaitingForData through?
Updated•11 years ago
|
Attachment #8546947 -
Flags: review?(cpearce) → review-
Comment 15•11 years ago
|
||
Comment on attachment 8546947 [details] [diff] [review]
Part 2 - Remove now-unused blocking read API from MP4Stream. v1
Review of attachment 8546947 [details] [diff] [review]:
-----------------------------------------------------------------
Oh oops, just kidding, r+.
Attachment #8546947 -
Flags: review- → review+
Updated•11 years ago
|
Attachment #8546948 -
Flags: review?(cpearce) → review+
Updated•11 years ago
|
Attachment #8546946 -
Flags: review?(cpearce) → review+
Assignee | ||
Comment 16•11 years ago
|
||
So this doesn't work, because non-MSE mp4 actually _does_ rely on blocking, and we have no way to do WaitForData for non-MSE.
I went back to a dumber (but simpler) strategy in which we make all demuxer reads non-blocking, and record the parameters of the last failed failed read on the MP4Stream. If we find ourselves erroring out of the demuxer, we check to see if the most recent read failed, and we re-attempt it as a blocking read (while releasing the demuxer). If that works, we just try again. This more or less emulates our current behavior while hoisting the blocking out a few frames.
Assignee | ||
Comment 17•11 years ago
|
||
Attachment #8546387 -
Attachment is obsolete: true
Attachment #8546946 -
Attachment is obsolete: true
Attachment #8546947 -
Attachment is obsolete: true
Attachment #8546948 -
Attachment is obsolete: true
Attachment #8547017 -
Flags: review?(ajones)
Assignee | ||
Comment 18•11 years ago
|
||
Attachment #8547018 -
Flags: review?(ajones)
Assignee | ||
Comment 19•11 years ago
|
||
Attachment #8547019 -
Flags: review?(ajones)
Assignee | ||
Comment 20•11 years ago
|
||
Assignee | ||
Comment 21•11 years ago
|
||
Comment on attachment 8547017 [details] [diff] [review]
Part 1 - Switch all of MP4Demuxer to CachedReadAt and hoist blocking into callers with a hacky retry strategy. v1
I think this strategy works, but I need to tweak the patches a little bit.
Attachment #8547017 -
Attachment is obsolete: true
Attachment #8547017 -
Flags: review?(ajones)
Assignee | ||
Updated•11 years ago
|
Attachment #8547018 -
Attachment is obsolete: true
Attachment #8547018 -
Flags: review?(ajones)
Assignee | ||
Updated•11 years ago
|
Attachment #8547019 -
Attachment is obsolete: true
Attachment #8547019 -
Flags: review?(ajones)
Assignee | ||
Comment 22•11 years ago
|
||
Attachment #8547110 -
Flags: review?(cpearce)
Attachment #8547110 -
Flags: review?(ajones)
Assignee | ||
Comment 23•11 years ago
|
||
Assignee | ||
Comment 24•11 years ago
|
||
Small fix to switch the VLA to dynamic allocation, since the former doesn't
work on MSVC (and probably isn't a good idea anyway).
Attachment #8547114 -
Flags: review?(cpearce)
Attachment #8547114 -
Flags: review?(ajones)
Assignee | ||
Updated•11 years ago
|
Attachment #8547110 -
Attachment is obsolete: true
Attachment #8547110 -
Flags: review?(cpearce)
Attachment #8547110 -
Flags: review?(ajones)
Assignee | ||
Comment 25•11 years ago
|
||
Comment 26•11 years ago
|
||
Comment on attachment 8547114 [details] [diff] [review]
Make MP4Demuxer's blocking reads non-blocking and hoist blocking into callers with a hacky retry strategy. v5
Review of attachment 8547114 [details] [diff] [review]:
-----------------------------------------------------------------
Excellent. That all looks like godo clean fun. I don't like the LastReadFailed() business but I can see why it is needed.
::: media/libstagefright/binding/mp4_demuxer.cpp
@@ +98,5 @@
> +
> + // Read the number of tracks. If we can't find any, make sure to bail now before
> + // attempting any new reads to make the retry system work.
> + size_t trackCount = e->countTracks();
> + if (trackCount == 0) {
Suggestion: Not much value in the intermediate variable but if you're using it then you may as well use it below.
Attachment #8547114 -
Flags: review?(ajones) → review+
Comment 27•11 years ago
|
||
Comment on attachment 8547114 [details] [diff] [review]
Make MP4Demuxer's blocking reads non-blocking and hoist blocking into callers with a hacky retry strategy. v5
Review of attachment 8547114 [details] [diff] [review]:
-----------------------------------------------------------------
kft's r+ is quite sufficient here.
Attachment #8547114 -
Flags: review?(cpearce)
Assignee | ||
Comment 28•11 years ago
|
||
Attachment #8547145 -
Flags: review?(roc)
Assignee | ||
Comment 29•11 years ago
|
||
Comment on attachment 8547145 [details] [diff] [review]
Work around media cache bug. v1
Review of attachment 8547145 [details] [diff] [review]:
-----------------------------------------------------------------
Technically I think this is not a media cache bug. It never promised that Pin would have the property you want it to have. For a pinned stream, the ranges returned by GetCachedDataEnd, GetNextCachedData, GetCachedRanges never decrease and reads in those ranges will always succeed. That's all we promise:
// Cached blocks associated with this stream will not be evicted
// while the stream is pinned.
If you probe with GetCachedRanges etc, mPartialBlockBuffer is not included; it's not a block.
If we want the media cache to promise that "everything read from a pinned stream can be read again", then I think we should modify MediaCache::Read so, when a stream is pinned, we always block until we've reached a block boundary or end-of-stream.
Meanwhile, this workaround is OK, modulo the following comment:
::: dom/media/fmp4/MP4Reader.cpp
@@ +99,4 @@
> MonitorAutoUnlock unlock(*aMonitor);
> + size_t ignored;
> + (void) stream->BlockingReadAt(failure.mOffset, dummyBuffer, bufferSize, &ignored);
> + if (NS_WARN_IF(!stream->BlockingReadAt(failure.mOffset, dummyBuffer, failure.mCount, &ignored))) {
I don't understand why you're doing two reads here.
Why isn't this just
if (!stream->BlockingReadAt(failure.mOffset, dummyBuffer, bufferSize, &ignored)) {
Attachment #8547145 -
Flags: review?(roc) → review-
Assignee | ||
Comment 31•11 years ago
|
||
> I don't understand why you're doing two reads here.
My original concern was that BlockingReadAt would return failure for EOS,
causing us to error out for bytes that weren't actually requested by the
original demuxer read. But I now note that this case doesn't actually cause
us to return error, so I think we should be good with a single best-effort
priming read.
Attachment #8547145 -
Attachment is obsolete: true
Attachment #8547193 -
Flags: review?(roc)
Assignee | ||
Comment 32•11 years ago
|
||
Assignee | ||
Comment 33•11 years ago
|
||
Comment on attachment 8547193 [details] [diff] [review]
Work around the fact that media cache does not quite guarantee the property we want. v2
Actually, I think comment 30 implies r+ on this patch.
Attachment #8547193 -
Flags: review?(roc) → review+
Assignee | ||
Comment 34•11 years ago
|
||
(In reply to Bobby Holley (Busy with media, don't ask for DOM/JS/XPConnect things) from comment #29)
> https://treeherder.mozilla.org/ui/#/jobs?repo=try&revision=32b557090cc8
This one is green.
(In reply to Bobby Holley (Busy with media, don't ask for DOM/JS/XPConnect things) from comment #32)
> https://treeherder.mozilla.org/ui/#/jobs?repo=try&revision=4cfa0861ac6d
This one includes roc's suggested tweak. Looking at the code now, I'm quite sure that the tweak should not cause any new failures.
(In reply to Bobby Holley (Busy with media, don't ask for DOM/JS/XPConnect things) from comment #33)
> Comment on attachment 8547193 [details] [diff] [review]
> Work around the fact that media cache does not quite guarantee the property
> we want. v2
>
> Actually, I think comment 30 implies r+ on this patch.
Yep.
Assignee | ||
Comment 36•11 years ago
|
||
Comment 37•11 years ago
|
||
https://hg.mozilla.org/mozilla-central/rev/19b8c420b13a
https://hg.mozilla.org/mozilla-central/rev/3ffd66a3f1bd
Status: ASSIGNED → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla37
Comment 38•11 years ago
|
||
Comment on attachment 8547114 [details] [diff] [review]
Make MP4Demuxer's blocking reads non-blocking and hoist blocking into callers with a hacky retry strategy. v5
Approval Request Comment
[Feature/regressing bug #]: MSE
[User impact if declined]: Users more like to see playback stalls with YouTube. Less consistent testing.
[Describe test coverage new/current, TBPL]: Landed on m-c. Manual testing.
[Risks and why]: Risk is moderate. This is the least-invasive fix we could come up with, but it does affect normal mp4 playback code outside MSE.
[String/UUID change made/needed]: None.
This request applies to all patches in this bug.
Attachment #8547114 -
Flags: approval-mozilla-beta?
Updated•11 years ago
|
Updated•11 years ago
|
Attachment #8547114 -
Flags: approval-mozilla-beta? → approval-mozilla-beta+
Updated•11 years ago
|
Attachment #8547193 -
Flags: approval-mozilla-beta+
Comment 39•11 years ago
|
||
You need to log in
before you can comment on or make changes to this bug.
Description
•