Closed Bug 1125776 Opened 6 years ago Closed 6 years ago

updateend is fired before decoder is being initialized and errors aren't reported to the parent source buffer.

Categories

(Core :: Audio/Video, defect, P2)

x86
macOS
defect

Tracking

()

RESOLVED FIXED
mozilla38
Tracking Status
firefox36 --- unaffected
firefox37 --- fixed
firefox38 --- fixed

People

(Reporter: jya, Assigned: jya)

References

(Blocks 3 open bugs)

Details

Attachments

(12 files, 26 obsolete files)

1.26 KB, patch
Details | Diff | Splinter Review
3.44 KB, patch
Details | Diff | Splinter Review
1.30 KB, patch
Details | Diff | Splinter Review
1.32 KB, patch
Details | Diff | Splinter Review
21.58 KB, patch
mattwoodrow
: review+
Details | Diff | Splinter Review
2.92 KB, patch
Details | Diff | Splinter Review
1.38 KB, patch
Details | Diff | Splinter Review
2.04 KB, patch
Details | Diff | Splinter Review
15.01 KB, patch
Details | Diff | Splinter Review
10.55 KB, patch
Details | Diff | Splinter Review
23.44 KB, patch
mattwoodrow
: review+
Details | Diff | Splinter Review
2.09 KB, patch
mattwoodrow
: review+
Details | Diff | Splinter Review
MediaSource's duration attribute isn't always updated before the update/updateend event is fired.

Normally, when you call SourceBuffer.appendBuffer() with an init segment, the metadata should be read and the mediasource duration updated, and only then update/updatend event be fired.

However, currently the metadata is being read by the MediaDecoderStateMachine in a concurrent thread. This happens in parallel, and mediasource's duration will be initialized in its own time.
The behaviour is as such racy.

When mediasource duration hasn't yet been initialized (it's NaN when MediaSource element is created), and an init segment has been loaded, appendBuffer should wait for the metadata to be read by the MDSM and only then fire update/updateend event.

This is the last task at this stage that cause Youtube test suite 23 and 24 to fail (bug 1116645)
Blocks: 1106776
Priority: -- → P2
Blocks: 1127414
Attached patch Fix initialization of variables (obsolete) — Splinter Review
initialize variables
Attachment #8556763 - Flags: review?(cajbir.bugzilla)
Assignee: nobody → jyavenard
Status: NEW → ASSIGNED
Blocks: 1127674
Attachment #8556763 - Flags: review?(cajbir.bugzilla) → review+
Depends on: 1127775
Blocks: 1092276
Blocks: 1085247
Make our appendBuffer more in-line with W3C segment parser loop algorithm. We ensure we do not fire updateend until TrackBuffer::AppendData has finished its job and initialize the decoder. Metadata are read within appendBuffer and set the mediasource duration only if it's NaN (we used to set as max(audio.duration,video.duration) and only the first init segment parsed should ever set the mediasource duration. Duration is then updated as per buffered data is being added.
Attachment #8557420 - Flags: review?(cajbir.bugzilla)
Add partial initialization segment support. As appendBuffer initialize the newly created decoders before firing updateend, we don't want to call readmetadata on an incomplete init segment as it would otherwise error. We need to detect if an init segment is incomplete and queue it for later initialization.
Attachment #8557423 - Flags: review?(cajbir.bugzilla)
Attached patch Update webref tests (obsolete) — Splinter Review
Update web-ref tests. We only fail those tests due to small mismatch in the media duration calculation (bug 1065207). The only one I can't explain is why the H264 Windows duration test timeout. I should get a fail. At least its now always consistent
Attachment #8557427 - Flags: review?(cajbir.bugzilla)
Blocks: 1096089
Attached patch Update webref tests (obsolete) — Splinter Review
:cajbir
Attachment #8557427 - Attachment is obsolete: true
Attachment #8557427 - Flags: review?(cajbir.bugzilla)
Attachment #8557459 - Flags: review?(cajbir.bugzilla)
https://treeherder.mozilla.org/#/jobs?repo=try&revision=1c1149744c29

all the intermittent oranges are now non-mse related
the libnestegg solution turned out to not be appropriate under all circumstances. It will choke if any of the data provided is invalid. So if you add an init segment plus half a media segment, nestegg_init will return an error.
So I modified WebMBufferedParser to find the Tracks element and record its end offset ; we assume this is always the end of the init segment.
We will fail further down the track as WebMReader::ReadMetadata will also fail. We should limit the size of the data provided to nestegg_init to the size of the init segment just found. I can wrap this patch and the previous one together. if you want (probbly better that way too).
Attachment #8557478 - Flags: review?(kinetik)
Simplify handling of pending decoders as there really can only be one: mCurrentDecoder
Attachment #8557491 - Flags: review?(cajbir.bugzilla)
limit webm demuxer on the init segment size when reading the metadata.
Attachment #8557492 - Flags: review?(kinetik)
Page to test:
http://people.mozilla.org/~jyavenard/tests/mse_webm/init-only.html?chunks=1&eos_chunk=10&eos_delay=-1&delay_chunk=0&start=-1&size=237

The init segment is 236 bytes here. Feeding 237 bytes would cause readmetadata to fail, and playback wouldn't start and decoder initialization would fail.

we now handle this condition just fine (so long as the rest of the media segments are complete)
Comment on attachment 8557492 [details] [diff] [review]
Limit the data parsed to the size of the init segment

After sleeping on it, I don't like this approach. It breaks the abstraction of the MediaSource component.

The WebMReader has access to that information already as it has parsed the data using WebMBufferedParser already.
Attachment #8557492 - Attachment is obsolete: true
Attachment #8557492 - Flags: review?(kinetik)
Use init segment size if known and limit parser to that size.
Attachment #8557546 - Flags: review?(kinetik)
The MediaSourceReader doesn't need to have an init segment available anymore, as readmetadata is done within the trackbuffer.
Attachment #8557603 - Flags: review?(matt.woodrow)
Don't unecessarily tell the MediaSourceReader we have data when we don't.
Attachment #8557604 - Flags: review?(matt.woodrow)
Blocks: 1128332
Attachment #8557603 - Flags: review?(matt.woodrow) → review+
Attachment #8557604 - Flags: review?(matt.woodrow) → review+
Combine the two patches together as the nestegg approach just doesn't work under all circumstances.
Attachment #8557686 - Flags: review?(kinetik)
Attachment #8557478 - Attachment is obsolete: true
Attachment #8557478 - Flags: review?(kinetik)
Comment on attachment 8557459 [details] [diff] [review]
Update webref tests

Review of attachment 8557459 [details] [diff] [review]:
-----------------------------------------------------------------

I defer to Karl for web platform test changes.
Attachment #8557459 - Flags: review?(cajbir.bugzilla) → review?(karlt)
Summary: mediasource.duration isn't updated before updateend event fired → updateend is fired before decoder is being initialized and errors aren't reported to the parent source buffer.
Disable a test that timesout on Windows {7,8}. The loadeddata event is never fired. I'll keep tracking that one
Attachment #8557707 - Flags: review?(karlt)
Attachment #8557721 - Flags: review?(matt.woodrow)
Comment on attachment 8557459 [details] [diff] [review]
Update webref tests

>   [Test setting same duration multiple times does not fire duplicate durationchange]
>-    disabled: https://bugzilla.mozilla.org/show_bug.cgi?id=1116007
>+    expected: FAIL
> 
>diff --git a/testing/web-platform/tests/media-source/mediasource-duration.html b/testing/web-platform/tests/media-source/mediasource-duration.html
>--- a/testing/web-platform/tests/media-source/mediasource-duration.html
>+++ b/testing/web-platform/tests/media-source/mediasource-duration.html
>@@ -206,12 +206,12 @@
>                   test.expectEvent(mediaElement, 'ended', 'Playback ended');
>                   test.waitForExpectedEvents(function()
>                   {
>                       mediaElement.removeEventListener('durationchange', durationchangeEventHandler);
>                       assert_equals(durationchangeEventCounter, expectedDurationChangeEventCount, 'durationchanges');
>                       test.done();
>                   });
>               });
>-          }, 'Test setting same duration multiple times does not fire duplicate durationchange', {timeout: 2500});
>+          }, 'Test setting same duration multiple times does not fire duplicate durationchange', {timeout: 10000});

We're not meant to change files in the tests directory as these are imported from upstream and so will be overridden.  I suggest marking this subtest and the parent TIMEOUT.  If it is intermittent, then we'll need to disable the whole test file unfortunately.  r+ on the all the rest.
Attachment #8557459 - Flags: review?(karlt) → review+
(In reply to Karl Tomlinson (back 2 Feb:karlt) from comment #21)
> We're not meant to change files in the tests directory as these are imported
> from upstream and so will be [...]

overwritten.
Comment on attachment 8557707 [details] [diff] [review]
Disable test timing out on Windows {8,7}

I don't know what happened to loadeddata, but the annotations are appropriate for what you describe.

Perhaps the missing loaddata is responsible for the timeout in mediasource-duration.html - I haven't looked closely.
Attachment #8557707 - Flags: review?(karlt) → review+
Attachment #8557423 - Attachment is obsolete: true
Attachment #8557423 - Flags: review?(cajbir.bugzilla)
Attachment #8557491 - Attachment is obsolete: true
Attachment #8557491 - Flags: review?(cajbir.bugzilla)
somehow we ended up with two, and I don't know which is which. So re-uploading
Attachment #8557762 - Flags: review?(cajbir.bugzilla)
Attachment #8556763 - Attachment is obsolete: true
Attachment #8557420 - Attachment is obsolete: true
Attachment #8557420 - Flags: review?(cajbir.bugzilla)
:cajbir
Attachment #8557762 - Attachment is obsolete: true
Attachment #8557762 - Flags: review?(cajbir.bugzilla)
:kinetik
Attachment #8557686 - Attachment is obsolete: true
Attachment #8557686 - Flags: review?(kinetik)
Attachment #8557546 - Attachment is obsolete: true
Attachment #8557546 - Flags: review?(kinetik)
Attachment #8557767 - Flags: review?(cajbir.bugzilla)
Attachment #8557770 - Flags: review?(cajbir.bugzilla)
Attachment #8557772 - Flags: review?(kinetik)
Attachment #8557774 - Flags: review?(kinetik)
add part#. Carrying r+
Attachment #8557459 - Attachment is obsolete: true
Attachment #8557604 - Attachment is obsolete: true
Attachment #8557603 - Attachment is obsolete: true
Attachment #8557707 - Attachment is obsolete: true
add part#
Attachment #8557788 - Flags: review?(matt.woodrow)
Attachment #8557721 - Attachment is obsolete: true
Attachment #8557721 - Flags: review?(matt.woodrow)
Attachment #8557788 - Flags: review?(matt.woodrow) → review+
Comment on attachment 8557767 [details] [diff] [review]
Part2. appendBuffer scanning the data before firing updateend

Review of attachment 8557767 [details] [diff] [review]:
-----------------------------------------------------------------

Looks awesome so far, mainly just want to see the promise bit changed.

::: dom/media/mediasource/TrackBuffer.cpp
@@ +550,5 @@
>    mParentDecoder->GetReader()->MaybeNotifyHaveData();
>  
> +  MSE_DEBUG("TrackBuffer(%p): Reader %p activated",
> +            this, aDecoder->GetReader());
> +  mInitializationPromise.ResolveIfExists(aDecoder->GetRealMediaDuration(), __func__);

This promise expects a bool, not a duration.

Use aDecoder->GetRealMediaDuration() != 0 to make the conversion explicit if that's what you meant :)

::: dom/media/mediasource/TrackBuffer.h
@@ +43,5 @@
>    // NotifyDataArrived on the decoder to keep buffered range computation up
>    // to date.  Returns false if the append failed.
> +  bool AppendData(LargeDataBuffer* aData,
> +                  int64_t aTimestampOffset /* microseconds */,
> +                  bool& aGotMedia, nsRefPtr<InitializationPromise>& aPromise);

Lets switch to unconditionally returning the promise, and drop aGotMedia and the bool retval (as discussed on irc).
Comment on attachment 8557770 [details] [diff] [review]
Part3. Add support for partial init segment

Review of attachment 8557770 [details] [diff] [review]:
-----------------------------------------------------------------

::: dom/media/mediasource/TrackBuffer.cpp
@@ +180,5 @@
> +
> +  if (newInitData) {
> +    if (!gotInit) {
> +      // We need a new decoder, but we can't initialize it yet.
> +      nsRefPtr<SourceBufferDecoder> decoder = NewDecoder(aTimestampOffset);

Comment that the new decoder is stored in mDecoders/mCurrentDecoder, so we don't need to do anything with 'decoder'.

@@ +420,2 @@
>      nsRefPtr<dom::TimeRanges> r = new dom::TimeRanges();
> +    mInitializedDecoders[i]->GetBuffered(r);

This change probably should be part of the previous patch :)

@@ +509,5 @@
> +  // As this decoder hasn't been initialized yet, the resource isn't yet in use.
> +  bool wasEnded = aDecoder->GetResource()->IsEnded();
> +  if (!wasEnded) {
> +    aDecoder->GetResource()->Ended();
> +  }

This is a horrible hack. It would be nice if we could just pass a parameter to ReadMetadata to ask it to be non-blocking, but I guess that's quite a hassle to hook up.
Attachment #8557770 - Flags: review?(cajbir.bugzilla) → review+
Comment on attachment 8557772 [details] [diff] [review]
Part4. Add support for partial WebM init segment

Review of attachment 8557772 [details] [diff] [review]:
-----------------------------------------------------------------

::: dom/media/webm/WebMBufferedParser.h
@@ +94,5 @@
>    // Current offset with the stream.  Updated in chunks as Append() consumes
>    // data.
>    int64_t mCurrentOffset;
>  
> +  // Tracks elelement's end offset. This indicates the end of the init segment.

Typo in element

@@ +96,5 @@
>    int64_t mCurrentOffset;
>  
> +  // Tracks elelement's end offset. This indicates the end of the init segment.
> +  // Will only be set if a Segment Information has been found.
> +  uint32_t mInitEndOffset;

Make this int64_t since it's an offset.  Then, if you init it to -1, GetInitEndOffset can drop the !mInitEndOffset test and return the value when mRangeParsers isn't empty.
Attachment #8557772 - Flags: review?(kinetik) → review+
Comment on attachment 8557774 [details] [diff] [review]
Part5. Limit metadata parsing to init segment size if known

Review of attachment 8557774 [details] [diff] [review]:
-----------------------------------------------------------------

::: dom/media/webm/WebMBufferedParser.h
@@ +241,5 @@
>    // dependencies to arrive at aTime.
>    bool GetOffsetForTime(uint64_t aTime, int64_t* aOffset);
>  
> +  // Returns end offset of init segment or -1 if none found.
> +  int32_t GetInitEndOffset();

Being an offset, make it int64_t.
Attachment #8557774 - Flags: review?(kinetik) → review+
Carrying r+ comments. Always return a promise. Much more elegant.
Attachment #8557838 - Flags: review?(matt.woodrow)
Attachment #8557767 - Attachment is obsolete: true
Attachment #8557767 - Flags: review?(cajbir.bugzilla)
Carrying r+, rebase on previous patch changes
Attachment #8557770 - Attachment is obsolete: true
Attachment #8557838 - Flags: review?(matt.woodrow) → review+
Carrying r+
Attachment #8557772 - Attachment is obsolete: true
Attachment #8557774 - Attachment is obsolete: true
Attachment #8557784 - Attachment is obsolete: true
Attachment #8557788 - Attachment is obsolete: true
Attached patch Part10. Handle concurrent aborts (obsolete) — Splinter Review
Handle successive appendBuffer that could have been interrupted half-way through.
A test case would have been something like:
sb.appendBuffer()
sb.OnEvent(updatestarting) {
  sb.abort();
  sb.appendBuffer()
}

The first appendBuffer would have queued the task and set sb.updating to true
on abort(), sb.updating is set to false. When the second sb.appendBuffer() runs : it would set sb.updating to true.
Now that the first queued task runs, sees that sb.updating is true and happily continue. Then the second append buffer queued would run and queue a second initialization task. And boom.

This situation could happen with YouTube when you seek, it starts buffering and immediately change resolution (which calls abort).

We add a small update ID that we record at the time we queue the buffer append task. And if upon execution the two update IDs do not match, we abort.
Attachment #8558297 - Flags: review?(matt.woodrow)
missed a line in the previous rebase
Attachment #8557839 - Attachment is obsolete: true
made a mistake when I carried the r+ comments, and now that mInitEndOffset has a default value of -1, we need to test for positive, not non-zero
Attachment #8557840 - Attachment is obsolete: true
Attached patch Part10. Handle concurrent aborts (obsolete) — Splinter Review
Can't reject the promise in shutdown. It is possible that the SourceBuffer got deleted while shutdown is running ; that would cause the promise to still hold a reference to the SourceBuffer and it would be deleted outside the main thread (causing a crash as SourceBuffer not being thread-safe)
Attachment #8558437 - Flags: review?(matt.woodrow)
Attachment #8558297 - Attachment is obsolete: true
Attachment #8558297 - Flags: review?(matt.woodrow)
Attached patch Fix potential crash (obsolete) — Splinter Review
Got that race condition once. Shutdown completed right before the initialization task got to run. mParentDecoder was set to null, and attempting to get the lock cause a null dereference: [39007] ###!!! ASSERTION: You can't dereference a NULL nsRefPtr with operator->().: 'mRawPtr != 0', file ../../../dist/include/nsRefPtr.h, line 229. This is not the perfect solution, as getting the lock isn't atomic here. Ideally, we should use our own monitor when it comes to shutdown/init thread rather than using the parent decoder one. The likelihood of still happening however is super low.
Attachment #8558438 - Flags: review?(matt.woodrow)
Depends on: 1129224
Comment on attachment 8558437 [details] [diff] [review]
Part10. Handle concurrent aborts

One of the issue this patch tried to get around has been fixed in bug 1129224. So writing a new version
Attachment #8558437 - Attachment is obsolete: true
Attachment #8558437 - Flags: review?(matt.woodrow)
Reworked following resolution of bug 1129224. We promise must be resolved, even if disconnected. Also rename some methods to make them more explicit in what they actually do.
Attachment #8558897 - Flags: review?(matt.woodrow)
Rebase.
Attachment #8558898 - Flags: review?(matt.woodrow)
Attachment #8558438 - Attachment is obsolete: true
Attachment #8558438 - Flags: review?(matt.woodrow)
Comment on attachment 8558897 [details] [diff] [review]
Part10. Handle concurrent aborts

Review of attachment 8558897 [details] [diff] [review]:
-----------------------------------------------------------------

::: dom/media/mediasource/SourceBuffer.h
@@ +32,5 @@
>  class ErrorResult;
>  class LargeDataBuffer;
>  class TrackBuffer;
>  template <typename T> class AsyncEventRunner;
> +typedef MediaPromise<bool, nsresult, /* IsExclusive = */ true> TrackBufferInitializationPromise;

TrackBufferAppendPromise might be more accurate, now that we always return one.
Attachment #8558897 - Flags: review?(matt.woodrow) → review+
Attachment #8558898 - Flags: review?(matt.woodrow) → review+
Duplicate of this bug: 1097252
Comment on attachment 8557838 [details] [diff] [review]
Part2. appendBuffer scanning the data before firing updateend

Requesting 37 uplift of all patches on this bug.

Approval Request Comment
[Feature/regressing bug #]: MSE
[User impact if declined]: Dependency of later critical fixes. Possible video playback stalls with YouTube. Less consistent testing.
[Describe test coverage new/current, TreeHerder]: Landed on m-c two weeks ago.
[Risks and why]: This is a significant rewrite code specific to the MSE feature. it's been working fine on nightly.
[String/UUID change made/needed]: None.
Attachment #8557838 - Flags: approval-mozilla-aurora?
Blocks: 1122358
Comment on attachment 8557838 [details] [diff] [review]
Part2. appendBuffer scanning the data before firing updateend

Dependency for other MSE work. Been on m-c for ~2 weeks already without issue. Aurora+
Attachment #8557838 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Comment on attachment 8557764 [details] [diff] [review]
Part1. Fix initialization of variables

[Triage Comment]
Attachment #8557764 - Flags: approval-mozilla-aurora+
Attachment #8557779 - Flags: approval-mozilla-aurora+
Attachment #8557785 - Flags: approval-mozilla-aurora+
Attachment #8557786 - Flags: approval-mozilla-aurora+
Attachment #8557841 - Flags: approval-mozilla-aurora+
Attachment #8557844 - Flags: approval-mozilla-aurora+
Attachment #8557846 - Flags: approval-mozilla-aurora+
Attachment #8558397 - Flags: approval-mozilla-aurora+
Attachment #8558398 - Flags: approval-mozilla-aurora+
Attachment #8558897 - Flags: approval-mozilla-aurora+
Attachment #8558898 - Flags: approval-mozilla-aurora+
You need to log in before you can comment on or make changes to this bug.