Closed Bug 1163667 Opened 4 years ago Closed 4 years ago

Improve seeking accuracy and performance in MP3TrackDemuxer

Categories

(Core :: Audio/Video: Playback, defect, P1)

defect

Tracking

()

RESOLVED FIXED
mozilla44
Tracking Status
firefox41 --- affected
firefox44 --- fixed

People

(Reporter: esawin, Assigned: esawin)

References

Details

Attachments

(5 files, 13 obsolete files)

14.55 KB, patch
esawin
: review+
Details | Diff | Splinter Review
612 bytes, patch
esawin
: review+
Details | Diff | Splinter Review
7.64 KB, patch
esawin
: review+
Details | Diff | Splinter Review
3.36 KB, patch
esawin
: review+
Details | Diff | Splinter Review
9.09 KB, patch
esawin
: review+
Details | Diff | Splinter Review
With bug 1093815 we've added preliminary seeking support combining fast approximate "jumps" with slow forward-reading.

The fast back-jumping (MP3Demuxer::FastSeek) derives the stream offset based on the average frame length, which leads to inaccuracies with VBR sources.
We should look for ways to improve accuracy, e.g., by reading optional VBR headers when available.

The slow forward-reading (MP3Demuxer::SlowSeek) simply reads the source up to the desired time stamp, which is slow for long forward seeking. We should look for options to increase performance here without regressing seeking accuracy.
Assignee: nobody → esawin
Depends on: 1093815
We should also handle repeated seek requests more gracefully.
Currently, if you drag the seek slider around on the timeline (works also when paused) and let it go eventually, it takes a long time for the demuxer to process all the seek requests and finally continue playback at the last seek location.
Component: Audio/Video → Audio/Video: Playback
(In reply to Eugen Sawin [:esawin] from comment #1)
> We should also handle repeated seek requests more gracefully.
> Currently, if you drag the seek slider around on the timeline (works also
> when paused) and let it go eventually, it takes a long time for the demuxer
> to process all the seek requests and finally continue playback at the last
> seek location.

This has been fixed in bug 1089586.
Blocks: 1210711
Found this article relevant.
http://www.codeproject.com/Articles/8295/MPEG-Audio-Frame-Header

So my explanation in the commit log of bug 1209410 (part4) was incorrect. The reason we see 128 frames is because the first 2 frames are actually a Xing header, and contains the seek table.
Add ByteReader::CanRead32() for convenience.
WIP: Read all VBR header info if available.
WIP: Use VBR/CBR headers TOC for efficient seeking.

This gives us near-instant seeking on most reasonable media lengths, but there are still some quirks to work out.

Both, VBR headers and the MediaResource interface can give us the total stream length, so we need robust fallback/trust mechanics in the case where these values differ.

The TOC doesn't provide enough resolution for second-exact seeking, so this still generates visible artifacts like playback beyond the final duration time.
I've also noticed some instabilities after repeated seeking actions with this patch, which I need to address.
Let's reduce the logging macro name for convenience (also helps with staying within line width limits).
Attachment #8674552 - Flags: review?(jyavenard)
For convenience.
Attachment #8671596 - Attachment is obsolete: true
Attachment #8674553 - Flags: review?(jyavenard)
Read the Xing/Info headers and extract all info we can get.
Attachment #8671598 - Attachment is obsolete: true
Attachment #8674555 - Flags: review?(jyavenard)
Use the Xing/Info TOC for efficient seeking.

This has still some rough edges, but from my testing it seemed stable and gives reasonable accuracy and good seeking performance.
Attachment #8671603 - Attachment is obsolete: true
Attachment #8674556 - Flags: review?(jyavenard)
Comment on attachment 8674552 [details] [diff] [review]
0001-Bug-1163667-Shorten-logging-macro.-r-jya.patch

Review of attachment 8674552 [details] [diff] [review]:
-----------------------------------------------------------------

Should use a generic demuxed log, common to all demuxers.
Attachment #8674552 - Flags: review?(jyavenard) → review+
Comment on attachment 8674553 [details] [diff] [review]
0002-Bug-1163667-Extend-ByteReader-interface-with-CanRead.patch

Review of attachment 8674553 [details] [diff] [review]:
-----------------------------------------------------------------

Should have a CanRead(size_t aCount) really.
Attachment #8674553 - Flags: review?(jyavenard) → review+
Comment on attachment 8674555 [details] [diff] [review]
0003-Bug-1163667-Read-complete-Xing-Info-header-info.-r-j.patch

Review of attachment 8674555 [details] [diff] [review]:
-----------------------------------------------------------------

LGTM, however, seeking doesn't work for me.

Attempting to seek anywhere shows in the LOG:
34926592[120a81680]: MP3Demuxer GetSamples(1) Begin mOffset=37529865 mNumParsedFrames=272 mFrameIndex=89604 mTotalFrameLen=113686 mSamplesPerFrame=1152 mSamplesPerSecond=44100 mChannels=2
34926592[120a81680]: MP3Demuxer FindNext() Begin mOffset=37529865 mNumParsedFrames=272 mFrameIndex=89604 mTotalFrameLen=113686 mSamplesPerFrame=1152 mSamplesPerSecond=44100 mChannels=2
34926592[120a81680]: MP3Demuxer MP3TrackDemuxer::Read(70000214da00 37529865 4096)
34926592[120a81680]: MP3Demuxer MP3TrackDemuxer::Read        -> ReadAt(0)
31707136[127ad8300]: MediaFormatReader(122145a00)::Output: Decoded Audio sample time=2340609566 timecode=2340609566 kf=0 dur=26122
34926592[120a81680]: MP3Demuxer FindNext() Abort/EOS Exceeding MAX_SKIPPED_BYTES without a frame
34926592[120a81680]: MP3Demuxer FindNext() Exit foundFrame=0 mParser.CurrentFrame().Length()=0 
34926592[120a81680]: MP3Demuxer GetNext() Begin({mStart=0 Length()=0})
34926592[120a81680]: MP3Demuxer GetSamples() End mSamples.Size()=0 aNumSamples=0 mOffset=37529865 mNumParsedFrames=272 mFrameIndex=89604 mTotalFrameLen=113686 mSamplesPerFrame=1152 mSamplesPerSecond=44100 mChannels=2
34926592[120a81680]: MediaFormatReader(122145a00)::NotifyNewOutput: Received new Audio sample time:2340609566 duration:26122
34926592[120a81680]: MediaFormatReader(122145a00)::ScheduleUpdate: SchedulingUpdate(Audio)
34926592[120a81680]: MediaFormatReader(122145a00)::NotifyInputExhausted: Decoder has requested more Audio data
34926592[120a81680]: MediaFormatReader(122145a00)::OnDemuxFailed: Failed to demux audio, failure:1
34926592[120a81680]: MediaFormatReader(122145a00)::Update: Processing update for Audio

That's using http://podiobooks.com/title/a-coffee-crusade/

What about the VBRI TOC? Wouldn't it be more accurate than just relying on the number of frames?

::: dom/media/MP3Demuxer.cpp
@@ +869,3 @@
>  int64_t
> +FrameParser::VBRHeader::Offset(float aDurationFac) const {
> +  const float durationPer = 100.0f * std::min(0.98f, std::max(0.0f, aDurationFac));

sounds like you're attempting to do some rounding in here ; that's worth a comment describing the logic this 0.98f

@@ +898,5 @@
>  
>    // We have to search for the Xing header as its position can change.
> +  while (aReader->CanRead32() &&
> +         aReader->PeekU32() != XING_TAG && aReader->PeekU32() != INFO_TAG) {
> +    aReader->Read(1);

Any reason you're removing the upper limit?

That itself could be optimised rather than reading 1 byte at a time, but I guess if it's small enough it won't matter.

@@ +919,5 @@
> +  }
> +  if (flags & TOC && aReader->Remaining() >= TOC_SIZE) {
> +    if (mNumBytes <= 0) {
> +      // We don't have the stream size to calculate offsets, skip the TOC.
> +      aReader->Read(TOC_SIZE);

this seems weird ; the flags tell you that you have a TOC; yet you don't.
How would skipping just 100 bytes will skip the TOC, especially as it's not there to start with.
Shouldn't this be invalid instead?

::: dom/media/MP3Demuxer.h
@@ +218,5 @@
>      VBRHeaderType Type() const;
>  
> +    // Returns the total number of audio frames (excluding the VBR header frame)
> +    // expected in the stream/file.
> +    int32_t NumAudioFrames() const;

The renaming from Blah to BlahAudio seems redundant to me when it can never be anything else but audio AFAICT

@@ +256,5 @@
>      // The total number of frames expected as parsed from a VBR header.
> +    int32_t mNumAudioFrames;
> +
> +    // The total number of bytes expected in the stream.
> +    int32_t mNumBytes;

So a MP3 can never be > 2GB ?

The value is read with ReadU32 , so storing this into a signed int leaves room for non-handled overflow.

You seem to use the fact that a negative value means non initialized.

Using a Maybe<uint32_t> would be much more elegant IMHO, and prevent the issue of signed vs unsigned and overflow

@@ +262,5 @@
> +    // The VBR scale factor.
> +    int32_t mScale;
> +
> +    // The TOC table mapping duration percentage to byte offset.
> +    std::vector<int64_t> mTOC;

If mNumBytes is a signed 32 bits int, why would this need to be a 64 bits signed integer ?

And why signed while at it?
Attachment #8674555 - Flags: review?(jyavenard)
Comment on attachment 8674556 [details] [diff] [review]
0004-Bug-1163667-Use-Xing-TOC-for-efficient-seeking.-r-jy.patch

Review of attachment 8674556 [details] [diff] [review]:
-----------------------------------------------------------------

I'm going to reserve my comment as I couldn't verify if it worked.

::: dom/media/MP3Demuxer.cpp
@@ +101,4 @@
>  // MP3TrackDemuxer
>  
>  MP3TrackDemuxer::MP3TrackDemuxer(MediaResource* aSource)
> +  : mSource(aSource),

have the , at the beginning, aligned with :

@@ +214,1 @@
>      return TimeUnit::FromMicroseconds(-1);

note: IMHO that's bad, a negative timecode is bad and will cause havoc if that happens.

If you can't seek, then the IsSeekable() method should have returned false to start with.

::: dom/media/MP3Demuxer.h
@@ +412,5 @@
> +  // Returns the frame index for the given offset.
> +  int64_t DerivedFrameIndex(int64_t aOffset) const;
> +
> +  // Returns the frame index for the given time.
> +  int64_t DerivedFrameIndex(media::TimeUnit aTime) const;

this function can never return an number < 0... why the signed int?
Attachment #8674556 - Flags: review?(jyavenard)
(In reply to Jean-Yves Avenard [:jya] from comment #13)
> LGTM, however, seeking doesn't work for me.
> That's using http://podiobooks.com/title/a-coffee-crusade/

This stream doesn't have a Xing/Info TOC (or any VBR/CBR headers whatsoever), so we fall back to the speculative (based on average frame length) seeking, which is highly inaccurate. I'll remove this for streams without TOCs, so we fall back to the slow scan instead.
  
> What about the VBRI TOC? Wouldn't it be more accurate than just relying on
> the number of frames?

We don't parse VBRI TOC with this patches yet, but it's a simple addition. My first goal was to make it work well for the most common TOC format (Xing) first.

> Any reason you're removing the upper limit?
> 
> That itself could be optimised rather than reading 1 byte at a time, but I
> guess if it's small enough it won't matter.

We handle each field constraint individually now, which improves code readability. Since only the first 8 bytes of the header are mandatory (tag and flags) there isn't much we can do in terms of performance.
 
> this seems weird ; the flags tell you that you have a TOC; yet you don't.
> How would skipping just 100 bytes will skip the TOC, especially as it's not
> there to start with.
> Shouldn't this be invalid instead?

This is a case where the header contains the TOC (|flags & TOC|) but doesn't contain the NUM_BYTES field (optional). It's an unlikely scenario, but since we can't compute the byte offsets we just skip the TOC field in this case.

> The renaming from Blah to BlahAudio seems redundant to me when it can never
> be anything else but audio AFAICT

I wanted to make it clear that it does not include non-audio frames, like the VBR header itself.

> So a MP3 can never be > 2GB ?
> 
> The value is read with ReadU32 , so storing this into a signed int leaves
> room for non-handled overflow.
> 
> You seem to use the fact that a negative value means non initialized.
> 
> Using a Maybe<uint32_t> would be much more elegant IMHO, and prevent the
> issue of signed vs unsigned and overflow

Thanks for catching this, I will fix this to either using Maybe<> or int64_t, as I would like to avoid plain unsigned types for various reasons.

> If mNumBytes is a signed 32 bits int, why would this need to be a 64 bits
> signed integer ?
> 
> And why signed while at it?

Both, MediaResource and MediaDataDemuxer use int64_t for the byte offsets, I'm just keeping type consistency.
(In reply to Eugen Sawin [:esawin] from comment #15)
> (In reply to Jean-Yves Avenard [:jya] from comment #13)
> > LGTM, however, seeking doesn't work for me.
> > That's using http://podiobooks.com/title/a-coffee-crusade/
> 
> This stream doesn't have a Xing/Info TOC (or any VBR/CBR headers
> whatsoever), so we fall back to the speculative (based on average frame
> length) seeking, which is highly inaccurate. I'll remove this for streams
> without TOCs, so we fall back to the slow scan instead.
>   

I personally would prefer fast seek albeit inaccurate over super slow that makes you wait minutes to complete. But maybe that's just me. 

I don't see the link here however between the failure to seek completely and the fast seek

Can't we keep a compromise between those two situations?
So seeks aren't exceptionally long and still complete. 


> 
> > Any reason you're removing the upper limit?
> > 
> > That itself could be optimised rather than reading 1 byte at a time, but I
> > guess if it's small enough it won't matter.
> 
> We handle each field constraint individually now, which improves code
> readability. Since only the first 8 bytes of the header are mandatory (tag
> and flags) there isn't much we can do in terms of performance.
>  
> > this seems weird ; the flags tell you that you have a TOC; yet you don't.
> > How would skipping just 100 bytes will skip the TOC, especially as it's not
> > there to start with.
> > Shouldn't this be invalid instead?
> 
> This is a case where the header contains the TOC (|flags & TOC|) but doesn't
> contain the NUM_BYTES field (optional). It's an unlikely scenario, but since
> we can't compute the byte offsets we just skip the TOC field in this case.
> 
> > The renaming from Blah to BlahAudio seems redundant to me when it can never
> > be anything else but audio AFAICT
> 
> I wanted to make it clear that it does not include non-audio frames, like
> the VBR header itself.
> 
> > So a MP3 can never be > 2GB ?
> > 
> > The value is read with ReadU32 , so storing this into a signed int leaves
> > room for non-handled overflow.
> > 
> > You seem to use the fact that a negative value means non initialized.
> > 
> > Using a Maybe<uint32_t> would be much more elegant IMHO, and prevent the
> > issue of signed vs unsigned and overflow
> 
> Thanks for catching this, I will fix this to either using Maybe<> or
> int64_t, as I would like to avoid plain unsigned types for various reasons.
> 
> > If mNumBytes is a signed 32 bits int, why would this need to be a 64 bits
> > signed integer ?
> > 
> > And why signed while at it?
> 
> Both, MediaResource and MediaDataDemuxer use int64_t for the byte offsets,
> I'm just keeping type consistency.
(In reply to Jean-Yves Avenard [:jya] from comment #16)
> I personally would prefer fast seek albeit inaccurate over super slow that
> makes you wait minutes to complete. But maybe that's just me. 
> 
> I don't see the link here however between the failure to seek completely and
> the fast seek
> 
> Can't we keep a compromise between those two situations?
> So seeks aren't exceptionally long and still complete.

I absolutely agree and I'm sorry about the confusion. I've assumed by "doesn't work for me" you meant it's not using the new code path, the failure was caused by calling the wrong DerivedFrameIndex version (because I passed seconds/floats instead of TimeUnit). I've fixed that and also streamlined the FastSeek function a bit in the next version.
Sorry for the re-flagging, but I've extended the patch quite a bit. Also, the class contained a wild mix of coding styles, so I've refactored that to use something consistent.
Attachment #8674553 - Attachment is obsolete: true
Attachment #8676311 - Flags: review?(jyavenard)
Now using Maybe<> and some other comments addressed.
Attachment #8674555 - Attachment is obsolete: true
Attachment #8676312 - Flags: review?(jyavenard)
Addressed some comments, renamed FrameIndexFrom* functions to be more explicit and fixed/streamlined FastSeek.
Attachment #8674556 - Attachment is obsolete: true
Attachment #8676314 - Flags: review?(jyavenard)
Comment on attachment 8676311 [details] [diff] [review]
0002-Bug-1163667-Extend-and-refactor-ByteReader-interface.patch

Review of attachment 8676311 [details] [diff] [review]:
-----------------------------------------------------------------

Please separate functional from style patches.

This is not mozilla coding style anyway: https://developer.mozilla.org/en-US/docs/Mozilla/Developer_guide/Coding_Style

The MP3Demuxer introduced its own formatting, which itself was wrong IMHO.

If you want to fix the coding style, it's MP3Demuxer that needs modification.
Attachment #8676311 - Flags: review?(jyavenard) → review-
(In reply to Eugen Sawin [:esawin] from comment #18)
> Created attachment 8676311 [details] [diff] [review]
> 0002-Bug-1163667-Extend-and-refactor-ByteReader-interface.patch
> 
> Sorry for the re-flagging, but I've extended the patch quite a bit. Also,
> the class contained a wild mix of coding styles, so I've refactored that to
> use something consistent.

It seemed perfectly consistent to me:
https://developer.mozilla.org/en-US/docs/Mozilla/Developer_guide/Coding_Style

Empty function defined as {} is used thoroughly across our code.
otherwise 
type func_name
{
}
Comment on attachment 8676312 [details] [diff] [review]
0003-Bug-1163667-Read-complete-Xing-Info-header-info.-r-j.patch

Review of attachment 8676312 [details] [diff] [review]:
-----------------------------------------------------------------

::: dom/media/MP3Demuxer.cpp
@@ +939,5 @@
> +    }
> +  }
> +  if (flags & VBR_SCALE && aReader->CanRead32()) {
> +    mScale.emplace(aReader->ReadU32());
> +  }

if the header isn't valid (e.g. contains bytes after the TOC) the ByteReader will cause a NS_ASSERTION ; not as bad as when it used to be a MOZ_ASSERT())

should we handle this case better?

@@ +964,4 @@
>      aReader->Seek(prevReaderOffset + OFFSET);
>      if (aReader->ReadU32() == TAG) {
>        aReader->Seek(prevReaderOffset + FRAME_COUNT_OFFSET);
> +      mNumAudioFrames.emplace(aReader->ReadU32());

I personally prefer mNumAudioFrames = Some(blah);

as emplace will assert if mNumAudioFrames already contains data.

::: dom/media/MP3Demuxer.h
@@ +220,5 @@
>      VBRHeaderType Type() const;
>  
> +    // Returns the total number of audio frames (excluding the VBR header frame)
> +    // expected in the stream/file.
> +    Maybe<uint32_t> NumAudioFrames() const;

you could make those const Maybe<T>& func() ...

would prevent an unnecessary copy
Attachment #8676312 - Flags: review?(jyavenard) → review+
Comment on attachment 8676314 [details] [diff] [review]
0004-Bug-1163667-Use-VBR-CBR-header-TOC-for-efficient-see.patch

Review of attachment 8676314 [details] [diff] [review]:
-----------------------------------------------------------------

::: dom/media/MP3Demuxer.cpp
@@ +108,5 @@
> +  , mFrameIndex(0)
> +  , mTotalFrameLen(0)
> +  , mSamplesPerFrame(0)
> +  , mSamplesPerSecond(0)
> +  , mChannels(0)

that seems to be a bit outside the scope of this patch, and likely be better in a separate patch.
but I'm just being anal.

@@ +308,4 @@
>  MP3TrackDemuxer::Reset() {
>    MP3LOG("Reset()");
>  
> +  FastSeek(TimeUnit());

so you don't need to reset the members anymore ?

::: dom/media/MP3Demuxer.h
@@ +414,5 @@
> +  // Returns the frame index for the given offset.
> +  int64_t FrameIndexFromOffset(int64_t aOffset) const;
> +
> +  // Returns the frame index for the given time.
> +  int64_t FrameIndexFromTime(media::TimeUnit aTime) const;

make that const media::TimeUnit& aTime
Attachment #8676314 - Flags: review?(jyavenard) → review+
(In reply to Jean-Yves Avenard [:jya] from comment #23)
> if the header isn't valid (e.g. contains bytes after the TOC) the ByteReader
> will cause a NS_ASSERTION ; not as bad as when it used to be a MOZ_ASSERT())
> 
> should we handle this case better?

I'm not sure I understand. In which case would the ByteReader assert?
The VBR header can contain the optional quality indication field following the TOC field and there are also the extended Lame headers which contain even more info that we don't read. We essentially ignore the trailing "garbage" bytes in the VBR frame until the next MPEG frame sync bytes.

(In reply to Jean-Yves Avenard [:jya] from comment #24)
> that seems to be a bit outside the scope of this patch, and likely be better
> in a separate patch.
> but I'm just being anal.

I think checking coding style is an important part of the review, so thanks for that. More so, a linter/style check should be part of the build chain and the automated test process, but that would require an complete style guide to begin with.

> so you don't need to reset the members anymore ?

The demuxer's Reset() is used more frequently than I've anticipated when I originally wrote it. Not only is it used during initialization, it is also called before each seek operation, which would remove all the meta data we've collected to allow for efficient seeking.

So, I've changed Reset() to reset only the current seek-location-based data to keep all the first-frame/header meta data.
(In reply to Jean-Yves Avenard [:jya] from comment #24)
> ::: dom/media/MP3Demuxer.h
> @@ +414,5 @@
> > +  // Returns the frame index for the given offset.
> > +  int64_t FrameIndexFromOffset(int64_t aOffset) const;
> > +
> > +  // Returns the frame index for the given time.
> > +  int64_t FrameIndexFromTime(media::TimeUnit aTime) const;
> 
> make that const media::TimeUnit& aTime

I agree that all non-POD types should be passed as const-references. But then I would like to fix that on the Media*Demuxer interfaces, too, or we would have inconsistent handling regarding TimeUnit.
Re-resubmitting due to new patch name format.
Attachment #8674552 - Attachment is obsolete: true
Attachment #8677628 - Flags: review+
Minimal ByteReader extension patch, carrying over r+. Further refactoring will be carried out in a follow-up bug.
Attachment #8676311 - Attachment is obsolete: true
Attachment #8677631 - Flags: review+
Addressed all review comments.
Attachment #8676312 - Attachment is obsolete: true
Attachment #8677633 - Flags: review+
I had to add an additional check in FastSeek to prevent calls to StreamLength() during initialization, which would break when using the MockMediaResource during tests.
Carrying over r+ due to minor changes only.
Attachment #8676314 - Attachment is obsolete: true
Attachment #8677636 - Flags: review+
This is an optional change, but I think it should help with efficiency in general when scanning for next frame (during playback and seeking).

We've been reading 4kB chunks when scanning for the next frame header, which is unnecessarily large.

During initialization, we skip over the complete ID3v2 tags, it is enough to read the ID3v2 header which is 10 bytes in size and expected to be at the beginning of the stream.
Mid-stream, we skip over the MPEG frame after reading the complete frame in an additional Read call with the exact size as provided by the MPEG header.
Which means that, on average, we're reading 4kB to find a 4 bytes header which is (usually) found within the first bytes of the chunk and discard the rest of the chunk.
Attachment #8677642 - Flags: review?(jyavenard)
This came up during testing, it looks like we haven't been draining the demuxer queue properly.
Attachment #8677645 - Flags: review?(snorp)
(In reply to Eugen Sawin [:esawin] from comment #25)
> (In reply to Jean-Yves Avenard [:jya] from comment #23)
> > if the header isn't valid (e.g. contains bytes after the TOC) the ByteReader
> > will cause a NS_ASSERTION ; not as bad as when it used to be a MOZ_ASSERT())
> > 
> > should we handle this case better?
> 
> I'm not sure I understand. In which case would the ByteReader assert?
> The VBR header can contain the optional quality indication field following
> the TOC field and there are also the extended Lame headers which contain
> even more info that we don't read. We essentially ignore the trailing
> "garbage" bytes in the VBR frame until the next MPEG frame sync bytes.

If the VBR header is longer than expected, and as you only read a set amount of bytes, when the ByteReader destructor is called and there are still some bytes left, it will assert.

That's why you typically call ByteReader::DiscardRemaining when done.
(In reply to Eugen Sawin [:esawin] from comment #26)
> (In reply to Jean-Yves Avenard [:jya] from comment #24)
> > ::: dom/media/MP3Demuxer.h
> > @@ +414,5 @@
> > > +  // Returns the frame index for the given offset.
> > > +  int64_t FrameIndexFromOffset(int64_t aOffset) const;
> > > +
> > > +  // Returns the frame index for the given time.
> > > +  int64_t FrameIndexFromTime(media::TimeUnit aTime) const;
> > 
> > make that const media::TimeUnit& aTime
> 
> I agree that all non-POD types should be passed as const-references. But
> then I would like to fix that on the Media*Demuxer interfaces, too, or we
> would have inconsistent handling regarding TimeUnit.

The reason TimeUnit are passed in value in the MediaDataDemuxer virtual members is so they can be used with ProxyMediaCall (or whatever it's called now) MozPromises. You can't pass reference there unfortunately, as by the time the task run the original reference is no longer valid.

I don't see why you would need such use, and the function being a private member the need for declaration consistency is rather unimportant IMHO
Comment on attachment 8677642 [details] [diff] [review]
0005-Bug-1163667-5.1-Decrease-buffer-size-when-scanning-f.patch

Review of attachment 8677642 [details] [diff] [review]:
-----------------------------------------------------------------

What about streamed MP3 like typically used with Internet radios. We may start the stream at any point in time. Will we always find the header there if there are one or the start of a MP3 frame?
Attachment #8677642 - Flags: review?(jyavenard) → review+
(In reply to Jean-Yves Avenard [:jya] from comment #33)
> If the VBR header is longer than expected, and as you only read a set amount
> of bytes, when the ByteReader destructor is called and there are still some
> bytes left, it will assert.
> 
> That's why you typically call ByteReader::DiscardRemaining when done.

Oh, right, I see what you mean now. The ByteReader instance is owned by GetNextFrame, which calls DiscardRemaining after the VBR parsing is done, that's how we do it with all Parse* functions.

(In reply to Jean-Yves Avenard [:jya] from comment #34)
> The reason TimeUnit are passed in value in the MediaDataDemuxer virtual
> members is so they can be used with ProxyMediaCall (or whatever it's called
> now) MozPromises. You can't pass reference there unfortunately, as by the
> time the task run the original reference is no longer valid.
> 
> I don't see why you would need such use, and the function being a private
> member the need for declaration consistency is rather unimportant IMHO

I've already updated all TimeUnit arguments in the latest patch versions, but it's good to know the reason behind passing it by value there.

(In reply to Jean-Yves Avenard [:jya] from comment #35)
> What about streamed MP3 like typically used with Internet radios. We may
> start the stream at any point in time. Will we always find the header there
> if there are one or the start of a MP3 frame?

In theory, we can set the buffer arbitrarily small (1 byte) and it should still work since the frame parser is a state machine and keeps track of the parse state. We don't expect to find the first frame at offset 0 and the parser should be rather robust against garbage/fragmented data between frames as long as it doesn't resemble a valid MPEG header.

That said, the choice of the buffer size is mostly a matter of finding a good balance for performance. Ideally, we would only parse enough to include the next header to avoid reading (and re-reading) redundant chunks. A small buffer would have a negative effect on performance for such a live stream only for the first frame - assuming we land right past a frame header, that would be 1-30 read iterations until we reach the next header with a 64 byte buffer.

For well-behaving streams, we're currently reading at least 3 full frames (10 on average), find the 4 bytes of header, discard the rest, re-read the first frame found and then skip past it. This is rather wasteful, I don't think we should optimize for the worst case (landing mid-stream, lots of fragmentation/garbage bytes).
Attachment #8677645 - Flags: review?(snorp) → review+
with this patch, it works very nicely now. seeks are almost instantaneous: awesome !
Duplicate of this bug: 1210711
Priority: -- → P1
I'm splitting off patch 5 (buffer size decreasing) into bug 1217802 as it's not directly related with the seeking improvements.
Attachment #8677642 - Attachment is obsolete: true
Attachment #8677645 - Attachment is obsolete: true
Attachment #8678073 - Flags: review+
Added additional assertions and checks to prevent out-of-bounds access on the TOC vector.
Attachment #8677633 - Attachment is obsolete: true
Attachment #8678953 - Flags: review+
None of the Try bustage seems to be related to the patches https://treeherder.mozilla.org/#/jobs?repo=try&revision=ae12089f2da9
You need to log in before you can comment on or make changes to this bug.