Implement decodeAudioData

RESOLVED FIXED in mozilla21

Status

()

Core
Web Audio
RESOLVED FIXED
5 years ago
7 months ago

People

(Reporter: Ehsan, Assigned: Ehsan)

Tracking

(Depends on: 1 bug, Blocks: 1 bug)

Trunk
mozilla21
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(5 attachments, 5 obsolete attachments)

So I'm in the process of implementing decodeAudioData.  This function expects a typed array containing the raw bytes of the media resource, presumably obtained through XHR or something.

Now, ideally I would be able to use the new nsMediaSniffer class to get the mime type of the audio buffer and create the right decoder object and throw it at the problem, except that the decoder objects seem to have access to an nsHTMLMediaElement, which is overkill for this purpose.  Is there an easier way that I can decode audio data that is sitting in a buffer?  I don't really want to create one nsHTMLMediaElement per each call to decodeAudioData...
Yes, the decoder objects need some refactoring.

I'd lower the priority of this feature and work on the raw array buffer access instead, and hope we can find someone else to wrangle the decoder objects.
(In reply to comment #1)
> I'd lower the priority of this feature and work on the raw array buffer access
> instead, and hope we can find someone else to wrangle the decoder objects.

OK, sounds good.
Depends on: 810177
Depends on: 812032
Depends on: 812572
Depends on: 813209
Depends on: 813433
Depends on: 813438
Depends on: 814284
Depends on: 814756
Depends on: 814757
Depends on: 814789
Depends on: 815226
Depends on: 816828
Depends on: 818327
Depends on: 818672
Created attachment 690690 [details] [diff] [review]
Patch (v1)

OK, I finally have something which seems to work.

The only part that I have not made sure works correctly is the resampling part.  I am still working on that.  I also need to write tests for this, but I'll probably do that in a separate patch, once I figure out what the correct way of testing this would be.

Requesting review from Boris on the dom parts, the threading stuff (please pay extra attention to those and on the overall architecture.  Requesting review from Chris Pearce and Paul on the decoding and resampling bits.  Requesting review from Chris Double since this patch uses the plugin decoder and he explicitly asked me to let him know on related changes.
Attachment #690690 - Flags: review?(paul)
Attachment #690690 - Flags: review?(cpearce)
Attachment #690690 - Flags: review?(chris.double)
Attachment #690690 - Flags: review?(bzbarsky)
Comment on attachment 690690 [details] [diff] [review]
Patch (v1)

Review of attachment 690690 [details] [diff] [review]:
-----------------------------------------------------------------

r=cpearce if you either explain or address the following:

* How do you limit the amount of memory allocated to the decoded frames? I imagine that a large file with all zero samples would compress very well and it would be every easy to OOM a 32bit Firefox.
* How do you ensure the AudioData are deleted (see my comment in MediaBufferDecoder.cpp)?

::: content/media/webaudio/MediaBufferDecoder.cpp
@@ +46,5 @@
> +namespace mozilla {
> +
> +/**
> + * This class provides a decoder object which decodes a media file that is
> + * The AbstractMediaDecoder class describes the public interface for a media decoder

This comment looks like it's two half comments.

@@ +110,5 @@
> +};
> +
> +NS_IMPL_ISUPPORTS0(BufferDecoder)
> +
> +BufferDecoder::BufferDecoder(nsIThread* aDecodeThread,

Indentation looks off here.

@@ +434,5 @@
> +    return;
> +  }
> +
> +  // decode the media
> +  decoderReader->Init(nullptr);

You need to check if Init() fails, and respond appropriately if it does.

@@ +477,5 @@
> +MediaDecodeTask::AllocateBuffer()
> +{
> +  MOZ_ASSERT(NS_IsMainThread());
> +
> +  AudioData* firstBuffer = mDecodeJob.mDecoderReader->AudioQueue().PopFront();

Should this be AudioQueue().PeekFront()? It looks like you're leaking firstBuffer, and won't copy it in CopyBuffer() below.

@@ +520,5 @@
> +
> +  MediaQueue<AudioData>& audioQueue = mDecodeJob.mDecoderReader->AudioQueue();
> +
> +  uint32_t frameCounter = 0;
> +  while (AudioData* audioData = audioQueue.PopFront()) {

I don't understand how you're deleting the AudioData objects, but they MOZ_COUNT_CTOR, so you should know if you're leaking them...

We always store the pop'd AudioData inside AutoPtrs so that they're deleted, why don't you need to do the same here?
Attachment #690690 - Flags: review?(cpearce) → review+

Comment 5

5 years ago
Comment on attachment 690690 [details] [diff] [review]
Patch (v1)

Please document that the void* arg to SetChannelDataFromArrayBuffer must have come from JS_AllocateArrayBufferContents or JS_StealArrayBufferContents.  Maybe call it SetChannelDataFromArrayBufferContents?

>+++ b/content/media/webaudio/AudioContext.h
>+class DecodeJob;

struct?

>+++ b/content/media/webaudio/AudioContext.cpp
>+  nsContentUtils::HoldJSObjects(this, NS_CYCLE_COLLECTION_PARTICIPANT(AudioContext));

Please document why?  I don't understand why it's needed, for example...

>+    // The spec is not clear what we should do here.  For now, just return.

That seems pretty weird, since we'll never invoke the callbacks....  Does the spec really not define something like "if not throwing must invoke callbacks" or whatnot?

>+  mDecoder.AsyncDecodeMedia(PromiseFlatCString(contentType).get(),

contentType is already flat.  Just contentType.get(), please.

>+  delete aDecodeJob;

Why not make mDecodeJobs an nsTArray< nsAutoPtr<DecodeJob> >?

>+++ b/content/media/webaudio/MediaBufferDecoder.cpp

>+ * This class provides a decoder object which decodes a media file that is

That is what?  The rest of this comment doesn't make sense.

>+BufferDecoder::BufferDecoder(nsIThread* aDecodeThread,
>+                                       MediaResource* aResource)

Please fix indent.  Also, assert that the ctor/dtor are only called on main thread?

>+  MediaDecodeTask(const char* aContentType, void* aBuffer,

Isn't the buffer uin8_t*, here and elsewhere?  I'd vastly prefer that!

This constructor should assert it's on the main thread, right?

>+    MOZ_ASSERT(aLength);

How can you assert that?  That length is passed in from JS; nothing says you can't have 0-length typed arrays.

>+    nsCOMPtr<nsPIDOMWindow> pWindow = do_QueryInterface(mDecodeJob.mContext->GetParentObject());
>+    if (pWindow) {
>+      mPrincipal = pWindow->GetExtantDoc()->NodePrincipal();

Please QI to nsIScriptObjectPrincipal instead, then GetPrincipal() on it.  That will handle various edge cases (like "document has been unloaded") gracefully.

>+  void ReportFailureOnMainThread() {
....
>+        new ReportResultTask(mDecodeJob, &DecodeJob::OnSuccess);

OnFailure, no?

Why do you need to null-check "event" here?  I'd think it can't be null.

>+MediaDecodeTask::Decode()

This is adding an off-main-thread refcount of the principal.  :(  Ah, well, the hope of making that not threadsafe is pretty low anyway.

>+MediaDecodeTask::AllocateBuffer()
>+  AudioData* firstBuffer = mDecodeJob.mDecoderReader->AudioQueue().PopFront();

PeekFront(), perhaps?

>+float Upscale(float in)
>+float Downscale(float in)

These need documentation.  I can't tell whether the magic numbers are right as things stand.

>+MediaDecodeTask::CopyBuffer()
>+      for (uint32_t xxx = 0; xxx < inSamples; ++xxx) {

At least "idx" instead of "xxx"?  ;)

I'm taking on faith that the channel operations here are correct.  Let me know if I shouldn't do that.

DecodeJob methods should assert which thread they're called on, ideally (looks like main-thread for most of them).

>+DecodeJob::AllocateBuffer(uint32_t aFrames)

Again, assuming that someone else is checking that the buffer sizes allocated here are big enough, etc.  Please let me know if not.

Trying to understand how this shutdown stuff works.  What makes sure to notify failure or something when we shut down the MediaBufferDecoder?  And in particular to delete all the DecodeJobs?

Do we really want one threadpool per audiocontext, or a single shared threadpool across audiocontexts?

>+DecodeJob::DecodeJob(const nsACString& aContentType,
>+  , mObj(aBuffer.Obj())

That member seems unused (which is good, since we don't root it!).  Can we remove it?

>+  , mBuffer(aBuffer.Data())

Again, make mBuffer a uint8_t*.

>+DecodeJob::OnSuccess()
>+  // call the failure callback

Fix comment?  Similar in OnFailure.

>+++ b/content/media/webaudio/MediaBufferDecoder.h
>+struct DecodeJob {

It's a bit weird to have mozilla::DecodeJob for something this generic in a header...  And the ctor/dtor logging claimed this was an inner class of AudioContext.  Is that feasible?  If not, at least fix that logging?

>+++ b/dom/webidl/AudioContext.webidl
>+                         optional DecodeErrorCallback errorCallback);

If I were designing this API, I think I would make this be:

                         optional DecodeErrorCallback? errorCallback = null);

so that people can pass in null or undefined to mean "no callback" if they want, fwiw.

r=me with the above either fixed or explained.
Attachment #690690 - Flags: review?(bzbarsky) → review+
(In reply to Chris Pearce (:cpearce) from comment #4)
> Comment on attachment 690690 [details] [diff] [review]
> Patch (v1)
> 
> Review of attachment 690690 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> r=cpearce if you either explain or address the following:
> 
> * How do you limit the amount of memory allocated to the decoded frames? I
> imagine that a large file with all zero samples would compress very well and
> it would be every easy to OOM a 32bit Firefox.

For the buffers that I allocate myself are all fallible allocations, which never OOM.  I was _assuming_ that the allocations that the decoder reader class does are fallible too.  If they're not, then we should fix them (probably in a separate bug though.)

> * How do you ensure the AudioData are deleted (see my comment in
> MediaBufferDecoder.cpp)?
> 
> @@ +477,5 @@
> > +MediaDecodeTask::AllocateBuffer()
> > +{
> > +  MOZ_ASSERT(NS_IsMainThread());
> > +
> > +  AudioData* firstBuffer = mDecodeJob.mDecoderReader->AudioQueue().PopFront();
> 
> Should this be AudioQueue().PeekFront()? It looks like you're leaking
> firstBuffer, and won't copy it in CopyBuffer() below.

Yes, thanks for catching this!!!  I wasted almost a day looking for this bug... (which is a pretty recent part of my patch.) :/

> @@ +520,5 @@
> > +
> > +  MediaQueue<AudioData>& audioQueue = mDecodeJob.mDecoderReader->AudioQueue();
> > +
> > +  uint32_t frameCounter = 0;
> > +  while (AudioData* audioData = audioQueue.PopFront()) {
> 
> I don't understand how you're deleting the AudioData objects, but they
> MOZ_COUNT_CTOR, so you should know if you're leaking them...
> 
> We always store the pop'd AudioData inside AutoPtrs so that they're deleted,
> why don't you need to do the same here?

I'm not sure if I understand this.  Why do I need to delete them explicitly?  ~MediaQueue() seems to do that for me...  (And I don't see any leaks -- unless something is terribly wrong.)
(In reply to Ehsan Akhgari [:ehsan] from comment #6)
> (In reply to Chris Pearce (:cpearce) from comment #4)
> > Comment on attachment 690690 [details] [diff] [review]
> > Patch (v1)
> > 
> > Review of attachment 690690 [details] [diff] [review]:
> > -----------------------------------------------------------------
> > 
> > r=cpearce if you either explain or address the following:
> > 
> > * How do you limit the amount of memory allocated to the decoded frames? I
> > imagine that a large file with all zero samples would compress very well and
> > it would be every easy to OOM a 32bit Firefox.
> 
> For the buffers that I allocate myself are all fallible allocations, which
> never OOM.  I was _assuming_ that the allocations that the decoder reader
> class does are fallible too.  If they're not, then we should fix them
> (probably in a separate bug though.)

We're using operator new inside the Reader classes to when we allocate the AudioDataValue arrays that we store in AudioData::mAudioData. Operator new is infallible right?

It looks like AudioData::mAudioBuffer is allocated with moz_xmalloc, which is infallible, but I'm not sure what paths we use that on.
 

> 
> > @@ +520,5 @@
> > > +
> > > +  MediaQueue<AudioData>& audioQueue = mDecodeJob.mDecoderReader->AudioQueue();
> > > +
> > > +  uint32_t frameCounter = 0;
> > > +  while (AudioData* audioData = audioQueue.PopFront()) {
> > 
> > I don't understand how you're deleting the AudioData objects, but they
> > MOZ_COUNT_CTOR, so you should know if you're leaking them...
> > 
> > We always store the pop'd AudioData inside AutoPtrs so that they're deleted,
> > why don't you need to do the same here?
> 
> I'm not sure if I understand this.  Why do I need to delete them explicitly?
> ~MediaQueue() seems to do that for me...  (And I don't see any leaks --
> unless something is terribly wrong.)


~MediaQueue() only deletes the AudioData objects still in the queue, but you're popping all AudioData objects out of the queue, so ~MediaQueue() won't delete them. AudioData objects aren't refcounted, they need to be deleted explicitly.
(In reply to Boris Zbarsky (:bz) from comment #5)
> >+++ b/content/media/webaudio/AudioContext.cpp
> >+  nsContentUtils::HoldJSObjects(this, NS_CYCLE_COLLECTION_PARTICIPANT(AudioContext));
> 
> Please document why?  I don't understand why it's needed, for example...

Sure, will do.  The reason is that mDecodeJobs holds on to JS objects.

> >+    // The spec is not clear what we should do here.  For now, just return.
> 
> That seems pretty weird, since we'll never invoke the callbacks....  Does
> the spec really not define something like "if not throwing must invoke
> callbacks" or whatnot?

Right, the spec bug here was https://www.w3.org/Bugs/Public/show_bug.cgi?id=20039 and it was fixed, so I should fix this too to call the failure callback.  Note to self: I need to send a message to the event loop to call the callback which is a pain.

> >+  delete aDecodeJob;
> 
> Why not make mDecodeJobs an nsTArray< nsAutoPtr<DecodeJob> >?

I can do that.

> >+    MOZ_ASSERT(aLength);
> 
> How can you assert that?  That length is passed in from JS; nothing says you
> can't have 0-length typed arrays.

If the input buffer is empty, the content type sniffing will fail, so we'll never get to here.  I should probably add a comment about this though...

> >+    nsCOMPtr<nsPIDOMWindow> pWindow = do_QueryInterface(mDecodeJob.mContext->GetParentObject());
> >+    if (pWindow) {
> >+      mPrincipal = pWindow->GetExtantDoc()->NodePrincipal();
> 
> Please QI to nsIScriptObjectPrincipal instead, then GetPrincipal() on it. 
> That will handle various edge cases (like "document has been unloaded")
> gracefully.

Sure.

> >+  void ReportFailureOnMainThread() {
> ....
> >+        new ReportResultTask(mDecodeJob, &DecodeJob::OnSuccess);
> 
> OnFailure, no?

Oops!

> Why do you need to null-check "event" here?  I'd think it can't be null.

Because I'm silly?

> >+MediaDecodeTask::Decode()
> 
> This is adding an off-main-thread refcount of the principal.  :(  Ah, well,
> the hope of making that not threadsafe is pretty low anyway.

Good thing is that the debugging macros will catch us if we attempt to do that...

> >+MediaDecodeTask::AllocateBuffer()
> >+  AudioData* firstBuffer = mDecodeJob.mDecoderReader->AudioQueue().PopFront();
> 
> PeekFront(), perhaps?

Yeah :/

> >+float Upscale(float in)
> >+float Downscale(float in)
> 
> These need documentation.  I can't tell whether the magic numbers are right
> as things stand.

I'm expecting to rewrite this part when Paul tells me how wrong I am in using libspeex anyways.  ;-)

> >+MediaDecodeTask::CopyBuffer()
> >+      for (uint32_t xxx = 0; xxx < inSamples; ++xxx) {
> 
> At least "idx" instead of "xxx"?  ;)

See above! ^

> I'm taking on faith that the channel operations here are correct.  Let me
> know if I shouldn't do that.

Please don't.  Honestly I don't have a lot of confidence in this patch yet, especially since I've not been very successful in testing it properly without playback support. :(

> DecodeJob methods should assert which thread they're called on, ideally
> (looks like main-thread for most of them).

Will do.

> >+DecodeJob::AllocateBuffer(uint32_t aFrames)
> 
> Again, assuming that someone else is checking that the buffer sizes
> allocated here are big enough, etc.  Please let me know if not.

AudioBuffer should do that.  It uses the JS API to allocate the typed arrays, and FallibleTArray to allocate the channels array.

> Trying to understand how this shutdown stuff works.  What makes sure to
> notify failure or something when we shut down the MediaBufferDecoder?

Bug 814789.

>  And in particular to delete all the DecodeJobs?

They're owned by the AudioContext.  Or do you have another concern?

> Do we really want one threadpool per audiocontext, or a single shared
> threadpool across audiocontexts?

I think we want a global thread pool.  I wanted to do that here but this patch was already getting too big and taking too long, so I didn't.  But that's not a good reason!  I filed bug 821062 about that.

> >+DecodeJob::DecodeJob(const nsACString& aContentType,
> >+  , mObj(aBuffer.Obj())
> 
> That member seems unused (which is good, since we don't root it!).  Can we
> remove it?

Yeah, I probably meant to remove it earlier.  :-)

> >+++ b/content/media/webaudio/MediaBufferDecoder.h
> >+struct DecodeJob {
> 
> It's a bit weird to have mozilla::DecodeJob for something this generic in a
> header...  And the ctor/dtor logging claimed this was an inner class of
> AudioContext.  Is that feasible?  If not, at least fix that logging?

Gah, yeah, this used to be an inner class... but I don't remember why I had to take it out.  I think I did that because of some kind of #include hell which I couldn't figure out how to fix...  Worst comes to worse, I can give it a name such as WebAudioDecodeJob.  Would that be fine as a last resort?

> >+++ b/dom/webidl/AudioContext.webidl
> >+                         optional DecodeErrorCallback errorCallback);
> 
> If I were designing this API, I think I would make this be:
> 
>                          optional DecodeErrorCallback? errorCallback = null);
> 
> so that people can pass in null or undefined to mean "no callback" if they
> want, fwiw.

https://www.w3.org/Bugs/Public/show_bug.cgi?id=20372
(In reply to Chris Pearce (:cpearce) from comment #7)
> (In reply to Ehsan Akhgari [:ehsan] from comment #6)
> > (In reply to Chris Pearce (:cpearce) from comment #4)
> > > Comment on attachment 690690 [details] [diff] [review]
> > > Patch (v1)
> > > 
> > > Review of attachment 690690 [details] [diff] [review]:
> > > -----------------------------------------------------------------
> > > 
> > > r=cpearce if you either explain or address the following:
> > > 
> > > * How do you limit the amount of memory allocated to the decoded frames? I
> > > imagine that a large file with all zero samples would compress very well and
> > > it would be every easy to OOM a 32bit Firefox.
> > 
> > For the buffers that I allocate myself are all fallible allocations, which
> > never OOM.  I was _assuming_ that the allocations that the decoder reader
> > class does are fallible too.  If they're not, then we should fix them
> > (probably in a separate bug though.)
> 
> We're using operator new inside the Reader classes to when we allocate the
> AudioDataValue arrays that we store in AudioData::mAudioData. Operator new
> is infallible right?

It is.

> It looks like AudioData::mAudioBuffer is allocated with moz_xmalloc, which
> is infallible, but I'm not sure what paths we use that on.

OK.  Should I file a bug about this stuff?

> > > @@ +520,5 @@
> > > > +
> > > > +  MediaQueue<AudioData>& audioQueue = mDecodeJob.mDecoderReader->AudioQueue();
> > > > +
> > > > +  uint32_t frameCounter = 0;
> > > > +  while (AudioData* audioData = audioQueue.PopFront()) {
> > > 
> > > I don't understand how you're deleting the AudioData objects, but they
> > > MOZ_COUNT_CTOR, so you should know if you're leaking them...
> > > 
> > > We always store the pop'd AudioData inside AutoPtrs so that they're deleted,
> > > why don't you need to do the same here?
> > 
> > I'm not sure if I understand this.  Why do I need to delete them explicitly?
> > ~MediaQueue() seems to do that for me...  (And I don't see any leaks --
> > unless something is terribly wrong.)
> 
> 
> ~MediaQueue() only deletes the AudioData objects still in the queue, but
> you're popping all AudioData objects out of the queue, so ~MediaQueue()
> won't delete them. AudioData objects aren't refcounted, they need to be
> deleted explicitly.

Ah I see...  Right, I'll fix it.

Comment 10

5 years ago
> Sure, will do.  The reason is that mDecodeJobs holds on to JS objects.

It does?  Where?  Where do we trace them?  Just saying we have JS objects is not enough; we have to actually trace them in our trace callback...

> Note to self: I need to send a message to the event loop to call the callback which is a
> pain.

nsRunnableMethod doesn't make it simple enough?

> I should probably add a comment about this though...

Yes, indeed.

> Please don't.  

OK.  I'll sit down and try to sort through what that's doing, but I might have to catch you to talk it over...

> AudioBuffer should do that.

No, I meant making sure that the buffers being allocated are the right size.  That is, that AudioBuffer is being told the right things about buffer sizes.

> They're owned by the AudioContext.  Or do you have another concern?

DecodeJob holds a strong ref to the AudioContext.  AudioContext owns the DecodeJob.  The only way the whole thing can go away is if we land in RemoveFromDecodeQueue for every single DecodeJob, which means landing in OnSuccess or OnFailure, right?  Or am I missing something?

> I think we want a global thread pool.

OK, good.  Followup as you filed is fine.

> WebAudioDecodeJob.  Would that be fine as a last resort?

Absolutely.
(In reply to Boris Zbarsky (:bz) from comment #10)
> > Sure, will do.  The reason is that mDecodeJobs holds on to JS objects.
> 
> It does?  Where?  Where do we trace them?  Just saying we have JS objects is
> not enough; we have to actually trace them in our trace callback...

It holds on to an AudioBuffer (the mOutput member).  We do the tracing inside AudioBuffer itself.  Is that not enough?

> > Note to self: I need to send a message to the event loop to call the callback which is a
> > pain.
> 
> nsRunnableMethod doesn't make it simple enough?

It does!

> > Please don't.  
> 
> OK.  I'll sit down and try to sort through what that's doing, but I might
> have to catch you to talk it over...

Sounds good.

> > AudioBuffer should do that.
> 
> No, I meant making sure that the buffers being allocated are the right size.
> That is, that AudioBuffer is being told the right things about buffer sizes.

I think that should be ok, but I'll double check.

> > They're owned by the AudioContext.  Or do you have another concern?
> 
> DecodeJob holds a strong ref to the AudioContext.  AudioContext owns the
> DecodeJob.  The only way the whole thing can go away is if we land in
> RemoveFromDecodeQueue for every single DecodeJob, which means landing in
> OnSuccess or OnFailure, right?  Or am I missing something?

No, you're right.  Incidentally I figured out that I was not actually doing any leak detection at all when I was testing this before submitting the patch, so I need to recheck all of the lifetimes again.  And this one for sure is at least one of the bugs that I should fix...

Comment 12

5 years ago
> It holds on to an AudioBuffer (the mOutput member).  We do the tracing inside AudioBuffer
> itself.

Then why does AudioContext need to hold/drop js objects?  Presumably the AudioBuffer has done that already, and it has a nonzero refcount, so it will trace them.

Comment 13

5 years ago
OK, so I think I understand the various bits about buffer sizes better now.  I assume the number of channels is constant across our AudioData structs in the queue?

I think it would be clearer if DecodeJob had a member (mResampledFrames?) for the number of frames it will output in the end, that being "destSampleRate * audioData->mFrames / sourceSampleRate".  Then we could make it clearer (e.g. via comments) that we have allocated an AudioBuffer with the relevant number of channels and that many frames.  So each of the typed arrays, one per channel, will contain mResampledFrames entries.

Now in CopyBuffer we go to copy the data into there.  Fine.  frameCounter keeps track of the overall frame we're in (and we should add some assert in the loop from 0 to frameCount that frameCounter never gets bigger than mDecodeJob.mResampledFrames or something).  We're writing the data into the frameCounter entry of each typed array, also fine.

But the data we're writing makes no sense.  Ignoring for the moment the resampling bit, resampledBuffer is whatever is in the AudioBuffer created by EnsureAudioBuffer.  This is only a float* buffer in some cases; in other cases it's an int16_t* buffer.  See how AUDIO_OUTPUT_FORMAT is conditionally defined to different formats and AudioDataValue is set accordingly.  So at the very least, bufferData should be a AudioDataValue*, right?  And you should probably use speex_resampler_process_interleaved_int as needed for resampling.

But past that, the buffer returned by EnsureAudioBuffer has the data in the following form, as far as I can see: first mFrames samples for channel 0, then mFrames samples for channel 1, etc.  Is that really what the interleaved_* resampler wants?  It's not very interleaved....

Anyway, the last problem is that the data we write to each channel's frameCounter entry is resampledBuffer[i+j].  At least in the non-resampled case, that makes no sense if I understood the format of the data in the bufferData correctly.  It doesn't even make sense of I understood it wrong and the bufferData has all the samples for frame 0, then all the ones for frame 1, etc.  Somewhere here we need to multiply one of i and j by either mChannels or frameCount, no?

Oh, last nit: We should also assert that in SetChannelDataFromArrayBuffer once we're done setting up the typed arrays their lengths match the dom::AudioBuffer mLength.
(In reply to comment #12)
> > It holds on to an AudioBuffer (the mOutput member).  We do the tracing inside AudioBuffer
> > itself.
> 
> Then why does AudioContext need to hold/drop js objects?  Presumably the
> AudioBuffer has done that already, and it has a nonzero refcount, so it will
> trace them.

OK, I thought that we need to hold/drop transitively!
Hrm, ok.  Thanks for the analysis, Boris!  It seems like I was relying too much on the media decoder APIs.  :(

Chris, is there a better way for us to tell what format (int16/float) the decoded data is, whether it's interleaved, etc. that guesswork our way throughout the media code?
Matthew Gregan is the best person to ask about audio stuff. Roughly AudioDataValue is float32 on desktop and signed 16 on mobile.

Dunno about how to tell about interleaving though, Matthew might have a better idea.

Comment 17

5 years ago
> OK, I thought that we need to hold/drop transitively!

Ah, no.  All holding does is make sure that when GC happens your trace hook gets called so you can trace your JS objects.  So only the thing with the trace hook needs to hold.

> Chris, is there a better way for us to tell what format (int16/float) the decoded data
> is

Does testing sizeof(AudioDataValue) == sizeof(AudioSampleTraits<AUDIO_FORMAT_FLOAT32>::Type) work perhaps?  And similar with AudioSampleTraits<AUDIO_FORMAT_S16>::Type?

I also don't know about the interleaving stuff.  I was just basing my comments on trying to read the media code that produces that buffer.... and I have no idea what input speex wants.
Comment on attachment 690690 [details] [diff] [review]
Patch (v1)

Review of attachment 690690 [details] [diff] [review]:
-----------------------------------------------------------------

::: content/media/MediaDecoderReader.h
@@ +344,5 @@
> +  uint32_t ChannelCount() {
> +    ReentrantMonitorAutoEnter mon(mReentrantMonitor);
> +    T* first = PeekFront();
> +    return first->mChannels;
> +  }

Maybe you want to check the number of elements contained in the deque before doing that. Actually, an assert would make sense, since this should not be called at all when the deque is empty.

::: content/media/webaudio/AudioContext.cpp
@@ +146,5 @@
> +                  aBuffer.Data(), aBuffer.Length(),
> +                  contentType);
> +  if (contentType.IsEmpty()) {
> +    // The spec is not clear what we should do here.  For now, just return.
> +    return;

The spec says:

> If a decoding error is encountered due to the audio format not being recognized
> [...] the errorCallback will be scheduled to run on the main thread's event loop and 
> these steps will be terminated.

We are in the _not being recognized_ case here.

In any case, warning the author that the browser could not perform what he expected it to do is a good thing.

Also, note that per spec [1], I made the sniffing function assign "application/octet-stream" if the mimetype could not be determined [2].

[1]: step 10 in the algorithm described at http://mimesniff.spec.whatwg.org/#identifying-a-resource-with-an-unknown-media-type
[2]: http://mxr.mozilla.org/mozilla-central/source/toolkit/components/mediasniffer/nsMediaSniffer.cpp#116

::: content/media/webaudio/MediaBufferDecoder.cpp
@@ +322,5 @@
> +  }
> +
> +  uint32_t GetSourceSampleRate(const AudioData* aAudioData) const
> +  {
> +    return aAudioData->mFrames * 1000000 /

We have the handy USECS_PER_S macro to avoid counting the number of zeros all the time :-).

Also, if a mFrames is big (2 seconds of audio at 48kHz), this will overflow. Reading the call sites, that should no happen, but well.

@@ +534,5 @@
> +    if (sourceSampleRate != destSampleRate) {
> +      resampler = speex_resampler_init(mDecodeJob.mChannels,
> +                                       sourceSampleRate,
> +                                       destSampleRate,
> +                                       SPEEX_RESAMPLER_QUALITY_MAX, nullptr);

We should double check with jmspeex what tradeoff we can make regarding the quality of the resampling in regard to the speed of the processing. iirc, for |playbackRate|, he told me I did not need maximum quality, and a quality of 3 for mobile and 5 for desktop was adequate. Maybe we have different quality/speed concerns here, though.

@@ +538,5 @@
> +                                       SPEEX_RESAMPLER_QUALITY_MAX, nullptr);
> +
> +      frameCount = destSampleRate * audioData->mFrames / sourceSampleRate;
> +
> +      resampledBuffer = new float[mDecodeJob.mChannels * frameCount];

This can allocate a huge array if we are not careful. For example, if someone crafts a file with a very low sample rate, say, 1, that we try to resample to 96k, we will try to allocate an array that is |96000 * sizeof(float) * audioData->mFrames| bytes, which is likely to be in the hundredth of megabytes, and that evil person can do that a few time, effectively DOSing the browser. Maybe we check the sample rate somewhere else, not sure.

@@ +545,5 @@
> +      uint32_t outSamples = frameCount;
> +
> +      for (uint32_t xxx = 0; xxx < inSamples; ++xxx) {
> +        bufferData[xxx] = Upscale(bufferData[xxx]);
> +      }

I'm not sure why you do this dance with the upscaling and the downscaling. iirc, the speex resampler, when used in float mode, expects values between -1.0 and +1.0 (in fact, afaik, when we deal with floats in audio, a signal that does not clip is between -1.0 and +1.0). Conveniently, the output of our decoders, when used in floating point mode, will also give you data in the [-1.0; 1.0] range, and the AudioStream also expect samples in that range.

@@ +548,5 @@
> +        bufferData[xxx] = Upscale(bufferData[xxx]);
> +      }
> +
> +      speex_resampler_process_interleaved_float(resampler, bufferData, &inSamples,
> +                                                resampledBuffer, &outSamples);

Did we agree to do the all the internal processing in floats, even on mobile? My concern here is that we market the fact that Fennec can run on ARMv6, and that ARMv6 has no requirement to have an FPU (presence of an FPU is however required for ARMv7), leading to terrible performance when using floating point arithmetic on those devices. Also, the decoders will give you integers on mobile, so this will not compile. Or I am missing something and you do the conversion elsewhere.

Also, |speex_sampler_process_*| will store the number of sample consumed and produced in |inSamples| and |outSamples| (and is in fact the reason why the functions are passed the address of the variable). There is no guarantee to produce or consume the number of samples you passed it. I've got bitten by this when implementing |playbackRate|, not sure it was when dynamically changing the playback rate or even when it was static. I would sleep better at night if you assert that all the data have been consume/produced as you expect, so we can fix it by using a spill buffer if needed:

> int32_t inSamplesBck = inSamples;
> int32_t outSamplesBck = outSamples;
> speex_resample_process_interleaved_float(...);
> MOZ_ASSERT(inSamplesBck == inSamples && outSamplesBck == outSamples, "We did not get what we expect when resampling.");

Or maybe we can pull that out of the tree and test.

::: content/media/webaudio/MediaBufferDecoder.h
@@ +46,5 @@
> +  bool FinalizeBufferData();
> +
> +  nsCString mContentType;
> +  JSObject* mObj;
> +  void* mBuffer;

Can we try to have this a bit more strictly typed? I don't know the js api very well, maybe it is not worth the trouble.
Attachment #690690 - Flags: review?(paul) → review-
(In reply to Ehsan Akhgari [:ehsan] from comment #15)
> Hrm, ok.  Thanks for the analysis, Boris!  It seems like I was relying too
> much on the media decoder APIs.  :(
> 
> Chris, is there a better way for us to tell what format (int16/float) the
> decoded data is, whether it's interleaved, etc. that guesswork our way
> throughout the media code?

Ehsan, the output of the decoder will be 16 bits integers on mobile (fennec, b2g), and 32bits float on desktop. Everything is done at compile time, in configure.in [1], and then typedef appropriatly in content/media/AudioSampleFormat.h [2].

Also, when we have multiple channels, the decoder spit out interleaved buffers. All the AudioStream code expect interleaved buffers. Thus, we are going to need code to deinterleave and interleave audio buffers.

[1]: http://mxr.mozilla.org/mozilla-central/source/configure.in#5319
[2]: http://mxr.mozilla.org/mozilla-central/source/content/media/AudioSampleFormat.h#27
Thanks!  OK, I have some serious reworking to do on this patch now.  I'll post back on the bug when I have the next iteration ready for review.
Attachment #690690 - Flags: review?(chris.double)
Depends on: 827599
(In reply to Paul Adenot (:padenot) from comment #19)
> Also, when we have multiple channels, the decoder spit out interleaved
> buffers. All the AudioStream code expect interleaved buffers. Thus, we are
> going to need code to deinterleave and interleave audio buffers.

Hmm, Paul, this code <http://mxr.mozilla.org/mozilla-central/source/content/media/MediaDecoderReader.cpp#45> seems to produce a non-interleaved buffer.  Or am I missing something?
I added that deinterleaving code for capturing MediaStreams from media elements. Please use it :-).
(In reply to comment #22)
> I added that deinterleaving code for capturing MediaStreams from media
> elements. Please use it :-).

Good, I'm using it already in my local tree.  Thanks!
(In reply to Paul Adenot (:padenot) from comment #18)
> Also, |speex_sampler_process_*| will store the number of sample consumed and
> produced in |inSamples| and |outSamples| (and is in fact the reason why the
> functions are passed the address of the variable). There is no guarantee to
> produce or consume the number of samples you passed it. I've got bitten by
> this when implementing |playbackRate|, not sure it was when dynamically
> changing the playback rate or even when it was static. I would sleep better
> at night if you assert that all the data have been consume/produced as you
> expect, so we can fix it by using a spill buffer if needed:

So it turns out that it's actually sort of common for the speex resampler to not consume all of its input/output samples.  It frequently returns a few samples fewer than what I request, and it sometimes consumes a few samples fewer than inSamples too.  That will result in "gaps" in the decoded buffer.  Do you know how I should deal with this problem?
Created attachment 700827 [details] [diff] [review]
WIP

So this WIP patch nearly seems to work (except for the resampling bits which I have not thoroughly tested yet.)  To try it out, apply it and build and then use http://people.mozilla.com/~eakhgari/decode.html as a testcase.  The test case shows two audio elements, one playing back the original audio and the other playing back a wave file constructed from the decoded buffer through a blob URL.  The playback is identical in Chrome, but in Firefox there's some choppiness in the sound, and I can't quite figure out why.

Paul, Chris, I'd appreciate if you guys can please take a look and tell me if you have ideas on why this happens.

Thanks!
Attachment #690690 - Attachment is obsolete: true
Attachment #700827 - Flags: feedback?(paul)
Attachment #700827 - Flags: feedback?(cpearce)
(Note that webaudio is currently hidden behind the media.webaudio.enabled pref.)
Ehsan, maybe you have something in you queue that is needed here. The patch, as attached, does not build on my machine (http://www.pastebin.mozilla.org/2051417).
(In reply to Paul Adenot (:padenot) from comment #27)
> Ehsan, maybe you have something in you queue that is needed here. The patch,
> as attached, does not build on my machine
> (http://www.pastebin.mozilla.org/2051417).

I fixed part of the problem in https://hg.mozilla.org/integration/mozilla-inbound/rev/4cb44c4e4629.  I still don't know what's going on with that friend class...
Created attachment 701211 [details] [diff] [review]
WIP

This fixes the g++ friend issue.  Please apply it on mozilla-inbound.
Attachment #700827 - Attachment is obsolete: true
Attachment #700827 - Flags: feedback?(paul)
Attachment #700827 - Flags: feedback?(cpearce)
Attachment #701211 - Flags: review?(paul)
Attachment #701211 - Flags: review?(cpearce)
Depends on: 829724
Comment on attachment 701211 [details] [diff] [review]
WIP

Review of attachment 701211 [details] [diff] [review]:
-----------------------------------------------------------------

::: content/media/MediaDecoderReader.h
@@ +355,5 @@
> +    ReentrantMonitorAutoEnter mon(mReentrantMonitor);
> +    MOZ_ASSERT(nsDeque::GetSize() > 0);
> +    T* first = PeekFront();
> +    return first->mChannels;
> +  }

This is not needed if you get the channel count from the metadata (see below).

::: content/media/webaudio/MediaBufferDecoder.cpp
@@ +214,5 @@
> +
> +layers::ImageContainer*
> +BufferDecoder::GetImageContainer()
> +{
> +  // no image frame

s/frame/container/

@@ +448,5 @@
> +  }
> +
> +  VideoInfo videoInfo;
> +  MetadataTags* tags;
> +  rv = decoderReader->ReadMetadata(&videoInfo, &tags);

This call gives you the sample rate (|videoInfo.mAudioRate|) and the number of channels (|videoInfo.mAudioChannels|), so you don't have to compute it. This is actually important because computing the original sample rate the way you do it is not very precise: with the file I tested with (using your handy test page), the computed sample rate was between 44101Hz up to 44107Hz for some buffers, while the file is actually a 44001Hz file. Then it got resampled, and should not have been.

@@ +539,5 @@
> +    if (sourceSampleRate != destSampleRate) {
> +      resampler = speex_resampler_init(mDecodeJob.mChannels,
> +                                       sourceSampleRate,
> +                                       destSampleRate,
> +                                       SPEEX_RESAMPLER_QUALITY_DEFAULT, nullptr);

You should call the resampler's init and destroy functions once per decoding task (i.e. keep it for the whole media), and call the resampling functions repeatedly. The resampler is stateful, and I believe restarting the resampler from scratch for every buffer is the cause of the glitches. I might be wrong.

Basically, put init and destroy outside of the loop, that should do the trick.

@@ +553,5 @@
> +
> +      for (uint32_t i = 0; i < audioData->mChannels; ++i) {
> +#ifdef MOZ_SAMPLE_TYPE_S16
> +        speex_resampler_process_int(resampler, i, bufferData, &inSamples,
> +                                    &resampledBuffer[i * mDecodeJob.mChannels],

You meant:

> &resampledBuffer[i * expectedOutSamples],

or something like that, otherwise, your are not going to write to the correct offset. I noticed that I had only an output in the left channel in my headphones, but it is probably easy to miss on the built-in speakers of a laptop.

@@ +557,5 @@
> +                                    &resampledBuffer[i * mDecodeJob.mChannels],
> +                                    &outSamples);
> +#else
> +        speex_resampler_process_float(resampler, i, bufferData, &inSamples,
> +                                      &resampledBuffer[i * mDecodeJob.mChannels],

Of course, same thing here.

@@ +567,5 @@
> +        if (outSamples != expectedOutSamples) {
> +          ReportFailureOnMainThread();
> +          return;
> +        }
> +        */

This is not hit when uncommented, on my machine, but it might be an artifact of the way you called the init, resample and destroy functions.

@@ +577,5 @@
> +      // Convert back to float
> +      for (uint32_t j = 0; j < expectedOutSamples; ++j) {
> +        mDecodeJob.mChannelBuffers[i].second[j + framesCopied] = static_cast<float>
> +          (resampledBuffer[i * expectedOutSamples + j]) / INT16_MAX;
> +      }

Roc wrote handy functions to do that, maybe you could use them? They are in content/media/AudioSampleFormat.h.

@@ +689,5 @@
> +  if (!*aContentType ||
> +      strcmp(aContentType, APPLICATION_OCTET_STREAM) == 0) {
> +    nsCOMPtr<nsIRunnable> event =
> +      new ReportResultTask(aDecodeJob, &WebAudioDecodeJob::OnFailure);
> +    NS_DispatchToMainThread(event, NS_DISPATCH_NORMAL);

Maybe we could report more informative error messages to the authors, like we do in |nsHTMLMediaElement::ReportLoadError|. This is probably not a priority, though, but I have seen a couple occurrence of people facing a problem, with just a "null" return value in the |decodeAudioData| callback, that they had trouble debugging.

@@ +697,5 @@
> +  EnsureThreadPoolInitialized();
> +
> +  nsRefPtr<MediaDecodeTask> task =
> +    new MediaDecodeTask(aContentType, aBuffer, aLength, aDecodeJob, mThreadPool);
> +  mThreadPool->Dispatch(task, nsIThreadPool::DISPATCH_NORMAL);

This will segfault on a null deference if |EnsureThreadPoolInitialized| fails, and it does so silently. If it cannot fail, we should not have to null check it after the |do_CreateInstance|. If it can, we should ensure that we don't call methods on null pointers.

::: dom/webidl/AudioContext.webidl
@@ +33,1 @@
>      // AudioNode creation 

nit: trailing space.
Comment on attachment 701211 [details] [diff] [review]
WIP

Review of attachment 701211 [details] [diff] [review]:
-----------------------------------------------------------------

::: content/media/webaudio/MediaBufferDecoder.cpp
@@ +689,5 @@
> +  if (!*aContentType ||
> +      strcmp(aContentType, APPLICATION_OCTET_STREAM) == 0) {
> +    nsCOMPtr<nsIRunnable> event =
> +      new ReportResultTask(aDecodeJob, &WebAudioDecodeJob::OnFailure);
> +    NS_DispatchToMainThread(event, NS_DISPATCH_NORMAL);

Hmm, this stuff is mandated by the spec.  If you think we can do better with error messages, please bring that up on the public-audio list.
Comment on attachment 701211 [details] [diff] [review]
WIP

Review of attachment 701211 [details] [diff] [review]:
-----------------------------------------------------------------

::: content/media/webaudio/MediaBufferDecoder.cpp
@@ +394,5 @@
> +
> +#ifdef MOZ_GSTREAMER
> +  // When enabled, use GStreamer for H.264, but not for codecs handled by our
> +  // bundled decoders, unless the "media.prefer-gstreamer" pref is set.
> +  if (DecoderTraits::IsGStreamerSupportedType(mContentType)) {

Most (or all?) of these DecoderTraits::IsXXXSupportedType() functions make a Preference::GetBool() call, which IIRC we're not supposed to do off the main thread. And the WMFReader also relies on the assumption that we're on the main thread (I could fix that however). Can you add another phase which creates the reader and calls its init function on the main thread please?

@@ +428,5 @@
> +#ifdef MOZ_WEBM
> +  if (DecoderTraits::IsWebMType(mContentType)) {
> +    decoderReader = new WebMReader(decoder);
> +  } else
> +#endif

Could you add a block for DecoderTraits::IsWMFSupportedType() here too? ;)

@@ +438,5 @@
> +  if (!decoderReader) {
> +    ReportFailureOnMainThread();
> +    return;
> +  }
> +

You need to call decoderReader->OnDecodeThreadStart() here (and OnDecodeThreadFinish() when the decode is finished). These were added when I landed the Windows Media Foundation backend, and ensure MSCOM is initialized on the decoding thread.

@@ +449,5 @@
> +
> +  VideoInfo videoInfo;
> +  MetadataTags* tags;
> +  rv = decoderReader->ReadMetadata(&videoInfo, &tags);
> +  if (NS_FAILED(rv)) {

You need to delete |tags| here (and initialize it to nullptr before passing it into ReadMetadata!).

Normally we pass off |tags| to the media element to delete its memory, but you're not doing that here obviously.

@@ +484,5 @@
> +  const uint32_t destSampleRate = mDecodeJob.mContext->SampleRate();
> +
> +  mDecodeJob.mResampledFrames = mDecodeJob.mFrames;
> +  if (sourceSampleRate != destSampleRate) {
> +    mDecodeJob.mResampledFrames = destSampleRate * frameCount / sourceSampleRate;

Do you need to handle integer overflow here?

@@ +532,5 @@
> +    AudioDataValue* resampledBuffer = bufferData;
> +
> +    uint32_t inSamples = audioData->mFrames;
> +    const uint32_t expectedOutSamples =
> +      destSampleRate * inSamples / sourceSampleRate;

Is it safe to assume this won't overflow? Do these rates come from metadata (i.e. are they easy to spoof) or from actual data in the media file?

@@ +689,5 @@
> +  if (!*aContentType ||
> +      strcmp(aContentType, APPLICATION_OCTET_STREAM) == 0) {
> +    nsCOMPtr<nsIRunnable> event =
> +      new ReportResultTask(aDecodeJob, &WebAudioDecodeJob::OnFailure);
> +    NS_DispatchToMainThread(event, NS_DISPATCH_NORMAL);

I think Paul was referring to logging messages to the web console to assist web-devs with their debugging, I don't see why we'd need to consult public-audio to do that?
(In reply to Chris Pearce (:cpearce) from comment #32)
> @@ +689,5 @@
> > +  if (!*aContentType ||
> > +      strcmp(aContentType, APPLICATION_OCTET_STREAM) == 0) {
> > +    nsCOMPtr<nsIRunnable> event =
> > +      new ReportResultTask(aDecodeJob, &WebAudioDecodeJob::OnFailure);
> > +    NS_DispatchToMainThread(event, NS_DISPATCH_NORMAL);
> 
> I think Paul was referring to logging messages to the web console to assist
> web-devs with their debugging, I don't see why we'd need to consult
> public-audio to do that?

Ah, right, sorry for misunderstanding.  I can definitely do that!
Boris, should I use localizable strings when reporting errors to the web console?  I'm not sure what the usual practice for that is...

Comment 35

5 years ago
We typically localize those, yes.  See nsContentUtils::ReportToConsole.
I *finally* found out the reason that I was not getting enough samples out of libspeex_resampler, and it's super-embarrassing.  I was setting inSamples and outSamples outside of the loop, initial values being 1024 and 4058 respectively.  Then on the first iteration, speex_resampler_process_float would set inSamples to 1023 and outSamples to 4058, which is what I would expect (since it just didn't need to consume all of the input samples.)  On the second iteration, however, inSamples would be 1023 as opposed to 1024, and therefore libspeex would provide 4055 samples, as opposed to 4058 samples.  I don't know why I didn't see this sooner.  :(

With this, I think this patch is basically done, I need to clean it up, make sure I've addressed all of the comments, add tests, etc and submit it for final review.  I expected to finish doing this either some time tonight or tomorrow...

Comment 37

5 years ago
I see in the patch here that you are using an nsRefPtr to manage the lifetime of a MediaDecoderReader. As noted in bug 831640 the fact that MediaDecoderReader currently has reference counting methods appears to be an error as the lifetime of them is controlled via an nsAutoPtr. We should either use reference counting, or not use reference counting, consistently wherever it is used I think.

In MediaDecodeTask::Decode a MediaDecoderReader is created and at the end assigned to a mDecodeJob.mDecoderReader member. It's not used after that in the method. Is the decode job object copied anywhere, or the mDecoderReader member of that shared? If not, then would an nsAutoPtr controlling the lifetime suffice?

One other drive by comment, I see code like:

+class ReportResultTask : public nsRunnable
+{
+public:
+  ReportResultTask(WebAudioDecodeJob& aDecodeJob,
+                   WebAudioDecodeJob::ResultFn aFunction)
+    : mDecodeJob(aDecodeJob)
+    , mFunction(aFunction)
+  {
+    MOZ_ASSERT(aFunction);
+  }
...
+private:
+  WebAudioDecodeJob& mDecodeJob;
+  WebAudioDecodeJob::ResultFn mFunction;
+};

I find it a bit confusing to have a reference member variable. It's a little harder to immediately notice that a creator of ReportResultTask has to keep the lifetime of the WebAudioDecode job around for the lifetime of the ReportResultTask. This is easier to notice if pointers to WebAudioDecodeJob are used instead. I feel that makes it a little more obvious to a casual reader than there are lifetime issues that they need to be aware of. Can you use pointers? Or add a comment explaining the lifetime requirement of mDecodeJob vs ReportResultTask?
Depends on: 831640
(In reply to Chris Double (:doublec) from comment #37)
> I see in the patch here that you are using an nsRefPtr to manage the
> lifetime of a MediaDecoderReader. As noted in bug 831640 the fact that
> MediaDecoderReader currently has reference counting methods appears to be an
> error as the lifetime of them is controlled via an nsAutoPtr. We should
> either use reference counting, or not use reference counting, consistently
> wherever it is used I think.

Commented on that bug.  Short story, the refcounting stuff is not needed for this patch since I control the lifetime explicitly.

> In MediaDecodeTask::Decode a MediaDecoderReader is created and at the end
> assigned to a mDecodeJob.mDecoderReader member. It's not used after that in
> the method. Is the decode job object copied anywhere, or the mDecoderReader
> member of that shared? If not, then would an nsAutoPtr controlling the
> lifetime suffice?

It would, yes (and this stuff has sort of changed since the last version of the patch here.)

> One other drive by comment, I see code like:
> 
> +class ReportResultTask : public nsRunnable
> +{
> +public:
> +  ReportResultTask(WebAudioDecodeJob& aDecodeJob,
> +                   WebAudioDecodeJob::ResultFn aFunction)
> +    : mDecodeJob(aDecodeJob)
> +    , mFunction(aFunction)
> +  {
> +    MOZ_ASSERT(aFunction);
> +  }
> ...
> +private:
> +  WebAudioDecodeJob& mDecodeJob;
> +  WebAudioDecodeJob::ResultFn mFunction;
> +};
> 
> I find it a bit confusing to have a reference member variable. It's a little
> harder to immediately notice that a creator of ReportResultTask has to keep
> the lifetime of the WebAudioDecode job around for the lifetime of the
> ReportResultTask. This is easier to notice if pointers to WebAudioDecodeJob
> are used instead. I feel that makes it a little more obvious to a casual
> reader than there are lifetime issues that they need to be aware of. Can you
> use pointers? Or add a comment explaining the lifetime requirement of
> mDecodeJob vs ReportResultTask?

The lifetime of WebAudioDecodeJob is guaranteed to end at the same time that ReportResultTask is Run.  I'll add a comment to clarify this.
Once this lands, it needs some fuzz love.
Depends on: 832098
Created attachment 703706 [details] [diff] [review]
Patch (v1)

This should be ready for review.
Attachment #701211 - Attachment is obsolete: true
Attachment #701211 - Flags: review?(paul)
Attachment #701211 - Flags: review?(cpearce)
Attachment #703706 - Flags: review?(paul)
Attachment #703706 - Flags: review?(cpearce)
Attachment #703706 - Flags: review?(bzbarsky)
Created attachment 703707 [details] [diff] [review]
Test case
Attachment #703707 - Flags: review?(bzbarsky)

Comment 42

5 years ago
Is an interdiff against the last thing I reviewed worthwhile, or is it all just too different now?

Note that I promised Neil to spend the next several days reviewing bug 653881, so it might be a bit before I get to this one.  :(
Created attachment 703876 [details] [diff] [review]
interdiff since bz's last review

(In reply to Boris Zbarsky (:bz) from comment #42)
> Is an interdiff against the last thing I reviewed worthwhile, or is it all
> just too different now?

I doubt that Bugzilla gets it right.  Fortunately I do have the full commit history, so here's the interdiff that you requested.  Not sure how useful it is though...

> Note that I promised Neil to spend the next several days reviewing bug
> 653881, so it might be a bit before I get to this one.  :(

OK, thanks for letting me know!
Created attachment 703888 [details] [diff] [review]
Use a plain enum for WebAudioDecodeJob::ErrorCode

Our MOZ_ENUM_CLASS fallbacks do not support nested enums, so I had to switch back to a normal enum for ErrorCode.  Posting this as a separate patch but will land it together with the code changes.
Attachment #703888 - Flags: review?(bzbarsky)

Updated

5 years ago
Attachment #703706 - Flags: review?(paul) → review+
Comment on attachment 703706 [details] [diff] [review]
Patch (v1)

Review of attachment 703706 [details] [diff] [review]:
-----------------------------------------------------------------

::: content/media/webaudio/MediaBufferDecoder.cpp
@@ +425,5 @@
> +  // If you change this list to add support for new decoders, please consider
> +  // updating nsHTMLMediaElement::CreateDecoder as well.
> +
> +#ifdef MOZ_GSTREAMER
> +  // When enabled, use GStreamer for H.264, but not for codecs handled by our

This comment really belongs above the declaration of DecoderTraits::IsGStreamerSupportedType in DecoderTraits.h, it's not really relevant here. It should be removed from nsHTMLMediaElement.cpp as well.

@@ +476,5 @@
> +  if (!mDecoderReader) {
> +    return false;
> +  }
> +
> +  mDecoderReader->OnDecodeThreadStart();

Call OnDecodeThreadStart() *after* calling MediaDecoderReader->Init().
Attachment #703706 - Flags: review?(cpearce) → review+
There seems to be a bunch of test failures in this patch which only get triggered in opt builds:

https://tbpl.mozilla.org/?tree=Try&rev=09f9e5cd6c2f
https://tbpl.mozilla.org/?tree=Try&rev=3b9a59cc2ae8

I'm not quite sure what could be causing these yet.
Created attachment 705169 [details] [diff] [review]
Test case

This version of the test case does a fuzzy comparison to mask out the effects of doing a binary comparison over resampled data.  Things to note are that the fuzz tolerance for audio-expected.wav (which will not be resampled) is 0, and that the fuzz tolerance on mobile platforms is higher, which is expected given the fact that we do fixed-point resampling on those platforms.  I had to introduce two fuzz tolerances for each test to make sure that mobile won't cover real failures on desktop platforms.
Attachment #703707 - Attachment is obsolete: true
Attachment #703707 - Flags: review?(bzbarsky)
Attachment #705169 - Flags: review?(bzbarsky)
https://tbpl.mozilla.org/?tree=Try&rev=13043c831767
https://tbpl.mozilla.org/?tree=Try&rev=47d12c61fd97
Created attachment 705380 [details] [diff] [review]
Test case

This version of the test case actually passes, I believe.
Attachment #705169 - Attachment is obsolete: true
Attachment #705169 - Flags: review?(bzbarsky)
Attachment #705380 - Flags: review?(bzbarsky)
All seems to be green on the following try pushes:

https://tbpl.mozilla.org/?tree=Try&rev=886d317e9132
https://tbpl.mozilla.org/?tree=Try&rev=f1f4d190659b

Comment 51

5 years ago
Comment on attachment 703888 [details] [diff] [review]
Use a plain enum for WebAudioDecodeJob::ErrorCode

r=me
Attachment #703888 - Flags: review?(bzbarsky) → review+

Comment 52

5 years ago
Comment on attachment 705380 [details] [diff] [review]
Test case

This seems ok.  The only question I have is whether it makes sense to measure how _far_ off we are for the fuzz, not just the number of bytes where we're off.
Attachment #705380 - Flags: review?(bzbarsky) → review+

Comment 53

5 years ago
Comment on attachment 703706 [details] [diff] [review]
Patch (v1)

Need to move the OnDecodeThreadStart() call to the start of Decode().  Ehsan tells me that's already done locally.

>+    const uint32_t expectedOutSamples = static_cast<uint32_t>(

Why not just use mDecodeJob.mResampledFrames?  Or can we have a different dest sample rate here?  Or is the audioData->mFrames different from audioQueue.FrameCount() in this case (because the latter is the sum over all AudioData bits)?  In any case, worth documenting what this number here actually means.

Please address the nit from the end of comment 13?

With that, r=me
Attachment #703706 - Flags: review?(bzbarsky) → review+
(In reply to Boris Zbarsky (:bz) from comment #52)
> Comment on attachment 705380 [details] [diff] [review]
> Test case
> 
> This seems ok.  The only question I have is whether it makes sense to
> measure how _far_ off we are for the fuzz, not just the number of bytes
> where we're off.

Maybe.  I mean, these differences are sometimes in parts of a float variable.  Ultimately we want something like opus_compare, and I actually spent some time to try to understand it and use it here, but that stuff is sort of above my head, and for the purposes of the tests here I mostly care to know if we break something, in which case the change in the output would be sort of drastic...
(In reply to Boris Zbarsky (:bz) from comment #53)
> Comment on attachment 703706 [details] [diff] [review]
> Patch (v1)
> 
> Need to move the OnDecodeThreadStart() call to the start of Decode().  Ehsan
> tells me that's already done locally.
> 
> >+    const uint32_t expectedOutSamples = static_cast<uint32_t>(
> 
> Why not just use mDecodeJob.mResampledFrames?  Or can we have a different
> dest sample rate here?  Or is the audioData->mFrames different from
> audioQueue.FrameCount() in this case (because the latter is the sum over all
> AudioData bits)?  In any case, worth documenting what this number here
> actually means.

The last guess is correct!  I'll document this.

> Please address the nit from the end of comment 13?
> 
> With that, r=me

Will do.  Thanks!
Created attachment 709243 [details] [diff] [review]
Patch to land
https://hg.mozilla.org/integration/mozilla-inbound/rev/083e6703ed56
Landed http://hg.mozilla.org/integration/mozilla-inbound/rev/36304c47f23b as the follow-up because, warnings as errors :(
OS: Mac OS X → All
Hardware: x86 → All
https://hg.mozilla.org/mozilla-central/rev/083e6703ed56
https://hg.mozilla.org/mozilla-central/rev/36304c47f23b
Status: NEW → RESOLVED
Last Resolved: 5 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla21
Depends on: 837427
Depends on: 839175
Depends on: 849106
Depends on: 870174
Mass moving Web Audio bugs to the Web Audio component.  Filter on duckityduck.
Component: Video/Audio → Web Audio
Depends on: 900341
Depends on: 938450
Blocks: 1326003
You need to log in before you can comment on or make changes to this bug.