Closed Bug 834513 Opened 8 years ago Closed 7 years ago

Implement ScriptProcessorNode

Categories

(Core :: Web Audio, defect)

x86
macOS
defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla23

People

(Reporter: ehsan, Assigned: ehsan)

References

Details

(Keywords: dev-doc-complete)

Attachments

(5 files, 12 obsolete files)

3.90 KB, patch
roc
: review+
ehsan
: checkin+
Details | Diff | Splinter Review
582 bytes, text/html
Details
21.06 KB, patch
roc
: review+
Details | Diff | Splinter Review
43.45 KB, patch
roc
: review+
Details | Diff | Splinter Review
11.66 KB, patch
roc
: review+
Details | Diff | Splinter Review
Attached file WIP (obsolete) —
This is stalled for now because of spec issues.  This is my WIP patch which is no good just yet (doesn't even build.)
Attached patch WIP (obsolete) — Splinter Review
The real patch.
Assignee: nobody → ehsan
Attachment #706162 - Attachment is obsolete: true
Status: NEW → ASSIGNED
Attached patch WIP (obsolete) — Splinter Review
Not quite sure what's wrong with me!
Attachment #706163 - Attachment is obsolete: true
One thing that I don't know how to handle here is the event dispatching part.  ProduceAudioBlock is only ever called with 128 frames of input/output, but the spec requires AudioProcessingEvent to be callable with many different buffer sizes, all bigger than 128 frames.  And I don't see how we can reconcile these two...

Also, I was planning on sending a runnable to the main thread and waiting on a monitor right after that.  The runnable would dispatch the event to the handler, extract the buffer from the event's outputBuffer, and then notify the monitor, allowing the MSG thread to wake up.  Does that sound good to you?
Depends on: 853298
Flags: needinfo?(roc)
I think we should choose the buffer size and ignore the buffer size specified by the author.

I was thinking of something like this:
-- In the ScriptProcessorNodeEngine, keep two queues of frames: input frames and output frames. Protect each queue with a lock.
-- At each processing step, append WEBAUDIO_BLOCK_SIZE frames to the input frame queue and take WEBAUDIO_BLOCK_SIZE frames from the output frame queue. If the output frame queue is empty, output silence instead.
-- During a processing step, if the input frame queue length grows to be >= the buffer size, dispatch a processing event runnable to the main thread.
-- In that runnable, take one buffer-full of frames from input queue, fire the processing event, and append any output frames to the output frame queue. After this, if the input queue still has at least one buffer-full of frames, dispatch another runnable.

That'll handle all the "input only" and "output only" cases about as well as can be. Actual audio processing should be OK too, with some latency that will build up over time until it's about the right level to avoid skipping. I hope :-).
Flags: needinfo?(roc)
(In reply to comment #4)
> I think we should choose the buffer size and ignore the buffer size specified
> by the author.

OK, I agree.  You raised this either in a bug or in some thread on public-audio IIRC, but I can't find it.  Do you have a link handy?  I want to bring it up in the f2f next week.

> I was thinking of something like this:
> -- In the ScriptProcessorNodeEngine, keep two queues of frames: input frames
> and output frames. Protect each queue with a lock.
> -- At each processing step, append WEBAUDIO_BLOCK_SIZE frames to the input
> frame queue and take WEBAUDIO_BLOCK_SIZE frames from the output frame queue. If
> the output frame queue is empty, output silence instead.
> -- During a processing step, if the input frame queue length grows to be >= the
> buffer size, dispatch a processing event runnable to the main thread.
> -- In that runnable, take one buffer-full of frames from input queue, fire the
> processing event, and append any output frames to the output frame queue. After
> this, if the input queue still has at least one buffer-full of frames, dispatch
> another runnable.
> 
> That'll handle all the "input only" and "output only" cases about as well as
> can be. Actual audio processing should be OK too, with some latency that will
> build up over time until it's about the right level to avoid skipping. I hope
> :-).

Hmm, perhaps I don't understand something fundamental here.  Let's say that the buffer size is 1024.  That means that before each buffer is full, we have processed 8 Web Audio blocks, and handed them off to the MSG, which means that MediaStreamGraph::PlayAudio has probably been called at least 7 times before the mainthread sees the event.  Now that can't be right!  If I'm reading between the lines correctly here, we need to delay the output somehow (i.e., make ScriptProcessorNode behave like a DelayNode under the hood.)  Is that correct?  In that case, I think I should focus on DelayNode first.  (I haven't actually thought about that at all so far, so pointers there are appreciated too! -- filed bug 853721)
(In reply to :Ehsan Akhgari (needinfo? me!) from comment #5)
> OK, I agree.  You raised this either in a bug or in some thread on
> public-audio IIRC, but I can't find it.  Do you have a link handy?  I want
> to bring it up in the f2f next week.

I don't remember, but I think it was the mailing list.

> Hmm, perhaps I don't understand something fundamental here.  Let's say that
> the buffer size is 1024.  That means that before each buffer is full, we
> have processed 8 Web Audio blocks, and handed them off to the MSG, which
> means that MediaStreamGraph::PlayAudio has probably been called at least 7
> times before the mainthread sees the event.

Those numbers aren't right; each block is only 128/44100 seconds i.e. about 3ms and the MSG runs every 10ms or so, but anyway...

> Now that can't be right!  If
> I'm reading between the lines correctly here, we need to delay the output
> somehow (i.e., make ScriptProcessorNode behave like a DelayNode under the
> hood.)  Is that correct?  In that case, I think I should focus on DelayNode
> first.  (I haven't actually thought about that at all so far, so pointers
> there are appreciated too! -- filed bug 853721)

"If the output frame queue is empty, output silence instead." I.e. *pad* the output frame queue with silence. Once the script starts producing output, that output is effectively appended to the synthetic silence that was generated. Make sense?
(In reply to comment #6)
> (In reply to :Ehsan Akhgari (needinfo? me!) from comment #5)
> > OK, I agree.  You raised this either in a bug or in some thread on
> > public-audio IIRC, but I can't find it.  Do you have a link handy?  I want
> > to bring it up in the f2f next week.
> 
> I don't remember, but I think it was the mailing list.
> 
> > Hmm, perhaps I don't understand something fundamental here.  Let's say that
> > the buffer size is 1024.  That means that before each buffer is full, we
> > have processed 8 Web Audio blocks, and handed them off to the MSG, which
> > means that MediaStreamGraph::PlayAudio has probably been called at least 7
> > times before the mainthread sees the event.
> 
> Those numbers aren't right; each block is only 128/44100 seconds i.e. about 3ms
> and the MSG runs every 10ms or so, but anyway...
> 
> > Now that can't be right!  If
> > I'm reading between the lines correctly here, we need to delay the output
> > somehow (i.e., make ScriptProcessorNode behave like a DelayNode under the
> > hood.)  Is that correct?  In that case, I think I should focus on DelayNode
> > first.  (I haven't actually thought about that at all so far, so pointers
> > there are appreciated too! -- filed bug 853721)
> 
> "If the output frame queue is empty, output silence instead." I.e. *pad* the
> output frame queue with silence. Once the script starts producing output, that
> output is effectively appended to the synthetic silence that was generated.
> Make sense?

Yeah I think it does.  Let me see what I can do here.
Attachment #706164 - Attachment is obsolete: true
Attachment #737809 - Flags: review?(roc)
Attached patch WIP (obsolete) — Splinter Review
This still has quite a few issues which I need to fix, but it's at a stage where I'm confident getting some feedback on.  Please don't spend a lot of time reviewing this yet, but I'd like to know what you think about ScriptProcessorNode.{h,cpp}, the buffering approach I've used there, the locking, etc.  Or any other big issues that you can see which I should change before proceeding further.  Thanks!
Attachment #737811 - Flags: feedback?(roc)
Whiteboard: [leave open]
Comment on attachment 737809 [details] [diff] [review]
Part 1: Refactor WebAudioDecodeJob::GetJSContext into AudioContext

https://hg.mozilla.org/integration/mozilla-inbound/rev/dbb94129fec6
Attachment #737809 - Flags: checkin+
Comment on attachment 737811 [details] [diff] [review]
WIP

Review of attachment 737811 [details] [diff] [review]:
-----------------------------------------------------------------

::: content/media/webaudio/ScriptProcessorNode.h
@@ +45,5 @@
> +  void DispatchAudioProcessEvent(AudioBuffer* aInputBuffer,
> +                                 AudioBuffer* aOutputBuffer);
> +  friend class ScriptProcessorNodeEngine;
> +  // This class implement a queue of input/output AudioBuffers shared between
> +  // the main thread and the Media Stream Graph thread.

It implements two queues, right?

Could we refactor this so there's a single queue class that gets used once for input and once for output?

@@ +111,5 @@
> +    static const unsigned PreAllocatedBuffers = 10;
> +    // Synchronizes access to mInputBuffers and mCurrentInputBuffer
> +    Mutex mInputMutex;
> +    // Synchronizes access to mOutputBuffers, mCurrentOutputBuffer
> +    // and mOutpytBufferToConsume

typo

@@ +129,5 @@
> +    // write to.
> +    OutputBufferList::iterator mCurrentOutputBuffer;
> +    // Iterator pointing to the current output buffer that the engine
> +    // will read from.  Initially this is an invalid iterator.
> +    OutputBufferList::iterator mOutputBufferToConsume;

It's not clear to me why we need these iterators. Can't we just have a queue object that supports writing to one and and reading from the other? i.e. a pipe?
(In reply to Robert O'Callahan (:roc) (Mozilla Corporation) from comment #11)
> Comment on attachment 737811 [details] [diff] [review]
> WIP
> 
> Review of attachment 737811 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> ::: content/media/webaudio/ScriptProcessorNode.h
> @@ +45,5 @@
> > +  void DispatchAudioProcessEvent(AudioBuffer* aInputBuffer,
> > +                                 AudioBuffer* aOutputBuffer);
> > +  friend class ScriptProcessorNodeEngine;
> > +  // This class implement a queue of input/output AudioBuffers shared between
> > +  // the main thread and the Media Stream Graph thread.
> 
> It implements two queues, right?

Yeah, I'll adjust the comment.

> Could we refactor this so there's a single queue class that gets used once
> for input and once for output?

Yes, and that is a great idea.  I did that and I think the code is much simpler now.

> @@ +129,5 @@
> > +    // write to.
> > +    OutputBufferList::iterator mCurrentOutputBuffer;
> > +    // Iterator pointing to the current output buffer that the engine
> > +    // will read from.  Initially this is an invalid iterator.
> > +    OutputBufferList::iterator mOutputBufferToConsume;
> 
> It's not clear to me why we need these iterators. Can't we just have a queue
> object that supports writing to one and and reading from the other? i.e. a
> pipe?

I now have a queue object which has a Produce() method which marks a buffer at the end as ready and a Consume() method which removes a buffer at the start when it can be consumed.

The iterators are necessary since I carefully avoid copying InputBuffer/OutputBuffer objects on the MSG thread, as the AudioBuffer object stored inside them cannot be AddRef'ed/Released on that thread.  So I need "pointers" that I can pass to the main thread and do everything with AudioBuffer there.  I've documented this a bit in the new patch which I'll attach shortly.
Attached patch WIP 2 (obsolete) — Splinter Review
This version of the patch addresses the comment before and makes things a bit more robust.  I have added an initial delay of 20ms in this patch before the ScriptProcessorNode emits any audio (see the mInitialDelay business inside ScriptProcessorNode::SharedBuffers::GetOutputBuffer()).  Even without this, our performance on 4KB buffers is a _lot_ better than WebKit's (in WebKIt, the audio playback is basically choppy all the way along -- I'll attach the testcase after this.)

Note that the 4KB buffer size and the 10 buffer queue size are entirely arbitrary.

I've spent a bunch of time looking into what we can do to optimize this even more, and I'm now less convinced that choosing an arbitrary buffer size is a good idea.  Without the initial delay, everything that I tried results in choppy playback.  With an initial delay of <10ms, there is not much improvement.  Increasing the delay to 20ms makes things a _lot_ better, giving us an almost non-choppy playback.  And anything more than 20ms doesn't introduce a noticeable improvement.  The rates of production and consumption of the buffers vary a lot, and I doubt that there is much that can be done in this implementation to improve things.  Profiling doesn't show anything in particular being very hot -- and the lockings are pretty cheap too.

Given the fact that we can adjust the number of preallocated blocks based on the buffer size that the author requests us, I'm beginning to think that perhaps we should honor the buffer size requested from the author and just change the number of blocks to give us enough breathing room for a ~20ms initial delay.  I'm curious to know what you think.

Also, given the quality of implementation in WebKit (which uses double buffering as opposed to our buffer queue approach) I am beginning to really doubt if ScriptProcessorNode is going to be useful to any real web application.  Once we settle down on the implementation on our side, I'm planning to talk to the WebKit/Blink folks to see if they can improve their implementation, but I'm beginning to think that until authors request us, perhaps we should not be investing too much in improving things here... :(
Attachment #737811 - Attachment is obsolete: true
Attachment #737811 - Flags: feedback?(roc)
Attachment #738311 - Flags: feedback?(roc)
Attached file testcase
(Note that I have adjusted the testcase to also request 4KB buffers so that a comparison between Firefox and Chrome would be fair!)
Chatted with roc about this on IRC: http://pastebin.mozilla.org/2312718
BTE Ms2ger told me today that I should use JSAutoRequest here.  Boris, do you mind skimming over the AutoPushJSContext's in the patch and confirm it?  (BTW, is AutoPushJSContext the right thing to use in order to get a JSContext*?)
Flags: needinfo?(bzbarsky)
Yeah, you need the JSAutoRequest.

It can't hurt to do the AutoPushJSContext, though if you're not actually running script on it, then it might not actually need to be pushed.  Proving that script won't run is usually too painful to be worth the bother...
Flags: needinfo?(bzbarsky)
(In reply to comment #19)
> Yeah, you need the JSAutoRequest.
> 
> It can't hurt to do the AutoPushJSContext, though if you're not actually
> running script on it, then it might not actually need to be pushed.  Proving
> that script won't run is usually too painful to be worth the bother...

Just to confirm, I need to do both right?
I think that would be best, yes.
Attached patch WIP 3 (obsolete) — Splinter Review
This patch implements the ideas we talked about yesterday.  Now, we have no input queue, and we just hand off the input buffer to the main thread when it fills up.  The AudioBuffer objects are now created on the main thread when needed, eliminating the need for the iterators.  Every time we produce an output buffer on the main thread, we steal its contents and append it to the output queue.  Everytime we have an output queue on the MSG side, we just output it, otherwise we just output silence.

The glitching behavior is improved a _lot_ in this patch, and when they happen, we quickly recover.  I think this might be the best thing that we can achieve with the current spec.  The code is also a lot simpler compared to the previous patch, which is a win-win situation!

I experimented with different buffer sizes, and I confirmed that the size of the buffers makes no difference on the glitching behavior, so I'm planning to change the patch to respect the buffer size that the author has requested.  I also need to hook things up to the CC, set the AudioProcessingEvent.playbackTime, write tests, and submit the final patch for review.  Other than the above, consider this patch finished.
Attachment #738311 - Attachment is obsolete: true
Attachment #738311 - Flags: feedback?(roc)
Attachment #738861 - Flags: feedback?(roc)
Comment on attachment 738861 [details] [diff] [review]
WIP 3

Review of attachment 738861 [details] [diff] [review]:
-----------------------------------------------------------------

It seems to me the output queue might be simpler if it was just a linked list of AudioChunks.

::: content/media/webaudio/AudioContext.cpp
@@ +104,5 @@
> +AudioContext::CreateScriptProcessor(uint32_t aBufferSize,
> +                                    uint32_t aNumberOfInputChannels,
> +                                    uint32_t aNumberOfOutputChannels)
> +{
> +  // TODO: Handle the number of input and output channels somehow

I think this is actually important. If we don't handle these parameters then we're incompatible with the spec.

Supporting these parameters shouldn't be hard. aNumberOfInputChannels would override AudioNodeStream::ObtainInputBlock's outputChannelCount decision; you'd have to extend AudioNodeStream::ObtainInputBlock to downmix inputs to outputChannelCount if necessary.

aNumberOfOutputChannels is even easier, just use it where you currently have "outputBuffer->InitializeBuffers(2, cx)".

::: content/media/webaudio/AudioProcessingEvent.cpp
@@ +12,5 @@
> +
> +NS_IMPL_CYCLE_COLLECTION_INHERITED_2(AudioProcessingEvent, nsDOMEvent,
> +                                     mInputBuffer, mOutputBuffer)
> +
> +NS_INTERFACE_MAP_BEGIN_CYCLE_COLLECTION_INHERITED(AudioProcessingEvent)

Don't you need to traverse mInputBuffer/mOutputBuffer here?

::: content/media/webaudio/ScriptProcessorNode.cpp
@@ +52,5 @@
> +    mInputWriteIndex += aInput.GetDuration();
> +    if (mInputWriteIndex >= mBufferSize) {
> +      // we now have a full input buffer ready to be sent to the main thread.
> +      mSharedBuffers.SendBuffersToMainThread(mInputChannels[0].forget(),
> +                                             mInputChannels[1].forget());

I think it might be better to pass an AudioChunk directly here. That might be simpler, and in the case where a ScriptProcessorNode is only producing output, we'd pass a Null audio chunk which means we don't have to allocate buffers full of silence here, and if inputBuffer allocates AudioBuffer lazily (and isn't called), we won't ever have to create silent input buffers at all.

@@ +200,5 @@
> +          return NS_OK;
> +        }
> +        // Put out left and right channel data inside it
> +        inputBuffer->SetRawChannelContents(cx, 0, mLeftChannel);
> +        inputBuffer->SetRawChannelContents(cx, 1, mRightChannel);

So you decided not to make this lazy? I kinda think it would be almost as simple just to make allocation of inputBuffer and outputBuffer lazy.

::: content/media/webaudio/ScriptProcessorNode.h
@@ +37,5 @@
> +
> +  uint32_t BufferSize() const
> +  {
> +    // TODO: See if we can choose a better buffer size
> +    return 4096;

Should return the real buffer size

@@ +61,5 @@
> +      Mutex& Lock() { return mMutex; }
> +
> +      size_t ReadyToConsume() const
> +      {
> +        return mBufferList.size();

assert that the lock is held

@@ +67,5 @@
> +
> +      // Returns a refertence to the next buffer being consumed
> +      OutputBuffer& NextToConsume()
> +      {
> +        MOZ_ASSERT(ReadyToConsume() > 0);

assert that the lock is held

@@ +74,5 @@
> +
> +      // Produce one buffer
> +      OutputBuffer& Produce()
> +      {
> +        mBufferList.push_back(OutputBuffer());

assert that the lock is held

@@ +81,5 @@
> +
> +      // Consumes one buffer.
> +      void Consume()
> +      {
> +        MOZ_ASSERT(ReadyToConsume() > 0);

assert that the lock is held

::: dom/webidl/AudioProcessingEvent.webidl
@@ +15,5 @@
> +
> +    readonly attribute double playbackTime;
> +    readonly attribute AudioBuffer inputBuffer;
> +    readonly attribute AudioBuffer outputBuffer;
> +

Are these blank lines really the correct style?
(In reply to Robert O'Callahan (:roc) (Mozilla Corporation) from comment #23)
> Comment on attachment 738861 [details] [diff] [review]
> WIP 3
> 
> Review of attachment 738861 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> It seems to me the output queue might be simpler if it was just a linked
> list of AudioChunks.

Will do.

> ::: content/media/webaudio/AudioContext.cpp
> @@ +104,5 @@
> > +AudioContext::CreateScriptProcessor(uint32_t aBufferSize,
> > +                                    uint32_t aNumberOfInputChannels,
> > +                                    uint32_t aNumberOfOutputChannels)
> > +{
> > +  // TODO: Handle the number of input and output channels somehow
> 
> I think this is actually important. If we don't handle these parameters then
> we're incompatible with the spec.
> 
> Supporting these parameters shouldn't be hard. aNumberOfInputChannels would
> override AudioNodeStream::ObtainInputBlock's outputChannelCount decision;
> you'd have to extend AudioNodeStream::ObtainInputBlock to downmix inputs to
> outputChannelCount if necessary.
> 
> aNumberOfOutputChannels is even easier, just use it where you currently have
> "outputBuffer->InitializeBuffers(2, cx)".

Oh yeah sorry, this was addressed to myself before I finish this patch.  :-)  I'll definitely do this (and test it) before I submit this for review.

> ::: content/media/webaudio/AudioProcessingEvent.cpp
> @@ +12,5 @@
> > +
> > +NS_IMPL_CYCLE_COLLECTION_INHERITED_2(AudioProcessingEvent, nsDOMEvent,
> > +                                     mInputBuffer, mOutputBuffer)
> > +
> > +NS_INTERFACE_MAP_BEGIN_CYCLE_COLLECTION_INHERITED(AudioProcessingEvent)
> 
> Don't you need to traverse mInputBuffer/mOutputBuffer here?

mInputBuffer doesn't exist any more ;-) but yeah, hence my "hook things up to CC" comment.

> ::: content/media/webaudio/ScriptProcessorNode.cpp
> @@ +52,5 @@
> > +    mInputWriteIndex += aInput.GetDuration();
> > +    if (mInputWriteIndex >= mBufferSize) {
> > +      // we now have a full input buffer ready to be sent to the main thread.
> > +      mSharedBuffers.SendBuffersToMainThread(mInputChannels[0].forget(),
> > +                                             mInputChannels[1].forget());
> 
> I think it might be better to pass an AudioChunk directly here. That might
> be simpler, and in the case where a ScriptProcessorNode is only producing
> output, we'd pass a Null audio chunk which means we don't have to allocate
> buffers full of silence here, and if inputBuffer allocates AudioBuffer
> lazily (and isn't called), we won't ever have to create silent input buffers
> at all.

Good idea, I think we should do this.  But remember that mInputChannel is not an AudioChunk.  But this can easily be done by remembering if all of the input chunks in a buffer size were null, in which case we would pass that information down to here in order to do what you suggest.

> @@ +200,5 @@
> > +          return NS_OK;
> > +        }
> > +        // Put out left and right channel data inside it
> > +        inputBuffer->SetRawChannelContents(cx, 0, mLeftChannel);
> > +        inputBuffer->SetRawChannelContents(cx, 1, mRightChannel);
> 
> So you decided not to make this lazy? I kinda think it would be almost as
> simple just to make allocation of inputBuffer and outputBuffer lazy.

Yeah, I wanted to do this and the point before in a follow-up, but I guess this is very simple.

> ::: content/media/webaudio/ScriptProcessorNode.h
> @@ +37,5 @@
> > +
> > +  uint32_t BufferSize() const
> > +  {
> > +    // TODO: See if we can choose a better buffer size
> > +    return 4096;
> 
> Should return the real buffer size

Yep  This is just a placeholder.

> @@ +61,5 @@
> > +      Mutex& Lock() { return mMutex; }
> > +
> > +      size_t ReadyToConsume() const
> > +      {
> > +        return mBufferList.size();
> 
> assert that the lock is held

There was a comment here at some point saying that it's the responsibility of the caller of this class to lock every access, but I accidentally removed that at some point.  But I guess asserting is better than expecting to read the comment.

> ::: dom/webidl/AudioProcessingEvent.webidl
> @@ +15,5 @@
> > +
> > +    readonly attribute double playbackTime;
> > +    readonly attribute AudioBuffer inputBuffer;
> > +    readonly attribute AudioBuffer outputBuffer;
> > +
> 
> Are these blank lines really the correct style?

I copied these verbatim from the spec.  I can fix these wholesale in the spec and in our copies if you want me to.
(In reply to :Ehsan Akhgari (needinfo? me!) from comment #24)
> > ::: content/media/webaudio/AudioProcessingEvent.cpp
> > @@ +12,5 @@
> > > +
> > > +NS_IMPL_CYCLE_COLLECTION_INHERITED_2(AudioProcessingEvent, nsDOMEvent,
> > > +                                     mInputBuffer, mOutputBuffer)
> > > +
> > > +NS_INTERFACE_MAP_BEGIN_CYCLE_COLLECTION_INHERITED(AudioProcessingEvent)
> > 
> > Don't you need to traverse mInputBuffer/mOutputBuffer here?
> 
> mInputBuffer doesn't exist any more ;-) but yeah, hence my "hook things up
> to CC" comment.

Hmm, wait, I'm traversing both of them with NS_IMPL_CYCLE_COLLECTION_INHERITED_2...
This addresses all of the feedback comments above, and should be ready for review.

The only problem with this patch is that when I'm computing the playback time inside WebAudioUtils::StreamPositionToDestinationTime, I get this assertion: "Don't ask about times where we haven't made blocking decisions yet" <http://mxr.mozilla.org/mozilla-central/source/content/media/MediaStreamGraph.cpp#228>.  As far as I can tell, this is because I'm trying to compute a time that is in the future.  Is this supported?  Or is there a better way to do this?

Note that I still need to write the tests here, but I didn't have enough time to do that today, and was too excited not to submit the patch right now.  I'll submit the tests in a different patch.
Attachment #738861 - Attachment is obsolete: true
Attachment #738861 - Flags: feedback?(roc)
Attachment #739383 - Flags: review?(roc)
(In reply to :Ehsan Akhgari (needinfo? me!) from comment #26)
> The only problem with this patch is that when I'm computing the playback
> time inside WebAudioUtils::StreamPositionToDestinationTime, I get this
> assertion: "Don't ask about times where we haven't made blocking decisions
> yet"
> <http://mxr.mozilla.org/mozilla-central/source/content/media/
> MediaStreamGraph.cpp#228>.  As far as I can tell, this is because I'm trying
> to compute a time that is in the future.  Is this supported?  Or is there a
> better way to do this?

You can call GraphTimeToStreamTimeOptimistic to get around that assertion.
(In reply to comment #27)
> (In reply to :Ehsan Akhgari (needinfo? me!) from comment #26)
> > The only problem with this patch is that when I'm computing the playback
> > time inside WebAudioUtils::StreamPositionToDestinationTime, I get this
> > assertion: "Don't ask about times where we haven't made blocking decisions
> > yet"
> > <http://mxr.mozilla.org/mozilla-central/source/content/media/
> > MediaStreamGraph.cpp#228>.  As far as I can tell, this is because I'm trying
> > to compute a time that is in the future.  Is this supported?  Or is there a
> > better way to do this?
> 
> You can call GraphTimeToStreamTimeOptimistic to get around that assertion.

That's entirely not obvious from the name of the function!  ;-)  but it works great!
Attachment #739383 - Attachment is obsolete: true
Attachment #739383 - Flags: review?(roc)
Attachment #739389 - Flags: review?(roc)
Comment on attachment 739383 [details] [diff] [review]
Part 2: Implement ScriptProcessorNode

Review of attachment 739383 [details] [diff] [review]:
-----------------------------------------------------------------

::: content/media/webaudio/AudioContext.cpp
@@ +101,5 @@
>  }
>  
> +namespace {
> +
> +bool InvalidBufferSize(uint32_t aBufferSize) {

Better to avoid unnecessary negatives and make this IsValidBufferSize().

@@ +126,5 @@
> +                                    uint32_t aNumberOfOutputChannels,
> +                                    ErrorResult& aRv)
> +{
> +  if (aNumberOfInputChannels == 0 || aNumberOfOutputChannels == 0 ||
> +      aNumberOfInputChannels > 32 || aNumberOfOutputChannels > 32 ||

Why 32? If 32 is the max channels we're going to support, make this a constant somewhere.

Is there a reason to limit to 32? Why not make this something very large so we're just doing a sanity check?

::: content/media/webaudio/ScriptProcessorNode.cpp
@@ +89,5 @@
> +  void FinishProducingOutputBuffer(ThreadSharedFloatArrayBufferList* aBuffer,
> +                                   uint32_t aBufferSize);
> +
> +  // graph thread
> +  AudioChunk GetOutputBuffer();

Since this class is local to a single translation unit you might as well put all method implementations inline. I don't think they'll really be inlined if they're used in more than one place.

@@ +101,5 @@
> +private:
> +  OutputQueue mOutputQueue;
> +  // How much delay we've seen so far.  This measures the amount of delay
> +  // caused by the main thread lagging behind in producing output buffers.
> +  // TRACK_TICKS_MAX means that we have not received our first buffer yet.

Why isn't 0 the initial value?

@@ +140,5 @@
> +    // First, record our input buffer
> +    for (uint32_t i = 0; i < mInputChannels.Length(); ++i) {
> +      if (aInput.IsNull()) {
> +        memset(mInputChannels[i] + mInputWriteIndex, 0,
> +               sizeof(float) * aInput.GetDuration());

Use PodSet

@@ +144,5 @@
> +               sizeof(float) * aInput.GetDuration());
> +      } else {
> +        mSeenNonSilenceInput = true;
> +        memcpy(mInputChannels[i] + mInputWriteIndex, aInput.mChannelData[i],
> +               sizeof(float) * aInput.GetDuration());

Use PodCopy

@@ +160,5 @@
> +      TrackTicks playbackTick = aStream->GetCurrentPosition();
> +      // Add the duration of the current sample
> +      playbackTick += aInput.GetDuration();
> +      // Add the initial delay caused by our buffer size
> +      playbackTick += mBufferSize;

I'm suspicious of having to add both mBufferSize and aInput.GetDuration() here.

@@ +170,5 @@
> +                                                       mSource,
> +                                                       mDestination);
> +
> +      SendBuffersToMainThread(mInputChannels, playbackTime);
> +      mInputWriteIndex %= mBufferSize;

This should be mInputWriteIndex -= mBufferSize, since we're sending exactly one buffer.

@@ +180,5 @@
> +private:
> +  void AllocateInputBlock()
> +  {
> +    for (unsigned i = 0; i < mInputChannels.Length(); ++i) {
> +      mInputChannels[i] = new float[mBufferSize];

skip the allocation if mInputChannels is null (which it can be, if the previous input was silent)

@@ +303,5 @@
> +  return buffer;
> +}
> +
> +void
> +ScriptProcessorNodeEngine::SendBuffersToMainThread(const InputChannels& aInputChannels,

This aInputChannels parameter is unused.

@@ +304,5 @@
> +}
> +
> +void
> +ScriptProcessorNodeEngine::SendBuffersToMainThread(const InputChannels& aInputChannels,
> +                                                   double aPlaybackTime)

I think calculation of aPlaybackTime can be moved into this function.

::: content/media/webaudio/ScriptProcessorNode.h
@@ +60,5 @@
> +
> +  using nsDOMEventTargetHelper::DispatchTrustedEvent;
> +
> +private:
> +  nsAutoPtr<SharedBuffers> mSharedBuffers;

Why not just make this a direct member of ScriptProcessorNode?
Attachment #739383 - Attachment is obsolete: false
(In reply to comment #29)
> @@ +126,5 @@
> > +                                    uint32_t aNumberOfOutputChannels,
> > +                                    ErrorResult& aRv)
> > +{
> > +  if (aNumberOfInputChannels == 0 || aNumberOfOutputChannels == 0 ||
> > +      aNumberOfInputChannels > 32 || aNumberOfOutputChannels > 32 ||
> 
> Why 32? If 32 is the max channels we're going to support, make this a constant
> somewhere.
> 
> Is there a reason to limit to 32? Why not make this something very large so
> we're just doing a sanity check?

What value should I pick?

> ::: content/media/webaudio/ScriptProcessorNode.cpp
> @@ +89,5 @@
> > +  void FinishProducingOutputBuffer(ThreadSharedFloatArrayBufferList* aBuffer,
> > +                                   uint32_t aBufferSize);
> > +
> > +  // graph thread
> > +  AudioChunk GetOutputBuffer();
> 
> Since this class is local to a single translation unit you might as well put
> all method implementations inline. I don't think they'll really be inlined if
> they're used in more than one place.

OK.

> @@ +101,5 @@
> > +private:
> > +  OutputQueue mOutputQueue;
> > +  // How much delay we've seen so far.  This measures the amount of delay
> > +  // caused by the main thread lagging behind in producing output buffers.
> > +  // TRACK_TICKS_MAX means that we have not received our first buffer yet.
> 
> Why isn't 0 the initial value?

Because 0 is a valid value.

> @@ +160,5 @@
> > +      TrackTicks playbackTick = aStream->GetCurrentPosition();
> > +      // Add the duration of the current sample
> > +      playbackTick += aInput.GetDuration();
> > +      // Add the initial delay caused by our buffer size
> > +      playbackTick += mBufferSize;
> 
> I'm suspicious of having to add both mBufferSize and aInput.GetDuration() here.

mBufferSize is added to account for the initial delay caused by the ScriptProcessorNode.  We basically wait for mBufferSize frames initially before we fill up the first AudioBuffer being passed to the main thread.  GetDuration() is added to account for the frames seen this time.

> @@ +170,5 @@
> > +                                                       mSource,
> > +                                                       mDestination);
> > +
> > +      SendBuffersToMainThread(mInputChannels, playbackTime);
> > +      mInputWriteIndex %= mBufferSize;
> 
> This should be mInputWriteIndex -= mBufferSize, since we're sending exactly one
> buffer.

OK, I guess they would both do the same thing here.

> @@ +180,5 @@
> > +private:
> > +  void AllocateInputBlock()
> > +  {
> > +    for (unsigned i = 0; i < mInputChannels.Length(); ++i) {
> > +      mInputChannels[i] = new float[mBufferSize];
> 
> skip the allocation if mInputChannels is null (which it can be, if the previous
> input was silent)

Hmm, no, mInputChannels should never be null.

> @@ +303,5 @@
> > +  return buffer;
> > +}
> > +
> > +void
> > +ScriptProcessorNodeEngine::SendBuffersToMainThread(const InputChannels& aInputChannels,
> 
> This aInputChannels parameter is unused.

Good catch!
(In reply to Robert O'Callahan (:roc) (Mozilla Corporation) from comment #29)
> ::: content/media/webaudio/ScriptProcessorNode.h
> @@ +60,5 @@
> > +
> > +  using nsDOMEventTargetHelper::DispatchTrustedEvent;
> > +
> > +private:
> > +  nsAutoPtr<SharedBuffers> mSharedBuffers;
> 
> Why not just make this a direct member of ScriptProcessorNode?

Mostly did that in order to be able to move the definition of SharedBuffer to the cpp file.
You're right about not adding mBufferSize to the playback time, since that is already encoded in the stream's position, but we still have to add the input duration to properly account for everything that we've seen so far.
Attachment #739383 - Attachment is obsolete: true
Attachment #739389 - Attachment is obsolete: true
Attachment #739389 - Flags: review?(roc)
Attachment #739624 - Flags: review?(roc)
Depends on: 863884
Comment on attachment 739624 [details] [diff] [review]
Part 2: Implement ScriptProcessorNode

Review of attachment 739624 [details] [diff] [review]:
-----------------------------------------------------------------

::: content/media/webaudio/AudioContext.cpp
@@ +23,5 @@
>  #include "nsNetUtil.h"
>  
> +// Note that this number is an arbitrary large value to protect against OOM
> +// attacks.
> +#define MAX_SCRIPT_PROCESSOR_CHANNELS 10000

Use a const int instead of a #define

::: content/media/webaudio/ScriptProcessorNode.cpp
@@ +211,5 @@
> +private:
> +  void AllocateInputBlock()
> +  {
> +    for (unsigned i = 0; i < mInputChannels.Length(); ++i) {
> +      mInputChannels[i] = new float[mBufferSize];

mInputChannels[i] can be null --- if the buffer was filled with silence, forget() won't be called on it and we should just reuse it.

So, skip reallocation if mInputChannels[i] is non-null here.
Attachment #739624 - Flags: review?(roc) → review+
So in order to satisfy the CC, I need to use a thread safe WeakPtr instead of an nsMainThreadPtrHolder since that incurs a strong ref.

(Note that ThreadSafeWeakPtr.h is intended to be copied back into mfbt at some point so I just copied mfbt's style there.)
Attachment #739624 - Attachment is obsolete: true
Attachment #739861 - Flags: review?(roc)
(In reply to Robert O'Callahan (:roc) (Mozilla Corporation) from comment #34)
> Comment on attachment 739624 [details] [diff] [review]
> Part 2: Implement ScriptProcessorNode
> 
> Review of attachment 739624 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> ::: content/media/webaudio/AudioContext.cpp
> @@ +23,5 @@
> >  #include "nsNetUtil.h"
> >  
> > +// Note that this number is an arbitrary large value to protect against OOM
> > +// attacks.
> > +#define MAX_SCRIPT_PROCESSOR_CHANNELS 10000
> 
> Use a const int instead of a #define
> 
> ::: content/media/webaudio/ScriptProcessorNode.cpp
> @@ +211,5 @@
> > +private:
> > +  void AllocateInputBlock()
> > +  {
> > +    for (unsigned i = 0; i < mInputChannels.Length(); ++i) {
> > +      mInputChannels[i] = new float[mBufferSize];
> 
> mInputChannels[i] can be null --- if the buffer was filled with silence,
> forget() won't be called on it and we should just reuse it.
> 
> So, skip reallocation if mInputChannels[i] is non-null here.

I will address both of these before landing of course.

Note that I also used this new node to test AudioBufferSourceNode and it works like a charm.  The test patches need a bit more work and I will attach them once they're ready.
We discussed things on IRC and Ehsan agreed to remove the WeakPtr stuff.
No longer depends on: 863884
Attachment #739861 - Attachment is obsolete: true
Attachment #739861 - Flags: review?(roc)
Attachment #739981 - Flags: review?(roc)
Attachment #739982 - Flags: review?(roc)
Attached patch Part 4: tests (obsolete) — Splinter Review
Attachment #739983 - Flags: review?(roc)
Sorry, forgot to address comment 34.
Attachment #739982 - Attachment is obsolete: true
Attachment #739982 - Flags: review?(roc)
Attachment #739985 - Flags: review?(roc)
Comment on attachment 739981 [details] [diff] [review]
Part 2: Add an AudioNode weak pointer to the AudioNodeEngine class

Review of attachment 739981 [details] [diff] [review]:
-----------------------------------------------------------------

r+ with those cosmetic changes.

::: content/media/AudioNodeEngine.h
@@ +206,5 @@
> +
> +  template <class NodeType>
> +  NodeType* Node() const
> +  {
> +    return static_cast<NodeType*>(mNode);

Assert we're on the main thread.

I'm not sure we should hide the static_cast in here. I think it's probably more honest to move the static_cast up into callers.

::: content/media/webaudio/AudioNode.h
@@ +82,5 @@
>    // this method.
>    virtual void DestroyMediaStream()
>    {
>      if (mStream) {
> +      ReleaseEngineReference();

This name isn't very good since we aren't releasing the engine reference. I would just inline the function into here to avoid the problem.
Attachment #739981 - Flags: review?(roc) → review+
Comment on attachment 739983 [details] [diff] [review]
Part 4: tests

Review of attachment 739983 [details] [diff] [review]:
-----------------------------------------------------------------

I think we'll probably want to factor more test code into webaudio.js --- e.g. code to set up the context, and set up a ScriptProcessorNode to collect test results and actually do the tests, so that for testing the output of a node just requires creating and configuring the node and specifying the expected results. Maybe we should do that now?

::: content/media/webaudio/test/webaudio.js
@@ +41,5 @@
> +  for (var i = offset || 0; i < Math.min(buf1.length, (offset || 0) + length); ++i) {
> +    if (!fuzzyCompare(buf1[i + sourceOffset], buf2[i + destOffset])) {
> +      is(buf1[i + sourceOffset], buf2[i + destOffset], "Difference in offset " + i +
> +         " at source offset " + sourceOffset +
> +         " and destination offset " + destOffset);

Probably should only report the first difference. Otherwise, if buffers are completely different, we'll get massive test failure spam.

We may wish to follow reftests and print the number of different samples and the max difference.
(In reply to comment #43)
> I think we'll probably want to factor more test code into webaudio.js --- e.g.
> code to set up the context, and set up a ScriptProcessorNode to collect test
> results and actually do the tests, so that for testing the output of a node
> just requires creating and configuring the node and specifying the expected
> results. Maybe we should do that now?

I have thought about this, and yes, we probably want to come up with a common infrastructure at some point, but I'd rather do that later once I have a better understanding of how I'd like it to look like.

There's still some experimentation to be done to see how we can make the tests more robust, etc.

> ::: content/media/webaudio/test/webaudio.js
> @@ +41,5 @@
> > +  for (var i = offset || 0; i < Math.min(buf1.length, (offset || 0) + length); ++i) {
> > +    if (!fuzzyCompare(buf1[i + sourceOffset], buf2[i + destOffset])) {
> > +      is(buf1[i + sourceOffset], buf2[i + destOffset], "Difference in offset " + i +
> > +         " at source offset " + sourceOffset +
> > +         " and destination offset " + destOffset);
> 
> Probably should only report the first difference. Otherwise, if buffers are
> completely different, we'll get massive test failure spam.
> 
> We may wish to follow reftests and print the number of different samples and
> the max difference.

OK I'll do that.
Attached patch Part 4: testsSplinter Review
Attachment #739983 - Attachment is obsolete: true
Attachment #739983 - Flags: review?(roc)
Attachment #740015 - Attachment is patch: true
Attachment #740015 - Flags: review?(roc)
I saw an assertion on the try server when dispatching the event <https://tbpl.mozilla.org/php/getParsedLog.php?id=22058857&tree=Try&full=1#error0>:

14:39:43     INFO -  [Parent 894] ###!!! ASSERTION: This is unsafe! Fix the caller!: 'Error', file ../../../../content/events/src/nsEventDispatcher.cpp, line 494
14:39:43     INFO -  nsEventDispatcher::DispatchDOMEvent(nsISupports*, nsEvent*, nsIDOMEvent*, nsPresContext*, nsEventStatus*) [content/events/src/nsEventDispatcher.cpp:698]
14:39:43     INFO -  nsDOMEventTargetHelper::DispatchEvent(nsIDOMEvent*, bool*) [content/events/src/nsDOMEventTargetHelper.cpp:257]
14:39:43     INFO -  nsDOMEventTargetHelper::DispatchTrustedEvent(nsIDOMEvent*) [content/events/src/nsDOMEventTargetHelper.cpp:278]
14:39:43     INFO -  mozilla::dom::ScriptProcessorNodeEngine::SendBuffersToMainThread(mozilla::AudioNodeStream*)::Command::Run()
14:39:43     INFO -  nsThread::ProcessNextEvent(bool, bool*) [xpcom/threads/nsThread.cpp:627]
14:39:43     INFO -  NS_ProcessNextEvent(nsIThread*, bool) [obj-firefox/xpcom/build/nsThreadUtils.cpp:238]
14:39:43     INFO -  nsThread::Shutdown() [xpcom/threads/nsThread.cpp:473]
14:39:43     INFO -  nsThreadPool::Shutdown() [obj-firefox/dist/include/nsTArray.h:363]
14:39:43     INFO -  mozilla::MediaBufferDecoder::Shutdown() [obj-firefox/dist/include/nsCOMPtr.h:476]
14:39:43     INFO -  nsGlobalWindow::FreeInnerObjects() [dom/base/nsGlobalWindow.cpp:1493]
14:39:43     INFO -  nsGlobalWindow::SetNewDocument(nsIDocument*, nsISupports*, bool) [dom/base/nsGlobalWindow.cpp:2344]
14:39:43     INFO -  nsDocumentViewer::InitInternal(nsIWidget*, nsISupports*, nsIntRect const&, bool, bool, bool) [layout/base/nsDocumentViewer.cpp:940]
14:39:43     INFO -  nsDocumentViewer::Init(nsIWidget*, nsIntRect const&) [layout/base/nsDocumentViewer.cpp:683]
14:39:43     INFO -  nsDocShell::SetupNewViewer(nsIContentViewer*) [docshell/base/nsDocShell.cpp:8189]
14:39:43     INFO -  nsDocShell::Embed(nsIContentViewer*, char const*, nsISupports*) [obj-firefox/dist/include/nsCOMPtr.h:782]
14:39:43     INFO -  nsDocShell::CreateContentViewer(char const*, nsIRequest*, nsIStreamListener**) [docshell/base/nsDocShell.cpp:7976]
14:39:43     INFO -  nsDSURIContentListener::DoContent(char const*, bool, nsIRequest*, nsIStreamListener**, bool*) [docshell/base/nsDSURIContentListener.cpp:123]
14:39:43     INFO -  nsDocumentOpenInfo::TryContentListener(nsIURIContentListener*, nsIChannel*) [uriloader/base/nsURILoader.cpp:658]
14:39:43     INFO -  nsDocumentOpenInfo::DispatchContent(nsIRequest*, nsISupports*) [uriloader/base/nsURILoader.cpp:360]
14:39:43     INFO -  nsDocumentOpenInfo::OnStartRequest(nsIRequest*, nsISupports*) [uriloader/base/nsURILoader.cpp:252]
14:39:43     INFO -  mozilla::net::nsHttpChannel::CallOnStartRequest() [netwerk/protocol/http/nsHttpChannel.cpp:970]
14:39:43     INFO -  mozilla::net::nsHttpChannel::ContinueOnStartRequest2(tag_nsresult) [netwerk/protocol/http/nsHttpChannel.cpp:4958]
14:39:43     INFO -  mozilla::net::nsHttpChannel::OnStartRequest(nsIRequest*, nsISupports*) [netwerk/protocol/http/nsHttpChannel.cpp:4930]
14:39:43     INFO -  nsInputStreamPump::OnStateStart() [netwerk/base/src/nsInputStreamPump.cpp:418]
14:39:43     INFO -  nsInputStreamPump::OnInputStreamReady(nsIAsyncInputStream*) [netwerk/base/src/nsInputStreamPump.cpp:369]
14:39:43     INFO -  nsInputStreamReadyEvent::Run() [obj-firefox/dist/include/nsCOMPtr.h:476]
14:39:43     INFO -  nsThread::ProcessNextEvent(bool, bool*) [xpcom/threads/nsThread.cpp:627]
14:39:43     INFO -  NS_ProcessPendingEvents(nsIThread*, unsigned int) [obj-firefox/xpcom/build/nsThreadUtils.cpp:188]
14:39:43     INFO -  nsBaseAppShell::NativeEventCallback() [widget/xpwidgets/nsBaseAppShell.cpp:98]
14:39:43     INFO -  nsAppShell::ProcessGeckoEvents(void*) [widget/cocoa/nsAppShell.mm:388]

The fix is simple, I'll just add a script runner if it's not safe to run scripts.
(In reply to Robert O'Callahan (:roc) (Mozilla Corporation) from comment #42)
> ::: content/media/webaudio/AudioNode.h
> @@ +82,5 @@
> >    // this method.
> >    virtual void DestroyMediaStream()
> >    {
> >      if (mStream) {
> > +      ReleaseEngineReference();
> 
> This name isn't very good since we aren't releasing the engine reference. I
> would just inline the function into here to avoid the problem.

Unfortunately inlining this function means that I would need to pull AudioNodeStream.h into AudioNode.h, which results in an #include hell.  I'll rename this to UnbindFromEngine.
Depends on: 864351
(In reply to :Ehsan Akhgari (needinfo? me!) from comment #47)
> Unfortunately inlining this function means that I would need to pull
> AudioNodeStream.h into AudioNode.h, which results in an #include hell.

I think moving Destroy() into the .cpp file would be a better fix for that. Since it's virtual, there's no strong reason to leave it in the .h file.
(In reply to Robert O'Callahan (:roc) (Mozilla Corporation) from comment #50)
> (In reply to :Ehsan Akhgari (needinfo? me!) from comment #47)
> > Unfortunately inlining this function means that I would need to pull
> > AudioNodeStream.h into AudioNode.h, which results in an #include hell.
> 
> I think moving Destroy() into the .cpp file would be a better fix for that.
> Since it's virtual, there's no strong reason to leave it in the .h file.

Done: https://hg.mozilla.org/integration/mozilla-inbound/rev/3ff055568d93
Mass moving Web Audio bugs to the Web Audio component.  Filter on duckityduck.
Component: Video/Audio → Web Audio
Blocks: 894848
Blocks: 894856
No longer blocks: 894848
You need to log in before you can comment on or make changes to this bug.