Closed Bug 815643 Opened 12 years ago Closed 11 years ago

Implement ConvolverNode

Categories

(Core :: Web Audio, defect)

x86
macOS
defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla24

People

(Reporter: ehsan.akhgari, Assigned: ehsan.akhgari)

References

(Depends on 1 open bug)

Details

(Keywords: doc-bug-filed)

Attachments

(9 files, 3 obsolete files)

12.13 KB, patch
roc
: review+
Details | Diff | Splinter Review
11.56 KB, patch
roc
: review+
Details | Diff | Splinter Review
79.17 KB, patch
roc
: review+
Details | Diff | Splinter Review
6.92 KB, patch
roc
: review+
Details | Diff | Splinter Review
3.01 KB, patch
roc
: review+
Details | Diff | Splinter Review
2.52 KB, patch
roc
: review+
Details | Diff | Splinter Review
8.91 KB, patch
roc
: review+
Details | Diff | Splinter Review
5.49 KB, patch
roc
: review+
Details | Diff | Splinter Review
82.46 KB, patch
roc
: review+
Details | Diff | Splinter Review
ConvolverNode is defined here: <https://dvcs.w3.org/hg/audio/raw-file/tip/webaudio/specification.html#ConvolverNode-section>.  The implementation would be very similar to the other node types that we have implemented.
Attached patch WIP (obsolete) — Splinter Review
Incomplete patch - missing a test and correct implementation of how it carries out a linear convolution. 

Couple of questions. In the specification, 'impulse response' and 'audio data' are mentioned. Am i to assume they're the same? 

In the attributes section (4.16.1), under buffer it states:
" At the time when this attribute is set, the buffer and the state of the normalize attribute will be used to configure the ConvolverNode with this impulse response having the given normalization". What's does |given normalization| refer to?

Similarly:
"If the normalize attribute is false when the buffer attribute is set then the ConvolverNode will perform a linear convolution given the exact impulse response contained within the buffer. "  
I'll need help in figuring out how it carries out the linear convolution and where it directs the output. In the this WIP, i assume it's carried out in conjunction with the result of |calculateNormalizationScale| when the |buffer| attribute is set.
Assignee: nobody → andrew.quartey
Attachment #730905 - Flags: feedback?(ehsan)
Comment on attachment 730905 [details] [diff] [review]
WIP

Review of attachment 730905 [details] [diff] [review]:
-----------------------------------------------------------------

This is a great start!

::: content/media/webaudio/ConvolverNode.cpp
@@ +51,5 @@
> +    */
> +}
> +
> +/*
> + * Iimplementation of normalization scale algorithm as detailed in :

Nit: typo.

@@ +55,5 @@
> + * Iimplementation of normalization scale algorithm as detailed in :
> + * <https://dvcs.w3.org/hg/audio/raw-file/tip/webaudio/specification.html#ConvolverNode-section>
> + */
> +float
> +ConvolverNode::ComputeNormalizationScale()

So the function inside the spec has been copied from the WebKit source code, and that function has diverged from the spec since then.  I have raised a spec issue about this.

@@ +72,5 @@
> +  {
> +    JSObject* channelObj = mBuffer->GetChannelData(i);
> +    JSObject* arrayBuffer = JS_GetArrayBufferViewBuffer(channelObj);
> +    uint8_t* data =  JS_GetArrayBufferData(arrayBuffer);
> +    float* channelData = reinterpret_cast<float*>(data);

static_cast

::: content/media/webaudio/ConvolverNode.h
@@ +42,5 @@
> +
> +private:
> +  nsRefPtr<AudioBuffer> mBuffer;
> +  bool mNormalize;
> +  float ComputeNormalizationScale();

Nit: please separate the private methods and fields like this:

private:
  float method();

private:
  nsRefPtr field;
Attachment #730905 - Flags: feedback?(ehsan) → feedback+
(In reply to Andrew Quartey [:drexler] from comment #1)
> Created attachment 730905 [details] [diff] [review]
> WIP
> 
> Incomplete patch - missing a test and correct implementation of how it
> carries out a linear convolution. 
> 
> Couple of questions. In the specification, 'impulse response' and 'audio
> data' are mentioned. Am i to assume they're the same? 

I think the buffer attribute of ConvolverNode is the impulse response, and the incoming data flowing through it is the audio data.

> In the attributes section (4.16.1), under buffer it states:
> " At the time when this attribute is set, the buffer and the state of the
> normalize attribute will be used to configure the ConvolverNode with this
> impulse response having the given normalization". What's does |given
> normalization| refer to?

this.normalize.

> Similarly:
> "If the normalize attribute is false when the buffer attribute is set then
> the ConvolverNode will perform a linear convolution given the exact impulse
> response contained within the buffer. "  

This means that in this case the impulse response will not be normalized and will be used as is.

> I'll need help in figuring out how it carries out the linear convolution and
> where it directs the output. In the this WIP, i assume it's carried out in
> conjunction with the result of |calculateNormalizationScale| when the
> |buffer| attribute is set.

Implementing linear convolution is apparently quite challenging, see <https://dvcs.w3.org/hg/audio/raw-file/tip/webaudio/convolution.html>.  I talked to Chris Rogers about this last week, and he convinced me to consider borrowing this code from WebKit.  This code seems pretty self-contained, so I might end up doing that.

Are you interested in continuing to work on this beyond the DOM bindings?  If yes, the first step is to create an engine class and deliver arguments to it and make sure that you pass through the input buffer unmodified so that putting a ConvolverNode inside a graph doesn't mute playback.  Then I'd like to review the WebKit code in the mean time to decide what to do in order to implement the actual convolution processing.
> > +  {
> > +    JSObject* channelObj = mBuffer->GetChannelData(i);
> > +    JSObject* arrayBuffer = JS_GetArrayBufferViewBuffer(channelObj);
> > +    uint8_t* data =  JS_GetArrayBufferData(arrayBuffer);
> > +    float* channelData = reinterpret_cast<float*>(data);
> 
> static_cast

Used reinterpret_cast to force the type conversion otherwise i get error C2440. 

> Are you interested in continuing to work on this beyond the DOM bindings?
Yes. :)

> If yes, the first step is to create an engine class and deliver arguments to
> it and make sure that you pass through the input buffer unmodified so that
> putting a ConvolverNode inside a graph doesn't mute playback.

For this, we're deep-copying the contents of the input buffer to AudioChunk via:
ProduceAudioBlock(AudioNodeStream* aStream,const AudioChunk& aInput,
                  AudioChunk* aOutput, bool* aFinished)

since all prior processing on the input buffer will take place when the |buffer| attribute is set. Correct?
Sorry for the late reply.  I somehow missed this bugmail.

(In reply to Andrew Quartey [:drexler] from comment #4)
> > > +  {
> > > +    JSObject* channelObj = mBuffer->GetChannelData(i);
> > > +    JSObject* arrayBuffer = JS_GetArrayBufferViewBuffer(channelObj);
> > > +    uint8_t* data =  JS_GetArrayBufferData(arrayBuffer);
> > > +    float* channelData = reinterpret_cast<float*>(data);
> > 
> > static_cast
> 
> Used reinterpret_cast to force the type conversion otherwise i get error
> C2440. 

Ah, ok, yeah you're right. :-)

> > Are you interested in continuing to work on this beyond the DOM bindings?
> Yes. :)
> 
> > If yes, the first step is to create an engine class and deliver arguments to
> > it and make sure that you pass through the input buffer unmodified so that
> > putting a ConvolverNode inside a graph doesn't mute playback.
> 
> For this, we're deep-copying the contents of the input buffer to AudioChunk
> via:
> ProduceAudioBlock(AudioNodeStream* aStream,const AudioChunk& aInput,
>                   AudioChunk* aOutput, bool* aFinished)
> 
> since all prior processing on the input buffer will take place when the
> |buffer| attribute is set. Correct?

Hmm, not sure what you mean by deep copying.  You just need to do:

*aOutput = aInput;

which actually doesn't copy the buffer since AudioChunk is smart enough to handle that efficiently, but you're right about the processing model of ProduceAudioBlock.  Here's the code which ensures that: <http://mxr.mozilla.org/mozilla-central/source/content/media/MediaStreamGraph.cpp#919>.  We basically come up with WEBAUDIO_BLOCK_SIZE time slices, and for each one we call into the engines for each node in order one by one, passing the output of one engine (stored in nsAudioNodeStream::mLastChunk) to the next one (see ObtainInputBlock to see how that works if you're interested.)

Please ping me if you need help making progress.  And thanks!
Andrew, do you think you can work on this soon? I'd like to drive this to the finish line soon...
Flags: needinfo?(andrew.quartey)
Ooops sorry - your reply got buried in a deluge of emails. I'll devote this weekend to this.
Flags: needinfo?(andrew.quartey)
(In reply to comment #7)
> Ooops sorry - your reply got buried in a deluge of emails. I'll devote this
> weekend to this.

Thanks a lot!  :-)

BTW, I can probably help in stealing the Blink code for the processing.
With your permission, I would like to steal this bug, as this is one of the most important node implementations that we're currently missing.
Assignee: andrew.quartey → ehsan
Blocks: 865241
Mass moving Web Audio bugs to the Web Audio component.  Filter on duckityduck.
Component: Video/Audio → Web Audio
Attachment #730905 - Attachment is obsolete: true
Attachment #760244 - Flags: review?(roc)
Sigh, forgot that MSVC doesn't have std::isinf and std::isnan...
Attachment #760248 - Attachment is obsolete: true
Attachment #760248 - Flags: review?(roc)
Attachment #760256 - Flags: review?(roc)
Comment on attachment 760245 [details] [diff] [review]
Part 2: Refactor our Web Audio FFT code into the FFTFrame class

Review of attachment 760245 [details] [diff] [review]:
-----------------------------------------------------------------

::: content/media/webaudio/FFTFrame.h
@@ +14,5 @@
> +namespace mozilla {
> +
> +// This class defines an FFT frame, loosely modeled after Blink's FFTFrame
> +// class to make sharing code with Blink easy.
> +// Currently it's implemented on top of KissFFT on all platforms.

Do we have to call this "Frame"? That is a vastly overloaded term already.
(In reply to Robert O'Callahan (:roc) (Mozilla Corporation) from comment #21)
> Comment on attachment 760245 [details] [diff] [review]
> Part 2: Refactor our Web Audio FFT code into the FFTFrame class
> 
> Review of attachment 760245 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> ::: content/media/webaudio/FFTFrame.h
> @@ +14,5 @@
> > +namespace mozilla {
> > +
> > +// This class defines an FFT frame, loosely modeled after Blink's FFTFrame
> > +// class to make sharing code with Blink easy.
> > +// Currently it's implemented on top of KissFFT on all platforms.
> 
> Do we have to call this "Frame"? That is a vastly overloaded term already.

I adopted this because "FFT frame" is a common terminology...  I believe this code is far enough from layout/.  :-)
We've got, at least
-- nsIFrame
-- nsHTMLIFrameElement etc
-- audio frames (e.g. AudioSegment::AppendFrames) (1 sample per channel)
-- Opus frames (5-20ms of data)
(In reply to Robert O'Callahan (:roc) (Mozilla Corporation) from comment #24)
> We've got, at least
> -- nsIFrame
> -- nsHTMLIFrameElement etc
> -- audio frames (e.g. AudioSegment::AppendFrames) (1 sample per channel)
> -- Opus frames (5-20ms of data)

OK, I can rename it to FFTBlock before landing...
Comment on attachment 760245 [details] [diff] [review]
Part 2: Refactor our Web Audio FFT code into the FFTFrame class

Review of attachment 760245 [details] [diff] [review]:
-----------------------------------------------------------------

::: content/media/AudioNodeEngine.h
@@ +110,5 @@
>                                      const float aScale[WEBAUDIO_BLOCK_SIZE],
>                                      float aOutput[WEBAUDIO_BLOCK_SIZE]);
>  
>  /**
> + * Vectory copy-scaled operation on arbitrary sized buffers.

this comment seems wrong

::: content/media/webaudio/AnalyserNode.cpp
@@ +221,4 @@
>  
>    for (uint32_t i = 0; i < mOutputBuffer.Length(); ++i) {
> +    double scalarMagnitude = sqrt(mAnalysisFrame.RealData(i) * mAnalysisFrame.RealData(i) +
> +                                  mAnalysisFrame.ImagData(i) * mAnalysisFrame.ImagData(i)) *

you should use NS_hypot here
Attachment #760245 - Flags: review?(roc) → review+
Comment on attachment 760256 [details] [diff] [review]
Part 4: Add the Convolution processing implementation to the build system

Review of attachment 760256 [details] [diff] [review]:
-----------------------------------------------------------------

The checkin comment for this patch seems wrong.

::: content/media/AudioNodeEngine.cpp
@@ +116,5 @@
> +void
> +AudioBlockInPlaceScale(float* aBlock,
> +                       uint32_t aChannelCount,
> +                       float aScale,
> +                       uint32_t aSize)

Shouldn't this be called AudioBufferInPlaceScale?

::: content/media/AudioNodeEngine.h
@@ +166,5 @@
>  /**
> + * Return the sum of squares of all of the samples in the input.
> + */
> +float
> +AudioBlockSumOfSquares(const float* aInput, uint32_t aLength);

And shouldn't this be AudioBufferSumOfSquares?
(In reply to Robert O'Callahan (:roc) (Mozilla Corporation) from comment #27)
> Comment on attachment 760256 [details] [diff] [review]
> Part 4: Add the Convolution processing implementation to the build system
> 
> Review of attachment 760256 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> The checkin comment for this patch seems wrong.

I don't think so...  That's precisely what this commit is doing.  See <http://mxr.mozilla.org/mozilla-central/source/content/media/webaudio/blink/README>.
Attachment #760256 - Attachment is obsolete: true
Attachment #760256 - Flags: review?(roc)
Attachment #760537 - Flags: review?(roc)
Comment on attachment 760537 [details] [diff] [review]
Part 4: Add the Convolution processing implementation to the build system

Review of attachment 760537 [details] [diff] [review]:
-----------------------------------------------------------------

You're replacing a tryLock with an unconditional lock, and the comment there says it'll cause glitching. Have you looked into that at all? I think we currently don't support trylock, so I don't mind this landing, but maybe we should.
Attachment #760537 - Flags: review?(roc) → review+
(In reply to Robert O'Callahan (:roc) (Mozilla Corporation) from comment #30)
> Comment on attachment 760537 [details] [diff] [review]
> Part 4: Add the Convolution processing implementation to the build system
> 
> Review of attachment 760537 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> You're replacing a tryLock with an unconditional lock, and the comment there
> says it'll cause glitching. Have you looked into that at all? I think we
> currently don't support trylock, so I don't mind this landing, but maybe we
> should.

Good point.  I actually did look into this, and I think the possibility of this code trying to wait on the lock does exist in theory, but so far I've not seen any glitching in my testing.  That being said, perhaps we need to use <http://mxr.mozilla.org/mozilla-central/source/ipc/chromium/src/base/lock.h> in order to reintroduce trylocking here (NSPR locking primitives do not support trylock.)

Filed bug 881558 about that.
Blocks: 881587
Depends on: 885284
Depends on: 886660
Depends on: 889042
Depends on: 891254
Depends on: 942022
Depends on: 947015
Depends on: 1148230
Depends on: 1221879
Depends on: 1024182
You need to log in before you can comment on or make changes to this bug.