Closed Bug 1140450 Opened 7 years ago Closed 6 years ago

Investigate using less expensive resampling

Categories

(Core :: Web Audio, defect, P2)

32 Branch
x86_64
Linux
defect

Tracking

()

RESOLVED FIXED

People

(Reporter: padenot, Assigned: u538282)

References

(Blocks 1 open bug)

Details

(Keywords: perf)

Attachments

(6 files, 13 obsolete files)

7.64 KB, patch
padenot
: feedback+
Details | Diff | Splinter Review
3.33 KB, patch
padenot
: feedback+
Details | Diff | Splinter Review
15.31 KB, patch
padenot
: feedback+
Details | Diff | Splinter Review
214.59 KB, image/png
Details
218.32 KB, image/png
Details
59.23 KB, patch
Details | Diff | Splinter Review
We use a resampling technique that is very expensive, and it kills us in terms of CPU

There is a couple way forward, here:
- Lower the quality in the speex resampler
- Use a linear or other less expensive resampling technique
- For AudioBufferSourceNode that have loop = true, consider caching the resampling from the buffer's intrinsic rate to the AudioContext.sampleRate

Chrome does linear resampling, and people don't seem to complain.
Resampling is #1 when profiling code that has an heavy AudioBufferSourceNode usage (of course, only when playbackRate != 1.0 or detune != 0, or the sample rate of the AudioBuffer set as `.buffer` is not the sample rate of the AudioContext, which is a common case), and this is one of the most popular nodes in the Web Audio API.
Priority: -- → P1
Code is here: https://dxr.mozilla.org/mozilla-central/source/dom/media/webaudio/AudioBufferSourceNode.cpp#242, this is about implementing a linear resampler to replace speex's resampler.

We should make sure that simply lowering the resampler quality is not enough, by profiling, for example.
Attachment #8608559 - Flags: feedback?(padenot)
Comment on attachment 8608559 [details] [diff] [review]
lower_speex_quality.patch.v1

Review of attachment 8608559 [details] [diff] [review]:
-----------------------------------------------------------------

Overall good, but we should restrict the changes to Web Audio API I think. Now I'll have a listen.

::: dom/media/encoder/OpusTrackEncoder.cpp
@@ +172,5 @@
>      int error;
>      mResampler = speex_resampler_init(mChannels,
>                                        aSamplingRate,
>                                        kOpusSamplingRate,
> +                                      SPEEX_RESAMPLER_QUALITY_MIN,

We should not touch this I think, this is for encoding and is done on a separate thread.

::: dom/media/omx/OMXCodecWrapper.cpp
@@ +543,5 @@
>      int error;
>      mResampler = speex_resampler_init(aChannels,
>                                        aInputSampleRate,
>                                        aEncodedSampleRate,
> +                                      SPEEX_RESAMPLER_QUALITY_MIN,

Same, this is for regular media playback.

::: dom/media/webaudio/MediaBufferDecoder.cpp
@@ +342,5 @@
>  
>      resampler = speex_resampler_init(channelCount,
>                                       sampleRate,
>                                       destSampleRate,
> +                                     SPEEX_RESAMPLER_QUALITY_MIN, nullptr);

Maybe we can leave this one off as well, it's also on another thread.
Attachment #8608559 - Flags: feedback?(padenot) → feedback+
Attachment #8608574 - Flags: feedback?(padenot)
Attachment #8608574 - Flags: feedback?(padenot) → feedback+
Listening to some sound, resampled, I cannot really make a difference, and benchmarks numbers are better.

Let's land that for now, keeping this bug open, as students are working on writing a super cheap resampler.
Making sure this does not bust the tests, because maybe we're making some assumptions. The quality should be the same for decodeAudioData, but you never know.

https://treeherder.mozilla.org/#/jobs?repo=try&revision=179de1cb7a1e
Putting leave-open, we're investigating writing an even cheaper resampler now.
Keywords: leave-open
Attached patch interface_ressampling.patch.v1 (obsolete) — Splinter Review
Attachment #8609386 - Flags: review?(padenot)
While working on Speex resampler, I saw this optimisation in the algorithm used to reduce the fraction to lowest terms. Euclid's algorithm is indeed much more efficient than the for-loop. I don't know if it will be useful since we will certainly change the resampler, but I post it however, just in case.
Attachment #8609741 - Flags: feedback?(padenot)
Comment on attachment 8609741 [details] [diff] [review]
improve_speex_fraction_reduction.patch

Review of attachment 8609741 [details] [diff] [review]:
-----------------------------------------------------------------

This won't really matter in terms of performance I think, but we might as well since we're here. Usually we apply patches on top of our copy of speex's resampler, but maybe we can upstream this one?

Jean-Marc, what do you think?

Thibaud, if this is not going to be upstreamed, we'll have to do the little patch + update.sh dance, as you see in media/libspeex_resampler.

::: media/libspeex_resampler/src/resample.c
@@ +1107,5 @@
> +   if(st->num_rate == st->den_rate) {
> +       st->num_rate = 1;
> +       st->den_rate = 1;
> +   } else {
> +       spx_uint32_t a = (st->num_rate > st->den_rate) ? st->num_rate : st->den_rate; // a = max(num_rate, den_rate)

You have readily available IMAX and IMIN macros at the top of this file.
Attachment #8609741 - Flags: feedback?(padenot) → feedback?(jmvalin)
Comment on attachment 8609386 [details] [diff] [review]
interface_ressampling.patch.v1

Review of attachment 8609386 [details] [diff] [review]:
-----------------------------------------------------------------

Looks cool, but the formatting is completely off. Maybe it'd be worthwhile to run ./mach clang-format on the patch?

In any case, the style guide is here: https://developer.mozilla.org/en-US/docs/Mozilla/Developer_guide/Coding_Style

Also, AudioResamplingInterface should be called AudioResampler or something.
Attachment #8609386 - Flags: review?(padenot)
This is a Git patch (sorry if it is a bit different from Mercurial patches, I'll fix that later) with a very first version of the linear resampler. Some functions (ProcessInt, ProcessInterleavedFloat, ProcessInterleavedInt) are not implemented yet since they are very similar to ProcessFloat. I would like to have your first impressions : what is good, what is bad, what should be improved, or am I completely missing the point?
Attachment #8610153 - Flags: feedback?(padenot)
Attached file interface_ressampling.patch.v2 (obsolete) —
Attachment #8609386 - Attachment is obsolete: true
Attachment #8610414 - Flags: feedback?(padenot)
Attachment #8610414 - Attachment is obsolete: true
Attachment #8610414 - Flags: feedback?(padenot)
Attachment #8610465 - Flags: feedback?(padenot)
Attachment #8610465 - Attachment is patch: true
Comment on attachment 8610153 [details] [diff] [review]
linear_resampler.feedback1.gitpatch

Review of attachment 8610153 [details] [diff] [review]:
-----------------------------------------------------------------

The code looks great, a number of style issues need to be fixed, though.

We don't use snake_case for file name, we use CamelCase.

::: linear_resampler.cpp
@@ +4,5 @@
> +
> +
> +LinearResampler::LinearResampler(unsigned int nb_channels,
> +                                 unsigned int in_rate,
> +                                 unsigned int out_rate) {

Style: parameters should start by a 'a', and identifiers should be in CamelCase. This should be fixed for every parameter of this file.

@@ +26,5 @@
> +
> +// The buffer is planar, for example with a 16-frames buffer and a stereo sound:
> +//    LLLLLLLLLLLLLLLLRRRRRRRRRRRRRRR
> +// A frame is a couple (L,R) whereas a sample is a single L or R.
> +// https://developer.mozilla.org/en-US/docs/Web/API/Web_Audio_API/Basic_concepts_behind_Web_Audio_API#Planar_versus_interleaved_buffers

This is implied, there is no "interleaved" in the function name. Also, prefer putting comments in the header.

@@ +32,5 @@
> +                                  unsigned int* in_len, // the number of *frames* to read
> +                                  float* out,
> +                                  unsigned int* out_len) {
> +    // We use the general method of weighted arithmetic mean explained by the third formula in 
> +    // https://fr.wikipedia.org/wiki/Interpolation_lin%C3%A9aire#Principe

Prefer english wikipedia links :-)

@@ +39,5 @@
> +    //      = f(i)*(1-k) + f(i+1)*k
> +    // with k = j-i (interpolation factor)
> +    // hence
> +    // out[i] = in[i]*(1-k) + in[j+1]*k
> +    // all we have to do is compute the right values of i and j.

You're re-explaining the code with more or less the same form as the wikipedia article and the code below, that's not particularly useful.

@@ +41,5 @@
> +    // hence
> +    // out[i] = in[i]*(1-k) + in[j+1]*k
> +    // all we have to do is compute the right values of i and j.
> +
> +    unsigned int readIndex = 0; // the current position in the input buffer

We usually use stdint.h style integers (uint32_t instead of unsigned int, for example).

Also, the name is explicit enough I think, drop the comment.

@@ +44,5 @@
> +
> +    unsigned int readIndex = 0; // the current position in the input buffer
> +    unsigned int writeIndex = 0; // the current position in the output buffer
> +
> +    unsigned int intAdvance = m_numRate / m_denRate; // the (integer) number of input samples to jump for one step in the output buffer

Class attributes should start with 'm', and be in CamelCase.

We usually put comments on top of declaration, and not after, so it's easier to keep 80 characters columns.

@@ +52,5 @@
> +    unsigned int nbSamplesRead = 0; // the number of samples read from the intput buffer
> +    unsigned int nbSamplesWritten = 0; // the number of samples written in the output buffer
> +
> +    while(writeIndex < *out_len && readIndex < *in_len-1) { // for each of the output samples, as long as we have unread input samples
> +        // FIXME: do we have to assume that out[0] <= in[0] or is there a "stride"? If so, fracAdvanceAccumulator, readIndex and writeIndex have to be initialized to something else than zero...

Drop this comment, I think we talked about that on irc.

@@ +56,5 @@
> +        // FIXME: do we have to assume that out[0] <= in[0] or is there a "stride"? If so, fracAdvanceAccumulator, readIndex and writeIndex have to be initialized to something else than zero...
> +
> +        // if fracAdvanceAccumulator==0, it means we simply have to copy the input value to the output buffer (no resampling needed)
> +        if(fracAdvanceAccumulator == 0) {
> +            out[writeIndex] = in[readIndex];

Maybe it's worth it to not branch here? We'd have to measure, probably with callgrind.

@@ +57,5 @@
> +
> +        // if fracAdvanceAccumulator==0, it means we simply have to copy the input value to the output buffer (no resampling needed)
> +        if(fracAdvanceAccumulator == 0) {
> +            out[writeIndex] = in[readIndex];
> +        }

`else` on the same line.

@@ +60,5 @@
> +            out[writeIndex] = in[readIndex];
> +        }
> +        else { // else, we need to do a resampling
> +            int interpolationFactor = writeIndex - readIndex;
> +            out[writeIndex] = in[readIndex]*(1-interpolationFactor) + in[readIndex+1]*interpolationFactor; // linear resampling

Drop the comment at the end, it's the name of the file, after all :-)

@@ +82,5 @@
> +int LinearResampler::ProcessInt(const short* in,
> +                                unsigned int* in_len,
> +                                short* out,
> +                                unsigned int* out_len) {
> +    // FIXME: not necessary?

Not strictly necessary for a first cut, might be useful, and it's probably not a lot of work, might as well.

@@ +89,5 @@
> +// The buffer is interleaved, for example with a 16-frames buffer and a stereo sound:
> +//    LRLRLRLRLRLRLRLRLRLRLRLRLRLRLRLR
> +// A frame is a couple (L,R) whereas a sample is a single L or R.
> +// The stride is the number of elements between two samples of the same channel (here, stride = nb_channels and this is often the case)
> +// https://developer.mozilla.org/en-US/docs/Web/API/Web_Audio_API/Basic_concepts_behind_Web_Audio_API#Planar_versus_interleaved_buffers

Drop the comment.

@@ +120,5 @@
> +                                 unsigned int ratio_den,
> +                                 unsigned int in_rate,
> +                                 unsigned int out_rate) {
> +    if(ratio_num == m_numRate && ratio_den == m_denRate &&
> +       in_rate == m_inRate && out_rate == m_outRate)

Always brace control structures, even if they only contain one instruction.

@@ +135,5 @@
> +    if(m_numRate == m_denRate) {
> +        m_numRate = 1;
> +        m_denRate = 1;
> +    } else {
> +        // FIXME: try to use mfbt/MathAlgorithms.h instead

That would be great, yes.

@@ +177,5 @@
> +    *stride = m_outStride;
> +}
> +
> +void LinearResampler::GetInputLatency(unsigned int* input_latency) {
> +    *input_latency = 0; // no latency for a linear resampler

Comment on top, please.

@@ +181,5 @@
> +    *input_latency = 0; // no latency for a linear resampler
> +}
> +
> +void LinearResampler::GetOutputLatency(unsigned int* output_latency) {
> +    *output_latency = 0; // no latency for a linear resampler

Comment on top, please.

@@ +196,5 @@
> +int LinearResampler::ResetMem() {
> +    // nothing to do : no samples memorized
> +}
> +
> +const char* LinearResampler::StrError(int err) {

Do we need this?

::: linear_resampler.h
@@ +7,5 @@
> +    RESAMPLER_EXIT_BAD_STATE    = 2, // never used for LinearResampler
> +    RESAMPLER_EXIT_INVALID_ARG  = 3,
> +    RESAMPLER_EXIT_PTR_OVERLAP  = 4,
> +    RESAMPLER_EXIT_MAX_ERROR
> +};

Can you put this inside the class ?

@@ +9,5 @@
> +    RESAMPLER_EXIT_PTR_OVERLAP  = 4,
> +    RESAMPLER_EXIT_MAX_ERROR
> +};
> +
> +class LinearResampler {

This should inherit from the common interface, right?

@@ +78,5 @@
> +
> +        /**
> +         * Resamples an interleaved float array. The input and output
> +         * buffers
> +         * must *not* overlap.

Fix line wrapping.

@@ +233,5 @@
> +        unsigned int m_denRate;
> +        unsigned int m_nbChannels;
> +
> +        int m_inStride;
> +        int m_outStride;

stdint.h types, and 'm'+CamelCase.
Attachment #8610153 - Flags: feedback?(padenot) → feedback+
Comment on attachment 8610465 [details] [diff] [review]
interface_ressampling.patch.v3

Review of attachment 8610465 [details] [diff] [review]:
-----------------------------------------------------------------

I'm thinking that maybe we could use a template policy to be able to switch between resampler quality at compile time? I don't know if you're comfortable with templates. We need to decide on a dispatch mechanism to switch between resampler anyway.

::: dom/media/webaudio/AudioBufferSourceNode.cpp
@@ +78,5 @@
>    {}
>  
>    ~AudioBufferSourceNodeEngine()
>    {
> +    mResamplerInterface.ResamplerDestroy();

Since we are wrapping everything in a class, we can simply put that in the destructor and use a smart pointer to manage the lifetime for us.

::: dom/media/webaudio/AudioResampler.cpp
@@ +30,1 @@
>      if (mResampler)

I think we should simply MOZ_ASSERT(mResampler), here and everywhere else. If we don't have a resampler, but trying to call inside, it's a rather serious issue, and needs to be fixed.

::: dom/media/webaudio/AudioResampler.h
@@ +13,5 @@
>  class AudioResampler
>  {
>  
>  public:
>    AudioResampler() : mResampler(nullptr) {}

Maybe we could use a factory-style static method that would return the right resampler based on the quality? (speex or the new linear resampler).  Or use templates and do it at compile time?

I seems wasteful to branch all the time: once we have a resampler, we'll stick with the same method of resampling, right?

@@ +15,5 @@
>  
>  public:
>    AudioResampler() : mResampler(nullptr) {}
>  
> +  void ResamplerInit(uint32_t aNbChannels, uint32_t aInRate, uint32_t aOutRate);

I think we should not prefix all function by "Resampler", it add unnecessary clutter.
Attachment #8610465 - Flags: feedback?(padenot) → feedback+
This patch adds the resampler interface and the LinearResampler and enables it by default. Previous remarks have been take in account, except for the use of templates in the interface (still under developpement).
Attachment #8609741 - Attachment is obsolete: true
Attachment #8610153 - Attachment is obsolete: true
Attachment #8609741 - Flags: feedback?(jmvalin)
Attachment #8611221 - Flags: review?(padenot)
Comment on attachment 8611221 [details] [diff] [review]
LinearResampler+Interface.rev1.patch

Review of attachment 8611221 [details] [diff] [review]:
-----------------------------------------------------------------

Better! Waiting for the version that does the dispatch at compile time, now.

::: dom/media/webaudio/AudioBufferSourceNode.cpp
@@ +530,5 @@
>    StreamTime mBeginProcessing;
>    StreamTime mStop;
>    nsRefPtr<ThreadSharedFloatArrayBufferList> mBuffer;
> +
> +  AudioResampler mResamplerInterface;

Just call that mResampler.

::: dom/media/webaudio/AudioResampler.cpp
@@ +4,5 @@
> +
> +#include "AudioResampler.h"
> +
> +
> +

Only one blank line here.

::: dom/media/webaudio/LinearResampler.cpp
@@ +33,5 @@
> +int
> +LinearResampler::ProcessFloat(
> +  const float* aIn,     // a pointer to the right buffer/channel to compute
> +  uint32_t* aInLen, // the number of *frames* to read
> +  float* aOut, uint32_t* aOutLen)

Indentation is off.

@@ +277,5 @@
> +  if (mNumRate == mDenRate) {
> +    mNumRate = 1;
> +    mDenRate = 1;
> +  } else {
> +    // FIXME: try to use mfbt/MathAlgorithms.h instead

!

::: dom/media/webaudio/LinearResampler.h
@@ +50,5 @@
> +   * @param out_len Number of frames in the output buffer.
> +   *                Returns the number of samples written
> +   * @return an error code
> +   */
> +  int32_t ProcessFloat(const float* aIn, uint32_t* aInLen, float* aOut,

Indentation is off, here and below for other methods.

@@ +133,5 @@
> +   * @param ratio_num Numerator of the sampling rate ratio, copied
> +   * @param ratio_den Denominator of the sampling rate ratio, copied
> +   */
> +  void GetRatio(uint32_t* aRationNum, uint32_t* aRatioDen);
> +  

trailing spaces.

@@ +211,5 @@
> +  uint32_t mDenRate;
> +  uint32_t mNbChannels;
> +  uint32_t mInStride;
> +  uint32_t mOutStride;
> +  

here as well.
Attachment #8611221 - Flags: review?(padenot)
Please guys, don't do linear resampling! Sure there may be cases where you don't hear the difference so much but believe me there's other cases where it's absolutely atrocious. Even at its lowest complexity (which I'm still not sure is a good idea), the Speex resampler will be much much better than linear resampling.
(In reply to Jean-Marc Valin (:jmspeex) from comment #22)
> Please guys, don't do linear resampling! Sure there may be cases where you
> don't hear the difference so much but believe me there's other cases where
> it's absolutely atrocious. Even at its lowest complexity (which I'm still
> not sure is a good idea), the Speex resampler will be much much better than
> linear resampling.

Listening to some sound on YouTube and several other websites with the new resampler, I can't hear major differences, only a bit noise sometimes. Do you have any example where the linear resampling would fail to give a good sound? This would be quite interesting.
Moreover, we worked on keeping a switchable interface between Firefox on the one side, and Speex or the linear resampler on the other side, maybe for a future integration into Firefox's Preferences... The two resampler can easily be interchanged.
Thanks in advance!
(In reply to Jean-Marc Valin (:jmspeex) from comment #22)
> Please guys, don't do linear resampling! Sure there may be cases where you
> don't hear the difference so much but believe me there's other cases where
> it's absolutely atrocious. Even at its lowest complexity (which I'm still
> not sure is a good idea), the Speex resampler will be much much better than
> linear resampling.

Note that this is not for general music playback (it's for resampling in-memory short audio files that are going through a processing graph before output, for Web Audio API), and that I'd rather have something fast than something that sounds good.

The characteristics we need for this use-case are:
- speed
- no latency
- ability to adjust the resampling ratio very quickly (per 128 samples block if not at each sample)

Chrome has been shipping for four years with a linear resampler for this use case, and that enables them to play much more samples at the same time, the resampling cost easily having the first place in our profiles, by an order of magnitude.

That said, I haven't listened yet to this new code. I can hear artifacts with Chrome's linear resampler, when comparing to Firefox for sure, but fidelity is not the most important here.
OK, so here's an example of what linear resampling can do. I took a short clip at 8 kHz and upsampled it to 48 kHz using 1) a proper resampler (sox was easy, but the speex resampler would sound the same) 2) a linear resampler. Here's the output
http://jmvalin.ca/misc_stuff/good_resampling.wav
http://jmvalin.ca/misc_stuff/linear_resampling.wav
(make sure to use headphones in case your speakers are bad)

If you can give me an idea of the use case you're trying to optimize for, I may be able to make suggestions on the best thing to use. For example, what sampling rates we talking about (input and output), how frequent are rate changes, what architecture, what kind of optimizations are currently enabled for the Speex resampler, ...
(In reply to Jean-Marc Valin (:jmspeex) from comment #25)
> OK, so here's an example of what linear resampling can do. I took a short
> clip at 8 kHz and upsampled it to 48 kHz using 1) a proper resampler (sox
> was easy, but the speex resampler would sound the same) 2) a linear
> resampler. Here's the output
> http://jmvalin.ca/misc_stuff/good_resampling.wav
> http://jmvalin.ca/misc_stuff/linear_resampling.wav
> (make sure to use headphones in case your speakers are bad)
> 
> If you can give me an idea of the use case you're trying to optimize for, I
> may be able to make suggestions on the best thing to use. For example, what
> sampling rates we talking about (input and output), how frequent are rate
> changes, what architecture, what kind of optimizations are currently enabled
> for the Speex resampler, ...

- input sample rate: 8000Hz to 192kHz, more or less
- output sample rate : arbitrary
- Jumps in resampling ratio are expected to occur rather often (as low as every 128 frames), and can be continuous (like when a low frequency oscillator controls the resampling ratio)
- ARM and x86, has to run on other architecture, but we care less
- As far as I know, we have NEON and SSE code for the speex resampler, and we use the HUGE_MEMORY define
- We used to use it in DESKTOP quality, recently switched to MIN, but for some reason our benchmarking system stopped running, we don't have really precise numbers but we know it's better
About the complexity of the Speex resampler, one important thing to know is that it is roughly proportional (aside from a fixed cost) to the filter length. These are the lengths at quality ranging from 0 to 10:
{8, 16, 32, 48, 64, 80, 96, 128, 160, 192, 256}
The desktop quality you've been using is 5, so it uses a filter length of 80. Just going to quality 2 would divide the complexity by more than half. And quality 0 would be close to a factor of 10 (ignoring the small fixed cost). It's not great quality, but certainly much much better than linear resampling.

About changing the sampling frequency, the Speex resampler supports doing that at any time, with any granularity. The only thing to keep in mind is that there is a fixed cost every time the ratio is changed (recomputing the filters). That cost also depends on the filter length, but at some point (didn't calculate where that point is), the cost of the changes can become greater than the resampling itself. If you're in that situation, than Erik de Castro Lopo's "Secret Rabbit Code" (SRC) resampler becomes more computationally efficient than the Speex resampler because it has free rate changes, at the cost of higher complexity for normal operation.
Ths patch adds the template for choosing the resampler at build time (in AudioBuffer.h).
Attachment #8611221 - Attachment is obsolete: true
Attachment #8612124 - Flags: review?(padenot)
Blocks: webaudioperf
Assignee: nobody → thibaud.backenstrass
Comment on attachment 8612124 [details] [diff] [review]
LinearResampler+Interface.rev2.patch

Review of attachment 8612124 [details] [diff] [review]:
-----------------------------------------------------------------

Looks cool. Now we need to really do some performance testing and profiling.

::: dom/media/webaudio/AudioBufferSourceNode.cpp
@@ +177,4 @@
>      } else {
>        uint32_t currentOutSampleRate, currentInSampleRate;
> +      mResampler.GetRate(&currentInSampleRate,
> +                                           &currentOutSampleRate);

nit: indentation is off.
Attachment #8612124 - Flags: review?(padenot) → review+
Fixed indentation.
Attachment #8612124 - Attachment is obsolete: true
Attachment #8613379 - Flags: review?(padenot)
Fixed a memory leak issue.
Attachment #8613379 - Attachment is obsolete: true
Attachment #8613379 - Flags: review?(padenot)
Attachment #8614072 - Flags: review?(padenot)
Perf's Report for the Speex Resampler with the common interface. We can see that Speex's functions represent at least 16-17% of the execution time.
Attachment #8614145 - Flags: feedback?(padenot)
Perf's Report for the Linear Resampler with the common interface. We can see that Speex's functions represent around 9-10% of the execution time.
Both profiling were made in the exact same conditions (opening Nightly with Padenot's testbench set as default start page, without clicking "Play" on any testcase).
Attachment #8614148 - Flags: feedback?(padenot)
What quality is the Speex resampling set to when it takes 16-17% CPU?
(In reply to Jean-Marc Valin (:jmspeex) from comment #34)
> What quality is the Speex resampling set to when it takes 16-17% CPU?

Good question, I forget to indicate this! The Speex resampler is at its lowest quality (0) on these profiling results.
OK, I don't know what you guys tested exactly, but these numbers are insane. I just quickly ran two tests using the testresample source in libspeexdsp (where the Speex resampler lives):

1) Mono resampling from 96 kHz to 44.1 kHz, quality 0. The amount of CPU required was 0.2%.
2) Mono resampling from 8 kHZ to 48 kHz, quality 0. The amount of CPU required was 0.04%.

Even running test #1 and changing the rate every 2 ms, I still only get 2.7% CPU usage. So I don't know where the problem comes from, but I don't think it's about the resampler being too slow.
(In reply to Jean-Marc Valin (:jmspeex) from comment #36)
> OK, I don't know what you guys tested exactly, but these numbers are insane.
> I just quickly ran two tests using the testresample source in libspeexdsp
> (where the Speex resampler lives):
> 
> 1) Mono resampling from 96 kHz to 44.1 kHz, quality 0. The amount of CPU
> required was 0.2%.
> 2) Mono resampling from 8 kHZ to 48 kHz, quality 0. The amount of CPU
> required was 0.04%.
> 
> Even running test #1 and changing the rate every 2 ms, I still only get 2.7%
> CPU usage. So I don't know where the problem comes from, but I don't think
> it's about the resampler being too slow.

For the profiling, I used Padenot's benchmark (https://github.com/padenot/webaudio-benchmark). I'm working on a quite old computer (4 Go RAM, Intel Centrino 2 Dual-Core 2GHz), if it can help to understand the numbers.
Code clean-up.
Attachment #8614072 - Attachment is obsolete: true
Attachment #8614072 - Flags: review?(padenot)
Attachment #8614652 - Flags: review?(padenot)
A 2 GHz Centrino is still pretty fast for this. It should not make the CPU to from 0.04% to 16%. And if it's too slow to resample even at complexity 0, then it's also too slow for doing anything, including decoding an MP3/Opus file, which is orders of magnitude more complex than resampling. Of course, this is assuming "reasonable" use of the resampler. If you reinitialize it or change the rate every milliseconds, then this performance is not too off. In that case, I see two options:
1) it's an unrealistic benchmark, you update it to not switch that often and find that the complexity of the Speex resampled goes down by a factor of 100.
2) you decide switching every millisecond is really important, in which case you want to use Erik's SRC: http://www.mega-nerd.com/SRC/ . At SRC_SINC_FASTEST, it should be the same kind of speed as linear resampling for the switch every millisecond case, but higher quality. The license is GPL + purchasable exemption to the GPL, so you might just want to check which is best.

In any case, it's still not clear to me what the exact problem is here, but I don't see "use a linear resampler" as the solution to anything.
This is certainly not 17% CPU, since right below in the profile, there is our "mix with gain" function, which is a loop with an addition and a multiplication, and that can't be 3% CPU. It might be 17% of the total time of the benchmark or something, which would make sense since this bench is made to test the resampling speed.

Also, those particular test cases are not changing the sample rate ratio (it is set to a value, and then all the processing happens at this ratio).

I'm going to do some profiling in the next few days, but it seems that we're good, now, with a lower quality.
Use Speex Resampler as default resampler.
Attachment #8614652 - Attachment is obsolete: true
Attachment #8614652 - Flags: review?(padenot)
Attachment #8616580 - Flags: review?(padenot)
Attachment #8614145 - Flags: feedback?(padenot)
Attachment #8614148 - Flags: feedback?(padenot)
Comment on attachment 8616580 [details] [diff] [review]
LinearResampler+Interface.rev6.patch

The plan is now to have speex by default, and land this. Runtime switching will be possible using a pref. This makes sense because there has been some talk on having a configurable resampling quality at the AudioBufferSourceNode level.
Attachment #8616580 - Flags: review?(padenot)
This patch completely changes the approach of an interface with templates.  The class AudioResampler has been rewritten to implement polymorphism.
I added a key in about:config (media.webaudio.resampler) to allow the user to change the resampler used. Possible values are "speex" (default value, using Speex at its lowest quality) or "linear" (using the new LinearResampler). Any other string is accepted, but will lead to the use of SpeexResampler.
Attachment #8616580 - Attachment is obsolete: true
Attachment #8616717 - Flags: review?(padenot)
Comment on attachment 8616717 [details] [diff] [review]
LinearResampler+Interface.rev7.patch

Review of attachment 8616717 [details] [diff] [review]:
-----------------------------------------------------------------

Looks good, still a couple things to fix to get to the finish line !

::: dom/media/webaudio/AudioBufferSourceNode.cpp
@@ +4,5 @@
>   * License, v. 2.0. If a copy of the MPL was not distributed with this
>   * file, You can obtain one at http://mozilla.org/MPL/2.0/. */
>  
>  #include "AudioBufferSourceNode.h"
> +#include "mozilla/Preferences.h" // to allow changing the resampler

Drop the comment, that's not very useful.

@@ +18,3 @@
>  #include <limits>
>  #include <algorithm>
> +#include <iostream>

Drop this, please.

@@ +77,5 @@
>      mPlaybackRateTimeline(1.0f),
>      mDetuneTimeline(0.0f),
>      mLoop(false)
> +  {
> +    if(Preferences::GetString("media.webaudio.resampler").LowerCaseEqualsLiteral("linear")) {

Can you use a pref cache (look into the pref header file) ? This can be pretty high traffic sometimes.

@@ +79,5 @@
>      mLoop(false)
> +  {
> +    if(Preferences::GetString("media.webaudio.resampler").LowerCaseEqualsLiteral("linear")) {
> +      mResampler.reset(new LinearResampler());
> +      std::cerr << "\n\nNEW LinearResampler\n\n" << std::endl;

Please remove those debugging dumps.

@@ +257,5 @@
>                                           int32_t aBufferMax) {
>      // TODO: adjust for mStop (see bug 913854 comment 9).
>      uint32_t availableInOutputBuffer =
>        WEBAUDIO_BLOCK_SIZE - *aOffsetWithinBlock;
> +    

nit: trailing spaces.

@@ +311,5 @@
>              // We'll feed in enough zeros to empty out the resampler's memory.
>              // This handles the output latency as well as capturing the low
>              // pass effects of the resample filter.
>              mRemainingResamplerTail =
> +              2 * mResampler->GetInputLatency() - 1;

Are you not setting this to -1 when the input latency is 0 because we're using a linear resampler ?

::: dom/media/webaudio/LinearResampler.h
@@ +140,5 @@
> +  
> +  void SetQuality(int32_t aQuality) override;
> +  
> +  int32_t GetQuality() override;
> +  

nit: some trailing spaces here.

::: modules/libpref/init/all.js
@@ +272,5 @@
>  
>  pref("media.decoder.heuristic.dormant.enabled", true);
>  pref("media.decoder.heuristic.dormant.timeout", 60000);
>  
> +// The quality of the resampler used in Web Audio, see bug 1140450

Strictly speaking, this is the "type" of resampler using for Web Audio.
Attachment #8616717 - Flags: review?(padenot)
Priority: P1 → P2
Read the prefs in AudioContext instead of in AudioBufferSourceNode.
Attachment #8616717 - Attachment is obsolete: true
Attachment #8617377 - Flags: review?(padenot)
Attachment #8617377 - Attachment is obsolete: true
Attachment #8617377 - Flags: review?(padenot)
Attachment #8617847 - Flags: review?(padenot)
Looks good, I went and did a try push, this looks like it breaks the build on android:

https://treeherder.mozilla.org/#/jobs?repo=try&revision=b3dd908ef70b
Flags: needinfo?(thibaud.backenstrass)
I fixed the build errors on Android and this should be ok.
However, there is duplicated code in WebAudioUtils.cpp and in AudioResampler.cpp. This is not a problem for now, but maybe the best way to resolve that issue would be to replace all direct calls to the speex resampler with calls to the SpeexResampler object?
Attachment #8617847 - Attachment is obsolete: true
Attachment #8617847 - Flags: review?(padenot)
Flags: needinfo?(thibaud.backenstrass)
Attachment #8621093 - Flags: review?(padenot)
After spending some time thinking on what we should do here, I think the initial patch was good enough. We're now tackling other performance issues, in other parts of the MSG.

Thanks all for working on this!
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
Attachment #8621093 - Flags: review?(padenot)
Removing leave-open keyword from resolved bugs, per :sylvestre.
Keywords: leave-open
You need to log in before you can comment on or make changes to this bug.