Closed Bug 1341254 Opened 7 years ago Closed 7 years ago

Update libspeex_resampler to 79822c

Categories

(Core :: Web Audio, defect, P1)

x86_64
Windows 10
defect

Tracking

()

VERIFIED FIXED
mozilla55
Tracking Status
firefox55 --- fixed

People

(Reporter: jujjyl, Assigned: padenot)

Details

Attachments

(7 files)

Attached image speex_hotspot.png
Profiling an upcoming WebAssembly + WebGL 2 demo in AMD CodeXL, which uses Web Audio to play back its sound, the highest amount of time samples inside xul.dll is taken in function moz_speex_resampler_set_rate_frac(). See the attached screenshot for CodeXL allocation of the samples within that function.

Would it be possible to optimize that code block to be fast?
Jukka, can you get me a way to repro this (maybe off-band) ?
Flags: needinfo?(jujjyl)
Thanks for the quick reply Paul,

sent a test case via email.
Flags: needinfo?(jujjyl)
Here is a list of hottest xul.dll functions from AMD CodeXL from the profiling run.
Assignee: nobody → padenot
Although moz_speex_resampler_set_rate_frac() is indeed inefficient, I believe it's likely that you're calling it unnecessarily. In most normal use cases, it should not show up in profiling. Can you see how often it's being called per second?
Yes, that's the first thing I'll be doing tomorrow. I don't expect to have to optimize `moz_speex_resampler_set_rate_frac`.
I can't repro this profile using Instruments on a macbook. I'll try again tomorrow. Anything in particular I should do in the demo ?
I added an option to disable all WebGL operations so that the demo stresses computational aspects more heavily. Try running with the "?playback&fakegl&novsync" GET parameter option to run a timedemo without rendering or display refresh rate pauses affecting if you have a fast CPU.

Here's what a run in that mode shows with AMD CodeXL on Windows for biggest hotspots in xul.dll.
Oh, looking at the code in detail, I see that it is a very inefficient algorithm for reducing fractions to minimal form. Perhaps it could be replaced with a google/stackoverflow variant, e.g.

http://andreinc.net/2010/12/11/euclids-algorithm-reducing-fraction-to-lowest-terms/
As the comment points out, I've always been aware of the inefficiency. My concern here is that just fixing that might cover up a more fundamental issue: why this function is called as often as is it now.
Rank: 19
Priority: -- → P1
I guess I should add just to be clear that we'll still be glad to merge a faster algo in upstream.
(In reply to Jean-Marc Valin (:jmspeex) from comment #9)
> As the comment points out, I've always been aware of the inefficiency. My
> concern here is that just fixing that might cover up a more fundamental
> issue: why this function is called as often as is it now.

We'll probably end up doing both: getting a faster algorithm upstream, and investigating why the authors of this demo change the playbackRate of their AudioBufferSourceNode so much (as this is what is happening).
The compiled code uses OpenAL for its audio, so all Web Audio manipulation in the demo should be coming from this file:

https://github.com/kripken/emscripten/blob/incoming/src/library_openal.js

Looking through that file, there are two instances where .playbackRate is set:

https://github.com/kripken/emscripten/blob/incoming/src/library_openal.js#L89
https://github.com/kripken/emscripten/blob/incoming/src/library_openal.js#L583

Application code can animate pitch rate changes per frame, which is done e.g. to implement an accelerating/decelerating car engine sound, so it is expected that .playbackRate could be set each frame.

However this demo does not do that, and I think .playbackRate is static in all audio clips that are played back. It is possible that the engine calls alSourcef(source, AL_PITCH, x); unconditionally for each audio clip it sets to play back even if the audio pitch does not change. Should the best practice be to track these manually, i.e. instead of 

> src.playbackRate.value = newPlaybackRate;

to always test for change first like this?

> if (src.playbackRate.value != newPlaybackRate) src.playbackRate.value = newPlaybackRate;
Looking through UE4 engine source code, there's three audio effects that can animate pitch per frame:
  - audio sample specific pitch changes that game code can do, e.g. that car engine acceleration/deceleration effect
  - velocity based 3D positional doppler effect
  - a global time scale slow down/speedup effect, which is used for bullet time effects and for pausing/resuming game when entering and exiting menus

Tried the if () test above for the demo, but it is animating pitch, it turns out the scene has 3D positional audio elements in it which have doppler effects enabled, and the if () test does not optimize away the changes to setting .playbackRate per frame, it is still being animated per frame by these doppler enabled 3D positional audio effects.
(In reply to Jukka Jylänki from comment #12)
> The compiled code uses OpenAL for its audio, so all Web Audio manipulation
> in the demo should be coming from this file:
> 
> https://github.com/kripken/emscripten/blob/incoming/src/library_openal.js
> 
> Looking through that file, there are two instances where .playbackRate is
> set:
> 
> https://github.com/kripken/emscripten/blob/incoming/src/library_openal.js#L89
> https://github.com/kripken/emscripten/blob/incoming/src/library_openal.
> js#L583
> 
> Application code can animate pitch rate changes per frame, which is done
> e.g. to implement an accelerating/decelerating car engine sound, so it is
> expected that .playbackRate could be set each frame.
> 
> However this demo does not do that, and I think .playbackRate is static in
> all audio clips that are played back. It is possible that the engine calls
> alSourcef(source, AL_PITCH, x); unconditionally for each audio clip it sets
> to play back even if the audio pitch does not change. Should the best
> practice be to track these manually, i.e. instead of 
> 
> > src.playbackRate.value = newPlaybackRate;
> 
> to always test for change first like this?
> 
> > if (src.playbackRate.value != newPlaybackRate) src.playbackRate.value = newPlaybackRate;

That's what is done internally. We only call `moz_speex_resampler_set_rate_frac` if it's necessary:
http://searchfox.org/mozilla-central/source/dom/media/webaudio/AudioBufferSourceNode.cpp#187 (this is what calls moz_speex_resampler_set_rate_frac under the hood).
This simple fix makes the function disappear from the profile. I did a
micro-benchmark that agrees.

This patches our in-tree copy of the resampler, who should I contact and where
should I get this upstreamed? I see multiple copies of speexdsp (one at
xiph.org, one at github).
Attachment #8840848 - Flags: review?(jmvalin)
Comment on attachment 8840848 [details] [diff] [review]
Optimize reducing the resampling ratio to lowest terms in speex's resampler

Switching reviewer to the current upstream maintainer.
Attachment #8840848 - Flags: review?(jmvalin) → review?(le.businessman)
So, this is already upstream. I'm morphing this bug about updating our in-tree copy.
Summary: "FIXME: This is terribly inefficient, but who cares (at least for now)?" code block shows up as the highest overhead in xul.dll in WebAssembly+WebGL 2 demo → Update libspeex_resanmpler to 79822c
Comment on attachment 8840848 [details] [diff] [review]
Optimize reducing the resampling ratio to lowest terms in speex's resampler

This is already upstream, but this patch can be used for uplift, if needed. Jukka, is that necessary ?
Flags: needinfo?(jujjyl)
Attachment #8840848 - Flags: review?(le.businessman)
Karl, we need to update the library to pick up [0], that brings a very nice speedup in Jukka's scenario. I've updated to the last revision. Upstream equivalent code to 

I've updated the library using update.sh, rebasing the patches we apply on top. Next up, I'll see about upstreaming some of our patches, manually rebasing our patches was a bit tedious.

[0]: http://git.xiph.org/?p=speexdsp.git;a=commit;h=d08a14462744654f7097f17943d8e2ddc0dfe93e
Summary: Update libspeex_resanmpler to 79822c → Update libspeex_resampler to 79822c
Comment on attachment 8840952 [details]
Bug 1341254 - Update libspeex_resampler to 79822c.

https://reviewboard.mozilla.org/r/115342/#review117020

Thanks for doing this, Paul.  The fix for bug 1274083 is getting in the way, and a couple of nits.

::: media/libspeex_resampler/hugemem.patch:1
(Diff revision 1)
> +--- a/media/libspeex_resampler/src/resample.c
> ++++ b/media/libspeex_resampler/src/resample.c
> +@@ -56,16 +56,18 @@
> +    (e.g. 2/3), and get rid of the rounding operations in the inner loop.
> +    The latter both reduces CPU time and makes the algorithm more SIMD-friendly.
> + */
> + 
> + #ifdef HAVE_CONFIG_H
> + #include "config.h"
> + #endif
> + 
> ++#define RESAMPLE_HUGEMEM 1
> ++
> + #ifdef OUTSIDE_SPEEX

This is all repeated below.

::: media/libspeex_resampler/src/resample.c:1150
(Diff revision 1)
> -         {
> +            return RESAMPLER_ERR_OVERFLOW;
> -             st->samp_frac_num[i] = st->den_rate-1;

Lost the fix for bug 1274083 here, which didn't have an associated change to update.sh.

https://hg.mozilla.org/mozilla-central/rev/74d13ba37b59

::: media/libspeex_resampler/update.sh:1
(Diff revision 1)
>  # This Source Code Form is subject to the terms of the Mozilla Public

Can you update the revision in README_MOZILLA, please?

::: media/libspeex_resampler/update.sh
(Diff revision 1)
>  patch -p3 < outside-speex.patch
>  patch -p3 < simd-detect-runtime.patch
>  patch -p3 < set-skip-frac.patch
>  patch -p3 < hugemem.patch
>  patch -p3 < remove-empty-asm-clobber.patch
> -patch -p3 < handle-memory-error.patch

Good.  Covered by http://git.xiph.org/?p=speexdsp.git;a=commitdiff;h=8c2981fa4527330708fccda4de3490a60168d568

::: media/libspeex_resampler/update.sh
(Diff revision 1)
>  patch -p3 < simd-detect-runtime.patch
>  patch -p3 < set-skip-frac.patch
>  patch -p3 < hugemem.patch
>  patch -p3 < remove-empty-asm-clobber.patch
> -patch -p3 < handle-memory-error.patch
> -patch -p3 < fix-overflow.patch

Yes.  Better version upstream.
http://git.xiph.org/?p=speexdsp.git;a=commitdiff;h=d5fade684a4f0ea1419901173d8a254742a6f3a9
http://git.xiph.org/?p=speexdsp.git;a=commitdiff;h=335a9a16b8a90fee92002a7256b90bb6d8498622
Attachment #8840952 - Flags: review?(karlt) → review-
Comment on attachment 8840952 [details]
Bug 1341254 - Update libspeex_resampler to 79822c.

https://reviewboard.mozilla.org/r/115342/#review119252

::: media/libspeex_resampler/src/resample.c:1150
(Diff revision 1)
> -         {
> +            return RESAMPLER_ERR_OVERFLOW;
> -             st->samp_frac_num[i] = st->den_rate-1;

Good catch. I've made it into a patch in a separate commit (as it should have done), to be applied before the rebase, and the new version of the update patch updates the newly-created patch against the latest codebase (the function is called `_muldiv` instead of `_muldiv_safe`, in particular).
Actually, that change is *not* a good idea. It turns a minor error (which might cause just a tiny glitch) into an error that breaks resampling completely.
(In reply to Jean-Marc Valin (:jmspeex) from comment #25)
> Actually, that change is *not* a good idea. It turns a minor error (which
> might cause just a tiny glitch) into an error that breaks resampling
> completely.

Agreed, and I plan to do like you said in the other bug, clamp the resampling ratio ahead of time on every call site that uses `speex_resampler_set_rate`.

I filed bug 1344785 to remove our local patch and do a little audit, but I did a quick check and we're hitting this code path in only one use of the resampler (details in the bug).

This bug is just about updating our copy from upstream to fix something unrelated, so I'm going to land the patches as they are (granted Karl does not find something to fix).
(In reply to Jean-Marc Valin (:jmspeex) from comment #25)
> Actually, that change is *not* a good idea. It turns a minor error (which
> might cause just a tiny glitch) into an error that breaks resampling
> completely.

I assume you are referring to the last hunk in this patch upstream,
rather than the fix for that?
http://git.xiph.org/?p=speexdsp.git;a=commitdiff;h=335a9a16b8a90fee92002a7256b90bb6d8498622#patch2
Comment on attachment 8844021 [details]
Bug 1341254 - Put the modification made to the resampler code in bug 1274083 into their own patch, and add this patch to update.sh.

https://reviewboard.mozilla.org/r/117588/#review119424
Attachment #8844021 - Flags: review?(karlt) → review+
Comment on attachment 8840952 [details]
Bug 1341254 - Update libspeex_resampler to 79822c.

https://reviewboard.mozilla.org/r/115342/#review119428

::: media/libspeex_resampler/set-rate-overflow-no-return.patch:13
(Diff revision 2)
> -          if (!_muldiv_safe(st->samp_frac_num[i],st->den_rate,old_den))
> +-         if (_muldiv(&st->samp_frac_num[i],st->samp_frac_num[i],st->den_rate,old_den) != RESAMPLER_ERR_SUCCESS)
>  -            return RESAMPLER_ERR_OVERFLOW;
> -+         {
> ++         if (_muldiv(&st->samp_frac_num[i],st->samp_frac_num[i],st->den_rate,old_den) != RESAMPLER_ERR_SUCCESS) {

File style is to have the opening brace on its own line, but that can be sorted out when we decide how this should merge with the safety net below.
Attachment #8840952 - Flags: review?(karlt) → review+
Pushed by paul@paul.cx:
https://hg.mozilla.org/integration/mozilla-inbound/rev/4b1b4e0d8579
Put the modification made to the resampler code in bug 1274083 into their own patch, and add this patch to update.sh. r=karlt
https://hg.mozilla.org/integration/mozilla-inbound/rev/c8df499e005a
Update libspeex_resampler to 79822c. r=karlt
https://hg.mozilla.org/mozilla-central/rev/4b1b4e0d8579
https://hg.mozilla.org/mozilla-central/rev/c8df499e005a
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla55
Attached image after_fix.png
Perfect, retesting after FF Nightly fixes, the performance hotspot is nowhere found anymore.

The demo has since been posted live at http://mzl.la/webassemblydemo

Thanks for the fix!
Flags: needinfo?(jujjyl)
Status: RESOLVED → VERIFIED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: