Update libspeex_resampler to 79822c

VERIFIED FIXED in Firefox 55

Status

()

P1
normal
Rank:
19
VERIFIED FIXED
2 years ago
2 years ago

People

(Reporter: jujjyl, Assigned: padenot)

Tracking

unspecified
mozilla55
x86_64
Windows 10
Points:
---

Firefox Tracking Flags

(firefox55 fixed)

Details

Attachments

(7 attachments)

(Reporter)

Description

2 years ago
Created attachment 8839432 [details]
speex_hotspot.png

Profiling an upcoming WebAssembly + WebGL 2 demo in AMD CodeXL, which uses Web Audio to play back its sound, the highest amount of time samples inside xul.dll is taken in function moz_speex_resampler_set_rate_frac(). See the attached screenshot for CodeXL allocation of the samples within that function.

Would it be possible to optimize that code block to be fast?
(Assignee)

Comment 1

2 years ago
Jukka, can you get me a way to repro this (maybe off-band) ?
Flags: needinfo?(jujjyl)
(Reporter)

Comment 2

2 years ago
Thanks for the quick reply Paul,

sent a test case via email.
Flags: needinfo?(jujjyl)
(Reporter)

Comment 3

2 years ago
Created attachment 8839445 [details]
codexl_hot_functions.png

Here is a list of hottest xul.dll functions from AMD CodeXL from the profiling run.
(Assignee)

Updated

2 years ago
Assignee: nobody → padenot
Although moz_speex_resampler_set_rate_frac() is indeed inefficient, I believe it's likely that you're calling it unnecessarily. In most normal use cases, it should not show up in profiling. Can you see how often it's being called per second?
(Assignee)

Comment 5

2 years ago
Yes, that's the first thing I'll be doing tomorrow. I don't expect to have to optimize `moz_speex_resampler_set_rate_frac`.
(Assignee)

Comment 6

2 years ago
I can't repro this profile using Instruments on a macbook. I'll try again tomorrow. Anything in particular I should do in the demo ?
(Reporter)

Comment 7

2 years ago
Created attachment 8840235 [details]
xul_hotspots_when_webgl_is_disabled.png

I added an option to disable all WebGL operations so that the demo stresses computational aspects more heavily. Try running with the "?playback&fakegl&novsync" GET parameter option to run a timedemo without rendering or display refresh rate pauses affecting if you have a fast CPU.

Here's what a run in that mode shows with AMD CodeXL on Windows for biggest hotspots in xul.dll.
(Reporter)

Comment 8

2 years ago
Oh, looking at the code in detail, I see that it is a very inefficient algorithm for reducing fractions to minimal form. Perhaps it could be replaced with a google/stackoverflow variant, e.g.

http://andreinc.net/2010/12/11/euclids-algorithm-reducing-fraction-to-lowest-terms/
As the comment points out, I've always been aware of the inefficiency. My concern here is that just fixing that might cover up a more fundamental issue: why this function is called as often as is it now.
Rank: 19
Priority: -- → P1
I guess I should add just to be clear that we'll still be glad to merge a faster algo in upstream.
(Assignee)

Comment 11

2 years ago
(In reply to Jean-Marc Valin (:jmspeex) from comment #9)
> As the comment points out, I've always been aware of the inefficiency. My
> concern here is that just fixing that might cover up a more fundamental
> issue: why this function is called as often as is it now.

We'll probably end up doing both: getting a faster algorithm upstream, and investigating why the authors of this demo change the playbackRate of their AudioBufferSourceNode so much (as this is what is happening).
(Reporter)

Comment 12

2 years ago
The compiled code uses OpenAL for its audio, so all Web Audio manipulation in the demo should be coming from this file:

https://github.com/kripken/emscripten/blob/incoming/src/library_openal.js

Looking through that file, there are two instances where .playbackRate is set:

https://github.com/kripken/emscripten/blob/incoming/src/library_openal.js#L89
https://github.com/kripken/emscripten/blob/incoming/src/library_openal.js#L583

Application code can animate pitch rate changes per frame, which is done e.g. to implement an accelerating/decelerating car engine sound, so it is expected that .playbackRate could be set each frame.

However this demo does not do that, and I think .playbackRate is static in all audio clips that are played back. It is possible that the engine calls alSourcef(source, AL_PITCH, x); unconditionally for each audio clip it sets to play back even if the audio pitch does not change. Should the best practice be to track these manually, i.e. instead of 

> src.playbackRate.value = newPlaybackRate;

to always test for change first like this?

> if (src.playbackRate.value != newPlaybackRate) src.playbackRate.value = newPlaybackRate;
(Reporter)

Comment 13

2 years ago
Looking through UE4 engine source code, there's three audio effects that can animate pitch per frame:
  - audio sample specific pitch changes that game code can do, e.g. that car engine acceleration/deceleration effect
  - velocity based 3D positional doppler effect
  - a global time scale slow down/speedup effect, which is used for bullet time effects and for pausing/resuming game when entering and exiting menus

Tried the if () test above for the demo, but it is animating pitch, it turns out the scene has 3D positional audio elements in it which have doppler effects enabled, and the if () test does not optimize away the changes to setting .playbackRate per frame, it is still being animated per frame by these doppler enabled 3D positional audio effects.
(Assignee)

Comment 14

2 years ago
(In reply to Jukka Jylänki from comment #12)
> The compiled code uses OpenAL for its audio, so all Web Audio manipulation
> in the demo should be coming from this file:
> 
> https://github.com/kripken/emscripten/blob/incoming/src/library_openal.js
> 
> Looking through that file, there are two instances where .playbackRate is
> set:
> 
> https://github.com/kripken/emscripten/blob/incoming/src/library_openal.js#L89
> https://github.com/kripken/emscripten/blob/incoming/src/library_openal.
> js#L583
> 
> Application code can animate pitch rate changes per frame, which is done
> e.g. to implement an accelerating/decelerating car engine sound, so it is
> expected that .playbackRate could be set each frame.
> 
> However this demo does not do that, and I think .playbackRate is static in
> all audio clips that are played back. It is possible that the engine calls
> alSourcef(source, AL_PITCH, x); unconditionally for each audio clip it sets
> to play back even if the audio pitch does not change. Should the best
> practice be to track these manually, i.e. instead of 
> 
> > src.playbackRate.value = newPlaybackRate;
> 
> to always test for change first like this?
> 
> > if (src.playbackRate.value != newPlaybackRate) src.playbackRate.value = newPlaybackRate;

That's what is done internally. We only call `moz_speex_resampler_set_rate_frac` if it's necessary:
http://searchfox.org/mozilla-central/source/dom/media/webaudio/AudioBufferSourceNode.cpp#187 (this is what calls moz_speex_resampler_set_rate_frac under the hood).
(Assignee)

Comment 15

2 years ago
Created attachment 8840848 [details] [diff] [review]
Optimize reducing the resampling ratio to lowest terms in speex's resampler

This simple fix makes the function disappear from the profile. I did a
micro-benchmark that agrees.

This patches our in-tree copy of the resampler, who should I contact and where
should I get this upstreamed? I see multiple copies of speexdsp (one at
xiph.org, one at github).
Attachment #8840848 - Flags: review?(jmvalin)
(Assignee)

Comment 16

2 years ago
Comment on attachment 8840848 [details] [diff] [review]
Optimize reducing the resampling ratio to lowest terms in speex's resampler

Switching reviewer to the current upstream maintainer.
Attachment #8840848 - Flags: review?(jmvalin) → review?(le.businessman)
(Assignee)

Comment 17

2 years ago
So, this is already upstream. I'm morphing this bug about updating our in-tree copy.
(Assignee)

Updated

2 years ago
Summary: "FIXME: This is terribly inefficient, but who cares (at least for now)?" code block shows up as the highest overhead in xul.dll in WebAssembly+WebGL 2 demo → Update libspeex_resanmpler to 79822c
(Assignee)

Comment 18

2 years ago
Comment on attachment 8840848 [details] [diff] [review]
Optimize reducing the resampling ratio to lowest terms in speex's resampler

This is already upstream, but this patch can be used for uplift, if needed. Jukka, is that necessary ?
Flags: needinfo?(jujjyl)
Attachment #8840848 - Flags: review?(le.businessman)
Comment hidden (mozreview-request)
(Assignee)

Comment 20

2 years ago
Karl, we need to update the library to pick up [0], that brings a very nice speedup in Jukka's scenario. I've updated to the last revision. Upstream equivalent code to 

I've updated the library using update.sh, rebasing the patches we apply on top. Next up, I'll see about upstreaming some of our patches, manually rebasing our patches was a bit tedious.

[0]: http://git.xiph.org/?p=speexdsp.git;a=commit;h=d08a14462744654f7097f17943d8e2ddc0dfe93e
(Assignee)

Updated

2 years ago
Summary: Update libspeex_resanmpler to 79822c → Update libspeex_resampler to 79822c

Comment 21

2 years ago
mozreview-review
Comment on attachment 8840952 [details]
Bug 1341254 - Update libspeex_resampler to 79822c.

https://reviewboard.mozilla.org/r/115342/#review117020

Thanks for doing this, Paul.  The fix for bug 1274083 is getting in the way, and a couple of nits.

::: media/libspeex_resampler/hugemem.patch:1
(Diff revision 1)
> +--- a/media/libspeex_resampler/src/resample.c
> ++++ b/media/libspeex_resampler/src/resample.c
> +@@ -56,16 +56,18 @@
> +    (e.g. 2/3), and get rid of the rounding operations in the inner loop.
> +    The latter both reduces CPU time and makes the algorithm more SIMD-friendly.
> + */
> + 
> + #ifdef HAVE_CONFIG_H
> + #include "config.h"
> + #endif
> + 
> ++#define RESAMPLE_HUGEMEM 1
> ++
> + #ifdef OUTSIDE_SPEEX

This is all repeated below.

::: media/libspeex_resampler/src/resample.c:1150
(Diff revision 1)
> -         {
> +            return RESAMPLER_ERR_OVERFLOW;
> -             st->samp_frac_num[i] = st->den_rate-1;

Lost the fix for bug 1274083 here, which didn't have an associated change to update.sh.

https://hg.mozilla.org/mozilla-central/rev/74d13ba37b59

::: media/libspeex_resampler/update.sh:1
(Diff revision 1)
>  # This Source Code Form is subject to the terms of the Mozilla Public

Can you update the revision in README_MOZILLA, please?

::: media/libspeex_resampler/update.sh
(Diff revision 1)
>  patch -p3 < outside-speex.patch
>  patch -p3 < simd-detect-runtime.patch
>  patch -p3 < set-skip-frac.patch
>  patch -p3 < hugemem.patch
>  patch -p3 < remove-empty-asm-clobber.patch
> -patch -p3 < handle-memory-error.patch

Good.  Covered by http://git.xiph.org/?p=speexdsp.git;a=commitdiff;h=8c2981fa4527330708fccda4de3490a60168d568

::: media/libspeex_resampler/update.sh
(Diff revision 1)
>  patch -p3 < simd-detect-runtime.patch
>  patch -p3 < set-skip-frac.patch
>  patch -p3 < hugemem.patch
>  patch -p3 < remove-empty-asm-clobber.patch
> -patch -p3 < handle-memory-error.patch
> -patch -p3 < fix-overflow.patch

Yes.  Better version upstream.
http://git.xiph.org/?p=speexdsp.git;a=commitdiff;h=d5fade684a4f0ea1419901173d8a254742a6f3a9
http://git.xiph.org/?p=speexdsp.git;a=commitdiff;h=335a9a16b8a90fee92002a7256b90bb6d8498622
Attachment #8840952 - Flags: review?(karlt) → review-
(Assignee)

Comment 22

2 years ago
mozreview-review
Comment on attachment 8840952 [details]
Bug 1341254 - Update libspeex_resampler to 79822c.

https://reviewboard.mozilla.org/r/115342/#review119252

::: media/libspeex_resampler/src/resample.c:1150
(Diff revision 1)
> -         {
> +            return RESAMPLER_ERR_OVERFLOW;
> -             st->samp_frac_num[i] = st->den_rate-1;

Good catch. I've made it into a patch in a separate commit (as it should have done), to be applied before the rebase, and the new version of the update patch updates the newly-created patch against the latest codebase (the function is called `_muldiv` instead of `_muldiv_safe`, in particular).
Comment hidden (mozreview-request)
Comment hidden (mozreview-request)
Actually, that change is *not* a good idea. It turns a minor error (which might cause just a tiny glitch) into an error that breaks resampling completely.
(Assignee)

Comment 26

2 years ago
(In reply to Jean-Marc Valin (:jmspeex) from comment #25)
> Actually, that change is *not* a good idea. It turns a minor error (which
> might cause just a tiny glitch) into an error that breaks resampling
> completely.

Agreed, and I plan to do like you said in the other bug, clamp the resampling ratio ahead of time on every call site that uses `speex_resampler_set_rate`.

I filed bug 1344785 to remove our local patch and do a little audit, but I did a quick check and we're hitting this code path in only one use of the resampler (details in the bug).

This bug is just about updating our copy from upstream to fix something unrelated, so I'm going to land the patches as they are (granted Karl does not find something to fix).
(In reply to Jean-Marc Valin (:jmspeex) from comment #25)
> Actually, that change is *not* a good idea. It turns a minor error (which
> might cause just a tiny glitch) into an error that breaks resampling
> completely.

I assume you are referring to the last hunk in this patch upstream,
rather than the fix for that?
http://git.xiph.org/?p=speexdsp.git;a=commitdiff;h=335a9a16b8a90fee92002a7256b90bb6d8498622#patch2

Comment 28

2 years ago
mozreview-review
Comment on attachment 8844021 [details]
Bug 1341254 - Put the modification made to the resampler code in bug 1274083 into their own patch, and add this patch to update.sh.

https://reviewboard.mozilla.org/r/117588/#review119424
Attachment #8844021 - Flags: review?(karlt) → review+

Comment 29

2 years ago
mozreview-review
Comment on attachment 8840952 [details]
Bug 1341254 - Update libspeex_resampler to 79822c.

https://reviewboard.mozilla.org/r/115342/#review119428

::: media/libspeex_resampler/set-rate-overflow-no-return.patch:13
(Diff revision 2)
> -          if (!_muldiv_safe(st->samp_frac_num[i],st->den_rate,old_den))
> +-         if (_muldiv(&st->samp_frac_num[i],st->samp_frac_num[i],st->den_rate,old_den) != RESAMPLER_ERR_SUCCESS)
>  -            return RESAMPLER_ERR_OVERFLOW;
> -+         {
> ++         if (_muldiv(&st->samp_frac_num[i],st->samp_frac_num[i],st->den_rate,old_den) != RESAMPLER_ERR_SUCCESS) {

File style is to have the opening brace on its own line, but that can be sorted out when we decide how this should merge with the safety net below.
Attachment #8840952 - Flags: review?(karlt) → review+

Comment 30

2 years ago
Pushed by paul@paul.cx:
https://hg.mozilla.org/integration/mozilla-inbound/rev/4b1b4e0d8579
Put the modification made to the resampler code in bug 1274083 into their own patch, and add this patch to update.sh. r=karlt
https://hg.mozilla.org/integration/mozilla-inbound/rev/c8df499e005a
Update libspeex_resampler to 79822c. r=karlt

Comment 31

2 years ago
bugherder
https://hg.mozilla.org/mozilla-central/rev/4b1b4e0d8579
https://hg.mozilla.org/mozilla-central/rev/c8df499e005a
Status: NEW → RESOLVED
Last Resolved: 2 years ago
status-firefox55: --- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla55
(Reporter)

Comment 32

2 years ago
Created attachment 8849118 [details]
after_fix.png

Perfect, retesting after FF Nightly fixes, the performance hotspot is nowhere found anymore.

The demo has since been posted live at http://mzl.la/webassemblydemo

Thanks for the fix!
Flags: needinfo?(jujjyl)
(Reporter)

Updated

2 years ago
Status: RESOLVED → VERIFIED
You need to log in before you can comment on or make changes to this bug.