Closed Bug 1027864 Opened 7 years ago Closed 6 years ago

flush subnormals in feedback loops

Categories

(Core :: Web Audio, defect, P2)

x86_64
Linux
defect

Tracking

()

RESOLVED FIXED
mozilla40
Tracking Status
firefox40 --- fixed

People

(Reporter: karlt, Assigned: baku)

References

Details

(Keywords: perf)

Attachments

(1 file, 2 obsolete files)

A feedback loop is a situation where denormals can be generated and attenuation will not remove them, so they remain indefinitely wasting cpu.

DelayNode is probably the best place to flush these, and that may be necessary only while in a cycle.

This is also important for removing the DelayNode's self reference so that the node can be collected when the cycle is no longer producing audible signal.
I think we can just set FTZ in the CPU, right?
Keywords: perf
(In reply to Paul Adenot (:padenot) from comment #1)
> I think we can just set FTZ in the CPU, right?

Perhaps.  I was concerned that flushing subnormals to zero for all arithmetic may cause unexpected results, such as divide-by-zero due to permitting a - b = 0 when
a != b.
(In reply to Karl Tomlinson (:karlt) from comment #2)
> (In reply to Paul Adenot (:padenot) from comment #1)
> > I think we can just set FTZ in the CPU, right?
> 
> Perhaps.  I was concerned that flushing subnormals to zero for all
> arithmetic may cause unexpected results, such as divide-by-zero due to
> permitting a - b = 0 when
> a != b.

We may be fine here, because on OSX, when CoreAudio creates and hands us audio callbacks [0], it sets FTZ and DAZ, which is a sane default for audio.

[0]: http://lists.apple.com/archives/coreaudio-api/2014/Jun/msg00051.html and answers
Attached patch isSilence.patch (obsolete) — Splinter Review
Can you give me a first feedback about this?
A good test is missing.
Attachment #8587451 - Flags: feedback?(padenot)
Comment on attachment 8587451 [details] [diff] [review]
isSilence.patch

I think that works. I'm not sure if 1e-7 is an appropriate number, maybe we can aim higher/lower?
Attachment #8587451 - Flags: feedback?(padenot) → feedback+
Attached patch isSilence.patch (obsolete) — Splinter Review
I'm using 1e-9f instead 1e-7f. Maybe it's better.
Attachment #8587451 - Attachment is obsolete: true
Attachment #8587498 - Flags: review?(padenot)
Assignee: nobody → amarchesini
Comment on attachment 8587498 [details] [diff] [review]
isSilence.patch

Review of attachment 8587498 [details] [diff] [review]:
-----------------------------------------------------------------

This looks good to me, but Karl knows this better.

::: dom/media/AudioSegment.h
@@ +137,5 @@
> +    }
> +
> +    for (uint32_t i = 0, length = mChannelData.Length(); i < length; ++i) {
> +      const float* channel = static_cast<const float*>(mChannelData[i]);
> +      for (StreamTime frame = 0; frame < mDuration; ++frame) {

Just use a uint32_t here, StreamTime is for, well, MediaStream times.

@@ +138,5 @@
> +
> +    for (uint32_t i = 0, length = mChannelData.Length(); i < length; ++i) {
> +      const float* channel = static_cast<const float*>(mChannelData[i]);
> +      for (StreamTime frame = 0; frame < mDuration; ++frame) {
> +        if (fabs(channel[frame]) >= 1e-9f) {

I don't like hard coded magic values without explanations :-)

::: dom/media/webaudio/test/test_bug1027864.html
@@ +1,4 @@
> +<!DOCTYPE HTML>
> +<html>
> +<head>
> +  <title>Test bug 1027864</title>

Maybe a more explicit name? "Check that unreferenced cycles with no inputs can be collected" ?
Attachment #8587498 - Flags: review?(padenot) → review?(karlt)
I'm going to apply your comment when I'll receive the karlt's review.
Comment on attachment 8587498 [details] [diff] [review]
isSilence.patch

>+  bool IsSilence() const

>+        if (fabs(channel[frame]) >= 1e-9f) {

The Web Audio spec requires floating point representation, and gain nodes
could multiply signals by arbitrary values, so we can't just assume that a
small signal is a silent signal.

However, subnormal floating point values come with a large cost, and are
already being flushed elsewhere, such as in simd operations, so I think we can
justify considering subnormals as silence.  That would mean a cut-off at
FLT_MIN or numeric_limits<float>:min().

Naming the function IsSilentOrSubnormal() would be clearer about
what is happening.

Mochitests don't usually require review, but I have some suggestions.

>+Services.obs.addObserver(observer, "webaudio-node-demise", false);

I didn't know about webaudio-node-demise.
That's very useful for tests, thanks.

>+        Services.obs.removeObserver(observer, "webaudio-node-demise");

IIUC the observer service maintains a global list and so the observer will not
be automatically removed when the document is unloaded.  The test harness tries
to keep running tests in the same process after one test times out, so I think
calling of removeObserver should be handled through
SimpleTest.registerCleanupFunction().

>+  var delay = ac.createDelay();
>+  delay.delayTime.value = 0.3;
>+
>+  var gain = ac.createGain();
>+  gain.gain.value = 0.3;

How long does the test take to run with the change to FLT_MIN?

Reducing delayTime, or reducing the magnitude of the original signal from the oscillator, could make it run considerably faster.  Minimum delay time is one block, which is about 0.003s at 44.1kHz.

The gain in the loop needs to be high enough so that multiplying denorm_min()
by the gain is not zero.  I would have thought that would require a gain >
0.5, unless there is some different rounding happening.

>+setInterval(function() {
>+  forceCC();
>+  forceCC();
>+}, 200);
>+
>+function forceCC() {
>+  SpecialPowers.DOMWindowUtils.cycleCollect();
>+  SpecialPowers.DOMWindowUtils.garbageCollect();
>+  SpecialPowers.DOMWindowUtils.garbageCollect();
>+}

Is it necessary to have all these GC/CC calls?
I would expect a GC to be required before the CC, but
DOMWindowUtils.garbageCollect() does "a garbage collection followed by a cycle
collection" [1], so I would have expected a single call to be enough.

[1] http://hg.mozilla.org/mozilla-central/annotate/a32e3b93c8d8/dom/interfaces/base/nsIDOMWindowUtils.idl#l775
Attachment #8587498 - Flags: review?(karlt) → review-
(In reply to Karl Tomlinson (:karlt) from comment #9)
> Comment on attachment 8587498 [details] [diff] [review]
> isSilence.patch
> 
> >+  bool IsSilence() const
> 
> >+        if (fabs(channel[frame]) >= 1e-9f) {
> 
> The Web Audio spec requires floating point representation, and gain nodes
> could multiply signals by arbitrary values, so we can't just assume that a
> small signal is a silent signal.
> 
> However, subnormal floating point values come with a large cost, and are
> already being flushed elsewhere, such as in simd operations, so I think we
> can
> justify considering subnormals as silence.  That would mean a cut-off at
> FLT_MIN or numeric_limits<float>:min().

I've been thinking about adding an informative note in the spec about that. In practice, I'd expect any audio software to automatically flush denormals to zero.
Attached patch isSilence.patchSplinter Review
Thanks for reviewing the patch. Here your comments applied.
The test takes a couple of seconds with the new values ~30secs (running mach mochitest-chrome ... - debug build). The test itself takes around 10secs.
Attachment #8587498 - Attachment is obsolete: true
Attachment #8595320 - Flags: review?(karlt)
mmm... what I meant to say is:

The test takes ~30secs (running mach mochitest-chrome ... - debug build).
The test itself takes around 10secs.

I don't have stats about the previous patch.
Comment on attachment 8595320 [details] [diff] [review]
isSilence.patch

Sorry, I didn't notice when I first looked, but I now realise this addresses
only some of the issues in comment 0.

I'm marking this r+ because it addresses the important issue of removing the
DelayNode's self reference.  However, if you would like to proceed with this
approach, then please file a new bug for the remaining issues.

The key remaining issue that is not addressed is that subnormals are still not
flushed to zero so the processing is still expensive until CC runs, which is
not necessarily frequent.  There may even be a reference still remaining so CC
may not reap the feedback loop.

While providing a null chunk to mBuffer.Write() may be an option, it doesn't
seem ideal because the chunk consists of a volume (modified by the gain node)
and the channel data.

IsSilentOrSubnormal() is testing only the channel data and ignoring the
volume.

Channel data and volume are combined in DelayBuffer::ReadChannels(), so I'd be
inclined to test for subnormals in DelayBuffer::Read(), where
aOutputChunk->SetNull(WEBAUDIO_BLOCK_SIZE) can be safely called.
Doing that would make the IsSilentOrSubnormal() test in DelayNodeEngine::ProcessBlock() unnecessary.
In that case an AudioBlockIsSilentOrSubnormal() method, declared in
AudioNodeEngine.h may be more appropriate than the AudioChunk method.

(In reply to Andrea Marchesini (:baku) from comment #12)
> The test takes ~30secs (running mach mochitest-chrome ... - debug build).
> The test itself takes around 10secs.

10s for a single test is acceptable, if necessary, but, if reducing the delay further, possibly with a corresponding reduction in the oscillator stop() time speeds up the test, then that is worth doing.

>+  forceCC();
>+  forceCC();
>+}, 200);
>+
>+function forceCC() {
>+  SpecialPowers.DOMWindowUtils.cycleCollect();
>+  SpecialPowers.DOMWindowUtils.garbageCollect();

I doubt the cycleCollect() call here is useful, because JS will need to GC
before the nodes can be collected, and garbageCollect() will do that as well
as the CC.

I would have expected only a single forceCC() to be required unless there are
some odd ownership issues remaining.  If such issues exist, we should identify
them, because Web Audio performance is unfortunately dependent on CC.
Attachment #8595320 - Flags: review?(karlt) → review+
(In reply to Paul Adenot (:padenot) from comment #1)
> I think we can just set FTZ in the CPU, right?

I'm not opposed to doing that.
I'm assuming that can be done on all our target platforms.
If not, then there may be little point, as we'd have to write the code to explicitly flush where necessary anyway.

The comment in DenormalDisabler.h says 

          "For systems using x87 instead of sse, there's no hardware support
        // to flush denormals automatically. Hence, we need to flush
        // denormals to zero manually."

I don't know whether or not that affects us.

Running JS on the audio thread may complicate things a bit as we'd have to set FTZ again after each time the JS runs.
Blocks: 1157635
https://hg.mozilla.org/mozilla-central/rev/3a9e056fb0ec
Status: NEW → RESOLVED
Closed: 6 years ago
Flags: in-testsuite+
Resolution: --- → FIXED
Target Milestone: --- → mozilla40
You need to log in before you can comment on or make changes to this bug.