Closed
Bug 1027864
Opened 9 years ago
Closed 8 years ago
flush subnormals in feedback loops
Categories
(Core :: Web Audio, defect, P2)
Tracking
()
RESOLVED
FIXED
mozilla40
Tracking | Status | |
---|---|---|
firefox40 | --- | fixed |
People
(Reporter: karlt, Assigned: baku)
References
Details
(Keywords: perf)
Attachments
(1 file, 2 obsolete files)
4.66 KB,
patch
|
karlt
:
review+
|
Details | Diff | Splinter Review |
A feedback loop is a situation where denormals can be generated and attenuation will not remove them, so they remain indefinitely wasting cpu. DelayNode is probably the best place to flush these, and that may be necessary only while in a cycle. This is also important for removing the DelayNode's self reference so that the node can be collected when the cycle is no longer producing audible signal.
Reporter | ||
Comment 2•8 years ago
|
||
(In reply to Paul Adenot (:padenot) from comment #1) > I think we can just set FTZ in the CPU, right? Perhaps. I was concerned that flushing subnormals to zero for all arithmetic may cause unexpected results, such as divide-by-zero due to permitting a - b = 0 when a != b.
Comment 3•8 years ago
|
||
(In reply to Karl Tomlinson (:karlt) from comment #2) > (In reply to Paul Adenot (:padenot) from comment #1) > > I think we can just set FTZ in the CPU, right? > > Perhaps. I was concerned that flushing subnormals to zero for all > arithmetic may cause unexpected results, such as divide-by-zero due to > permitting a - b = 0 when > a != b. We may be fine here, because on OSX, when CoreAudio creates and hands us audio callbacks [0], it sets FTZ and DAZ, which is a sane default for audio. [0]: http://lists.apple.com/archives/coreaudio-api/2014/Jun/msg00051.html and answers
Assignee | ||
Comment 4•8 years ago
|
||
Can you give me a first feedback about this? A good test is missing.
Attachment #8587451 -
Flags: feedback?(padenot)
Comment 5•8 years ago
|
||
Comment on attachment 8587451 [details] [diff] [review] isSilence.patch I think that works. I'm not sure if 1e-7 is an appropriate number, maybe we can aim higher/lower?
Attachment #8587451 -
Flags: feedback?(padenot) → feedback+
Assignee | ||
Comment 6•8 years ago
|
||
I'm using 1e-9f instead 1e-7f. Maybe it's better.
Attachment #8587451 -
Attachment is obsolete: true
Attachment #8587498 -
Flags: review?(padenot)
Assignee | ||
Updated•8 years ago
|
Assignee: nobody → amarchesini
Comment 7•8 years ago
|
||
Comment on attachment 8587498 [details] [diff] [review] isSilence.patch Review of attachment 8587498 [details] [diff] [review]: ----------------------------------------------------------------- This looks good to me, but Karl knows this better. ::: dom/media/AudioSegment.h @@ +137,5 @@ > + } > + > + for (uint32_t i = 0, length = mChannelData.Length(); i < length; ++i) { > + const float* channel = static_cast<const float*>(mChannelData[i]); > + for (StreamTime frame = 0; frame < mDuration; ++frame) { Just use a uint32_t here, StreamTime is for, well, MediaStream times. @@ +138,5 @@ > + > + for (uint32_t i = 0, length = mChannelData.Length(); i < length; ++i) { > + const float* channel = static_cast<const float*>(mChannelData[i]); > + for (StreamTime frame = 0; frame < mDuration; ++frame) { > + if (fabs(channel[frame]) >= 1e-9f) { I don't like hard coded magic values without explanations :-) ::: dom/media/webaudio/test/test_bug1027864.html @@ +1,4 @@ > +<!DOCTYPE HTML> > +<html> > +<head> > + <title>Test bug 1027864</title> Maybe a more explicit name? "Check that unreferenced cycles with no inputs can be collected" ?
Attachment #8587498 -
Flags: review?(padenot) → review?(karlt)
Assignee | ||
Comment 8•8 years ago
|
||
I'm going to apply your comment when I'll receive the karlt's review.
Reporter | ||
Comment 9•8 years ago
|
||
Comment on attachment 8587498 [details] [diff] [review] isSilence.patch >+ bool IsSilence() const >+ if (fabs(channel[frame]) >= 1e-9f) { The Web Audio spec requires floating point representation, and gain nodes could multiply signals by arbitrary values, so we can't just assume that a small signal is a silent signal. However, subnormal floating point values come with a large cost, and are already being flushed elsewhere, such as in simd operations, so I think we can justify considering subnormals as silence. That would mean a cut-off at FLT_MIN or numeric_limits<float>:min(). Naming the function IsSilentOrSubnormal() would be clearer about what is happening. Mochitests don't usually require review, but I have some suggestions. >+Services.obs.addObserver(observer, "webaudio-node-demise", false); I didn't know about webaudio-node-demise. That's very useful for tests, thanks. >+ Services.obs.removeObserver(observer, "webaudio-node-demise"); IIUC the observer service maintains a global list and so the observer will not be automatically removed when the document is unloaded. The test harness tries to keep running tests in the same process after one test times out, so I think calling of removeObserver should be handled through SimpleTest.registerCleanupFunction(). >+ var delay = ac.createDelay(); >+ delay.delayTime.value = 0.3; >+ >+ var gain = ac.createGain(); >+ gain.gain.value = 0.3; How long does the test take to run with the change to FLT_MIN? Reducing delayTime, or reducing the magnitude of the original signal from the oscillator, could make it run considerably faster. Minimum delay time is one block, which is about 0.003s at 44.1kHz. The gain in the loop needs to be high enough so that multiplying denorm_min() by the gain is not zero. I would have thought that would require a gain > 0.5, unless there is some different rounding happening. >+setInterval(function() { >+ forceCC(); >+ forceCC(); >+}, 200); >+ >+function forceCC() { >+ SpecialPowers.DOMWindowUtils.cycleCollect(); >+ SpecialPowers.DOMWindowUtils.garbageCollect(); >+ SpecialPowers.DOMWindowUtils.garbageCollect(); >+} Is it necessary to have all these GC/CC calls? I would expect a GC to be required before the CC, but DOMWindowUtils.garbageCollect() does "a garbage collection followed by a cycle collection" [1], so I would have expected a single call to be enough. [1] http://hg.mozilla.org/mozilla-central/annotate/a32e3b93c8d8/dom/interfaces/base/nsIDOMWindowUtils.idl#l775
Attachment #8587498 -
Flags: review?(karlt) → review-
Comment 10•8 years ago
|
||
(In reply to Karl Tomlinson (:karlt) from comment #9) > Comment on attachment 8587498 [details] [diff] [review] > isSilence.patch > > >+ bool IsSilence() const > > >+ if (fabs(channel[frame]) >= 1e-9f) { > > The Web Audio spec requires floating point representation, and gain nodes > could multiply signals by arbitrary values, so we can't just assume that a > small signal is a silent signal. > > However, subnormal floating point values come with a large cost, and are > already being flushed elsewhere, such as in simd operations, so I think we > can > justify considering subnormals as silence. That would mean a cut-off at > FLT_MIN or numeric_limits<float>:min(). I've been thinking about adding an informative note in the spec about that. In practice, I'd expect any audio software to automatically flush denormals to zero.
Assignee | ||
Comment 11•8 years ago
|
||
Thanks for reviewing the patch. Here your comments applied. The test takes a couple of seconds with the new values ~30secs (running mach mochitest-chrome ... - debug build). The test itself takes around 10secs.
Attachment #8587498 -
Attachment is obsolete: true
Attachment #8595320 -
Flags: review?(karlt)
Assignee | ||
Comment 12•8 years ago
|
||
mmm... what I meant to say is: The test takes ~30secs (running mach mochitest-chrome ... - debug build). The test itself takes around 10secs. I don't have stats about the previous patch.
Reporter | ||
Comment 13•8 years ago
|
||
Comment on attachment 8595320 [details] [diff] [review] isSilence.patch Sorry, I didn't notice when I first looked, but I now realise this addresses only some of the issues in comment 0. I'm marking this r+ because it addresses the important issue of removing the DelayNode's self reference. However, if you would like to proceed with this approach, then please file a new bug for the remaining issues. The key remaining issue that is not addressed is that subnormals are still not flushed to zero so the processing is still expensive until CC runs, which is not necessarily frequent. There may even be a reference still remaining so CC may not reap the feedback loop. While providing a null chunk to mBuffer.Write() may be an option, it doesn't seem ideal because the chunk consists of a volume (modified by the gain node) and the channel data. IsSilentOrSubnormal() is testing only the channel data and ignoring the volume. Channel data and volume are combined in DelayBuffer::ReadChannels(), so I'd be inclined to test for subnormals in DelayBuffer::Read(), where aOutputChunk->SetNull(WEBAUDIO_BLOCK_SIZE) can be safely called. Doing that would make the IsSilentOrSubnormal() test in DelayNodeEngine::ProcessBlock() unnecessary. In that case an AudioBlockIsSilentOrSubnormal() method, declared in AudioNodeEngine.h may be more appropriate than the AudioChunk method. (In reply to Andrea Marchesini (:baku) from comment #12) > The test takes ~30secs (running mach mochitest-chrome ... - debug build). > The test itself takes around 10secs. 10s for a single test is acceptable, if necessary, but, if reducing the delay further, possibly with a corresponding reduction in the oscillator stop() time speeds up the test, then that is worth doing. >+ forceCC(); >+ forceCC(); >+}, 200); >+ >+function forceCC() { >+ SpecialPowers.DOMWindowUtils.cycleCollect(); >+ SpecialPowers.DOMWindowUtils.garbageCollect(); I doubt the cycleCollect() call here is useful, because JS will need to GC before the nodes can be collected, and garbageCollect() will do that as well as the CC. I would have expected only a single forceCC() to be required unless there are some odd ownership issues remaining. If such issues exist, we should identify them, because Web Audio performance is unfortunately dependent on CC.
Attachment #8595320 -
Flags: review?(karlt) → review+
Reporter | ||
Comment 14•8 years ago
|
||
(In reply to Paul Adenot (:padenot) from comment #1) > I think we can just set FTZ in the CPU, right? I'm not opposed to doing that. I'm assuming that can be done on all our target platforms. If not, then there may be little point, as we'd have to write the code to explicitly flush where necessary anyway. The comment in DenormalDisabler.h says "For systems using x87 instead of sse, there's no hardware support // to flush denormals automatically. Hence, we need to flush // denormals to zero manually." I don't know whether or not that affects us. Running JS on the audio thread may complicate things a bit as we'd have to set FTZ again after each time the JS runs.
Assignee | ||
Comment 15•8 years ago
|
||
https://treeherder.mozilla.org/#/jobs?repo=try&revision=cc6b0b2bacd4 https://hg.mozilla.org/integration/mozilla-inbound/rev/3a9e056fb0ec /me filing a follow up for the FTZ thing.
Comment 16•8 years ago
|
||
https://hg.mozilla.org/mozilla-central/rev/3a9e056fb0ec
Status: NEW → RESOLVED
Closed: 8 years ago
status-firefox40:
--- → fixed
Flags: in-testsuite+
Resolution: --- → FIXED
Target Milestone: --- → mozilla40
You need to log in
before you can comment on or make changes to this bug.
Description
•