Closed Bug 1493779 Opened 6 years ago Closed 6 years ago

AddressSanitizer: ILL /builds/worker/workspace/build/src/xpcom/base/nsDebugImpl.cpp:628:3 in NS_ABORT_OOM(unsigned long)

Categories

(Core :: Web Audio, defect, P2)

defect

Tracking

()

RESOLVED FIXED
mozilla64
Tracking Status
firefox-esr60 --- wontfix
firefox62 --- wontfix
firefox63 --- wontfix
firefox64 --- fixed

People

(Reporter: jkratzer, Assigned: padenot)

References

(Blocks 2 open bugs)

Details

(Keywords: crash, testcase)

Attachments

(3 files, 1 obsolete file)

Attached file testcase.html
Testcase found while fuzzing mozilla-central rev 095ec59a8800.

==28996==ERROR: AddressSanitizer: ILL on unknown address 0x7f8cbd681c9b (pc 0x7f8cbd681c9b bp 0x7f8c6abdbc70 sp 0x7f8c6abdbc70 T26)
    #0 0x7f8cbd681c9a in NS_ABORT_OOM(unsigned long) /builds/worker/workspace/build/src/xpcom/base/nsDebugImpl.cpp:628:3
    #1 0x7f8cbd67f68e in SizeTooBig /builds/worker/workspace/build/src/obj-firefox/dist/include/nsTArray.h:217:42
    #2 0x7f8cbd67f68e in nsTArrayInfallibleAllocator::ResultTypeProxy nsTArray_base<nsTArrayInfallibleAllocator, nsTArray_CopyWithMemutils>::EnsureCapacity<nsTArrayInfallibleAllocator>(unsigned long, unsigned long) /builds/worker/workspace/build/src/obj-firefox/dist/include/nsTArray-inl.h:128
    #3 0x7f8cc08d4053 in InsertSlotsAt<nsTArrayInfallibleAllocator> /builds/worker/workspace/build/src/obj-firefox/dist/include/nsTArray-inl.h:344:3
    #4 0x7f8cc08d4053 in InsertElementsAt<nsTArrayInfallibleAllocator> /builds/worker/workspace/build/src/obj-firefox/dist/include/nsTArray.h:2233
    #5 0x7f8cc08d4053 in nsTArrayInfallibleAllocator::ResultType nsTArray_Impl<float, nsTArrayInfallibleAllocator>::SetLength<nsTArrayInfallibleAllocator>(unsigned long) /builds/worker/workspace/build/src/obj-firefox/dist/include/nsTArray.h:2171
    #6 0x7f8cc64399b5 in WebCore::Reverb::Reverb(mozilla::AudioChunk const&, unsigned long, bool, bool, float) /builds/worker/workspace/build/src/dom/media/webaudio/blink/Reverb.cpp:93:17
    #7 0x7f8cc63d0bb3 in mozilla::dom::ConvolverNodeEngine::SetBuffer(mozilla::AudioChunk&&) /builds/worker/workspace/build/src/dom/media/webaudio/ConvolverNode.cpp:143:19
    #8 0x7f8cc5cd88ca in mozilla::MediaStreamGraphImpl::RunMessagesInQueue() /builds/worker/workspace/build/src/dom/media/MediaStreamGraph.cpp:1271:20
    #9 0x7f8cc5cdd846 in mozilla::MediaStreamGraphImpl::OneIteration(long) /builds/worker/workspace/build/src/dom/media/MediaStreamGraph.cpp:1481:3
    #10 0x7f8cc59e5eaf in mozilla::AudioCallbackDriver::DataCallback(float const*, float*, long) /builds/worker/workspace/build/src/dom/media/GraphDriver.cpp:1017:36
    #11 0x7f8ccdacda40 in _$LT$audioipc_client..stream..CallbackServer$u20$as$u20$audioipc..rpc..server..Server$GT$::process::_$u7b$$u7b$closure$u7d$$u7d$::ha8b9763f461538f5 /builds/worker/workspace/build/src/media/audioipc/client/src/stream.rs:110:24
    #12 0x7f8ccdacda40 in _$LT$futures..future..lazy..Lazy$LT$F$C$$u20$R$GT$$GT$::get::h8a34836a8181130a /builds/worker/workspace/build/src/third_party/rust/futures/src/future/lazy.rs:64
    #13 0x7f8ccdacda40 in _$LT$futures..future..lazy..Lazy$LT$F$C$$u20$R$GT$$u20$as$u20$futures..future..Future$GT$::poll::h347c84b78536ff3f /builds/worker/workspace/build/src/third_party/rust/futures/src/future/lazy.rs:82
    #14 0x7f8ccdacda40 in futures::future::catch_unwind::_$LT$impl$u20$futures..future..Future$u20$for$u20$std..panic..AssertUnwindSafe$LT$F$GT$$GT$::poll::hc66f05d272be80ba /builds/worker/workspace/build/src/third_party/rust/futures/src/future/catch_unwind.rs:49
    #15 0x7f8ccdacda40 in _$LT$futures..future..catch_unwind..CatchUnwind$LT$F$GT$$u20$as$u20$futures..future..Future$GT$::poll::_$u7b$$u7b$closure$u7d$$u7d$::h74789e3cf2a66f9d /builds/worker/workspace/build/src/third_party/rust/futures/src/future/catch_unwind.rs:32
    #16 0x7f8ccdacda40 in std::panicking::try::do_call::h383a81b8d2056618 /checkout/src/libstd/panicking.rs:310
    #17 0x7f8ccdacda40 in __rust_maybe_catch_panic /checkout/src/libpanic_abort/lib.rs:39
    #18 0x7f8ccdacda40 in std::panicking::try::had5865a562e0f04b /checkout/src/libstd/panicking.rs:289
    #19 0x7f8ccdacda40 in std::panic::catch_unwind::h90957eb90b945f34 /checkout/src/libstd/panic.rs:392
    #20 0x7f8ccdacda40 in _$LT$futures..future..catch_unwind..CatchUnwind$LT$F$GT$$u20$as$u20$futures..future..Future$GT$::poll::h8deb4a51dbc3321a /builds/worker/workspace/build/src/third_party/rust/futures/src/future/catch_unwind.rs:32
    #21 0x7f8ccdacda40 in _$LT$futures_cpupool..MySender$LT$F$C$$u20$core..result..Result$LT$$LT$F$u20$as$u20$futures..future..Future$GT$..Item$C$$u20$$LT$F$u20$as$u20$futures..future..Future$GT$..Error$GT$$GT$$u20$as$u20$futures..future..Future$GT$::poll::h41f167feb7838e3a /builds/worker/workspace/build/src/third_party/rust/futures-cpupool/src/lib.rs:325
    #22 0x7f8ccdae23d0 in _$LT$alloc..boxed..Box$LT$F$GT$$u20$as$u20$futures..future..Future$GT$::poll::h644a5e47165a2c68 /builds/worker/workspace/build/src/third_party/rust/futures/src/future/mod.rs:113:12
    #23 0x7f8ccdae23d0 in _$LT$futures..task_impl..Spawn$LT$T$GT$$GT$::poll_future_notify::_$u7b$$u7b$closure$u7d$$u7d$::h5904c75564ea2b63 /builds/worker/workspace/build/src/third_party/rust/futures/src/task_impl/mod.rs:289
    #24 0x7f8ccdae23d0 in _$LT$futures..task_impl..Spawn$LT$T$GT$$GT$::enter::_$u7b$$u7b$closure$u7d$$u7d$::h1b1078111cc7b73a /builds/worker/workspace/build/src/third_party/rust/futures/src/task_impl/mod.rs:363
    #25 0x7f8ccdae23d0 in futures::task_impl::std::set::h07ec201e1f095c67 /builds/worker/workspace/build/src/third_party/rust/futures/src/task_impl/std/mod.rs:78
    #26 0x7f8ccdae23d0 in _$LT$futures..task_impl..Spawn$LT$T$GT$$GT$::enter::h8019cea08c9ddbfb /builds/worker/workspace/build/src/third_party/rust/futures/src/task_impl/mod.rs:363
    #27 0x7f8ccdae23d0 in _$LT$futures..task_impl..Spawn$LT$T$GT$$GT$::poll_future_notify::h4ab45d7b2ac6c51b /builds/worker/workspace/build/src/third_party/rust/futures/src/task_impl/mod.rs:289
    #28 0x7f8ccdae23d0 in futures::task_impl::std::Run::run::h65b702aba83ce2ad /builds/worker/workspace/build/src/third_party/rust/futures/src/task_impl/std/mod.rs:450
    #29 0x7f8ccdae23d0 in futures_cpupool::Inner::work::h8a5b4bffeba6b3c4 /builds/worker/workspace/build/src/third_party/rust/futures-cpupool/src/lib.rs:257
    #30 0x7f8ccdae23d0 in futures_cpupool::Builder::create::_$u7b$$u7b$closure$u7d$$u7d$::h5d9360920a712044 /builds/worker/workspace/build/src/third_party/rust/futures-cpupool/src/lib.rs:427
    #31 0x7f8ccdae23d0 in std::sys_common::backtrace::__rust_begin_short_backtrace::hb23cbd55170464ca /checkout/src/libstd/sys_common/backtrace.rs:136
    #32 0x7f8ccdae126d in std::thread::Builder::spawn::_$u7b$$u7b$closure$u7d$$u7d$::_$u7b$$u7b$closure$u7d$$u7d$::hf6e88a5b6bc04a0d /checkout/src/libstd/thread/mod.rs:409:20
    #33 0x7f8ccdae126d in _$LT$std..panic..AssertUnwindSafe$LT$F$GT$$u20$as$u20$core..ops..function..FnOnce$LT$$LP$$RP$$GT$$GT$::call_once::h114bfe3a744a6135 /checkout/src/libstd/panic.rs:313
    #34 0x7f8ccdae126d in std::panicking::try::do_call::hac5f5677baa68db3 /checkout/src/libstd/panicking.rs:310
    #35 0x7f8ccdae126d in __rust_maybe_catch_panic /checkout/src/libpanic_abort/lib.rs:39
    #36 0x7f8ccdae126d in std::panicking::try::he9642d51295b5be5 /checkout/src/libstd/panicking.rs:289
    #37 0x7f8ccdae126d in std::panic::catch_unwind::h0da2e14d6d8eb1cd /checkout/src/libstd/panic.rs:392
    #38 0x7f8ccdae126d in std::thread::Builder::spawn::_$u7b$$u7b$closure$u7d$$u7d$::h46336af32520d244 /checkout/src/libstd/thread/mod.rs:408
    #39 0x7f8ccdae126d in _$LT$F$u20$as$u20$alloc..boxed..FnBox$LT$A$GT$$GT$::call_box::hc064f4650c2cac03 /checkout/src/liballoc/boxed.rs:642
    #40 0x7f8cce044a81 in _$LT$alloc..boxed..Box$LT$$LP$dyn$u20$alloc..boxed..FnBox$LT$A$C$$u20$Output$u3d$R$GT$$u20$$u2b$$u20$$u27$a$RP$$GT$$u20$as$u20$core..ops..function..FnOnce$LT$A$GT$$GT$::call_once::h940efee7335c8f36 /checkout/src/liballoc/boxed.rs:652:8
    #41 0x7f8cce044a81 in std::sys_common::thread::start_thread::hac18b9ed82bd359d /checkout/src/libstd/sys_common/thread.rs:24
    #42 0x7f8cce044a81 in std::sys::unix::thread::Thread::new::thread_start::ha15fdd35f58285a3 /checkout/src/libstd/sys/unix/thread.rs:90
    #43 0x7f8ce09546da in start_thread (/lib/x86_64-linux-gnu/libpthread.so.0+0x76da)
    #44 0x7f8cdf92d88e in clone /build/glibc-OTsEL5/glibc-2.27/misc/../sysdeps/unix/sysv/linux/x86_64/clone.S:95

AddressSanitizer can not provide additional info.
SUMMARY: AddressSanitizer: ILL /builds/worker/workspace/build/src/xpcom/base/nsDebugImpl.cpp:628:3 in NS_ABORT_OOM(unsigned long)
Thread T26 (AudioIPC0) created by T0 (file:// Content) here:
    #0 0x560c9ab7373d in __interceptor_pthread_create /builds/worker/workspace/moz-toolchain/src/llvm/projects/compiler-rt/lib/asan/asan_interceptors.cc:204:3
    #1 0x7f8cce0447ac in std::sys::unix::thread::Thread::new::h7a2147c32829a17b /checkout/src/libstd/sys/unix/thread.rs:78:18
Flags: in-testsuite?
Paul, does P2 sounds right, or is this lower priority?
Flags: needinfo?(padenot)
Priority: -- → P2
This is just an OOM, but worth fixing because it's trivial to hit. I've toyed with a couple ideas:
1 - Putting a limit on the impulse response length (not amazing and not too reliable)
2 - Allocate a new buffer on the main thread, copy the data and normalize, so that we can signal the OOM early, and then delete the buffer on the rendering thread when we're done with all the FFTs (works, not super elegant, causes transient memory pressure)
3 - Doing the normalization, doing the FFTs and then undoing it (multiplying by the inverse): we can't because of our buffer sharing mechanism (maybe _in practice_ it would be safe, but well...).
4 - Doing the normalization on the input of the convolution process: this is trading a one time cost into a cost per rendering quantum, not great
5 - Use fallible alloc on the rendering thread, and try to signal back the error somehow, without throwing (using the console maybe ?) and turn the ConvolveNode into a noop.
6 - Plumb the normalization factor down the various objects created here, counting on the fact that we're using multi-stage convolution, and do our temporary allocation chunk per chunk, hoping that they are small enough (worst case, 32768 / 2 floats I think, which is around 13kB).

I'm leaning towards option 2 or 6 here. Karl, what do you think? Do you have another idea?
Flags: needinfo?(padenot) → needinfo?(karlt)
I agree this is worth fixing.  I wonder how large a buffer is feasible
with realtime processing, but large buffers are doable with offline
processing.

I agree it is sensible to focus on the allocations that are the full length
of the buffer and leave the smaller FFT allocations infallible.

FFTConvolver and FFTBlock have float arrays of length fftSize, which for
maxFFTSize = 32768 is 128kB for each of those arrays.  Failure in one of those
allocations would qualify for OOM | small on crash-stats.

You've covered the options well.  I don't have other ideas.

Note, however, that ReverbConvolver::m_accumulationBuffer() is also sized to
the length of the buffer.  This could be made fallible, with failure leading
to zero output.  Reporting an error early would require moving Reverb
construction to the main thread, so I think I'd only do 2 if moving the entire
construction.

Option 6 initially looked appealing.  Maybe its worth doing to save one
allocation, but it is not a complete solution because it doesn't address
m_accumulationBuffer.

I don't have any other ideas for signalling an error to the main thread for
option 5, but I think zero output on allocation failure is acceptable.
i.e. option 5 with a console warning is a good option.

I'd either do 5 or 2 with all of Reverb construction.

My thoughts on performance:

The normalization that is performed is not very useful, but it is the default
and so I guess we should be a bit concerned about performance.

Because FFTs are limited to maxFFTSize, the cost of a single process() call
eventually becomes proportional to the size of the impulse buffer.

At least for moderate sized FFTs less than maxFFTSize, the cost of a
normalization (proportional to impulse buffer) may be large or at least
significant compared to process() cost.

That leads me to suspect that moving calculation of the normalization factor
to the main thread would eventually be beneficial for the rendering thread.
(i.e. part of option 2, at least)

Improvements from moving calculation of the normalization factor are not going
to be immediately noticed because all of the FFTs are currently set up on
construction of the Reverb on the rendering thread.  i.e. cost is proportional
to size.  ReverbAccumulationBuffer zero initialization is proportional too.
Moving Reverb construction to the main thread would be better.  Even
better I suspect would be to delay each FFTBlock initialization until
required, to stagger them.  FFTConvolver::process() does not use the fftKernel
until m_readWriteIndex == halfSize.  That would require to some extent
reverting
https://hg.mozilla.org/mozilla-central/rev/cd8681109e9606502d08341f5c9a58e4b74118ab
This is all a separate project, but where we want to be heading may influence
choices here.

I'm not so concerned about efficiency in the offline processing, super large
buffer situation.  For feasible realtime sizes, I suspect the transient memory
pressure would not be a big concern.
Flags: needinfo?(karlt)
When OOMing when allocating the temporary buffer, we return an error from the
ctor via an output parameter, and make the ConvolverNode output silence.
Additionaly, a warning is issued each time we fail to set a buffer to the buffer
property of a ConvolverNode.

This is much more complicated than anticipated, because we can't, from the
ConvolverNodeEngine::SetBuffer method, find an object we can AddRef, to then
move the reference into a runnable, to then dispatch this runnable to the main
thread, to then find the right owner and then report the error.

Instead, we propagate the error back to the AudioNodeStream, grip the stream,
and go from there. This means AudioNodeStream::SetBuffer has to be modified,
along with the other consumers.
Comment on attachment 9012253 [details]
Bug 1493779 - Add a test for ConvolverNode with very large buffer. r?karlt

Karl Tomlinson (:karlt) has approved the revision.
Attachment #9012253 - Flags: review+
Assignee: nobody → padenot
When OOMing when allocating the temporary buffer, we return an error from the
ctor via an output parameter, and make the ConvolverNode output silence.
Additionaly, a warning is issued each time we fail to set a buffer to the buffer
property of a ConvolverNode.

This is much more complicated than anticipated, because we can't, from the
ConvolverNodeEngine::SetBuffer method, find an object we can AddRef, to then
move the reference into a runnable, to then dispatch this runnable to the main
thread, to then find the right owner and then report the error.

Instead, we propagate the error back to the AudioNodeStream, grip the stream,
and go from there. This means AudioNodeStream::SetBuffer has to be modified,
along with the other consumers.
Attachment #9013598 - Attachment is obsolete: true
Pushed by padenot@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/e380ab03ba8d
Add a test for ConvolverNode with very large buffer. r=karlt
https://hg.mozilla.org/integration/autoland/rev/cefa70540aac
Gracefuly handle OOM when setting a buffer on ConvolverNode.  r=karlt
hrm, sometimes the `createBuffer` itself is ooming. I'm now catching this and finishing the test early.
Flags: needinfo?(padenot)
Pushed by padenot@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/53950ea4b105
Add a test for ConvolverNode with very large buffer. r=karlt
https://hg.mozilla.org/integration/autoland/rev/e78d9996b1a4
Gracefuly handle OOM when setting a buffer on ConvolverNode.  r=karlt
Backout by btara@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/d9f492389275
Backed out 2 changesets for test_convolverNodeOOM.html failures CLOSED TREE
Backed out 2 changesets (Bug 1493779) for test_convolverNodeOOM.html failures.

Push with failures: https://treeherder.mozilla.org/#/jobs?repo=autoland&revision=e78d9996b1a44045dcf8a85722189a3db77033d3

Backout link: https://hg.mozilla.org/integration/autoland/rev/d9f4923892754c462907bdc524efcf37d75306a7

Failure log: https://treeherder.mozilla.org/logviewer.html#?job_id=203446073&repo=autoland&lineNumber=1590

[task 2018-10-04T17:06:12.537Z] 17:06:12     INFO -  145 INFO TEST-START | dom/media/webaudio/test/test_convolverNodeOOM.html
[task 2018-10-04T17:06:23.160Z] 17:06:23     INFO -  Buffered messages logged at 17:06:11
[task 2018-10-04T17:06:23.162Z] 17:06:23     INFO -  146 INFO TEST-PASS | dom/media/webaudio/test/test_convolverNodeOOM.html | Correct number of channels for expected buffer 0
[task 2018-10-04T17:06:23.163Z] 17:06:23     INFO -  Buffered messages finished
[task 2018-10-04T17:06:23.163Z] 17:06:23     INFO -  147 INFO TEST-UNEXPECTED-FAIL | dom/media/webaudio/test/test_convolverNodeOOM.html | uncaught exception - NS_ERROR_OUT_OF_MEMORY:  at createGraph@http://mochi.test:8888/tests/dom/media/webaudio/test/test_convolverNodeOOM.html:28:19
[task 2018-10-04T17:06:23.163Z] 17:06:23     INFO -  runTestOnContext@http://mochi.test:8888/tests/dom/media/webaudio/test/webaudio.js:202:20
[task 2018-10-04T17:06:23.163Z] 17:06:23     INFO -  testOnNormalContext@http://mochi.test:8888/tests/dom/media/webaudio/test/webaudio.js:222:7
[task 2018-10-04T17:06:23.163Z] 17:06:23     INFO -  runTestFunction@http://mochi.test:8888/tests/dom/media/webaudio/test/webaudio.js:253:5
[task 2018-10-04T17:06:23.164Z] 17:06:23     INFO -  rval@http://mochi.test:8888/tests/SimpleTest/SimpleTest.js:146:17
[task 2018-10-04T17:06:23.164Z] 17:06:23     INFO -      simpletestOnerror@SimpleTest/SimpleTest.js:1635:11
[task 2018-10-04T17:06:23.164Z] 17:06:23     INFO -  148 INFO TEST-OK | dom/media/webaudio/test/test_convolverNodeOOM.html | took 4114ms
Flags: needinfo?(padenot)
https://treeherder.mozilla.org/#/jobs?repo=try&revision=2f52f556deda92ad5b139b2ef7e8537e568957c8 green if we put the getChannelData in the try {} block.
Flags: needinfo?(padenot)
QA Contact: drno
QA Contact: drno
Pushed by padenot@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/2430e0fe8d6d
Add a test for ConvolverNode with very large buffer. r=karlt
https://hg.mozilla.org/integration/autoland/rev/4dbc5f131040
Gracefuly handle OOM when setting a buffer on ConvolverNode.  r=karlt
https://treeherder.mozilla.org/#/jobs?repo=try&revision=23d98cd3996734a96b3059268816da775c8727a2, with the test disabled on Android.
Flags: needinfo?(padenot)
Pushed by padenot@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/76ae4cbbb7e4
Add a test for ConvolverNode with very large buffer. r=karlt
https://hg.mozilla.org/integration/autoland/rev/0db994314ad3
Gracefuly handle OOM when setting a buffer on ConvolverNode.  r=karlt
https://hg.mozilla.org/mozilla-central/rev/76ae4cbbb7e4
https://hg.mozilla.org/mozilla-central/rev/0db994314ad3
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla64
Flags: in-testsuite? → in-testsuite+
See Also: → 1576059
See Also: → 1706772
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: