Closed Bug 886618 Opened 11 years ago Closed 11 years ago

Need to audit Web Audio code, especially AudioBuffer, for cases where JS arrays are neutered

Categories

(Core :: Web Audio, defect)

18 Branch
x86_64
Linux
defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla25
Tracking Status
firefox23 --- disabled
firefox24 - fixed
firefox25 - fixed
firefox-esr17 --- unaffected
b2g18 --- unaffected
b2g-v1.1hd --- unaffected
b2g-v1.2 --- unaffected

People

(Reporter: roc, Assigned: roc)

References

Details

(Keywords: sec-high)

Attachments

(2 files)

It's possible to neuter a Float32Array by passing it as a Transferable via postMessage.

So someone could create an AudioBuffer, call getChannelData(), and then neuter the resulting array. If there's then a GC, the data will probably be collected. But StealJSArrayDataIntoThreadSharedFloatArrayBufferList doesn't check for this, so if you try to use the AudioBuffer, we'll probably just crash.

So anywhere we have an API returning a Float32Array that we retain a reference to, we have to think about what happens if the caller neuters it (length set to zero). Passing in a neutered array is just passing in an array of length zero, so that should be OK.

Jesse, you might to add neutering as a feature in your fuzzers...
Attached file testcase —
Plays random garbage, sometimes crashes.
Assignee: nobody → roc
Attached patch fix — — Splinter Review
Attachment #767046 - Flags: review?(ehsan)
I've looked at the code that uses GetThreadSharedData and it looks OK after this patch.

The only other Web Audio object that holds a JS array currently is WaveShaperNode.curve. We copy the contents when it's set, so neutering it after setting has no effect.
Comment on attachment 767046 [details] [diff] [review]
fix

Review of attachment 767046 [details] [diff] [review]:
-----------------------------------------------------------------

Thanks!
Attachment #767046 - Flags: review?(ehsan) → review+
I think this was backed out on inbound...
Group: core-security
Keywords: sec-audit
It looks like this bug was the cause of the test failure. But I don't see how it could be going wrong :-(.
I pushed this changeset to add some logging:
https://hg.mozilla.org/try/rev/f07d64c510de

and got this output:
https://tbpl.mozilla.org/php/getParsedLog.php?id=24708139&tree=Try&full=1#error0
06:11:21     INFO -  153207 INFO TEST-START | /tests/content/media/webaudio/test/test_convolverNode_mono_mono.html
06:11:21     INFO -  153208 INFO TEST-INFO | /tests/content/media/webaudio/test/test_convolverNode_mono_mono.html | Tests ConvolverNode processing a mono channel with mono impulse response.
06:11:21     INFO -  AudioBuffer::GetThreadSharedChannelsForRate returning shared channel
06:11:21     INFO -  AudioBuffer::GetThreadSharedChannelsForRate returning shared channel
06:11:21     INFO -  ConvolverNode::SetBuffer data=0x11a8107c0
06:12:09     INFO -  153209 ERROR TEST-UNEXPECTED-FAIL | /tests/content/media/webaudio/test/test_convolverNode_mono_mono.html | Triangular portion of convolution is not correct.  Max deviation = 0 dB at 87040
06:12:09     INFO -  153210 ERROR TEST-UNEXPECTED-FAIL | /tests/content/media/webaudio/test/test_convolverNode_mono_mono.html | Test signal was not correctly convolved.
06:12:09     INFO -  153211 INFO TEST-END | /tests/content/media/webaudio/test/test_convolverNode_mono_mono.html | finished in 47650ms

Now this is weird:
-- The change to StealJSArrayDataIntoThreadSharedFloatArrayBufferList was never executed.
-- The early return in AudioBuffer::GetThreadSharedChannelsForRate was not executed.
-- The change to ConvolverNode::SetBuffer had no effect, since it only added 'data &&' to a boolean test, and 'data' was always non-null there.
So I'm mystified.

This test seems to take surprisingly long to run. But that appears to be true for the passing cases also: https://tbpl.mozilla.org/php/getParsedLog.php?id=24577268&tree=Mozilla-Inbound&full=1, and in runs before the patch: https://tbpl.mozilla.org/php/getParsedLog.php?id=24573406&tree=Mozilla-Inbound&full=1
Pushed the part of the patch that just removes dead code, to simplify things:
https://hg.mozilla.org/integration/mozilla-inbound/rev/69d48d054a52
Try run with minimal part of the patch required to fix the security issue: https://tbpl.mozilla.org/?tree=Try&rev=feba0dc2a969
(In reply to Robert O'Callahan (:roc) (Mozilla Corporation) from comment #12)
> Try run with minimal part of the patch required to fix the security issue:
> https://tbpl.mozilla.org/?tree=Try&rev=feba0dc2a969

No failures. So whatever caused those mysterious failures, it's not the key part of this patch. I'll land it. At least it turns uninitialized memory reads into null-pointer derefs.
https://hg.mozilla.org/integration/mozilla-inbound/rev/b684eef8d587

I still have a few pieces of this patch to try-test and land, including the testcase.
Whiteboard: [leave open]
OK, so it's actually adding the *test* that causes test_convolverNode_mono_mono.html to fail!
Update: I determined that the SpecialPowers.gc() calls in my test are what causes test_convolverNode_mono_mono.html to fail (much later).

I've been doing try builds and studying ConvolverNode code to try to narrow down the problem. I found one big problem: the Reverb constructor temporarily modifies the contents of a ThreadSharedFloatArrayBufferList, which is definitely wrong since those contents could be in use for some other purpose at the same time. But fixing that problem didn't fix the test failure. (I'll post a patch anyway.) Right now I suspect uninitialized memory use in some of the Reverb-related code. Still hunting.
Have you tried this under an ASan build?  That should easily help find the possible uninitialized memory reads...
Also, we need to land this on Aurora as well.
ASan doesn't find uses of uninitialized memory.  For that you need MSan or Valgrind.
(In reply to Jesse Ruderman from comment #27)
> ASan doesn't find uses of uninitialized memory.  For that you need MSan or
> Valgrind.

Ah, you're right.
Valgrind doesn't seem to work on Firefox on my system.
The problem is that ConvolverNode can keep producing output even after its input has gone away (while we drain the Reverb buffers), but the node isn't keeping itself alive during that time, so cycle collection can kick in and remove the node prematurely if we're unlucky enough to have it run during that drain time.

I wonder how many of our other nodes are susceptible to this bug.
We don't track audits or meta bugs, but will track sec-high/critical bugs that fall out of this audit.
(In reply to Robert O'Callahan (:roc) (Mozilla Corporation) from comment #30)
> The problem is that ConvolverNode can keep producing output even after its
> input has gone away (while we drain the Reverb buffers), but the node isn't
> keeping itself alive during that time, so cycle collection can kick in and
> remove the node prematurely if we're unlucky enough to have it run during
> that drain time.

Ah, I should have thought about this!  The simple fix is to hold a SelfReference to yourself when the input goes away and drop it when the reverb buffers are completely emptied.

> I wonder how many of our other nodes are susceptible to this bug.

None of them, the only other ones with a similar behavior are AudioBufferSourceNode, DelayNode, ScriptProcessorNode and OscillatorNode, and they all use SelfReferences to fix this problem.  I just forgot to apply that fix to ConvolverNode as well.
Depends on: 890072
(In reply to :Ehsan Akhgari (needinfo? me!) from comment #32)
> None of them, the only other ones with a similar behavior are
> AudioBufferSourceNode, DelayNode, ScriptProcessorNode and OscillatorNode,
> and they all use SelfReferences to fix this problem.  I just forgot to apply
> that fix to ConvolverNode as well.

Why do only AudioBufferSourceNodes and ScriptProcessorNodes get explicitly stopped in AudioContext::Shutdown, then?
And why doesn't BiquadFilterNode have an mPlayingRef? Its engine has state too.
(In reply to Robert O'Callahan (:roc) (Mozilla Corporation) from comment #33)
> (In reply to :Ehsan Akhgari (needinfo? me!) from comment #32)
> > None of them, the only other ones with a similar behavior are
> > AudioBufferSourceNode, DelayNode, ScriptProcessorNode and OscillatorNode,
> > and they all use SelfReferences to fix this problem.  I just forgot to apply
> > that fix to ConvolverNode as well.
> 
> Why do only AudioBufferSourceNodes and ScriptProcessorNodes get explicitly
> stopped in AudioContext::Shutdown, then?

Because those two nodes can be source nodes, and they should live as long as they're not explicitly stopped, or as long as the page is alive.  We probably need to do something similar for OscillatorNode as well...

(In reply to Robert O'Callahan (:roc) (Mozilla Corporation) from comment #34)
> And why doesn't BiquadFilterNode have an mPlayingRef? Its engine has state
> too.

It does?  I don't see anything indicating that in WebCore::Biquad::process.  Also, in BiquadFilterNodeEngine::ProduceAudioBlock, we return null if the input is null.
There's the bit that says "// Filter memory". You can see that Biquad::process, even if sourceP is all zeroes, can return nonzero output in destP. But only for the first two samples, so I guess we get away with chopping those off sometimes.

I think this means that in BiquadFilterNodeEngine::ProduceAudioBlock, aInput being explicit zeroes might behave slightly differently from aInput.IsNull(), since in the former case we could produce nonzero output and in the latter case we will not.

These are not bad bugs, but they're disquieting.
(In reply to Robert O'Callahan (:roc) (Mozilla Corporation) from comment #36)
> There's the bit that says "// Filter memory". You can see that
> Biquad::process, even if sourceP is all zeroes, can return nonzero output in
> destP. But only for the first two samples, so I guess we get away with
> chopping those off sometimes.
> 
> I think this means that in BiquadFilterNodeEngine::ProduceAudioBlock, aInput
> being explicit zeroes might behave slightly differently from
> aInput.IsNull(), since in the former case we could produce nonzero output
> and in the latter case we will not.
> 
> These are not bad bugs, but they're disquieting.

Hmm, yeah, ok, can you please file this?  We should fix them at some point, but it's not quite urgent.

Also, yay on comment 38!
Shouldn't you get approval for that?

Also, I'm not quite following the details here, but if you actually found and fixed some sec-high or sec-crit bugs, you should adjust the security rating here as appropriate.
(In reply to Andrew McCreight [:mccr8] from comment #41)
> Shouldn't you get approval for that?

I already have gotten approval for webaudio specific patches from release-drivers.

> Also, I'm not quite following the details here, but if you actually found
> and fixed some sec-high or sec-crit bugs, you should adjust the security
> rating here as appropriate.

I *think* the bug here is sec-high, as we don't (shouldn't?) make branching decisions based on the invalid data.  I'll let roc confirm my understanding.
Flags: needinfo?(roc)
Yep.

Try says we're good to go with the test. Hurrah.
Flags: needinfo?(roc)
Keywords: sec-auditsec-high
https://hg.mozilla.org/mozilla-central/rev/60307abd5f7f
Status: NEW → RESOLVED
Closed: 11 years ago
Flags: in-testsuite+
Resolution: --- → FIXED
Target Milestone: --- → mozilla25
Group: media-core-security
Group: core-security
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: