Crash in [@ arena_dalloc | MozJemalloc::free | rlbox::rlbox_wasm2c_sandbox::impl_create_sandbox] on poison value
Categories
(Core :: Security: RLBox, defect, P2)
Tracking
()
People
(Reporter: mccr8, Assigned: shravanrn)
References
(Regression)
Details
(4 keywords, Whiteboard: [adv-main125+r])
Crash Data
Attachments
(2 files)
|
48 bytes,
text/x-phabricator-request
|
tjr
:
sec-approval+
|
Details | Review |
|
48 bytes,
text/x-phabricator-request
|
phab-bot
:
approval-mozilla-beta+
|
Details | Review |
Crash report: https://crash-stats.mozilla.org/report/index/e518df80-6de6-4794-bfa9-723710240313
Reason: SIGSEGV / SEGV_MAPERR
Top 10 frames of crashing thread:
0 libmozglue.so arena_dalloc memory/build/mozjemalloc.cpp:3790
1 libmozglue.so MozJemalloc::free memory/build/malloc_decls.h:54
2 liblgpllibs.so rlbox::rlbox_wasm2c_sandbox::impl_create_sandbox media/libsoundtouch/src/RLBoxSoundTouch.cpp
2 liblgpllibs.so rlbox::rlbox_sandbox<rlbox::rlbox_wasm2c_sandbox>::create_sandbox<bool> third_party/rlbox/include/rlbox_sandbox.hpp:402
2 liblgpllibs.so mozilla::RLBoxSoundTouch::RLBoxSoundTouch media/libsoundtouch/src/RLBoxSoundTouch.cpp:15
3 libxul.so mozilla::AudioDecoderInputTrack::EnsureTimeStretcher dom/media/mediasink/AudioDecoderInputTrack.cpp:623
3 libxul.so mozilla::AudioDecoderInputTrack::AppendTimeStretchedDataToSegment dom/media/mediasink/AudioDecoderInputTrack.cpp:424
3 libxul.so mozilla::AudioDecoderInputTrack::AppendBufferedDataToOutput dom/media/mediasink/AudioDecoderInputTrack.cpp:390
3 libxul.so mozilla::AudioDecoderInputTrack::ProcessInput dom/media/mediasink/AudioDecoderInputTrack.cpp:343
4 libxul.so mozilla::MediaTrackGraphImpl::ProduceDataForTracksBlockByBlock dom/media/MediaTrackGraph.cpp:1254
We're crashing on an address that is at least partially poison (0x00e5e5e5e5e00000 or 0xe5e5e5e5e5e00000) while trying to free something while running impl_create_sandbox. I'm not sure what line this is actually crashing on. Maybe we're hitting some error while creating the sandbox and freeing some partially allocated data structure? There are 16 crashes with these signatures on poison values in the last 6 months, which is a reasonably steady rate. These are all on Android.
| Reporter | ||
Comment 1•2 years ago
|
||
FWIW, these are all from RLBoxSoundTouch, though I'd guess that's just because that is the most common use of the RLBox on Android.
gsvelto, can you figure out from the crash dumps or whatever which code is actually running here when we crash? Thanks.
Comment 2•2 years ago
|
||
The address appears to be all poison and the reason why we crash on 0xe5e5e5e5e5e00000 is because we've masked out the last few bits (which were also the poison value). It's unclear on what line this is failing but there's a few things that hint at what might be going wrong:
- This particular sandbox is created with
infallibleset to false - Because of this there's several checks that could fail
- In case of failure
FALLIBLE_DYNAMIC_CHECK()callsimpl_destroy_sandbox() impl_destroy_sandbox()is indeed freeing some objects so chances are that thefree()call that is failing is there
| Assignee | ||
Comment 3•2 years ago
|
||
Ah gotcha. Thanks for putting this on my radar. I will investigate further.
| Assignee | ||
Updated•2 years ago
|
| Assignee | ||
Comment 4•2 years ago
•
|
||
Ok, just did a quick audit and here are the two changes we should make:
-
As a general rule, create_sandbox should either be called with
infallible = false /* abort if OOM or other failure */or the caller should check the returnedbooland fail gracefully if this is false. SoundTouch doesn't seem to follow either option (all other sandboxed libraries correctly follow this pattern). We should probably just be passing ininfallible=truein the first place, given that the caller doesn't seem to have any graceful exit code. -
There does seem to be a case here where if the OOM failure happens between certain allocation, the
impl_destroy_sandbox()can try to free an object that wasn't allocated. We should play it safe and modifyimpl_destroy_sandbox()so that every deallocation occurs only after checking that the allocation has gone through.
I'll work on both these patches now.
| Assignee | ||
Comment 5•2 years ago
|
||
| Assignee | ||
Comment 6•2 years ago
•
|
||
Patch has been submitted to phabricator.
@glandium: could you take a look?
Notes:
- The patch to rlbox_wasm2c_sandbox has been made directly on Firefox's in-tree version; so the in-tree version is now out of sync with the public version. Once this change has landed in Mozilla, I will push this back up to RLBox's public repos.
- We will continue to use the in-tree version of RLBox wasm2c even after this (we won't sync back to the latest upstream commit), as we are trying to migrate to rlbox v2 after this, and want to minimize any new addition of features to rlbox v1 as far as possible.
- Re tests: Ran a soundtouch test locally, and this worked fine. So I think this patch shouldn't need a try run given that the patch is simple.
| Reporter | ||
Comment 7•2 years ago
|
||
One thing we do in some of our data structures is have a fallible version with [[nodiscard]] to enforce that the return type is checked. The fallible version take a special dummy mozilla::fallible_t argument. Example: fallible nsTArray::AppendElement, infallible nsTArray::AppendElement.
| Assignee | ||
Comment 8•2 years ago
|
||
(In reply to Andrew McCreight [:mccr8] from comment #7)
One thing we do in some of our data structures is have a fallible version with
[[nodiscard]]to enforce that the return type is checked.
Oh this is a good note. I'll use this in the RLBox v2 design.
Separately, re the exploit-ability of this bug. Since this is basically a free(uninit) on a fresh heap allocation, per @glandium, this means uninit is either always null or poison due to the operation of the mozilla allocator (which poisons on free). Given this, this should not be exploitable.
Additionally, @tjr mentioned we are currently in a freeze for landing patches for a couple of days.
So it seems sensible to wait to land this, given that this is not exploitable.
| Assignee | ||
Comment 9•2 years ago
|
||
I have taken a guess at the priority and severity flags --- please adjust if this doesn't conform to the norms followed on these fields
| Assignee | ||
Comment 10•2 years ago
|
||
Comment on attachment 9391375 [details]
Bug 1885359 - Fix intermittent crash on destroy rlbox sandbox in soundtouch r=glandium,tjr
Security Approval Request
- How easily could an exploit be constructed based on the patch?: This does not seem to exploitable beyond causing a crash. This is because the crashing operation is a
free(uninit)on a new allocation, and new allocations are in Firefox are either zeroed or have a posion value. - Do comments in the patch, the check-in comment, or tests included in the patch paint a bulls-eye on the security problem?: Yes
- Which branches (beta, release, and/or ESR) are affected by this flaw, and do the release status flags reflect this affected/unaffected state correctly?: Firefox 120 and newer
- If not all supported branches, which bug introduced the flaw?: Bug 1673285
- Do you have backports for the affected branches?: Yes
- If not, how different, hard to create, and risky will they be?:
- How likely is this patch to cause regressions; how much testing does it need?: Unlikely
- Is the patch ready to land after security approval is given?: Yes
- Is Android affected?: Yes
| Reporter | ||
Updated•2 years ago
|
| Reporter | ||
Comment 11•2 years ago
|
||
(In reply to Shravan Narayan from comment #8)
Separately, re the exploit-ability of this bug. Since this is basically a
free(uninit)on a fresh heap allocation, per @glandium, this meansuninitis either always null or poison due to the operation of the mozilla allocator (which poisons on free). Given this, this should not be exploitable.
I'm not sure how much that necessarily protects us. We don't poison very large allocations (though maybe that doesn't apply here) and we don't actually protect the poison address against allocation, so maybe you could do something like get an object allocated at the poison address, trigger this issue to free that address, and now your pointer to the poison address is pointing at something that was freed.
Comment 12•2 years ago
|
||
The poison has a high enough value that it should be impossible to map it (it should end up in kernel-reserved space). As for size, unless this struct is very very large https://searchfox.org/mozilla-central/rev/5f0a7ca8968ac5cef8846e1d970ef178b8b76dcc/media/libsoundtouch/src/RLBoxSoundTouch.h#46, there shouldn't be a concern.
Updated•2 years ago
|
Comment 13•2 years ago
|
||
Comment on attachment 9391375 [details]
Bug 1885359 - Fix intermittent crash on destroy rlbox sandbox in soundtouch r=glandium,tjr
sec-approvals were paused for a few days after merge, thanks for the patience. approved to land and uplift
| Assignee | ||
Comment 14•2 years ago
|
||
@glandium: could you help land this as I don't have permission to commit?
Comment 15•2 years ago
|
||
Original Revision: https://phabricator.services.mozilla.com/D204720
Updated•2 years ago
|
Comment 16•2 years ago
|
||
Uplift Approval Request
- String changes made/needed: No
- Needs manual QE test: no
- Steps to reproduce for manual QE testing: none
- Risk associated with taking this patch: low
- Fix verified in Nightly: no
- Explanation of risk level: "How likely is this patch to cause regressions; how much testing does it need?: Unlikely"
- Code covered by automated testing: yes
- Is Android affected?: yes
- User impact if declined: sec-high bug, but not believe to be exploitable
Updated•2 years ago
|
Updated•2 years ago
|
Comment 17•2 years ago
|
||
| uplift | ||
Comment 18•2 years ago
|
||
Comment 19•2 years ago
|
||
Updated•2 years ago
|
Updated•2 years ago
|
Updated•2 years ago
|
Updated•2 years ago
|
Updated•1 year ago
|
Description
•