Closed Bug 1869083 Opened 2 years ago Closed 2 years ago

Poison value crash in [@ objc_msgSend | HALC_ShellDevice::_GetPropertyData] from cubeb_coreaudio with Google Meet on MacOS 12

Categories

(Core :: Audio/Video: cubeb, defect, P2)

defect

Tracking

()

RESOLVED FIXED
123 Branch
Tracking Status
firefox-esr115 --- unaffected
firefox120 --- unaffected
firefox121 --- unaffected
firefox122 + fixed
firefox123 + fixed

People

(Reporter: mccr8, Assigned: pehrsons)

References

(Depends on 1 open bug, Blocks 1 open bug, Regression)

Details

(4 keywords)

Crash Data

Crash report: https://crash-stats.mozilla.org/report/index/9611e6e9-e376-4643-9af0-4d4f40231207

Reason: EXC_BAD_ACCESS / KERN_INVALID_ADDRESS

Top 10 frames of crashing thread:

0  libobjc.A.dylib  objc_msgSend  
1  CoreAudio  HALC_ShellDevice::_GetPropertyData const  
2  CoreAudio  invocation function for block in HALC_ShellObject::GetPropertyData const  
3  CoreAudio  HALB_CommandGate::ExecuteCommand const  
4  CoreAudio  HALC_ShellObject::GetPropertyData const  
5  CoreAudio  HAL_HardwarePlugIn_ObjectGetPropertyData  
6  CoreAudio  HALPlugIn::ObjectGetPropertyData const  
7  CoreAudio  HALObject::GetPropertyData const  
8  CoreAudio  AudioObjectGetPropertyData  
9  AudioDSP  AudioDSP@0x26f1b9  

These crashes all have the jemalloc poison value in the crash address, suggesting a use-after-free. It looks like the crashes are all in MacOS 12, so maybe this is a bug that was fixed a few versions ago.

Of the dozen crashes in the last month with this signature that have URLs in them, 9 are Google Meet and 2 are Google Docs.

We are seeing a similar issue on upstream tests in CI (Github Actions) as well. It has already been submitted to Apple Security as OE192508859543.

I'll change it to sec-vector, though if we can work around it somehow that would be good.

Keywords: sec-highsec-vector

Taking this because I'm actively working on it.

I can reproduce some crashes in the area by checking out https://github.com/mozilla/cubeb-coreaudio-rs and running some tests. run_sanitizers.sh will cover two passes of all tests, first with ASAN, then TSAN and would normally catch something when on MacOS 12.

Note those tests are run in parallel and the CoreAudio APIs seem to not be thread safe to at least some extent, though the docs are very vague on this. In gecko all calls into those APIs are run on a single thread. But there's also a notification thread, managed by the platform (per our instructions). I want to test making that the same thread as where we make our calls, to see if that helps anything.

Assignee: nobody → apehrson
Severity: -- → S2
Status: NEW → ASSIGNED
Priority: -- → P2

I came across another crash that looks similar [@ objc_release | CACFString::~CACFString ]
bp-21bceda2-afb4-4f69-90ce-6af9e0231205

0  libobjc.A.dylib  objc_release  
1  CoreAudio  CACFString::~CACFString  
2  CoreAudio  HALDevice::~HALDevice  
3  CoreAudio  HALDevice::~HALDevice  
4  CoreAudio  HALObjectMap::ReleaseObject  
5  CoreAudio  HALSystem::ObjectsPublishedAndDied  
6  CoreAudio  HALSystem::AudioObjectsPublishedAndDied  
7  CoreAudio  HALC_ShellPlugIn::ProxyObject_PropertiesChanged  
8  CoreAudio  HALC_ProxyNotifications::CallListener_f  
9  libdispatch.dylib  _dispatch_client_callout  
Crash Signature: [@ objc_msgSend | HALC_ShellDevice::_GetPropertyData] → [@ objc_msgSend | HALC_ShellDevice::_GetPropertyData] [@ objc_release | CACFString::~CACFString ]

Re comment 3, the report in the description is on the AudioIPC Server RPC thread that we make all calls into the framework on, and the report in comment 4 is on a notification thread managed by the platform. I suspect these are the two threads racing on a non-threadsafe API (though it also seems to vary somewhat by MacOS version).

I'm going to mark this as a regression as this is only showing up in Nightly. I guess maybe this is using some feature that is Nightly only instead.

:pherson could you set the correct regressor on this?
You had a few bug land during Fx122 nightly, not sure if its one of them

Flags: needinfo?(apehrson)

I believe it's in theory a latent issue (with our CoreAudio threading model) but exacerbated by bug 1670633, though it's hard to know for sure since the platform code is closed source.

Flags: needinfo?(apehrson)
Regressed by: 1670633

Set release status flags based on info from the regressing bug 1670633

(In reply to Andreas Pehrson [:pehrsons] from comment #3)

Taking this because I'm actively working on it.

I can reproduce some crashes in the area by checking out https://github.com/mozilla/cubeb-coreaudio-rs and running some tests. run_sanitizers.sh will cover two passes of all tests, first with ASAN, then TSAN and would normally catch something when on MacOS 12.

Note those tests are run in parallel and the CoreAudio APIs seem to not be thread safe to at least some extent, though the docs are very vague on this. In gecko all calls into those APIs are run on a single thread. But there's also a notification thread, managed by the platform (per our instructions). I want to test making that the same thread as where we make our calls, to see if that helps anything.

(In reply to Andreas Pehrson [:pehrsons] from comment #5)

Re comment 3, the report in the description is on the AudioIPC Server RPC thread that we make all calls into the framework on, and the report in comment 4 is on a notification thread managed by the platform. I suspect these are the two threads racing on a non-threadsafe API (though it also seems to vary somewhat by MacOS version).

I've tested this and making everything the same thread does not help. The threads involved in the suspected race do not appear to be the thread we call the APIs on and the platform's notification thread, it's the thread we call the APIs on and a separate platform-internal notification listener thread, supposedly handling events from the platform's notification thread (it was spawned from there).

The only two ways forward I see are 1) live with this MacOS-12-specific platform crash, or 2) disable the fix for bug 1670633 on MacOS-12 only (possible with the help of bug 1869526).

I'll also note that for 1) sites would not be able to exploit this at will, because it requires user permission to trigger (getUserMedia for audio).

Set release status flags based on info from the regressing bug 1670633

See Also: → 1871417

The only two ways forward I see are 1) live with this MacOS-12-specific platform crash, or 2) disable the fix for bug 1670633 on MacOS-12 only (possible with the help of bug 1869526).

:persons any updates on this bug?
Next week is the final week of beta for Fx122, and this is triaged as S2

Flags: needinfo?(apehrson)

Bug 1866595 will disable the fix for bug 1670633 on MacOS 12.

Flags: needinfo?(apehrson)

:pehrsons I set Fx122 and Fx123 as Fixed since Bug 1866595 landed and was uplifted to beta.
Just checking if there's anything left to track here?

We should be good.

Status: ASSIGNED → RESOLVED
Closed: 2 years ago
Flags: needinfo?(apehrson)
Resolution: --- → FIXED
Group: media-core-security → core-security-release
Depends on: 1866595
Target Milestone: --- → 123 Branch
QA Whiteboard: [post-critsmash-triage]

Making Firefox 122 security bugs public. [bugspam filter string: Pilgarlic-Towers]

Group: core-security-release
You need to log in before you can comment on or make changes to this bug.