Open Bug 1793538 Opened 2 years ago Updated 2 years ago

Crashes in -[RTCVideoCaptureIosObjC startCaptureInBackgroundWithOutput:] [@ CFRunLoopAddSource ]

Categories

(Core :: WebRTC: Audio/Video, defect, P3)

Unspecified
macOS
defect

Tracking

()

People

(Reporter: smichaud, Unassigned)

References

Details

(Keywords: reproducible, Whiteboard: STR in comment #21)

Crash Data

Attachments

(1 file)

These aren't new. Typical crash stack:

bp-8e2e2151-d442-44ca-ae7e-c77b80220929

Crashing Thread (80)
Frame  Module  Signature  Source  Trust
0  CoreFoundation  CFRunLoopAddSource   context
1  CMIOBaseUnits  CMIOUnitDALInputEntry   cfi
2  CMIOBaseUnits  CMIOUnitVideoToolboxCompressorEntry   cfi
3  CMIOBaseUnits  CMIOUnitInputFromProcsEntry   cfi
4  CMIOBaseUnits  CMIOUnitDALInputEntry   cfi
5  CMIOBaseUnits  CMIOUnitDALInputEntry   cfi
6  CMIOBaseUnits  CMIOUnitFanOutEntry   cfi
7  CoreMediaIO  CMIOUnitNodeInfo::Initialize(CMIOGraph*)   cfi
8  CoreMediaIO  CMIOGraph::Initialize()   cfi
9  AVFCapture  -[AVCaptureSession_Tundra _buildAndRunGraph]   cfi
10  AVFCapture  -[AVCaptureSession_Tundra _setRunning:]   cfi
11  AVFCapture  -[AVCaptureSession_Tundra startRunning]   cfi
12  XUL  -[RTCVideoCaptureIosObjC startCaptureInBackgroundWithOutput:]  dom/media/systemservices/objc_video_capture/rtc_video_capture_objc.mm:200  cfi
13  libdispatch.dylib  _dispatch_call_block_and_release   cfi
14  libdispatch.dylib  _dispatch_client_callout   cfi
15  libdispatch.dylib  _dispatch_lane_serial_drain   cfi
16  libdispatch.dylib  _dispatch_lane_invoke   cfi
17  libdispatch.dylib  _dispatch_workloop_worker_thread   cfi
18  libsystem_pthread.dylib  _pthread_wqthread   cfi
19  libsystem_pthread.dylib  start_wqthread   cfi

https://crash-stats.mozilla.org/search/?signature=~CFRunLoopAddSource&platform=Mac%20OS%20X&date=%3E%3D2022-09-04T02%3A05%3A00.000Z&date=%3C2022-10-04T02%3A05%3A00.000Z&_facets=signature&_facets=platform_version&_facets=proto_signature&_facets=version&_sort=-date&_columns=date&_columns=signature&_columns=product&_columns=version&_columns=build_id&_columns=platform#facet-proto_signature

Crash Signature: [@ CFRunLoopAddSource ]

I strongly suspect the top 7 lines of the crash stack from comment #0 are corrupt.

Here are two demos that exercise -[RTCVideoCaptureIosObjC startCaptureInBackgroundWithOutput:]. I can't reproduce this stack above line 7 with either of them in lldb.

https://www.webrtc-experiment.com/RecordRTC/
https://whatwebcando.today/camera-microphone.html

Gabriele, any idea what's going on here (with respect to comment #2)?

Flags: needinfo?(gsvelto)

After setting breakpoints on all CMIOBaseUnits functions plus CFRunLoopAddSource, I only get the following stacks in lldb. (I limit the CFRunLoopSource breakpoint to the same thread as -[RTCVideoCaptureIosObjC startCaptureInBackgroundWithOutput:] by doing br modify -T current [N] after -[RTCVideoCaptureIosObjC startCaptureInBackgroundWithOutput:] has been hit.)

* thread #4, queue = 'org.webrtc.videocapture', stop reason = breakpoint 4.1
    frame #0: 0x00000001a407c580 CMIOBaseUnits`CMIOUnitDALInputEntry
CMIOBaseUnits`CMIOUnitDALInputEntry:
->  0x1a407c580 <+0>: pushq  %rbp
    0x1a407c581 <+1>: movq   %rsp, %rbp
    0x1a407c584 <+4>: popq   %rbp
    0x1a407c585 <+5>: jmp    0x1a407c58a               ; ___lldb_unnamed_symbol321$$CMIOBaseUnits
Target 0: (firefox) stopped.
(lldb) bt
* thread #4, queue = 'org.webrtc.videocapture', stop reason = breakpoint 4.1
  * frame #0: 0x00000001a407c580 CMIOBaseUnits`CMIOUnitDALInputEntry
    frame #1: 0x00007ff81fd35848 CoreMediaIO`CMIOUnitCreateFromDescription + 1087
    frame #2: 0x00007ff81fd123d3 CoreMediaIO`CMIOUnitNodeInfo::Open() + 59
    frame #3: 0x00007ff81fd02ae6 CoreMediaIO`CMIOGraph::CreateNode(unsigned int, unsigned int, OpaqueCMIOUnit*, CMIOUnitDescription const&, void const*, int&) + 170
    frame #4: 0x00007ff81fd02917 CoreMediaIO`CMIOGraphCreateNode + 314
    frame #5: 0x00007ff82be8ca06 AVFCapture`-[AVCaptureDeviceInput_Tundra addInputUnitsForInputPort:toGraph:ofCaptureSession:error:] + 496
    frame #6: 0x00007ff82be5f2b7 AVFCapture`-[AVCaptureSession_Tundra _buildGraphUnitsForInputPort:error:] + 743
    frame #7: 0x00007ff82be5cf0e AVFCapture`-[AVCaptureSession_Tundra _buildAndRunGraph] + 933
    frame #8: 0x00007ff82be5e913 AVFCapture`-[AVCaptureSession_Tundra _setRunning:] + 456
    frame #9: 0x00007ff82be5e460 AVFCapture`-[AVCaptureSession_Tundra startRunning] + 180
    frame #10: 0x0000000112ba7827 XUL`-[RTCVideoCaptureIosObjC startCaptureInBackgroundWithOutput:](self=0x0000000127903b80, _cmd=<unavailable>, currentOutput=<unavailable>) at rtc_video_capture_objc.mm:200:3 [opt]
    frame #11: 0x00007ff811e030cc libdispatch.dylib`_dispatch_call_block_and_release + 12
    frame #12: 0x00007ff811e04317 libdispatch.dylib`_dispatch_client_callout + 8
    frame #13: 0x00007ff811e0a317 libdispatch.dylib`_dispatch_lane_serial_drain + 672
    frame #14: 0x00007ff811e0adfd libdispatch.dylib`_dispatch_lane_invoke + 366
    frame #15: 0x00007ff811e14eee libdispatch.dylib`_dispatch_workloop_worker_thread + 753
    frame #16: 0x00007ff811fb7fd0 libsystem_pthread.dylib`_pthread_wqthread + 326
    frame #17: 0x00007ff811fb6f57 libsystem_pthread.dylib`start_wqthread + 15
(lldb) c
Process 931 resuming
Process 931 stopped
* thread #4, queue = 'org.webrtc.videocapture', stop reason = breakpoint 6.1
    frame #0: 0x00000001a40ea940 CMIOBaseUnits`CMIOUnitVideoToolboxCompressorEntry
CMIOBaseUnits`CMIOUnitVideoToolboxCompressorEntry:
->  0x1a40ea940 <+0>: pushq  %rbp
    0x1a40ea941 <+1>: movq   %rsp, %rbp
    0x1a40ea944 <+4>: popq   %rbp
    0x1a40ea945 <+5>: jmp    0x1a40ea94a               ; ___lldb_unnamed_symbol1257$$CMIOBaseUnits
Target 0: (firefox) stopped.
(lldb) bt
* thread #4, queue = 'org.webrtc.videocapture', stop reason = breakpoint 6.1
  * frame #0: 0x00000001a40ea940 CMIOBaseUnits`CMIOUnitVideoToolboxCompressorEntry
    frame #1: 0x00007ff81fd35848 CoreMediaIO`CMIOUnitCreateFromDescription + 1087
    frame #2: 0x00007ff81fd123d3 CoreMediaIO`CMIOUnitNodeInfo::Open() + 59
    frame #3: 0x00007ff81fd02ae6 CoreMediaIO`CMIOGraph::CreateNode(unsigned int, unsigned int, OpaqueCMIOUnit*, CMIOUnitDescription const&, void const*, int&) + 170
    frame #4: 0x00007ff81fd02917 CoreMediaIO`CMIOGraphCreateNode + 314
    frame #5: 0x00007ff82be68d64 AVFCapture`-[AVCaptureVideoDataOutput_Tundra addOutputUnitsForConnection:toGraph:ofCaptureSession:error:] + 572
    frame #6: 0x00007ff82be5f629 AVFCapture`-[AVCaptureSession_Tundra _buildGraphUnitsForConnection:error:] + 548
    frame #7: 0x00007ff82be5d19f AVFCapture`-[AVCaptureSession_Tundra _buildAndRunGraph] + 1590
    frame #8: 0x00007ff82be5e913 AVFCapture`-[AVCaptureSession_Tundra _setRunning:] + 456
    frame #9: 0x00007ff82be5e460 AVFCapture`-[AVCaptureSession_Tundra startRunning] + 180
    frame #10: 0x0000000112ba7827 XUL`-[RTCVideoCaptureIosObjC startCaptureInBackgroundWithOutput:](self=0x0000000127903b80, _cmd=<unavailable>, currentOutput=<unavailable>) at rtc_video_capture_objc.mm:200:3 [opt]
    frame #11: 0x00007ff811e030cc libdispatch.dylib`_dispatch_call_block_and_release + 12
    frame #12: 0x00007ff811e04317 libdispatch.dylib`_dispatch_client_callout + 8
    frame #13: 0x00007ff811e0a317 libdispatch.dylib`_dispatch_lane_serial_drain + 672
    frame #14: 0x00007ff811e0adfd libdispatch.dylib`_dispatch_lane_invoke + 366
    frame #15: 0x00007ff811e14eee libdispatch.dylib`_dispatch_workloop_worker_thread + 753
    frame #16: 0x00007ff811fb7fd0 libsystem_pthread.dylib`_pthread_wqthread + 326
    frame #17: 0x00007ff811fb6f57 libsystem_pthread.dylib`start_wqthread + 15
(lldb) c

CFRunLoopAddSource can be called from the CMIOBaseUnits bundle (from two different locations), but only from static functions with no name in the symbol table. In neither case is CMIOUnitDALInputEntry the nearest label. I strongly suspect it's not being called at all.

(Following up comment #5)

I strongly suspect it's not being called at all.

Oops, this is wrong. br modify -T current [N] doesn't work. The following does work:

Process 1544 stopped
* thread #110, queue = 'org.webrtc.videocapture', stop reason = breakpoint 1.1
    frame #0: 0x0000000112ba772a XUL`-[RTCVideoCaptureIosObjC startCaptureInBackgroundWithOutput:](self=0x0000000132708ac0, _cmd=<unavailable>, currentOutput=<unavailable>) at rtc_video_capture_objc.mm:179:30 [opt]
   176 	}
   177 	
   178 	- (void)startCaptureInBackgroundWithOutput:(AVCaptureVideoDataOutput*)currentOutput {
-> 179 	  NSString* captureQuality = [NSString stringWithString:AVCaptureSessionPresetLow];
   180 	  if (_capability.width >= 1280 || _capability.height >= 720) {
   181 	    captureQuality = [NSString stringWithString:AVCaptureSessionPreset1280x720];
   182 	  } else if (_capability.width >= 640 || _capability.height >= 480) {
Target 0: (firefox) stopped.
(lldb) thread info
thread #110: tid = 0xd28a, 0x0000000112ba772a XUL`-[RTCVideoCaptureIosObjC startCaptureInBackgroundWithOutput:](self=0x0000000132708ac0, _cmd=<unavailable>, currentOutput=<unavailable>) at rtc_video_capture_objc.mm:179:30, queue = 'org.webrtc.videocapture', stop reason = breakpoint 1.1

(lldb) b CFRunLoopAddSource
Breakpoint 2: where = CoreFoundation`CFRunLoopAddSource, address = 0x00007ff81205c110
(lldb) br modify -t 0xd28a 2
(lldb) c
Process 1544 resuming
2022-10-04 00:14:16.165604-0500 firefox[1544:53435] [default] enable_updates_common timed out waiting for updates to reenable
Process 1544 stopped
* thread #110, queue = 'org.webrtc.videocapture', stop reason = breakpoint 2.1
    frame #0: 0x00007ff81205c110 CoreFoundation`CFRunLoopAddSource
CoreFoundation`CFRunLoopAddSource:
->  0x7ff81205c110 <+0>: pushq  %rbp
    0x7ff81205c111 <+1>: movq   %rsp, %rbp
    0x7ff81205c114 <+4>: pushq  %r15
    0x7ff81205c116 <+6>: pushq  %r14
Target 0: (firefox) stopped.
(lldb) bt
* thread #110, queue = 'org.webrtc.videocapture', stop reason = breakpoint 2.1
  * frame #0: 0x00007ff81205c110 CoreFoundation`CFRunLoopAddSource
    frame #1: 0x00000001a6bf5b4c CMIOBaseUnits`___lldb_unnamed_symbol389$$CMIOBaseUnits + 226
    frame #2: 0x00000001a6c6c5cf CMIOBaseUnits`___lldb_unnamed_symbol1346$$CMIOBaseUnits + 29
    frame #3: 0x00000001a6c23120 CMIOBaseUnits`___lldb_unnamed_symbol762$$CMIOBaseUnits + 58
    frame #4: 0x00000001a6befd1a CMIOBaseUnits`___lldb_unnamed_symbol330$$CMIOBaseUnits + 102
    frame #5: 0x00000001a6bf7ef1 CMIOBaseUnits`___lldb_unnamed_symbol421$$CMIOBaseUnits + 23
    frame #6: 0x00000001a6c04a16 CMIOBaseUnits`___lldb_unnamed_symbol551$$CMIOBaseUnits + 43
    frame #7: 0x00007ff81fd12505 CoreMediaIO`CMIOUnitNodeInfo::Initialize(CMIOGraph*) + 57
    frame #8: 0x00007ff81fd04e00 CoreMediaIO`CMIOGraph::Initialize() + 3072
    frame #9: 0x00007ff82be5d356 AVFCapture`-[AVCaptureSession_Tundra _buildAndRunGraph] + 2029
    frame #10: 0x00007ff82be5e913 AVFCapture`-[AVCaptureSession_Tundra _setRunning:] + 456
    frame #11: 0x00007ff82be5e460 AVFCapture`-[AVCaptureSession_Tundra startRunning] + 180
    frame #12: 0x0000000112ba7827 XUL`-[RTCVideoCaptureIosObjC startCaptureInBackgroundWithOutput:](self=0x0000000132708ac0, _cmd=<unavailable>, currentOutput=<unavailable>) at rtc_video_capture_objc.mm:200:3 [opt]
    frame #13: 0x00007ff811e030cc libdispatch.dylib`_dispatch_call_block_and_release + 12
    frame #14: 0x00007ff811e04317 libdispatch.dylib`_dispatch_client_callout + 8
    frame #15: 0x00007ff811e0a317 libdispatch.dylib`_dispatch_lane_serial_drain + 672
    frame #16: 0x00007ff811e0adfd libdispatch.dylib`_dispatch_lane_invoke + 366
    frame #17: 0x00007ff811e14eee libdispatch.dylib`_dispatch_workloop_worker_thread + 753
    frame #18: 0x00007ff811fb7fd0 libsystem_pthread.dylib`_pthread_wqthread + 326
    frame #19: 0x00007ff811fb6f57 libsystem_pthread.dylib`start_wqthread + 15

So here's the correct crash stack for this bug.

(In reply to Steven Michaud [:smichaud] (Retired) from comment #6)

frame #1: 0x00000001a6bf5b4c CMIOBaseUnits`___lldb_unnamed_symbol389$$CMIOBaseUnits + 226
frame #2: 0x00000001a6c6c5cf CMIOBaseUnits`___lldb_unnamed_symbol1346$$CMIOBaseUnits + 29
frame #3: 0x00000001a6c23120 CMIOBaseUnits`___lldb_unnamed_symbol762$$CMIOBaseUnits + 58
frame #4: 0x00000001a6befd1a CMIOBaseUnits`___lldb_unnamed_symbol330$$CMIOBaseUnits + 102
frame #5: 0x00000001a6bf7ef1 CMIOBaseUnits`___lldb_unnamed_symbol421$$CMIOBaseUnits + 23
frame #6: 0x00000001a6c04a16 CMIOBaseUnits`___lldb_unnamed_symbol551$$CMIOBaseUnits + 43

Ah, interesting, synthesized symbols. I don't know how Symbolic deals with them. It's possible that when dumping the symbols these aren't being generated correctly and those address ranges end up inside other ones.

Going back to the crash itself this might be a bug in Apple libraries. There doesn't seem to be a correlation with our build IDs, the earliest crash is in a release version but then there's also quite a few in newer betas which is just not likely if the problem was on our side. This might have to do with specific macOS versions: for example 10.15 is affected but only the very last version, we have no other crash on file for other ones.

Also plenty of crashes seem to involve Google Meet. It appears both in the URL and in the comments.

Flags: needinfo?(gsvelto)

I filed bug 1793562 to investigate the symbolication issue.

This is definitely an Apple bug -- presumably an intermittent one.

It happens when CFRunLoopAddSource(CFRunLoopRef rl, CFRunLoopSourceRef source, CFRunLoopMode mode) is called with a NULL source parameter. This happens when CFRunLoopSourceRef IONotificationPortGetRunLoopSource(IONotificationPortRef notify) returns NULL. This in turn happens when io_connect_t IORegisterForSystemPower(void *refcon, IONotificationPortRef *thePortRef, IOServiceInterestCallback callback, io_object_t *notifier) messes up without doing an error return.

The function in which these bad things (sometimes) occur is the first of the two static (and therefore nameless) functions that can call CFRunLoopAddSource() in the CMIOBaseUnits bundle (located at /System/Library/Frameworks/CoreMediaIO.framework/Versions/A/Resources/BaseUnits/CMIOBaseUnits.bundle/Contents/MacOS).

IORegisterForSystemPower()

Thinking more about this, it may be a wake/sleep issue.

br modify -T current [N] doesn't work.

Neither does br modify -t current [N] -- you get the error message error: invalid thread id string 'current'. Which is weird, because it used to work (see bug 1780938 comment #27). And I've found a commit that supports it, which seems to be included in LLVM 14.0.6 (the version currently used by Mozilla's build toolkit).

https://reviews.llvm.org/D107015
https://reviews.llvm.org/rGf362b05d0dcd176348f19a63f8b7f4d4c0498bba

Edit: I've figured this out, I think. Mozilla's build toolkit uses Apple's lldb, which only supports -t current on macOS 13 :-(

I just looked at the module list for several of this bug's crash reports. They all contain a obs-mac-virtualcam. This seems to be part of https://github.com/johnboiles/obs-mac-virtualcam. It may play a role here, though I'm still pretty sure this is an Apple bug.

obs-mac-virtualcam seems now to be part of OBS.

This module's UUID in the module lists (for the most recent crashes) is the same as the UUID for the same module in the latest OBS installer, available here. Version 28.0 of OBS was released on 2022-08-31, preceded by an RC2 on 2022-08-24. This matches pretty closely with when these crashes started showing up in Mozilla's crash stats. I think we may have found the trigger for these crashes.

You can find a module's UUID (which is also it's "debug identifier") by doing otool -l [module] | grep uuid.

Does anyone know if OBS is part of Google Meet? As best I can tell from Google Meet's list of requirements, the answer is "no".

I haven't a clue how to use OBS with Firefox, or any other web browser. What I need to know, of course, is how to get Firefox to run in such a way that the obs-mac-virtualcam module is also running inside the Firefox main process. What help I can find isn't written from that point of view.

Edit: Found the answer. 1) Open the OBS installer and copy its app (OBS Studio) to the /Applications directory, then run it. 2) Allow it to install the extra stuff it needs. 3) Leave it running. No other setup is needed. 4) (Re)start Firefox.

From this point, doing image list in lldb includes /Library/CoreMediaIO/Plug-Ins/DAL/obs-mac-virtualcam.plugin/Contents/MacOS/obs-mac-virtualcam. I suppose that gets loaded automatically with the CoreMediaIO framework. And the demo apps from comment #2 will find an "OBS Virtual Camera". You don't even need to have a real camera installed.

I've opened an issue at OBS.

This turns out to be a dup of another bug. This bug's crashes also happen with another "client" (one that uses the OBS Virtual Camera) -- Zoom. And even without the involvement of any third party software.

My hunch from comment #9 is probably wrong: These crashes probably aren't caused by IORegisterForSystemPower() failing to do an error return when it should. Instead they seem to have to do with "cached" Mach ports.

The OBS Project has reported these crashes to Apple. They're also thinking about using "native" virtual camera support -- available on macOS 12.3. This would hopefully take care of the problem on macOS 12.3 and up. Apple may also fix this bug -- but probably only on macOS 13 (and up).

Steven, should this still be considered a webrtc bug now that it has been determined to be a bug in OBS?

Flags: needinfo?(smichaud)

It's really an Apple bug. And for that reason I think it's best to keep this under webrtc. Most Apple bugs get dumped under Cocoa Widgets. But it's clearly webrtc code (and functionality) that's effected by this Apple bug.

If this were truly a bug in OBS, it'd probably be best to put it under "External Software Affecting Firefox" (like at bug 1670195). But it's not. Though OBS does have some leeway to choose how it might work around Apple's bug, it doesn't really have any control over the problem. So it's not really up to them to fix it.

Flags: needinfo?(smichaud)

Given that we're waiting for either a workaround from OBS or a fix from Apple, I'm going to mark this P3/S3.

Severity: -- → S3
Priority: -- → P3

Minimal STR. For this you need to have an actual camera installed.

  1. Make sure your camera is connected -- for example by plugging it in to a USB port.

  2. Run OBS Studio and choose "Start Virtual Camera". You don't need to have any "sources" defined. You can just make the virtual camera broadcast a blank screen.

  3. Start Firefox and visit https://webrtc.github.io/samples/src/content/devices/input-output/.

  4. Switch your "video source" back and forth between your camera and "OBS Virtual Camera". Doing this four or five times should trigger one of this bug's crashes.

Has STR: --- → yes
Keywords: reproducible
Whiteboard: STR in comment #21

When you trigger one of these crashes in a local build (vanilla settings, not running in lldb) you get an Apple crash report. Here's an example.

Unlike PatTheMav at OBS, I didn't see any errors about bad mach ports.*

I tested on a MacPro (Intel) running macOS 12.6.

*Edit: But I did see the following error in the Console, when I ran it before crashing:

15:38:43.845326-0500	IOKit.framework:IONotificationPortGetRunLoopSource bad CFMachPort, <CFMachPort 0x139794ac0 [0x7ff85ccebd70]>{valid = Yes, port = 20f5f, source = 0x0, callout = __NSFireMachPort (0x7ff81b0eb788), context = <CFMachPort context 0x12c5518e0>}

My STR don't work in Chrome, though -- even though I also see the IOKit.framework:IONotificationPortGetRunLoopSource bad CFMachPort error there. I think the crucial difference is that Chrome uses the OS (macOS) to determine access to the camera (and microphone). Firefox has its own methods.

Edit: Oops, I may need to take this back. Chrome doesn't visibly crash, but the next line in Console is:

16:08:00.272170-0500	AMFI: Denying core dump for pid 4304 (Google Chrome He)

I've found a fix for these crashes at https://github.com/obsproject/obs-studio/issues/7287 -- or really a workaround for Apple's bug(s). There are still some kinks that need to be ironed out, but I think we'll soon be able to close this bug WORKSFORME.

This bug's crashes are fixed in OBS Studio 28.1, which is at the RC1 stage. So we should soon see their numbers start to go down.

See Also: → 1798503
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: