Closed Bug 1720009 Opened 3 years ago Closed 2 years ago

Startup crashes with "unable to open IOSurface kernel service" in mac_crash_info, mostly on macOS 12.0.0, mostly on nb-no locale

Categories

(Core :: Widget: Cocoa, defect, P2)

Unspecified
macOS
defect

Tracking

()

RESOLVED WORKSFORME

People

(Reporter: smichaud, Unassigned)

References

(Blocks 2 open bugs)

Details

Crash Data

Blocks: 1711944
Crash Signature: [@ libsystem_kernel.dylib@0xb6a ] [@ libsystem_kernel.dylib@0x71c2 ]

These have various mac_crash_info sections. Here's the most common:

    {
      "num_records": 1,
      "records": [
        {
          "message": "Assertion failed: (_iosConnectInitalize() unable to open IOSurface kernel service: e00002c7\n1020 existing clients:\n{\n}\n), function _iosConnectInitalize, file /System/Volumes/Data/SWE/macOS/BuildRoots/0ed32b12e4/Library/Caches/com.apple.xbs/Sources/IOSurface/IOSurface-302.7/IOSurfaceUser/IOSurfaceClient.m, line 407.\n",
          "module": "/usr/lib/system/libsystem_c.dylib"
        }
      ]
    }

On macOS 10.15.7 at least, this error message is written by _iosConnectInitalize.cold.3() in the IOSurface framework, on the failure of a call to IOServiceOpen() (https://developer.apple.com/documentation/iokit/1514515-ioserviceopen?language=objc), as that framework is being initialized.

There are no crashes with iosConnectInitalize.cold in the proto signature, so these crashes are definitely new with macOS 12.0.0 build 21A5268h.

It's conceivable, but not very likely, that this is a sandbox bug.

I'm CCing you, Haik, because I know you used to work on Mac sandbox things. Please pass the NI along if you're no longer doing this.

Flags: needinfo?(haftandilian)

The error number 0xe00002c7 is an IOKit error defined as kIOReturnUnsupported.

Blocks: monterey
Flags: needinfo?(haftandilian)
Flags: needinfo?(haftandilian)

I can't reproduce locally so far with Beta 21A5268h and a 6-core MacBook Pro (15-inch 2018). I noticed all of the crash reports are from a 16 CPU, 64 GB system. I think we just have to keep an eye on this for now. Given what you found regarding iosConnectInitalize.cold, this seems like a macOS bug. To debug further, we could try disassembling the IOService functions looking for an interaction with sandboxing.

Flags: needinfo?(haftandilian)
Crash Signature: [@ libsystem_kernel.dylib@0xb6a ] [@ libsystem_kernel.dylib@0x71c2 ] → [@ libsystem_kernel.dylib@0xb6a ] [@ libsystem_kernel.dylib@0x71c2 ] [@ libsystem_kernel.dylib@0x7db6 ]
Crash Signature: [@ libsystem_kernel.dylib@0xb6a ] [@ libsystem_kernel.dylib@0x71c2 ] [@ libsystem_kernel.dylib@0x7db6 ] → [@ libsystem_kernel.dylib@0xb6a ] [@ libsystem_kernel.dylib@0x71c2 ] [@ libsystem_kernel.dylib@0x7db6 ] [@ __pthread_kill | abort | _iosConnectInitalize ]

I found one this bug's crash reports on macOS 10.13.6 (bp-c87c7b64-c93c-4fff-b51e-68f140210713), so its crash stack is symbolicated:

    0  libsystem_kernel.dylib  __pthread_kill   context
    1  libsystem_c.dylib  abort   frame_pointer
    2  libsystem_c.dylib  __assert_rtn   frame_pointer
    3  IOSurface  _iosConnectInitalize   frame_pointer
    4  libsystem_pthread.dylib  __pthread_once_handler   frame_pointer
    5  libsystem_platform.dylib  _os_once   frame_pointer
    6  libsystem_pthread.dylib  pthread_once   frame_pointer
    7  IOSurface  IOSurfaceClientGetPropertyMaximum   frame_pointer
    8  CoreImage  __iosurface_limits_block_invoke   frame_pointer
    9  libdispatch.dylib  _dispatch_client_callout   frame_pointer
    10  libdispatch.dylib  dispatch_once_f   frame_pointer
    ...

Some of the crash reports with signatures [@ libsystem_kernel.dylib@0x71c2 ] and [@ libsystem_kernel.dylib@0x7db6 ] don't belong to this bug -- they have different mac_crash_info sections.

Summary: Startup crashes with "unable to open IOSurface kernel service" in mac_crash_info, on macOS 12.0.0 → Startup crashes with "unable to open IOSurface kernel service" in mac_crash_info, mostly on macOS 12.0.0
Crash Signature: [@ libsystem_kernel.dylib@0xb6a ] [@ libsystem_kernel.dylib@0x71c2 ] [@ libsystem_kernel.dylib@0x7db6 ] [@ __pthread_kill | abort | _iosConnectInitalize ] → [@ libsystem_kernel.dylib@0xb6a ] [@ libsystem_kernel.dylib@0x71c2 ] [@ libsystem_kernel.dylib@0x7db6 ] [@ __pthread_kill | abort | _iosConnectInitalize ] [@ abort | _iosConnectInitalize.cold.3 ]
Crash Signature: [@ libsystem_kernel.dylib@0xb6a ] [@ libsystem_kernel.dylib@0x71c2 ] [@ libsystem_kernel.dylib@0x7db6 ] [@ __pthread_kill | abort | _iosConnectInitalize ] [@ abort | _iosConnectInitalize.cold.3 ] → [@ libsystem_kernel.dylib@0xb6a ] [@ libsystem_kernel.dylib@0x71c2 ] [@ libsystem_kernel.dylib@0x7db6 ] [@ libsystem_kernel.dylib@0x72a2 ] [@ libsystem_kernel.dylib@0xc4a ] [@ __pthread_kill | abort | _iosConnectInitalize ] [@ abort | _iosConnectIni…
Summary: Startup crashes with "unable to open IOSurface kernel service" in mac_crash_info, mostly on macOS 12.0.0 → Startup crashes with "unable to open IOSurface kernel service" in mac_crash_info, mostly on macOS 12.0.0, mostly in nb-no locale
Summary: Startup crashes with "unable to open IOSurface kernel service" in mac_crash_info, mostly on macOS 12.0.0, mostly in nb-no locale → Startup crashes with "unable to open IOSurface kernel service" in mac_crash_info, mostly on macOS 12.0.0, mostly on nb-no locale
Severity: -- → S2
Priority: -- → P2
Crash Signature: _iosConnectInitalize.cold.3 ] → _iosConnectInitalize.cold.3 ] [@ __pthread_kill | pthread_kill | abort | _iosConnectInitalize.cold.4 ]
Crash Signature: _iosConnectInitalize.cold.3 ] [@ __pthread_kill | pthread_kill | abort | _iosConnectInitalize.cold.4 ] → _iosConnectInitalize.cold.3 ] [@ __pthread_kill | pthread_kill | abort | _iosConnectInitalize.cold.4 ] [@ pthread_kill | abort | _iosConnectInitalize.cold.4 ]

Closing because no crashes reported for 12 weeks.

Status: NEW → RESOLVED
Closed: 2 years ago
Resolution: --- → WORKSFORME

Crashes are actually still happening that match this bug's description. But their signatures are somewhat different, they no longer happen mostly on macOS 12, and they no longer happen on the nb-no locale. So it's likely now a different bug.

Edit: Also, these are now almost entirely content process crashes.

https://crash-stats.mozilla.org/search/?mac_crash_info=~unable%20to%20open%20IOSurface%20kernel&platform=Mac%20OS%20X&date=%3E%3D2022-01-06T03%3A58%3A00.000Z&date=%3C2022-04-06T03%3A58%3A00.000Z&_facets=signature&_facets=platform_version&_facets=useragent_locale&_facets=process_type&_sort=-date&_columns=date&_columns=signature&_columns=product&_columns=version&_columns=build_id&_columns=platform#facet-useragent_locale

You need to log in before you can comment on or make changes to this bug.