Closed Bug 1644057 Opened 4 years ago Closed 4 years ago

Figure out how to avoid bug 1596826

Categories

(Core :: Widget: Cocoa, enhancement)

enhancement

Tracking

()

RESOLVED FIXED

People

(Reporter: jrmuizel, Assigned: jrmuizel)

Details

No description provided.
Assignee: nobody → jmuizelaar

So I was able to trigger this crash more often by expanding the customizeui test suite. It also looks like it's specifically https://bugzilla.mozilla.org/attachment.cgi?id=9105894 that causes the crashes to start happening.

I took a closer look at where the most common crash in [NSView buildLayerTreeWithOwnLayerRequirement:someAncestorWantsLayer:] is happening.

Crash reason:  EXC_BAD_ACCESS / KERN_INVALID_ADDRESS
Crash address: 0xffffffffe5e5e5f8
Process uptime: 744 seconds

Thread 0 (crashed)
 0  libobjc.A.dylib!objc_msgSend + 0x1d
    rax = 0x00007fff429e6503   rdx = 0x0000000000000002
    rcx = 0x00000001265ad430   rbx = 0x00000001265ad430
    rsi = 0x00007fff40ac4929   rdi = 0x00000001265ad430
    rbp = 0x00007ffee90f4c40   rsp = 0x00007ffee90f4c28
     r8 = 0x000000012350faa0    r9 = 0x0000000106c00110
    r10 = 0x000065e5e5e5e5e0   r11 = 0x00007fff40ac4929
    r12 = 0x0000000000000003   r13 = 0x0000000000000004
    r14 = 0x0000000000000000   r15 = 0x00007fff40b1c77d
    rip = 0x00007fff6d00a69d
    Found by: given as instruction pointer in context
 1  AppKit!-[NSView buildLayerTreeWithOwnLayerRequirement:someAncestorWantsLayer:] + 0xbed
    rbp = 0x00007ffee90f50d0   rsp = 0x00007ffee90f4c50
    rip = 0x00007fff4004963b
    Found by: previous frame's frame pointer
 2  AppKit!-[NSView buildLayerTreeWithOwnLayerRequirement:someAncestorWantsLayer:] + 0x4c9
    rbp = 0x00007ffee90f5560   rsp = 0x00007ffee90f50e0
    rip = 0x00007fff40048f17
    Found by: previous frame's frame pointer
 3  AppKit!-[NSView buildLayerTreeWithOwnLayerRequirement:someAncestorWantsLayer:] + 0x4c9
    rbp = 0x00007ffee90f59f0   rsp = 0x00007ffee90f5570
    rip = 0x00007fff40048f17
    Found by: previous frame's frame pointer
 4  AppKit!-[NSView buildLayerTreeWithOwnLayerRequirement:someAncestorWantsLayer:] + 0x4c9
    rbp = 0x00007ffee90f5e80   rsp = 0x00007ffee90f5a00
    rip = 0x00007fff40048f17

The crash happens with a call to _objc_enumerationMutation which tail calls the enumeration handler which is likely ___NSFastEnumerationMutationHandle. ___NSFastEnumerationMutationHandle calls _objc_msgSend early but it's very weird that ___NSFastEnumerationMutationHandle does not appear on the stack.

Some more information on what we're enumerating:
We call "sublayers" and then call "countByEnumeratingWithState" on the result to get a count. If the count != 0 we seem to enumerate the result during which we run into the mutation problem.

The other crash in [NSView buildLayerTreeWithOwnLayerRequirement:someAncestorWantsLayer:] is from calling "count" on the return value of "buildLayerTreeWithOwnLayerRequirement:someAncestorWantsLayer:"

I'll try setting my own enumeration mutation handler and see if I can make the crash show up there instead.

Great find! __NSFastEnumerationMutationHandle should be throwing an exception, not crashing...

But onto the actual problem! We definitely mutate the sublayers array on the compositor thread. The crashing code runs on the main thread. And I don't see anything that synchronizes between them. This is bound to cause problems.

There exists +[CATransaction lock/unlock] but it usually only gets called inside the CALayer property getters / setters. Once the property (e.g. sublayers) has been gotten, it's outside the lock.

So I'm not sure how this is supposed to work, at all.

Can you check whether the patch in bug 1644940 helps?

The patch from bug 1644940 seems to fix the crash.

🎉🎉🎉

It would still be nice to find out why exceptions aren't happening. Maybe due to the same phenomenon as bug 1392431?

Confirmed that it was mutation problems: https://firefoxci.taskcluster-artifacts.net/WELrvI2UQZab9GiX8Lfnrw/0/public/logs/live_backing.log

Still don't understand the callstack though.

This is figured out enough.

Status: NEW → RESOLVED
Closed: 4 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.