Open Bug 1765391 Opened 2 years ago Updated 9 days ago

Crash in [@ nsMenuBarX::Paint]

Categories

(Core :: Widget: Cocoa, defect, P3)

defect

Tracking

()

ASSIGNED

People

(Reporter: mccr8, Assigned: spohl)

References

Details

(Keywords: crash, regression, Whiteboard: [tbird crash])

Crash Data

Attachments

(2 files)

Crash report: https://crash-stats.mozilla.org/report/index/606ea0f7-5d04-41ee-bb7c-a43880220419

MOZ_CRASH Reason: MOZ_CRASH(Encountered unexpected Objective C exception)

Top 10 frames of crashing thread:

0 XUL nsMenuBarX::Paint widget/cocoa/nsMenuBarX.mm:431
1 XUL -[WindowDelegate windowDidResignMain:] widget/cocoa/nsCocoaWindow.mm:2886
2 CoreFoundation __CFNOTIFICATIONCENTER_IS_CALLING_OUT_TO_AN_OBSERVER__ 
3 CoreFoundation ___CFXRegistrationPost_block_invoke 
4 CoreFoundation _CFXRegistrationPost 
5 CoreFoundation _CFXNotificationPost 
6 Foundation -[NSNotificationCenter postNotificationName:object:userInfo:] 
7 AppKit -[NSWindow _changeKeyAndMainLimitedOK:] 
8 AppKit -[NSWindow _makeKeyRegardlessOfVisibility] 
9 AppKit -[NSWindow makeKeyAndOrderFront:] 

I hit this crash, in a build that I think has the patch from bug 1699936. My browser was lagging a lot while trying to switch "tabs" in crash stats, so I decided to restart the browser. While it was in the middle of shutting down, it hit this crash.

signature ranks #35 for Thunderbird 100 beta
bp-d7530b60-9a3f-4395-84a7-185290220421

Whiteboard: [tbird crash]

This should have started to drop off now that bug 1699936 was backed out. Wayne, is this so?

Flags: needinfo?(vseerror)
Regressed by: 1699936

Will have results in about two weeks, with stats from beta 101

Has Regression Range: --- → yes

FWIW a firefox 101.0b2 beta crash- May 3 build bp-8716a618-6155-4634-9c5a-806e90220506

(In reply to Stephen A Pohl [:spohl] from comment #2)

This should have started to drop off now that bug 1699936 was backed out. Wayne, is this so?

The backout was April 25. That change would pick up in beta 101?

Overall, for beta, average crash rate

Flags: needinfo?(vseerror) → needinfo?(spohl.mozilla.bugs)

Then this couldn't have been caused by bug 1699936...

Flags: needinfo?(spohl.mozilla.bugs)
No longer regressed by: 1699936
See Also: 1699936

The volume is pretty low, could we decrease the severity?

Flags: needinfo?(spohl.mozilla.bugs)
Severity: S2 → S3
Flags: needinfo?(spohl.mozilla.bugs)
Priority: -- → P3

The bug is linked to a topcrash signature, which matches the following criterion:

  • Top 5 desktop browser crashes on Mac on beta (startup)

:spohl, could you consider increasing the severity of this top-crash bug?

For more information, please visit auto_nag documentation.

Flags: needinfo?(spohl.mozilla.bugs)

(In reply to Marco Castelluccio [:marco] from comment #7)

The volume is pretty low, could we decrease the severity?

(In reply to Release mgmt bot [:suhaib / :marco/ :calixte] from comment #8)

:spohl, could you consider increasing the severity of this top-crash bug?

🤷‍♂️

Flags: needinfo?(spohl.mozilla.bugs)

A couple of crashes per build doesn't really seem like cause for alarm (comment 8 says this is top 5 for beta Mac startup crashes). Maybe there should be a lower threshold for crash volume for one of these alerts?

Flags: needinfo?(smujahid)

The release reports are throttled so the actual count is like 10x higher? And this is on a minor platform, which means that the chance per user of hitting this crash is proportionally greater? I understand we like S2 bugs to be gone from everyone's radar but I do think this is the kind of thing we actually want to look into...

(In reply to Andrew McCreight [:mccr8] from comment #10)

A couple of crashes per build doesn't really seem like cause for alarm (comment 8 says this is top 5 for beta Mac startup crashes). Maybe there should be a lower threshold for crash volume for one of these alerts?

Autonag implements the top crash criteria which in this specific case have the following thresholds: "If there's less than 5 crashes per week on a signature, that bug probably still doesn't qualify - same for crashes happening to only 2 or 3 installations."

Flags: needinfo?(smujahid)

https://support.mozilla.org/en-US/questions/1396794#answer-1549032 cites Thunderbird crash bp-32bd076c-494d-4eed-bd73-e2cba0221114

0 XUL nsMenuBarX::Paint()
1 XUL +[WindowDelegate paintMenubarForWindow:]
2 XUL -[WindowDelegate windowDidBecomeMain:]
3 CoreFoundation CFNOTIFICATIONCENTER_IS_CALLING_OUT_TO_AN_OBSERVER
4 CoreFoundation ___CFXRegistrationPost_block_invoke
5 CoreFoundation _CFXRegistrationPost
6 CoreFoundation _CFXNotificationPost
7 Foundation -[NSNotificationCenter postNotificationName:object:userInfo:]
8 AppKit -[NSWindow _changeKeyAndMainLimitedOK:]
9 AppKit -[NSMenuWindowManagerWindow _restorePreviousKeyWindowFromSavedProperties]
10 AppKit -[NSMenuWindowManagerWindow _setVisible:]
11 AppKit -[NSWindow _doWindowWillBecomeHidden]
12 AppKit -[NSWindow _reallyDoOrderWindowOutRelativeTo:findKey:forCounter:force:isModal:]

Based on the topcrash criteria, the crash signature linked to this bug is not a topcrash signature anymore.

For more information, please visit auto_nag documentation.

See Also: → 1799247

Firefox crash rate of the past month is tripled compared to April.
And pct of crashes <1min uptime has increased to 67% for past month, up from 56% for April

Flags: needinfo?(spohl.mozilla.bugs)

I am reworking our menu bar in bug 1808223 and this crash will hopefully be fixed at the same time. Marking as blocked by bug 1808223 for now.

Depends on: 1808223
Flags: needinfo?(spohl.mozilla.bugs)

New report: Thunderbird Support Forum: https://support.mozilla.org/en-US/questions/1428216

MAC OS X version 10.15.7 Catalina
Crash ID: bp-290856b9-84d7-4075-aada-dfe180231020

Info says using version 102.15.1
I'm going to suggest:

  • switch off auto compact
  • try manual compact on one folder at time.
  • an upgrade to 115.3.3, but that's going to be the user choice. Just to see if problem continues.

The fix that ultimately landed in bug 1808223 will probably not make a difference in the crash rate here.

No longer depends on: 1808223
Assignee: nobody → spohl.mozilla.bugs
Status: NEW → ASSIGNED

I will be landing the diagnostic patch next week. Adding leave-open keyword.

Keywords: leave-open
Pushed by spohl@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/d5c404ba6047
Add more exception handling to help isolate when the OS is throwing an NSGenericException when painting menu bars on macOS. r=mstange

The leave-open keyword is there and there is no activity for 6 months.
:spohl, maybe it's time to close this bug?
For more information, please visit BugBot documentation.

Flags: needinfo?(spohl.mozilla.bugs)

The improved exception handling shows that this crash occurs when we're attempting to set NSApp.mainMenu. The exception, "NSGenericException: *** Collection <__NSArrayM: 0x140c82f10> was mutated while being enumerated." indicates that macOS might be in the process of enumerating the menu items when we're attempting to set a new mainMenu. Let's see if we can fix this with a simple mutex lock.

Flags: needinfo?(spohl.mozilla.bugs)
Keywords: leave-open

(In reply to Stephen A Pohl [:spohl] from comment #25)

The improved exception handling shows that this crash occurs when we're attempting to set NSApp.mainMenu. The exception, "NSGenericException: *** Collection <__NSArrayM: 0x140c82f10> was mutated while being enumerated." indicates that macOS might be in the process of enumerating the menu items when we're attempting to set a new mainMenu. Let's see if we can fix this with a simple mutex lock.

Which crash report did you see this in?

Is another thread enumerating, or is the same thread enumerating? The stacks in the report should show us.

Flags: needinfo?(spohl.mozilla.bugs)

(In reply to Markus Stange [:mstange] from comment #27)

(In reply to Stephen A Pohl [:spohl] from comment #25)

The improved exception handling shows that this crash occurs when we're attempting to set NSApp.mainMenu. The exception, "NSGenericException: *** Collection <__NSArrayM: 0x140c82f10> was mutated while being enumerated." indicates that macOS might be in the process of enumerating the menu items when we're attempting to set a new mainMenu. Let's see if we can fix this with a simple mutex lock.

Which crash report did you see this in?

Is another thread enumerating, or is the same thread enumerating? The stacks in the report should show us.

For example in: https://crash-stats.mozilla.org/report/index/beaf09ca-957f-476c-a21a-47f840240618

The top-most frame points to the try/abort block that only has one statement, which is setting NSApp.mainMenu.

Flags: needinfo?(spohl.mozilla.bugs)
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: