Closed Bug 1523944 Opened 5 years ago Closed 5 years ago

nsTouchbar keeps XPConnect wrapper alive past CC shutdown

Categories

(Core :: Widget: Cocoa, defect)

Other Branch
x86_64
macOS
defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla67
Tracking Status
firefox-esr60 --- unaffected
firefox65 --- unaffected
firefox66 + fixed
firefox67 --- fixed

People

(Reporter: kats, Assigned: spohl)

References

Details

(Keywords: regression, Whiteboard: [MemShrink])

Attachments

(1 file, 2 obsolete files)

macOS local debug build of recent m-c (sometime yesterday). Pretty much every shutdown of the browser produces this assertion failure:

Assertion failure: data, at /Users/kats/zspace/gecko-wr/xpcom/base/nsCycleCollector.cpp:3754
#01: nsXPCWrappedJS::Release()[/Users/kats/zspace/gecko-wr/obj-host-debug/dist/NightlyDebug.app/Contents/MacOS/XUL +0xdb6e0e]
#02: object_cxxDestructFromClass(objc_object*, objc_class*)[/usr/lib/libobjc.A.dylib +0x105d3]
#03: objc_destructInstance[/usr/lib/libobjc.A.dylib +0x92cc]
#04: object_dispose[/usr/lib/libobjc.A.dylib +0x925f]
#05: -[NSTouchBar dealloc][/System/Library/Frameworks/AppKit.framework/Versions/C/AppKit +0xade873]
#06: -[nsTouchBar dealloc][/Users/kats/zspace/gecko-wr/obj-host-debug/dist/NightlyDebug.app/Contents/MacOS/XUL +0x3c38b9f]
#07: _object_remove_assocations[/usr/lib/libobjc.A.dylib +0x9491]
#08: objc_destructInstance[/usr/lib/libobjc.A.dylib +0x92d9]
#09: object_dispose[/usr/lib/libobjc.A.dylib +0x925f]
#10: -[NSResponder dealloc][/System/Library/Frameworks/AppKit.framework/Versions/C/AppKit +0x35bc3]
#11: -[NSWindow dealloc][/System/Library/Frameworks/AppKit.framework/Versions/C/AppKit +0x28f83e]
#12: -[BaseWindow dealloc][/Users/kats/zspace/gecko-wr/obj-host-debug/dist/NightlyDebug.app/Contents/MacOS/XUL +0x3c1325c]
#13: (anonymous namespace)::AutoreleasePoolPage::pop(void*)[/usr/lib/libobjc.A.dylib +0xa087]
#14: _CFAutoreleasePoolPop[/System/Library/Frameworks/CoreFoundation.framework/Versions/A/CoreFoundation +0x44926]
#15: -[NSAutoreleasePool release][/System/Library/Frameworks/Foundation.framework/Versions/C/Foundation +0x20aa6]
#16: ScopedXPCOMStartup::~ScopedXPCOMStartup()[/Users/kats/zspace/gecko-wr/obj-host-debug/dist/NightlyDebug.app/Contents/MacOS/XUL +0x564a4bc]
#17: XREMain::XRE_main(int, char**, mozilla::BootstrapConfig const&)[/Users/kats/zspace/gecko-wr/obj-host-debug/dist/NightlyDebug.app/Contents/MacOS/XUL +0x5654d15]
#18: XRE_main(int, char**, mozilla::BootstrapConfig const&)[/Users/kats/zspace/gecko-wr/obj-host-debug/dist/NightlyDebug.app/Contents/MacOS/XUL +0x56554ed]
#19: main[/Users/kats/zspace/gecko-wr/obj-host-debug/dist/NightlyDebug.app/Contents/MacOS/firefox +0xfed]

Possibly related to the touchbar support added recently, based on the stack above.

Component: JavaScript: GC → XPCOM

This assertion means somebody is releasing a cycle collected object after we shut down the cycle collector.

We actually have some code already to deal with this (SuspectAfterShutdown) because it was a problem at some point, but I'm not sure how it is reachable. The code only deals with a partially shut down CC.

Peter's summary is more descriptive, so I'm copying it over here.

Component: XPCOM → Widget: Cocoa
Summary: Assertion failure in nsCycleCollector.cpp on shutdown → nsTouchbar keeps XPConnect wrapper alive past CC shutdown

I'm not sure how bad this could end up being in an opt build, but having stuff potentially poke at XPConnect that late in shutdown sounds questionable.

Flags: needinfo?(spohl.mozilla.bugs)
Keywords: regression

I'm not sure what the right way to address this is. Should we be listening to some kind of CC shutdown event and release the nsXPCWrappedJS (see bug 1523965 comment 0)? How is this handled elsewhere?

Flags: needinfo?(spohl.mozilla.bugs)
Blocks: 1313429
Flags: needinfo?(continuation)

Judging from the stack in the other bug, I think the problem is that MacAutoreleasePool is scoped to be destroyed after we've already shut down the CC. I think ideally it would get destroyed before we call nsCycleCollector_shutdown. Of course, it could be the case that things are being put into the pool after that point, but hopefully not.

Flags: needinfo?(continuation)

The MacAutoreleasePool in ~ScopedXPCOMStartup() is necessary to catch cocoa objects that are autoreleased during shutdown (see [1]). I think we will have to find a way to release nsTouchBar's mTouchBarHelper just before the cycle collector shuts down.

[1] https://hg.mozilla.org/mozilla-central/rev/a6d7a5677b4c#l4.31

Is using xpcom-shutdown notification good enough?

Or ClearOnShutdown?

Can we get a fix here? I ended up commenting out the setting of mTouchBarHelper just so I can debug some other leaks.

Flags: needinfo?(spohl.mozilla.bugs)
Attached patch Patch (obsolete) — Splinter Review

This seems to do the trick. My main concern is that we should apparently only use ClearOnShutdown on StaticRefPtr or StaticAutoPtr, while mTouchBarHelper is currently an nsCOMPtr [1].

[1] https://dxr.mozilla.org/mozilla-central/source/xpcom/base/ClearOnShutdown.h#31-36

Assignee: nobody → spohl.mozilla.bugs
Status: NEW → ASSIGNED
Flags: needinfo?(spohl.mozilla.bugs)
Attachment #9041824 - Flags: review?(mstange)
Comment on attachment 9041824 [details] [diff] [review]
Patch

Review of attachment 9041824 [details] [diff] [review]:
-----------------------------------------------------------------

It looks like every window (nsCocoaWindow) has its own nsTouchbar, which gets destroyed when the window is closed. But at that point, the ClearOnShutdown mechanism will still have a reference to that nsTouchbar's mTouchBarHelper. Won't that be a dangling pointer?

On the other hand, I don't really understand the leak - I assumed that all nsIWidgets (and thus nsCocoaWindows) have already been destroyed by the time CC shutdown happens.
Attachment #9041824 - Flags: review?(mstange) → review-

(In reply to Markus Stange [:mstange] from comment #15)

Comment on attachment 9041824 [details] [diff] [review]
Patch

Review of attachment 9041824 [details] [diff] [review]:

It looks like every window (nsCocoaWindow) has its own nsTouchbar, which
gets destroyed when the window is closed. But at that point, the
ClearOnShutdown mechanism will still have a reference to that nsTouchbar's
mTouchBarHelper. Won't that be a dangling pointer?

You're correct. Since we can't undo a call to ClearOnShutdown, this won't work.

On the other hand, I don't really understand the leak - I assumed that all
nsIWidgets (and thus nsCocoaWindows) have already been destroyed by the time
CC shutdown happens.

After tracing this, what appears to happen is that we release nsCocoaWindows, but they get dealloc'd asynchronously by the OS when autorelease pools are drained. It looks like the cycle collector is shut down before the OS calls nsCocoaWindow's dealloc, which is where we release mTouchBar and its mTouchBarHelper, which is an nsXPCWrappedJS object requiring the cycle collector to be in place.

I'm going to try and add an observer for xpcom-shutdown. I can't seem to find an example where we do this in Obj-C code at the moment.

Attached patch Patch (wip) (obsolete) — Splinter Review

I believe something like this could work, but right now we don't seem to ever observe the xpcom-shutdown notification. I was able to manually confirm that if we did, we wouldn't be leaking the mTouchBarHelper object anymore. Am I missing something obvious why we never hit the xpcom-shutdown case in MacTouchBar.js?

Attachment #9041824 - Attachment is obsolete: true
Attachment #9042352 - Flags: feedback?(mstange)
Attachment #9042352 - Flags: feedback?(bugs)

Because "quit-application" calls destructor and removes all the observers?
I guess the clean up could be done in destructor() ?

Attachment #9042352 - Flags: feedback?(bugs)
Attached patch PatchSplinter Review

(In reply to Olli Pettay [:smaug] (massive needinfo queue, ping on IRC on anything urgent) from comment #18)

Because "quit-application" calls destructor and removes all the observers?
I guess the clean up could be done in destructor() ?

You're right, I had the sequence of notifications wrong and thought "xpcom-shutdown" came before "quit-application". However, I still couldn't get this to work reliably; more than half the time, mTouchBarHelper would still leak.

I have changed this patch to release mTouchBarHelper at the time that we destroy the native window instead. This works reliably and keeps the size of this patch to a minimum, in case we want to uplift.

Attachment #9042352 - Attachment is obsolete: true
Attachment #9042352 - Flags: feedback?(mstange)
Attachment #9043364 - Flags: review?(mstange)
Comment on attachment 9043364 [details] [diff] [review]
Patch

Review of attachment 9043364 [details] [diff] [review]:
-----------------------------------------------------------------

Looks good. In the future it would be nice to try to make the ChildView destruction synchronous with the nsChildView destruction, so that we don't have to coordinate as much.
Attachment #9043364 - Flags: review?(mstange) → review+
Status: ASSIGNED → RESOLVED
Closed: 5 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla67

[Tracking Requested - why for this release]:

Regression from new feature (TouchBar support, bug 1313429), shipping in Firefox 66.

Comment on attachment 9043364 [details] [diff] [review]
Patch

Beta/Release Uplift Approval Request

Feature/Bug causing the regression

Bug 1313429

User impact if declined

Shutdown leak.

Is this code covered by automated tests?

Yes

Has the fix been verified in Nightly?

Yes

Needs manual test from QE?

No

If yes, steps to reproduce

List of other uplifts needed

none

Risk to taking this patch

Low

Why is the change risky/not risky? (and alternatives if risky)

This addresses a shutdown leak in a safe manner.

String changes made/needed

none

Attachment #9043364 - Flags: approval-mozilla-beta?
Comment on attachment 9043364 [details] [diff] [review]
Patch

Fix for shutdown leak, feature shipping in 66, let's uplift for beta 8.
Attachment #9043364 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
See Also: → 1542944
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: