Open Bug 1541628 Opened 6 years ago Updated 1 year ago

[macOS] Forget button leaks the world

Categories

(Toolkit :: Data Sanitization, defect, P3)

Unspecified
macOS
defect

Tracking

()

Tracking Status
firefox68 - fix-optional

People

(Reporter: Gijs, Unassigned)

References

Details

(Keywords: memory-leak, Whiteboard: [MemShrink:P3])

Attachments

(1 file)

STR:

  1. open debug build on a clean profile
  2. open customize
  3. move forget button into toolbar
  4. exit customize mode
  5. forget the last 5 minutes using the toolbar
  6. dismiss the 'thanks' popup
  7. quit browser

ER:
no leaks

AR:
leaks all the things:

WARNING: YOU ARE LEAKING THE WORLD (at least one JSRuntime and everything alive inside it, that is) AT JS_ShutDown TIME. FIX THIS!

I expect something keeps a reference to the browser windows we're closing, but I'm really not sure.

(I tested on current nightly + macos, no idea if the platform makes a difference.)

That doesn't sound great, would be nice if someone could take a look at this. Adding it high on my (long) list of bugs for 68...

Priority: -- → P1

Andrew, any chance you could help here? I'm still lost on how to use CC/GC logs to investigate this type of thing and/or how to see the specific bits of tree we care about for the forest of edges (to, um, invert and abuse the metaphor...).

Flags: needinfo?(continuation)

What exactly is "all the things"? It'll show you if you run with XPCOM_MEM_BLOAT_LOG=1.

Do you also get a leak if you run a mochitest? Some people (including myself) get leaks all of the time on OSX.

The steps you can use to investigate leaks are in here:
https://developer.mozilla.org/en-US/docs/Mozilla/Performance/DMD/Heap_Scan_Mode

You just can ignore everything that mentions DMD.

Flags: needinfo?(continuation)
Attached file Leaked things

A lot leaks. I expect it's just the entire window leaking and everything it holds references to...

(In reply to :Gijs (he/him) from comment #5)

Created attachment 9056094 [details]
Leaked things

A lot leaks. I expect it's just the entire window leaking and everything it holds references to...

Except the hidden window also shows up as leaking...

I looked at the docs you linked... in the 4th log (which I assume is the last one), looking for browser.xul, I get:

$ python ../heapgraph/find_roots.py cc*-4.log 0x12d137c00
Parsing cc-edges.2590-4.log. Done loading graph. 

0x12dc90d30 [FragmentOrElement (XUL) menu id='historyUndoMenu' chrome://browser/content/browser.xul]
    --[Preserved wrapper]--> 0xc94b0a1a700 [JS Object (XULMenuElement)]
    --[group_global]--> 0x1c401ef8b240 [JS Object (Window)]
    --[UnwrapDOMObject(obj)]--> 0x12d137c00 [nsGlobalWindowInner # 4 inner chrome://browser/content/browser.xul]

    Root 0x12dc90d30 is a ref counted object with 2 unknown edge(s).
    known edges:
       0x12dc90dc0 [FragmentOrElement (XUL) menupopup id='historyUndoPopup' chrome://browser/content/browser.xul] --[GetParent()]--> 0x12dc90d30
       0xc94b0a1a700 [JS Object (XULMenuElement)] --[UnwrapDOMObject(obj)]--> 0x12dc90d30
       0x12dc90ca0 [FragmentOrElement (XUL) menuitem id='hiddenTabsMenu' chrome://browser/content/browser.xul] --[mNextSibling]--> 0x12dc90d30

(For the hidden window, the last mention is in -2.log, and the both -3 and -4 don't mention either hiddenWindow.xul or its address, based on grep.)

Is this basically saying the history undo menupopup are holding a reference to the window? You told me to ignore the bits about DMD, so I'm a bit lost as to how to go from there - it's not immediately obvious to me why this would be the case... If I had to guess, some JSM is holding onto that menuitem, but I can't seem to figure out how to find out which jsm that'd be...

(In reply to Andrew McCreight [:mccr8] from comment #4)

Do you also get a leak if you run a mochitest? Some people (including myself) get leaks all of the time on OSX.

I've seen some leaks before. I don't normally think much of them if infra isn't seeing the same thing when running a test. In this case, I'm aware there are no automated tests for the forget button at all, because mochitest hates you closing the initial window (which the forget button will cause to happen), so back when I wrote the code for the forget button, there wasn't a way to test it. I guess these days perhaps we could write a marionette test...

Anyway, if I just ./mach run and close the browser after waiting a bit, I don't see leaks. Nor if I ./mach run, open another window, close the first window myself, and then quit the browser. So something the forget button is doing certainly seems "off"...

I don't normally run debug builds, so I don't normally see leaks - I was forced to to investigate some other work I was doing that was hitting debug-only assertions... and then I reviewed a patch for the forget button in bug 1495861 and here we are. :-)

Flags: needinfo?(continuation)

What the log means is that the C++ historyUndoMenu object is leaking, and it holds alive the browser.xul because that is its global. There are two C++ references to the object that the CC doesn't know about. It can't be anything directly through JS, because the cycle collector is able to see through JS.

Is there any C++ code that you can think of that might be involved here? I found nsMenuPopupFrame which could be having issues. I can try to reproduce it myself in a DMD build, and that should give more information about what is holding onto the historyUndoMenu object.

Flags: needinfo?(continuation)

(In reply to Andrew McCreight [:mccr8] from comment #7)

What the log means is that the C++ historyUndoMenu object is leaking, and it holds alive the browser.xul because that is its global. There are two C++ references to the object that the CC doesn't know about. It can't be anything directly through JS, because the cycle collector is able to see through JS.

Is there any C++ code that you can think of that might be involved here? I found nsMenuPopupFrame which could be having issues.

I think in the history menu in general there might be places code, but the 'undo close tab' and 'undo close window' (sub)menus I don't think involve C++ code for this. Marco, can you confirm?

I can try to reproduce it myself in a DMD build, and that should give more information about what is holding onto the historyUndoMenu object.

That would be great, thank you!

Flags: needinfo?(mak77)

The HistoryMenu object in browser-places.js has a reference to historyUndoMenu and references to an nsINavHistoryResult, that is in cpp. Results participate in cycle collection but I can't exclude bugs. That said HistoryMenu derives from PlacesMenu that on "unload" invokes PlacesViewBase uninit() that should basically detach the cpp result from the view.

The same HistoryMenu object has handles to the last closed Tabs/Windows menus, but at first glance they don't seem to use persistent cpp modules

Flags: needinfo?(mak77)

I looked at this a little. I can reproduce it on OSX, but unfortunately not on Linux.

See Also: → 1544002

One thing holding alive the window in my local run was a FragmentOrElement (XUL) menu id='view-menu' for chrome://browser/content/browser.xul. It appears that the two missing references are an nsMenuBarX and a nsMenuItemIconX. The latter is referred to by the former. The nsMenuBarX reference is at an offset of 8, which I think it means that it is nsMenuObjectX::mContent.

The comment on nsMenuBarX says "Once instantiated, this object lives until its DOM node or its parent window is destroyed." I don't see anything that clears mContent, so I'm not sure off hand what is supposed to happen that is not.

Stephen, do you know how these nsMenuBarX are supposed to go away or what might be going wrong here? Thanks.

Flags: needinfo?(spohl.mozilla.bugs)
Whiteboard: [MemShrink]

(In reply to Andrew McCreight [:mccr8] from comment #11)

Stephen, do you know how these nsMenuBarX are supposed to go away or what might be going wrong here? Thanks.

Unfortunately, I have no idea. I would like to help but it doesn't look like I'll be able to get to this anytime soon.

Flags: needinfo?(spohl.mozilla.bugs)

(In reply to Andrew McCreight [:mccr8] from comment #11)

The comment on nsMenuBarX says "Once instantiated, this object lives until its DOM node or its parent window is destroyed." I don't see anything that clears mContent, so I'm not sure off hand what is supposed to happen that is not.

Dumb question, but:

(In reply to :Gijs (he/him) from comment #6)

Anyway, if I just ./mach run and close the browser after waiting a bit, I don't see leaks. Nor if I ./mach run, open another window, close the first window myself, and then quit the browser. So something the forget button is doing certainly seems "off"...

Perhaps this can help figuring out how it "normally" gets cleared?

Whiteboard: [MemShrink] → [MemShrink:P3]

Sounds like we don't yet know when this regressed and it's not actively worked on, calling this fix-optional for 68.

OS: Unspecified → macOS
Priority: P1 → P3
Summary: Forget button leaks the world → [macOS] Forget button leaks the world
Severity: normal → S3

I hoped that our mac native menu work would have fixed this, but this still happens. Markus, with the recent work on native menus, do you think we have a better understanding of the lifetime of the menubar and its menus at this point that might help here?

I'm additionally also seeing:

Hit MOZ_CRASH(mozilla::LinkedList<nsSHistory>::~LinkedList() [T = nsSHistory] has a buggy user: it should have removed all this list's elements before the list's destruction) at /builds/worker/workspace/obj-build/dist/include/mozilla/LinkedList.h:444
#01: mozilla::LinkedList<nsSHistory>::~LinkedList()[objdir/dist/NightlyDebug.app/Contents/MacOS/XUL +0x5a21819]
#02: __cxa_finalize_ranges[/usr/lib/system/libsystem_c.dylib +0x5ad25]
#03: exit[/usr/lib/system/libsystem_c.dylib +0x5b010]

I actually don't really understand that, because AFAICT from https://searchfox.org/mozilla-central/search?q=gSHistoryList&redirect=false nothing ever clears that list out so I'm in the "how did this ever work" stage of looking at this... Niklas, can you comment?

Flags: needinfo?(ngogge)
Flags: needinfo?(mstange.moz)

I think i have seen seen this LinkedList<nsSHistory> error before when writing the test in D121872. For me it happened because I had forgotten to unregister the console listener at the end of the test. This is obviously not the same for this bug.

AFAICT elements of gSHistoryList can remove themselves from the list (See https://searchfox.org/mozilla-central/rev/e3a7a72713e1ba8696cb2af55e928059bc822572/mfbt/LinkedList.h#241, https://searchfox.org/mozilla-central/rev/e3a7a72713e1ba8696cb2af55e928059bc822572/mfbt/LinkedList.h#200). For an instance of nsSHistory that happens when it destructs. So if one sees this error it probably means something kept a reference to an instance of nsSHistory.

Flags: needinfo?(n.goeggi)

(In reply to :Gijs (he/him) from comment #15)

Markus, with the recent work on native menus, do you think we have a better understanding of the lifetime of the menubar and its menus at this point that might help here?

Sorry, I let this needinfo slide for so long that I've forgotten anything that I may have learned about the menubar's lifetime again...

Flags: needinfo?(mstange.moz)
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: