Closed Bug 471126 Opened 17 years ago Closed 17 years ago

leak content nodes (and sometimes dom windows) after clicking on nytimes.com articles

Categories

(Core :: DOM: Core & HTML, defect, P1)

x86
Linux
defect

Tracking

()

RESOLVED FIXED
mozilla1.9.1b3

People

(Reporter: dbaron, Assigned: peterv)

References

Details

(Keywords: fixed1.9.1)

Attachments

(1 file, 2 obsolete files)

I'm seeing a leak of content nodes when clicking on an article on nytimes.com. (Actually, in my own tree, I'm seeing DOM windows leaked too, but in the simple steps to reproduce that I just figured out, I'm seeing only content nodes.) Steps to reproduce: 1. NSPR_LOG_MODULES=DOMLeak:5,DocumentLeak:5,nsDocShellLeak:5,NodeInfoManagerLeak:5 NSPR_LOG_FILE=nspr.log ./firefox -profile testing-profile-directory -no-remote http://www.nytimes.com/2008/12/25/world/asia/25afghan.html?hp 2. click on the word "rehabilitation in the picture caption" 3. press Ctrl-W to close the window 4. run leak-gauge.pl over the resulting nspr.log Expected results: no leaks Actual results: leaked content nodes This regressed between Linux nightlies 2008-12-03-02-mozilla-central and 2008-12-04-02-mozilla-central. (And it's still present in 2008-12-25-02-mozilla-central.)
Flags: blocking1.9.1?
(The same steps to reproduce also lead to windows leaked in my DEBUG build with my own patches in it; I'm not sure why there's a difference, though.)
The regression range (using the changesets from the application.ini files of the builds) is: http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=150dbe7b6fd7&tochange=169e8eb751a7
0x13AF93D0 [nsGlobalWindow] 0x00C8B400 [nsDocument ([none]) http://www.nytimes.com/2008/12/25/world/asia/25afgh [2/3]] --[mCachedRootContent]-> 0x15F11E10 [nsGenericElement ([none]) html] --[mAttrsAndChildren[i]]-> 0x162099E0 [nsGenericElement ([none]) body] --[mAttrsAndChildren[i]]-> 0x0EA3E5A0 [nsGenericElement ([none]) script] --[[via hash] mListenerManager]-> 0x16F705B0 [nsEventListenerManager] --[mListeners[i] mListener]-> 0x0EA42E00 [nsJSEventListener] --[mContext]-> 0x13AF89E0 [nsJSContext] --[mGlobalWrapperRef]-> 0x13AF73B0 [XPCWrappedNative (Window)] --[mIdentity]-> 0x13AF93D0 [nsGlobalWindow] known edges: 0x00D3BE04 --[mTargetDocument]--> 0x00C8B400 0x15F16E00 --[mIdentity]--> 0x00C8B400 Missing an edge to the document (2/3 found).
(I tested that it's a regression from that changeset by reverting it locally.) That said, it seems possible that the problem could be with a range not getting traversed to rather than being an actual bug in that changeset.
Do we leak nsXMLFragmentContentSink.cpp? That doesn't seem to have proper unlink. I couldn't see any leaks using XPCOM_MEM_LEAK_LOG
Or do we leak any fragmentcontentsinks? nsHTMLDocument doesn't unlink those properly.
Seems like we do leak HTMLContentSink.
Assignee: nobody → Olli.Pettay
Er, HTMLFragmentContentSink
Attached patch v1 (obsolete) — Splinter Review
This fixes the leak for me. It's a bit scary because it adds a real cycle between nsTypedSelection and nsFrameSelection, it should be mostly ok though. We do need to hook up nsTypedSelection because content can get at it (window.getSelection), but it's hard to keep the current special refcounting *and* use the cycle collector.
Assignee: Olli.Pettay → peterv
Comment on attachment 354551 [details] [diff] [review] v1 This crashes sometimes while unlinking.
Attachment #354551 - Attachment is obsolete: true
Attached patch v1.1 (obsolete) — Splinter Review
The crashes were because DocumentViewerImpl::SetDOMDocument destroyed the presshell without removing mSelectionListener. The listener holds a raw (weak) pointer to the DocumentViewerImpl, and during unlinking the selection called that listener which had a dangling pointer. Maybe we should make the DocumentViewerImpl null that pointer when it gets destroyed at some point. I've added null checks to nsFrameSelection and nsTypedSelection to make unlinking safe (also removed some that looked redundant).
Attachment #356002 - Flags: superreview?(bzbarsky)
Attachment #356002 - Flags: review?(bent.mozilla)
Attachment #356002 - Flags: superreview?(bzbarsky) → superreview+
Attachment #356002 - Flags: review?(bent.mozilla) → review+
Attached patch v1.2Splinter Review
Also need the chunk in nsComposerCommandsUpdater.cpp (only change vs attachment 356002 [details] [diff] [review]). nsComposerCommandsUpdater holds a weak reference to the window in a raw pointer. We seem to be releasing a window sooner now, before the editor (which holds the nsComposerCommandsUpdater). As a result we have a stale pointer in nsComposerCommandsUpdater's mWindow, which we try to use when a timer fires. Switching mWindow to a real nsWeakPtr fixes it. I haven't been able to narrow down which exact test triggers this, but it gets triggered when running all Mochitests.
Attachment #356002 - Attachment is obsolete: true
Attachment #356724 - Flags: superreview?(bzbarsky)
Attachment #356724 - Flags: review?(bent.mozilla)
Attachment #356724 - Flags: superreview?(bzbarsky) → superreview+
Plusing, and making this a P1 since this has shown to be a somewhat risky change, so we need the beta3 testing on this one.
Flags: blocking1.9.1? → blocking1.9.1+
Priority: -- → P1
Attachment #356724 - Flags: review?(bent.mozilla) → review+
Comment on attachment 356724 [details] [diff] [review] v1.2 + if (!domWindow) return PR_TRUE; I know it's not your fault, but split that to two lines? r=me either way.
I checked this in but it turned the tree orange a bit randomly, crashing while running crashtest. I can somewhat reproduce, and I've narrowed it down to a specific crashtest (458637-1.html, which uses an iframe and XSLT). I can reproduce while running in gdb, but setting breakpoints make the crash go away, so debugging is painful. What seems to be happening is that we end up reentering DocumentViewerImpl::InitPresentationStuff. We end up in InitPresentationStuff because we report an XSLT error. InitPresentationStuff does its stuff, until it calls |mViewManager->EnableRefresh(NS_VMREFRESH_IMMEDIATE);|. That somehow ends up calling DocumentViewerImpl::Show which calls DocumentViewerImpl::InitPresentationStuff again. The end result is that we add the selection listener twice from InitPresentationStuff. When the document viewer is destroyed we only remove the selection listener once. One selection listener remains and it has a stale pointer to the now dead document viewer. I can fix this in the listener, but it somehow seems wrong to reenter InitPresentationStuff. Bz: should we protect against that reentering DocumentViewerImpl::InitPresentationStuff? What about ending up in DocumentViewerImpl::Show from DocumentViewerImpl::InitPresentationStuff, that seems wrong too, no?
Status: NEW → ASSIGNED
Depends on: 473773
I've moved the reentering InitPresentationStuff problem into bug 473773, it's a different problem but it does block the fix for this bug.
Status: ASSIGNED → RESOLVED
Closed: 17 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla1.9.2a1
Target Milestone: mozilla1.9.2a1 → mozilla1.9.1b3
Keywords: fixed1.9.1
Comment on attachment 356724 [details] [diff] [review] v1.2 >+NS_IMPL_CYCLE_COLLECTION_UNLINK_BEGIN(nsTypedSelection) >+ tmp->RemoveAllRanges(); >+ NS_IMPL_CYCLE_COLLECTION_UNLINK_NSCOMPTR(mFrameSelection) Is this safe to do during GC? It notifies the selection listeners, which include JS listeners.
Component: DOM → DOM: Core & HTML
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: