Closed Bug 743178 Opened 12 years ago Closed 12 years ago

Some eventlistener keeps documents alive

Categories

(Core :: Networking, defect)

defect
Not set
critical

Tracking

()

VERIFIED FIXED
Tracking Status
firefox13 --- verified
firefox14 + verified

People

(Reporter: smaug, Assigned: smaug)

References

Details

(Keywords: regression, Whiteboard: [Snappy:P2][Memshrink])

Attachments

(3 files, 1 obsolete file)

I started to get this yesterday. Some WrappedJS (nsIDOMEventListener) keeps documents
occasionally alive and in CC graph.
I also had this.  It didn't show up right away.  For me, they were associate with various Facebook things.  I have disabled most of my addons to see if I still get it.  MemChaser released a new version in the last day or two, which I assume we both use, and I guess it is possible they are forgetting to remove a document listener somewhere.  Does that sound at all plausible, whimboo?
OS: Linux → All
Hardware: x86_64 → All
Version: unspecified → Trunk
For me, I had a zombie compartment for Facebook, as well as the zombie document.
this may account for the heavy session zombies that i get after a few days.
Whiteboard: [Snappy], [Memshrink] → [Snappy][Memshrink]
MemChaser 0.3 is not build against the SDK 1.6 where all the leaks have been fixed. It was mainly because of downward compatibility. Version 0.3.1 which we will release in the next one or two weeks will be build against 1.6 and shouldn't show the widget leak anymore. Not sure which other zombie compartments you are seeing here. So MemChaser could not be related.
I did get the problem also without Memchaser
Er, hmm, now the document is gone from the CC graph.
This is a bit strange.
I've seen a periodic zombie for the past month or two now.  It's long-lasting, but not immortal.  I've thought it was from ABP, but maybe it's not, and maybe we've been seeing the same issue.
if its what i've been seeing, the zombie may disappear after reopening the tab using that compartment and navigating to a new page in that tab.

of course this doesn't work when the compartment was for multiple tabs.....
So far I haven't managed to reproduce this using a build
http://hg.mozilla.org/mozilla-central/rev/9894cd999781
I got this again without MemChaser:

0x131573800 [rc=6] nsDocument normal (xhtml) https://www.facebook.com/
     Roots: 2
         0x1332df310 [rc=2] root nsXPCWrappedJS (nsIDOMEventListener)
         0x1173ecd10 [rc=2] root nsXPCWrappedJS (nsIDOMEventListener)
0x12e177000 [rc=7] nsDocument normal (xhtml) https://www.facebook.com/
     Roots: 2
         0x11a57d790 [rc=2] root nsXPCWrappedJS (nsIDOMEventListener)
         0x11a631d10 [rc=2] root nsXPCWrappedJS (nsIDOMEventListener)
It took a day or so for it to happen.
Whiteboard: [Snappy][Memshrink] → [Snappy:P2][Memshrink]
When eventlistenermanager.log is true and Components.classes["@mozilla.org/observer-service;1"].getService(Components.interfaces.nsIObserverService).notifyObservers(null, "eventlistenermanager-log", null); is executed, this creates a log
(similar to cc log) which tells the target of the ELM and lists all the listeners.
Hopefully that tells what is keeping the Eventlistener alive.
Olli, is the description you gave part of your patch or already build into Firefox? I kinda would like to add such a feature to MemChaser.
The patch hasn't even been reviewed, so no, it is not part of Firefox. And I'm not
sure I even want that. I created it mainly for debugging purposes.
anything that can trace zombies should be included, even if disabled by default.
(In reply to Danial Horton from comment #16)
> anything that can trace zombies should be included, even if disabled by
> default.

Please understand that this is not a general-purpose tool.  It is designed specifically to help diagnose the leak Olli is seeing.
I haven't experienced the problem again.  I'll try not restarting each day for the new Nightly build and see if that has any effect.
(In reply to Justin Lebar [:jlebar] from comment #17)
> (In reply to Danial Horton from comment #16)
> > anything that can trace zombies should be included, even if disabled by
> > default.
> 
> Please understand that this is not a general-purpose tool.  It is designed
> specifically to help diagnose the leak Olli is seeing.

general purpose is a relative description.
> general purpose is a relative description.

If you would like this feature to be included in Firefox, please file a separate bug.
I was having problems with fairly large CCs, but it cleared up eventually.

I do have a ghost window from a FB-related site: https://s-static.ak.fbcdn.net/connect/xd_proxy.php#cb=fc971be9d7d1e6&origin=https%3A%2F%2Fwww.scientificamerican.com%2Ff36e55e32bdd782&relation=parent.parent&transport=postmessage&type=resize&height=20&ackData[id]=1&width=90 [2]

This could just be a failure of the Ghost Window same-origin detection, I suppose.  The ghost window went away when I closed my Facebook tab.  Note that I think that the FB thing was for an article I closed a while ago, but I guess somehow was being kept alive?
> This could just be a failure of the Ghost Window same-origin detection, I suppose.  The 
> ghost window went away when I closed my Facebook tab.

If it went away when you closed the tab, it sure sounds like it.

I guess we could look at the window's |domain| property to improve the ghost window heuristic.  That might help...
I got the problem again. Interestingly the nsXPCWrappedJS (nsIDOMEventListener) object
wasn't kept alive by any ELM.

Kyle, is it possible that the recent changes to XHR onfoo handling could have caused
XHR to keep listener alive longer? I started to see this problem sometime after Paris Bindings landed.
Attached patch wipSplinter Review
I don't know if this helps, but testing.
Bug 744339 might be related
The patch does not help.
(In reply to Olli Pettay [:smaug] from comment #23)
> I got the problem again. Interestingly the nsXPCWrappedJS
> (nsIDOMEventListener) object
> wasn't kept alive by any ELM.
> 
> Kyle, is it possible that the recent changes to XHR onfoo handling could
> have caused
> XHR to keep listener alive longer? I started to see this problem sometime
> after Paris Bindings landed.

I suppose its possible.
Now testing something else...
https://tbpl.mozilla.org/?tree=Try&rev=e6f68f21a585

I need to figure out how to debug this :/
I got this again today.  Or I think I did.  The browser was too unresponsive to get about:cc to actually produce a list of documents.
Yeah, I'm still seeing this. I *think* it happens when there is some problem with network connection,
or right after resuming.
Interesting that you mention problems with the network connection. I have seen this a couple of times too when Firefox was not able to connect to a side and displayed 'Connecting...'. The CPU load also increases in those situations. I haven't thought about a problem with GC yet, so was I seeing the same problem or not? What do I have to do when I see it the next time? Btw. I have seen it with an Aurora version afair even with 12.
Ehsan is seeing this too:

ehsan: smaug: and good news, I only have one facebook doc listed, and I remember exactly what I did on that page last night
ehsan: smaug: I navigated to the page, and played an embedded youtube video in flash
ehsan: and then navigated away
Summary: Some eventlistener keeps documents alive → Some eventlistener keeps documents alive on Facebook
This happens (very rarely) with tbpl too.
Summary: Some eventlistener keeps documents alive on Facebook → Some eventlistener keeps documents alive
I tried to reproduce the bug by doing the same thing on Facebook after I restarted Firefox, but didn't manage to reproduce the bug.
Finally got this in a debug build (with which even gdb works)

#0  nsXPCWrappedJS::Release (this=0x7f5c6383ff00) at /home/smaug/mozilla/hg/mozilla/js/xpconnect/src/XPCWrappedJS.cpp:224
#1  0x00007f5c84005af9 in nsXPTCStubBase::Release (this=0x7f5c685d4f40) at /home/smaug/mozilla/hg/mozilla/xpcom/reflect/xptcall/src/xptcall.cpp:65
#2  0x00007f5c82b5f9bb in nsCOMPtr<nsIDOMEventListener>::~nsCOMPtr (this=0x7f5c685d4f78, __in_chrg=<optimized out>)
    at ../../dist/include/nsCOMPtr.h:521
#3  0x00007f5c8302370e in nsDOMEventListenerWrapper::~nsDOMEventListenerWrapper (this=0x7f5c685d4f60, __in_chrg=<optimized out>)
    at /home/smaug/mozilla/hg/mozilla/content/events/src/nsDOMEventTargetHelper.h:53
#4  0x00007f5c830236bf in nsDOMEventListenerWrapper::Release (this=0x7f5c685d4f60)
    at /home/smaug/mozilla/hg/mozilla/content/events/src/nsDOMEventTargetHelper.cpp:55
#5  0x00007f5c82fdbb45 in nsRefPtr<nsIDOMEventListener>::~nsRefPtr (this=0x7f5c63859538, __in_chrg=<optimized out>)
    at ../../../dist/include/nsAutoPtr.h:908
#6  0x00007f5c82fd8317 in nsListenerStruct::~nsListenerStruct (this=0x7f5c63859538, __in_chrg=<optimized out>)
    at /home/smaug/mozilla/hg/mozilla/content/events/src/nsEventListenerManager.h:81
#7  0x00007f5c82fdceb8 in nsTArrayElementTraits<nsListenerStruct>::Destruct (e=0x7f5c63859538) at ../../../dist/include/nsTArray.h:380
#8  0x00007f5c82fdcd61 in nsTArray<nsListenerStruct, nsTArrayDefaultAllocator>::DestructRange (this=0x7f5c63859528, start=0, count=1)
    at ../../../dist/include/nsTArray.h:1243
#9  0x00007f5c82fdc9aa in nsTArray<nsListenerStruct, nsTArrayDefaultAllocator>::RemoveElementsAt (this=0x7f5c63859528, start=0, count=1)
    at ../../../dist/include/nsTArray.h:963
#10 0x00007f5c82fdc493 in nsTArray<nsListenerStruct, nsTArrayDefaultAllocator>::Clear (this=0x7f5c63859528) at ../../../dist/include/nsTArray.h:974
#11 0x00007f5c82fdbc32 in nsAutoTObserverArray<nsListenerStruct, 2u>::Clear (this=0x7f5c63859520) at ../../../dist/include/nsTObserverArray.h:242
#12 0x00007f5c82fd86d4 in nsEventListenerManager::RemoveAllListeners (this=0x7f5c63859500)
    at /home/smaug/mozilla/hg/mozilla/content/events/src/nsEventListenerManager.cpp:173
#13 0x00007f5c82fdb12c in nsEventListenerManager::Disconnect (this=0x7f5c63859500)
    at /home/smaug/mozilla/hg/mozilla/content/events/src/nsEventListenerManager.cpp:904
#14 0x00007f5c83024167 in nsDOMEventTargetHelper::~nsDOMEventTargetHelper (this=0x7f5c75764000, __in_chrg=<optimized out>)
    at /home/smaug/mozilla/hg/mozilla/content/events/src/nsDOMEventTargetHelper.cpp:119
#15 0x00007f5c82f2214a in nsXHREventTarget::~nsXHREventTarget (this=0x7f5c75764000, __in_chrg=<optimized out>)
    at /home/smaug/mozilla/hg/mozilla/content/base/src/nsXMLHttpRequest.h:101
#16 0x00007f5c82f26215 in nsXMLHttpRequest::~nsXMLHttpRequest (this=0x7f5c75764000, __in_chrg=<optimized out>)
    at /home/smaug/mozilla/hg/mozilla/content/base/src/nsXMLHttpRequest.cpp:494
#17 0x00007f5c82f26250 in nsXMLHttpRequest::~nsXMLHttpRequest (this=0x7f5c75764000, __in_chrg=<optimized out>)
    at /home/smaug/mozilla/hg/mozilla/content/base/src/nsXMLHttpRequest.cpp:508
#18 0x00007f5c83024096 in nsDOMEventTargetHelper::Release (this=0x7f5c75764000)
    at /home/smaug/mozilla/hg/mozilla/content/events/src/nsDOMEventTargetHelper.cpp:109
#19 0x00007f5c82f25174 in nsXHREventTarget::Release (this=0x7f5c75764000)
---Type <return> to continue, or q <return> to quit---
    at /home/smaug/mozilla/hg/mozilla/content/base/src/nsXMLHttpRequest.cpp:344
#20 0x00007f5c82f2785e in nsXMLHttpRequest::Release (this=0x7f5c75764000)
    at /home/smaug/mozilla/hg/mozilla/content/base/src/nsXMLHttpRequest.cpp:720
#21 0x00007f5c827d5799 in nsCOMPtr<nsIStreamListener>::~nsCOMPtr (this=0x7f5c63840030, __in_chrg=<optimized out>)
    at ../../../dist/include/nsCOMPtr.h:521
#22 0x00007f5c82e3506b in nsCORSListenerProxy::~nsCORSListenerProxy (this=0x7f5c63840000, __in_chrg=<optimized out>)
    at /home/smaug/mozilla/hg/mozilla/content/base/src/nsCrossSiteListenerProxy.h:66
#23 0x00007f5c82e34f71 in nsCORSListenerProxy::Release (this=0x7f5c63840000)
    at /home/smaug/mozilla/hg/mozilla/content/base/src/nsCrossSiteListenerProxy.cpp:363
#24 0x00007f5c827d0757 in nsCOMPtr<nsIInterfaceRequestor>::~nsCOMPtr (this=0x7f5c5aa8d568, __in_chrg=<optimized out>)
    at ../../../dist/include/nsCOMPtr.h:521
#25 0x00007f5c83ff558a in nsInterfaceRequestorAgg::~nsInterfaceRequestorAgg (this=0x7f5c5aa8d550, __in_chrg=<optimized out>)
    at /home/smaug/mozilla/hg/mozilla/xpcom/base/nsInterfaceRequestorAgg.cpp:40
#26 0x00007f5c83ff562f in nsInterfaceRequestorAgg::Release (this=0x7f5c5aa8d550)
    at /home/smaug/mozilla/hg/mozilla/xpcom/base/nsInterfaceRequestorAgg.cpp:55
#27 0x00007f5c83f7a165 in nsProxyReleaseEvent::Run (this=0x7f5c4f107c40)
    at /home/smaug/mozilla/hg/mozilla/ff_build/xpcom/build/nsProxyRelease.cpp:52
#28 0x00007f5c83fe0299 in nsThread::ProcessNextEvent (this=0x7f5c87545300, mayWait=true, result=0x7fff85c9d75f)
    at /home/smaug/mozilla/hg/mozilla/xpcom/threads/nsThread.cpp:656
#29 0x00007f5c83f75267 in NS_ProcessNextEvent_P (thread=0x7f5c87545300, mayWait=true)
    at /home/smaug/mozilla/hg/mozilla/ff_build/xpcom/build/nsThreadUtils.cpp:245
#30 0x00007f5c83fdfc30 in nsThread::Shutdown (this=0x7f5c87545a50) at /home/smaug/mozilla/hg/mozilla/xpcom/threads/nsThread.cpp:498
#31 0x00007f5c8281ff43 in nsSocketTransportService::Shutdown (this=0x7f5c875610c0)
    at /home/smaug/mozilla/hg/mozilla/netwerk/base/src/nsSocketTransportService2.cpp:534
#32 0x00007f5c827ef8d5 in nsIOService::SetOffline (this=0x7f5c87522a90, offline=true)
    at /home/smaug/mozilla/hg/mozilla/netwerk/base/src/nsIOService.cpp:794
#33 0x00007f5c827f036d in nsIOService::Observe (this=0x7f5c87522a90, subject=0x7f5c4f106fc0, topic=0x7f5c848b4ffd "profile-change-net-teardown", 
    data=0x7f5c848b5260) at /home/smaug/mozilla/hg/mozilla/netwerk/base/src/nsIOService.cpp:983
#34 0x00007f5c83f97643 in nsObserverList::NotifyObservers (this=0x7f5c698fe758, aSubject=0x7f5c4f106fc0, aTopic=
    0x7f5c848b4ffd "profile-change-net-teardown", someData=0x7f5c848b5260) at /home/smaug/mozilla/hg/mozilla/xpcom/ds/nsObserverList.cpp:130
#35 0x00007f5c83f99063 in nsObserverService::NotifyObservers (this=0x7f5c7adb8600, aSubject=0x7f5c4f106fc0, aTopic=
    0x7f5c848b4ffd "profile-change-net-teardown", someData=0x7f5c848b5260) at /home/smaug/mozilla/hg/mozilla/xpcom/ds/nsObserverService.cpp:182
#36 0x00007f5c8279f16a in nsXREDirProvider::DoShutdown (this=0x7fff85c9dc10) at /home/smaug/mozilla/hg/mozilla/toolkit/xre/nsXREDirProvider.cpp:835
#37 0x00007f5c8278b396 in ScopedXPCOMStartup::~ScopedXPCOMStartup (this=0x7f5c8756db58, __in_chrg=<optimized out>)
---Type <return> to continue, or q <return> to quit---
    at /home/smaug/mozilla/hg/mozilla/toolkit/xre/nsAppRunner.cpp:1125
#38 0x00007f5c827946bc in XREMain::XRE_main (this=0x7fff85c9dbd0, argc=3, argv=0x7fff85ca0028, aAppData=0x633060)
    at /home/smaug/mozilla/hg/mozilla/toolkit/xre/nsAppRunner.cpp:3867
#39 0x00007f5c8279487b in XRE_main (argc=3, argv=0x7fff85ca0028, aAppData=0x633060)
    at /home/smaug/mozilla/hg/mozilla/toolkit/xre/nsAppRunner.cpp:3921
#40 0x000000000040289a in do_main (argc=3, argv=0x7fff85ca0028) at /home/smaug/mozilla/hg/mozilla/browser/app/nsBrowserApp.cpp:190
#41 0x0000000000402a9a in main (argc=3, argv=0x7fff85ca0028) at /home/smaug/mozilla/hg/mozilla/browser/app/nsBrowserApp.cpp:277
Attached patch wip for cors (obsolete) — Splinter Review
Just something I'm testing. CORS should clear the owning pointers at some point
(that is what the patch does), but has something changed in Necko to keep stuff alive in some error cases?
There is still something leaking. My current guess is that this is from
Paris bindings or some necko changes.
...and  this is pretty horrible leak. when it happens, it can cause 1s+ CC times.
hmm, perhaps I need to wait a bit longer with the CORS changes.
I did saw a twitter related XHR to keep stuff alive for some time but it disappeared.
Component: General → Networking
Product: Firefox → Core
QA Contact: general → networking
Attached patch Patch for CORSSplinter Review
Try to handle the problem at least on content/ side, so that
huge CC times could be avoided. So, clear strong references explicitly and as early as 
possibly. I'm not sure if there is still something to fix in Necko.

Boris, since Jonas is on vacation, could you take a look.
I wish the ownership of various channels and listeners could be simpler.

https://tbpl.mozilla.org/?tree=Try&rev=0c3d0c6caef4
Assignee: nobody → bugs
Attachment #615060 - Attachment is obsolete: true
Attachment #615150 - Flags: feedback?(bzbarsky)
So far looking good. I've been using the patch about 2 days in two instances of FF and no documents 
leaked because of Necko/CORS keeping stuff alive.
Attachment #615150 - Flags: feedback?(bzbarsky) → review?(bzbarsky)
Comment on attachment 615150 [details] [diff] [review]
Patch for CORS

r=me, but it looks like necko is holding on to the callbacks passed to transaction->Init() in nsHttpChannel::SetupTransaction for too long or something.  If so, that's a pretty serious regression.  There are plenty of other places we use callbacks, and they all expect the channel to drop _all_ refs to them onStopRequest.

Could you please file a followup and cc me, :jduell, and :mcmanus?

r=me
Attachment #615150 - Flags: review?(bzbarsky) → review+
Comment on attachment 615150 [details] [diff] [review]
Patch for CORS

Fix at least the really bad part of a leak, which causes occasionally
huge cycle collection times.
Attachment #615150 - Flags: approval-mozilla-central?
This is tracking Firefox 14, so per Alex's post, you don't need to request approval: Bugs marked with blocking-fennec1.0+, sg:high, sg:crit, or tracking-firefoxN+ can land on m-i or m-c without explicit approval (include a=blocking-fennec, a=security, or a=tracking-firefox in your commit comment)."
Comment on attachment 615150 [details] [diff] [review]
Patch for CORS

Ah, thanks.
Attachment #615150 - Flags: approval-mozilla-central?
https://hg.mozilla.org/mozilla-central/rev/719a2fb28324
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
Comment on attachment 615150 [details] [diff] [review]
Patch for CORS

[Approval Request Comment]
Regression caused by (bug #): NA
User impact if declined: Possibly higher cycle collection times
Testing completed (on m-c, etc.): landed m-c
Risk to taking this patch (and alternatives if risky): Should be low risk -
basically clearing strong pointers after connection is handled
String changes made by this patch: NA
Attachment #615150 - Flags: approval-mozilla-aurora?
Comment on attachment 615150 [details] [diff] [review]
Patch for CORS

[Triage Comment]
Approved for Aurora 13 since this is a low risk memshrink bug.
Attachment #615150 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Is this something QA can verify?
Whiteboard: [Snappy:P2][Memshrink] → [Snappy:P2][Memshrink][qa?]
Keep Facebook open in a tab for few days, reload it occasionally and check if
DOMEventListener keeps a document from facebook.com alive.
https://bugzilla.mozilla.org/show_bug.cgi?id=726346 has a tool to check if the document is there.

But, I haven't seen the leak after the patch landed.
(In reply to Olli Pettay [:smaug] from comment #54)
> But, I haven't seen the leak after the patch landed.

That's good enough for me. Marking this qa- but I'll circulate it to QA for some dogfooding in 13.0b3.
Whiteboard: [Snappy:P2][Memshrink][qa?] → [Snappy:P2][Memshrink]
Mozilla/5.0 (Windows NT 6.0; rv:13.0) Gecko/20100101 Firefox/13.0

After spending a while on Facebook using Firefox 13 beta 3, the following document remain alive: 

0xf34f800 [rc=3] nsDocument normal (XUL) chrome://browser/content/browser.xul
    Roots: 3
        0xc10c8e0 [rc=68] root nsNodeInfoManager
        0x8acb740 [rc=1] root XPCVariant
        0x1021a308 [rc=2] root nsXPCWrappedJS (nsIObserver)

To track this document I used the tool from Bug 726346.
That is not the facebook.com document.
Verified on multiple OSs on Firefox 13.0 beta 3: 20120509070325.

No Facebook documents were kept even after leaving Firefox with Facebook opened for 2 consecutive days. Facebook was reloaded and used periodically and we checked for documents after each reload/use.
Status: RESOLVED → VERIFIED
You need to log in before you can comment on or make changes to this bug.