Investigate odd cycle involving js objects causing leak of nsBookmarksService and InternetSearchService



Bookmarks & History
18 years ago
7 years ago


(Reporter: Alec Flett, Assigned: jag (Peter Annema))


({memory-leak, regression})

memory-leak, regression

Firefox Tracking Flags

(Not tracked)



(3 attachments)



18 years ago
When fixing the orange build on friday, I had to change the mTimer members of 
these classes to be member variables, since they are doing so, I 
somehow seem to have introduced a leak.

I've done some playing with the refcount balancer, but I have had no luck 
figuring out how to fix this.

it looks like there may be a number of objects holding references to the 
bookmarks service. the most obvious one is the "mInner" member, which is an 
in-memory datasource... we do some crazy stuff in nsBookmarksService::Release() 
to make sure to release the inner early, but it doesn't look like that's ever 
being hit.

I'm adding waterson to the CC because the output from lists 
nsBookmarkService as having one extra reference, but always has 
nsBookmarksService with 4 refs... I'm not really sure what's going on..
I wrote on n.p.m.builds this weekend:

I don't understand how changing the static nsCOMPtr to a member made
the leak stats go up.  Static nsCOMPtrs aren't released until after the
leak stats are printed, so I don't see how it would cause fewer
releases (before the leak stats).  What other change made this an
issue?  Or am I missing something?
Keywords: mlk, regression

Comment 2

18 years ago
bug 60754 was just filed. Do bookmarks work at all anymore?

Comment 3

18 years ago
that other bug sounds like some bad JS, which probably has nothing to do with 

dbaron: I didn't know that. Maybe it's just coincidence that the 2 leaks started 
that day? or maybe we didn't previously instantiate these services, but we are 

Comment 4

18 years ago
alecf: What was the purpose of your changes on Friday? What was the issue with
timers that required the changes?

FWIW: The previously static nsCOMPtr<> mTimer is cancelled/cleared in
nsBookmarkService's destructor.
The reason for the changes was that Linux went orange -- it was crashing on
shutdown.  (One possible explanation for such a crash could be that someone
started leaking the bookmark service so that the static nsCOMPtr wasn't cleared
by the dtor, but was instead cleared on the library unload after something it
pointed to was no longer valid.  But I'm just guessing...)

Comment 6

18 years ago
dbaron: That sounds like a fair guess... which means that the checkins right 
before the orange began should be examined (for things like XUL document leaks.)

Comment 7

18 years ago
I think this is what was happening before:
- the bookmarks service was never getting destroyed, because something was 
holding a reference to it.
- the static nsCOMPtr<> was getting destroyed, and thus releasing the timer, 
which might have caused the bookmarks service to get destroyed.

The only thing that's odd about that was that the bookmarks service should have 
been reported as a leak, anyway....:(

Comment 8

18 years ago
Hmmm... the timer doesn't hold an owning reference to the bookmarks service.
No, I don't think the timer was holding the bookmarks service, and there's no
need for the release of the time to trigger the release of the bookmarks service
to do something that shouldn't happen.  (What exactly was it that crashed?)

So what's causing this nsXULDocument not to be garbage collected?  Which leaked
JS root is the root of the problem?  (And when did the problem start?)

It's interetsing to note that the leak stats *were* working while tinderbox was
orange, since the crash was after they were printed.  (I wish it showed them in
spite of being orange!)  And the leak was occurring during the orange.  So
probably the checkin that caused the leak and the one that caused the orange
were the same.

Comment 11

18 years ago
Created attachment 19525 [details] [diff] [review]
[patch] Oops, introduced a cycle... Fix
[Checkin: Comment 19]

Comment 12

18 years ago
Spot the cycle :-)

Comment 14

18 years ago
And have a=alecf, will check in as soon as the tree opens

Comment 15

18 years ago
reassign to jag so he gets credit :)
Assignee: rjc → disttsc
Can someone diagram the cycle?


Comment 17

18 years ago
alecf: hey, some credit vs. most of the blame ;-)

I thought there was a cycle here, but now I'm not so sure anymore. It definitely
shows up when storing the |bar| element in the Browser object's |barEl|

So what I think was going on:
BrowserInstance has a reference to the XULBrowser object created at onload, this
object has a reference to the urlbar element and thus(?) the document, which
holds the global JS var appCore which is the BrowserInstance we started with.

But like I said, I'm not so sure anymore.
Here's a guess at part of the cycle:

The piece jag created is that the XULBrowser object has a pointer in JS to an
element of the XUL document (the document in which the script is executing),
which keeps the document's script object, and thus the document alive.  I'm not
sure of the rest, but I imagine something in the document is pointing to the
XULBrowser object.  Are there good ways to debug such things (to figure out if
it's something we could clean up in GlobalWindowImpl::CleanUp)?

Comment 19

18 years ago
Fix was checked in btw. Keeping this bug open pending open questions.
Netscape Nav Triage Team: marking fixed 
Last Resolved: 18 years ago
Resolution: --- → FIXED
Reopening, since there are unresolved questions.  Why is the Nav triage team
touching the bugs of non-Netscape employees, anyway?
Resolution: FIXED → ---
We should test if this still creates a leak if you insert the same code now
(after hyatt's XBL fix).

Comment 23

18 years ago
Yeah, it still leaks. I've reduced it to the following two attachments.
Removing the last rule from xul.css stops the leaking.

Replace chrome/toolkit/content/global/xul.css with this one and put nav.xul in
then start with mozilla -chrome chrome://navigator/content/nav.xul

Please make sure you're not working with a JAR build, or otherwise edit
installed-chrome.txt and remove all "jar:" and ".jar!".

Comment 24

18 years ago
Created attachment 22502 [details]
[testcase] nav.xul

Comment 25

18 years ago
Created attachment 22504 [details]
[testcase] xul.css

Comment 26

18 years ago
So if I take this nav.xul, and the original xul.css, and remove these two bits,
the leaking stops:

Line 27:

Line: 497:
    -moz-binding: url("chrome://global/content/xulBindings.xml#slider");

It doesn't stop the leaking for the complete navigator.xul + rest though.

Comment 27

18 years ago
How on earth did you decide to remove THOSE? I can't figure out how they would
be relevant :(

Comment 28

18 years ago
You just reduce your testcase to where it's so small that removing anything else
will stop the problem from appearing. That led me to these things.

Comment 29

18 years ago
Also see related bug 65377.

Comment 30

17 years ago
Mass move to
Assignee: jag → jaggernaut

Comment 31

17 years ago
Target Milestone: --- → mozilla1.0


17 years ago
Blocks: 92580


17 years ago
No longer blocks: 92580

Comment 32

17 years ago
Priority: P3 → P2
Target Milestone: mozilla1.0 → mozilla0.9.7

Comment 33

17 years ago
Since we're not leaking currently, and know what link was completing the cycle
(and thus how to avoid it), I'm moving this to future, since there is still a
theoretical interest (mine) to see if this cycle has been broken elsewhere in
the mean time.
Severity: critical → normal
Summary: Leaking nsBookmarksService and InternetSearchService → Investigate odd cycle involving js objects causing leak of nsBookmarksService and InternetSearchService
Target Milestone: mozilla0.9.7 → Future


17 years ago
Priority: P2 → P4

Comment 34

17 years ago
filter keyword: nothingnessless
Product: Browser → Seamonkey
Attachment #19525 - Attachment description: [patch] Oops, introduced a cycle... Fix → [patch] Oops, introduced a cycle... Fix [Checkin: Comment 19]
QA Contact: claudius → bookmarks
This bug is being marked EXPIRED as it has seen no activity in a very long time.

If you think that the issue reported might still be relevant, please test with a recent release of SeaMonkey and if the problem persists feel free to re-open the report. Thank you.
Last Resolved: 18 years ago10 years ago
Resolution: --- → EXPIRED
Bulk reopening incorrectly expired bugs - no activity does not constitute no bug - these need proper checking.
Resolution: EXPIRED → ---

Comment 37

7 years ago
I'm going to close this since the original issue was fixed. Any additional leaks should be filed as new bugs.
Last Resolved: 10 years ago7 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.