typeaheadfind causes major memory leaks

RESOLVED FIXED in mozilla1.4final

Status

defect
--
critical
RESOLVED FIXED
16 years ago
8 years ago

People

(Reporter: darin.moz, Assigned: aaronlev)

Tracking

({memory-leak, topembed})

Dependency tree / graph

Firefox Tracking Flags

(Not tracked)

Details

(Whiteboard: [adt2], URL)

Attachments

(3 attachments, 1 obsolete attachment)

(Reporter)

Description

16 years ago
In some cases, nsTypeAheadFind holds an owning reference to a DOM window for
longer than it should.  Since the DOM window indirectly owns image objects, this
can represent a lot of memory.  We believe it is causing a number of bugs, such
as bug 193865, bug 198806, and bug 204374.  There are other GDI bugs that are
very likely related to this.

We have tried some different things to eliminate the leak, but we haven't been
able to completely eliminate in all cases, and our changes seem to have
introduced some crashes.  Attached is a diff of some of the changes we've tried.

A simple way to repro is:

1) start mozilla with about:blank
2) using the URL bar, enter http://gs1.mcom.com/test/images.html
3) wait for the page to finish loading and then press any key (like 'a')
4) using the URL bar, enter about:cache
5) wait more than 2 seconds (delay for javascript garbage collection)
6) reload about:cache and notice that the cache reports that ~50Mb (more than
the cache limit) are in use.
6) now press a key and then reload about:cache and notice that the memory usage
goes back down to below the limit on the memory cache size.

it appears that typeaheadfind does not release the DOM window until it hooks
itself into a new DOM window.

btw: this happens with any trunk or 1.4 branch build.
(Reporter)

Updated

16 years ago
Severity: normal → critical
Flags: blocking1.4?
Keywords: nsbeta1, topembed
Target Milestone: --- → mozilla1.4final

Comment 2

16 years ago
Test URL http://gs1.mcom.com/test/images.html not resolving. Typo?
rr_mozilla: *.mcom.com urls are only available from inside the netscape firewall.
(Reporter)

Comment 4

16 years ago
I should have clarified that you can use any page with a large enough number of
images or sufficiently large images to repro this bug.  The sample URL is just
the one that I've been using, and yeah.. unfortunately it is only reachable from
within the Netscape/AOL firewall :-/
(Assignee)

Comment 5

16 years ago
Unfortunately I couldn't test this because I couldn't reach the test page. Will
you test it for me?

Also, if this doesn't work, I have another idea to use my fixed for bug 208803,
which will allow me to know when it is safe to remove all the event listeners
and references, because a doc has gone away.
(Assignee)

Comment 6

16 years ago
I'm unable to get to the test case, even when I'm using Sera.

I have tried a page with a lot of images, but am unable to replicate the problem.
(Reporter)

Comment 7

16 years ago
aaron:

ok, so gs1.mcom.com doesn't seem to exist afterall... but if you try this, it
should work:

http://gs1.nscp.aoltw.net/test/images.html

for whatever reason, gs1 isn't accessible from .mcom.com :-(

Comment 8

16 years ago
adt: nsbeta1+/adt2
Keywords: nsbeta1nsbeta1+
Whiteboard: [adt2]
(Reporter)

Comment 9

16 years ago
i tested the patch, and it unfortunately doesn't help.  i tested it using a
linux trunk build from today.
(Assignee)

Updated

16 years ago
Depends on: 208803
(Assignee)

Comment 10

16 years ago
Able to repro bug, working on fix now.
(Assignee)

Comment 12

16 years ago
Comment on attachment 125817 [details] [diff] [review]
I have verified that the patch works, but have not played with it a great deal more than that. Seems clean to me.

Seeking r=/sr=
I'm not sure if it's necessary to set the nsISelection or the
nsISelectionController member variables to null, but I think that's right.
Attachment #125817 - Flags: superreview?(alecf)
Attachment #125817 - Flags: review?(darin)
(Reporter)

Comment 13

16 years ago
the patch solves the problem as described in comment #0, but using a slight
variation on those steps, i'm still able to repro the leak.

to repro, follow the original steps 1-5, and then...

6) notice that with the patch, the size of the memory cache goes down below the
memory cache size limit.
7) now, press back
8) click on page and press a key like 'a'
9) press forward button, and notice that the cache size remains above cache limit.
(Reporter)

Comment 14

16 years ago
Comment on attachment 125817 [details] [diff] [review]
I have verified that the patch works, but have not played with it a great deal more than that. Seems clean to me.

this patch doesn't work.  also, you would need to have someone from the layout
team review the layout changes here.
Attachment #125817 - Flags: superreview?(alecf)
Attachment #125817 - Flags: review?(darin)
Attachment #125817 - Flags: review-
(Assignee)

Comment 15

16 years ago
If this patch didn't work for you, I have no idea why. Is the observer getting
called when the image-rich doc goes away?

I tested on my system, and it worked.
(Reporter)

Comment 16

16 years ago
I retested, and now I can no longer repro the leak.  It's strange... the memory
seems to only be freed when I move my mouse.  Probably because I dragged the
cursor over a chrome button, which caused a presshell to get destroyed.  I
wonder, since presshell's are created and destroyed frequently, might this patch
have a negative performance impact?
(Reporter)

Comment 17

16 years ago
Comment on attachment 125817 [details] [diff] [review]
I have verified that the patch works, but have not played with it a great deal more than that. Seems clean to me.

dbaron said he'd review this.
Attachment #125817 - Flags: review- → review?(dbaron)
Comment on attachment 125817 [details] [diff] [review]
I have verified that the patch works, but have not played with it a great deal more than that. Seems clean to me.

This seems fine to me, although the pres shell is really someone internal and
I'd rather not have more observer topics that look public, so could you put a
comment in nsIPresShell.h that the observer topic is a hook for typeaheadfind
only?

(In the long run, if we continue to want an interface for this type of thing,
I'd rather have it on the document viewer, but that would be a bigger patch at
this point...)
Attachment #125817 - Flags: review?(dbaron) → review+
(Reporter)

Comment 19

16 years ago
Comment on attachment 125817 [details] [diff] [review]
I have verified that the patch works, but have not played with it a great deal more than that. Seems clean to me.

sr=darin
Attachment #125817 - Flags: superreview+
(Assignee)

Comment 21

16 years ago
Comment on attachment 125888 [details] [diff] [review]
Alternate patch does not touch pres shell at all. Perhaps better?

Or if dbaron has time to look at this.

This patch doesn't need to touch pres shell -- it uses the DOM unload event. Is
that better?
Attachment #125888 - Flags: superreview?(alecf)
Attachment #125888 - Flags: review?(darin)

Comment 22

16 years ago
Comment on attachment 125888 [details] [diff] [review]
Alternate patch does not touch pres shell at all. Perhaps better?

oh! that works for me. sr=alecf
Attachment #125888 - Flags: superreview?(alecf) → superreview+
(Reporter)

Comment 23

16 years ago
Comment on attachment 125888 [details] [diff] [review]
Alternate patch does not touch pres shell at all. Perhaps better?

>+    NS_ENSURE_TRUE(focusedShell && doc, NS_ERROR_FAILURE);

this patch seems to work better; however, i see this "assertion"
getting hit often.  maybe it should be changed to this instead:

  if (!focusedShell || !doc)
      return NS_ERROR_FAILURE;

r=darin with that change.
Attachment #125888 - Flags: review?(darin) → review+
(Assignee)

Updated

16 years ago
Attachment #125888 - Flags: approval1.4?
Ah, glad to see that unload does work.
Comment on attachment 125888 [details] [diff] [review]
Alternate patch does not touch pres shell at all. Perhaps better?

a=asa (on behalf of drivers) for checkin to the 1.4 branch.
Attachment #125888 - Flags: approval1.4? → approval1.4+
(Assignee)

Comment 26

16 years ago
checked into trunk

Comment 27

16 years ago
a=adt to land this on the 1.4 branch by tonight (before 2003-06-19 5am and no
later please).  Please mark this with the fixed1.4 keyword once this lands. 
Thanks.  
(Reporter)

Comment 28

16 years ago
fixed1.4
Status: NEW → RESOLVED
Last Resolved: 16 years ago
Keywords: fixed1.4
Resolution: --- → FIXED
(Assignee)

Comment 29

16 years ago
Darin, does it happen if you use the other patch instead? I wonder if the unload
isn't firing when you hit forward.
(Assignee)

Comment 30

16 years ago
Sorry Darin, I was just replying to an old comment of yours accidentally.
tested using 2003.06.19.0x commercial 1.4 branch builds on linux rh8.0, win2k
and mac 10.2.6. the following are results from the test in comment 0 on linux.

* at step 4, about:cache displays:

Memory cache device
-------------------
Number of entries: 	97
Maximum storage size: 	31744 k
Storage in use: 	53224 k
Inactive Storage: 	0 k

Disk cache device
-----------------
Number of entries: 	49
Maximum storage size: 	51200 k
Storage in use: 	4335 k

* at step 6, about:cache displays:

Memory cache device
-------------------
Number of entries: 	76
Maximum storage size: 	31744 k
Storage in use: 	22449 k
Inactive Storage: 	22281 k

Disk cache device
-----------------
Number of entries: 	49
Maximum storage size: 	51200 k
Storage in use: 	4335 k

* after step 7, about:cache displays:

Memory cache device
-------------------
Number of entries: 	78
Maximum storage size: 	31744 k
Storage in use: 	22579 k
Inactive Storage: 	22412 k

Disk cache device
-----------------
Number of entries: 	49
Maximum storage size: 	51200 k
Storage in use: 	4335 k


is this expected behavior now?
(Reporter)

Comment 32

16 years ago
sairuh: yes, i think so.
okay --thanks, darin! marking verified1.4.
Keywords: fixed1.4verified1.4

Comment 34

16 years ago
I noticed Darin's comment #23...

I see this assert all the time with testcase for bug 199443
http://homepages.inf.ed.ac.uk/s9902074/mozilla/bug.php

Did the line "NS_ENSURE_TRUE(focusedShell && doc, NS_ERROR_FAILURE);"
get checked in by mistake instead of Darin's suggestion of
  if (!focusedShell || !doc)
      return NS_ERROR_FAILURE;

(Assignee)

Comment 35

16 years ago
Heh - well Darin actually checked it in for me.

Anyway, it doens't actually matter on a release build. The NS_ENSURE_TRUE will
act the same way as the if - since NS_WARNING won't do anything there. It will
still return early.
(Reporter)

Comment 36

16 years ago
oops.. yup, i forgot to follow my own suggestion.  quickly checked in the patch
in the bug.  we should clean that up though... aaron: can you please do that
when you get the chance.  thanks!  (no need for r=/sr= on that.)
(Assignee)

Comment 37

16 years ago
Fixed the tweak in trunk.
mozilla1.4 shipped. unsetting blocking1.4 request.
Flags: blocking1.4?
Product: Core → SeaMonkey
You need to log in before you can comment on or make changes to this bug.