Closed Bug 287276 Opened 20 years ago Closed 20 years ago

Huge Memory Leaks w/XUL Widgets Generated from RDF Datasources/Templates, Probably Observer Interface Related

Categories

(Core :: XUL, defect)

x86
Linux
defect
Not set
critical

Tracking

()

RESOLVED WORKSFORME

People

(Reporter: rogata, Unassigned)

Details

(Keywords: testcase)

Attachments

(3 files)

User-Agent:       Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.5) Gecko/20041107 Firefox/1.0
Build Identifier: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.5) Gecko/20041107 Firefox/1.0

Although there are a couple of other reports that touch on this problem, we
wanted to start a new bug because we think we've narrowed down the problem
considerably.

In a nutshell, if you connect a XUL template to an RDF datasource and then
perform ANY action on the RDF datasource that will notify that datasource's
observers ( ie beginUpdateBatch, endUpdateBatch, assert, unassert, change ),
mozilla will leak memory.  Badly, in fact.  If your application depends on
updates to RDF generated XUL elements, your system can run out of memory pretty
quickly.

We have written a simple test case, attached.  The only drawback of the testcase
is that, because it instantiates an nsIRDFDataSource, it must be run as chrome.
 (Well, it probably doesn't have to be run as chrome, but I am not l33t enough
to figure out a better way.  8-) ).  Untar the testcase, add this line to
installed-chrome.txt:

content,install,url,file:///your/path/to/xul_leak_test/content/

delete the chrome.rdf, then run the testcase as
chrome://xul_leak_test/content/main.xul.

The testcase shows a simple label and progress bar (built by a template)
generated from an in-memory datasource with a single container containing a
single record.  There is a button which, when pressed, simply invokes
beginUpdateBatch() and endUpdateBatch() on the in-memory datasource 10,000 times
per press.  Try it and watch mozilla's memory usage grow without bound.

We used beginUpdateBatch() and endUpdateBatch() to try to prove that it isn't a
leak in the RDF datasource itself.  Further proof of this is that if you sever
the link between the datasource and template (comment out the AddDataSource
call, or change the ref attribute), the leak does not occur.

Note that Assert, Unassert, and Change statements will also trigger the leak as
well.  Note that just getting values from the datasource while it's hooked to
the template ( GetResource, GetTarget, etc) does not trigger the leak.  Since
beginUpdateBatch, endUpdateBatch, Assert, Unassert, and Change are all of the
messages that notify RDF DataSource observers, there's obviously a connection there.

Another interesting note is that the RDF datasource must contain at least one
record that matches a template rule for the leak to occur.  Hooking the template
to a datasource with a valid-but-empty container pointed to by the "ref"
attribute will not generate a leak.

We're trying to use some of the leak tools to isolate this further, but so far
the output has eluded our ability to interpret it.

Can anyone help with this?  This bug renders mozilla useless for writing rapidly
updating client applications.  8-(

Reproducible: Always

Steps to Reproduce:
1. Install the testcase, as per above
2. Click the "Grow" button
3. Watch mozilla grow.

Actual Results:  
Mozilla memory usage increases without bound.

Expected Results:  
No memory increase should have occurred at all, since there was no data creation
nor change in display.
Attached file Testcase for Leak
See installation instructions in summary.
Wanted to add that this happens in all versions of mozilla we tried, including
1.7.5 and our own builds from CVS, as well as in Firefox 1.0.1.
Attached file Testcase
The testcase is from the reporter, but I added netscape.enablePrivilege to it,
to be able to view it online.
Bummer, the testcase needs to be viewed locally, otherwise the enablePrivilege
code doesn't work.
Keywords: testcase
OK, we've tried tracing the binary through the testcase, and we've determined
that the leak is triggered by something in nsXULContentBuilder::RebuildAll(),
which can be seen here:

http://lxr.mozilla.org/seamonkey/source/content/xul/templates/src/nsXULContentBuilder.cpp#1856

If we short circuit this method ( by simply returning NS_OK at the beginning )
then the leak goes away.  Of course, the screen elements stop updating, too, so
this is only useful for trying to get closer to the bug.

Separately, we tried commenting out line 1895:

xulcontent->SetLazyState(nsXULElement::eChildrenMustBeRebuilt);

which seemed like a likely candidate to trigger memory allocation, but it had no
effect on the memory leak.

In our test case, every time we click the grow button we execute
"beginUpdateBatch" and "endUpdateBatch" 10,000 times, and mozilla grows by about
12M per click.  We therefore expect mozilla to leak over 1K per run through the
loop, but that doesn't appear to happen when we manually trace.  We haven't had
the patience to trace through 10,000 times yet ( 8-) ) so we don't yet know at
what intervals mozilla does grow.

Can any mozilla memory debugging gurus suggest any tools to help with tracking
this down?
Reporter, can you still see the bug with the latest nightly trunk build?
http://ftp.uni-erlangen.de/pub/mozilla.org/firefox/nightly/latest-trunk/
I don't think I see a lot of memory increase with the testcase in the latest
nightly trunk build.
Very nice!  I tried it with the nightlies from both Firefox and Mozilla and the
leak seems to be gone!  Out of curiosity, we'll run the reference count tools
again to see what changed, but this is really great!  Am leaving something
running all night tonight, and would like to do more tests tomorrow, but it
looks good.  Whoever fixed this deserves our thanks!
(In reply to comment #8)
> Whoever fixed this deserves our thanks!

That could be dbaron with bug 283129

So is this still an issue?
Oy, sorry for going away like that!  I was pulled onto another project.

The short answer is that this no longer seems to be an issue with the checked in
fix.  We are very grateful.

The only reason I don't give an unqualified yes is that our full app (as opposed
to our test case) still leaks a relatively small amount of memory over time, but
that could be happening anywhere.  The rapid leak that we outlined here does
appear to be either gone, or it grows slowly enough to be a relative non-issue.
 I've been meaning to do more thorough testing of both the test case and the
full app, but I am short of time at the moment.
Marking worksforme per that comment.  Sounds like the remaining issue would be
separate bugs.
Status: UNCONFIRMED → RESOLVED
Closed: 20 years ago
Resolution: --- → WORKSFORME
Component: XP Toolkit/Widgets: XUL → XUL
QA Contact: xptoolkit.widgets
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: