Closed Bug 90394 Opened 23 years ago Closed 23 years ago

nsXPIDLCString is leaking!

Categories

(Core :: XPCOM, defect)

x86
Linux
defect
Not set
major

Tracking

()

RESOLVED FIXED

People

(Reporter: curt, Assigned: scc)

Details

(Keywords: memory-leak, top-memory-leak)

Attachments

(2 files)

On July 2nds build the Leaks file generated by my daily leaks test leapt up in 
size to over 3 Gig, causing me to run out of build space and, therefore, 
breaking my tests.  

After several abortive efforts to isolate the problem I have generated an 
abreviated test which generates a more manageably sized leak file.  It is 
available on the local system: 
sweetums:/builds/footprint/leaks/cm/dist/bin/Leaks072121.  The build it was 
generated against lives in that same directory.

Chris, I cannot get this to take dbaron@netscape.com as a legitimate user.  
What's up with that?
Steve generated some information from this file which I believe is located at 
/u/thesteve/curt/a.out.out.  It indicates that we have got a huge number of 
scattered, relatively small luck with some other very large leaks thrown in.  
Unfortunately Steve is not in a position to analyse the problem further.  I need 
help isolating where this problem is located in the code.
Over to curt since this is probably not an embedding widget problem.
Assignee: blizzard → curt
Curt,

I'll add instructions on how to hunt down bugs from my tool output. Could you,
for the record, comment on when this leak explosion started, so that when we
look at source code changes, we know the possible range of dates when some major
leaks were introduced?
I looked at 2 of the leaks and they're both cases where people used the pattern:

nsXPIDLCString foo;
blah->GetFoo(getter_Copies(foo));

so something in scc's string landing broke nsXPIDL[C]String so it's not freeing
the results in at least some cases.  Eek!
Assignee: curt → scc
Component: GTK Embedding Widget → String
QA Contact: pavlov → scc
Keywords: mlk, topmlk
Summary: Generating huge leaks files → nsXPIDLCString is leaking!
Oh yeah.  As Steve points out I did forget to pinpoint when the bug was checked 
in.  It should be in code checked in between 1:00 a.m. Friday, 6/29 and 1:00 
a.m. Monday, 7/2.
This could be the same as bug 75310.  The buffer handles are refcounted...
Instructions for hunting down leaks:

1) get access to the machine `sweetums` (either be on it directly, or rlogin)

2) make sure you have access to my output in  /u/thesteve/a.out.out.results

The format of this file is, a series of:
    7 instances (for a total of      28756 bytes) of:
    <the call stack, which I'll talk more about later>

"7 instances" means that in the raw leak output (which my tool processed), there
are 7 leaks which happened from this 
eaxact same call stack, whcih means the exact same place in the code.

3) from here on, leak hunting becomes more of an art. You can take some
combination of top-down and boottom-up in your 
approach to finding the major offenders. From the top down: look at the whole
file, and try to see where the major offenders 
are, what the distribution of the leaks sites are.

You can find the major leaks sites, by doing:
`grep -n instance /u/thesteve/a.out.out.results ` , which will 1404 lines. Since
this is a lot, you can find the high runners by 
doing:

grep -n instance /u/thesteve/a.out.out.results  | awk '$2 >= 300 {print}'
for example, to find those sites with 300 (pick a number) or more instances.
Or, you can do the same this on the $8 field, to filter out by total amount of
memory leaked.

Once you've identified the ones your really interested in, start looking at the
call stacks of interest to see if you can identify the 
offending functions. This is an art. Usually, if you peruse the top of the call
stacks (in the sense of furthest away from "main") 
you may see functions in common, among the large sites of interest. By the way,
the format of the call stack is.

function [library +hex_offset]

Usually, the very top of the call stack, is a call to PR_Malloc, or some similar
function which does the actuall allocation, and 
seldom is the problem itself. The real problem is usually further down, but not
too much further down, where someone 
allocated memory, but then forgot to free it.

4) you can look at actual source, by using the `addr2line` tool on LINUX.
For instance, for the call stack element:
PL_HandleEvent[./libxpcom +0xCC484]

`addr2line -e ./libxpcom +0xCC484`
will return a source file and line number, which you can then look at.

you must be in the correct directory (the ...dist/bin directory of the build on
sweetums)
Can you attach the test code to this bug?  I am outside the firewall.
Status: NEW → ASSIGNED
I checked in what I think is the fix.  I'll wait for some verification from the
GC leak tests before I mark the bug fixed.
I turned off my build last night to preserve the environment for the data we'd 
collect--I'm working with a limited amount of disk space--so I don't have a 
build to test today.  I'll have test results first thing in the morning.
Status: ASSIGNED → RESOLVED
Closed: 23 years ago
Resolution: --- → FIXED
I'm marking this bug fixed.  My tests do work today against the offending url.  
However, I watched the urls loading and note that this one refreshed VERY much 
slower than the ones preceeding and following.  Do you suppose that is just 
because of the international character support? or should a different bug be 
opened on this account?  I'm hesitant to open the bug myself since this is an 
entirely subjective observation.
Component: String → XPCOM
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: