Closed Bug 41172 Opened 25 years ago Closed 25 years ago

leak the world via an InterceptStreamListener

Categories

(Core :: Networking: Cache, defect, P4)

x86
Linux
defect

Tracking

()

RESOLVED FIXED

People

(Reporter: dbaron, Assigned: dbaron)

References

()

Details

(Keywords: memory-leak, Whiteboard: [nsbeta2-] partial fix attached)

Attachments

(1 file)

DESCRIPTION: My homepage (and probably other pages with linked stylesheets) recently started leaking about 250K. The chain of leaks was: everything else nsParser (if you leak the parser you leak *everything*) SheetLoadData nsStreamLoader nsHTTPFinalListener InterceptStreamListener The leak occurred in nsCachedNetData::InterceptAsyncRead because interceptListener->Init returned null, and therefore the functioned returned without ever addrefing or releasing interceptListener, which caused interceptListener (an InterceptStreamListener) to leak. I will attach the fix in a minute. However, this raises some slightly more important questions: * why does Init() fail, causing the leak? * does the ownership model really need to make a small leak turn into something this big? [ cc:ing attinasi on this issue ] STEPS TO REPRODUCE: * setenv XPCOM_MEM_LEAK_LOG 1 * ./mozilla http://www.people.fas.harvard.edu/~dbaron/ ACTUAL RESULTS: * leak about 250K EXPECTED RESULTS: * leak about 1.5K ADDITIONAL INFORMATION: Nominating for nsbeta2 because this is a big leak (250K an occurrence) with an easy fix. I'm not sure how many pages this happens on, but it may be as many as any page with a linked stylesheet (although I'm not sure).
Whiteboard: fix attached
When I said "returned null" above, I meant "returned a failure result" (so NS_FAILED(rv) was true).
Severity: normal → major
[nsbeta2+] [6/15]
Whiteboard: fix attached → [nsbeta2+] [6/15] fix attached
I could check this in if I can get 1) a review, and 2) approval. Who should review this?
r=gordon
Oops... further testing revealed that this patch causes a strange bug: if I load certain pages (for example, the 3 of the first 5 items in the QA menu that end in .html (rather than /)), they only load once, and if I try to come back to them again (without clearing the cache), they won't load: Document: Done (0.265 secs) Error loading URL http://www.mozilla.org/quality/precheckin-tests.html This continues to happen with the patch removed, until I clear the cache, but the problem is only created when the patch is in. If someone has any idea why this is happening, I'd appreciate some help. I'm not much of an expert on the cache code.
Whiteboard: [nsbeta2+] [6/15] fix attached → [nsbeta2+] [6/15] partial fix attached
I applied the patch to my local tree and I do not see the problem. I can load http://www.mozilla.org/quality/precheckin-tests.html and the first 3 of the 3 items in the QA menu that end in .html again and again. Neeti
Also, in a new build from today( without the patch) , I do not see InterceptStreamListener leaking, when I do ./mozilla http://www.people.fas.harvard.edu/~dbaron/ 75 InterceptStreamListener 36 0 7 0 ( 0.93 +/- 0.73) 42 0 ( 1.93 +/- 0.92) Neeti
What probably happened is that init is not failing either: * for you * anymore. There are still two bugs here. First, we leak when Init fails. Second, the early release caused by Init()'s failure, possibly combined with whatever caused that failure, seriously broke the cache.
I don't see the leak anymore either. However, I'm still somewhat afraid to fix the underlying bug without knowing why the cache broke so badly when it was fixed and something else was failing, and didn't break as badly when it leaked. However, I think this should be fixed (but I can see not doing it for beta2).
*** Bug 40321 has been marked as a duplicate of this bug. ***
Cleaning the status whiteboard by marking this beta2 minus. (6/15 has passed) I'm confused by the comments going back and forth... is the leak still as bad? ...or only when the init fails? Could someone summarize severity and status of this bug? Thanks, Jim
Whiteboard: [nsbeta2+] [6/15] partial fix attached → [nsbeta2-] partial fix attached
The answer to the above is that this is very bad bug. Sure we only see it if the cache become corrupted, but once we get there the combination of the 'not checking return value' in the http channel code (described in my dup report) and the cache manager returning a substitute stream listener in a screwed up state on Init error can result in a crash. (I have a cache directory on my box where attempting to visit yahoo will crash the browser every time due to this bug.) The patch above is basicly the same as the patch in my dup report. The only diff is that I use a smart pointer. Also I provide a patch for the http channel code to fix the 'ignore the return code and just look for a non-null stream' which is the first step to a crash. --------- gagan, would you review my two patches on bug #40321? If you give the OK, I'll check them in.
I prefer my patch for nsCachedNetData.cpp, because XPCOM objects should never be used without being |AddRef|ed, i.e., with a refcount of 0 (and similarly they should not be |delete|d, only |Release|d). Were |Init| to call something that |AddRef|ed and |Release|d |interceptStreamListener|, it could be deleted immediately and the function could crash when then trying to |AddRef| a pointer to garbage. Do you know that your second patch fixes the other problems I described?
Actually, your second patch won't change anything (although it is a good patch), since *aResult was never filled in unless there was a successful return value, and nsCOMPtrs are initialized to null. So, do we want to check this in despite that it has been known to cause stranger problems?
hmm... yea, i buy your argument for nsCachedNetData.cpp. And now that I look at the channel code again it does look like my channel patch doesn't do change anything (other then I think it's bad form to not use the return value). Although, the strange behaviour I reported only happened with unpatched code. Now I'm not so sure why I crash with a screwed up cache. So how about we get a review (ok to check-in) for David's patch?
looks good to me. r=waterson
David's patch is checked in.
Status: NEW → RESOLVED
Closed: 25 years ago
Resolution: --- → FIXED
Reopening and...
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
... assigning to myself, since there are 2 other issues I wanted to investigate related to this bug.
Assignee: gordon → dbaron
Status: REOPENED → NEW
Keywords: nsbeta3
Since this is confusing bug triage, marking FIXED again. I'll try to investigate making the ownership model less risky without keeping this bug open as a reminder to do so.
Status: NEW → RESOLVED
Closed: 25 years ago25 years ago
Resolution: --- → FIXED
This evening I checked in a fix that I believe would have broken the chain of leaks between the SheetLoadData and the nsParser. We now release the nsParser from the SheetLoadData once it has been unblocked, rather than holding on until the SheetLoadData goes away. The cache has changed so much since this bug was filed that the original issue preventing me from checking in the fix may no longer be a problem. At least I hope so.
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: