Closed
Bug 472773
Opened 16 years ago
Closed 16 years ago
[MacOSX] all test suites: nsBaseAppShell and nsRunnable leak occasionally ["runtests-leaks | leaked 68 bytes during test execution"]
Categories
(Core :: Widget: Cocoa, defect)
Tracking
()
RESOLVED
FIXED
mozilla1.9.2a1
Tracking | Status | |
---|---|---|
status1.9.1 | --- | .4-fixed |
People
(Reporter: bent.mozilla, Assigned: smichaud)
References
()
Details
(Keywords: intermittent-failure, memory-leak, meta, Whiteboard: [qaw: check/dupe 'URL' bug list] )
Attachments
(2 files, 6 obsolete files)
2.57 KB,
text/plain
|
Details | |
7.97 KB,
patch
|
jaas
:
review+
beltzner
:
approval1.9.1.4+
|
Details | Diff | Splinter Review |
Often in my debugging I see a single nsBaseAppShell and nsRunnable instance leak. This seems to happen more frequently when I quit the browser soon after startup but I've seen it in many other cases as well. The nsRunnable is strongly owned by the nsBaseAppShell ('mDummyEvent'), so the nsBaseAppShell is the real leak. I finally got a leak log of the nsBaseAppShell leak and noticed this:
nsAppShell::ScheduleNativeEventCallback() 44
nsAppShell::ProcessGeckoEvents(void*) -43
The log indicates that there were 44 AddRefs to the app shell but only 43 Releases. The code in question assumes that there will be a 1:1 relationship between calls to ScheduleNativeEventCallback and ProcessGeckoEvents, but that appears to not be true in all cases.
Attached is a patch which fixes the leak for me (at least I haven't been able to reproduce it since I applied this patch) but I don't know if there is a better way or if this patch will hurt performance.
This leak itself is not a big deal, really. It does, however, cause random tinderbox orange and it also causes lots of duplicate bug reports to be filed in topsite leak analyses.
Attachment #356107 -
Flags: review?(smichaud)
Assignee | ||
Updated•16 years ago
|
Attachment #356107 -
Flags: review?(smichaud) → review+
Assignee | ||
Comment 1•16 years ago
|
||
Comment on attachment 356107 [details] [diff] [review]
Patch, v1
Your patch's logic seems impeccable, and I can't foresee it doing any
harm.
But I think we should let this bake on the trunk for a while before
landing it on any of the branches.
Reporter | ||
Comment 2•16 years ago
|
||
Comment on attachment 356107 [details] [diff] [review]
Patch, v1
Actually, I meant to obsolete this before.
I agree that the logic seems correct, but I have occasionally seen a double free with this patch on shutdown... Something is weird here.
Attachment #356107 -
Attachment is obsolete: true
Assignee | ||
Comment 3•16 years ago
|
||
I've always wondered if the OS could call
nsAppShell::ProcessGeckoEvents() without having been told to do so (by
the calls to CFRunLoopSourceSignal() and CFRunLoopWakeUp()) in
nsAppShell::ScheduleNativeEventCallback().
Maybe the problem is that it can.
Assignee | ||
Comment 4•16 years ago
|
||
Ben, here's a hunch:
There's code in the nsAppShell destructor that removes
mCFRunLoopSource from mCFRunLoop, then releases mCFRunLoopSource.
Try commenting that out and see what happens. I suspect that code
might trigger one last call to nsAppShell::ProcessGeckoEvents().
Assignee | ||
Comment 5•16 years ago
|
||
Or maybe you could call CFRunLoopSourceInvalidate() on
mCFRunLoopSource before removing it from mCFRunLoop.
Reporter | ||
Comment 6•16 years ago
|
||
Thanks! I'll give it a try.
Assignee | ||
Comment 7•16 years ago
|
||
> I suspect that code might trigger one last call to
> nsAppShell::ProcessGeckoEvents().
Or (more likely) maybe it can (sometimes) trigger all pending calls to
nsAppShell::ProcessGeckoEvents().
Comment 8•16 years ago
|
||
Wow, there is a bunch of duplicates or at least related bugs...
Updated•16 years ago
|
Comment 9•16 years ago
|
||
Assuming for now that there's only one cause to this leak:
{
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox3.5/1238711840.1238715491.11899.gz
OS X 10.5.2 mozilla-1.9.1 unit test on 2009/04/02 15:37:20
TEST-UNEXPECTED-FAIL | runtests-leaks | leaked 68 bytes during test execution
TEST-UNEXPECTED-FAIL | runtests-leaks | leaked 1 instance of nsBaseAppShell with size 60 bytes
TEST-UNEXPECTED-FAIL | runtests-leaks | leaked 1 instance of nsRunnable with size 8 bytes
TinderboxPrint: mochitest-plain<br/>76138/0/1541 <em class="testfail">LEAK</em>
}
Reporter | ||
Comment 10•16 years ago
|
||
I think any leak on mac of a single nsBaseAppShell and a single nsRunnable should be duped to this bug and the failing test should not be fingered. This is easy enough to reproduce.
Updated•16 years ago
|
Flags: blocking1.9.1?
Updated•16 years ago
|
Flags: blocking1.9.1? → blocking1.9.1-
Updated•16 years ago
|
Flags: wanted1.9.1?
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1240219838.1240230963.23529.gz
OS X 10.5.2 mozilla-central unit test on 2009/04/20 02:30:38
Comment 12•16 years ago
|
||
(In reply to comment #11)
> http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1240219838.1240230963.23529.gz
> OS X 10.5.2 mozilla-central unit test on 2009/04/20 02:30:38
This case was 'reftest'.
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1240781852.1240784557.2880.gz&fulltext=1
OS X 10.5.2 mozilla-central unit test on 2009/04/26 14:37:32
This is also reported a lot (at "random") on 'xpcshell-tests'.
Depends on: 469523
Comment 13•16 years ago
|
||
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox3.5/1242133704.1242138142.11650.gz
OS X 10.5.2 mozilla-1.9.1 unit test on 2009/05/12 06:08:24
Assignee | ||
Comment 14•16 years ago
|
||
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox3.5/1242401721.1242404593.30000.gz
OS X 10.5.2 mozilla-1.9.1 unit test on 2009/05/15 08:35:21
Comment 15•16 years ago
|
||
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox3.5/1242461254.1242466169.10039.gz
OS X 10.5.2 mozilla-1.9.1 unit test on 2009/05/16 01:07:34
Summary: nsBaseAppShell and nsRunnable leak occasionally on OS X → [MacOSX] all test suites: nsBaseAppShell and nsRunnable leak occasionally
Comment 16•16 years ago
|
||
http://tinderbox.mozilla.org/showlog.cgi?log=SeaMonkey/1242630739.1242636130.8550.gz
OS X 10.4 comm-central unit test on 2009/05/18 00:12:19
Comment 17•16 years ago
|
||
http://tinderbox.mozilla.org/showlog.cgi?log=SeaMonkey/1242648001.1242653410.7920.gz
OS X 10.4 comm-central unit test on 2009/05/18 05:00:01
Comment 18•16 years ago
|
||
http://tinderbox.mozilla.org/showlog.cgi?log=SeaMonkey-Ports/1242634221.1242641434.17806.gz
OS X 10.5 comm-1.9.1 unit test on 2009/05/18 01:10:21
Comment 19•16 years ago
|
||
I ran into this bug while trying to get bug 454921 ready to land. I'm pretty convinced that the leak I was seeing there is in fact this bug. I'm attaching (and will immediately obsolete) the leak logs and balance tree for nsBaseAppShell I created when trying to work out what was going on, just in case any further analysis is needed.
Updated•16 years ago
|
Attachment #378226 -
Attachment is obsolete: true
Assignee | ||
Comment 20•16 years ago
|
||
> bug 454921
Is this the right bug number?
Assignee | ||
Comment 21•16 years ago
|
||
I've found a trivially easy way to make an nsAppShell/nsBaseAppShell
object leak: This happens every time FF relaunches as it starts up
(which in turn happens every time you run FF with a fresh profile, or
if the copy of FF you ran previously had a different major version).
Could *this* be the source of our "random" oranges?
Tell me it isn't so :-)
Assignee | ||
Comment 22•16 years ago
|
||
> I've found a trivially easy way to make an nsAppShell/nsBaseAppShell
> object leak: ...
And now I've stopped being able to reproduce it. Sigh :-(
I'll keep digging.
Comment 23•16 years ago
|
||
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox3.5/1242820340.1242825407.30878.gz
OS X 10.5.2 mozilla-1.9.1 unit test on 2009/05/20 04:52:20
Comment 24•16 years ago
|
||
5/20
TEST-UNEXPECTED-FAIL | runtests-leaks | leaked 68 bytes during test execution
TEST-UNEXPECTED-FAIL | runtests-leaks | leaked 1 instance of nsBaseAppShell with size 60 bytes
TEST-UNEXPECTED-FAIL | runtests-leaks | leaked 1 instance of nsRunnable with size 8 bytes
TEST-UNEXPECTED-FAIL | runtests-leaks | leaked 68 bytes during test execution
TEST-UNEXPECTED-FAIL | runtests-leaks | leaked 1 instance of nsBaseAppShell with size 60 bytes
TEST-UNEXPECTED-FAIL | runtests-leaks | leaked 1 instance of nsRunnable with size 8 bytes
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox3.5/1242832092.1242836949.27265.gz
Comment 25•16 years ago
|
||
I can see this leak consistently on some of the xpcshell tests when run in a debug build, but I haven't had time to investigate it. That would probably be a fruitful avenue for attack given that xpcshell tests are pretty minimal (of course no guarantees it'd be the same problem).
Assignee | ||
Comment 26•16 years ago
|
||
I'm making progress on this.
I find it pretty easy to reproduce ... though I don't have formal STR.
I hope to post a patch tomorrow.
Comment 27•16 years ago
|
||
Assignee | ||
Comment 28•16 years ago
|
||
Here's the patch I've been working on.
As best I can tell, it completely resolves the problem that Ben
reported in comment #0 (the leak of an nsBaseAppShell object that
occurs when a call is (occasionally) made to
ScheduleNativeEventCallback() that isn't matched by a call to
ProcessGeckoEvents()).
It also resolves (in cross-platform code in nsBaseAppShell::Observe())
an additional leak that often happens (at least on OS X) when the
browser is relaunched.
But, like Ben (in comment #2), I've seen two or three cases where a
double-free happens on shutdown. I'm not able to reproduce this
reliably -- so I haven't been able to track down the cause. But I
strongly suspect it isn't due to the code that (in Ben's patch and
mine) compensates for "extra" calls to ScheduleNativeEventCallback().
In other words I suspect it's a cross-platform problem -- one that
also occurs at least occasionally on other platforms.
So I think this patch needs a fair amount of testing before it can be
landed ... or at least landed permanently.
If nothing else works, I think we should land this on the trunk and
let it bake there a while, to see if our automated tests turn up a way
to reproduce the double-free problem.
But maybe one of the people cc-ed here can come up with a way to
torture test this patch short of actually landing it on the trunk.
Assignee | ||
Comment 29•16 years ago
|
||
All the double-frees I've seen happened on quitting the browser after
having run the Crash Tests (-reftest
testing/crashtest/crashtests.list).
There were always two of them -- for the nsRunnable object owned by an
nsBaseAppShell object, and then for the nsBaseAppShell object itself.
They always happened at the last point in the browser where an
nsBaseAppShell object is released (when the ScopedXPCOMStartup object
near the end of XRE_main() (in nsAppRunner.cpp) goes out of scope).
So the refcount for the nsBaseAppShell object was always exactly '1'
too small.
(I know all this because one of the double-frees happened while I was
running the crashtests in gdb.)
Oddly, the double-frees weren't actually called double-frees. Instead
malloc displayed a "non-aligned pointer" error message for each of
them:
firefox-bin(9821,0xa0098720) malloc: \
*** error for object 0x1e334af0: Non-aligned pointer being freed (2)
*** set a breakpoint in malloc_error_break to debug
firefox-bin(9821,0xa0098720) malloc: \
*** error for object 0x3d0dd30: Non-aligned pointer being freed (2)
*** set a breakpoint in malloc_error_break to debug
This message is supposed to indicate an attempt to free a memory block
that isn't aligned on a 16-byte boundary (and thereby to indicate a
corrupt or uninitialized pointer).
But note that both addresses *are* properly aligned (they both end in
'0'). And when I deliberately added an extra Release() to
nsAppShell::Exit(), I kept seeing the same messages (at the same
location in the browser).
So these messages really do indicate double-frees.
Side note: To break on malloc's "non-aligned pointer" messages (in
gdb), you need to break on 'malloc_printf' (*not* on
'malloc_error_break').
Assignee | ||
Comment 30•16 years ago
|
||
I "reconstructed" these stacks by deliberately adding an extra
Release() to nsAppShell::Exit(), then running the browser in gdb.
Assignee | ||
Updated•16 years ago
|
Attachment #378952 -
Flags: review?(joshmoz)
Assignee | ||
Updated•16 years ago
|
Attachment #378952 -
Flags: review?(bent.mozilla)
Comment 31•16 years ago
|
||
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox3.5/1242991708.1242994615.28544.gz
OS X 10.5.2 mozilla-1.9.1 unit test on 2009/05/22 04:28:28
Comment 32•16 years ago
|
||
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox3.5/1243005817.1243011072.370.gz
OS X 10.5.2 mozilla-1.9.1 unit test on 2009/05/22 08:23:37
Updated•16 years ago
|
Summary: [MacOSX] all test suites: nsBaseAppShell and nsRunnable leak occasionally → [MacOSX] all test suites: nsBaseAppShell and nsRunnable leak occasionally ["runtests-leaks | leaked 68 bytes during test execution"]
Comment 33•16 years ago
|
||
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox3.5/1243019296.1243022355.25114.gz
OS X 10.5.2 mozilla-1.9.1 unit test on 2009/05/22 12:08:16
Comment 34•16 years ago
|
||
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox3.5/1243030778.1243033851.16590.gz
OS X 10.5.2 mozilla-1.9.1 unit test on 2009/05/22 15:19:38
Comment 35•16 years ago
|
||
http://tinderbox.mozilla.org/showlog.cgi?log=SeaMonkey/1243211725.1243217093.17299.gz
OS X 10.4 comm-central unit test on 2009/05/24 17:35:25
Comment 36•16 years ago
|
||
Comment on attachment 378952 [details] [diff] [review]
Possible patch
A minor shutdown leak for a crash doesn't seem like a good trade, however annoying it might be on tbox. Seems like we shouldn't take a patch until we also have the crash figured out.
Assignee | ||
Comment 37•16 years ago
|
||
(In reply to comment #36)
Actually, the double-frees don't cause crashes -- not the ones I saw,
nor (I suspect) the ones Ben saw.
But I'll spend some more time trying to track them down.
Comment 38•16 years ago
|
||
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox3.5/1243378847.1243383129.27518.gz
OS X 10.5.2 mozilla-1.9.1 unit test on 2009/05/26 16:00:47
Comment 39•16 years ago
|
||
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox3.5/1243388041.1243391150.7914.gz
OS X 10.5.2 mozilla-1.9.1 unit test on 2009/05/26 18:34:01
Assignee | ||
Updated•16 years ago
|
Attachment #378952 -
Flags: review?(joshmoz)
Attachment #378952 -
Flags: review?(bent.mozilla)
Updated•16 years ago
|
Whiteboard: [orange] → [No need for more logs, atm] [orange]
Assignee | ||
Comment 40•16 years ago
|
||
Here's a revision of my patch from comment #28.
I've done several things to clean it up (for example we should use
PR_Atomic... on mNativeEventScheduledDepth, because it can be
set/changed from different threads).
I've also added code to deal with the possibility that the OS can call
ProcessGeckoEvents() "spontaneously" (without the browser first having
called ScheduleNativeEventCallback()). This *might* be the cause of
the "Non-aligned pointer" errors I saw -- and if so my change will fix
them.
I haven't seen these errors again, despite considerable effort.
And I've now looked at every single place in the tree where an
nsAppShell object can be AddRefed or Released. I didn't find one
that looked wrong.
So I think this patch is the best we can do ... at least for the time
being.
Once it's reviewed, I hope to land it on the trunk, and let it bake
there for a week or two. We can see what it turns up, and on that
basis decide what to do next.
(By the way, I did a tryserver build of my patch and had several
failures. But they all seem to have been spurious.)
Attachment #378952 -
Attachment is obsolete: true
Attachment #380274 -
Flags: review?(joshmoz)
Assignee | ||
Updated•16 years ago
|
Attachment #380274 -
Flags: review?(bent.mozilla)
Comment 41•16 years ago
|
||
Assignee | ||
Comment 42•16 years ago
|
||
For what it's worth, here's the list I compiled of places in the tree
where an nsAppShell object can be AddRefed or Released (not including
platform-specific code):
AddRef
http://hg.mozilla.org/mozilla-central/file/bfb383af1903/widget/src/xpwidgets/nsAppShellSingleton.h#l72
AddRef
http://hg.mozilla.org/mozilla-central/file/bfb383af1903/xpcom/threads/nsThread.cpp#l588
http://hg.mozilla.org/mozilla-central/file/bfb383af1903/widget/src/xpwidgets/nsBaseAppShell.cpp#l77
AddRef
http://hg.mozilla.org/mozilla-central/file/bfb383af1903/xpcom/ds/nsObserverList.cpp#l50
http://hg.mozilla.org/mozilla-central/file/bfb383af1903/widget/src/xpwidgets/nsBaseAppShell.cpp#l82
AddRef
http://hg.mozilla.org/mozilla-central/file/bfb383af1903/xpcom/components/nsComponentManager.cpp#l1596
http://hg.mozilla.org/mozilla-central/file/bfb383af1903/xpcom/components/nsComponentManager.cpp#l1900
http://hg.mozilla.org/mozilla-central/file/bfb383af1903/toolkit/components/startup/src/nsAppStartup.cpp#l111
AddRef
http://hg.mozilla.org/mozilla-central/file/bfb383af1903/xpcom/components/nsComponentManager.cpp#l1928
http://hg.mozilla.org/mozilla-central/file/bfb383af1903/toolkit/components/startup/src/nsAppStartup.cpp#l111
AddRef
http://hg.mozilla.org/mozilla-central/file/bfb383af1903/xpcom/components/nsComponentManager.cpp#l1934
http://hg.mozilla.org/mozilla-central/file/bfb383af1903/toolkit/components/startup/src/nsAppStartup.cpp#l111
Release
http://hg.mozilla.org/mozilla-central/file/bfb383af1903/xpcom/components/nsComponentManager.cpp#l1936
http://hg.mozilla.org/mozilla-central/file/bfb383af1903/toolkit/components/startup/src/nsAppStartup.cpp#l111
AddRef
http://hg.mozilla.org/mozilla-central/file/bfb383af1903/xpcom/ds/nsObserverList.cpp#l125
http://hg.mozilla.org/mozilla-central/file/bfb383af1903/toolkit/xre/nsAppRunner.cpp#l995
Release
http://hg.mozilla.org/mozilla-central/file/bfb383af1903/xpcom/ds/nsObserverList.cpp#l130
http://hg.mozilla.org/mozilla-central/file/bfb383af1903/toolkit/xre/nsAppRunner.cpp#l995
Release
http://hg.mozilla.org/mozilla-central/file/bfb383af1903/xpcom/components/nsComponentManager.cpp#l1748
http://hg.mozilla.org/mozilla-central/file/bfb383af1903/toolkit/xre/nsAppRunner.cpp#l995
Release
http://hg.mozilla.org/mozilla-central/file/bfb383af1903/widget/src/xpwidgets/nsAppShellSingleton.h#l86
http://hg.mozilla.org/mozilla-central/file/bfb383af1903/toolkit/xre/nsAppRunner.cpp#l995
Release
http://hg.mozilla.org/mozilla-central/file/bfb383af1903/toolkit/components/startup/src/nsAppStartup.h#l75
http://hg.mozilla.org/mozilla-central/file/bfb383af1903/toolkit/xre/nsAppRunner.cpp#l998
AddRef and Release
http://hg.mozilla.org/mozilla-central/file/bfb383af1903/content/base/src/nsContentSink.cpp#l1565
AddRef and Release
http://hg.mozilla.org/mozilla-central/file/bfb383af1903/docshell/base/nsDocShell.cpp#l257
AddRef and Release
http://hg.mozilla.org/mozilla-central/file/bfb383af1903/layout/generic/nsObjectFrame.cpp#l856
AddRef and Release
http://hg.mozilla.org/mozilla-central/file/bfb383af1903/layout/generic/nsObjectFrame.cpp#l1967
AddRef and Release
http://hg.mozilla.org/mozilla-central/file/bfb383af1903/layout/generic/nsObjectFrame.cpp#l3266
AddRef and Release
http://hg.mozilla.org/mozilla-central/file/bfb383af1903/xpcom/threads/nsThread.cpp#l366
AddRef and Release
http://hg.mozilla.org/mozilla-central/file/bfb383af1903/xpcom/threads/nsThread.cpp#l495
Attachment #380274 -
Flags: review?(joshmoz) → review-
Comment 43•16 years ago
|
||
Comment on attachment 380274 [details] [diff] [review]
Patch v2.1 (revised and cleaned up)
>--- a/widget/src/xpwidgets/nsBaseAppShell.cpp
>+++ b/widget/src/xpwidgets/nsBaseAppShell.cpp
>@@ -315,15 +315,30 @@ nsBaseAppShell::OnProcessNextEvent(nsITh
> NS_IMETHODIMP
> nsBaseAppShell::Observe(nsISupports *subject, const char *topic,
> const PRUnichar *data)
> {
> NS_ASSERTION(!strcmp(topic, NS_XPCOM_SHUTDOWN_OBSERVER_ID), "oops");
>+
>+ // The observer service and current thread both hold strong references
>+ // to us, which don't always get released unless we explicitly stop
>+ // observing (especially when the browser is relaunched).
I don't think we should make these changes to the base appshell. As far as I can tell there is no circumstance under which this should be happening and all the new code here will do is cover up leaks in other code.
I think you should remove this part of the patch and file a bug about whatever events/observations prompted you to add this code.
Comment 44•16 years ago
|
||
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox3.5/1244465722.1244468642.29853.gz&fulltext=1
(I cleared the whiteboard request for no more logs - I apologize for the noise it causes, but pasting the logs not only gives us frequency data that may or may not be useful, it also keeps the bug live in the minds of people who can solve it. This is something that frequently puts the tree in a bad state - either committers respect the orange and don't check in, or they learn to ignore the orange - both bad outcomes.
The best way to cut down on link spam is to fix it. Which it looks like Josh and smichaud are making progress on, thanks guys!)
Whiteboard: [No need for more logs, atm] [orange] → [orange]
Reporter | ||
Comment 45•16 years ago
|
||
Comment on attachment 380274 [details] [diff] [review]
Patch v2.1 (revised and cleaned up)
Steven, do you want to post a new patch? Or do you want my review on this one?
Assignee | ||
Comment 46•16 years ago
|
||
Comment on attachment 380274 [details] [diff] [review]
Patch v2.1 (revised and cleaned up)
> Steven, do you want to post a new patch? Or do you want my review on
> this one?
I'll post a new one tomorrow, so don't bother reviewing this one.
I'm trying to deal with Josh's objection to the part of my patch that
changes nsBaseAppShell::Observe(). Making no changes there means that
the appshell will leak almost 100% of the time on restart.
For a while I thought I was close to finding the true source(s) of
that leak. Now I'm no longer so sure ... but I intend to keep trying
for little longer.
One way or another, though, I'll post something for review tomorrow.
Attachment #380274 -
Flags: review?(bent.mozilla)
Assignee | ||
Comment 47•16 years ago
|
||
> One way or another, though, I'll post something for review tomorrow.
Looks like I won't make it. Tomorrow for sure!
Comment 48•16 years ago
|
||
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox3.5/1244722045.1244726163.8605.gz
OS X 10.5.2 mozilla-1.9.1 unit test on 2009/06/11 05:07:25
Assignee | ||
Comment 49•16 years ago
|
||
Attachment #380274 -
Attachment is obsolete: true
Attachment #382878 -
Flags: superreview?(bent.mozilla)
Attachment #382878 -
Flags: review?(joshmoz)
Assignee | ||
Comment 50•16 years ago
|
||
> I don't think we should make these changes to the base appshell. As
> far as I can tell there is no circumstance under which this should
> be happening and all the new code here will do is cover up leaks in
> other code.
If we don't change nsBaseAppShell::Observe(), the appShell will leak
almost 100% of the time on relaunch.
I've tracked down (as best I can) the true "cause" of this leak (see
bug 497745). But I don't it's currently practical to fix it "at the
source". So if we're going to fix this leak at all (the one on
relaunch), I think it's best to do it here.
I did find out, though, that I didn't need to explicitly remove the
appShell from the observer service.
Otherwise this patch is the same as v2.1.
We might be able to get away with *not* fixing the leak on relaunch --
I suspect it doesn't have any effect on tinderbox builds. But my fix
for it is very simple and effective, and I really don't see anything
wrong with it.
Assignee | ||
Updated•16 years ago
|
Attachment #382878 -
Flags: superreview?(bent.mozilla) → review?(bent.mozilla)
Assignee | ||
Comment 51•16 years ago
|
||
Comment on attachment 382878 [details] [diff] [review]
Patch v2.2
Oops, Ben. I just meant to ask you for a review.
Comment 52•16 years ago
|
||
Comment on attachment 382878 [details] [diff] [review]
Patch v2.2
I can't r+ the base appshell code unless we decide that it is the correct way to fix the leak. I don't want to cover that leak up, I'd rather have the leak and improve our chances of fixing it the right way in the future.
Attachment #382878 -
Flags: review?(joshmoz) → review-
Assignee | ||
Comment 53•16 years ago
|
||
> I don't want to cover that leak up
We wouldn't be covering it up -- I've already identified its source
(in bug 497745). And I suspect it will be at least a year before that
bug is fixed (fixed in a better way than attachment 382873 [details] [diff] [review]).
But, as I said above, we probably can get away with not fixing it. So
here's a patch with the offending code removed.
Attachment #382878 -
Attachment is obsolete: true
Attachment #382962 -
Flags: review?(joshmoz)
Attachment #382878 -
Flags: review?(bent.mozilla)
Assignee | ||
Updated•16 years ago
|
Attachment #382962 -
Flags: review?(bent.mozilla)
Reporter | ||
Comment 54•16 years ago
|
||
Actually, I think it might make sense for a thread to auto-release its observer once it's shutdown. That might solve the appshell case and others. I'd have to look more closely, but that sounds like a smart move to me.
Assignee | ||
Comment 55•16 years ago
|
||
Does XPCOM have an "autorelease"? :-)
Reporter | ||
Comment 56•16 years ago
|
||
I filed bug 498010 to get the thread observers released at shutdown.
Reporter | ||
Comment 57•16 years ago
|
||
Comment on attachment 382962 [details] [diff] [review]
Patch v2.3
>diff --git a/widget/src/cocoa/nsAppShell.h b/widget/src/cocoa/nsAppShell.h
> ...
>+ PRInt32 mNativeEventScheduledDepth;
Might want to indicate here with a comment that this int is only modified atomically
>+ // As best I can tell, every call to ProcessGeckoEvents() is triggered
I'm not a fan of using the first person in code, but that's my personal opinion.
>+ if (!self->mTerminated) {
As long as we're doing an else clause I'd prefer to reverse the !mTerminated test and then switch the if/else clauses around. Double-negatives are tough to read ;)
> NS_ADDREF_THIS();
>+ PR_AtomicIncrement(&mNativeEventScheduledDepth);
So, if a background thread makes it through the addref but then gets swapped out and the main thread runs through ProcessGeckoEvents we'll be left with an extra ref. It looks like we'll catch that in Exit(), but maybe put a comment here explaining that?
>@@ -759,24 +792,38 @@ nsAppShell::Exit(void)
> ...
>+ if (!mNativeEventCallbackDepth) {
I'd do this just to make it simpler:
if (!mNativeEventCallbackDepth && mNativeEventScheduledDepth) {
PRInt32 releaseCount = PR_AtomicSet(&mNativeEventScheduledDepth, 0);
while (releaseCount-- > 0)
NS_RELEASE_THIS();
}
Attachment #382962 -
Flags: review?(bent.mozilla) → review+
Assignee | ||
Comment 58•16 years ago
|
||
Attachment #382962 -
Attachment is obsolete: true
Attachment #383361 -
Flags: review?(joshmoz)
Attachment #382962 -
Flags: review?(joshmoz)
Assignee | ||
Comment 59•16 years ago
|
||
Here's a new revision that (mostly) follow's Ben's suggestions. It
also rewrites a bunch of comments, and hopefully makes them easier to
understand.
>> NS_ADDREF_THIS();
>>+ PR_AtomicIncrement(&mNativeEventScheduledDepth);
>
> So, if a background thread makes it through the addref but then gets
> swapped out and the main thread runs through ProcessGeckoEvents
> we'll be left with an extra ref. It looks like we'll catch that in
> Exit(), but maybe put a comment here explaining that?
I'm not sure what you're after here. But I think my comments in
ProcessGeckoEvents() and Exit() cover all the bases (or at least now
they do). If I'm right about that, saying more here wouldn't add
anything, and might actually be confusing.
Attachment #383361 -
Flags: review?(joshmoz) → review+
Assignee | ||
Comment 60•16 years ago
|
||
Comment on attachment 383361 [details] [diff] [review]
Patch v2.4 (follow Ben's suggestions)
[Checkin: Comment 60 & 79]
Landed on trunk (mozilla-central):
http://hg.mozilla.org/mozilla-central/rev/e10f24426a52
This doesn't (of course) fix the leaks on relaunch. But the patch for
bug 498010 does work around them, and should be landed soon.
We should let this bake on the trunk for a while, then (if all goes
well) land it on the 1.9.1 branch.
Assignee | ||
Updated•16 years ago
|
Status: NEW → RESOLVED
Closed: 16 years ago
Resolution: --- → FIXED
Updated•16 years ago
|
Flags: blocking1.9.2?
Keywords: qawanted
Whiteboard: [orange] → [qaw: check/dupe 'URL' bug list] [needs 1.9.1 landing: needs approval] [orange]
Target Milestone: --- → mozilla1.9.2a1
Assignee | ||
Updated•16 years ago
|
Attachment #383361 -
Flags: approval1.9.1?
Comment 62•16 years ago
|
||
FWIW, we've hit this bug on 5 out of the last 8 cycles on the OS X 10.5.2 mozilla-1.9.1 unit test tinderbox.
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox3.5/1246466277.1246469141.13765.gz
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox3.5/1246445380.1246448556.7797.gz
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox3.5/1246437476.1246443016.8652.gz
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox3.5/1246430276.1246436044.29238.gz
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox3.5/1246415879.1246418937.8307.gz
Updated•16 years ago
|
Attachment #383361 -
Flags: approval1.9.1? → approval1.9.1.1?
Updated•16 years ago
|
Flags: wanted1.9.1?
Comment 63•16 years ago
|
||
Comment 64•16 years ago
|
||
This is still failing more than 50% of the time on the Mac OS X Firefox3.5 tinderbox. (not pasting all the failure logs here to reduce noise)
Updated•16 years ago
|
Attachment #383361 -
Flags: approval1.9.1.1? → approval1.9.1.2?
Updated•15 years ago
|
Attachment #383361 -
Flags: approval1.9.1.2? → approval1.9.1.3?
Updated•15 years ago
|
status1.9.1:
--- → ?
Comment 65•15 years ago
|
||
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox3.5/1248708363.1248714985.3480.gz
OS X 10.5.2 mozilla-1.9.1 unit test on 2009/07/27 08:26:03
Comment 66•15 years ago
|
||
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox3.5-Unittest/1248718968.1248720070.30356.gz
OS X 10.5.2 mozilla-1.9.1 test everythingelse on 2009/07/27 11:22:48
Comment 67•15 years ago
|
||
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox3.5/1249733048.1249736004.23356.gz
OS X 10.5.2 mozilla-1.9.1 unit test on 2009/08/08 05:04:08
Comment 68•15 years ago
|
||
http://tinderbox.mozilla.org/showlog.cgi?log=SeaMonkey2.0/1249741029.1249747376.25347.gz
OS X 10.4 comm-central unit test on 2009/08/08 07:17:09
Comment 69•15 years ago
|
||
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox3.5/1249928878.1249931813.16502.gz
OS X 10.5.2 mozilla-1.9.1 unit test on 2009/08/10 11:27:58
Comment 70•15 years ago
|
||
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox3.5/1250497469.1250500428.24934.gz
OS X 10.5.2 mozilla-1.9.1 unit test on 2009/08/17 01:24:29
Comment 71•15 years ago
|
||
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox3.5-Unittest/1250839644.1250841119.19907.gz
OS X 10.5.2 mozilla-1.9.1 test everythingelse on 2009/08/21 00:27:24
Comment 72•15 years ago
|
||
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox3.5/1250874917.1250877894.2580.gz
OS X 10.5.2 mozilla-1.9.1 unit test on 2009/08/21 10:15:17
Please !
Comment 73•15 years ago
|
||
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox3.5/1251062935.1251068432.3946.gz
OS X 10.5.2 mozilla-1.9.1 unit test on 2009/08/23 14:28:55
Comment 74•15 years ago
|
||
http://tinderbox.mozilla.org/showlog.cgi?log=SeaMonkey2.0/1251082602.1251089180.27973.gz
OS X 10.4 comm-central unit test on 2009/08/23 19:56:42
Comment 75•15 years ago
|
||
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox3.5/1251293945.1251297074.32573.gz
OS X 10.5.2 mozilla-1.9.1 unit test on 2009/08/26 06:39:05
Comment 76•15 years ago
|
||
Comment on attachment 383361 [details] [diff] [review]
Patch v2.4 (follow Ben's suggestions)
[Checkin: Comment 60 & 79]
Please...
Attachment #383361 -
Flags: approval1.9.1.3? → approval1.9.1.4?
Comment 77•15 years ago
|
||
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox3.5/1251384388.1251391134.29854.gz
OS X 10.5.2 mozilla-1.9.1 unit test on 2009/08/27 07:46:28
Comment 78•15 years ago
|
||
Comment on attachment 383361 [details] [diff] [review]
Patch v2.4 (follow Ben's suggestions)
[Checkin: Comment 60 & 79]
a1914=beltzner
Attachment #383361 -
Flags: approval1.9.1.4? → approval1.9.1.4+
Comment 79•15 years ago
|
||
Comment on attachment 383361 [details] [diff] [review]
Patch v2.4 (follow Ben's suggestions)
[Checkin: Comment 60 & 79]
http://hg.mozilla.org/releases/mozilla-1.9.1/rev/496510e1d755
Attachment #383361 -
Attachment description: Patch v2.4 (follow Ben's suggestions) → Patch v2.4 (follow Ben's suggestions)
[Checkin: Comment 60 & 79]
Updated•15 years ago
|
Whiteboard: [qaw: check/dupe 'URL' bug list] [needs 1.9.1 landing: needs approval] [orange] → [qaw: check/dupe 'URL' bug list] [orange]
Updated•12 years ago
|
Keywords: intermittent-failure
Updated•12 years ago
|
Whiteboard: [qaw: check/dupe 'URL' bug list] [orange] → [qaw: check/dupe 'URL' bug list]
Comment 80•10 years ago
|
||
Issue is Resolved - removing QA-Wanted Keywords - QA-Wanted query clean-up task
Keywords: qawanted
You need to log in
before you can comment on or make changes to this bug.
Description
•