[MacOSX] all test suites: nsBaseAppShell and nsRunnable leak occasionally ["runtests-leaks | leaked 68 bytes during test execution"]

RESOLVED FIXED in mozilla1.9.2a1

Status

()

RESOLVED FIXED
10 years ago
4 years ago

People

(Reporter: bent.mozilla, Assigned: smichaud)

Tracking

({intermittent-failure, memory-leak, meta})

Trunk
mozilla1.9.2a1
x86
Mac OS X
intermittent-failure, memory-leak, meta
Points:
---
Dependency tree / graph
Bug Flags:
blocking1.9.1 -

Firefox Tracking Flags

(status1.9.1 .4-fixed)

Details

(Whiteboard: [qaw: check/dupe 'URL' bug list] , URL)

Attachments

(2 attachments, 6 obsolete attachments)

Created attachment 356107 [details] [diff] [review]
Patch, v1

Often in my debugging I see a single nsBaseAppShell and nsRunnable instance leak. This seems to happen more frequently when I quit the browser soon after startup but I've seen it in many other cases as well. The nsRunnable is strongly owned by the nsBaseAppShell ('mDummyEvent'), so the nsBaseAppShell is the real leak. I finally got a leak log of the nsBaseAppShell leak and noticed this:

  nsAppShell::ScheduleNativeEventCallback() 44
  nsAppShell::ProcessGeckoEvents(void*) -43

The log indicates that there were 44 AddRefs to the app shell but only 43 Releases. The code in question assumes that there will be a 1:1 relationship between calls to ScheduleNativeEventCallback and ProcessGeckoEvents, but that appears to not be true in all cases.

Attached is a patch which fixes the leak for me (at least I haven't been able to reproduce it since I applied this patch) but I don't know if there is a better way or if this patch will hurt performance.

This leak itself is not a big deal, really. It does, however, cause random tinderbox orange and it also causes lots of duplicate bug reports to be filed in topsite leak analyses.
Attachment #356107 - Flags: review?(smichaud)
(Assignee)

Updated

10 years ago
Attachment #356107 - Flags: review?(smichaud) → review+
(Assignee)

Comment 1

10 years ago
Comment on attachment 356107 [details] [diff] [review]
Patch, v1

Your patch's logic seems impeccable, and I can't foresee it doing any
harm.

But I think we should let this bake on the trunk for a while before
landing it on any of the branches.
Comment on attachment 356107 [details] [diff] [review]
Patch, v1

Actually, I meant to obsolete this before.

I agree that the logic seems correct, but I have occasionally seen a double free with this patch on shutdown... Something is weird here.
Attachment #356107 - Attachment is obsolete: true
(Assignee)

Comment 3

10 years ago
I've always wondered if the OS could call
nsAppShell::ProcessGeckoEvents() without having been told to do so (by
the calls to CFRunLoopSourceSignal() and CFRunLoopWakeUp()) in
nsAppShell::ScheduleNativeEventCallback().

Maybe the problem is that it can.
(Assignee)

Comment 4

10 years ago
Ben, here's a hunch:

There's code in the nsAppShell destructor that removes
mCFRunLoopSource from mCFRunLoop, then releases mCFRunLoopSource.

Try commenting that out and see what happens.  I suspect that code
might trigger one last call to nsAppShell::ProcessGeckoEvents().
(Assignee)

Comment 5

10 years ago
Or maybe you could call CFRunLoopSourceInvalidate() on
mCFRunLoopSource before removing it from mCFRunLoop.
Thanks! I'll give it a try.
(Assignee)

Comment 7

10 years ago
> I suspect that code might trigger one last call to
> nsAppShell::ProcessGeckoEvents().

Or (more likely) maybe it can (sometimes) trigger all pending calls to
nsAppShell::ProcessGeckoEvents().
Wow, there is a bunch of duplicates or at least related bugs...
Blocks: 486146, 482598, 438871
Flags: blocking1.9.2?
Keywords: meta, mlk
Whiteboard: [orange]
Assuming for now that there's only one cause to this leak:

{
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox3.5/1238711840.1238715491.11899.gz
OS X 10.5.2 mozilla-1.9.1 unit test on 2009/04/02 15:37:20

TEST-UNEXPECTED-FAIL | runtests-leaks | leaked 68 bytes during test execution
TEST-UNEXPECTED-FAIL | runtests-leaks | leaked 1 instance of nsBaseAppShell with size 60 bytes
TEST-UNEXPECTED-FAIL | runtests-leaks | leaked 1 instance of nsRunnable with size 8 bytes

TinderboxPrint: mochitest-plain<br/>76138/0/1541&nbsp;<em class="testfail">LEAK</em>
}
I think any leak on mac of a single nsBaseAppShell and a single nsRunnable should be duped to this bug and the failing test should not be fingered. This is easy enough to reproduce.
Flags: blocking1.9.1?
Flags: blocking1.9.1? → blocking1.9.1-
Flags: wanted1.9.1?
(In reply to comment #11)
> http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1240219838.1240230963.23529.gz
> OS X 10.5.2 mozilla-central unit test on 2009/04/20 02:30:38

This case was 'reftest'.


http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1240781852.1240784557.2880.gz&fulltext=1
OS X 10.5.2 mozilla-central unit test on 2009/04/26 14:37:32

This is also reported a lot (at "random") on 'xpcshell-tests'.
Depends on: 469523

Updated

10 years ago
Assignee: bent.mozilla → smichaud
(Assignee)

Comment 14

10 years ago
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox3.5/1242401721.1242404593.30000.gz
OS X 10.5.2 mozilla-1.9.1 unit test on 2009/05/15 08:35:21
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox3.5/1242461254.1242466169.10039.gz
OS X 10.5.2 mozilla-1.9.1 unit test on 2009/05/16 01:07:34
Summary: nsBaseAppShell and nsRunnable leak occasionally on OS X → [MacOSX] all test suites: nsBaseAppShell and nsRunnable leak occasionally
Created attachment 378226 [details]
leak logs from bug 454921

I ran into this bug while trying to get bug 454921 ready to land.  I'm pretty convinced that the leak I was seeing there is in fact this bug.  I'm attaching (and will immediately obsolete) the leak logs and balance tree for nsBaseAppShell I created when trying to work out what was going on, just in case any further analysis is needed.
Attachment #378226 - Attachment is obsolete: true
(Assignee)

Comment 20

10 years ago
> bug 454921

Is this the right bug number?
(Assignee)

Comment 21

10 years ago
I've found a trivially easy way to make an nsAppShell/nsBaseAppShell
object leak:  This happens every time FF relaunches as it starts up
(which in turn happens every time you run FF with a fresh profile, or
if the copy of FF you ran previously had a different major version).

Could *this* be the source of our "random" oranges?

Tell me it isn't so :-)
(Assignee)

Comment 22

10 years ago
> I've found a trivially easy way to make an nsAppShell/nsBaseAppShell
> object leak: ...

And now I've stopped being able to reproduce it.  Sigh :-(

I'll keep digging.

Comment 24

10 years ago
5/20

TEST-UNEXPECTED-FAIL | runtests-leaks | leaked 68 bytes during test execution
TEST-UNEXPECTED-FAIL | runtests-leaks | leaked 1 instance of nsBaseAppShell with size 60 bytes
TEST-UNEXPECTED-FAIL | runtests-leaks | leaked 1 instance of nsRunnable with size 8 bytes
TEST-UNEXPECTED-FAIL | runtests-leaks | leaked 68 bytes during test execution
TEST-UNEXPECTED-FAIL | runtests-leaks | leaked 1 instance of nsBaseAppShell with size 60 bytes
TEST-UNEXPECTED-FAIL | runtests-leaks | leaked 1 instance of nsRunnable with size 8 bytes

http://tinderbox.mozilla.org/showlog.cgi?log=Firefox3.5/1242832092.1242836949.27265.gz

Comment 25

10 years ago
I can see this leak consistently on some of the xpcshell tests when run in a debug build, but I haven't had time to investigate it.  That would probably be a fruitful avenue for attack given that xpcshell tests are pretty minimal (of course no guarantees it'd be the same problem).
(Assignee)

Comment 26

10 years ago
I'm making progress on this.

I find it pretty easy to reproduce ... though I don't have formal STR.

I hope to post a patch tomorrow.
(Assignee)

Comment 28

10 years ago
Created attachment 378952 [details] [diff] [review]
Possible patch

Here's the patch I've been working on.

As best I can tell, it completely resolves the problem that Ben
reported in comment #0 (the leak of an nsBaseAppShell object that
occurs when a call is (occasionally) made to
ScheduleNativeEventCallback() that isn't matched by a call to
ProcessGeckoEvents()).

It also resolves (in cross-platform code in nsBaseAppShell::Observe())
an additional leak that often happens (at least on OS X) when the
browser is relaunched.

But, like Ben (in comment #2), I've seen two or three cases where a
double-free happens on shutdown.  I'm not able to reproduce this
reliably -- so I haven't been able to track down the cause.  But I
strongly suspect it isn't due to the code that (in Ben's patch and
mine) compensates for "extra" calls to ScheduleNativeEventCallback().
In other words I suspect it's a cross-platform problem -- one that
also occurs at least occasionally on other platforms.

So I think this patch needs a fair amount of testing before it can be
landed ... or at least landed permanently.

If nothing else works, I think we should land this on the trunk and
let it bake there a while, to see if our automated tests turn up a way
to reproduce the double-free problem.

But maybe one of the people cc-ed here can come up with a way to
torture test this patch short of actually landing it on the trunk.
(Assignee)

Comment 29

10 years ago
All the double-frees I've seen happened on quitting the browser after
having run the Crash Tests (-reftest
testing/crashtest/crashtests.list).

There were always two of them -- for the nsRunnable object owned by an
nsBaseAppShell object, and then for the nsBaseAppShell object itself.
They always happened at the last point in the browser where an
nsBaseAppShell object is released (when the ScopedXPCOMStartup object
near the end of XRE_main() (in nsAppRunner.cpp) goes out of scope).
So the refcount for the nsBaseAppShell object was always exactly '1'
too small.

(I know all this because one of the double-frees happened while I was
running the crashtests in gdb.)

Oddly, the double-frees weren't actually called double-frees.  Instead
malloc displayed a "non-aligned pointer" error message for each of
them:

firefox-bin(9821,0xa0098720) malloc: \
  *** error for object 0x1e334af0: Non-aligned pointer being freed (2)
*** set a breakpoint in malloc_error_break to debug
firefox-bin(9821,0xa0098720) malloc: \
  *** error for object 0x3d0dd30: Non-aligned pointer being freed (2)
*** set a breakpoint in malloc_error_break to debug

This message is supposed to indicate an attempt to free a memory block
that isn't aligned on a 16-byte boundary (and thereby to indicate a
corrupt or uninitialized pointer).

But note that both addresses *are* properly aligned (they both end in
'0').  And when I deliberately added an extra Release() to
nsAppShell::Exit(), I kept seeing the same messages (at the same
location in the browser).

So these messages really do indicate double-frees.

Side note:  To break on malloc's "non-aligned pointer" messages (in
gdb), you need to break on 'malloc_printf' (*not* on
'malloc_error_break').
(Assignee)

Comment 30

10 years ago
Created attachment 378961 [details]
Reconstructed stacks for the two double-frees

I "reconstructed" these stacks by deliberately adding an extra
Release() to nsAppShell::Exit(), then running the browser in gdb.
(Assignee)

Updated

10 years ago
Attachment #378952 - Flags: review?(joshmoz)
(Assignee)

Updated

10 years ago
Attachment #378952 - Flags: review?(bent.mozilla)
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox3.5/1243005817.1243011072.370.gz
OS X 10.5.2 mozilla-1.9.1 unit test on 2009/05/22 08:23:37
Summary: [MacOSX] all test suites: nsBaseAppShell and nsRunnable leak occasionally → [MacOSX] all test suites: nsBaseAppShell and nsRunnable leak occasionally ["runtests-leaks | leaked 68 bytes during test execution"]

Comment 36

10 years ago
Comment on attachment 378952 [details] [diff] [review]
Possible patch

A minor shutdown leak for a crash doesn't seem like a good trade, however annoying it might be on tbox. Seems like we shouldn't take a patch until we also have the crash figured out.
(Assignee)

Comment 37

10 years ago
(In reply to comment #36)

Actually, the double-frees don't cause crashes -- not the ones I saw,
nor (I suspect) the ones Ben saw.

But I'll spend some more time trying to track them down.
(Assignee)

Updated

10 years ago
Attachment #378952 - Flags: review?(joshmoz)
Attachment #378952 - Flags: review?(bent.mozilla)
Whiteboard: [orange] → [No need for more logs, atm] [orange]
(Assignee)

Comment 40

10 years ago
Created attachment 380274 [details] [diff] [review]
Patch v2.1 (revised and cleaned up)

Here's a revision of my patch from comment #28.

I've done several things to clean it up (for example we should use
PR_Atomic... on mNativeEventScheduledDepth, because it can be
set/changed from different threads).

I've also added code to deal with the possibility that the OS can call
ProcessGeckoEvents() "spontaneously" (without the browser first having
called ScheduleNativeEventCallback()).  This *might* be the cause of
the "Non-aligned pointer" errors I saw -- and if so my change will fix
them.

I haven't seen these errors again, despite considerable effort.

And I've now looked at every single place in the tree where an
nsAppShell object can be AddRefed or Released.  I didn't find one
that looked wrong.

So I think this patch is the best we can do ... at least for the time
being.

Once it's reviewed, I hope to land it on the trunk, and let it bake
there for a week or two.  We can see what it turns up, and on that
basis decide what to do next.

(By the way, I did a tryserver build of my patch and had several
failures.  But they all seem to have been spurious.)
Attachment #378952 - Attachment is obsolete: true
Attachment #380274 - Flags: review?(joshmoz)
(Assignee)

Updated

10 years ago
Attachment #380274 - Flags: review?(bent.mozilla)
(Assignee)

Comment 42

10 years ago
For what it's worth, here's the list I compiled of places in the tree
where an nsAppShell object can be AddRefed or Released (not including
platform-specific code):

AddRef
http://hg.mozilla.org/mozilla-central/file/bfb383af1903/widget/src/xpwidgets/nsAppShellSingleton.h#l72

AddRef
http://hg.mozilla.org/mozilla-central/file/bfb383af1903/xpcom/threads/nsThread.cpp#l588
http://hg.mozilla.org/mozilla-central/file/bfb383af1903/widget/src/xpwidgets/nsBaseAppShell.cpp#l77

AddRef
http://hg.mozilla.org/mozilla-central/file/bfb383af1903/xpcom/ds/nsObserverList.cpp#l50
http://hg.mozilla.org/mozilla-central/file/bfb383af1903/widget/src/xpwidgets/nsBaseAppShell.cpp#l82

AddRef
http://hg.mozilla.org/mozilla-central/file/bfb383af1903/xpcom/components/nsComponentManager.cpp#l1596
http://hg.mozilla.org/mozilla-central/file/bfb383af1903/xpcom/components/nsComponentManager.cpp#l1900
http://hg.mozilla.org/mozilla-central/file/bfb383af1903/toolkit/components/startup/src/nsAppStartup.cpp#l111

AddRef
http://hg.mozilla.org/mozilla-central/file/bfb383af1903/xpcom/components/nsComponentManager.cpp#l1928
http://hg.mozilla.org/mozilla-central/file/bfb383af1903/toolkit/components/startup/src/nsAppStartup.cpp#l111

AddRef
http://hg.mozilla.org/mozilla-central/file/bfb383af1903/xpcom/components/nsComponentManager.cpp#l1934
http://hg.mozilla.org/mozilla-central/file/bfb383af1903/toolkit/components/startup/src/nsAppStartup.cpp#l111

Release
http://hg.mozilla.org/mozilla-central/file/bfb383af1903/xpcom/components/nsComponentManager.cpp#l1936
http://hg.mozilla.org/mozilla-central/file/bfb383af1903/toolkit/components/startup/src/nsAppStartup.cpp#l111

AddRef
http://hg.mozilla.org/mozilla-central/file/bfb383af1903/xpcom/ds/nsObserverList.cpp#l125
http://hg.mozilla.org/mozilla-central/file/bfb383af1903/toolkit/xre/nsAppRunner.cpp#l995

Release
http://hg.mozilla.org/mozilla-central/file/bfb383af1903/xpcom/ds/nsObserverList.cpp#l130
http://hg.mozilla.org/mozilla-central/file/bfb383af1903/toolkit/xre/nsAppRunner.cpp#l995

Release
http://hg.mozilla.org/mozilla-central/file/bfb383af1903/xpcom/components/nsComponentManager.cpp#l1748
http://hg.mozilla.org/mozilla-central/file/bfb383af1903/toolkit/xre/nsAppRunner.cpp#l995

Release
http://hg.mozilla.org/mozilla-central/file/bfb383af1903/widget/src/xpwidgets/nsAppShellSingleton.h#l86
http://hg.mozilla.org/mozilla-central/file/bfb383af1903/toolkit/xre/nsAppRunner.cpp#l995

Release
http://hg.mozilla.org/mozilla-central/file/bfb383af1903/toolkit/components/startup/src/nsAppStartup.h#l75
http://hg.mozilla.org/mozilla-central/file/bfb383af1903/toolkit/xre/nsAppRunner.cpp#l998

AddRef and Release
http://hg.mozilla.org/mozilla-central/file/bfb383af1903/content/base/src/nsContentSink.cpp#l1565

AddRef and Release
http://hg.mozilla.org/mozilla-central/file/bfb383af1903/docshell/base/nsDocShell.cpp#l257

AddRef and Release
http://hg.mozilla.org/mozilla-central/file/bfb383af1903/layout/generic/nsObjectFrame.cpp#l856

AddRef and Release
http://hg.mozilla.org/mozilla-central/file/bfb383af1903/layout/generic/nsObjectFrame.cpp#l1967

AddRef and Release
http://hg.mozilla.org/mozilla-central/file/bfb383af1903/layout/generic/nsObjectFrame.cpp#l3266

AddRef and Release
http://hg.mozilla.org/mozilla-central/file/bfb383af1903/xpcom/threads/nsThread.cpp#l366

AddRef and Release
http://hg.mozilla.org/mozilla-central/file/bfb383af1903/xpcom/threads/nsThread.cpp#l495

Updated

10 years ago
Attachment #380274 - Flags: review?(joshmoz) → review-

Comment 43

10 years ago
Comment on attachment 380274 [details] [diff] [review]
Patch v2.1 (revised and cleaned up)

>--- a/widget/src/xpwidgets/nsBaseAppShell.cpp
>+++ b/widget/src/xpwidgets/nsBaseAppShell.cpp
>@@ -315,15 +315,30 @@ nsBaseAppShell::OnProcessNextEvent(nsITh
> NS_IMETHODIMP
> nsBaseAppShell::Observe(nsISupports *subject, const char *topic,
>                         const PRUnichar *data)
> {
>   NS_ASSERTION(!strcmp(topic, NS_XPCOM_SHUTDOWN_OBSERVER_ID), "oops");
>+
>+  // The observer service and current thread both hold strong references
>+  // to us, which don't always get released unless we explicitly stop
>+  // observing (especially when the browser is relaunched).

I don't think we should make these changes to the base appshell. As far as I can tell there is no circumstance under which this should be happening and all the new code here will do is cover up leaks in other code.

I think you should remove this part of the patch and file a bug about whatever events/observations prompted you to add this code.
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox3.5/1244465722.1244468642.29853.gz&fulltext=1

(I cleared the whiteboard request for no more logs - I apologize for the noise it causes, but pasting the logs not only gives us frequency data that may or may not be useful, it also keeps the bug live in the minds of people who can solve it.  This is something that frequently puts the tree in a bad state - either committers respect the orange and don't check in, or they learn to ignore the orange - both bad outcomes.

The best way to cut down on link spam is to fix it.  Which it looks like Josh and smichaud are making progress on, thanks guys!)
Whiteboard: [No need for more logs, atm] [orange] → [orange]
Comment on attachment 380274 [details] [diff] [review]
Patch v2.1 (revised and cleaned up)

Steven, do you want to post a new patch? Or do you want my review on this one?
(Assignee)

Comment 46

10 years ago
Comment on attachment 380274 [details] [diff] [review]
Patch v2.1 (revised and cleaned up)

> Steven, do you want to post a new patch? Or do you want my review on
> this one?

I'll post a new one tomorrow, so don't bother reviewing this one.

I'm trying to deal with Josh's objection to the part of my patch that
changes nsBaseAppShell::Observe().  Making no changes there means that
the appshell will leak almost 100% of the time on restart.

For a while I thought I was close to finding the true source(s) of
that leak.  Now I'm no longer so sure ... but I intend to keep trying
for little longer.

One way or another, though, I'll post something for review tomorrow.
Attachment #380274 - Flags: review?(bent.mozilla)
(Assignee)

Comment 47

10 years ago
> One way or another, though, I'll post something for review tomorrow.

Looks like I won't make it.  Tomorrow for sure!
(Assignee)

Comment 49

10 years ago
Created attachment 382878 [details] [diff] [review]
Patch v2.2
Attachment #380274 - Attachment is obsolete: true
Attachment #382878 - Flags: superreview?(bent.mozilla)
Attachment #382878 - Flags: review?(joshmoz)
(Assignee)

Comment 50

10 years ago
> I don't think we should make these changes to the base appshell. As
> far as I can tell there is no circumstance under which this should
> be happening and all the new code here will do is cover up leaks in
> other code.

If we don't change nsBaseAppShell::Observe(), the appShell will leak
almost 100% of the time on relaunch.

I've tracked down (as best I can) the true "cause" of this leak (see
bug 497745).  But I don't it's currently practical to fix it "at the
source".  So if we're going to fix this leak at all (the one on
relaunch), I think it's best to do it here.

I did find out, though, that I didn't need to explicitly remove the
appShell from the observer service.

Otherwise this patch is the same as v2.1.

We might be able to get away with *not* fixing the leak on relaunch --
I suspect it doesn't have any effect on tinderbox builds.  But my fix
for it is very simple and effective, and I really don't see anything
wrong with it.
(Assignee)

Updated

10 years ago
Attachment #382878 - Flags: superreview?(bent.mozilla) → review?(bent.mozilla)
(Assignee)

Comment 51

10 years ago
Comment on attachment 382878 [details] [diff] [review]
Patch v2.2

Oops, Ben.  I just meant to ask you for a review.
Depends on: 497745

Comment 52

10 years ago
Comment on attachment 382878 [details] [diff] [review]
Patch v2.2

I can't r+ the base appshell code unless we decide that it is the correct way to fix the leak. I don't want to cover that leak up, I'd rather have the leak and improve our chances of fixing it the right way in the future.
Attachment #382878 - Flags: review?(joshmoz) → review-
(Assignee)

Comment 53

10 years ago
Created attachment 382962 [details] [diff] [review]
Patch v2.3

> I don't want to cover that leak up

We wouldn't be covering it up -- I've already identified its source
(in bug 497745).  And I suspect it will be at least a year before that
bug is fixed (fixed in a better way than attachment 382873 [details] [diff] [review]).

But, as I said above, we probably can get away with not fixing it.  So
here's a patch with the offending code removed.
Attachment #382878 - Attachment is obsolete: true
Attachment #382962 - Flags: review?(joshmoz)
Attachment #382878 - Flags: review?(bent.mozilla)
(Assignee)

Updated

10 years ago
Attachment #382962 - Flags: review?(bent.mozilla)
Actually, I think it might make sense for a thread to auto-release its observer once it's shutdown. That might solve the appshell case and others. I'd have to look more closely, but that sounds like a smart move to me.
(Assignee)

Comment 55

10 years ago
Does XPCOM have an "autorelease"? :-)
I filed bug 498010 to get the thread observers released at shutdown.
Comment on attachment 382962 [details] [diff] [review]
Patch v2.3

>diff --git a/widget/src/cocoa/nsAppShell.h b/widget/src/cocoa/nsAppShell.h
> ...
>+  PRInt32            mNativeEventScheduledDepth;

Might want to indicate here with a comment that this int is only modified atomically

>+  // As best I can tell, every call to ProcessGeckoEvents() is triggered

I'm not a fan of using the first person in code, but that's my personal opinion.

>+  if (!self->mTerminated) {

As long as we're doing an else clause I'd prefer to reverse the !mTerminated test and then switch the if/else clauses around. Double-negatives are tough to read ;)

>   NS_ADDREF_THIS();
>+  PR_AtomicIncrement(&mNativeEventScheduledDepth);

So, if a background thread makes it through the addref but then gets swapped out and the main thread runs through ProcessGeckoEvents we'll be left with an extra ref. It looks like we'll catch that in Exit(), but maybe put a comment here explaining that?

>@@ -759,24 +792,38 @@ nsAppShell::Exit(void)
> ...
>+  if (!mNativeEventCallbackDepth) {

I'd do this just to make it simpler:

  if (!mNativeEventCallbackDepth && mNativeEventScheduledDepth) {
    PRInt32 releaseCount = PR_AtomicSet(&mNativeEventScheduledDepth, 0);
    while (releaseCount-- > 0)
      NS_RELEASE_THIS();
  }
Attachment #382962 - Flags: review?(bent.mozilla) → review+
(Assignee)

Comment 58

10 years ago
Created attachment 383361 [details] [diff] [review]
Patch v2.4 (follow Ben's suggestions)
[Checkin: Comment 60 & 79]
Attachment #382962 - Attachment is obsolete: true
Attachment #383361 - Flags: review?(joshmoz)
Attachment #382962 - Flags: review?(joshmoz)
(Assignee)

Comment 59

10 years ago
Here's a new revision that (mostly) follow's Ben's suggestions.  It
also rewrites a bunch of comments, and hopefully makes them easier to
understand.

>>   NS_ADDREF_THIS();
>>+  PR_AtomicIncrement(&mNativeEventScheduledDepth);
>
> So, if a background thread makes it through the addref but then gets
> swapped out and the main thread runs through ProcessGeckoEvents
> we'll be left with an extra ref. It looks like we'll catch that in
> Exit(), but maybe put a comment here explaining that?

I'm not sure what you're after here.  But I think my comments in
ProcessGeckoEvents() and Exit() cover all the bases (or at least now
they do).  If I'm right about that, saying more here wouldn't add
anything, and might actually be confusing.

Updated

10 years ago
Attachment #383361 - Flags: review?(joshmoz) → review+
(Assignee)

Comment 60

10 years ago
Comment on attachment 383361 [details] [diff] [review]
Patch v2.4 (follow Ben's suggestions)
[Checkin: Comment 60 & 79]

Landed on trunk (mozilla-central):
http://hg.mozilla.org/mozilla-central/rev/e10f24426a52

This doesn't (of course) fix the leaks on relaunch.  But the patch for
bug 498010 does work around them, and should be landed soon.

We should let this bake on the trunk for a while, then (if all goes
well) land it on the 1.9.1 branch.
(Assignee)

Updated

10 years ago
Status: NEW → RESOLVED
Last Resolved: 10 years ago
Resolution: --- → FIXED
No longer blocks: 485648
Duplicate of this bug: 485648
Flags: blocking1.9.2?
Keywords: qawanted
Whiteboard: [orange] → [qaw: check/dupe 'URL' bug list] [needs 1.9.1 landing: needs approval] [orange]
Target Milestone: --- → mozilla1.9.2a1
(Assignee)

Updated

9 years ago
Attachment #383361 - Flags: approval1.9.1?
Attachment #383361 - Flags: approval1.9.1? → approval1.9.1.1?
Flags: wanted1.9.1?
This is still failing more than 50% of the time on the Mac OS X Firefox3.5 tinderbox.  (not pasting all the failure logs here to reduce noise)
Attachment #383361 - Flags: approval1.9.1.1? → approval1.9.1.2?
Attachment #383361 - Flags: approval1.9.1.2? → approval1.9.1.3?
status1.9.1: --- → ?
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox3.5-Unittest/1248718968.1248720070.30356.gz
OS X 10.5.2 mozilla-1.9.1 test everythingelse on 2009/07/27 11:22:48
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox3.5/1250874917.1250877894.2580.gz
OS X 10.5.2 mozilla-1.9.1 unit test on 2009/08/21 10:15:17

Please !
Comment on attachment 383361 [details] [diff] [review]
Patch v2.4 (follow Ben's suggestions)
[Checkin: Comment 60 & 79]

Please...
Attachment #383361 - Flags: approval1.9.1.3? → approval1.9.1.4?
Comment on attachment 383361 [details] [diff] [review]
Patch v2.4 (follow Ben's suggestions)
[Checkin: Comment 60 & 79]

a1914=beltzner
Attachment #383361 - Flags: approval1.9.1.4? → approval1.9.1.4+
Comment on attachment 383361 [details] [diff] [review]
Patch v2.4 (follow Ben's suggestions)
[Checkin: Comment 60 & 79]


http://hg.mozilla.org/releases/mozilla-1.9.1/rev/496510e1d755
Attachment #383361 - Attachment description: Patch v2.4 (follow Ben's suggestions) → Patch v2.4 (follow Ben's suggestions) [Checkin: Comment 60 & 79]
status1.9.1: ? → .4-fixed
Whiteboard: [qaw: check/dupe 'URL' bug list] [needs 1.9.1 landing: needs approval] [orange] → [qaw: check/dupe 'URL' bug list] [orange]
Keywords: intermittent-failure
Whiteboard: [qaw: check/dupe 'URL' bug list] [orange] → [qaw: check/dupe 'URL' bug list]
Issue is Resolved - removing QA-Wanted Keywords - QA-Wanted query clean-up task
Keywords: qawanted
You need to log in before you can comment on or make changes to this bug.