472773 - [MacOSX] all test suites: nsBaseAppShell and nsRunnable leak occasionally ["runtests-leaks | leaked 68 bytes during test execution"]

Ben Turner (not reading bugmail, use the needinfo flag!)

Reporter

Description

•

16 years ago

Attached patch Patch, v1 (obsolete) — Details — Splinter Review

Often in my debugging I see a single nsBaseAppShell and nsRunnable instance leak. This seems to happen more frequently when I quit the browser soon after startup but I've seen it in many other cases as well. The nsRunnable is strongly owned by the nsBaseAppShell ('mDummyEvent'), so the nsBaseAppShell is the real leak. I finally got a leak log of the nsBaseAppShell leak and noticed this:

  nsAppShell::ScheduleNativeEventCallback() 44
  nsAppShell::ProcessGeckoEvents(void*) -43

The log indicates that there were 44 AddRefs to the app shell but only 43 Releases. The code in question assumes that there will be a 1:1 relationship between calls to ScheduleNativeEventCallback and ProcessGeckoEvents, but that appears to not be true in all cases.

Attached is a patch which fixes the leak for me (at least I haven't been able to reproduce it since I applied this patch) but I don't know if there is a better way or if this patch will hurt performance.

This leak itself is not a big deal, really. It does, however, cause random tinderbox orange and it also causes lots of duplicate bug reports to be filed in topsite leak analyses.

Attachment #356107 - Flags: review?(smichaud)

Steven Michaud [:smichaud] (Retired)

Assignee

Updated

•

16 years ago

Attachment #356107 - Flags: review?(smichaud) → review+

Steven Michaud [:smichaud] (Retired)

Assignee

Comment 1

•

16 years ago

Comment on attachment 356107 [details] [diff] [review]
Patch, v1

Your patch's logic seems impeccable, and I can't foresee it doing any
harm.

But I think we should let this bake on the trunk for a while before
landing it on any of the branches.

Ben Turner (not reading bugmail, use the needinfo flag!)

Reporter

Comment 2

•

16 years ago

Comment on attachment 356107 [details] [diff] [review]
Patch, v1

Actually, I meant to obsolete this before.

I agree that the logic seems correct, but I have occasionally seen a double free with this patch on shutdown... Something is weird here.

Attachment #356107 - Attachment is obsolete: true

Steven Michaud [:smichaud] (Retired)

Assignee

Comment 3

•

16 years ago

I've always wondered if the OS could call
nsAppShell::ProcessGeckoEvents() without having been told to do so (by
the calls to CFRunLoopSourceSignal() and CFRunLoopWakeUp()) in
nsAppShell::ScheduleNativeEventCallback().

Maybe the problem is that it can.

Steven Michaud [:smichaud] (Retired)

Assignee

Comment 4

•

16 years ago

Ben, here's a hunch:

There's code in the nsAppShell destructor that removes
mCFRunLoopSource from mCFRunLoop, then releases mCFRunLoopSource.

Try commenting that out and see what happens.  I suspect that code
might trigger one last call to nsAppShell::ProcessGeckoEvents().

Steven Michaud [:smichaud] (Retired)

Assignee

Comment 5

•

16 years ago

Or maybe you could call CFRunLoopSourceInvalidate() on
mCFRunLoopSource before removing it from mCFRunLoop.

Ben Turner (not reading bugmail, use the needinfo flag!)

Reporter

Comment 6

•

16 years ago

Thanks! I'll give it a try.

Steven Michaud [:smichaud] (Retired)

Assignee

Comment 7

•

16 years ago

> I suspect that code might trigger one last call to
> nsAppShell::ProcessGeckoEvents().

Or (more likely) maybe it can (sometimes) trigger all pending calls to
nsAppShell::ProcessGeckoEvents().

Serge Gautherie (:sgautherie)

Comment 8

•

15 years ago

Wow, there is a bunch of duplicates or at least related bugs...

Blocks: 486146, 482598, 438871

URL: &op_sys=Mac+OS+X&emailassigned_to1=1&...

Flags: blocking1.9.2?

Keywords: meta, mlk

Whiteboard: [orange]

Serge Gautherie (:sgautherie)

Updated

•

15 years ago

Blocks: mlkTests, 485648

Serge Gautherie (:sgautherie)

Comment 9

•

15 years ago

Assuming for now that there's only one cause to this leak:

{
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox3.5/1238711840.1238715491.11899.gz
OS X 10.5.2 mozilla-1.9.1 unit test on 2009/04/02 15:37:20

TEST-UNEXPECTED-FAIL | runtests-leaks | leaked 68 bytes during test execution
TEST-UNEXPECTED-FAIL | runtests-leaks | leaked 1 instance of nsBaseAppShell with size 60 bytes
TEST-UNEXPECTED-FAIL | runtests-leaks | leaked 1 instance of nsRunnable with size 8 bytes

TinderboxPrint: mochitest-plain<br/>76138/0/1541&nbsp;<em class="testfail">LEAK</em>
}

Ben Turner (not reading bugmail, use the needinfo flag!)

Reporter

Comment 10

•

15 years ago

I think any leak on mac of a single nsBaseAppShell and a single nsRunnable should be duped to this bug and the failing test should not be fingered. This is easy enough to reproduce.

Serge Gautherie (:sgautherie)

Updated

•

15 years ago

Flags: blocking1.9.1?

Mike Beltzner [:beltzner, not reading bugmail]

Updated

•

15 years ago

Flags: blocking1.9.1? → blocking1.9.1-

Serge Gautherie (:sgautherie)

Updated

•

15 years ago

Flags: wanted1.9.1?

David Baron :dbaron: (⌚️UTC-4, no longer working on Mozilla)

Comment 11

•

15 years ago

http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1240219838.1240230963.23529.gz
OS X 10.5.2 mozilla-central unit test on 2009/04/20 02:30:38

Serge Gautherie (:sgautherie)

Comment 12

•

15 years ago

(In reply to comment #11)
> http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1240219838.1240230963.23529.gz
> OS X 10.5.2 mozilla-central unit test on 2009/04/20 02:30:38

This case was 'reftest'.


http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1240781852.1240784557.2880.gz&fulltext=1
OS X 10.5.2 mozilla-central unit test on 2009/04/26 14:37:32

This is also reported a lot (at "random") on 'xpcshell-tests'.

Depends on: 469523

Josh Aas

Updated

•

15 years ago

Assignee: bent.mozilla → smichaud

Serge Gautherie (:sgautherie)

Comment 13

•

15 years ago

http://tinderbox.mozilla.org/showlog.cgi?log=Firefox3.5/1242133704.1242138142.11650.gz
OS X 10.5.2 mozilla-1.9.1 unit test on 2009/05/12 06:08:24

Steven Michaud [:smichaud] (Retired)

Assignee

Comment 14

•

15 years ago

http://tinderbox.mozilla.org/showlog.cgi?log=Firefox3.5/1242401721.1242404593.30000.gz
OS X 10.5.2 mozilla-1.9.1 unit test on 2009/05/15 08:35:21

Serge Gautherie (:sgautherie)

Comment 15

•

15 years ago

http://tinderbox.mozilla.org/showlog.cgi?log=Firefox3.5/1242461254.1242466169.10039.gz
OS X 10.5.2 mozilla-1.9.1 unit test on 2009/05/16 01:07:34

Summary: nsBaseAppShell and nsRunnable leak occasionally on OS X → [MacOSX] all test suites: nsBaseAppShell and nsRunnable leak occasionally

Serge Gautherie (:sgautherie)

Comment 16

•

15 years ago

http://tinderbox.mozilla.org/showlog.cgi?log=SeaMonkey/1242630739.1242636130.8550.gz
OS X 10.4 comm-central unit test on 2009/05/18 00:12:19

Serge Gautherie (:sgautherie)

Comment 17

•

15 years ago

http://tinderbox.mozilla.org/showlog.cgi?log=SeaMonkey/1242648001.1242653410.7920.gz
OS X 10.4 comm-central unit test on 2009/05/18 05:00:01

Serge Gautherie (:sgautherie)

Comment 18

•

15 years ago

http://tinderbox.mozilla.org/showlog.cgi?log=SeaMonkey-Ports/1242634221.1242641434.17806.gz
OS X 10.5 comm-1.9.1 unit test on 2009/05/18 01:10:21

Matthew Gregan [:kinetik]

Comment 19

•

15 years ago

Attached file leak logs from bug 454921 (obsolete) — Details

I ran into this bug while trying to get bug 454921 ready to land.  I'm pretty convinced that the leak I was seeing there is in fact this bug.  I'm attaching (and will immediately obsolete) the leak logs and balance tree for nsBaseAppShell I created when trying to work out what was going on, just in case any further analysis is needed.

Matthew Gregan [:kinetik]

Updated

•

15 years ago

Attachment #378226 - Attachment is obsolete: true

Steven Michaud [:smichaud] (Retired)

Assignee

Comment 20

•

15 years ago

> bug 454921

Is this the right bug number?

Steven Michaud [:smichaud] (Retired)

Assignee

Comment 21

•

15 years ago

I've found a trivially easy way to make an nsAppShell/nsBaseAppShell
object leak:  This happens every time FF relaunches as it starts up
(which in turn happens every time you run FF with a fresh profile, or
if the copy of FF you ran previously had a different major version).

Could *this* be the source of our "random" oranges?

Tell me it isn't so :-)

Steven Michaud [:smichaud] (Retired)

Assignee

Comment 22

•

15 years ago

> I've found a trivially easy way to make an nsAppShell/nsBaseAppShell
> object leak: ...

And now I've stopped being able to reproduce it.  Sigh :-(

I'll keep digging.

Serge Gautherie (:sgautherie)

Comment 23

•

15 years ago

http://tinderbox.mozilla.org/showlog.cgi?log=Firefox3.5/1242820340.1242825407.30878.gz
OS X 10.5.2 mozilla-1.9.1 unit test on 2009/05/20 04:52:20

Jim Mathies [:jimm]

Comment 24

•

15 years ago

5/20

TEST-UNEXPECTED-FAIL | runtests-leaks | leaked 68 bytes during test execution
TEST-UNEXPECTED-FAIL | runtests-leaks | leaked 1 instance of nsBaseAppShell with size 60 bytes
TEST-UNEXPECTED-FAIL | runtests-leaks | leaked 1 instance of nsRunnable with size 8 bytes
TEST-UNEXPECTED-FAIL | runtests-leaks | leaked 68 bytes during test execution
TEST-UNEXPECTED-FAIL | runtests-leaks | leaked 1 instance of nsBaseAppShell with size 60 bytes
TEST-UNEXPECTED-FAIL | runtests-leaks | leaked 1 instance of nsRunnable with size 8 bytes

http://tinderbox.mozilla.org/showlog.cgi?log=Firefox3.5/1242832092.1242836949.27265.gz

Jeff Walden [:Waldo]

Comment 25

•

15 years ago

I can see this leak consistently on some of the xpcshell tests when run in a debug build, but I haven't had time to investigate it.  That would probably be a fruitful avenue for attack given that xpcshell tests are pretty minimal (of course no guarantees it'd be the same problem).

Steven Michaud [:smichaud] (Retired)

Assignee

Comment 26

•

15 years ago

I'm making progress on this.

I find it pretty easy to reproduce ... though I don't have formal STR.

I hope to post a patch tomorrow.

John Daggett (:jtd)

Comment 27

•

15 years ago

http://tinderbox.mozilla.org/showlog.cgi?log=Firefox3.5/1242876939.1242882141.15514.gz

Steven Michaud [:smichaud] (Retired)

Assignee

Comment 28

•

15 years ago

Attached patch Possible patch (obsolete) — Details — Splinter Review

Here's the patch I've been working on.

As best I can tell, it completely resolves the problem that Ben
reported in comment #0 (the leak of an nsBaseAppShell object that
occurs when a call is (occasionally) made to
ScheduleNativeEventCallback() that isn't matched by a call to
ProcessGeckoEvents()).

It also resolves (in cross-platform code in nsBaseAppShell::Observe())
an additional leak that often happens (at least on OS X) when the
browser is relaunched.

But, like Ben (in comment #2), I've seen two or three cases where a
double-free happens on shutdown.  I'm not able to reproduce this
reliably -- so I haven't been able to track down the cause.  But I
strongly suspect it isn't due to the code that (in Ben's patch and
mine) compensates for "extra" calls to ScheduleNativeEventCallback().
In other words I suspect it's a cross-platform problem -- one that
also occurs at least occasionally on other platforms.

So I think this patch needs a fair amount of testing before it can be
landed ... or at least landed permanently.

If nothing else works, I think we should land this on the trunk and
let it bake there a while, to see if our automated tests turn up a way
to reproduce the double-free problem.

But maybe one of the people cc-ed here can come up with a way to
torture test this patch short of actually landing it on the trunk.

Steven Michaud [:smichaud] (Retired)

Assignee

Comment 29

•

15 years ago

All the double-frees I've seen happened on quitting the browser after
having run the Crash Tests (-reftest
testing/crashtest/crashtests.list).

There were always two of them -- for the nsRunnable object owned by an
nsBaseAppShell object, and then for the nsBaseAppShell object itself.
They always happened at the last point in the browser where an
nsBaseAppShell object is released (when the ScopedXPCOMStartup object
near the end of XRE_main() (in nsAppRunner.cpp) goes out of scope).
So the refcount for the nsBaseAppShell object was always exactly '1'
too small.

(I know all this because one of the double-frees happened while I was
running the crashtests in gdb.)

Oddly, the double-frees weren't actually called double-frees.  Instead
malloc displayed a "non-aligned pointer" error message for each of
them:

firefox-bin(9821,0xa0098720) malloc: \
  *** error for object 0x1e334af0: Non-aligned pointer being freed (2)
*** set a breakpoint in malloc_error_break to debug
firefox-bin(9821,0xa0098720) malloc: \
  *** error for object 0x3d0dd30: Non-aligned pointer being freed (2)
*** set a breakpoint in malloc_error_break to debug

This message is supposed to indicate an attempt to free a memory block
that isn't aligned on a 16-byte boundary (and thereby to indicate a
corrupt or uninitialized pointer).

But note that both addresses *are* properly aligned (they both end in
'0').  And when I deliberately added an extra Release() to
nsAppShell::Exit(), I kept seeing the same messages (at the same
location in the browser).

So these messages really do indicate double-frees.

Side note:  To break on malloc's "non-aligned pointer" messages (in
gdb), you need to break on 'malloc_printf' (*not* on
'malloc_error_break').

Steven Michaud [:smichaud] (Retired)

Assignee

Comment 30

•

15 years ago

Attached file Reconstructed stacks for the two double-frees — Details

I "reconstructed" these stacks by deliberately adding an extra
Release() to nsAppShell::Exit(), then running the browser in gdb.

Steven Michaud [:smichaud] (Retired)

Assignee

Updated

•

15 years ago

Attachment #378952 - Flags: review?(joshmoz)

Steven Michaud [:smichaud] (Retired)

Assignee

Updated

•

15 years ago

Attachment #378952 - Flags: review?(bent.mozilla)

Serge Gautherie (:sgautherie)

Comment 31

•

15 years ago

http://tinderbox.mozilla.org/showlog.cgi?log=Firefox3.5/1242991708.1242994615.28544.gz
OS X 10.5.2 mozilla-1.9.1 unit test on 2009/05/22 04:28:28

Daniel Holbert [:dholbert]

Comment 32

•

15 years ago

http://tinderbox.mozilla.org/showlog.cgi?log=Firefox3.5/1243005817.1243011072.370.gz
OS X 10.5.2 mozilla-1.9.1 unit test on 2009/05/22 08:23:37

Daniel Holbert [:dholbert]

Updated

•

15 years ago

Summary: [MacOSX] all test suites: nsBaseAppShell and nsRunnable leak occasionally → [MacOSX] all test suites: nsBaseAppShell and nsRunnable leak occasionally ["runtests-leaks | leaked 68 bytes during test execution"]

Daniel Holbert [:dholbert]

Comment 33

•

15 years ago

http://tinderbox.mozilla.org/showlog.cgi?log=Firefox3.5/1243019296.1243022355.25114.gz
OS X 10.5.2 mozilla-1.9.1 unit test on 2009/05/22 12:08:16

Daniel Holbert [:dholbert]

Comment 34

•

15 years ago

http://tinderbox.mozilla.org/showlog.cgi?log=Firefox3.5/1243030778.1243033851.16590.gz
OS X 10.5.2 mozilla-1.9.1 unit test on 2009/05/22 15:19:38

Serge Gautherie (:sgautherie)

Comment 35

•

15 years ago

http://tinderbox.mozilla.org/showlog.cgi?log=SeaMonkey/1243211725.1243217093.17299.gz
OS X 10.4 comm-central unit test on 2009/05/24 17:35:25

Josh Aas

Comment 36

•

15 years ago

Comment on attachment 378952 [details] [diff] [review]
Possible patch

A minor shutdown leak for a crash doesn't seem like a good trade, however annoying it might be on tbox. Seems like we shouldn't take a patch until we also have the crash figured out.

Steven Michaud [:smichaud] (Retired)

Assignee

Comment 37

•

15 years ago

(In reply to comment #36)

Actually, the double-frees don't cause crashes -- not the ones I saw,
nor (I suspect) the ones Ben saw.

But I'll spend some more time trying to track them down.

Serge Gautherie (:sgautherie)

Comment 38

•

15 years ago

http://tinderbox.mozilla.org/showlog.cgi?log=Firefox3.5/1243378847.1243383129.27518.gz
OS X 10.5.2 mozilla-1.9.1 unit test on 2009/05/26 16:00:47

Serge Gautherie (:sgautherie)

Comment 39

•

15 years ago

http://tinderbox.mozilla.org/showlog.cgi?log=Firefox3.5/1243388041.1243391150.7914.gz
OS X 10.5.2 mozilla-1.9.1 unit test on 2009/05/26 18:34:01

Steven Michaud [:smichaud] (Retired)

Assignee

Updated

•

15 years ago

Attachment #378952 - Flags: review?(joshmoz)

Attachment #378952 - Flags: review?(bent.mozilla)

Serge Gautherie (:sgautherie)

Updated

•

15 years ago

Whiteboard: [orange] → [No need for more logs, atm] [orange]

Steven Michaud [:smichaud] (Retired)

Assignee

Comment 40

•

15 years ago

Attached patch Patch v2.1 (revised and cleaned up) (obsolete) — Details — Splinter Review

Here's a revision of my patch from comment #28.

I've done several things to clean it up (for example we should use
PR_Atomic... on mNativeEventScheduledDepth, because it can be
set/changed from different threads).

I've also added code to deal with the possibility that the OS can call
ProcessGeckoEvents() "spontaneously" (without the browser first having
called ScheduleNativeEventCallback()).  This *might* be the cause of
the "Non-aligned pointer" errors I saw -- and if so my change will fix
them.

I haven't seen these errors again, despite considerable effort.

And I've now looked at every single place in the tree where an
nsAppShell object can be AddRefed or Released.  I didn't find one
that looked wrong.

So I think this patch is the best we can do ... at least for the time
being.

Once it's reviewed, I hope to land it on the trunk, and let it bake
there for a week or two.  We can see what it turns up, and on that
basis decide what to do next.

(By the way, I did a tryserver build of my patch and had several
failures.  But they all seem to have been spurious.)

Attachment #378952 - Attachment is obsolete: true

Attachment #380274 - Flags: review?(joshmoz)

Steven Michaud [:smichaud] (Retired)

Assignee

Updated

•

15 years ago

Attachment #380274 - Flags: review?(bent.mozilla)

cajbir (:cajbir)

Comment 41

•

15 years ago

http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1243545799.1243549760.17178.gz

Steven Michaud [:smichaud] (Retired)

Assignee

Comment 42

•

15 years ago

For what it's worth, here's the list I compiled of places in the tree
where an nsAppShell object can be AddRefed or Released (not including
platform-specific code):

AddRef
http://hg.mozilla.org/mozilla-central/file/bfb383af1903/widget/src/xpwidgets/nsAppShellSingleton.h#l72

AddRef
http://hg.mozilla.org/mozilla-central/file/bfb383af1903/xpcom/threads/nsThread.cpp#l588
http://hg.mozilla.org/mozilla-central/file/bfb383af1903/widget/src/xpwidgets/nsBaseAppShell.cpp#l77

AddRef
http://hg.mozilla.org/mozilla-central/file/bfb383af1903/xpcom/ds/nsObserverList.cpp#l50
http://hg.mozilla.org/mozilla-central/file/bfb383af1903/widget/src/xpwidgets/nsBaseAppShell.cpp#l82

AddRef
http://hg.mozilla.org/mozilla-central/file/bfb383af1903/xpcom/components/nsComponentManager.cpp#l1596
http://hg.mozilla.org/mozilla-central/file/bfb383af1903/xpcom/components/nsComponentManager.cpp#l1900
http://hg.mozilla.org/mozilla-central/file/bfb383af1903/toolkit/components/startup/src/nsAppStartup.cpp#l111

AddRef
http://hg.mozilla.org/mozilla-central/file/bfb383af1903/xpcom/components/nsComponentManager.cpp#l1928
http://hg.mozilla.org/mozilla-central/file/bfb383af1903/toolkit/components/startup/src/nsAppStartup.cpp#l111

AddRef
http://hg.mozilla.org/mozilla-central/file/bfb383af1903/xpcom/components/nsComponentManager.cpp#l1934
http://hg.mozilla.org/mozilla-central/file/bfb383af1903/toolkit/components/startup/src/nsAppStartup.cpp#l111

Release
http://hg.mozilla.org/mozilla-central/file/bfb383af1903/xpcom/components/nsComponentManager.cpp#l1936
http://hg.mozilla.org/mozilla-central/file/bfb383af1903/toolkit/components/startup/src/nsAppStartup.cpp#l111

AddRef
http://hg.mozilla.org/mozilla-central/file/bfb383af1903/xpcom/ds/nsObserverList.cpp#l125
http://hg.mozilla.org/mozilla-central/file/bfb383af1903/toolkit/xre/nsAppRunner.cpp#l995

Release
http://hg.mozilla.org/mozilla-central/file/bfb383af1903/xpcom/ds/nsObserverList.cpp#l130
http://hg.mozilla.org/mozilla-central/file/bfb383af1903/toolkit/xre/nsAppRunner.cpp#l995

Release
http://hg.mozilla.org/mozilla-central/file/bfb383af1903/xpcom/components/nsComponentManager.cpp#l1748
http://hg.mozilla.org/mozilla-central/file/bfb383af1903/toolkit/xre/nsAppRunner.cpp#l995

Release
http://hg.mozilla.org/mozilla-central/file/bfb383af1903/widget/src/xpwidgets/nsAppShellSingleton.h#l86
http://hg.mozilla.org/mozilla-central/file/bfb383af1903/toolkit/xre/nsAppRunner.cpp#l995

Release
http://hg.mozilla.org/mozilla-central/file/bfb383af1903/toolkit/components/startup/src/nsAppStartup.h#l75
http://hg.mozilla.org/mozilla-central/file/bfb383af1903/toolkit/xre/nsAppRunner.cpp#l998

AddRef and Release
http://hg.mozilla.org/mozilla-central/file/bfb383af1903/content/base/src/nsContentSink.cpp#l1565

AddRef and Release
http://hg.mozilla.org/mozilla-central/file/bfb383af1903/docshell/base/nsDocShell.cpp#l257

AddRef and Release
http://hg.mozilla.org/mozilla-central/file/bfb383af1903/layout/generic/nsObjectFrame.cpp#l856

AddRef and Release
http://hg.mozilla.org/mozilla-central/file/bfb383af1903/layout/generic/nsObjectFrame.cpp#l1967

AddRef and Release
http://hg.mozilla.org/mozilla-central/file/bfb383af1903/layout/generic/nsObjectFrame.cpp#l3266

AddRef and Release
http://hg.mozilla.org/mozilla-central/file/bfb383af1903/xpcom/threads/nsThread.cpp#l366

AddRef and Release
http://hg.mozilla.org/mozilla-central/file/bfb383af1903/xpcom/threads/nsThread.cpp#l495

Josh Aas

Updated

•

15 years ago

Attachment #380274 - Flags: review?(joshmoz) → review-

Josh Aas

Comment 43

•

15 years ago

Comment on attachment 380274 [details] [diff] [review]
Patch v2.1 (revised and cleaned up)

>--- a/widget/src/xpwidgets/nsBaseAppShell.cpp
>+++ b/widget/src/xpwidgets/nsBaseAppShell.cpp
>@@ -315,15 +315,30 @@ nsBaseAppShell::OnProcessNextEvent(nsITh
> NS_IMETHODIMP
> nsBaseAppShell::Observe(nsISupports *subject, const char *topic,
>                         const PRUnichar *data)
> {
>   NS_ASSERTION(!strcmp(topic, NS_XPCOM_SHUTDOWN_OBSERVER_ID), "oops");
>+
>+  // The observer service and current thread both hold strong references
>+  // to us, which don't always get released unless we explicitly stop
>+  // observing (especially when the browser is relaunched).

I don't think we should make these changes to the base appshell. As far as I can tell there is no circumstance under which this should be happening and all the new code here will do is cover up leaks in other code.

I think you should remove this part of the patch and file a bug about whatever events/observations prompted you to add this code.

Johnathan Nightingale [:johnath]

Comment 44

•

15 years ago

http://tinderbox.mozilla.org/showlog.cgi?log=Firefox3.5/1244465722.1244468642.29853.gz&fulltext=1

(I cleared the whiteboard request for no more logs - I apologize for the noise it causes, but pasting the logs not only gives us frequency data that may or may not be useful, it also keeps the bug live in the minds of people who can solve it.  This is something that frequently puts the tree in a bad state - either committers respect the orange and don't check in, or they learn to ignore the orange - both bad outcomes.

The best way to cut down on link spam is to fix it.  Which it looks like Josh and smichaud are making progress on, thanks guys!)

Whiteboard: [No need for more logs, atm] [orange] → [orange]

Ben Turner (not reading bugmail, use the needinfo flag!)

Reporter

Comment 45

•

15 years ago

Comment on attachment 380274 [details] [diff] [review]
Patch v2.1 (revised and cleaned up)

Steven, do you want to post a new patch? Or do you want my review on this one?

Steven Michaud [:smichaud] (Retired)

Assignee

Comment 46

•

15 years ago

Comment on attachment 380274 [details] [diff] [review]
Patch v2.1 (revised and cleaned up)

> Steven, do you want to post a new patch? Or do you want my review on
> this one?

I'll post a new one tomorrow, so don't bother reviewing this one.

I'm trying to deal with Josh's objection to the part of my patch that
changes nsBaseAppShell::Observe().  Making no changes there means that
the appshell will leak almost 100% of the time on restart.

For a while I thought I was close to finding the true source(s) of
that leak.  Now I'm no longer so sure ... but I intend to keep trying
for little longer.

One way or another, though, I'll post something for review tomorrow.

Attachment #380274 - Flags: review?(bent.mozilla)

Steven Michaud [:smichaud] (Retired)

Assignee

Comment 47

•

15 years ago

> One way or another, though, I'll post something for review tomorrow.

Looks like I won't make it.  Tomorrow for sure!

Serge Gautherie (:sgautherie)

Comment 48

•

15 years ago

http://tinderbox.mozilla.org/showlog.cgi?log=Firefox3.5/1244722045.1244726163.8605.gz
OS X 10.5.2 mozilla-1.9.1 unit test on 2009/06/11 05:07:25

Steven Michaud [:smichaud] (Retired)

Assignee

Comment 49

•

15 years ago

Attached patch Patch v2.2 (obsolete) — Details — Splinter Review

Attachment #380274 - Attachment is obsolete: true

Attachment #382878 - Flags: superreview?(bent.mozilla)

Attachment #382878 - Flags: review?(joshmoz)

Steven Michaud [:smichaud] (Retired)

Assignee

Comment 50

•

15 years ago

> I don't think we should make these changes to the base appshell. As
> far as I can tell there is no circumstance under which this should
> be happening and all the new code here will do is cover up leaks in
> other code.

If we don't change nsBaseAppShell::Observe(), the appShell will leak
almost 100% of the time on relaunch.

I've tracked down (as best I can) the true "cause" of this leak (see
bug 497745).  But I don't it's currently practical to fix it "at the
source".  So if we're going to fix this leak at all (the one on
relaunch), I think it's best to do it here.

I did find out, though, that I didn't need to explicitly remove the
appShell from the observer service.

Otherwise this patch is the same as v2.1.

We might be able to get away with *not* fixing the leak on relaunch --
I suspect it doesn't have any effect on tinderbox builds.  But my fix
for it is very simple and effective, and I really don't see anything
wrong with it.

Steven Michaud [:smichaud] (Retired)

Assignee

Updated

•

15 years ago

Attachment #382878 - Flags: superreview?(bent.mozilla) → review?(bent.mozilla)

Steven Michaud [:smichaud] (Retired)

Assignee

Comment 51

•

15 years ago

Comment on attachment 382878 [details] [diff] [review]
Patch v2.2

Oops, Ben.  I just meant to ask you for a review.

Serge Gautherie (:sgautherie)

Updated

•

15 years ago

Depends on: 497745

Josh Aas

Comment 52

•

15 years ago

Comment on attachment 382878 [details] [diff] [review]
Patch v2.2

I can't r+ the base appshell code unless we decide that it is the correct way to fix the leak. I don't want to cover that leak up, I'd rather have the leak and improve our chances of fixing it the right way in the future.

Attachment #382878 - Flags: review?(joshmoz) → review-

Steven Michaud [:smichaud] (Retired)

Assignee

Comment 53

•

15 years ago

Attached patch Patch v2.3 (obsolete) — Details — Splinter Review

> I don't want to cover that leak up

We wouldn't be covering it up -- I've already identified its source
(in bug 497745).  And I suspect it will be at least a year before that
bug is fixed (fixed in a better way than attachment 382873 [details] [diff] [review]).

But, as I said above, we probably can get away with not fixing it.  So
here's a patch with the offending code removed.

Attachment #382878 - Attachment is obsolete: true

Attachment #382962 - Flags: review?(joshmoz)

Attachment #382878 - Flags: review?(bent.mozilla)

Steven Michaud [:smichaud] (Retired)

Assignee

Updated

•

15 years ago

Attachment #382962 - Flags: review?(bent.mozilla)

Ben Turner (not reading bugmail, use the needinfo flag!)

Reporter

Comment 54

•

15 years ago

Actually, I think it might make sense for a thread to auto-release its observer once it's shutdown. That might solve the appshell case and others. I'd have to look more closely, but that sounds like a smart move to me.

Steven Michaud [:smichaud] (Retired)

Assignee

Comment 55

•

15 years ago

Does XPCOM have an "autorelease"? :-)

Ben Turner (not reading bugmail, use the needinfo flag!)

Reporter

Comment 56

•

15 years ago

I filed bug 498010 to get the thread observers released at shutdown.

Ben Turner (not reading bugmail, use the needinfo flag!)

Reporter

Comment 57

•

15 years ago

Comment on attachment 382962 [details] [diff] [review]
Patch v2.3

>diff --git a/widget/src/cocoa/nsAppShell.h b/widget/src/cocoa/nsAppShell.h
> ...
>+  PRInt32            mNativeEventScheduledDepth;

Might want to indicate here with a comment that this int is only modified atomically

>+  // As best I can tell, every call to ProcessGeckoEvents() is triggered

I'm not a fan of using the first person in code, but that's my personal opinion.

>+  if (!self->mTerminated) {

As long as we're doing an else clause I'd prefer to reverse the !mTerminated test and then switch the if/else clauses around. Double-negatives are tough to read ;)

>   NS_ADDREF_THIS();
>+  PR_AtomicIncrement(&mNativeEventScheduledDepth);

So, if a background thread makes it through the addref but then gets swapped out and the main thread runs through ProcessGeckoEvents we'll be left with an extra ref. It looks like we'll catch that in Exit(), but maybe put a comment here explaining that?

>@@ -759,24 +792,38 @@ nsAppShell::Exit(void)
> ...
>+  if (!mNativeEventCallbackDepth) {

I'd do this just to make it simpler:

  if (!mNativeEventCallbackDepth && mNativeEventScheduledDepth) {
    PRInt32 releaseCount = PR_AtomicSet(&mNativeEventScheduledDepth, 0);
    while (releaseCount-- > 0)
      NS_RELEASE_THIS();
  }

Attachment #382962 - Flags: review?(bent.mozilla) → review+

Steven Michaud [:smichaud] (Retired)

Assignee

Comment 58

•

15 years ago

Attached patch Patch v2.4 (follow Ben's suggestions) [Checkin: Comment 60 & 79] — Details — Splinter Review

Attachment #382962 - Attachment is obsolete: true

Attachment #383361 - Flags: review?(joshmoz)

Attachment #382962 - Flags: review?(joshmoz)

Steven Michaud [:smichaud] (Retired)

Assignee

Comment 59

•

15 years ago

Here's a new revision that (mostly) follow's Ben's suggestions.  It
also rewrites a bunch of comments, and hopefully makes them easier to
understand.

>>   NS_ADDREF_THIS();
>>+  PR_AtomicIncrement(&mNativeEventScheduledDepth);
>
> So, if a background thread makes it through the addref but then gets
> swapped out and the main thread runs through ProcessGeckoEvents
> we'll be left with an extra ref. It looks like we'll catch that in
> Exit(), but maybe put a comment here explaining that?

I'm not sure what you're after here.  But I think my comments in
ProcessGeckoEvents() and Exit() cover all the bases (or at least now
they do).  If I'm right about that, saying more here wouldn't add
anything, and might actually be confusing.

Josh Aas

Updated

•

15 years ago

Attachment #383361 - Flags: review?(joshmoz) → review+

Steven Michaud [:smichaud] (Retired)

Assignee

Comment 60

•

15 years ago

Comment on attachment 383361 [details] [diff] [review]
Patch v2.4 (follow Ben's suggestions)
[Checkin: Comment 60 & 79]

Landed on trunk (mozilla-central):
http://hg.mozilla.org/mozilla-central/rev/e10f24426a52

This doesn't (of course) fix the leaks on relaunch.  But the patch for
bug 498010 does work around them, and should be landed soon.

We should let this bake on the trunk for a while, then (if all goes
well) land it on the 1.9.1 branch.

Steven Michaud [:smichaud] (Retired)

Assignee

Updated

•

15 years ago

Status: NEW → RESOLVED

Closed: 15 years ago

Resolution: --- → FIXED

Serge Gautherie (:sgautherie)

Updated

•

15 years ago

No longer blocks: 485648

Serge Gautherie (:sgautherie)

Updated

•

15 years ago

Flags: blocking1.9.2?

Keywords: qawanted

•

15 years ago

http://tinderbox.mozilla.org/showlog.cgi?log=Firefox3.5/1248708363.1248714985.3480.gz
OS X 10.5.2 mozilla-1.9.1 unit test on 2009/07/27 08:26:03

Daniel Holbert [:dholbert]

Comment 66

•

15 years ago

http://tinderbox.mozilla.org/showlog.cgi?log=Firefox3.5-Unittest/1248718968.1248720070.30356.gz
OS X 10.5.2 mozilla-1.9.1 test everythingelse on 2009/07/27 11:22:48

Serge Gautherie (:sgautherie)

Comment 67

•

15 years ago

http://tinderbox.mozilla.org/showlog.cgi?log=Firefox3.5/1249733048.1249736004.23356.gz
OS X 10.5.2 mozilla-1.9.1 unit test on 2009/08/08 05:04:08

Serge Gautherie (:sgautherie)

Comment 68

•

15 years ago

http://tinderbox.mozilla.org/showlog.cgi?log=SeaMonkey2.0/1249741029.1249747376.25347.gz
OS X 10.4 comm-central unit test on 2009/08/08 07:17:09

Serge Gautherie (:sgautherie)

Comment 69

•

15 years ago

http://tinderbox.mozilla.org/showlog.cgi?log=Firefox3.5/1249928878.1249931813.16502.gz
OS X 10.5.2 mozilla-1.9.1 unit test on 2009/08/10 11:27:58

Serge Gautherie (:sgautherie)

Comment 70

•

15 years ago

http://tinderbox.mozilla.org/showlog.cgi?log=Firefox3.5/1250497469.1250500428.24934.gz
OS X 10.5.2 mozilla-1.9.1 unit test on 2009/08/17 01:24:29

Serge Gautherie (:sgautherie)

Comment 71

•

15 years ago

http://tinderbox.mozilla.org/showlog.cgi?log=Firefox3.5-Unittest/1250839644.1250841119.19907.gz
OS X 10.5.2 mozilla-1.9.1 test everythingelse on 2009/08/21 00:27:24

Serge Gautherie (:sgautherie)

Comment 72

•

15 years ago

http://tinderbox.mozilla.org/showlog.cgi?log=Firefox3.5/1250874917.1250877894.2580.gz
OS X 10.5.2 mozilla-1.9.1 unit test on 2009/08/21 10:15:17

Please !

Serge Gautherie (:sgautherie)

Comment 73

•

15 years ago

http://tinderbox.mozilla.org/showlog.cgi?log=Firefox3.5/1251062935.1251068432.3946.gz
OS X 10.5.2 mozilla-1.9.1 unit test on 2009/08/23 14:28:55

Serge Gautherie (:sgautherie)

Comment 74

•

15 years ago

http://tinderbox.mozilla.org/showlog.cgi?log=SeaMonkey2.0/1251082602.1251089180.27973.gz
OS X 10.4 comm-central unit test on 2009/08/23 19:56:42

Serge Gautherie (:sgautherie)

Comment 75

•

15 years ago

http://tinderbox.mozilla.org/showlog.cgi?log=Firefox3.5/1251293945.1251297074.32573.gz
OS X 10.5.2 mozilla-1.9.1 unit test on 2009/08/26 06:39:05

Serge Gautherie (:sgautherie)

Comment 76

•

15 years ago

Comment on attachment 383361 [details] [diff] [review]
Patch v2.4 (follow Ben's suggestions)
[Checkin: Comment 60 & 79]

Please...

Attachment #383361 - Flags: approval1.9.1.3? → approval1.9.1.4?

Serge Gautherie (:sgautherie)

Comment 77

•

15 years ago

http://tinderbox.mozilla.org/showlog.cgi?log=Firefox3.5/1251384388.1251391134.29854.gz
OS X 10.5.2 mozilla-1.9.1 unit test on 2009/08/27 07:46:28

Mike Beltzner [:beltzner, not reading bugmail]

Comment 78

•

15 years ago

Comment on attachment 383361 [details] [diff] [review]
Patch v2.4 (follow Ben's suggestions)
[Checkin: Comment 60 & 79]

a1914=beltzner

Attachment #383361 - Flags: approval1.9.1.4? → approval1.9.1.4+

Serge Gautherie (:sgautherie)

Comment 79

•

15 years ago

Comment on attachment 383361 [details] [diff] [review]
Patch v2.4 (follow Ben's suggestions)
[Checkin: Comment 60 & 79]


http://hg.mozilla.org/releases/mozilla-1.9.1/rev/496510e1d755

Attachment #383361 - Attachment description: Patch v2.4 (follow Ben's suggestions) → Patch v2.4 (follow Ben's suggestions) [Checkin: Comment 60 & 79]

Serge Gautherie (:sgautherie)

Updated

•

15 years ago

status1.9.1: ? → .4-fixed

Whiteboard: [qaw: check/dupe 'URL' bug list] [needs 1.9.1 landing: needs approval] [orange] → [qaw: check/dupe 'URL' bug list] [orange]

Mark Banner (:standard8)

Updated

•

15 years ago

Blocks: 499644

Nobody; OK to take it and work on it

Updated

•

12 years ago

Keywords: intermittent-failure

Nobody; OK to take it and work on it

Updated

•

12 years ago

Whiteboard: [qaw: check/dupe 'URL' bug list] [orange] → [qaw: check/dupe 'URL' bug list]

Joshua Mitchell (Inactive)

Comment 80

•

10 years ago

Issue is Resolved - removing QA-Wanted Keywords - QA-Wanted query clean-up task

Keywords: qawanted

Patch, v1 16 years ago Ben Turner (not reading bugmail, use the needinfo flag!) 3.10 KB, patch	smichaud : review+	Details \| Diff \| Splinter Review
leak logs from bug 454921 15 years ago Matthew Gregan [:kinetik] 286.14 KB, application/octet-stream		Details
Possible patch 15 years ago Steven Michaud [:smichaud] (Retired) 8.05 KB, patch		Details \| Diff \| Splinter Review
Reconstructed stacks for the two double-frees 15 years ago Steven Michaud [:smichaud] (Retired) 2.57 KB, text/plain		Details
Patch v2.1 (revised and cleaned up) 15 years ago Steven Michaud [:smichaud] (Retired) 8.70 KB, patch	jaas : review-	Details \| Diff \| Splinter Review
Patch v2.2 15 years ago Steven Michaud [:smichaud] (Retired) 8.63 KB, patch	jaas : review-	Details \| Diff \| Splinter Review
Patch v2.3 15 years ago Steven Michaud [:smichaud] (Retired) 7.54 KB, patch	bent.mozilla : review+	Details \| Diff \| Splinter Review
Patch v2.4 (follow Ben's suggestions) [Checkin: Comment 60 & 79] 15 years ago Steven Michaud [:smichaud] (Retired) 7.97 KB, patch	jaas : review+ beltzner : approval1.9.1.4+	Details \| Diff \| Splinter Review