Last Comment Bug 754588 - Intermittent mochitest-plain-5 leak of about 35KB or 67KB (7 AtomImpl, 1 BackstagePass, 1 Connection, 1 MemoryReporter_StorageSQLite, 8 Mutex)
: Intermittent mochitest-plain-5 leak of about 35KB or 67KB (7 AtomImpl, 1 Back...
Status: RESOLVED FIXED
: intermittent-failure, mlk
Product: Core
Classification: Components
Component: General (show other bugs)
: 15 Branch
: All All
: -- normal with 1 vote (vote)
: mozilla15
Assigned To: Bill McCloskey (:billm)
:
Mentors:
Depends on:
Blocks: 438871 748240
  Show dependency treegraph
 
Reported: 2012-05-12 09:21 PDT by Matt Brubeck (:mbrubeck)
Modified: 2012-11-25 19:31 PST (History)
4 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---


Attachments
patch (6.35 KB, patch)
2012-05-21 11:22 PDT, Bill McCloskey (:billm)
continuation: review+
Details | Diff | Review

Description Matt Brubeck (:mbrubeck) 2012-05-12 09:21:53 PDT
Possibly a regression from increment GC (bug 735099).  I retriggered some test runs to narrow the regression range.  First seen on this Try push which includes bug 735099:

https://tbpl.mozilla.org/php/getParsedLog.php?id=11702009&tree=Try
Rev3 Fedora 12 try debug test mochitests-5/5 on 2012-05-12 04:33:22 PDT for push f65fe01ce1d2
slave: talos-r3-fed-029

== BloatView: ALL (cumulative) LEAK AND BLOAT STATISTICS, default process 2148

     |<----------------Class--------------->|<-----Bytes------>|<----------------Objects---------------->|<--------------References-------------->|
                                              Per-Inst   Leaked    Total      Rem      Mean       StdDev     Total      Rem      Mean       StdDev
   0 TOTAL                                          12    35004 88763971     1007 (17129.82 +/- 23565.65) 71227393     1024 (16316.99 +/- 40403.18)
  11 AtomImpl                                       24      168    15028        7 ( 5891.17 +/-  2710.09)   597893        7 (19660.55 +/-  3577.79)
  13 BackstagePass                                  24       24        1        1 (    1.00 +/-     0.00)   732884       10 (  271.75 +/-    29.82)
  51 Connection                                    104      104       27        1 (    9.38 +/-     3.79)     9281        2 (  122.72 +/-    32.84)
 138 MemoryReporter_StorageSQLite                   12       12        1        1 (    1.00 +/-     0.00)        3        1 (    1.40 +/-     0.55)
 145 Mutex                                          12       96    14946        8 (  137.30 +/-    25.59)        0        0 (    0.00 +/-     0.00)
 167 Preferences                                    32       32        1        1 (    1.00 +/-     0.00)   150096        1 (   18.25 +/-    12.55)
 182 ReentrantMonitor                               16       48     4617        3 (   68.41 +/-    35.16)        0        0 (    0.00 +/-     0.00)
 193 Service                                        68       68        1        1 (    1.00 +/-     0.00)      141        1 (   15.58 +/-     4.94)
 195 SharedScriptableHelperForJSIID                 12       12        1        1 (    1.00 +/-     0.00)     9138        5 (   90.52 +/-    38.05)
 205 StorageSQLiteMultiReporter                     52       52        1        1 (    1.00 +/-     0.00)        3        1 (    1.40 +/-     0.55)
 226 ValueObserver                                  36     2520      151       70 (   87.87 +/-    41.20)    49717       70 (  353.24 +/-    49.02)
 243 XPCNativeScriptableInfo                         8       80    31705       10 ( 3568.90 +/-  1709.84)        0        0 (    0.00 +/-     0.00)
 244 XPCNativeScriptableShared                     232     2088    32028        9 (   91.00 +/-    21.51)        0        0 (    0.00 +/-     0.00)
 247 XPCWrappedNative                               56    15008   162064      268 ( 8384.30 +/-  3160.18)  1844396      268 ( 8270.80 +/-  2998.03)
 248 XPCWrappedNativeProto                          32     1088    36690       34 ( 2901.93 +/-  1762.97)        0        0 (    0.00 +/-     0.00)
 567 nsFrameMessageManager                          60       60       86        1 (    6.97 +/-     1.46)     2598        2 (   16.83 +/-     2.48)
 725 nsJSCID                                        52     2860      808       55 (  246.58 +/-    93.97)     6465       55 (  518.58 +/-   195.70)
 729 nsJSID                                         36      144    13411        4 (  645.69 +/-   473.51)   105138        4 ( 2393.36 +/-  2050.06)
 730 nsJSIID                                        24     4176     2374      174 (  763.54 +/-   294.20)    44093      174 ( 1616.05 +/-   453.26)
 750 nsLocalFile                                   124      372     9819        3 (  195.72 +/-    71.28)    41845        4 (  274.13 +/-   201.96)
 808 nsObserverService                              48       48        1        1 (    1.00 +/-     0.00)    79541        1 (   31.77 +/-    19.22)
 836 nsPrefBranch                                   76      152      216        2 (   22.97 +/-     7.69)     1844        2 (   35.88 +/-    17.72)
 987 nsStringBuffer                                  8      720  1068179       90 (60995.19 +/- 12570.89)  2481999      160 (75512.54 +/- 15437.47)
1032 nsSystemPrincipal                              16       16        1        1 (    1.00 +/-     0.00)   372583        1 (  358.35 +/-    59.16)
1033 nsTArray_base                                   4      304 27477770       76 (50125.08 +/-  6319.17)        0        0 (    0.00 +/-     0.00)
1043 nsThread                                      112      224       66        2 (   23.33 +/-     9.73)   179888        2 (  135.00 +/-    70.47)
1069 nsUUIDGenerator                               160      160        1        1 (    1.00 +/-     0.00)     8901        1 (    2.49 +/-     1.24)
1087 nsVoidArray                                     4        8   343120        2 (10378.72 +/-  1974.78)        0        0 (    0.00 +/-     0.00)
1138 nsXPCComponents                                64      320     2043        5 (  254.35 +/-   127.65)    35143       15 ( 1018.57 +/-   526.48)
1139 nsXPCComponents_Classes                        20      100      283        5 (   97.96 +/-    43.31)     5841       10 (  394.65 +/-   140.11)
1143 nsXPCComponents_Interfaces                     28      140      386        5 (  107.59 +/-    41.03)    62548       10 (  393.19 +/-    97.09)
1144 nsXPCComponents_Results                        20       80      149        4 (   69.46 +/-    37.35)     4950        8 (  210.02 +/-    97.67)
1145 nsXPCComponents_Utils                          24       48      249        2 (   94.11 +/-    44.88)     6140        2 (  209.63 +/-    76.81)
1150 nsXPCWrappedJS                                 64      384     7917        6 ( 1504.40 +/-   488.03)   457037       16 ( 3476.33 +/-   903.22)
1151 nsXPCWrappedJSClass                            44      132      859        3 (   37.89 +/-     5.88)   170079        6 ( 1440.23 +/-   470.73)
1189 xptiInterfaceInfo                              20     2940     6935      147 (  329.58 +/-   110.92)   468289      180 ( 1014.98 +/-   192.24)
1190 xptiInterfaceInfoManager                      128      128        1        1 (    1.00 +/-     0.00)   140954        5 (   92.99 +/-    24.43)
1191 xptiWorkingSet                                 88       88        1        1 (    1.00 +/-     0.00)        0        0 (    0.00 +/-     0.00)

nsTraceRefcntImpl::DumpStatistics: 1191 entries
TEST-UNEXPECTED-FAIL | automationutils.processLeakLog() | leaked 35004 bytes during test execution
TEST-UNEXPECTED-FAIL | automationutils.processLeakLog() | leaked 7 instances of AtomImpl with size 24 bytes each (168 bytes total)
TEST-UNEXPECTED-FAIL | automationutils.processLeakLog() | leaked 1 instance of BackstagePass with size 24 bytes
TEST-UNEXPECTED-FAIL | automationutils.processLeakLog() | leaked 1 instance of Connection with size 104 bytes
TEST-UNEXPECTED-FAIL | automationutils.processLeakLog() | leaked 1 instance of MemoryReporter_StorageSQLite with size 12 bytes
TEST-UNEXPECTED-FAIL | automationutils.processLeakLog() | leaked 8 instances of Mutex with size 12 bytes each (96 bytes total)
Comment 1 Matt Brubeck (:mbrubeck) 2012-05-12 09:29:53 PDT
This also happened on inbound several times starting with this changeset:
https://hg.mozilla.org/integration/mozilla-inbound/rev/d61928d439b4

https://tbpl.mozilla.org/php/getParsedLog.php?id=11705292&tree=Mozilla-Inbound
https://tbpl.mozilla.org/php/getParsedLog.php?id=11704382&tree=Mozilla-Inbound
https://tbpl.mozilla.org/php/getParsedLog.php?id=11704405&tree=Mozilla-Inbound
https://tbpl.mozilla.org/php/getParsedLog.php?id=11704987&tree=Mozilla-Inbound

So far it has not occurred on any earlier changeset -- including the ones included in the Try push in comment 0 -- despite several retriggers.  Maybe it was triggered by a networking glitch (like some other known leaks) rather than by any particular code change?

This leak looks vaguely similar to bug 753452.
Comment 5 Matt Brubeck (:mbrubeck) 2012-05-13 17:31:54 PDT
https://tbpl.mozilla.org/php/getParsedLog.php?id=11716933&tree=Mozilla-Inbound
Rev3 MacOSX Leopard 10.5.8 mozilla-inbound debug test mochitests-5/5 on 2012-05-13 07:26:34 PDT for push c5023518db2f

I am now blaming this leak on bug 748240.  The earliest leak is now on that patch, and the original leak on Try was based on top of that patch too.  And I retriggered the test 48 times on the parent changeset and it was 100% green.
Comment 6 Bill McCloskey (:billm) 2012-05-21 11:22:15 PDT
Created attachment 625701 [details] [diff] [review]
patch

This was triggered by the incremental GC landing, although it's not really related to incremental GC.

During the GCIfNeeded(true) call run during cycle collector shutdown, the GC was causing an XPCWrappedJS to be Released. When XPCWrappedJS::Release is called, it can unroot a JS object. If we had done more GC at that point, we would have collected the JS object in the next cycle. Instead, we did a cycle collection (which wasn't able to collect anything) and shut down the cycle collector for good. If we had done the extra GC, then that would have allowed more C++ objects to be cycle collected, preventing the leak.

The way we've handled these unrooting issues in the past is to set a special flag (rt->gcPoke), when unrooting stuff, telling the GC that more GC is needed. That wasn't happening here because the unrooting happens through XPConnect rather than in the JS engine. Consequently, I added a friend API to set the gcPoke flag from xpconnect.

I also made a few other changes to make absolutely sure that we free everything we can during the GCIfNeeded shutdown CCs.
Comment 7 Andrew McCreight [:mccr8] 2012-05-21 14:28:58 PDT
Comment on attachment 625701 [details] [diff] [review]
patch

Review of attachment 625701 [details] [diff] [review]:
-----------------------------------------------------------------

Nice find.  It looks good to me, modulo my concerns about the patch not recomputing gcShouldCleanUpEverything during the loop.

The general problem here is that JS can keep alive an XPCWrappedJS which can keep alive other JS, but the GC doesn't really see those edges, but just treats the wrapped JS as a root during tracing?

::: js/src/jsfriendapi.h
@@ +739,5 @@
>  extern JS_FRIEND_API(void)
>  IncrementalValueBarrier(const Value &v);
>  
> +extern JS_FRIEND_API(void)
> +PokeGC(JSRuntime *rt);

There's already a PokeGC in nsJSEnvironment, but I suppose this is okay, because they mean the same thing, and they are used in different contexts (browserland vs. XPC+JS).

::: js/src/jsgc.cpp
@@ +3653,5 @@
>  
> +static bool
> +ShouldCleanUpEverything(JSRuntime *rt, gcreason::Reason reason)
> +{
> +    return !rt->hasContexts() || reason == gcreason::CC_FORCED;

This is going to cause SuperGCs when we get a GC due to UnmarkGray overflows. I guess that's no big deal, as we're still in a mindset where they are so rare we don't care.

Does the hasContexts() condition happen other than at shutdown?  Can it change in value during a GC?  Prior to your patch, it would be checked every time around the loop, but now it will only be checked once.

This will also change behavior of ShouldPreserveJITCode, but I guess it lives in jsgc.cpp, so it is unlikely people will call it except when gcShouldCleanUpEverything is set up properly.  So it should be okay as long as hasContexts() doesn't change during marking (which seems unlikely).
Comment 8 Bill McCloskey (:billm) 2012-05-21 16:21:08 PDT
(In reply to Andrew McCreight [:mccr8] from comment #7)
> The general problem here is that JS can keep alive an XPCWrappedJS which can
> keep alive other JS, but the GC doesn't really see those edges, but just
> treats the wrapped JS as a root during tracing?

Yeah, pretty much.

> ::: js/src/jsgc.cpp
> @@ +3653,5 @@
> >  
> > +static bool
> > +ShouldCleanUpEverything(JSRuntime *rt, gcreason::Reason reason)
> > +{
> > +    return !rt->hasContexts() || reason == gcreason::CC_FORCED;
> 
> This is going to cause SuperGCs when we get a GC due to UnmarkGray
> overflows. I guess that's no big deal, as we're still in a mindset where
> they are so rare we don't care.

Yeah, we might want to add a special GC reason just for GC-caused-by-shutdown-CC. Maybe I'll do that, if you're okay with it.

> Does the hasContexts() condition happen other than at shutdown?  Can it
> change in value during a GC?  Prior to your patch, it would be checked every
> time around the loop, but now it will only be checked once.

I don't think hasContexts() can change during a GC. If it does, it's very unlikely.
Comment 9 Andrew McCreight [:mccr8] 2012-05-21 16:24:41 PDT
Thanks for the explanations.

(In reply to Bill McCloskey (:billm) from comment #8)
> Yeah, we might want to add a special GC reason just for
> GC-caused-by-shutdown-CC. Maybe I'll do that, if you're okay with it.

I was considering suggesting that anyways, so it is fine with me, but it doesn't seem like a huge deal.
Comment 11 Ed Morley [:emorley] 2012-05-23 05:05:28 PDT
https://hg.mozilla.org/mozilla-central/rev/e3cac83d3a14

Note You need to log in before you can comment on or make changes to this bug.