Last Comment Bug 654399 - Prevent rapid memory growth on pages with ShareThis: cache the string buffer for location.hash
: Prevent rapid memory growth on pages with ShareThis: cache the string buffer ...
Status: RESOLVED FIXED
fixed-in-cedar
: footprint
Product: Core
Classification: Components
Component: DOM (show other bugs)
: Trunk
: x86 Windows 7
: P1 normal (vote)
: mozilla6
Assigned To: Boris Zbarsky [:bz]
:
Mentors:
http://www.anandtech.com/show/4309/th...
: 631536 (view as bug list)
Depends on:
Blocks: mlk-fx4
  Show dependency treegraph
 
Reported: 2011-05-03 02:29 PDT by Ferdinand
Modified: 2012-05-03 11:34 PDT (History)
18 users (show)
bzbarsky: in‑testsuite?
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---


Attachments
part 1. Update JS malloc stats when handing out effectively owning references to string data to JS. (2.21 KB, patch)
2011-05-09 10:16 PDT, Boris Zbarsky [:bz]
bzbarsky: review-
Details | Diff | Review
part 2. Try to make sure we always hand out the same exact string buffer each time location.hash is gotten. (1.72 KB, patch)
2011-05-09 10:18 PDT, Boris Zbarsky [:bz]
roc: review+
Details | Diff | Review
Try to make sure we always hand out the same exact string buffer each time location.hash is gotten. (1.72 KB, patch)
2011-05-18 21:11 PDT, Boris Zbarsky [:bz]
no flags Details | Diff | Review

Description Ferdinand 2011-05-03 02:29:30 PDT
User-Agent:       Mozilla/5.0 (Windows NT 6.1; WOW64; rv:6.0a1) Gecko/20110502 Firefox/6.0a1
Build Identifier: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:6.0a1) Gecko/20110502 Firefox/6.0a1

When I open that page memory usage of Firefox grows from 50MB to over 300MB(it seems to plateau). Chrome doesn't do this and IE9 doesn't do this. I have tried this on a clean profile without extensions or plugins and it still happens. Firefox 3.6, Firefox 4.0 and Firefox nightly have this behavior and the newer the Firefox the more it leaks.

Reproducible: Always




I have saved the full page in a zip file if that is needed.
Comment 1 Nicholas Nethercote [:njn] 2011-05-03 22:30:50 PDT
I tried this page under Massif on a Linux64 build.  I ran for a bit over 1,000 seconds and got this graph for heap consumption:

    MB 
225.6^                                     ##                                 
     |                                   @@#                                ::
     |                                  @@@#                              ::::
     |                                :@@@@#                            @:::::
     |                              :::@@@@#                          ::@:::::
     |                            :::::@@@@#                        ::::@:::::
     |                          @::::::@@@@#                      @:::::@:::::
     |                        :@@::::::@@@@#                    ::@:::::@:::::
     |                      :::@@::::::@@@@#                  ::::@:::::@:::::
     |                    @@:::@@::::::@@@@#                @:::::@:::::@:::::
     |                  @@@@:::@@::::::@@@@#              ::@:::::@:::::@:::::
     |                ::@@@@:::@@::::::@@@@#            ::::@:::::@:::::@:::::
     |              ::::@@@@:::@@::::::@@@@#          ::::::@:::::@:::::@:::::
     |            ::::::@@@@:::@@::::::@@@@#        ::::::::@:::::@:::::@:::::
     |          @:::::::@@@@:::@@::::::@@@@#       :: ::::::@:::::@:::::@:::::
     |        @@@:::::::@@@@:::@@::::::@@@@#    ::::: ::::::@:::::@:::::@:::::
     |      :@@ @:::::::@@@@:::@@::::::@@@@#  :::: :: ::::::@:::::@:::::@:::::
     |    @@:@@ @:::::::@@@@:::@@::::::@@@@# :: :: :: ::::::@:::::@:::::@:::::
     |  @:@@:@@ @:::::::@@@@:::@@::::::@@@@# :: :: :: ::::::@:::::@:::::@:::::
     |  @:@@:@@ @:::::::@@@@:::@@::::::@@@@# :: :: :: ::::::@:::::@:::::@:::::
   0 +----------------------------------------------------------------------->s
     0                                                                   1065.3

Definitely looks like a GC happened in the middle.

The snapshot taken at the peak usage point was dominated by a single
allocation point:

97.77% (231,262,896B) (heap allocation functions) malloc/new/new[], --alloc-fns, etc.
o1o> 84.59% (200,084,363B): nsStringBuffer::Alloc() (nsSubstring.cpp:209)
| o2o> 84.37% (199,574,102B): nsAString_internal::MutatePrep() (nsTSubstring.cpp:162)
| | o3o> 84.35% (199,537,048B): nsAString_internal::ReplacePrepInternal() (nsTSubstring.cpp:198)
| | | o4o> 84.29% (199,381,282B): nsAString_internal::Replace() (nsTSubstring.h:684)
| | | | o5o> 84.28% (199,364,478B): nsLocation::GetHash() (nsTSubstring.h:378)
| | | | | o6o> 84.28% (199,364,478B): NS_InvokeByIndex_P ()
| | | | |   o7o> 84.28% (199,364,478B): XPCWrappedNative::CallMethod() (xpcwrappednative.cpp:3141)
| | | | |     o8o> 84.28% (199,364,478B): XPC_WN_GetterSetter() (xpcprivate.h:2621)
| | | | |       o9o> 84.28% (199,364,478B): js::Invoke() (jscntxtinlines.h:277)
| | | | |         o10> 84.28% (199,364,478B): js::ExternalInvoke()
 (jsinterp.cpp:806)
| | | | |           o11> 84.28% (199,364,478B): js::ExternalGetOrSet() (jsinterp.cpp:846)
| | | | |             o12> 84.28% (199,364,478B): js::JSProxyHandler::get() (jsproxy.cpp:133)
| | | | |               o13> 84.28% (199,364,478B): js::JSProxy::get() (jsproxy.cpp:806)
| | | | |                 o14> 84.28% (199,364,478B): js::proxy_GetProperty() (jsproxy.cpp:923)
| | | | |                   o15> 84.27% (199,325,984B): InlineGetProp() (jsobj.h:1224)

roc, bz:  any idea what that represents?
Comment 2 Robert O'Callahan (:roc) (Exited; email my personal email if necessary) 2011-05-03 22:48:18 PDT
Something calling document.location.hash a lot, which allocates strings, and GC not running very frequently?
Comment 3 Nicholas Nethercote [:njn] 2011-05-03 22:56:24 PDT
I instrumented by hand nsAString_internal::MutatePrep() to see how big the allocated string was each time it was called.  I saw this sequence of sizes:

sz = 505
sz = 1132
sz = 2018
sz = 1009

repeat over and over again.  So clearly something is happening repeatedly to cause the memory growth.
Comment 4 Nicholas Nethercote [:njn] 2011-05-03 22:57:04 PDT
The 505, 1009, 2018 sequence looks like a "double the hash table" kind of thing.
Comment 5 Boris Zbarsky [:bz] 2011-05-06 21:52:58 PDT
So what GetHash does is it takes the given nsAString (aHash) and:

405       aHash.Assign(PRUnichar('#'));
406       aHash.Append(unicodeRef);

I see unicodeRef being 579 PRUnichars every time; the actual value looks like this:

  "init/fpc=5568770-12fc8ba4ec9-156bff66-2/sessionID=1304742795995.65072/publisher=ffc93a4f-833d-4566-8...."

The caller is, surprise!, the sharethis script at http://w.sharethis.com/share4x/js/st.c58d43164c1cfdb83ea49cd898a81f54.js and after prettifying the relevant part is:

  fragmentPump = new function () {
    ...
    this.initialize = function () {
        this.fragTimer = setInterval("fragmentPump.checkFragment()", 5)
    };

    this.checkFragment = function () {
        var hash = document.location.hash.substring(1);
        // Do various crud here with that hash
    ...
    };
    this.initialize()
  }

So yes, every 5ms they get the hash (which in this case is huge for some reason) and then do stuff with it.  This generates about 1200 bytes bytes of garbage (not counting the JS string overhead) per 5ms, which is 240KB of garbage per second.  Or about 12MB/minute.  Of course these are "external" strings so the JS GC doesn't realize it's holding on to that much data.  Then eventually there's enough pure-JS garbage and there's a GC and all this stuff goes away.

For 1000s, by the way, I'd expect about 240MB of garbage per the numbers above.  Matches the comment 1 graph pretty well, esp. if we assume that we can't quite maintain a 5ms timer while running under Massif.

On 3.6, the "leak" would be 2x slower at least because the timeout clamp is 10ms.  But I think 3.6 also GCs more aggressively than 4 does.

On any browser with a generational GC that GCs often (e.g. Chrome; dunno about IE) there would be no problem, I'd think, because it would just GC all the strings often.

For what it's worth, I've reported this exact issue (the fact that they peg the CPU with checkFragment for no good reason) to ShareThis in the past; last time in March of 2010.  Clearly no action yet!

I think on our side short of going to a GGC the useful thing would be to teach JSeng to treat external strings as part of its heap... or something (in some cases they really are shared with the browser core, so this could have undesirable effects too).
Comment 6 Robert O'Callahan (:roc) (Exited; email my personal email if necessary) 2011-05-09 03:43:05 PDT
Would it be worth having nsLocation::GetHash cache the last URI/stringbuffer pair and return that stringbuffer when the URI hasn't changed?
Comment 7 Boris Zbarsky [:bz] 2011-05-09 07:10:12 PDT
Hmm.  That would fix this one particular case, but the problem would remain for other cases where a new XPCOM stringbuffer is created and handed out to JS.

I guess we should spot-fix window.location for now to work around the ShareThis inanity, and then file a followup bug on an overall better solution....
Comment 8 Boris Zbarsky [:bz] 2011-05-09 10:07:23 PDT
For what it's worth, I tried updating the gc stats with the new memory and it helps some: we start GCing after our heap reahes 170MB instead of 300MB...

I guess I'll do that _and_ roc's suggestion from comment 6.  :(
Comment 9 Boris Zbarsky [:bz] 2011-05-09 10:15:41 PDT
And if I do both, memory stabilizes at 70MB.
Comment 10 Boris Zbarsky [:bz] 2011-05-09 10:16:10 PDT
Created attachment 531076 [details] [diff] [review]
part 1.  Update JS malloc stats when handing out effectively owning references to string data to JS.
Comment 11 Boris Zbarsky [:bz] 2011-05-09 10:18:08 PDT
Created attachment 531077 [details] [diff] [review]
part 2.  Try to make sure we always hand out the same exact string buffer each time location.hash is gotten.
Comment 12 Boris Zbarsky [:bz] 2011-05-09 10:21:52 PDT
This will still suck for other string-getting use cases, but we really do need GGC for that.  Or seriously revamping how strings work, or both.
Comment 13 Gregor Wagner [:gwagner] 2011-05-09 11:40:59 PDT
Jason said I should take a look.

We already update the mallocCounter for "length" within JSExternalString::new_.
Is this additional memory you want to account for?
Comment 14 Boris Zbarsky [:bz] 2011-05-09 12:32:03 PDT
Hrm.  So we do.  I'd missed that.

In this case the stringbuffer capacity happens to be a good bit bigger than the length (almost 2x as big), so that's undercounting in some cases and overcounting in others (the shared buffer case), but I guess that means we don't want the part 1 from this bug....
Comment 15 Robert O'Callahan (:roc) (Exited; email my personal email if necessary) 2011-05-09 17:49:59 PDT
Comment on attachment 531077 [details] [diff] [review]
part 2.  Try to make sure we always hand out the same exact string buffer each time location.hash is gotten.

Technically I'm not a peer in this code, but the patch looks good.
Comment 16 Boris Zbarsky [:bz] 2011-05-18 21:11:37 PDT
Created attachment 533527 [details] [diff] [review]
Try to make sure we always hand out the same exact string buffer each time location.hash is gotten.
Comment 17 Boris Zbarsky [:bz] 2011-05-20 19:47:02 PDT
http://hg.mozilla.org/projects/cedar/rev/f68807492b2b
Comment 18 Daniel Holbert [:dholbert] 2011-05-21 17:35:43 PDT
http://hg.mozilla.org/mozilla-central/rev/f68807492b2b
Comment 19 Jesse Ruderman 2011-06-21 12:27:44 PDT
*** Bug 631536 has been marked as a duplicate of this bug. ***
Comment 20 Boaz Dodin 2012-05-02 23:46:20 PDT
Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:13.0) Gecko/20100101 Firefox/13.0

I just experienced huge memory growth on this page with sharethis code:
http://pages.ebay.com/sellerinformation/news/insiderapr2012.html

In about:memory it was more than 1GB of memory related to sharethis, in several minutes!

Do this bug should be reopened or should I fill a new bug?
Comment 21 Boris Zbarsky [:bz] 2012-05-03 10:43:01 PDT
Please file a new bug with clear steps to reproduce and cc me?
Comment 22 Boaz Dodin 2012-05-03 11:34:42 PDT
(In reply to Boris Zbarsky (:bz) from comment #21)
> Please file a new bug with clear steps to reproduce and cc me?

Hi Boris,
https://bugzilla.mozilla.org/show_bug.cgi?id=751656
just filled and you are cc.

Note You need to log in before you can comment on or make changes to this bug.