Closed
Bug 506125
Opened 15 years ago
Closed 13 years ago
Experiment with memory-pressure-based GC scheduler
Categories
(Core :: JavaScript Engine, defect)
Core
JavaScript Engine
Tracking
()
RESOLVED
WONTFIX
People
(Reporter: dmandelin, Assigned: gwagner)
References
Details
Attachments
(1 file, 14 obsolete files)
40.50 KB,
patch
|
Details | Diff | Splinter Review |
The starting point here is that we know that the current JS GC scheduler is not particularly performant, and we'd like to try some new ideas. I'll start with one idea to narrow this down to a sane level. First, consider a copying generational GC with a nursery (which is of course not what we do). This design is used when a high fraction of objects created are short-lived. They can then be collected early, and at low cost, by doing a 'minor' collection on the nursery. In the basic design of this kind of GC, a 'minor' collection is done exactly when the nursery is full. Thus, if the nursery size is S, a minor collection is done when S bytes have been allocated since the last GC, and the main scheduling parameter is the size of the nursery. Now, let's follow the noble tradition of pretending a solution to one problem will work on a related but different problem. For SpiderMonkey, this would mean doing a collection once S bytes have been allocated since the previous collection. Here, bytes allocated are ideally computed thus: - For NewGCThing(), the number of bytes allocated (8). - For JS_malloc(), if the pointer gets stored in a transitively GC'd object, the size otherwise 0 - For JS_realloc(), if the pointer gets stored in a transitively GC'd object, (newsize-oldsize) otherwise 0 Trying this out requires creating some new allocation APIs and changing old uses: JS_malloc -- for stuff that won't go under GC control JS_mallocGC -- for stuff that will go under GC control JS_realloc JS_reallocDouble -- stuff under GC control, when caller knows he is doubling the size JS_reallocFromTo -- caller gives old bytes as explicit argument JS_reallocWeirdness -- we might need some more ones to cover other cases The experiment is to set up this memory accounting and GC scheduling design, and then run some tests. The parameter S will vary from test to test. For each test, max heap size, GC pause time distribution, and overall GC overhead (as a fraction of total JS time) should be measured. The same measurements should be taken for the current scheduler. Picking a workload is hard but we can start with SunSpider, v8, and bug 504640.
Comment 1•15 years ago
|
||
"Double" is confusing in light of jsdouble, and realloc is the wrong API (too prone to leak on OOM, overflow of size_t, etc.) But I do like "Weirdness" ;-) Main point here is to suggest different name scheme, probably using Grow and Shrink to separate direction by method. /be
Comment 2•15 years ago
|
||
But I'm not bikeshedding -- the usability of realloc is poor, whatever its name. We want different grow&shrink APIs, also split as Dave says by growth policy. The names are secondary but of course if it no longer looks like realloc, it should not be named anything that matches *realloc*. /be
Updated•15 years ago
|
Assignee: general → gal
Comment 3•15 years ago
|
||
The attached patch implements an experimental actual-RSS size-based GC heuristics. We sample the RSS size after every GC, and trigger a new GC once we allocate 20% more pages than after the last GC. Delayed freeing during GC is accounted for. Allocating JSGCThings is no longer tracked, and is no longer part of the heuristics. As far as I can tell triggering a GC based on JSGCThings was mostly a kludge to trigger a GC based on the DOM memory contribution per JSGCThing which we can't measure direct. Since we measure total process memory use now, thats no longer necessary. A JS program is welcome to use the entire (fairly limited) 32MB JS heap and we GC last ditch if we have to, if overall (malloc) memory pressure didn't trigger a GC already. The patch results in about 10% better average memory use on the very memory happy http://people.mozilla.org/~jmuizelaar/world-map-fn.html benchmark.
Comment 4•15 years ago
|
||
Cleaned up the patch a bit. malloc_size() support for win32 and linux. Getting the RSS for win and linux still missing. I am not sure whether we currently inhibit GC during startup the same way we did before. If not, I would like to add a simple N s delay using js_IntervalNow().
Attachment #390688 -
Attachment is obsolete: true
Comment 5•15 years ago
|
||
Attachment #390692 -
Attachment is obsolete: true
Comment 6•15 years ago
|
||
Could some windows and linux fanboys help out with the patch? The windows code is in place but doesn't compile. For linux I think we have to sscanf /proc/self/stat, which is borderline funny. Maybe we can mmap(/proc/self/stat), and do a memory sscanf?
I think you want to just seek and read; proc files are pretty quick to access like that.
Comment 8•15 years ago
|
||
js_GetRSS() sits in MaybeGC which the browser calls a bunch during startup, so I want to make sure its fast Otoh we probably want to return a fake number for js_GetRSS() anyway to avoid GC during startup. I guess we will figure it out by measuring once the patch compiles everywhere.
Reporter | ||
Comment 9•15 years ago
|
||
I can fix the Windows part for you but the latest (3rd) patch doesn't apply to tip. Please rebase or tell me your parent rev.
Comment 11•15 years ago
|
||
Attachment #390896 -
Attachment is obsolete: true
Reporter | ||
Comment 12•15 years ago
|
||
Reporter | ||
Comment 13•15 years ago
|
||
Previous version didn't incorporate the latest linux additions.
Attachment #390908 -
Attachment is obsolete: true
Reporter | ||
Updated•15 years ago
|
Attachment #390902 -
Attachment is obsolete: true
Reporter | ||
Comment 14•15 years ago
|
||
Updated patch seems to work well on Windows. It did fine on the image evolution page, which the old GC (before it was specifically fixed for that problem) failed on. The more I think about it, the more I like the idea of *not* doing extra memory accounting outside of the allocator.
Reporter | ||
Comment 15•15 years ago
|
||
Updating title to reflect patch content. We can refile if we end up wanting to revive the original idea.
Summary: Experiment with total-bytes-based GC scheduler → Experiment with memory-pressure-based GC scheduler
Comment 16•15 years ago
|
||
This patch combined with the parallel sweeping is really improving our behavior on the box2d examples (pretty dramatic improvement in comparison to 3.5.1). http://box2d-js.sourceforge.net/index2.html
Attachment #390909 -
Attachment is obsolete: true
Updated•15 years ago
|
Attachment #391041 -
Flags: review?(igor)
Comment 17•15 years ago
|
||
Comment on attachment 391041 [details] [diff] [review] latest and greatest Still have to sort out the exact balancing of the heuristics and WinCE support. I will have igor take a look in the meantime. igor: any suggestions what to do with trigger_factor? do we want to try to somehow calculate that value backwards into something useful in the world?
Comment 18•15 years ago
|
||
... in the new world?
Comment 19•15 years ago
|
||
We pass talos, startup and tp_rss look good. mochitests seem to die on Linux and WIN32: Linux 86709 INFO TEST-PASS | /tests/toolkit/content/tests/widgets/test_panelfrommenu.xul | focus blurred on panel from toolbarbutton open 86710 INFO TEST-PASS | /tests/toolkit/content/tests/widgets/test_panelfrommenu.xul | focus not modified on cursor down from toolbarbutton 86712 INFO Running /tests/toolkit/content/tests/widgets/test_popup_attribute.xul... command timed out: 300 seconds without output, killing pid 24364 process killed by signal 9 program finished with exit code -1 elapsedTime=2159.091762 MacOSX had an error but that looks spurious: TEST-UNEXPECTED-FAIL | chrome://mochikit/content/browser/browser/components/places/tests/perf/browser_ui_history_sidebar.js | Timed out Win32 had an error too: 1900 INFO Running chrome://mochikit/content/chrome/dom/tests/mochitest/chrome/test_focus.xul... 1901 INFO TEST-PASS | chrome://mochikit/content/chrome/dom/tests/mochitest/chrome/test_focus.xul | activeWindow 1902 INFO TEST-PASS | chrome://mochikit/content/chrome/dom/tests/mochitest/chrome/test_focus.xul | child document hasFocus 1903 ERROR TEST-UNEXPECTED-FAIL | chrome://mochikit/content/chrome/dom/tests/mochitest/chrome/test_focus.xul | [SimpleTest/SimpleTest.js, window.onerror] An error occurred: [ fm.focusedElement is null ] command timed out: 300 seconds without output program finished with exit code 1 elapsedTime=364.031000
Comment 20•15 years ago
|
||
(In reply to comment #17) > igor: any suggestions what to do with trigger_factor? do we want to try to > somehow calculate that value backwards into something useful in the world? I suggest to remove this and any other no-longer used consts from the API, fine-tune the patch for best performance with the browser and add new API like setting the growth factor for RSS or similar. Embeddings as far as I remember rarely do any tuning of GC parameters besides calling MaybeGC. If by default MaybeGC and friends will do the right job for the browser, then the embedders should not have any troubles with code-porting.
Comment 21•15 years ago
|
||
(In reply to comment #14) > The more I think about it, the more I like the idea of *not* doing extra memory > accounting outside of the allocator. IMO ideally we should have universal GC-allocator for things of arbitrary sizes and do accounting based on big chunks of mmap-allocated memory. But with the current mixture of malloc and GC-based allocations some extra accounting could be necessary. Otherwise it is hard to avoid situations when the GC thinks that its memory is over-used while having mostly-freed malloced memory or vise-verse.
Comment 22•15 years ago
|
||
(In reply to comment #16) > Created an attachment (id=391041) [details] > latest and greatest Does it passes tests in single-threaded builds?
Comment 23•15 years ago
|
||
#22: I have a mochitest failure. Debugging that.
Comment 24•15 years ago
|
||
Attachment #391041 -
Attachment is obsolete: true
Attachment #391041 -
Flags: review?(igor)
Comment 25•15 years ago
|
||
For the life of me I can't reproduce the mochitest failure tinderbox is giving me. I tried mac and linux (in a vm). Will try windows in a vm now.
Comment 26•15 years ago
|
||
Ok, I try-servered this again: linux talos try01 passed. linux talos try02 crashed. wtf? http://tinderbox.mozilla.org/showlog.cgi?log=MozillaTry/1249105825.37.1249108480.14413.gz http://tinderbox.mozilla.org/showlog.cgi?log=MozillaTry/1249105825.37.1249107101.32056.gz OSX unit tests has a bunch of leaks but I am going to claim thats some unrelated **** in TM tip. http://tinderbox.mozilla.org/showlog.cgi?log=MozillaTry/1249102976.1249112136.21358.gz Patch passed talos on 2 WINNT columns. On try02 we got a ts_shutdown of 584.95. On try04 ts_shutdown was 296.75. Aroo? WINNT unit tests also show a leak: http://tinderbox.mozilla.org/showlog.cgi?log=MozillaTry/1249102976.1249111240.11614.gz Vista talos passed too. I think I definitively slow down ts with this patch. I will submit one that avoids GC in MaybeGC until startup is over.
Comment 27•15 years ago
|
||
This patch adopts the same cheesy startup 10s timer idea we use in the DOM elsewhere to suppress forced GCs (via JS_GC) during startup. MaybeGC is supposed to be called when the engine is idle. We really shouldn't be called it during startup (we call it some 30 times right now). I will defer to Johnny to oversee those calls to be removed. Until then this patch works around it.
Attachment #392041 -
Attachment is obsolete: true
Comment 28•15 years ago
|
||
This seems to have fixed the startup time issue. I am still seeing all sorts of random test failures, but nothing that could reasonably be related to my patch.
Comment 29•15 years ago
|
||
Attachment #392064 -
Attachment is obsolete: true
Updated•15 years ago
|
Attachment #392157 -
Flags: review?(igor)
Updated•15 years ago
|
Attachment #392157 -
Flags: review?(pavlov)
Comment 30•15 years ago
|
||
With the patch the monster 3d demo peaks at about 10mb less memory use than 3.5.2 (160mb vs 170mb for 3.5.2). An empty google page started up is 60mb vs 59mb (3.5.2). All numbers for x86 MACOSX. http://deanm.github.com/pre3d/monster.html Mobile might want to look at more aggressive GC heuristics. Now that we have an idea of the working set size, on mobile we could constantly GC really hard if we get near the device limit (beats running out of memory).
Comment 31•15 years ago
|
||
Comment on attachment 392157 [details] [diff] [review] ready for review >@@ -2553,41 +2488,39 @@ JS_IsAboutToBeFinalized(JSContext *cx, v > JS_PUBLIC_API(void) > JS_SetGCParameter(JSRuntime *rt, JSGCParamKey key, uint32 value) > { > switch (key) { > case JSGC_MAX_BYTES: > rt->gcMaxBytes = value; > break; > case JSGC_MAX_MALLOC_BYTES: >- rt->gcMaxMallocBytes = value; > break; > case JSGC_STACKPOOL_LIFESPAN: > rt->gcEmptyArenaPoolLifespan = value; > break; > default: > JS_ASSERT(key == JSGC_TRIGGER_FACTOR); > JS_ASSERT(value >= 100); >- rt->setGCTriggerFactor(value); > return; We should remove JSGC_MAX_MALLOC_BYTES and JSGC_TRIGGER_FACTOR. The latter is not present in any public release of SM and If some embedders would complain about JSGC_MAX_MALLOC_BYTES, we can consider restoring this particular parameter. >diff --git a/js/src/jsgc.cpp b/js/src/jsgc.cpp >- doGC = IsGCThresholdReached(rt); >+ doGC = rt->gcIsNeeded; > for (;;) { > if (doGC > #ifdef JS_TRACER > && !JS_ON_TRACE(cx) && !JS_TRACE_MONITOR(cx).useReservedObjects > #endif > ) { > /* > * Keep rt->gcLock across the call into js_GC so we don't starve >@@ -1856,23 +1799,21 @@ NewGCThing(JSContext *cx, uintN flags) > if (thing) { > arenaList->freeList = thing->next; > flagp = thing->flagp; > JS_ASSERT(*flagp & GCF_FINAL); > > #ifdef JS_THREADSAFE > /* > * Refill the local free list by taking several things from the >- * global free list unless the free list is already populated or >- * we are still at rt->gcMaxMallocBytes barrier. The former is >+ * global free list unless the free list is already populated. >+ * we are still at rt->gcMaxMallocBytes barrier. This is > * caused via allocating new things in gcCallback(cx, The comments changes looks semi-finished.
Attachment #392157 -
Flags: review?(igor) → review+
Comment 34•15 years ago
|
||
http://hg.mozilla.org/tracemonkey/rev/9fbf8217bd47
Whiteboard: fixed-in-tracemonkey
Comment 35•15 years ago
|
||
Attachment #392592 -
Flags: review?(igor)
Updated•15 years ago
|
Attachment #392592 -
Flags: review?(igor) → review+
Comment 37•15 years ago
|
||
Increased ceiling to 125% (from 118.5%), trying to identify tjss regression cause.
Comment 39•15 years ago
|
||
We are hitting a bunch of OOM crashes on talos nochrome. MaybeGC is not called or not called sufficiently I assume. Brendan suggest calling MaybeGC from the engine as well instead of relying on the embedding.
Comment 40•15 years ago
|
||
Backed out. I will look at the tjss issue first. http://hg.mozilla.org/tracemonkey/rev/809e7d5f5b78
Comment 41•15 years ago
|
||
had to back this out again, unit tests failing.
Updated•15 years ago
|
Whiteboard: fixed-in-tracemonkey
Comment 42•15 years ago
|
||
Attachment #392157 -
Attachment is obsolete: true
Attachment #392592 -
Attachment is obsolete: true
Attachment #392157 -
Flags: review?(pavlov)
Updated•15 years ago
|
Attachment #394957 -
Flags: review?(jwalden+bmo)
Updated•15 years ago
|
Attachment #394957 -
Flags: review?(jwalden+bmo) → review+
Comment 43•15 years ago
|
||
Comment on attachment 394957 [details] [diff] [review] make xpcshell call MaybeGC once a second via the operation callback >+ // Trigger a GC if the working set grew by more than 32MB and at least 25% >+ if (rss > lastRSS + 32*1024*1024 && rss > (lastRSS + lastRSS/4)) >+ JS_GC(cx); I think we've generally stuck to C-style comments outside of jstracer.cpp; please do that here. Please put spaces around the *, and nix the parens around |lastRSS + lastRSS / 4|. >+static const char* >+skipFields(const char* p, unsigned n) >+{ >+ while (*++p) >+ if ((*p == ' ') && (--n == 0)) >+ return p+1; >+ return NULL; >+} Brace the body of the loop. >+static int statfd = 0; Is there any difference if this is placed inside |js_GetRSS|? I'd prefer it there if there isn't. >+size_t >+js_GetRSS() >+{ >+ char buf[1024]; >+ if (!statfd) { >+ sprintf(buf, "/proc/%d/stat", getpid()); Paranoia, but use snprintf here with some generous value like 100, and overall size like 128 (avoid page and cache faults more often, 1024 seems more likely to occasionally unlucky). >+ if (statfd == -1) >+ return 0; < 0, please, equality looks like it could go haywire if you got unlucky in the right ways. >+ lseek(statfd, 0, SEEK_SET); >+ int n = read(statfd, buf, sizeof(buf)-1); |if (n < 0) return 0|, looks like you'd smash the stack if you don't do this! >+ buf[n] = 0; >+ const char* p = strrchr(buf, ')') + 2; >+ if (!p) >+ return 0; Um...I don't think !p is ever going to be the case if you add 2 here! Few enough changes to r+, but only with the above fixed! :-)
Comment 45•15 years ago
|
||
static int foo inside a function is different from outside according to luke. Fixed everything else.
Comment 46•15 years ago
|
||
How is POD static in a function different from out? It ain't in C. /be
Comment 47•15 years ago
|
||
http://hg.mozilla.org/tracemonkey/rev/6585e85bdbcd http://hg.mozilla.org/tracemonkey/rev/0398f87956e0
Updated•15 years ago
|
Whiteboard: fixed-in-tracemonkey
Comment 50•15 years ago
|
||
Looks like try-server is back amongst the living so doing another run of that ...
Comment 51•15 years ago
|
||
I think I have the GC experts in this bug. So, excuse me putting this post here. Is it possible that the GC is not triggered if you have a Javascript and you don't touch FF? We have clear proof that Google pages eat memory when you keep FF idle. See bug 510854. However, when you touch the UI, all the memory is released. But without it can take as much memory as you want (although I did not wait longer after 30Mb of eaten memory). If you want to try by yourself, open the google book page in the bug, in 10 tabs. You can also do it with the Google main page, but then you have to wait for 4 minutes. So, it looks that this GC related. Has anyone suggestion? It doesn't reproduce when I copy the page locally. Lucas
Comment 52•15 years ago
|
||
With the patch in this bug we GC every time the overall process memory use grows by 25% (and at least 32MB). After the GC we sample the heap size and then use 125% of that as new water mark. So what you describe shouldn't happen with the attached patch. Without it, anything is possible. The old heuristics is rather strange and hard to predict.
Comment 53•15 years ago
|
||
So, then bug 510854 is a dupe of this one? Can you test the google book page with your patch? I think I gave you a free test case :-)
Comment 54•15 years ago
|
||
I am pretty busy landing and debugging patches for a while. I don't think I will get around to testing this particular behavior but feel free to give it a spin.
Comment 55•15 years ago
|
||
This is known to regress tjss. Landing so we can get some data. http://hg.mozilla.org/tracemonkey/rev/9b6b17a275ec
Whiteboard: fixed-in-tracemonkey
Comment 56•15 years ago
|
||
malloc_usable_size is not a portable function. It's not available on Solaris. Should I re-implement js_*alloc to get around? e.g. for js_malloc(), do { size_t* iaddr = malloc(bytes+sizeof(size_t)); if (iaddr == nsnull) return nsnull; iaddr[0] = bytes; return (&iaddr[1]); } Is there another way to get around?
Comment 57•15 years ago
|
||
Thats pretty unfortunate. We should double check that we really funnel all allocations through js_malloc and friends.
Comment 58•15 years ago
|
||
I think it may waste a lot of memory, if js_malloc is commonly used for small allocations. Is it possible to disable building memory-pressure-based GC?
Comment 59•15 years ago
|
||
We will back out the patch in the morning due to failures (it was meant to cycle overnight to collect some data). We should discuss our options in the bug. The problem isn't so much the memory pressure measurement. Its accounting for the background deallocation. If we disable that selectively, malloc_size is not needed.
Comment 60•15 years ago
|
||
We should autoconf-test for malloc_usable_size, not use a rat's nest of OS-specific ifdefs. /be
Updated•15 years ago
|
Whiteboard: fixed-in-tracemonkey
Comment 61•15 years ago
|
||
How's this going? How soon are we to getting this something which will fix the excessive memory usage of xpcshell on trunk? We're seeing a lot of random mochitests timeout on Tinderbox, and I think this is a major cause.
Comment 62•14 years ago
|
||
Refreshed and fixed compiling errors for OS X 10.5.2
Attachment #394957 -
Attachment is obsolete: true
Comment 63•14 years ago
|
||
gregor++. Thanks for the refresh.
Comment 64•14 years ago
|
||
Another fix for linux build. Tryserver shows that CreateTimerQueueTimer is not supported by all windows versions (WinCE, WinMo)
Attachment #400932 -
Attachment is obsolete: true
Updated•14 years ago
|
Assignee: gal → anygregor
![]() |
||
Comment 65•13 years ago
|
||
Is this bug still relevant? Can I WONTFIX it?
![]() |
||
Comment 66•13 years ago
|
||
Bug 664291 is also related.
Reporter | ||
Comment 67•13 years ago
|
||
(In reply to comment #65) > Is this bug still relevant? Can I WONTFIX it? We do still want to work on the general concept, but it looks like this particular bug hasn't been moving it a while. So let's WONTFIX it, and create a new bug if we decide to try again, or reopen if someone does want to revive this patch.
Status: NEW → RESOLVED
Closed: 13 years ago
Resolution: --- → WONTFIX
Comment 68•13 years ago
|
||
Bug 655455 is also similar, in terms of making the JS GC better about responding to memory pressure, though I think the focus there is going to be more on keeping the heap size under control.
You need to log in
before you can comment on or make changes to this bug.
Description
•