Created attachment 470518 [details]
I looked at the testcase from bug 590379. The memory usage for the browser increases up to 4GB instead of 1GB in a build from 2 days ago.
The attached file shows that we keep allocation without performing GC.
Created attachment 470520 [details]
The patch for bug 592001 restores some memory pressure updating. Can you give it a try and see whether it helps? Thanks,
Created attachment 470531 [details]
stats with patch from bug 592001
I tried with the patch from bug 592001 but I still see the same behavior.
Reducing JSGC_MAX_MALLOC_BYTES to 32MB again also doesn't work.
(Tested with patch from 592001 applied)
Gregor, wow, how about to less than 16MB, say 8? Binary search for value with same effects as before?
In the browser JSGC_MAX_BYTES is 2^32. We should try to reduce this to some sane value as now, without malloced scopes, we have much less malloc pressure.
Gregor has kindly agreed to take this. It's clear we were relying on malloc throughput to page GC, which happened to work (better, or more often) when we had scopes bloating up the malloc heap for objects with own properties. Now that we do not, measuing malloc bytes is not going to save us from GC-heap exhaustion.
A possible quick fix could be make cx->realloc to update the malloc pressure counter on each realloc. Then at least slot arrays would contribute the pressure counter.
(In reply to comment #7)
> Gregor has kindly agreed to take this. It's clear we were relying on malloc
> throughput to page GC,
I meant "pace GC", but "page GC" as in "paging Dr. GC", works too.
Really do want more precise memory pressure accounting, or perhaps instead of us crudding up our code and data structures trying to keep track, plan B: ask the OS and avoid running up against its walls.
Changing realloc is problematic. It caused our string behavior to be wonky before.
There is a patch for #9 floating around.
Created attachment 470663 [details] [diff] [review]
This is a simple approach where I count all arenas that are alive after a GC and when the allocation of new arenas reaches a certain threshold (relative to the live arenas after the last GC), a new GC is triggered. This heuristic should only apply if we have a high memory throughput.
The current patch only triggers GCs once we have allocated more than 40 Chunks.
We have about 250 Arenas per Chunk and for a browser startup we allocate about 2000 arenas.
The remaining question is the tuning.
The activity monitor tells me that "real memory usage" is down to about 750 MB again.
Created attachment 470673 [details] [diff] [review]
changed to use gcBytes
What's the 40? And is 1.7 really necessary? 2 won't work?
Created attachment 470681 [details] [diff] [review]
The problem is that this approach shouldn't trigger the gc for a small working set. High throughput benchmarks show clearly that they need a few seconds until they get a stable working set at the beginning if the trigger value is too small. 40 means that we don't trigger a GC if we have less than 40 chunks allocated.
I found that small changes influence the behavior of the whole browser so we have to find the right setup.
Created attachment 470695 [details]
Some statistics for this patch:
We have about 500 Chunks allocated instead of 260 from before the patch.
The "real memory" usage (as the activity monitor on mac shows) is not increased.
The GC pause is a little bit longer because of the bigger heap.
The good thing is that the pause time does not increase linear with the heap size and is only slightly longer.
The remaining GC calls (once we have a stable working set) are coming from the browser side with the GC invocation callback.
You can't just sparkle the code with 3 instances of the magic value "25". At least put in some constant and re-use that. Its still not clear to me why we need to check against 2 trigger values, maxbytes and triggerbytes. Shouldn't one threshold be enough?
As mentioned in comment 6 JSGC_MAX_BYTES is 2^32. I don't want to change this value since this may be set via the API and embedders might use it for a hard upper limit. The triggerbytes should prevent a heap growth to the limit without a GC.
The "magic" 25 comes from the "old" GC heuristics where we triggered a GC for every 50 Chunks we allocated in a high throughput environment.
If I use a growth factor of 2 instead of 1.5 we end up with 800 Chunks instead of 500 and the pause time gets worse.
So the whole GC heuristic is pretty unstable right now and it needs more than one afternoon to stabilize it again. This patch prevents the heap explosion but it is definitely not the best solution.
At least add a couple clean constants to the code and then lets land this. We should pick up the memory pressure patch after this one. The current code is total clownshoes. Lets do it after your stuff lands though.
Created attachment 470722 [details] [diff] [review]
Lets hope that not too many testcases and benchmarks relied on the old GC heuristics...