Created attachment 530814 [details]
I am testing with my new netbook. Single atom core 1.66GHz, 1GB ram.
We have a 7sec and a 4sec GC pause in there. Looks like it happens during the RegExp benchmark.
This is Windows or Linux? I am dying to see a profile.
It's Win 7 prof. I already see a 50 sec pause during the realtime raytracer. I guess it's time to change to add some memory pressure for the allocation.
Gregor just told me we should get a lot more people netbooks so we see this kind of stuff earlier (probably also a good simulation for cell phone use).
I have 1GB Ram and it seems like the process gets about 300-400MB of it. Afterwards it starts paging and the GC performance gets exponentially worse.
Is there any way to specify the max heap size for Firefox? It would be useful for performance testing like this, and for bugs that show up when OOM.
We have a patch lying around I made 2 years ago that measures the total process heap size and tries to regulate that. Even better would be to measure paging and GC aggressively as soon we get near that.
A question: how does the timescale for doing that compare to the timescale for doing GGC and just GCing more often period? Or would we still want changes like this with a GGC?
Discussed in triage today - we don't think this is firefox 6 specific, but rather "ASAP" - would love to approve a safe patch!
dmandelin, can you find an assignee for this?
Gregor, want to own this?
This doesn't seem to be a well-defined bug yet. What do we want here? To avoid the long GC pause on a 1GB netbook? Don't we need a 1GB netbook in order to test this?
Also, do we know the max heap size required by the application? I.e., is the problem that our GC is allocating too much memory before it does a GC, or is the problem that the workload just uses too much memory?
Gregor bought a 1GB netbook for testing purposes. The problem is that we exceed the available physical RAM and we have to page to/from disk during GC, which is insanely slow. We need better working set size management.
(In reply to comment #12)
> Gregor bought a 1GB netbook for testing purposes.
+1 to Gregor. If he wants to work on this, that's great, but otherwise, we'd need another one.
> The problem is that we
> exceed the available physical RAM and we have to page to/from disk during
> GC, which is insanely slow. We need better working set size management.
Are you saying that it is because we are not GC'ing soon enough? I know we've talked about detecting and responding to memory pressure before, and that you've been a long-time advocate of it. Do you know of any papers on that subject? The review paper/textbook from the 90s don't seem to have much to say about it. So it sounds like it's going to need to use OS-specific APIs, and also like it's a research project--i.e., don't expect a quick fix. I fully agree that we should do that research at some point, though.
Or is there a quick fix, something like reading out how much memory the machine has, and if < 2GB, reset some tuning parameters?
All of the above sounds reasonable. I am all for a quick and dirty hack to make GCs more aggressive depending on physical memory size, and we should work on a more thorough approach that measures the working set and tries to balance it. I don't think academia talks much about this, its very application dependent.
MLton attempts to adapt its GC based on the size of the working set vs system memory ( http://mlton.org/GarbageCollection ) but I'm not sure if they do anything beyond switching from a Cheney collector to a Mark-Compact collector when space runs low.
I was already suggesting that many more people should get a netbook. I was testing heap growth parameters on it recently and changes that don't show any regression on my super MBP were a major slowdown on the netbook.
It also works the other way around where smaller memory footprint results in a 2x speedup on the netbook and a 1% regression on the MBP.
This work is very time-consuming because compiling the browser on this device doesn't work very well and it's easier to compile on the try-server.
I am pretty busy right now but we should come up with a strategy first and then I can decide if I have enough time to implement it.
Thats what you get for commenting on bugs with smart comments ... ;)
Andrew, want to ask IT for a netbook? If they don't have one just pick one up from best buy and expense it. I know that CC can be ridiculously slow on machines with limited RAM, too, so I guess we can check out both issues here.
Well, that's basically the entirety of my knowledge of resource-constrained GC. But I can look around to see what people do. How does Firefox Mobile deal with this? The Nexus S only has half a gig of RAM. But it looks like its secondary storage is flash ram, so maybe thrashing doesn't hit it quite as hard?
dougt can explain us what mobile does, but I think basically on mobile you don't look at as many tabs at the same time, and the OS kills you when you start paging. They could certainly use better working set management too.
I ended up just parsing /proc/self/stat to look at RSS for this sort of thing.
gjs is in a tricky place though because we're not in a position where we can easily tell Spidermonkey how much native malloc() allocation we do.
We are using a custom malloc (jemalloc), we might be able to tell easily what the total count is.
Any ideas regarding when to monitor the memory pressure?
In FF our GC also uses jemalloc to allocate its chunks. If jemalloc is extended with a hook invoked when jemalloc calls mmap to allocate another 1MB of memory, then we can do the monitoring from the hook and schedule the GC accordingly.
(In reply to comment #23)
> We are using a custom malloc (jemalloc), we might be able to tell easily
> what the total count is.
We can, see GetHeapUsed() and GetHeapUnused() in xpcom/base/nsMemoryReporterManager.cpp.
Gregor pointed out an interesting ISMM'11 paper on adapting GC triggers based on memory pressure. They approximate memory pressure using major page faults since the last GC and resident set sizes. I guess another thing to consider would be page faults that happen during the GC. I'll look into that.
(In reply to comment #26)
> Gregor pointed out an interesting ISMM'11 paper on adapting GC triggers
> based on memory pressure. They approximate memory pressure using major page
> faults since the last GC and resident set sizes. I guess another thing to
> consider would be page faults that happen during the GC. I'll look into
jlebar is working on this in bug 664291.
Gregor, does bug 656120 ameliorate or fix this bug?
In that bug he said "Our memory footprint is a big problem and I filed this bug because FF was/is very painful to use or even unusable (see bug 655455) on my new lowest-end netbook. This patch tries to keep the memory footprint small and therefore is a big win on such devices. Finally I can use FF on my new machine."
The heap sizing mechanism still is a bit funny, so we should still consider tweaking it.
(In reply to comment #28)
> Gregor, does bug 656120 ameliorate or fix this bug?
Bug 656120 makes it possible to leave the browser open over night with some allocating workload or run it in background with another application in parallel. Yeah :)
This bug is (most likely) caused by a swapping problem during the GC and we haven't fixed it. Bug 664291 seems to be the right solution. Or maybe we need a hard upper JS-heap limit for memory constrained devices in addition. Right now we completely ignore this information and let the heap grow to unlimited. Bug 592907 wanted to go in this direction.
We are going in the right direction but we are not close to a point where I would say that it is a pleasure to use FF on my netbook.
Gregor, judging from comment 30, the situation has improved since you filed this bug, and we have other bugs filed for all the remaining ideas we have to improve it further. So I'll close this one, please reopen if you disagree.