Last Comment Bug 664291 - Generate better low-memory events on Linux and Mac (and Android?) by tracking hard page faults
: Generate better low-memory events on Linux and Mac (and Android?) by tracking...
Status: NEW
[MemShrink:P2]
:
Product: Core
Classification: Components
Component: General (show other bugs)
: Other Branch
: All All
: -- normal with 17 votes (vote)
: ---
Assigned To: Justin Lebar (not reading bugmail)
:
Mentors:
Depends on: 664486 664758 670967
Blocks: image-suck 655455 660577
  Show dependency treegraph
 
Reported: 2011-06-14 14:27 PDT by Justin Lebar (not reading bugmail)
Modified: 2013-11-07 03:54 PST (History)
39 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---


Attachments
WIP v1, Linux/Mac only. (13.43 KB, patch)
2011-06-20 18:27 PDT, Justin Lebar (not reading bugmail)
no flags Details | Diff | Splinter Review
WIP v2 (29.91 KB, patch)
2011-06-23 09:21 PDT, Justin Lebar (not reading bugmail)
no flags Details | Diff | Splinter Review
Patch v1 (28.03 KB, patch)
2011-06-29 13:56 PDT, Justin Lebar (not reading bugmail)
n.nethercote: feedback+
Details | Diff | Splinter Review
Patch v2 (29.12 KB, patch)
2011-06-30 09:03 PDT, Justin Lebar (not reading bugmail)
taras.mozilla: review-
Details | Diff | Splinter Review
Rev 2, part 3 v1: Pass env var when running PGO-instrumented build (2.08 KB, patch)
2011-12-12 22:01 PST, Justin Lebar (not reading bugmail)
no flags Details | Diff | Splinter Review

Description Justin Lebar (not reading bugmail) 2011-06-14 14:27:32 PDT
Right now we don't have a good way of notifying code when we're running low on memory.  We try to sound warning bells when an allocation fails, but by then it's probably way too late.

A better approach would be to notice when the browser is being paged out by the OS.  If we catch it early enough, we may be able to reduce our footprint by running a GC/CC or by dropping caches, etc.  This was suggested in bug 661304 comment 32 and at [1].

A paper [2, from 1] suggests using the simple heuristic of "since the last time I checked, have we had 10 or more hard page faults, or has our resident size decreased?"  (In the paper's context, the RSS never decreases except due to paging.)

I presume we can tune the heuristic without much difficulty, although we'll need to figure out how to determine whether an RSS decrease is due to paging.  The harder problem, I think, will be determining the hard page fault count on Windows [3].  (Unix has getrusage, which makes it easy.)  It looks like we may be able to use ETW on Windows, although it's not clear whether that carries a performance penalty.  Chrome does something with ETW and page faults [4], although I don't know if they use it in production or only for debugging.

[1] https://groups.google.com/forum/#!topic/mozilla.dev.platform/DMqQ_HKEMp4
[2] http://www.cs.rochester.edu/~xiaoming/publications/ismm11-b.pdf
[3] http://glandium.org/blog/?p=1963
[4] http://code.google.com/p/sawbuck/source/browse/trunk/sawbuck/py/etw/etw/descriptors/pagefault.py
Comment 1 Nicholas Nethercote [:njn] 2011-06-14 16:58:48 PDT
(In reply to comment #0)
> 
> A paper [2, from 1] suggests using the simple heuristic of "since the last
> time I checked, have we had 10 or more hard page faults, or has our resident
> size decreased?"  (In the paper's context, the RSS never decreases except
> due to paging.)

The other part of this heuristic is how often it's run.  You start off doing it every N events (where "events" needs to be chosen, it could even be time-based).  Every time you don't have high memory pressure, you increase N by 1.  Every time you do have high memory pressure, you decrease N by a factor of 10.  That gives fast response when memory becomes tight, but slowly decreasing overhead when memory is plentiful.

If this is done well it might be enough to mitigate any overhead of using ETW on Windows?
Comment 2 Justin Lebar (not reading bugmail) 2011-06-15 16:06:22 PDT
I blogged about running Firefox with a reduced max-RSS so as to force it to page [1].  But since this is more persistent than my blog, here's the short version.  On Ubuntu 11.04:

* Install the cgroup-bin package.

* Edit /etc/cgconfig.config and create a group with limited memory.  For
  instance, I added:

      group limited {
        memory {
          memory.limit_in_bytes = 50M;
        }
      }

* Run

    # restart cgconfig
    # chown jlebar /sys/fs/cgroup/memory/limited
    # chown jlebar /sys/fs/cgroup/memory/limited/*
    $ cgexec -g memory:limited dist/bin/firefox

And have at it.  I observed FF holding to a 93M RSS when I asked it to have a
50M limit, but that's no problem for me.  It did page, spectacularly.

cgclassify theoretically lets you attach restrictions to a running process, but
it didn't appear to do anything when used in conjunction with the RSS limit.

[1] http://jlebar.com/2011/6/15/Limiting_the_amount_of_RAM_a_program_can_use.html
Comment 3 Justin Lebar (not reading bugmail) 2011-06-20 13:28:56 PDT
On Windows, it appears we can get a systemwide hard page fault rate with performance counters [1].  That might be good enough for this bug -- If the hard page fault rate is high, either because FF is faulting pages in or some other program is, dump our caches.

Anyone have thoughts on this?

[1] http://msdn.microsoft.com/en-us/library/aa373083%28v=vs.85%29.aspx
Comment 4 Justin Lebar (not reading bugmail) 2011-06-20 18:27:00 PDT
Created attachment 540647 [details] [diff] [review]
WIP v1, Linux/Mac only.
Comment 5 Nicholas Nethercote [:njn] 2011-06-20 18:52:27 PDT
Comment on attachment 540647 [details] [diff] [review]
WIP v1, Linux/Mac only.

Review of attachment 540647 [details] [diff] [review]:
-----------------------------------------------------------------

A couple of drive-by comments from a quick skim...

::: xpcom/base/LowMemoryDetector.cpp
@@ +48,5 @@
> +static PRLogModuleInfo* gLogModule = PR_LOG_DEFINE("LowMemoryDetector");
> +#define DEBUG(format) PR_LOG(gLogModule, PR_LOG_DEBUG, format)
> +#define INFO(format) PR_LOG(gLogModule, PR_LOG_WARNING, format)
> +
> +class LowMemoryNotificationReporter : public nsIMemoryReporter

I think you can use the NS_MEMORY_REPORTER_IMPLEMENT macro to avoid some boilerplate code here.  See js/src/xpconnect/src/xpcjsruntime.cpp for some examples of its use.

@@ +70,5 @@
> +
> +  nsresult GetKind(PRInt32 *aKind)
> +  {
> +    NS_ENSURE_ARG_POINTER(aKind);
> +    *aKind = MR_COUNT;

Does this patch compile?  MR_COUNT doesn't exist.  aboutMemory.js will also need modification in order to print the appropriate units -- it currently assumes all measurements are in bytes and prints "B" or "MB" as the unit.
Comment 6 Justin Lebar (not reading bugmail) 2011-06-20 18:55:16 PDT
This patch depends on the patch in bug 664486, which adds page faults to about:memory for Linux/Mac and implements MR_COUNT (with the appropriate changes to aboutMemory.js).
Comment 7 Justin Lebar (not reading bugmail) 2011-06-21 14:54:54 PDT
(In reply to comment #1)
> The other part of this heuristic is how often it's run.  You start off doing
> it every N events (where "events" needs to be chosen, it could even be
> time-based).  Every time you don't have high memory pressure, you increase N
> by 1.  Every time you do have high memory pressure, you decrease N by a
> factor of 10.  That gives fast response when memory becomes tight, but
> slowly decreasing overhead when memory is plentiful

I don't think this heuristic makes much sense in our context.  We lose unless we react quickly to low memory.  If we don't check for low memory often while we think everything is fine, then it's likely that by the time we notice that memory is low, the OS has already paged us out (that's what we were trying to avoid in the first place!).

Similarly, it doesn't make much sense to check for low memory more often right after we notice that memory is low.  That might make sense if we were able to dramatically decrease our footprint in response to memory pressure -- if we didn't release enough stuff on the first low memory event, maybe the second one will do it.  But in our case, there's relatively little I expect us to be able to release, so I don't expect that firing the notification many times in succession will do much good.

If we want any kind of backoff, I think we might want to back off when we hit a low memory notification (i.e. the opposite of what the paper suggests).  If we keep observing low memory, then clearly our notification isn't doing much good.  If a low-memory notification triggers, say, a GC, we wouldn't want to do that over and over while we're paging.
Comment 8 Justin Lebar (not reading bugmail) 2011-06-21 15:10:31 PDT
> We lose unless we react quickly to low memory.

On the other hand, I'm also kind of worried about checking all the time when it's not necessary.

On Windows, I can detect when the system as a whole starts paging.  When the system starts paging, we want to drop caches even if we haven't been doing anything lately, both to be a good citizen and to keep more important parts of FF from being paged out.  But to detect that, we have to check even while we're idle.

It remains to be seen how much CPU checking once a second will use.
Comment 9 Justin Lebar (not reading bugmail) 2011-06-22 12:58:48 PDT
bz or anyone else: Do you have an opinion on whether this should run in a separate thread or in the main event loop?  Right now it's only checking once a second.
Comment 10 Justin Lebar (not reading bugmail) 2011-06-22 13:03:03 PDT
Doug, it looks like you did something like this for Android and then backed it out.  Do you have any comments about the approach here?  Do you think we should turn it off for Android, since AIUI Android doesn't have swap?
Comment 11 Boris Zbarsky [:bz] (TPAC) 2011-06-22 13:06:07 PDT
From what I understand, once-per-second wakeups would be a serious problem for mobile.  Can we do whatever the JS folks did for periodic GC that doesn't run if nothing has happened here?

What's the practical difference between separate thread and main event loop?  Aren't you proxying to the main thread anyway?
Comment 12 Justin Lebar (not reading bugmail) 2011-06-22 13:17:11 PDT
(In reply to comment #11)
> From what I understand, once-per-second wakeups would be a serious problem
> for mobile.  Can we do whatever the JS folks did for periodic GC that
> doesn't run if nothing has happened here?

Sure, although it's not clear whether it should even be on for mobile in its current form.  Right now, on *nix, it warns about low memory when it sees that we're paging in, but AIUI there's no paging on Android at all.

We might want to do the same thing as the periodic GC to avoid wakeups on desktop, too.  I'll look into it.

Maybe we can watch something else on Android.  There's [1], but that seems to have been ill-fated (it was backed out [2]).

> What's the practical difference between separate thread and main event loop?
> Aren't you proxying to the main thread anyway?

It only proxies to the main thread if it needs to fire a low-memory event.  It's not clear how fast the syscall to get the memory info runs on Windows...

[1] http://hg.mozilla.org/mozilla-central/rev/37e4ab3abc44
[2] http://hg.mozilla.org/mozilla-central/rev/99af9d6485bb
Comment 13 Boris Zbarsky [:bz] (TPAC) 2011-06-22 13:22:18 PDT
If you _can_ stay off the main thread, that seems clearly superior to me.
Comment 14 Doug Turner (:dougt) 2011-06-22 13:25:46 PDT
justin - right, no swapping on most devices.  the n900/maemo had an option to enable swap to sdcard, but that is generally the exception.

regarding the android patches to detect low memory -- there was a system release for the Nexus S which, when low memory was hit, the kernel would panic.  This was fixed with a point release.  in order to work around this, we attempted to watch for low memory (MemFree, etc).  however, we found this to be not a good indicator of actually memory available.  on linux, free memory typically gets used for fs caches and, when needed, are given back to user space processes.
 
Android does send a 'system is low on memory' notification, but typically we this event very late.  Most other processes at the point of this notification have already been killed.  However, this might be a good enough signal for what you are trying to do.
Comment 15 Andrew McCreight [:mccr8] 2011-06-22 14:03:04 PDT
I think the JS GC trigger waits until allocations, which won't work as well for general memory watching, as there are all sorts of allocations all over the place, and I don't know if you'd be able to instrument them all.
Comment 16 Justin Lebar (not reading bugmail) 2011-06-23 09:21:46 PDT
Created attachment 541401 [details] [diff] [review]
WIP v2

This is almost ready for review.  The main thing that's left to do is tune how often we check for memory pressure (currently every second when there's no memory pressure, backing off to once every 20s when there is pressure) and what page fault / page out rate we declare is indicative of pressure (right now it's 10 faults or page outs in one second).

I'm tempted to change the 10 to 1000 or something, so we can have some confidence that we're not firing these notifications when there's plenty of available memory.  But that might not let us react quickly enough to memory pressure.  Suggestions on this point are welcome!

I also need to check that writing to an mmap'ed file on Windows doesn't cause the page out counter to increase.
Comment 17 Justin Lebar (not reading bugmail) 2011-06-23 11:48:40 PDT
> I also need to check that writing to an mmap'ed file on Windows doesn't cause 
> the page out counter to increase.

Verified this with a Python script which writes 1M to an mmap'ed file on Windows 7 VM.  I'll test on Windows XP once I finish downloading the ISO (nobody uses Vista, right?).

The pages output / sec counter stays firmly pinned at 0 as I use my VM to load programs and browse around, but fires away once I start approaching the machine's memory limit.  That's great to see.
Comment 18 Justin Lebar (not reading bugmail) 2011-06-23 12:09:10 PDT
I'm seeing periodic spikes up to 32 pages output / sec on Win XP, so we should set the threshold above that.

If you have Windows and want to see what your pages output / sec looks like, run the performance monitor.  It's in the start menu on Win 7 (I think with that name), and on WinXP, you can start it through start, run, perfmon.msc.  Then add the "pages output / sec" counter under "memory".
Comment 19 Justin Lebar (not reading bugmail) 2011-06-23 13:10:15 PDT
With some hammering on my WinXP VM, I can get the pages output / sec number to creep up to 100 or so before there's actually memory pressure.  So maybe this is the wrong metric to use.

Windows does provide an "Available MBytes" metric which looks pretty good.  When it hits zero, we start paging like crazy.  Maybe we should just watch that.  I need to test it on Win7 now...

I think we should also be watching our virtual address space, on all platforms.  If we're almost out of address space, we should run out of the room crying for our mother.  Or at least fire a low-memory notification.
Comment 20 Emanuel Hoogeveen [:ehoogeveen] 2011-06-23 19:28:14 PDT
Would it be going too far to keep a running mean and variance for the number of page counts per second? Then you could impose a dynamic limit like 'more than 5 standard deviations outside the normal range for two measurements in a row' (5 standard deviations being a 1 in ~2 million occurrence for a normal distribution).

A running mean and variance are cheap to compute ('mean = sum / n' and 'variance = mean_of_squares - square_of_mean') as long as you can do it in integers, but it does require keeping some memory around to evict the oldest entry for every new measurement.
Comment 21 Justin Lebar (not reading bugmail) 2011-06-24 07:16:17 PDT
> Would it be going too far to keep a running mean and variance for the number of 
> page counts per second?

I don't think that would be infeasible or overkill, but I'm also not sure it's better than the alternatives.

On Windows, there are large spikes in the page fault count when programs load, so we just don't want to use that metric, at least not by itself.  And on *nix, at least in my testing, we see very few page faults until we run out of memory, so I think the mean/variance measurement would be itself pretty noisy.

We'll hopefully have page fault counting in about:memory soon so others can have a look on *nix.
Comment 22 Robert Lickenbrock [:rclick] 2011-06-24 20:25:14 PDT
(In reply to comment #18)
> I'm seeing periodic spikes up to 32 pages output / sec on Win XP, so we
> should set the threshold above that.

I see something similar to this on Win7 except that it happens in bursts of 256 instead of 32. I'm not positive, but I think what we're seeing is Windows preemptively writing infrequently accessed pages to the pagefile so the memory can be reclaimed quickly.

(In reply to comment #19)
> With some hammering on my WinXP VM, I can get the pages output / sec number
> to creep up to 100 or so before there's actually memory pressure.  So maybe
> this is the wrong metric to use.

What sort of numbers do you see when there actually is memory pressure?

On my Win7 box, pages output / sec usually jumps straight from zero to thousands, although the spikes are occasionally in the 700-900 range. I wonder if using a threshold of something like 500 would be reasonable...
Comment 23 Justin Lebar (not reading bugmail) 2011-06-28 12:41:12 PDT
For the running-out-of-virtual-space case, we need to figure out how much of the virtual address space is available to userspace programs.

If you want to follow along at home, I've attached a Python script at the bottom of this comment which tries to mmap until it can't anymore.

Windows's limit is 2GB, unless you boot with some special switch.  I think I'm just going to ignore this and say that the VM limit is 2G, unless anyone objects.

The Linux-32 box I tested on apparently has a 3GB limit.  I can't find anything authoritative on this, but I think this is hardcoded.

Still looking for a mac-32 box...

    from itertools import count
    from mmap import mmap
     
    maps = []
    for i in count(1):
        maps.append(mmap(-1, 1024 * 1024))
        print('mapped %dmb' % i)
Comment 24 Nicholas Nethercote [:njn] 2011-06-28 17:07:54 PDT
(In reply to comment #23)
> 
> The Linux-32 box I tested on apparently has a 3GB limit.  I can't find
> anything authoritative on this, but I think this is hardcoded.

Nope.  IIRC you can have various set-ups.  3GB is the most common, but you can have 1GB, 2GB, even 4GB somehow.  So I'm uncomfortable about using a hard-coded value.  I wonder if you can get the number from /proc.
Comment 25 Justin Lebar (not reading bugmail) 2011-06-28 18:29:38 PDT
(In reply to comment #24)
> Nope.  IIRC you can have various set-ups.  3GB is the most common, but you
> can have 1GB, 2GB, even 4GB somehow.  So I'm uncomfortable about using a
> hard-coded value.  I wonder if you can get the number from /proc.

Hm.  How much of a hack would it be to launch a separate process which does a search to find out how much data it can map?  We'd only have to run it once on that machine, and it's not like mmap'ing 4gb is expensive, if you're not going to use it.
Comment 26 Johnny Stenback (:jst, jst@mozilla.com) 2011-06-28 18:36:58 PDT
That *seems* a bit much for me, and I don't know that we can cache that, at least not in a fool proof way. What if the profile is moved from one computer to another, or what about university type installs where the home directory is on nfs and you run firefox from various computers? Not sure how much we care, but it seems caching could cause problems...
Comment 27 Nicholas Nethercote [:njn] 2011-06-28 18:50:29 PDT
Yeah, just sounds like it's asking for trouble :(
Comment 28 Justin Lebar (not reading bugmail) 2011-06-28 18:52:20 PDT
Figuring out how much virtual address space you have to play with should *not* be this difficult.
Comment 29 Justin Lebar (not reading bugmail) 2011-06-28 20:24:52 PDT
It would be nice to say that we usually or always run out of physical memory before we run out of virtual address space, but I don't actually know that's true.  My vmem seems to be about 2 * RSS (on Linux, anyway; we don't report vmem on Windows?), so on a Windows box with 3G of RAM and a 2G per-process vmem limit, it seems that we could hit the vmem limit first.
Comment 30 Justin Lebar (not reading bugmail) 2011-06-28 20:34:14 PDT
Digging through the syscalls, it looks like maybe we can't get the vsize on Windows?  (Seriously?)  If so, I guess we can scrap that for now.
Comment 31 Justin Lebar (not reading bugmail) 2011-06-28 20:37:00 PDT
...well, process explorer somehow gets something it calls "virtual size", so maybe all hope is not lost.
Comment 32 Justin Lebar (not reading bugmail) 2011-06-28 20:42:27 PDT
Aha.  GlobalMemoryStatusEx.ullAvailVirtual tells you how much virtual memory is available in the current process.

http://msdn.microsoft.com/en-us/library/aa366589%28v=vs.85%29.aspx

(How many different ways are there of querying the memory management system in Windows?  It's nuts...)
Comment 33 Mike Hommey [:glandium] 2011-06-28 22:17:48 PDT
For Linux, doesn't /proc/meminfo contain what you want? (probably VmallocTotal, but my kernel and system is 64 bits, so the value here is just huge)
Comment 34 Justin Lebar (not reading bugmail) 2011-06-29 07:16:25 PDT
I just booted up a 32-bit Linux VM -- I think it's CommitLimit in /proc/meminfo.

Two down, one to go!
Comment 35 Mike Hommey [:glandium] 2011-06-29 07:34:59 PDT
(In reply to comment #34)
> I just booted up a 32-bit Linux VM -- I think it's CommitLimit in
> /proc/meminfo.
> 
> Two down, one to go!

On my x64 system with 16GB RAM, it reads:
CommitLimit:    10182392 kB

which is much less than the available RAM.

I could happily map 65470mb of virtual address space with your python script.
Comment 36 Justin Lebar (not reading bugmail) 2011-06-29 07:44:23 PDT
Hm.

On my 32-bit VM, CommitLimit is the only value near 3G (which is where the Python script bails).

MemTotal:        2060396 kB
MemFree:          434900 kB
Buffers:          155676 kB
Cached:          1239436 kB
SwapCached:            0 kB
Active:           566296 kB
Inactive:         950720 kB
Active(anon):     122656 kB
Inactive(anon):     2304 kB
Active(file):     443640 kB
Inactive(file):   948416 kB
Unevictable:           0 kB
Mlocked:               0 kB
HighTotal:       1187784 kB
HighFree:          32724 kB
LowTotal:         872612 kB
LowFree:          402176 kB
SwapTotal:       2095100 kB
SwapFree:        2095100 kB
Dirty:               116 kB
Writeback:             0 kB
AnonPages:        121824 kB
Mapped:            43744 kB
Shmem:              3052 kB
Slab:              90240 kB
SReclaimable:      80508 kB
SUnreclaim:         9732 kB
KernelStack:        2200 kB
PageTables:         4320 kB
NFS_Unstable:          0 kB
Bounce:                0 kB
WritebackTmp:          0 kB
CommitLimit:     3125296 kB
Committed_AS:    1063988 kB
VmallocTotal:     122880 kB
VmallocUsed:       32560 kB
VmallocChunk:      85492 kB
HardwareCorrupted:     0 kB
HugePages_Total:       0
HugePages_Free:        0
HugePages_Rsvd:        0
HugePages_Surp:        0
Hugepagesize:       4096 kB
DirectMap4k:       12280 kB
DirectMap4M:      897024 kB
Comment 37 Mike Hommey [:glandium] 2011-06-29 07:51:33 PDT
On my 64-bits system, there is no value near 65GB.
Comment 38 Mike Hommey [:glandium] 2011-06-29 07:54:36 PDT
Actually, I just had hit another kind of limit: the number of different mappings we can have in a process. If I change your script to allocate 1GB at a time instead of 1MB, I go up to 65TB mapped...
Comment 39 Justin Lebar (not reading bugmail) 2011-06-29 07:57:16 PDT
I just tested on office.mozilla.org, a 32-bit system with CommitLimit 2,553MB, and I could map 3068MB.
Comment 40 Justin Lebar (not reading bugmail) 2011-06-29 08:20:55 PDT
On the 32-bit system with a measured mapping limit of ~3G, there don't appear to be any values immediately in /proc which are 3 followed by at least 6 digits, aside from meminfo:CommitLimit.

I come up similarly empty-handed when I look in /proc/1.

  find /proc -maxdepth 1 | xargs egrep '\<3[0-9]{6}' 2>/dev/null
Comment 41 Justin Lebar (not reading bugmail) 2011-06-29 13:12:21 PDT
FWIW I was able to get vsize to 1600MB with explicit/private/rss of ~900MB on a Windows 7 x86-32 VM.  Since Windows has a (default) max vsize of 2G, I think it's worth monitoring both explicit and vsize here.
Comment 42 Justin Lebar (not reading bugmail) 2011-06-29 13:16:37 PDT
Removing the dependency on bug 668137 (add vsize on Windows) -- as I have the patch written, that bug isn't a strict dependency for firing low vmem events.  I think we want it anyway.
Comment 43 Justin Lebar (not reading bugmail) 2011-06-29 13:56:57 PDT
Created attachment 542934 [details] [diff] [review]
Patch v1
Comment 44 Justin Lebar (not reading bugmail) 2011-06-29 14:06:11 PDT
I don't think we want to check this in without first auditing all the low-memory listeners and making sure that either:

 * it's OK to run them once a second when there's memory pressure, or
 * they back off when they're called too often.

I considered making the LowMemoryDetector back off and not fire lots of events when there's memory pressure, but I think there's an end-to-end argument in favor of pushing the decision about how often to do things like GC down to the code which understands the GC.

I'm going to do this audit next, and I'll report back.
Comment 45 Justin Lebar (not reading bugmail) 2011-06-29 14:26:55 PDT
There are actually more places that listen to memory-pressure than I thought!

The one I worry about most is gc/cc, especially since if we're swapping, gc/cc might be more expensive than usual.

It seems like the right thing to do is trigger a gc/cc immediately when we see the first notification in some period of time, but from then on only gc/cc a bit more aggressively, until memory pressure subsides.

njn, do you have any thoughts on how we should do this?
Comment 46 Justin Lebar (not reading bugmail) 2011-06-29 21:03:25 PDT
Testing done:

* On Windows, I tested that vsize notifications are fired when vram gets too high.  I just realized I haven't tested the low physical memory notifications in this latest iteration of the patch, and I'll do that in the morning.

* On Linux, I tested that memory notifications are fired when RAM is constrained, using the process at [1].

My mac has 8G of RAM, and the process at [1] doesn't work on Mac, so I haven't tested there.  But the code is exactly the same as on Linux.

I haven't done any testing to ensure that the memory notifications actually do something useful.  I think we might as well do that in separate bugs.  But of course we don't want to land this until we know that it's at least not deleterious to fire so many low-memory events.

[1] http://jlebar.com/2011/6/15/Limiting_the_amount_of_RAM_a_program_can_use.html
Comment 47 Nicholas Nethercote [:njn] 2011-06-29 21:29:38 PDT
Comment on attachment 542934 [details] [diff] [review]
Patch v1

Review of attachment 542934 [details] [diff] [review]:
-----------------------------------------------------------------

Let me double check how this works (forgive me if I'm duplicating anything you've written).  The default behaviour:

- On Linux/Mac
  - If the number of hard page faults since the last check (1 second ago) exceeds 100, fire a PHYSICAL notification.

- On Windows:
  - If the available virtual memory is less than 256MB, fire a VIRTUAL notification.
  - If the available physical memory is less than 32MB, fire a PHYSICAL notification.

It'd be good to have comments like that in HasMemoryPressure(), even though the code isn't complicated, to make things super clear.

So the code looks pretty good, ie. it does what you intend it to.  However, I'm a Gecko newbie, so I strongly recommend you get someone else to review this (and then someone else to super-review).  I looked closely at the memory pressure stuff, but I don't know that much about observers and mutexes and initializing children and all that stuff.  So I've given feedback+.

The big questions are all on the heuristic side:  are these are the right measurements to be taking, are they taken often enough... ie. does it actually work?  As you say, that'll require checking individual listeners, and it shouldn't land until that has happened.  And then, this will need some time to bake, I suggest landing it early in the release cycle if possible.

Another question: does checking every 1 second cause problems with mobile devices?

Also, it's a concern that there are no tests, but it's hard to know what tests for this code would look like.  Maybe the prefs could be changed so that the notifications are fired very frequently?  Not sure.

::: toolkit/components/telemetry/TelemetryHistograms.h
@@ +50,5 @@
>  HISTOGRAM(MEMORY_JS_GC_HEAP, 1024, 512 * 1024, 10, EXPONENTIAL, "Memory used by the garbage-collected JavaScript heap (KB)")
>  HISTOGRAM(MEMORY_RESIDENT, 32 * 1024, 1024 * 1024, 10, EXPONENTIAL, "Resident memory size (KB)")
>  HISTOGRAM(MEMORY_LAYOUT_ALL, 1024, 64 * 1024, 10, EXPONENTIAL, "Memory used by layout (KB)")
> +HISTOGRAM(LOW_MEMORY_EVENTS, 1, 256, 4, EXPONENTIAL, "Number of low-memory events fired by LowMemoryDetector (since last telemetry ping)")
> +HISTOGRAM(LOW_VMEMORY_EVENTS, 1, 256, 4, EXPONENTIAL, "Number of low virtual memory events fired by LowMemoryDetector (since last telemetry ping)")

Nit: Inconsistent hyphenation, "low-memory" vs. "low virtual memory".

More substantial point:  you have an enum with
MEMORY_PRESSURE_PHYSICAL, MEMORY_PRESSURE_VIRTUAL.  So you should talk about "low physical memory" and "low virtual memory" and avoid phrases like "low memory"... I'd like you to scrupulously distinguish physical and virtual memory throughout the whole patch, it makes things much clearer.

::: xpcom/base/LowMemoryDetector.cpp
@@ +81,5 @@
> + *    check_interval_ms do we need to observe to declare that memory is low?
> + *
> + * - low_memory_threshold_mb (megabytes, Windows only)
> + *    If we're detecting memory pressure by monitoring the amount of available
> + *    memory, what amount of available memory do we consider to be "low"?

Similar: virtual or physical memory?  Presumably the former.

@@ +111,5 @@
> +  nsnull
> +};
> +
> +// Mark these as volatile since they may be read and written on different
> +// threads.  Volatile keeps the compiler from transforming this

This scares me.  Is it a standard Mozilla idiom?

@@ +132,5 @@
> +
> +void
> +ReloadPrefs()
> +{
> +  // You should probably keep these defaults sync'ed with all.js.

It kinda sucks that the defaults are specified twice, is that unavoidable?

@@ +169,5 @@
> +  UNITS_COUNT,
> +  GetNumLowMemoryEvents,
> +  "Number of times the process detected that the system was low on available "
> +  "physical memory and tried to reduce its footprint.  On Linux and Mac, we "
> +  "detect low memory by observing page hard page faults in the Firefox "

"page hard page"

Also, other reporters have avoided mentioning "Firefox", using "the application" or similar.  Seems a good idea in case this code ends up in some other product.  "The application" could replace "we", too.  (Oh, I see you used "the process" for low-vmemory-events, that's fine too.)

@@ +402,5 @@
> +#include <inttypes.h>
> +
> +// Initialize mLastNumPageFaults to -1 so we don't fire a low-memory
> +// notification the first time we read this value -- cold startup produces many
> +// hard page faults.

Won't the "don't send any notifications for the first N seconds" feature protect against that?  Then we could initialize it to 0, and then HasMemoryPressure() wouldn't need to check for -1.

@@ +506,5 @@
> +    NS_WARNING("GlobalMemoryStatusEx call failed.");
> +    return MEMORY_PRESSURE_NONE;
> +  }
> +
> +  // It's no extra work to check for vmemory presssure on 64-bit processes, so

presssure!

::: xpcom/base/LowMemoryDetector.h
@@ +116,5 @@
> +  /**
> +   * Increment our count of the number of memory pressure events we've fired
> +   * because we're running out of available physical memory.
> +   */
> +  void IncrementNumLowMemoryEvents();

As above:  it would be clearer if |Memory| was changed to |PMemory| or |PhysicalMemory|.
Comment 48 Justin Lebar (not reading bugmail) 2011-06-30 09:03:51 PDT
Created attachment 543150 [details] [diff] [review]
Patch v2

bsmedberg, are you interested in taking a look at this?
Comment 49 Justin Lebar (not reading bugmail) 2011-06-30 09:06:23 PDT
(In reply to comment 47)
> Let me double check how this works (forgive me if I'm duplicating anything you've written).  The default behaviour:
> 
> - On Linux/Mac
>   - If the number of hard page faults since the last check (1 second ago) exceeds 100, fire a PHYSICAL notification.
> 
> - On Windows:
>   - If the available virtual memory is less than 256MB, fire a VIRTUAL notification.
>   - If the available physical memory is less than 32MB, fire a PHYSICAL notification.

Yes, that's right.  I've added comments to this effect.
> 
> The big questions are all on the heuristic side:  are these are the right
> measurements to be taking, are they taken often enough... ie. does it
> actually work?  As you say, that'll require checking individual listeners,
> and it shouldn't land until that has happened.  And then, this will need some
> time to bake, I suggest landing it early in the release cycle if possible.

I agree.  To be clear, there are two (mostly) separate "does it work?" questions:

 * Do the memory pressure notifications get fired at the right time?  It has
   to be before we're totally out of memory, since things like the CC allocate
   memory, and since walking memory in a GC is going to be bad if we're paged
   out.  But it shouldn't be too much before we're out of memory, because we
   don't want to drop caches and whatnot unnecessarily.

 * Do the memory-pressure observers react well to the memory pressure
   notifications?  If it's expensive to run the observer (e.g. the CC), does it
   back off appropriately upon getting one memory-pressure report a second?

> Another question: does checking every 1 second cause problems with mobile devices?

It's currently disabled on mobile:

  // The UnixLowMemoryDetector shouldn't break on Android/Maemo, but Android
  // has no swap and swap is optional on Maemo.  The Unix detector works by
  // noticing when we swap, so is unlikely to be useful on these platforms.

> Also, it's a concern that there are no tests, but it's hard to know what tests for this code would look like.  Maybe the prefs could be changed so that the notifications are fired very frequently?  Not sure.

I have no idea how to test this.  The problem isn't the frequency of the notifications so much as the fact that we're observing system events.

I guess I could create some JS objects which use a lot of memory and then check that the notifications are fired.  But I'm not sure I could do this in a way which consistently doesn't cause us to run out of virtual address space...

> @@ +111,5 @@
> > +  nsnull
> > +};
> > +
> > +// Mark these as volatile since they may be read and written on different
> > +// threads.  Volatile keeps the compiler from transforming this
> 
> This scares me.  Is it a standard Mozilla idiom?

I could guard with a lock -- Is that what you were asking? -- but it just felt
unnecessary. But I did this before there were any locking operations in the
loop -- then I had to add a mutex in the sleep call.  

I think it's safe as it is, because aiui, if atomic operation A happens-before
atomic operation B, all writes which occur before A are visible after B.  So
the changes to the prefs are visible as soon as we, for instance, take a lock
on each thread, which surely will happen within one iteration of the detector
loop.

Whether we should be relying on this behavior is another question entirely.
Let's see what the next reviewer thinks.

> > +void
> > +ReloadPrefs()
> > +{
> > +  // You should probably keep these defaults sync'ed with all.js.
> 
> It kinda sucks that the defaults are specified twice, is that unavoidable?

It is as far as I know.

> @@ +402,5 @@
> > +#include <inttypes.h>
> > +
> > +// Initialize mLastNumPageFaults to -1 so we don't fire a low-memory
> > +// notification the first time we read this value -- cold startup produces many
> > +// hard page faults.
> 
> Won't the "don't send any notifications for the first N seconds" feature protect against that?  Then we could initialize it to 0, and then HasMemoryPressure() wouldn't need to check for -1.

Suppose cold startup causes 100 page faults.  We initialize mLastNumPageFaults
to 0 and wait N seconds before checking.  When we check, we discover that there
are 100 more page faults than mLastNumPageFaults, so we fire a memory-pressure
notification.
Comment 50 Justin Lebar (not reading bugmail) 2011-06-30 09:06:54 PDT
Comment on attachment 543150 [details] [diff] [review]
Patch v2

Taras, can you take a look at the Telemetry changes here?
Comment 51 (dormant account) 2011-06-30 15:22:36 PDT
Comment on attachment 543150 [details] [diff] [review]
Patch v2

It doesn't make sense to me to poll for OOM on a timer. I think a better way would be to devise an active timer: ie the opposite of nsIIdleService.

Stick some time tracking into the event loop. ie every time an event passes and elapsed time exceeds some internal, fire some callback.

Telemetry would benefit from this too.
Comment 52 (dormant account) 2011-06-30 15:24:43 PDT
In this specific case piggyback onto the cycle collector would work too.
Comment 53 Justin Lebar (not reading bugmail) 2011-06-30 16:25:25 PDT
(In reply to comment #51)
> It doesn't make sense to me to poll for OOM on a timer. I think a better way
> would be to devise an active timer: ie the opposite of nsIIdleService.

We can become low on physical memory even if we're idle.  Suppose the user minimizes Firefox and loads up Photoshop.  The hope is that we can catch this early and free up a whole bunch of memory, potentially keeping the system from swapping FF out and making us load back up more quickly.

The *nix code won't catch this case, since it's looking for page faults in the FF process, but the Windows code will, since it looks at overall available memory on the system.  I'm looking into doing something similar on *nix, if only since mobile can't swap.

That said, it would probably make sense that if we're idle and have fired one low memory notification, we stop checking until we're no longer idle.  Do you think this is reasonable, Taras?

Of course, this whole bug doesn't even make a lot of sense unless we can drop some serious RAM when we notify.  Dropping all of bfcache might help (it currently does this on memory-pressure), but maybe there's more we can do.  (Now that images are discarded on a 10s timer, they're kind of out of the picture.)
Comment 54 (dormant account) 2011-06-30 16:31:34 PDT
(In reply to comment #53)
> (In reply to comment #51)
> > It doesn't make sense to me to poll for OOM on a timer. I think a better way
> > would be to devise an active timer: ie the opposite of nsIIdleService.
> 
> We can become low on physical memory even if we're idle.  Suppose the user
> minimizes Firefox and loads up Photoshop.  The hope is that we can catch
> this early and free up a whole bunch of memory, potentially keeping the
> system from swapping FF out and making us load back up more quickly.
> 

Aside: personally I would drop memory when user minimizes the browser anyway. Android needs to do it(not sure if it does  yet), dunno about desktop.

I seriously doubt that Firefox would sit without spinning the event loop for too long. I object to adding yet another reason to spin it.
Comment 55 (dormant account) 2011-06-30 16:32:36 PDT
Note Linux will fault on Android, since memory mapped files(aka libraries) act as swap.
Comment 56 Justin Lebar (not reading bugmail) 2011-06-30 16:35:26 PDT
(In reply to comment #54)
> I seriously doubt that Firefox would sit without spinning the event loop for
> too long. I object to adding yet another reason to spin it.

Could you rephrase this?
Comment 57 Justin Lebar (not reading bugmail) 2011-06-30 16:42:43 PDT
(In reply to comment #56)
> Could you rephrase this?

<jlebar> taras, You're saying that the "idle" state is rare in practice.
<taras> yes
<jlebar> taras, But that it's important that when we're in this state, we don't wake up and check to see whether we're out of memory.
<taras> yes
<jlebar> taras, Well, why is it so important if that state doesn't happen so often?
<taras> it is important to not wake up too much to save power, unless you are asking something else
<taras> so while we wake up too often as is, we should move towards waking up less often, not more
<jlebar> taras, Okay, I buy that.  :)

(In reply to comment #55)
> Note Linux will fault on Android, since memory mapped files(aka libraries)
> act as swap.

This contradicts dougt in comment 14:

> justin - right, no swapping on most devices.  the n900/maemo had an option
> to enable swap to sdcard, but that is generally the exception.

Can you guys duke it out?
Comment 58 (dormant account) 2011-06-30 16:51:09 PDT
(In reply to comment #57)

> > justin - right, no swapping on most devices.  the n900/maemo had an option
> > to enable swap to sdcard, but that is generally the exception.
> 
> Can you guys duke it out?

Doug said no swapping, not no paging :)
Comment 59 Dave Garrett 2011-06-30 17:16:03 PDT
Minor suggestion: The threshold prefs are OS-specific so it might be a good idea to #ifdef them out of all.js when not applicable so people don't see non-functioning prefs in about:config if they try to tweak these settings. (the Preferences::GetInt() calls have defaults so the rest of the usages don't really need cluttering up with #ifdefs, though)
Comment 60 Justin Lebar (not reading bugmail) 2011-07-01 11:19:43 PDT
Comment on attachment 543150 [details] [diff] [review]
Patch v2

I'll see if I can make something which wakes us up less-often.
Comment 61 Justin Lebar (not reading bugmail) 2011-07-02 08:54:47 PDT
Jesse pointed out:

> The Mac "Activity Monitor" shows a system-wide value called "page
> outs". That might be useful to record along with the number of hard
> page faults Firefox encounters. For example, if "page outs" is 0 (like
> it is for me right now), then we know that none of the hard page
> faults encountered by Firefox are the result of swapping.

It's something to look into, although at least on my computer, we don't page much after startup except when there's memory pressure.

I'm starting to think that the current Windows approach (look at available bytes) makes more sense than the current *nix approach (look at page faults), since on *nix if the user

 1. minimizes FF,
 2. loads photoshop, which eats all RAM and causes FF to be paged out,
 3. closes photoshop, freeing a bunch of ram,
 4. reopens FF

we'll only notice memory pressure during the last step, when we get faulted back in.  But that's not the time to free up memory!
Comment 62 Justin Lebar (not reading bugmail) 2011-07-03 22:18:56 PDT
See bug 669120 for a simpler approach which doesn't involve timer threads or heuristics.  I'm beginning to think that bug may be a better place to start, at least in terms of FF being a good citizen and not obstructing other applications' use of memory.
Comment 63 Justin Lebar (not reading bugmail) 2011-07-05 08:51:50 PDT
I think the way forward here is:

 * Bug 669120 - Fire a memory-pressure event when we lose focus for X seconds.  This way, we don't have to check for low memory when we're idle.

 * Check for low memory using the same or a similar heuristic to the new periodic GC code.

The assumption here is that it's likely that if some other program starts using lots of memory, it'll be while we're not focused.  And this strategy doesn't involve any new timer threads.
Comment 64 Justin Lebar (not reading bugmail) 2011-07-06 06:45:50 PDT
The GC timer is bug 656120.
Comment 65 Mike Hommey [:glandium] 2011-07-10 00:10:21 PDT
Note that, now that I think of it, there are ways, at least on Linux, to know if the system is swapping or not, e.g. the taskstats api, which we're probably going to use to get a more accurate process startup time (and is quite painful to setup, so that part could be shared).
Comment 66 Justin Lebar (not reading bugmail) 2011-07-12 10:08:12 PDT
I filed a new bug just for tracking Windows vmem: bug 670967.  It seems to me that running out of address space on Windows is the worst manifestation of this issue, and I think I may be able to address that without adding a timer thread.
Comment 67 Justin Lebar (not reading bugmail) 2011-07-18 14:22:44 PDT
For those following along at home, I've morphed bug 670967 into tracking both virtual and physical memory on Windows by wrapping VirtualAlloc and the other virtual allocation syscalls.
Comment 68 Ziga Seilnacht 2011-08-17 12:44:06 PDT
For Windows, did you try using the CreateMemoryResourceNotification function for the LowMemoryDetector? From the MSDN description it looks to be exactly what you are looking for here. Its advantage over currently used solution is that it avoids time based polling, you can just include the returned handle in the main event loop and wait for it to become signaled.
Comment 69 Justin Lebar (not reading bugmail) 2011-08-22 11:02:14 PDT
Ah, the hidden gems of the WinAPI.  CreateMemoryResourceNotification looks like it does exactly what I'd want; thanks for pointing it out!

My current approach in bug 670967 is to watch only for low virtual memory, for reasons explained in the bug.  It doesn't look like CreateMemoryResourceNotificaiton helps here.  I'm currently watching the amount of available virtual memory by wrapping calls to VirtualAlloc and a few other functions.
Comment 70 Justin Lebar (not reading bugmail) 2011-12-12 22:01:59 PST
Created attachment 581179 [details] [diff] [review]
Rev 2, part 3 v1: Pass env var when running PGO-instrumented build
Comment 71 Justin Lebar (not reading bugmail) 2011-12-12 22:07:58 PST
Comment on attachment 581179 [details] [diff] [review]
Rev 2, part 3 v1: Pass env var when running PGO-instrumented build

Oops; wrong bug.
Comment 72 Nicholas Nethercote [:njn] 2011-12-16 12:03:26 PST
jlebar, can you remind me how this bug relates to bug 670967?  Is this bug now only for Mac and Linux?
Comment 73 Justin Lebar (not reading bugmail) 2011-12-16 12:08:53 PST
> Is this bug now only for Mac and Linux?

Right.
Comment 74 Nicholas Nethercote [:njn] 2012-01-29 12:07:08 PST
http://stackoverflow.com/questions/3019748/how-to-reliably-measure-available-memory-in-linux/ has some interesting info about detecting low memory on Linux.

Note You need to log in before you can comment on or make changes to this bug.