Listen for memory pressure notifications from lowmemkiller driver

RESOLVED FIXED in Firefox 18

Status

()

defect
RESOLVED FIXED
7 years ago
7 years ago

People

(Reporter: cjones, Assigned: justin.lebar+bug)

Tracking

Trunk
mozilla19
ARM
Gonk (Firefox OS)
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(blocking-basecamp:+, firefox18 fixed)

Details

(Whiteboard: [MemShrink:P2])

Attachments

(2 attachments, 6 obsolete attachments)

Alhad patches the linux driver to fire off a notification when the number of free pages falls below a certain threshold, which we'll set up to be just before processes start being gunned down.  We can use this notification in gecko to drop inessential caches / do GCs / etc. in the hope of keeping more processes alive.

If we don't free enough memory, that's OK, lowmemkiller will clean up for us, more invasively.

Any cleanup we do off this notification should *not* affect user experience.  For example, we should *not* drop WebGL contexts.
Posted patch MemPressurePoller (obsolete) — Splinter Review
MemPressurePoller implementation.  Surely doesn't fully confirm to Mozilla coding standards, please be brutal with review comments so we learn :)

This patch is incomplete due to licensing issues.  Imagine nsAppShell calling 
mozilla::hal_impl::EnableMemoryPressureNotifications() in its ctor and mozilla::hal_impl::DisableMemoryPressureNotifications() in its dtor, and a change to hal/Hal.h declaring those two symbols. 

Configuring the LMK, and the LMK kernel mod itself is also not a part of this patch.
Assignee: nobody → mvines
Attachment #640020 - Attachment is obsolete: true
Status: NEW → ASSIGNED
Comment on attachment 644722 [details] [diff] [review]
Gonk low memory polling

Review of attachment 644722 [details] [diff] [review]:
-----------------------------------------------------------------

::: widget/gonk/GonkMemPressure.cpp
@@ +51,5 @@
> +namespace {
> +
> +class MemPressureRunnable : public nsRunnable
> +{
> +	public:

DO you mind indenting with 2 spaces only? Otherwise files get really right-side heavy. Don't indent public: at all. 2 spaces for the members. Look at other files nearby for examples.

@@ +79,5 @@
> +			if (fds[0].fd < 0) {
> +				LOG("Error, could not open sysfs node\n");
> +				return -1;
> +			}
> +			(void)read(fds[0].fd, buf, 2);

is it wise to ignore the return value here? Can't this be EINTR?

@@ +83,5 @@
> +			(void)read(fds[0].fd, buf, 2);
> +			fds[0].events = POLLERR | POLLPRI;
> +			err = poll(fds, 1, -1);
> +			if (err > 0) {
> +				(void)read(fds[0].fd, buf, 2);

dito

@@ +99,5 @@
> +		}
> +
> +		SysFsPoller(const char *node) :
> +			sysfs_node(node) {
> +	   }

Watch out for tabs. They are evil.

::: widget/gonk/GonkMemPressure.h
@@ +29,5 @@
> +
> +#ifndef GONKMEMPRESSURE_H
> +#define GONKMEMPRESSURE_H
> +
> +void EnableMemoryPressureNotifications();

This should be namespaced.
In testing this patch a design flaw has been uncovered.  The /sys/kernel/mm/lowmemkiller/notify_trigger_active node that the patch poll()s to notified of memory pressure is level-triggered and not edge-triggered.  Worse, it only triggers when there is memory pressure.  The attached patch as written will cause Gecko to GC itself into the ground during memory pressure.  

Modifying the kernel to additionally notify when memory pressure is no longer present appears non-trivial and harmful to general system performance, so it seems like we're stuck with the current kernel interface and need to adapt in user space.  

The current plan to resolve this, which BTW was derived in absence of a solid understanding of what Gecko does when it receives a memory pressure event so comments more than welcome, is to run an N second timer when the kernel notifies of memory pressure before re-issuing a Gecko memory pressure event (provided we're still under memory pressure).  Free memory is wasted memory, so I want to be careful about using a value for N that's too small.  I'm thinking of roughly 15s as an opening bid and maybe we tune that later during user trials.  

Is there a better way?
That's a very reasonable approach.  I don't have a good intuitive guess at N.  I sort of suspect it doesn't much matter.
Attachment #644722 - Attachment is obsolete: true
Posted patch v3 (obsolete) — Splinter Review
Patch intended to addresses all gal's comments, and adds the cool down timer.  

This code is dormant until we add something like this to /init.b2g.rc:

  on boot
    # 4096 = 16MB
    write /sys/module/lowmemorykiller/parameters/notify_trigger 4096
blocking-basecamp: --- → ?
This sounds like a really-nice-to-have but not a blocker.
blocking-basecamp: ? → -
How does this work cross process? We probably mostly care about delivering these to content processes which use a puppet widget. cjones?
Looks like "memory-pressure" notifications are propagated into all content processes already: http://mxr.mozilla.org/mozilla-central/search?string=FlushMemory
FWIW, bug 749053 is another instance of 3-clause BSD source potentially sneaking into the tree.
Comment on attachment 654339 [details] [diff] [review]
Addresses v3 patch rot, no functional changes

Review of attachment 654339 [details] [diff] [review]:
-----------------------------------------------------------------

::: widget/gonk/GonkMemPressure.cpp
@@ +69,5 @@
> +};
> +
> +
> +static nsCOMPtr<nsIThread> sMemPressureThread;
> +static nsCOMPtr<nsIRunnable> sRunnable;

We should be able to do a nsRefPtr to the concrete class for this case.

@@ +70,5 @@
> +
> +
> +static nsCOMPtr<nsIThread> sMemPressureThread;
> +static nsCOMPtr<nsIRunnable> sRunnable;
> +static int sFd;

We have ScopedClose to automatically close FDs.

@@ +72,5 @@
> +static nsCOMPtr<nsIThread> sMemPressureThread;
> +static nsCOMPtr<nsIRunnable> sRunnable;
> +static int sFd;
> +
> +class MemPressurePoller {

I think it would work a bit better to roll the static global vars and the enable/disable functions into here and make them all non-static. Then nsAppShell would make a MemPressurePoller, hold on to it with a nsRefPtr, and call its enable/disable functions.

@@ +106,5 @@
> +
> +void
> +MemPressurePoller::Poll()
> +{
> +  if (sFd <= 0) {

0 is a valid FD. (though unlikely unless we have buggy code that closes FD 0)

@@ +126,5 @@
> +
> +  if (2 == numbytes) {
> +    // Emit a "memory-pressure" event if beneath the threshold
> +    if ('1' == buf[0]) {
> +      NS_DispatchToMainThread(new MemPressureRunnable());

Hm, maybe preallocate a MemPressureRunnable so we don't have to allocate an object when we're low on memory?

@@ +152,5 @@
> +
> +namespace mozilla {
> +
> +void
> +GonkMemPresureEnable()

Pressure is spelled wrong.

@@ +157,5 @@
> +{
> +  sFd = open("/sys/kernel/mm/lowmemkiller/notify_trigger_active",
> +    O_RDONLY | O_CLOEXEC);
> +  if (sFd > 0) {
> +    if (sMemPressureThread == nullptr) {

I think !sMemPressureThread is the usual way to check for null. I also would like to have these two conditions combined.
Comment on attachment 654339 [details] [diff] [review]
Addresses v3 patch rot, no functional changes

Clearing review while waiting for a new patch.
Attachment #654339 - Flags: review?(mwu)
Duplicate of this bug: 802665
It appears we now need this notification to help with blocker-level mem usage issues.
Blocks: 798002
blocking-basecamp: - → +
Driving by...

>+// Minimum time between emitting successive memory-pressure events
>+#define COOLDOWN_TIME_MS (15*1000) // FIXME: 15s was mostly chosen at random...

You can make this a pref with mozilla::Preferences::AddIntVarCache.

>+class MemPressureRunnable : public nsRunnable
>+{
>+public:
>+  NS_IMETHOD Run()
>+  {
>+    LOG("Dispatching memory-pressure event");
>+    nsCOMPtr<nsIObserverService> obsServ =
>+      mozilla::services::GetObserverService();

Nit: don't need mozilla::

>+    NS_NAMED_LITERAL_STRING(minimize, "heap-minimize");
>+    obsServ->NotifyObservers(nullptr, "memory-pressure", minimize.get());

You should check that obsServ is non-null, because it looks like the memory
pressure observer doesn't get killed until after XPCOM shutdown, and services
start returning null at shutdown.

Also, perhaps more idiomatically,

  obsServ->NotifyObserver(nullptr, "memory-pressure",
                          NS_LITERAL_STRING("heap-minimize").get());

>+static nsCOMPtr<nsIThread> sMemPressureThread;
>+static nsCOMPtr<nsIRunnable> sRunnable;
>+static int sFd;

static nsCOMPtr creates a static initializer, which is ungood.  You can do
|static StaticRefPtr|, but only if you have an nsRefPtr.  Otherwise what mwu
suggests about creating an object to hold these would be good.

(You could either hold the ref inside the AppShell or hold the ref to
MemPressurePoller in |static StaticRefPtr sSingleton| here.)

>+};

Nit: We put "// anonymous namespace" at the end of namespaces.

(I've also never seen us put a semicolon there, but w/e.)

>+void
>+GonkMemPresureDisable()
>+{
>+  if (sFd > 0) {
>+    close(sFd);
>+    sFd = 0;
>+  }
>+  sRunnable = nullptr;
>+  sMemPressureThread = nullptr;
>+}

Cancel the timer here, please.  Otherwise we can crash if the timer fires after
this.
mwu, our friends are no longer able to participate in this bug.  Can you update the patch get review from me or jlebar?
Help is on the way!
Ping on this bug?  I'm not sure what's happening in comment 20.
Flags: needinfo?(jones.chris.g)
I'm not sure what help was on the way in comment 20, but I'm just going to fix this up, unless anyone objects.
Assignee: mvines → justin.lebar+bug
Blocks: slim-fast
Flags: needinfo?(jones.chris.g)
It's all yours :).
Posted patch New patch, v1 (obsolete) — Splinter Review
I understood that we didn't want to take BSD code for this if we didn't have to (to avoid license proliferation and to avoid having to add extra advertising), so I re-implemented this code from scratch.  This patch is basically reviewable, except it has a bit more logging than I'd leave in for checkin.

Unfortunately I can't get a low-memory events to fire with the attached testcase, using either m1's patch or this patch.

I'll coordinate with mvines over e-mail to understand what's going on here.  If you'd like to be added to this e-mail chain, just drop me a line.
Attachment #654339 - Attachment is obsolete: true
# cat /sys/module/lowmemorykiller/parameters/notify_trigger     
0

Oh, well that probably explains something.
If I set notify_trigger to 32768, I get low-memory notifications in the testcase.  Hooray!

Now the question is, what should we set it to?  The value is in pages.

Comment 2 suggests that notify_trigger should be larger than the largest value in minfree.  minfree is currently set to 1536,2048,4096,16384.  I'll look into what these numbers mean, but if these numbers are in pages, that largest value is 64MB, which is probably excessive under any definition...
It's in pages.  There's no way we'll be able to derive the "right" value from first principles; will need testing.  Let's start with (/me rubs crystal ball) a value of 5000.

Note, we don't use that 64MB OOM class in gecko.
Getting somewhere in the e-mail chain:

http://www.cs.fsu.edu/~baker/devices/lxr/http/source/linux/drivers/staging/android/lowmemorykiller.txt

/sys/module/lowmemorykiller/parameters # cat adj
0,1,6,12
/sys/module/lowmemorykiller/parameters # cat minfree
1536,2048,4096,16384

This means we kill oom_adj less than 0 when there's less than 1536
pages, oom_adj less than 1 when there's less than 2048 pages, and so
on.

This is all well and good, except dhylands says [1] we switched to
oom_score_adj, which isn't the same thing, right?

It sounds like we need to switch back to oom_adj and set those values
and these lowmemorykiller parameters properly.

[1] https://bugzilla.mozilla.org/show_bug.cgi?id=780437#c8
> Note, we don't use that 64MB OOM class in gecko.

How do I figure out what classes we use?  We're using oom_score_adj, which does not correspond to the 0,1,6,12 above.
Yeah I think we've rather shot ourselves in the foot here, unless those oom_score_adj's get magically translated to oom_adj in the kernel :(.
You can use either oom_score_adj or oom_adj in the current kernel.

oom_adj is deprecated and will disappear sometime soon, but its still there.

If you enter oom_adj or oom_score_adj, they automagically get translated to the other.
And in fact you can do

echo 1 > /proc/<pid>/oom_adj

and see the corresponding

cat /proc/<pid>/oom_score_adj

or vice-versa
Phew!  Thanks, I had quick scare there ;).
Given that we must specify the scores in units of oom_adj in /sys/module/lowmemorykiller/parameters/adj, I don't understand what we're gaining by using oom_score_adj at all; we're just subjecting ourselves to rounding error.
Revert bug 785171 to go back to oom_adj
(In reply to Dave Hylands [:dhylands] from comment #36)
> Revert bug 785171 to go back to oom_adj

I guess another option is to set the values for parameters/{adj,minfree} when b2g starts up.  Then we can specify the values in terms of oom_score_adj in b2g's prefs and convert them appropriately.

As an added bonus, nobody has to mess around with .rc files and keep them in sync with our existing b2g prefs.

I'm happy with that if you are.
Posted patch Patch, v1 (obsolete) — Splinter Review
Comment on attachment 676488 [details] [diff] [review]
Patch, v1

$ git diff --stat=80 HEAD^
 hal/gonk/GonkHal.cpp                         |   90 +++++++++++-

dhylands, you last touched this code; does this look good to you?

 widget/gonk/GonkMemoryPressureMonitoring.cpp |  193 ++++++++++++++++++++++++++
 widget/gonk/GonkMemoryPressureMonitoring.h   |   14 ++
 widget/gonk/Makefile.in                      |    1 +
 widget/gonk/nsAppShell.cpp                   |    3 +

mwu, you looked at m1's patch for the same functionality here; can you look at these files?

 b2g/app/b2g.js                               |    9 ++

You both should probably check the changes here.

I've tested this with the testcase in this patch and we in fact get memory pressure events in all processes, as expected.  (Well, sometimes we don't get the event in the browser process, but that's not unexpected given how quickly it's allocating memory.)
Attachment #676488 - Attachment description: Bug 771195 - Fire memory pressure events on Gonk. → Patch, v1
Attachment #676488 - Flags: review?(mwu)
Attachment #676488 - Flags: review?(dhylands)
Whiteboard: [MemShrink]
Attachment #676454 - Attachment is obsolete: true
We'll want to keep having the lmk parameters set from the gonk level, since they're highly device specific and will depend on whatever QA scenarios are posed for a given product.  I would prefer to keep gecko ignorant of those things as much as possible to reduce potential forkage and the maintenance headaches those will incur.
What I wanted to avoid was the situation where we define the oom_value_adj values for master/fg/bg/etc in B2G and then we define the lmk parameters for these somewhere else.  Then whenever we add a new priority class or tweak an existing one we have to remember to change the lmk parameters, which aren't even the same repository and so can't easily be sync'ed to the b2g parameters' changes.

Can we have device-specific prefs in Gecko?  Or do we have a way to set these prefs as part of the build process for a specific device?  There are other such prefs we might want to change; for example, we have a pref that specifies how much decoded image memory we willingly keep around, so maybe a general solution, or the promise of one in the future, would be helpful here.
Well, that implies doing device-specific gecko builds, which is something I very much want to avoid.

We have device-specific "prefs" already through the propertydb.  But that also lives in gonk so we still have to update multiple places when we tweak high-level params like set of lmk classes.  Note, this is already the case for params like static allocation of pmem, which can't be twiddled by gecko.  For this type of solution, I don't have a strong opinion on whether propdb or init.rc settings are easier to maintain, but we can phone a friend.

Another option is "feature testing" these values by looking at MemTotal and then choosing values based on that.  But this doesn't handle tweaking per product well, since MemTotal doesn't capture product configuration from a QA/testcase standpoint.

Maybe a good compromise is
 - have a set of sensible default values "hard-coded" into gecko, perhaps decided by MemTotal
 - allow overriding those defaults through propdb "prefs"

wdyt?
Note that we might want to have device-specific settings for the oom_score_adj values; if we wanted that, we'd have to communicate those settings to Gecko somehow.  So I think figuring something out so we can modify Gecko prefs on a per-device basis would be good.

> Another option is "feature testing" these values by looking at MemTotal and then choosing 
> values based on that.  But this doesn't handle tweaking per product well, since MemTotal 
> doesn't capture product configuration from a QA/testcase standpoint.

Yeah, I agree.  We want to be able to tweak them arbitrarily.

I don't know what propertydb is, so I guess I don't have an opinion on that (yet :).  If we can read that in similarly to how we read in a Gecko pref, I guess that would be fine.

But would you consider it to be a "device-specific-build" if we had a otoro.js prefs file which overrides b2g.js (just as b2g.js overrides all.js)?  I don't know how exactly otoro.js file would need to be integrated into the build, but perhaps it wouldn't even need to be part of m-c.
Comment on attachment 676488 [details] [diff] [review]
Patch, v1

Review of attachment 676488 [details] [diff] [review]:
-----------------------------------------------------------------

::: hal/gonk/GonkHal.cpp
@@ +59,5 @@
>  #define LOG(args...)  __android_log_print(ANDROID_LOG_INFO, "Gonk", args)
>  #define NsecPerMsec  1000000LL
>  #define NsecPerSec   1000000000
>  
> +const size_t PAGE_SIZE_KB = 4;

This is available as PAGE_SIZE in <asm/page.h> (although its set to 4096 there)

I'd rather see us get this from the system header than use a hard-coded value.

::: widget/gonk/GonkMemoryPressureMonitoring.cpp
@@ +110,5 @@
> +
> +    // sysfs nodes need to be read before they can be successfully polled, so do
> +    // that here.
> +    char buf[2];
> +    read(lowMemFd, buf, 2);

nit: use sizeof(buf)

What if memory pressure is already present (i.e. read fills buf  with "1\n")

@@ +117,5 @@
> +      // Wait for lowMemFd to change or for data to be written to
> +      // mShutdownPipeWrite.
> +      struct pollfd pollfds[2];
> +      pollfds[0].fd = lowMemFd;
> +      pollfds[0].events = POLLERR | POLLPRI;

The documentation says that POLLERR is only valid in revents
http://linux.die.net/man/3/poll

Shouldn't we also have POLLIN?

@@ +120,5 @@
> +      pollfds[0].fd = lowMemFd;
> +      pollfds[0].events = POLLERR | POLLPRI;
> +      pollfds[1].fd = mShutdownPipeRead;
> +      pollfds[1].events = POLLERR | POLLIN;
> +      poll(pollfds, NS_ARRAY_LENGTH(pollfds), /* timeout */ -1);

What if a signal is received? poll will return -1 with errno set to EINTR

@@ +134,5 @@
> +
> +      lseek(lowMemFd, 0, SEEK_SET);
> +      // The notify_trigger_active sysfs node should contain either "0\n" or
> +      // "1\n".  The latter indicates memory pressure.
> +      char buf[2];

This buf shadows another one of the same name on line 113

@@ +135,5 @@
> +      lseek(lowMemFd, 0, SEEK_SET);
> +      // The notify_trigger_active sysfs node should contain either "0\n" or
> +      // "1\n".  The latter indicates memory pressure.
> +      char buf[2];
> +      int nread = read(lowMemFd, buf, 2);

nit: Use sizeof(buf)

@@ +136,5 @@
> +      // The notify_trigger_active sysfs node should contain either "0\n" or
> +      // "1\n".  The latter indicates memory pressure.
> +      char buf[2];
> +      int nread = read(lowMemFd, buf, 2);
> +      NS_ENSURE_STATE(nread == 2);

What if a signal is received?

That would cause nread to be -1, which will cause this runnable to exit prematurely.

@@ +138,5 @@
> +      char buf[2];
> +      int nread = read(lowMemFd, buf, 2);
> +      NS_ENSURE_STATE(nread == 2);
> +
> +      if (buf[0] == '1' && buf[1] == '\n') {

Due to comment 2, my interpretation is that the poll will be satisified while there is memory pressure. This implies that we'll continually send out memory-pressure events every mPollCooldownMs. Is that what's intended? Or should we only send out one event when this changes from 0 to 1?

@@ +145,5 @@
> +        NS_DispatchToMainThread(memoryPressureRunnable);
> +      }
> +
> +      {
> +        MonitorAutoLock lock(mMonitor);

Just reading the code, it isn't obvious why we're using this mMonitor and why the mPollCooldownMS is being used. We should document why we're doing this.

@@ +165,5 @@
> +
> +private:
> +  Monitor mMonitor;
> +  uint32_t mPollCooldownMS;
> +  bool mShuttingDown;

Intuition tells me that this should be volatile, since it's shared between threads

@@ +183,5 @@
> +{
> +  // memoryPressureWatcher is held alive by the observer service.
> +  nsRefPtr<MemoryPressureWatcher> memoryPressureWatcher =
> +    new MemoryPressureWatcher();
> +  NS_ENSURE_SUCCESS(memoryPressureWatcher->Init(),);

This looks odd (no argument after the comma). Wouldn't it be better to use NS_ENSURE_SUCCESS_VOID instead?
Attachment #676488 - Flags: review?(dhylands) → review-
> What if memory pressure is already present (i.e. read fills buf  with "1\n")

We'll see it when we check inside the loop, right?

> Intuition tells me that this should be volatile, since it's shared between threads

You don't need volatile when you have locks, right?  volatile is just to avoid the compiler doing optimizations like hoisting the variable read/write outside the loop, but locks (and syscalls and even non-inline method calls in the absence of LTO) should prevent that.

> This implies that we'll continually send out memory-pressure events every 
> mPollCooldownMs. Is that what's intended? Or should we only send out one event when 
> this changes from 0 to 1?

Hm.  I'm not sure.  It's working as intended, but I think your suggestion of sending out an event only when we change from 0 to 1 would probably be better.

Is there a way to detect this change?
(In reply to Justin Lebar [:jlebar] from comment #45)
> > What if memory pressure is already present (i.e. read fills buf  with "1\n")
> 
> We'll see it when we check inside the loop, right?

With the current behaviour yes. (where it's level sensitive rather than edge sensitive). If that ever changed though, we'd only see the transition from 1 to 0

> > Intuition tells me that this should be volatile, since it's shared between threads
> 
> You don't need volatile when you have locks, right?  volatile is just to
> avoid the compiler doing optimizations like hoisting the variable read/write
> outside the loop, but locks (and syscalls and even non-inline method calls
> in the absence of LTO) should prevent that.

Yeah - that's right.

> > This implies that we'll continually send out memory-pressure events every 
> > mPollCooldownMs. Is that what's intended? Or should we only send out one event when 
> > this changes from 0 to 1?
> 
> Hm.  I'm not sure.  It's working as intended, but I think your suggestion of
> sending out an event only when we change from 0 to 1 would probably be
> better.
> 
> Is there a way to detect this change?

Due to the way it works right now (level versus edge), we'd have to store the previous value ourselves, in order to detect when it changes
> Due to the way it works right now (level versus edge), we'd have to store the previous 
> value ourselves, in order to detect when it changes

I guess I mean, is there a way to poll() for the edge rather than the level here?  Otherwise we have to effectively manually check the notify fd every N seconds to see if it's 0, right?
(In reply to Justin Lebar [:jlebar] from comment #47)
> > Due to the way it works right now (level versus edge), we'd have to store the previous 
> > value ourselves, in order to detect when it changes
> 
> I guess I mean, is there a way to poll() for the edge rather than the level
> here?  Otherwise we have to effectively manually check the notify fd every N
> seconds to see if it's 0, right?

poll normally polls for edge (i.e. change)

My interpretation of things is that with this particular file when it's set to 1 it always returns (hence the cooldown delay).

The way poll works, the semantics of when it returns and when it doesn't is actually determined by the kernel side code.

So normally, when you read a 1, you could assume that the previous time you were called it was non-1. Here, we can't tell.

Actually, I'm not even sure that poll is the right API to be using here. I'll take a look and see if using uevents is possible/more appropriate
I've satisfied myself that poll is the correct API.
(In reply to Dave Hylands [:dhylands] from comment #49)
> I've satisfied myself that poll is the correct API.

How should we tell when the node switches from 1 to 0?
The only way I can see is if we keep the previous state.

Essentially, the poll will return each time that sysfs_notify is called:
https://github.com/mozilla-b2g/codeaurora_kernel_msm/blob/master/drivers/staging/android/lowmemorykiller.c#L255

The value it reads is determined by the lowmem_notify_trigger_active_show function a few lines later.

Looking at when lowmem_notify_killzone_approach is called and since the conditions when lowmem_notify_killzone_approach is called and the conditions for the show function to return 1 are the same, this means that when poll returns you'll essentially always read a 1.

So the only way to read a zero is to read it sometime when memory pressure is not being exerted.

We could do this by having the timeout on the poll be -1 if the previous value read was zero (which would be most of the time). Once we read a 1, we could change the timeout to check once every second or few seconds until we see a zero, at which point we can change the timeout back to -1.



So the only way to detect "changes" is to keep the old value and compare it.
(In reply to Justin Lebar [:jlebar] from comment #43)
> Note that we might want to have device-specific settings for the
> oom_score_adj values; if we wanted that, we'd have to communicate those
> settings to Gecko somehow.  So I think figuring something out so we can
> modify Gecko prefs on a per-device basis would be good.
> 

Why?

> > Another option is "feature testing" these values by looking at MemTotal and then choosing 
> > values based on that.  But this doesn't handle tweaking per product well, since MemTotal 
> > doesn't capture product configuration from a QA/testcase standpoint.
> 
> Yeah, I agree.  We want to be able to tweak them arbitrarily.
> 
> I don't know what propertydb is, so I guess I don't have an opinion on that
> (yet :).  If we can read that in similarly to how we read in a Gecko pref, I
> guess that would be fine.
> 

Yes.  http://mxr.mozilla.org/mozilla-central/source/widget/gonk/nsWindow.cpp#187 for example.

> But would you consider it to be a "device-specific-build" if we had a
> otoro.js prefs file which overrides b2g.js (just as b2g.js overrides
> all.js)?  I don't know how exactly otoro.js file would need to be integrated
> into the build, but perhaps it wouldn't even need to be part of m-c.

I want to minimize the device- and product-specific configuration that goes into gecko.  otoro.js et al is moving in the wrong direction.  I would prefer to expose features that gonk-level configurations can tweak.

What you're describing is basically what I'm proposing with propdb, with $b2g/device/otoro/[config].mk instead of otoro.js in gecko.
(In reply to Chris Jones [:cjones] [:warhammer] from comment #52)
> (In reply to Justin Lebar [:jlebar] from comment #43)
> > Note that we might want to have device-specific settings for the
> > oom_score_adj values; if we wanted that, we'd have to communicate those
> > settings to Gecko somehow.  So I think figuring something out so we can
> > modify Gecko prefs on a per-device basis would be good.
> > 
> 
> Why?

There were two assertions here; I'm not sure which one you question.

The oom_score_adj values have to be in Gecko because Gecko sets them on its processes.

The assertion that we might want to have different oom_score_adj values on different devices is based on the fact that AIUI the kernel kills based on oom_score, not oom_score_adj, and we may need to tweak the oom_score_adjs per device to get the behavior we want, depending on how the oom scores are calculated.

But this isn't an important point.  The more important point is that we have plenty of other Gecko prefs which we might want to tune on a per-device basis.  For example, we have tons of GC heuristics baked into b2g.js.  And we have that "decoded images memory" pref.  Like you said earlier, we want to be able to tweak these arbitrarily for a given device.

> > But would you consider it to be a "device-specific-build" if we had a
> > otoro.js prefs file which overrides b2g.js (just as b2g.js overrides
> > all.js)?  I don't know how exactly otoro.js file would need to be integrated
> > into the build, but perhaps it wouldn't even need to be part of m-c.
> 
> I want to minimize the device- and product-specific configuration that goes
> into gecko.  otoro.js et al is moving in the wrong direction.  I would
> prefer to expose features that gonk-level configurations can tweak.

How does propdb affect the amount of device-specific config that goes into Gecko, as compared to otoro.js?  Is the idea that we'll get less device-specific config using propdb because it's more of a pain to use from Gecko than oroto.js?  That seems like a bad reason...

Or maybe I misunderstand what you mean by "goes into Gecko".

By "goes into gecko", do you mean "goes into m-c"?  Would you be happy with an otoro.js solution where otoro.js did not live in m-c?

> What you're describing is basically what I'm proposing with propdb, with
> $b2g/device/otoro/[config].mk instead of otoro.js in gecko.

The main difference is that you can't override a Gecko pref with propdb.  So now every Gecko pref that we want to override from propdb we have to add code to read twice.

If we modified the prefs code so it checked propdb first for an identically-named pref (maybe prefixed with "gecko" or something), I guess that would work.  I don't see how that would be better than otoro.js, though.
If you're proposing something like a read-only extra-prefs.js that originates at the gonk level (not checked into m-c), then that would be approximately equivalent and satisfy my requirements.  Build support would have to be built for that.
Whiteboard: [MemShrink] → [MemShrink:P2]
> We could do this by having the timeout on the poll be -1 if the previous value read was zero (which 
> would be most of the time). Once we read a 1, we could change the timeout to check once every second 
> or few seconds until we see a zero, at which point we can change the timeout back to -1.

I'm worried about the case when we sit at Xmb of memory free, where X is under the low-mem-notify threshold but not so low that we kill any processes.

In that case, with this approach, we'll check the node every Y seconds, which should have a negative effect on our battery life.  (It's just as bad with the approach mvines and I took in our original patches, where we'd fire a low-memory event every Z seconds.)

If we could block until we switch from low-memory to not-low-memory, that would be an improvement here.  I'd still worry about thrashing, but if upon receiving a low-memory event we /raised/ the low-memory threshold by a bit and then waited for free memory to exceed that higher threshold, then we'd be less likely to thrash.

Anyone have a bright idea here, short of modifying the kernel so it doesn't suck?
Posted patch Patch, v2Splinter Review
I just punted on the problem of looping forever checking the low-memory-notify node.  Maybe it won't be an issue.

Chris, are you OK keeping these prefs in b2g.js for now until we have an otoro.js solution?  (We'd need defaults in b2g.js anyway, so this doesn't seem like a step backwards to me.)
Attachment #676488 - Attachment is obsolete: true
Attachment #676488 - Flags: review?(mwu)
Attachment #676914 - Flags: review?(dhylands)
We're stuck with the kernel interface.

The semantic question is whether we should try to emulate an edge-triggered interface, or treat the low-mem poll as a kind of persistent warning while we're level triggering.  In comment 5, what I meant was I'm guessing three scenarios will be important, in relative of frequency
 1. We dip down below the notification threshold on the way to dipping below the background-threshold.  (Think launching a new app when a bunch are already open.)  We'll either GC our way to victory or murder a background process and pop back up above that threshold.  By far most common case
 2. A fat process keeps us under the notification threshold "forever".
 3. Runaway allocations take us down to the foreground threshold and userspace essentially resets.

There's nothing we can do about (3) so we can toss that out.

There's not a whole lot we can do about (2); the best we can do is keep it above the foreground threshold.  Is it useful to poke fat processes once in a while to ask them to GC harder?  I dunno, but probably not.  What might be interesting though is temporarily making GC heuristics favor memory savings over performance.  To implement that, we'd need edge triggering.

(1) is most important.  The best we can hope for is to save a background process or two from death.  We only get one chance to do that.  After that, all the still-alive background processes have dropped what they could, so emergency belt-tightening is unlikely to be useful.  I don't think we'd ever want to consider affecting the performance of a foreground app when there's a background one to kill.  So the notify-and-sleep-awhile implementation suits that pretty well.

Unless there's a strong argument to be made for changing global GC heuristics based on edge-triggering for case (2), I think we should implement the dirt-simple notify-and-sleep-awhile.
(In reply to Justin Lebar [:jlebar] from comment #56)
> Created attachment 676914 [details] [diff] [review]
> Patch, v2
> 
> I just punted on the problem of looping forever checking the
> low-memory-notify node.  Maybe it won't be an issue.
> 
> Chris, are you OK keeping these prefs in b2g.js for now until we have an
> otoro.js solution?  (We'd need defaults in b2g.js anyway, so this doesn't
> seem like a step backwards to me.)

I don't want an otoro.js solution.  But maybe we're using different terms for the same thing ;).

Yes, this seems fine to me.
> Unless there's a strong argument to be made for changing global GC heuristics based on 
> edge-triggering for case (2), I think we should implement the dirt-simple notify-
> and-sleep-awhile.

I was with you until the conclusion here.

I agree that (1) is the most important situation to optimize for.  So doesn't that suggest we should go with the approach in the patch I just attached, which notifies and then waits until it observes no-memory-pressure before notifying again?

Both approaches are bad for battery life, but the one in this patch doesn't cause all processes to GC like crazy while we're low on memory, which should have a positive impact on performance (and not be anywhere near as hard on the battery).
Scenario (1) is that we dip below notification threshold, tighten our belts and hopefully save a process or two (or not), and then soon go back above the threshold.  Simulating edge triggering vs. a timeout doesn't matter.  We only get one shot.  Trying to simulate edge triggering will cause more wakeups, but I'm not too concerned about that.

I don't understand the "all processes GC'ing like crazy while we're low on memory" concern.
> I don't understand the "all processes GC'ing like crazy while we're low on memory" 
> concern.

Suppose we're low on memory, but nobody is dying, and there's no spare memory left to free by running a low-memory notification.  Suppose, perhaps, that the phone's screen is off.  So we're mostly just sitting there, in whatever state we happen to be in.

In this situation, we have two options.

Option (a) is "notify-and-sleep-awhile": Every X seconds, each process fires a low-memory notification, which causes a global GC/CC.

Option (b) is what's in the attached patch: Every X seconds, we wake up, see that we're still under memory pressure, and go back to sleep.

I understand and agree that this probably won't be a common scenario.  But given that this scenario is the only case when the behavior of option (a) is different from option (b), given that (b) seems obviously safer from a battery-life and perf perspective, and given that we have a patch for (b) in hand, it's not clear to me why we should go with (a).
These notifications won't wake the phone out of suspend.  So once the screen is off (except in unusual conditions when someone holds a wake lock), it doesn't matter what this code does, to a first approximation.
(In reply to Chris Jones [:cjones] [:warhammer] from comment #62)
> These notifications won't wake the phone out of suspend.

(More precisely, the CPU has to be running to generate these notifications.)
> These notifications won't wake the phone out of suspend.  So once the screen is off 
> (except in unusual conditions when someone holds a wake lock), it doesn't matter what 
> this code does, to a first approximation.

I don't feel like the screen's being off is particularly important to this argument.  Like you say, someone might be holding a wake lock.  Or perhaps the user is interacting with an app which doesn't churn much memory.  We were already talking about an uncommon situation.

Are you OK with us going with (b), given that it's basically the same as (a) in the common case but is perhaps marginally safer and is also what I have in hand?
Yes, if X is high that's fine by me.
In both (a) and (b) X affects our ability to respond quickly to memory pressure, so we don't want it to be too large.

We can get away with smaller X in case (b) than in case (a) because the cost of the operation we're doing is smaller (one read() versus a GC).  But the patch sets X = 5s, which I'd guess you wouldn't consider large.

We're going to have to tune all of these numbers and probably the algorithm in the patch anyway, so just name a value that you're happy with and I'll change the patch.
Comment on attachment 676914 [details] [diff] [review]
Patch, v2

Review of attachment 676914 [details] [diff] [review]:
-----------------------------------------------------------------

r=me if you fix the call to poll to check for EINTR

::: widget/gonk/GonkMemoryPressureMonitoring.cpp
@@ +51,5 @@
> + * that we're no longer under memory pressure, and only then start firing events
> + * again.
> + *
> + * (This is a bit problematic because we can't poll() to detect when we're no
> + * longer under memory pressure; instead we have to periodically read tye sysfs

nit: typo: tye s/b the

@@ +145,5 @@
> +      pollfds[0].fd = lowMemFd;
> +      pollfds[0].events = POLLPRI;
> +      pollfds[1].fd = mShutdownPipeRead;
> +      pollfds[1].events = POLLIN;
> +      poll(pollfds, NS_ARRAY_LENGTH(pollfds), /* timeout */ -1);

if poll returns due to a signal, then revents will be uninitialized.

You probably just want to reexecute the poll if it returns -1 with errno == EINTR
Attachment #676914 - Flags: review?(dhylands)
(In reply to Justin Lebar [:jlebar] from comment #66)
> In both (a) and (b) X affects our ability to respond quickly to memory
> pressure, so we don't want it to be too large.
> 

Do you mind elaborating on this a bit more?  What scenarios do you have in mind?

> We can get away with smaller X in case (b) than in case (a) because the cost
> of the operation we're doing is smaller (one read() versus a GC).  But the
> patch sets X = 5s, which I'd guess you wouldn't consider large.

I'm fine with 5s to begin with.  As long as we're not trying to simulate edge triggering, I still stand by comment 5.
> r=me if you fix the call to poll to check for EINTR

Thanks.

Did you mean to set r+, or do you want to see the next version?
Comment on attachment 676914 [details] [diff] [review]
Patch, v2

I thought the r=me was sufficient.
Attachment #676914 - Flags: review+
> In both (a) and (b) X affects our ability to respond quickly to memory
> pressure, so we don't want it to be too large.
> 
> Do you mind elaborating on this a bit more?  What scenarios do you have in mind?

In option (a) we never fire a low-memory event more than once every X seconds.

In option (b), we also never fire a low-memory event more than once every X seconds.  But it's actually worse than this, because if memory transitions from low --> not low --> low while we're sleep(X)'ing, we also will not fire a low-memory event, whereas we would with option (a).

So in both cases, X controls how quickly we can respond to duplicate memory pressure events.  For example, I parse ical, which generates a lot of JS garbage which we don't collect effectively for some reason (bug 805366).  That generates a memory pressure event, causes us to GC and saves some process's life.  Great.  Now I add another calendar and parse that.  If I do so within X seconds since the first memory-pressure event, we won't get a second one this time.
If you want to wait for the memory-pressure off without using a sleep, then one way you could do it would be to install a hook into the event loop.

Once you've detected memory pressure, you install the hook. The hook checks to see if your timeout has expired and if it has then it removes itself and schedules a runnable.

The advantage of this approach is that if the phone suspends (due to no user events) then we don't wake up the phone at all. Since the user isn't interacting with the phone, we probably don't care that the memory pressure has come off. As soon as the user wakes the phone up, we'll detect that our timeout has expired and we'll detect that memory pressure has gone.
I think in that case it's mostly up to gecko's GC heuristics.  The first lowmem notification is going to save all the memory we can in the background processes (to a first approximation), so from then on we're relying on the lowmem notification to bail out bad GC params in the foreground process and keep alive our poor shriveled background processes.

When X is on the order of 5s, anyway.  Significantly less than that and we're talking about edge-triggering simulation.
> If you want to wait for the memory-pressure off without using a sleep, then one way you could do it 
> would be to install a hook into the event loop.

I think this is a good idea, but it's tricky enough to do right that I'd prefer to save it for a follow-up if it's needed.

We can already do basically what you suggest here without too much trouble; the code for this hook already exists in the event loop.  The way that code works is you call a thread-safe method and we'll fire a low-mem notification immediately before the next turn of the event loop.  So if the event loop isn't running, we won't fire the event.

So we could notice memory pressure, call this function, and then sleep until we see that the low-mem notification fired.  That would let the background processes quiesce completely when we're under memory pressure.

But the trick is tuning heuristics, as usual: We don't want to do this "request memory-pressure event eventually" thing in a background process if we actually have memory to free.  We'll already release memory when the process goes into the background, but it's not in any sense "frozen" in the background, so it can still allocate memory and so on.

So it's not clear to me when to do one of these "request eventual low-mem event" things and when to actually fire a proper event.  :-/
https://hg.mozilla.org/mozilla-central/rev/378c38303325
https://hg.mozilla.org/mozilla-central/rev/05f887e6dd93
Status: ASSIGNED → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla19
Depends on: 808756
You need to log in before you can comment on or make changes to this bug.