Last Comment Bug 805855 - Reduce the memory consumption caused by dirty but unused pages kept by jemalloc
: Reduce the memory consumption caused by dirty but unused pages kept by jemalloc
Status: RESOLVED FIXED
[MemShrink:P1][soft-blocker]
:
Product: Firefox OS
Classification: Client Software
Component: General (show other bugs)
: unspecified
: All Gonk (Firefox OS)
: -- normal (vote)
: ---
Assigned To: Gabriele Svelto [:gsvelto]
:
:
Mentors:
Depends on: 814517
Blocks: slim-fast replace-malloc 807998
  Show dependency treegraph
 
Reported: 2012-10-26 10:18 PDT by Gabriele Svelto [:gsvelto]
Modified: 2013-06-22 11:48 PDT (History)
14 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
+
fixed
fixed


Attachments
Memory report of six running applications plus the preloaded one (81.38 KB, application/x-gzip)
2012-10-26 10:18 PDT, Gabriele Svelto [:gsvelto]
no flags Details
Add an extra function to jemalloc to free dirty unused pages (5.32 KB, patch)
2012-10-29 09:46 PDT, Gabriele Svelto [:gsvelto]
no flags Details | Diff | Splinter Review
Free dirty unused pages when minimizing memory (1.01 KB, patch)
2012-10-29 09:46 PDT, Gabriele Svelto [:gsvelto]
no flags Details | Diff | Splinter Review
Limit dirty unused pages held by jemalloc to 1MiB at most (695 bytes, patch)
2012-10-29 09:47 PDT, Gabriele Svelto [:gsvelto]
no flags Details | Diff | Splinter Review
Add an extra function to jemalloc to free dirty unused pages and limit them to 1MiB per arena at most (6.00 KB, patch)
2012-10-29 11:36 PDT, Gabriele Svelto [:gsvelto]
justin.lebar+bug: feedback+
Details | Diff | Splinter Review
Free dirty unused pages when minimizing memory (1.07 KB, patch)
2012-10-29 11:39 PDT, Gabriele Svelto [:gsvelto]
justin.lebar+bug: feedback-
Details | Diff | Splinter Review
Add an extra function to jemalloc to free dirty unused pages (5.46 KB, patch)
2012-11-01 08:15 PDT, Gabriele Svelto [:gsvelto]
justin.lebar+bug: review+
Details | Diff | Splinter Review
Free dirty unused pages when minimizing memory (3.62 KB, patch)
2012-11-01 08:21 PDT, Gabriele Svelto [:gsvelto]
no flags Details | Diff | Splinter Review
Free dirty unused pages when minimizing memory b2g-only (4.36 KB, patch)
2012-11-02 02:03 PDT, Gabriele Svelto [:gsvelto]
justin.lebar+bug: review-
Details | Diff | Splinter Review
Limit the maximum amount of dirty unused pages to 1MiB under B2G (856 bytes, patch)
2012-11-02 02:19 PDT, Gabriele Svelto [:gsvelto]
no flags Details | Diff | Splinter Review
Add an extra function to jemalloc to free dirty unused pages (6.02 KB, patch)
2012-11-06 12:05 PST, Gabriele Svelto [:gsvelto]
justin.lebar+bug: review+
mh+mozilla: review+
Details | Diff | Splinter Review
Free dirty pages in response to memory-pressure (low-memory) messages (6.01 KB, patch)
2012-11-06 12:14 PST, Gabriele Svelto [:gsvelto]
no flags Details | Diff | Splinter Review
Part 2: Free dirty pages in response to all memory-pressure messages (5.94 KB, patch)
2012-11-07 11:25 PST, Gabriele Svelto [:gsvelto]
justin.lebar+bug: review-
Details | Diff | Splinter Review
Part 2: Free dirty pages in response to all memory-pressure messages (5.77 KB, patch)
2012-11-07 18:02 PST, Gabriele Svelto [:gsvelto]
justin.lebar+bug: review+
Details | Diff | Splinter Review
Part 1: Add an extra function to jemalloc to free dirty unused pages (6.12 KB, patch)
2012-11-08 13:31 PST, Gabriele Svelto [:gsvelto]
mh+mozilla: review+
Details | Diff | Splinter Review
Part 2: Free dirty pages in response to all memory-pressure messages (5.83 KB, patch)
2012-11-08 13:33 PST, Gabriele Svelto [:gsvelto]
no flags Details | Diff | Splinter Review
Part 2: Free dirty pages in response to all memory-pressure messages (11.96 KB, patch)
2012-11-08 23:15 PST, Gabriele Svelto [:gsvelto]
justin.lebar+bug: feedback+
Details | Diff | Splinter Review
Part 2: Free dirty pages in response to all memory-pressure messages (10.18 KB, patch)
2012-11-09 11:22 PST, Gabriele Svelto [:gsvelto]
justin.lebar+bug: review+
Details | Diff | Splinter Review
Part 2: Free dirty pages in response to all memory-pressure messages (9.87 KB, patch)
2012-11-12 08:48 PST, Gabriele Svelto [:gsvelto]
no flags Details | Diff | Splinter Review

Description Gabriele Svelto [:gsvelto] 2012-10-26 10:18:02 PDT
Created attachment 675609 [details]
Memory report of six running applications plus the preloaded one

When freeing memory jemalloc keeps a bunch of empty pages around in order to speed up future allocations. Having been used these pages are dirty and thus will show up as part of the resident set of a process even if in practice they are not being used.

The attached memory report highlights the problem, with 6 apps plus the preloaded app open we have almost 15MiB of free but dirty pages out of a total of 90MB committed by jemalloc. Depending on the application this accounts for 10-20% of the committed heap. Here's the breakdown of every application:

Main Process

 32.41 MB ── heap-committed
  3.58 MB ── heap-dirty

Settings

 16.78 MB ── heap-committed
  3.55 MB ── heap-dirty

Communications

 11.73 MB ── heap-committed
  2.39 MB ── heap-dirty

Homescreen

 10.29 MB ── heap-committed
  1.74 MB ── heap-dirty

Clock

  9.57 MB ── heap-committed
  2.30 MB ── heap-dirty

(App)

  4.52 MB ── heap-committed
  0.64 MB ── heap-dirty

Browser

  5.13 MB ── heap-committed
  0.65 MB ── heap-dirty

There are two ways to recover some of this memory: the first is to instruct jemalloc to not keep around so many dirty-but-unused pages which can be done via the 'dirty_max' option. This solution has a significant downside in that it can raise the number of syscalls used by jemalloc as well as forcing the kernel to create and tear down mappings more often than during regular operation.

Another method would be to provide a function that forces jemalloc to purge all the dirty-but-unused pages and call it when the application is backgrounded to minimize its performance impact. This could be done by leveraging the functionality provided by the fix for bug 800166.

Both cases should be evaluated by tests to quantify their impact, potentially both approaches could be used together to keep the amount of heap-dirty memory in check both for running and backgrounded applications.
Comment 1 Mike Hommey [:glandium] 2012-10-26 11:03:24 PDT
FWIW, jemalloc 3 has controls allowing that through mallctl.

In mozjemalloc, we have a purge function iirc.
Comment 2 Justin Lebar (not reading bugmail) 2012-10-26 11:08:53 PDT
(In reply to Mike Hommey [:glandium] from comment #1)
> FWIW, jemalloc 3 has controls allowing that through mallctl.
> 
> In mozjemalloc, we have a purge function iirc.

The purge function we wrote is different -- it clears madvised pages on mac because they count against RSS (even though they get dropped on system memory pressure).

Unless we have another purge function...
Comment 3 Gabriele Svelto [:gsvelto] 2012-10-26 11:17:16 PDT
(In reply to Justin Lebar [:jlebar] from comment #2)
> The purge function we wrote is different -- it clears madvised pages on mac
> because they count against RSS (even though they get dropped on system
> memory pressure).
> 
> Unless we have another purge function...

Unfortunately not, so we'll either have to write another one for this task or import it from upstream. From a quick look it seems to me that upstream has deviated a lot so I think the safer choice would be to write our own function in mozjemalloc.
Comment 4 Mike Hommey [:glandium] 2012-10-26 11:24:53 PDT
(In reply to Gabriele Svelto [:gsvelto] from comment #3)
> Unfortunately not, so we'll either have to write another one for this task
> or import it from upstream. From a quick look it seems to me that upstream
> has deviated a lot so I think the safer choice would be to write our own
> function in mozjemalloc.

We have jemalloc 3 in the tree and we're working on making it the default. Not sure when that will happen, but depending how long you're okay to wait...
Comment 5 Justin Lebar (not reading bugmail) 2012-10-26 11:27:31 PDT
b2g v1 is shipping FF18 (currently Aurora) and we're certainly not going to turn on jemalloc 3 there.
Comment 6 Mike Hommey [:glandium] 2012-10-26 11:29:24 PDT
Then the mac purge function can certainly be reused/augmented to do what you want.
Comment 7 Gabriele Svelto [:gsvelto] 2012-10-26 11:33:04 PDT
(In reply to Mike Hommey [:glandium] from comment #6)
> Then the mac purge function can certainly be reused/augmented to do what you
> want.

OK, I'll try to modify it for our purposes then, thanks for the tip
Comment 8 Justin Lebar (not reading bugmail) 2012-10-26 11:37:15 PDT
(In reply to Mike Hommey [:glandium] from comment #6)
> Then the mac purge function can certainly be reused/augmented to do what you
> want.

Actually, I think it makes more sense for them to be separate functions, because they're functionally quite different:

* The mac purge function is there only to make our memory accounting correct.  It has no effect on the amount of memory Firefox uses, and should only be called right before we read RSS.

* This new purge function will slow down the app after you call it, so we should only call it when we expect that we're going to be quiescent for a while.  We definitely /shouldn't/ call it right before we read RSS.
Comment 9 Gabriele Svelto [:gsvelto] 2012-10-26 11:45:53 PDT
(In reply to Justin Lebar [:jlebar] from comment #8)
> Actually, I think it makes more sense for them to be separate functions,
> because they're functionally quite different:
> 
> * The mac purge function is there only to make our memory accounting
> correct.  It has no effect on the amount of memory Firefox uses, and should
> only be called right before we read RSS.
> 
> * This new purge function will slow down the app after you call it, so we
> should only call it when we expect that we're going to be quiescent for a
> while.  We definitely /shouldn't/ call it right before we read RSS.

OK, I see your point, I did not consider that the existing function is already in use for other stuff and modifying its semantics would affect other users too. I'll go for a separate one and when jemalloc 3 becomes the standard allocator we'll drop it in favor of mallctl() if it provides the same functionality as Mike suggested.
Comment 10 Mike Hommey [:glandium] 2012-10-26 11:49:43 PDT
(In reply to Gabriele Svelto [:gsvelto] from comment #9)
> I'll go for a separate one and when jemalloc 3 becomes the
> standard allocator we'll drop it in favor of mallctl() if it provides the
> same functionality as Mike suggested.

You might as well add the call for mallctl now, and #ifdef MOZ_JEMALLOC.
Comment 11 Gabriele Svelto [:gsvelto] 2012-10-29 09:46:04 PDT
Created attachment 676199 [details] [diff] [review]
Add an extra function to jemalloc to free dirty unused pages
Comment 12 Gabriele Svelto [:gsvelto] 2012-10-29 09:46:55 PDT
Created attachment 676200 [details] [diff] [review]
Free dirty unused pages when minimizing memory
Comment 13 Gabriele Svelto [:gsvelto] 2012-10-29 09:47:46 PDT
Created attachment 676201 [details] [diff] [review]
Limit dirty unused pages held by jemalloc to 1MiB at most
Comment 14 Gabriele Svelto [:gsvelto] 2012-10-29 10:02:38 PDT
I've attached three patches as work-in-progress to tackle this issue, attachment 676199 [details] [diff] [review] and 676200 apply to mozilla-central whilst attachment 676201 [details] [diff] [review] applies to the gonk-misc repository.

The patches make the following changes:

- The first one introduces a function that forces jemalloc to relinquish all dirty unused pages

- The second one adds an invocation of this function to the MinimizeMemoryUsageRunnable. The function is called after the final iteration of the memory minimization loop when the amount of dirty unused pages should maximal

- The third patch on the other hand applies to gonk-misc and sets the MALLOC_OPTIONS in b2g.sh as suggested by :sinker in bug 802446, comment 17. This forces jemalloc to have no more than 1 MiB worth of dirty pages (256 4KiB pages) per app.

Unfortunately the first two patches do not seem to do what they're supposed to do as my applications are getting killed when they're sent in the background, I still need to properly debug them.

In the meantime applying the third patch yields the following results, heap-dirty before applying the patch across five applications:

Main Process

 38.06 MB ── heap-committed
  2.75 MB ── heap-dirty

Settings

 14.41 MB ── heap-committed
  2.95 MB ── heap-dirty

Communications

 12.84 MB ── heap-committed
  2.81 MB ── heap-dirty

Homescreen

 12.54 MB ── heap-committed
  2.77 MB ── heap-dirty

Clock

 10.82 MB ── heap-committed
  3.14 MB ── heap-dirty

(App)

  4.43 MB ── heap-committed
  0.10 MB ── heap-dirty

And after applying the patch:

Main Process

 38.36 MB ── heap-committed
  0.53 MB ── heap-dirty

Settings

 12.38 MB ── heap-committed
  0.69 MB ── heap-dirty

Communications

 10.95 MB ── heap-committed
  0.77 MB ── heap-dirty

Homescreen

 10.27 MB ── heap-committed
  0.61 MB ── heap-dirty

Clock

  8.54 MB ── heap-committed
  0.77 MB ── heap-dirty

(App)

  4.43 MB ── heap-committed
  0.62 MB ── heap-committed-unused

The numbers cannot be compared with those of comment 1 because I had taken that snapshot with the patches for bug 798491 applied.
Comment 15 Gabriele Svelto [:gsvelto] 2012-10-29 10:07:47 PDT
Comment on attachment 676199 [details] [diff] [review]
Add an extra function to jemalloc to free dirty unused pages

In the first patch I've used the MOZ_GONK_MEMORY definition for enabling the new function, this means it won't be included in desktop builds, would it be better to use MOZ_MEMORY instead? Also I did not port the new mallctl() function over as suggested by Mike because the jemalloc mechanics have changed a lot in upstream and it wouldn't work with our code without significant modifications.
Comment 16 Gabriele Svelto [:gsvelto] 2012-10-29 10:10:39 PDT
Comment on attachment 676200 [details] [diff] [review]
Free dirty unused pages when minimizing memory

In the second patch I'm forcing freeing up unused pages as part of the MinimizeMemoryUsageRunnable loop, after all the iterations. This looks like the best place for the call as all the memory-pressure listeners should already have fred as much memory as possible. The call is also guarded by MOZ_MEMORY_GONK which again doesn't feel right.
Comment 17 Mike Hommey [:glandium] 2012-10-29 10:21:06 PDT
> Also I did not port the new mallctl() function over as suggested by Mike

I wasn't actually suggesting that. I was suggesting that in the calling code you do something like
#ifdef MOZ_JEMALLOC
   mallctl("arena.purge");
#else
   your_function()
#endif
Comment 18 Gabriele Svelto [:gsvelto] 2012-10-29 10:25:22 PDT
(In reply to Mike Hommey [:glandium] from comment #17)
> I wasn't actually suggesting that. I was suggesting that in the calling code
> you do something like
> #ifdef MOZ_JEMALLOC
>    mallctl("arena.purge");
> #else
>    your_function()
> #endif

Right, it makes a lot of sense, sorry for the misunderstanding.
Comment 19 Justin Lebar (not reading bugmail) 2012-10-29 10:44:05 PDT
> - The third patch on the other hand applies to gonk-misc and sets the MALLOC_OPTIONS in b2g.sh as 
> suggested by :sinker in bug 802446, comment 17. This forces jemalloc to have no more than 1 MiB 
> worth of dirty pages (256 4KiB pages) per app.

I'd probably prefer to set this within Firefox itself somehow, because people are measuring memory usage on B2G desktop as a proxy for B2G on the device, and these changes will only affect B2G on the device.
Comment 20 Gabriele Svelto [:gsvelto] 2012-10-29 11:36:06 PDT
Created attachment 676243 [details] [diff] [review]
Add an extra function to jemalloc to free dirty unused pages and limit them to 1MiB per arena at most

Folded patches 1 and 3 in the same patch, removed the changes to b2g.sh, now the dirty pages limit is enforced through the _malloc_options global in jemalloc. Fixed an assertion that would trigger when collecting dirty pages and made sure that we properly grab the arena's spinlock before purging it.
Comment 21 Gabriele Svelto [:gsvelto] 2012-10-29 11:39:25 PDT
Created attachment 676245 [details] [diff] [review]
Free dirty unused pages when minimizing memory

Modified to use mallctl() instead of the new functionality when MOZ_JEMALLOC is used as per comment 17. Also removed the #ifdef MOZ_MEMORY_GONK which really didn't belong there.
Comment 22 Andreas Gal :gal 2012-10-29 11:41:14 PDT
Why the 1MB limit? As long we free dirty unused pages when we background that should be enough right? No point in having free memory we don't need right now.
Comment 23 Justin Lebar (not reading bugmail) 2012-10-29 11:54:08 PDT
(In reply to Andreas Gal :gal from comment #22)
> Why the 1MB limit? As long we free dirty unused pages when we background
> that should be enough right? No point in having free memory we don't need
> right now.

No point if we can reliably free that extra memory upon memory pressure before we oom.  But those patches haven't landed (bug 771195).  And even once they do, memory pressure notifications are tricky to do right, and I'm not sure we should rely on them like this.

I kind of doubt we'll see a perf penalty with the dirty page cache at 1mb on these devices, but we should do some experiments.
Comment 24 Justin Lebar (not reading bugmail) 2012-10-29 12:01:38 PDT
Comment on attachment 676245 [details] [diff] [review]
Free dirty unused pages when minimizing memory

I think we should do this off a regular memory-pressure observer, rather than off the nsMemoryReporter::MinimizeMemoryUsage code.  If we do the former, the lowmem notifications from bug 771195 will Just Work.

If we don't want to run this three times when we minimize memory usage (seems pretty unnecessary), we can modify the minimize-memory-usage code to run two GC/CCs and then fire a heap-minimize notification (which will cause a third round of collections).  See attachment 671637 [details] [diff] [review] for the code to poke the gc/cc.

The jemalloc_free_dirty_pages() call should be guarded by MOZ_MEMORY.  Also, I think we should guard the call by a gecko pref that's not enabled except in b2g.  We have code on Windows that tries to dump memory on memory pressure, and dumping these ~3mb of heap-dirty is unlikely to be helpful there.  (That is, we should minimize risk to desktop.)

Do you know how to do the Gecko prefs?
Comment 25 Justin Lebar (not reading bugmail) 2012-10-29 12:10:08 PDT
Comment on attachment 676243 [details] [diff] [review]
Add an extra function to jemalloc to free dirty unused pages and limit them to 1MiB per arena at most

The jemalloc code looks safe to me.  Or at least, as safe as any change here can be.

>+#if defined(MOZ_MEMORY_GONK)
>+void    jemalloc_free_dirty_pages();
>+#endif

Maybe we should provide this function everywhere and simply not call it if we're not in b2g.  (We can use a gecko pref.)

But at the very least, we need to line up the ifdefs here and in the code which calls this function; the caller is not guarding on MOZ_MEMORY_GONK!

>+#ifndef MOZ_MEMORY_GONK
> const char	*_malloc_options;
>+#else
>+const char	*_malloc_options = "ff";
>+#endif

We discussed on IRC, but for everyone else's benefit: This would probably be better guarded with MOZ_MEMORY_SMALL_DIRTY_PAGE_CACHE or something, and then we can turn that flag on by default for B2G.
Comment 26 Gabriele Svelto [:gsvelto] 2012-10-29 12:25:29 PDT
(In reply to Justin Lebar [:jlebar] from comment #25)
> >+#if defined(MOZ_MEMORY_GONK)
> >+void    jemalloc_free_dirty_pages();
> >+#endif
> 
> Maybe we should provide this function everywhere and simply not call it if
> we're not in b2g.  (We can use a gecko pref.)
> 
> But at the very least, we need to line up the ifdefs here and in the code
> which calls this function; the caller is not guarding on MOZ_MEMORY_GONK!

Yeah, I forgot to add one bit, it should have looked like this:

+#if defined(MOZ_MEMORY_GONK)
+void    jemalloc_free_dirty_pages();
+#else
+static inline void jemalloc_free_dirty_pages() { }
+#endif

> We discussed on IRC, but for everyone else's benefit: This would probably be
> better guarded with MOZ_MEMORY_SMALL_DIRTY_PAGE_CACHE or something, and then
> we can turn that flag on by default for B2G.

The idea was to add something like this in configure.in to enable turning it on depending when --enable-application=b2g is set.
Comment 27 Gabriele Svelto [:gsvelto] 2012-10-29 12:31:20 PDT
(In reply to Justin Lebar [:jlebar] from comment #24)
> I think we should do this off a regular memory-pressure observer, rather
> than off the nsMemoryReporter::MinimizeMemoryUsage code.  If we do the
> former, the lowmem notifications from bug 771195 will Just Work.

I haven't looked into that bug yet, if there's a better way to ensure what would be the god time to trigger it then it's definitely the way to go. My rationale for doing it there was that it should probably be triggered after everybody else reclaimed their memory (and thus potentially created more dirty unused pages).

> If we don't want to run this three times when we minimize memory usage
> (seems pretty unnecessary), we can modify the minimize-memory-usage code to
> run two GC/CCs and then fire a heap-minimize notification (which will cause
> a third round of collections).  See attachment 671637 [details] [diff] [review]
> [review] for the code to poke the gc/cc.

I'll check it out, I was also surprised of seeing those three iterations but I didn't want to make this patch more intrusive than it already is.

> The jemalloc_free_dirty_pages() call should be guarded by MOZ_MEMORY.  Also,
> I think we should guard the call by a gecko pref that's not enabled except
> in b2g.  We have code on Windows that tries to dump memory on memory
> pressure, and dumping these ~3mb of heap-dirty is unlikely to be helpful
> there.  (That is, we should minimize risk to desktop.)

You mean that this might be useful for Firefox/Windows too? My rationale for limiting it to B2G was that it wouldn't have been useful in other places and I wanted to minimize the risk associated with what is an intrusive change to jemalloc.

> Do you know how to do the Gecko prefs?

I've seen related code in other patches, I think I could make it dependent on that.
Comment 28 Gabriele Svelto [:gsvelto] 2012-10-29 12:34:43 PDT
Here's the result with both patches applied and the Homescreen in the foreground after having launched all the other applications:

Main Process

 33.59 MB ── heap-committed
  0.59 MB ── heap-dirty

Settings

 11.57 MB ── heap-committed
  0.05 MB ── heap-dirty

Communications

 10.30 MB ── heap-committed
  0.02 MB ── heap-dirty

Homescreen

  9.54 MB ── heap-committed
  0.00 MB ── heap-dirty

Clock

  7.73 MB ── heap-committed
  0.00 MB ── heap-dirty
Comment 29 Justin Lebar (not reading bugmail) 2012-10-29 12:47:13 PDT
> My rationale for doing it there was that it should probably be triggered after everybody 
> else reclaimed their memory (and thus potentially created more dirty unused pages).

Ah, I see...

Hm; that's a bit tricky.  Maybe it would be sufficient to spin the event loop.  (That is, put the call into jemalloc in an nsRunnable and do NS_DispatchToMainThread() on it.)

> You mean that this might be useful for Firefox/Windows too? My rationale for limiting 
> it to B2G was that it wouldn't have been useful in other places and I wanted to 
> minimize the risk associated with what is an intrusive change to jemalloc.

No, I just didn't understand that you meant to make jemalloc_free_dirty_pages() do nothing on non-B2G platforms until comment 26.  I'd probably prefer disabling it on Windows via a Gecko pref (which causes us to avoid calling the method) than via a compile-time pref, just because Gecko prefs are easier to set, but whatever makes sense is fine.
Comment 30 Chris Jones [:cjones] inactive; ni?/f?/r? if you need me 2012-10-30 17:04:48 PDT
I'm not sure what the interpretation of memshrink flags is, but this looks like the biggest memory savings left for b2g beyond moving SMS OOP.  I would love to see this fixed before any of the memory-shrinking slim-fast work in the pipeline.
Comment 31 Justin Lebar (not reading bugmail) 2012-10-30 17:07:31 PDT
(In reply to Chris Jones [:cjones] [:warhammer] from comment #30)
> I'm not sure what the interpretation of memshrink flags is, but this looks
> like the biggest memory savings left for b2g beyond moving SMS OOP.  I would
> love to see this fixed before any of the memory-shrinking slim-fast work in
> the pipeline.

I don't think the MemShrink priority is going to affect how quickly Gabriele fixes this, but we were on the fence wrt P1/P2, so I don't think anyone will mind if I bump this up.
Comment 32 Gabriele Svelto [:gsvelto] 2012-10-31 11:09:49 PDT
(In reply to Justin Lebar [:jlebar] from comment #31)
> I don't think the MemShrink priority is going to affect how quickly Gabriele
> fixes this, but we were on the fence wrt P1/P2, so I don't think anyone will
> mind if I bump this up.

I can confirm that I'm working on this as my top priority so I'll deliver the updated patch soon. To sum up what we've decided to do so that I know I'm going in the right direction I'm preparing three patches:

- The first patch changes the way memory is minimized. Instead of the current behavior of sending minimize-memory events thrice I'll invoke the GC/CC twice, send just one minimize-memory event and then spin the event-loop

- The second patch adds the function to cleanup the dirty pages and adds it to the memory minimization process, this however happens depending on an option

- The third patch changes the jemalloc options to reduce the maximum amount of dirty pages to be kept around, I'm wondering if it wouldn't be worth making this one depending on an option too
Comment 33 Gabriele Svelto [:gsvelto] 2012-11-01 04:36:23 PDT
After reading the comments in bug 610166 comment 12 I don't think anymore it's a good idea to alter the current behavior of MinimizeMemoryUsageRunnable, it seems that doing so could introduce regressions and those would be hard to spot too. For now I'll go ahead and work only on the rest of the issue and leave that riskier change for a later time.
Comment 34 Gabriele Svelto [:gsvelto] 2012-11-01 08:15:54 PDT
Created attachment 677426 [details] [diff] [review]
Add an extra function to jemalloc to free dirty unused pages

Compared to the previous patch this one makes compiling jemalloc_free_dirty_pages() unconditional and moves the function description to jemalloc.h where it makes more sense for it to be. Functionally it's identical to the old patch.
Comment 35 Gabriele Svelto [:gsvelto] 2012-11-01 08:21:02 PDT
Created attachment 677430 [details] [diff] [review]
Free dirty unused pages when minimizing memory

This patch makes freeing the unused pages dependent on the 'memory.free_dirty_pages' options which I have set to true by default. We probably want to alter this and enable it only for B2G but I'm unsure how to do it in the prefs declaration (ifdef ANDROID?). Also the logic from which the clearing function is selected was changed: if native jemalloc is enabled use mallctl(), if we're using mozjemalloc use jemalloc_free_dirty_pages() and if neither is used no call will be done at all.
Comment 36 Justin Lebar (not reading bugmail) 2012-11-01 08:27:07 PDT
> We probably want to alter this and enable it only for B2G but I'm unsure how to do it in the prefs 
> declaration (ifdef ANDROID?).

Set the pref in b2g.js, which will override the all.js setting.
Comment 37 Mike Hommey [:glandium] 2012-11-01 12:07:08 PDT
Comment on attachment 677426 [details] [diff] [review]
Add an extra function to jemalloc to free dirty unused pages

Please don't touch jemalloc.h. I'm actually going to remove it in bug 804303.
Comment 38 Gabriele Svelto [:gsvelto] 2012-11-01 23:57:52 PDT
(In reply to Mike Hommey [:glandium] from comment #37)
> Please don't touch jemalloc.h. I'm actually going to remove it in bug 804303.

I've read the comments on bug 804303 but they didn't give me many clues about where to put the prototype instead.
Comment 39 Mike Hommey [:glandium] 2012-11-02 00:50:15 PDT
Ah, sorry, you're adding a prototype... go ahead in jemalloc.h then, i'll move it in bug 804303.
Comment 40 Gabriele Svelto [:gsvelto] 2012-11-02 01:35:17 PDT
(In reply to Mike Hommey [:glandium] from comment #39)
> Ah, sorry, you're adding a prototype... go ahead in jemalloc.h then, i'll
> move it in bug 804303.

OK, thanks, I really hope to finish the changes and land them soon.
Comment 41 Gabriele Svelto [:gsvelto] 2012-11-02 02:02:12 PDT
Comment on attachment 677426 [details] [diff] [review]
Add an extra function to jemalloc to free dirty unused pages

I would say this one is good to go without further changes so I'm asking for review.
Comment 42 Gabriele Svelto [:gsvelto] 2012-11-02 02:03:34 PDT
Created attachment 677689 [details] [diff] [review]
Free dirty unused pages when minimizing memory b2g-only

Revised patch which enables this functionality in b2g.js but leaves it disabled by default.
Comment 43 Gabriele Svelto [:gsvelto] 2012-11-02 02:19:54 PDT
Created attachment 677691 [details] [diff] [review]
Limit the maximum amount of dirty unused pages to 1MiB under B2G

This is the final patch, I hope to make this setting dependent on a preference but I fear it's impossible as the initialization of jemalloc is done very early in the startup process. Since I wanted to make sure that the patch is enabled for B2G desktop as well as for the devices I made the change conditional on MOZ_B2G_VERSION which should be defined for both. It's a bit clunky, I know but we don't seem to have any other globally defined macro for B2G, I thought this was better than adding yet another redundant macro specifically for this.

I will soon post the about:memory profiles without the patches applied, with only the first two patches applied and with the three patches applied to be able to gauge their individual impact.
Comment 44 Justin Lebar (not reading bugmail) 2012-11-02 10:46:37 PDT
Comment on attachment 677691 [details] [diff] [review]
Limit the maximum amount of dirty unused pages to 1MiB under B2G

I think cjones would probably appreciate something like MOZ_LIMIT_MALLOC_TEMP_MEMORY, but I don't have a strong opinion.

Have you experimented with different sizes here in order to quantify the perf trade-off?  I'd like to see some measurements before r+'ing this.
Comment 45 Justin Lebar (not reading bugmail) 2012-11-02 10:53:51 PDT
Comment on attachment 677689 [details] [diff] [review]
Free dirty unused pages when minimizing memory b2g-only

I don't think we should purge dirty pages only when we run minimize-memory-usage; we should do it off heap-minimize notifications.  Minimize-memory-usage is a hack mostly for the sake of about:memory.  We use it when putting a process into the background because when a process is backgrounded, we can afford to run an expensive routine.

But when the system is under memory pressure, we run heap-minimize, not minimize-memory-usage, in all processes.  I think we should drop jemalloc dirty memory on memory pressure.

If you don't want to modify the minimize-memory-usage code as I suggested (caution is fine with me), can you run jemalloc_free_dirty_pages() after spinning the event loop (comment 29)?  That would give the other heap-minimize listeners a chance to free their memory.

You'd end up purging the arenas three times instead of just once on minimize-memory-usage, but that doesn't really bother me.  We run three full GCs on MMU, which should be much more expensive.
Comment 46 Justin Lebar (not reading bugmail) 2012-11-02 11:04:10 PDT
(Low memory notifications are in bug 771195, if you're curious.)
Comment 47 Chris Jones [:cjones] inactive; ni?/f?/r? if you need me 2012-11-02 11:08:40 PDT
Comment on attachment 677691 [details] [diff] [review]
Limit the maximum amount of dirty unused pages to 1MiB under B2G

Whups, wrong cjones.  ":warhammer" will always find me.

I don't really know what's going on here, but yes please make a feature macro that's enabled for b2g.
Comment 48 Gabriele Svelto [:gsvelto] 2012-11-02 12:35:28 PDT
(In reply to Justin Lebar [:jlebar] from comment #45)
> I don't think we should purge dirty pages only when we run
> minimize-memory-usage; we should do it off heap-minimize notifications. 
> Minimize-memory-usage is a hack mostly for the sake of about:memory.  We use
> it when putting a process into the background because when a process is
> backgrounded, we can afford to run an expensive routine.
> 
> But when the system is under memory pressure, we run heap-minimize, not
> minimize-memory-usage, in all processes.  I think we should drop jemalloc
> dirty memory on memory pressure.

Right, now I see what you meant in the previous comment. I hadn't realized that we had two paths to deal with memory pressure conditions. I assumed that either way they would go through nsMemoryReporterManager (which sounded weird but without knowing the codebase better looked like the only option to me).

> If you don't want to modify the minimize-memory-usage code as I suggested
> (caution is fine with me), can you run jemalloc_free_dirty_pages() after
> spinning the event loop (comment 29)?  That would give the other
> heap-minimize listeners a chance to free their memory.

No, I'm fine trying to modify minimize-memory-usage, I prefer to get this done right rather than rolling a quick solution and then having to come back later.

I'll also add another feature macro as suggested in comment 44 and post a comparison when using 2MiB and 1MiB (the default is currently 4MiB, I don't think it's worth going down to 512KiB). Sorry for not posting the tests sooner but I don't have my Otoro right now to test, I'll do them tomorrow though.
Comment 49 Gabriele Svelto [:gsvelto] 2012-11-03 14:37:36 PDT
I've compiled some stats gathered when using different limits for the maximum amount of dirty pages. The results shows that lowering their maximum amount tends to lower overall memory consumption though the impact is proportional to the size of the application. The performance impact however is hard to gauge, one thing that is evident is that even with less dirty pages available most application still have plenty of free memory in partially filled pages (heap-committed-unused row), usually more than what is available as dirty pages.

Here are the results (note that counter-intuitively the main process memory consumption grew between runs, something I couldn't explain).

Dirty pages limit         4.00 MB     2.00 MB     1.00 MB

Main process
 heap-committed          31.22 MB    32.00 MB    33.33 MB
 heap-committed-unused    5.28 MB     3.50 MB     2.81 MB
 heap-dirty               3.39 MB     1.61 MB     0.82 MB

Settings
 heap-committed          14.57 MB    12.65 MB    11.69 MB
 heap-committed-unused    4.84 MB     3.06 MB     2.11 MB
 heap-dirty               3.39 MB     1.71 MB     0.71 MB

Communications
 heap-committed          12.16 MB    10.32 MB    10.27 MB
 heap-committed-unused    5.12 MB     1.78 MB     2.62 MB
 heap-dirty               3.43 MB     0.97 MB     0.86 MB

Homescreen
 heap-committed          11.04 MB    10.42 MB     9.82 MB
 heap-committed-unused    3.34 MB     2.62 MB     2.08 MB
 heap-dirty               1.86 MB     1.42 MB     0.72 MB

Clock
 heap-committed           9.66 MB     8.57 MB     8.00 MB
 heap-committed-unused    3.29 MB     2.27 MB     1.70 MB
 heap-dirty               2.45 MB     1.40 MB     0.86 MB

Browser
 heap-committed           5.21 MB     5.16 MB     5.16 MB
 heap-committed-unused    1.16 MB     1.15 MB     1.15 MB
 heap-dirty               0.63 MB     0.62 MB     0.62 MB

Preload app
 heap-committed           3.62 MB     3.62 MB     3.62 MB
 heap-committed-unused    0.60 MB     0.60 MB     0.60 MB
 heap-dirty               0.09 MB     0.09 MB     0.09 MB

Regarding dropping the dirty pages entirely only when under memory pressure I've experimented with a per-app observer listening to memory-pressure/low-memory with mixed results. Certain applications would drop the pages correctly, others wouldn't as if they had been fred after jemalloc_free_dirty_pages() had been called. I need to do further tests to figure out where the problem lies and how to fix it.
Comment 50 Justin Lebar (not reading bugmail) 2012-11-03 15:37:48 PDT
Thanks for running some tests!

> The performance impact however is hard to gauge

Could you run a benchmark such as V8 or Kraken inside the browser app?

> one thing that is evident is that even with less dirty pages available most application still have 
> plenty of free memory in partially filled pages (heap-committed-unused row), usually more than what 
> is available as dirty pages.

For one thing, you're measuring steady-state memory usage, not high-water-mark usage, so it's not entirely unexpected that you should see some unused memory sticking around.

dirty-pages counts only pages which the allocator could free -- that is, pages which are entirely empty.  On the other hand, heap-committed-unused (which may or may not also count dirty pages; I forget) counts pages which may have stuff on them, so are not decommitable.

Also note that jemalloc can't use an arbitrary spot from a heap-committed-unused page to fulfill any request, even if the spot is big enough to fit the request, because certain pages can only hold allocations of certain sizes.

If we take these results at face value, it seems that we should reduce the dirty pages limit for subprocesses but not for the main process.  I think this would be feasible; we can set an environment variable when launching the child processes.

But maybe the higher heap-committed you saw in the main process is just noise.  That seems like the simplest explanation here.

> Regarding dropping the dirty pages entirely only when under memory pressure
> I've experimented with a per-app observer listening to
> memory-pressure/low-memory with mixed results. Certain applications would
> drop the pages correctly, others wouldn't as if they had been fred after
> jemalloc_free_dirty_pages() had been called. I need to do further tests to
> figure out where the problem lies and how to fix it.

What are you doing, exactly?
Comment 51 Gabriele Svelto [:gsvelto] 2012-11-05 05:37:37 PST
(In reply to Justin Lebar [:jlebar] from comment #50)
> Could you run a benchmark such as V8 or Kraken inside the browser app?

Yes, I'll give them a try.

> If we take these results at face value, it seems that we should reduce the
> dirty pages limit for subprocesses but not for the main process.  I think
> this would be feasible; we can set an environment variable when launching
> the child processes.

Yes that is a possibility, however I agree with your previous idea that if we can just drop all dirty pages on memory pressure then limiting them beforehand has reduced value. I'm aware of how jemalloc segregates small allocations and without more detailed information on pages' status it's hard to tell if the heap-committed-unused figure is pure fragmentation or available free space. I also think that dirty is accounted in that number.

One important thing to take into account though is that if you substract heap-dirty from heap-committed not in all cases a smaller number of dirty pages leads to a lower used footprint, this means that reducing the number of dirty pages might actually generate more fragmentation in some cases which is not something we'd want.

> But maybe the higher heap-committed you saw in the main process is just
> noise.  That seems like the simplest explanation here.

Possibly, I'd really need to set some time aside to run the same test over and over a bunch of times and see how much variation there is.

> What are you doing, exactly?

I've attached an observer for memory-pressure to each ContentChild which I use to drop dirty pages. The flow of events looks like this:

- MemoryPressureWatcher spots  a low-memory condition and sends a memory-pressure notification
- ContentParent grabs it and sends a FlushMemory IPC message to all ContentChilds
- ContentChild receives it and issues a memory-pressure notification
- An observer grabs it and purges dirty pages

The problem is that I think my observer is being called before others have run, which would explain why dirty pages are not dropped. I haven't been able to prove this but one thing I wanted to try was to put the code into a runnable which would spin the event loop once before doing the freeing, in the hope that the other observers would run first and release memory.
Comment 52 Justin Lebar (not reading bugmail) 2012-11-05 08:20:54 PST
> Possibly, I'd really need to set some time aside to run the same test over and over a 
> bunch of times and see how much variation there is.

If it's too much time to verify that this reducing the dirty size helps like we want, we could just do the low-mem notifications in this bug, since we think that's safe.

But I think we probably want to decrease the dirty cache size, if we can show that it's a win.  Low-memory notifications are a last-ditch attempt to save ourselves, and I think they'll prove to be less-reliable than the alternative of not using that memory.  In particular, if a process tries to allocate X mb and we have less than X mb of free memory, some process is toast; low memory notifications can't help us here, because we can't react quickly enough.

> I've attached an observer for memory-pressure to each ContentChild which I use to drop 
> dirty pages.

If I haven't royally screwed up the low-memory notifications, this should be unnecessary.  Each process watches the low-memory-notify fd itself.

> The problem is that I think my observer is being called before others have run, which 
> would explain why dirty pages are not dropped.

Did you try purging dirty pages off a runnable to get around this problem?
Comment 53 Gabriele Svelto [:gsvelto] 2012-11-05 08:44:47 PST
(In reply to Justin Lebar [:jlebar] from comment #52)
> If it's too much time to verify that this reducing the dirty size helps like
> we want, we could just do the low-mem notifications in this bug, since we
> think that's safe.

It shouldn't take much time, it's just that have been so busy with a zillion other things. Maybe today I'll be able to do it and get a definitive answer (I'm in SF for the B2G work week).

> But I think we probably want to decrease the dirty cache size, if we can
> show that it's a win.  Low-memory notifications are a last-ditch attempt to
> save ourselves, and I think they'll prove to be less-reliable than the
> alternative of not using that memory.  In particular, if a process tries to
> allocate X mb and we have less than X mb of free memory, some process is
> toast; low memory notifications can't help us here, because we can't react
> quickly enough.

Right, so we'll also need the data from the browser benchmarks.

> If I haven't royally screwed up the low-memory notifications, this should be
> unnecessary.  Each process watches the low-memory-notify fd itself.

I noticed that the widget is initialized in nsAppShell::Init() which I couldn't relate to the rest of the application startup so I did a few tests. What I noticed was that a single ContentParent in b2g would get the first notification and then it would send it down to all ContentChilds which in turn had the PIDs of all the other processes. From your description it seems that this is not what you were expecting (or I got something wrong while testing it).

> Did you try purging dirty pages off a runnable to get around this problem?

Not yet, but that's what I want to try today.
Comment 54 Justin Lebar (not reading bugmail) 2012-11-05 10:38:44 PST
> From your description it seems that this is not what you were expecting

Ah, indeed.  That's not what I thought was going on, but it looks like you're right.

My intent was to have each process watch for memory pressure itself, in the hopes that would allow us to react more quickly.  But maybe that's not necessary.
Comment 55 Gabriele Svelto [:gsvelto] 2012-11-05 13:34:14 PST
(In reply to Justin Lebar [:jlebar] from comment #54)
> Ah, indeed.  That's not what I thought was going on, but it looks like
> you're right.
> 
> My intent was to have each process watch for memory pressure itself, in the
> hopes that would allow us to react more quickly.  But maybe that's not
> necessary.

Turns out we were both right, I've filed bug 808756 because we're getting duplicated memory-pressure events. One generated in every process by the watcher and one that is forwarded from the main b2g process to all its children in response to the original memory-pressure event.
Comment 56 Gabriele Svelto [:gsvelto] 2012-11-06 12:05:05 PST
Created attachment 678832 [details] [diff] [review]
Add an extra function to jemalloc to free dirty unused pages

Modified the patch to build correctly on Linux and possibly Windows too
Comment 57 Gabriele Svelto [:gsvelto] 2012-11-06 12:14:31 PST
Created attachment 678837 [details] [diff] [review]
Free dirty pages in response to memory-pressure (low-memory) messages

This patch changes the behavior of the previous patch to listen to memory-pressure (low-memory) events and hooks only one observer per process (including the main b2g process). Compared to the previous patch this one will also not trigger when a memory-pressure (heap-minimize) (i.e. when application go into the background or when minimization is called explicitly).
Comment 58 Mozilla RelEng Bot 2012-11-06 15:30:43 PST
Try run for 96ffd330068d is complete.
Detailed breakdown of the results available here:
    https://tbpl.mozilla.org/?tree=Try&rev=96ffd330068d
Results (out of 8 total builds):
    success: 6
    failure: 2
Builds (or logs if builds failed) available at:
http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/gsvelto@mozilla.com-96ffd330068d
Comment 59 Gabriele Svelto [:gsvelto] 2012-11-07 11:25:20 PST
Created attachment 679285 [details] [diff] [review]
Part 2: Free dirty pages in response to all memory-pressure messages

Final version of the patch, the only difference with the previous one is that this one responds to all memory-pressure events and not only to the low-memory ones.
Comment 60 Gabriele Svelto [:gsvelto] 2012-11-07 11:29:33 PST
Comment on attachment 678832 [details] [diff] [review]
Add an extra function to jemalloc to free dirty unused pages

Both patches have been tested and build successfully on Windows and Linux too, this one should be applied before the one in attachment 679285 [details] [diff] [review].
Comment 61 Gabriele Svelto [:gsvelto] 2012-11-07 13:47:12 PST
To add a data-point and a short summary to this bug with both patches applied I can open the Gallery, Settings, Clock, Music, Feedback, Calculator and Browser apps without the Homescreen getting killed, this is due to a ~15MiB memory footprint reduction across all applications. What the patches do now it to free the dirty pages both when an application goes into the background and when a low-memory condition is triggered.
Comment 62 Justin Lebar (not reading bugmail) 2012-11-07 16:56:39 PST
This is one of our top b2g MemShrink bugs, and we'll land this with minimal risk to desktop.  soft-basecamp-blocking+.
Comment 63 Justin Lebar (not reading bugmail) 2012-11-07 17:00:49 PST
Comment on attachment 678832 [details] [diff] [review]
Add an extra function to jemalloc to free dirty unused pages

This looks good to me, but I don't understand the build glue well enough to feel comfortable vouching for those changes.  Can you have a look at that part, glandium?
Comment 64 Justin Lebar (not reading bugmail) 2012-11-07 17:14:55 PST
Comment on attachment 679285 [details] [diff] [review]
Part 2: Free dirty pages in response to all memory-pressure messages

>diff --git a/b2g/app/b2g.js b/b2g/app/b2g.js
>--- a/b2g/app/b2g.js
>+++ b/b2g/app/b2g.js
>@@ -582,8 +582,12 @@
> // a restart is required to enable a new value.
> pref("network.activity.blipIntervalMilliseconds", 250);
> 
> // Send some sites a custom user-agent.
> pref("general.useragent.override.facebook.com", "\(Mobile#(Android; Mobile");
> pref("general.useragent.override.youtube.com", "\(Mobile#(Android; Mobile");
> 
> pref("jsloader.reuseGlobal", true);
>+
>+// Enable freeing dirty pages when minimizing memory, this reduces memory
>+// consumption when applications are sent to the background.
>+pref("memory.free_dirty_pages", true);

Nit: Comma splice.  Use a semicolon or start a new sentence.

>diff --git a/xpcom/base/nsMemoryReporterManager.cpp b/xpcom/base/nsMemoryReporterManager.cpp
>--- a/xpcom/base/nsMemoryReporterManager.cpp
>+++ b/xpcom/base/nsMemoryReporterManager.cpp
>+    nsCOMPtr<nsIObserverService> os = services::GetObserverService();
>+
>+    if (os) {
>+      os->AddObserver(this, "memory-pressure", false);
>+    }

Every reviewer has their own personal nits they ask people to do (sorry, this annoys me too).  One of mine is that we should comment the meaning of opaque boolean args, like this "false".  You'd do

>+      os->AddObserver(this, "memory-pressure", /* ownsWeak */ false);
   
>+namespace {
>+
>+/**
>+ * This runnable executed in response to a memory-pressure event, it currently
>+ * waits for all other memory-pressure observers before cleaning the dirty
>+ * unused pages held by jemalloc.
>+ */

Nit: Comma splice.  Also, you're missing a verb in the first clause (probably
"/is/ executed").

>+class MemoryPressureRunnable : public nsRunnable
>+{
>+public:
>+  MemoryPressureRunnable()
>+    : mFreeDirtyPages(false)
>+  {
>+    nsresult rv = NS_OK;
>+    nsCOMPtr<nsIPrefBranch> prefs = do_GetService(NS_PREFSERVICE_CONTRACTID, &rv);
>+    if (NS_SUCCEEDED(rv)) {

You can save a line of code by dropping the |rv| and doing |if (prefs)|.  But you can save even more code by doing |if (Preferences::GetBool("memory.free_dirty_pages"))| down below.

>+      prefs->GetBoolPref("memory.free_dirty_pages", &mFreeDirtyPages);
>+    }
>+  }

>+  NS_IMETHOD Run()
>+  {
>+    MOZ_ASSERT(NS_IsMainThread());
>+
>+    if (mFreeDirtyPages) {
>+#if defined(MOZ_JEMALLOC)
>+      mallctl("arenas.purge", nullptr, 0, nullptr, 0);
>+#elif defined(MOZ_MEMORY)
>+      jemalloc_free_dirty_pages();
>+#endif
>+    }
>+
>+    return NS_OK;
>+  }
>+
>+private:
>+  bool mFreeDirtyPages;
>+};
>+
>+} // anonymous namespace
>+
>+NS_IMETHODIMP
>+nsMemoryReporterManager::Observe(nsISupports *subject, const char *topic,
>+                                 const PRUnichar *data)
>+{
>+  MOZ_ASSERT(!strcmp(topic, "memory-pressure"), "Unknown topic");
>+
>+  nsRefPtr<MemoryPressureRunnable> memoryPressureRunnable =
>+    new MemoryPressureRunnable();
>+  NS_DispatchToMainThread(memoryPressureRunnable);

Rather than check the pref in the runnable, I'd rather check it before we even dispatch the runnable.  That way we don't dispatch useless runnables when the pref is false.

This is totally the right idea, but I'd like to see another patch, if you don't mind.
Comment 65 Gabriele Svelto [:gsvelto] 2012-11-07 18:02:25 PST
Created attachment 679497 [details] [diff] [review]
Part 2: Free dirty pages in response to all memory-pressure messages

I should have fixed all the issues you highlighted in the previous patch, I also changed indentation to 4 spaces, some parts of nsMemoryReporterManager.cpp like the MinimizeMemoryUsageRunnable use 2 spaces so that got me confused.
Comment 66 Justin Lebar (not reading bugmail) 2012-11-07 18:11:51 PST
Comment on attachment 679497 [details] [diff] [review]
Part 2: Free dirty pages in response to all memory-pressure messages

r=me, with one nit I should have caught in the first review.

> + * This runnable is executed in response to a memory-pressure event; it
> + * currently waits for all other memory-pressure observers before cleaning the
> + * dirty unused pages held by jemalloc.

This is kind of misleading; the runnable doesn't wait for all other observers to run.  It's more like we spin the event loop before running jemalloc_free_dirty_pages() in the hopes that other observers will synchronously free() their memory, which we can then purge.
Comment 67 Mike Hommey [:glandium] 2012-11-07 22:09:09 PST
Comment on attachment 678832 [details] [diff] [review]
Add an extra function to jemalloc to free dirty unused pages

Review of attachment 678832 [details] [diff] [review]:
-----------------------------------------------------------------

::: memory/mozjemalloc/jemalloc.h
@@ +122,5 @@
> + * memory needs to be reclaimed at all costs (see bug 805855). This function
> + * provides functionality similar to mallctl("arenas.purge") in jemalloc 3.
> + */
> +
> +#if defined(MOZ_NATIVE_JEMALLOC) \

You don't want the function to be there at all for MOZ_NATIVE_JEMALLOC.
Comment 68 Gabriele Svelto [:gsvelto] 2012-11-08 10:33:33 PST
(In reply to Mike Hommey [:glandium] from comment #67)
> ::: memory/mozjemalloc/jemalloc.h
> @@ +122,5 @@
> > + * memory needs to be reclaimed at all costs (see bug 805855). This function
> > + * provides functionality similar to mallctl("arenas.purge") in jemalloc 3.
> > + */
> > +
> > +#if defined(MOZ_NATIVE_JEMALLOC) \
> 
> You don't want the function to be there at all for MOZ_NATIVE_JEMALLOC.

Right, I had fought for a while with the Linux build because I hadn't realized the symbol needed to have weak linking so in the end I had copied the the #ifdefs from jemalloc_stats:

http://mxr.mozilla.org/mozilla-central/source/memory/mozjemalloc/jemalloc.h#50

I'll revise the patch and push it to try first before re-requesting the review, I want to make sure it build properly on all platforms.
Comment 69 Justin Lebar (not reading bugmail) 2012-11-08 10:58:57 PST
> Attachment #678832 [details] [diff] - Flags: review?(mh+mozilla@glandium.org) → review+

glandium gave you r+, so you don't need to re-request review (unless you want feedback, of course).
Comment 70 Gabriele Svelto [:gsvelto] 2012-11-08 13:31:50 PST
Created attachment 679808 [details] [diff] [review]
Part 1: Add an extra function to jemalloc to free dirty unused pages

Updated patch with the prototype disabled when MOZ_NATIVE_JEMALLOC is defined, the rest is unchanged. I've pushed it to try here https://tbpl.mozilla.org/?tree=Try&rev=ec976f6df8a6 and I'm requesting review again because since this interacts with a lot of parts I'm not familiar with I fear breaking the build for platforms I'm unable to test it on.
Comment 71 Gabriele Svelto [:gsvelto] 2012-11-08 13:33:58 PST
Created attachment 679812 [details] [diff] [review]
Part 2: Free dirty pages in response to all memory-pressure messages

Modified the description of the MemoryPressureRunnable as per comment 66
Comment 72 Gabriele Svelto [:gsvelto] 2012-11-08 15:41:45 PST
I pushed to try again those patches and I got the same weird error on four platforms:

https://tbpl.mozilla.org/?tree=Try&rev=98127c00918f

The issue happens in a packaging phase apparently and doesn't seem to have anything to do with my patch; I tried reproducing it on my machine but without success. Any clues as to why this is happening? Can we check in even with this issue present?
Comment 73 Mozilla RelEng Bot 2012-11-08 15:45:33 PST
Try run for 98127c00918f is complete.
Detailed breakdown of the results available here:
    https://tbpl.mozilla.org/?tree=Try&rev=98127c00918f
Results (out of 7 total builds):
    success: 2
    failure: 5
Builds (or logs if builds failed) available at:
http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/gsvelto@mozilla.com-98127c00918f
Comment 74 Mike Hommey [:glandium] 2012-11-08 15:49:27 PST
(In reply to Gabriele Svelto [:gsvelto] from comment #72)
> I pushed to try again those patches and I got the same weird error on four
> platforms:
> 
> https://tbpl.mozilla.org/?tree=Try&rev=98127c00918f
> 
> The issue happens in a packaging phase apparently and doesn't seem to have
> anything to do with my patch; I tried reproducing it on my machine but
> without success. Any clues as to why this is happening?

"ASSERTION: Recursive GetService!" sounds pretty explicit. Assuming it's not broken if you deapply your patches, the likely problem is https://hg.mozilla.org/try/rev/756d0cc54529#l3.40

> Can we check in even with this issue present?

Not without making sure this is not due to your patches.
Comment 75 Nicholas Nethercote [:njn] 2012-11-08 15:53:27 PST
I asked on #developers, the sheriffs hadn't seen this problem before, so as glandium says it's very likely your fault :/
Comment 76 Gabriele Svelto [:gsvelto] 2012-11-08 16:11:31 PST
(In reply to Mike Hommey [:glandium] from comment #74)
> "ASSERTION: Recursive GetService!" sounds pretty explicit. Assuming it's not
> broken if you deapply your patches, the likely problem is
> https://hg.mozilla.org/try/rev/756d0cc54529#l3.40

Yes, it's definitely my fault, I hadn't realized you can't call getService() from within a getService() call. What is happening is that the nsMemoryReporterManager class is being instanced in a do_GetService("@mozilla.org/memory-reporter-manager;1") call, when Init() is invoked and GetObserverService() is called it triggers the assertion. I'll revise the patch and push to try again.
Comment 77 Nicholas Nethercote [:njn] 2012-11-08 16:18:01 PST
An interesting question:  why didn't you see this locally?
Comment 78 Gabriele Svelto [:gsvelto] 2012-11-08 16:46:49 PST
(In reply to Nicholas Nethercote [:njn] from comment #77)
> An interesting question:  why didn't you see this locally?

I've re-run the desktop B2G and checked through the huge log it spits out, I get it locally too, but since it didn't stop anything from working I hadn't noticed it. In fact I can see a couple of other (apparently) unrelated assertions being triggered in that log.
Comment 79 Gabriele Svelto [:gsvelto] 2012-11-08 23:15:23 PST
Created attachment 679989 [details] [diff] [review]
Part 2: Free dirty pages in response to all memory-pressure messages

Revised patch that does not trigger the recursive GetService() assertion anymore; the memory pressure watcher now lives in the available memory tracker sources as discussed and is started together with other singletons in nsLayoutStatics.

I've tested it locally and also pushed it to try to speed up landing it if we're satisfied with this new iteration:

https://tbpl.mozilla.org/?tree=Try&rev=6467bfe532b7
Comment 80 Justin Lebar (not reading bugmail) 2012-11-09 09:52:27 PST
Comment on attachment 679989 [details] [diff] [review]
Part 2: Free dirty pages in response to all memory-pressure messages

>diff --git a/xpcom/base/AvailableMemoryTracker.h b/xpcom/base/AvailableMemoryTracker.h
>--- a/xpcom/base/AvailableMemoryTracker.h
>+++ b/xpcom/base/AvailableMemoryTracker.h
>-#if defined(XP_WIN)
> void Init();
> void Activate();
>-#else
>-void Init() {}
>-void Activate() {}
>-#endif

If you're changing this so that Init() and Activate() are defined on all
platforms, then maybe Activate() should initialize nsMemoryPressureWatcher()?
You'd need to update the comment.  Otherwise I don't see why you're making this
change.

>+
>+/**
>+ * The memory pressure watcher is a per-process singleton used for listening
>+ * to memory-pressure events and reacting upon them
>+ */
>+
>+class nsMemoryPressureWatcher MOZ_FINAL : public nsIObserver

nsFoo should live inside the empty namespace.  If it's insize mozilla::, we
call it Foo.

If you move this observer's initialization to AvailableMemoryTracker::Init(),
then you don't need to define it inside the header, which is nice.  But either
way...

>+{
>+public:
>+    NS_DECL_ISUPPORTS
>+    NS_DECL_NSIOBSERVER
>+
>+    static void Init();
>+    static void Shutdown();
>+
>+private:
>+    nsMemoryPressureWatcher();
>+    virtual ~nsMemoryPressureWatcher();
>+
>+    static nsMemoryPressureWatcher *sInstance;
>+
>+    bool mFreeDirtyPages;

Can you simply avoid creating the object if the pref is not set?  Then you
don't need this member variable.

>diff --git a/xpcom/base/AvailableMemoryTracker.cpp b/xpcom/base/AvailableMemoryTracker.cpp

>+/** Memory pressure watcher singleton */
>+nsMemoryPressureWatcher *nsMemoryPressureWatcher::sInstance = nullptr;

Please use StaticRefPtr instead; that way, you don't need to manually
addref/release.  But I don't think you need the singleton at all; see below.

>+NS_IMPL_ISUPPORTS1(nsMemoryPressureWatcher, nsIObserver)
>+
>+/**
>+ * Initialize the nsMemoryPressureWatcher singleton, the newly created object
>+ * will start listening to memory pressure events.
>+ */
>+
>+void
>+nsMemoryPressureWatcher::Init()
>+{
>+    sInstance = new nsMemoryPressureWatcher();
>+    NS_IF_ADDREF(sInstance);

Can we just not create the observer if the pref is not set?

>+}
>+
>+/**
>+ * Tear down the memory pressure watcher singleton.
>+ */
>+
>+void
>+nsMemoryPressureWatcher::Shutdown()
>+{
>+    NS_IF_RELEASE(sInstance);
>+}
>+
>+/**
>+ * Create a memory pressure watcher object and makes it listen to
>+ * memory-pressure events. Currently memory-pressure events are subscribed to
>+ * only if clearing dirty pages is enabled as that is the only action we
>+ * take in response to those events.
>+ */
>+
>+nsMemoryPressureWatcher::nsMemoryPressureWatcher() :
>+    mFreeDirtyPages(Preferences::GetBool("memory.free_dirty_pages", false))
>+{
>+    if (mFreeDirtyPages) {
>+        nsCOMPtr<nsIObserverService> os = services::GetObserverService();
>+
>+        if (os) {
>+            os->AddObserver(this, "memory-pressure", /* ownsWeak */ false);

Notice that you're passing false for the third param here.  That means that the
observer service holds a strong reference to this.  Which means that the
observer service holds this object alive; you don't need to hold your own ref.
It also means that...

>+        }
>+    }
>+}

>+nsMemoryPressureWatcher::~nsMemoryPressureWatcher()
>+{
>+    if (mFreeDirtyPages) {
>+        nsCOMPtr<nsIObserverService> os = services::GetObserverService();
>+
>+        if (os) {
>+            os->RemoveObserver(this, "memory-pressure");
>+        }

...this code doesn't work as you expect.  We won't be destroyed until the
observer service releases its ref to this.  At which point RemoveObserver(this)
won't do anything.

You can just rely on the observer service to keep the observer alive; that way
you don't need to keep a strong ref to it.
Comment 81 Gabriele Svelto [:gsvelto] 2012-11-09 11:22:54 PST
Created attachment 680163 [details] [diff] [review]
Part 2: Free dirty pages in response to all memory-pressure messages

Revised patch taking into account the suggestions in comment 80:
- Removed the singleton instance, the watcher is kept alive by the observer service
- Renamed classes to reflect the proper naming convention (MemoryPressureWatcher now lives under mozilla::AvailableMemoryTracker, I saw no reason to put it in mozilla since it's used only by the AvailableMemoryTracker now)
- The watcher object is created only if memory.free_dirty_pages is true
- Used 2 spaces indentation because the rest of the file uses it
- Reordered the includes as quite a few of them were changed
- Updated the comment in AvailableMemoryTracker.h to reflect that we're now using it on all platforms but only on Windows we have actual available memory tracking
Comment 82 Justin Lebar (not reading bugmail) 2012-11-11 18:01:31 PST
Comment on attachment 680163 [details] [diff] [review]
Part 2: Free dirty pages in response to all memory-pressure messages

>diff --git a/xpcom/base/AvailableMemoryTracker.cpp b/xpcom/base/AvailableMemoryTracker.cpp
>--- a/xpcom/base/AvailableMemoryTracker.cpp
>+++ b/xpcom/base/AvailableMemoryTracker.cpp
>@@ -1,31 +1,44 @@
> /* -*- Mode: C++; tab-width: 2; indent-tabs-mode: nil; c-basic-offset: 2 -*- */
> /* vim:set ts=2 sw=2 sts=2 ci et: */
> /* This Source Code Form is subject to the terms of the Mozilla Public
>  * License, v. 2.0. If a copy of the MPL was not distributed with this
>  * file, You can obtain one at http://mozilla.org/MPL/2.0/. */
> 
>-#include "mozilla/AvailableMemoryTracker.h"
>-#include "nsThread.h"
>-#include "nsIObserverService.h"
>-#include "mozilla/Services.h"
>-#include "mozilla/Preferences.h"
>-#include "nsWindowsDllInterceptor.h"
> #include "prinrval.h"
> #include "pratom.h"
> #include "prenv.h"
>+
> #include "nsIMemoryReporter.h"
>+#include "nsIObserver.h"
>+#include "nsIObserverService.h"
>+#include "nsISupports.h"
> #include "nsPrintfCString.h"
>-#include <windows.h>
>+#include "nsThread.h"
>+
>+#include "mozilla/AvailableMemoryTracker.h"

Please feel free to order the includes here however you like, subject to one
condition: A cpp's "primary" header should come first.  This insures that the
primary header includes everything it needs to compile.  That is, we don't want
AvailableMemoryTracker.h to rely on always being included after one of the
headers above.

>+class nsMemoryPressureRunnable : public nsRunnable

Maybe call this JemallocFreeDirtyPagesRunnable(), since that's what it actually
does?  Alternatively, you could make your observer implement nsIRunnable.

>+/**
>+ * The memory pressure watcher is used for listening to memory-pressure events
>+ * and reacting upon them. We use one instance per process currently only for
>+ * cleaning up dirty unused pages held by jemalloc.
>+ */
>+
>+class MemoryPressureWatcher MOZ_FINAL : public nsIObserver

Can you move this into the anonymous namespace?

>+/**
>+ * Empty constructor.
>+ * @see mozilla::AvailableMemoryTracker::MemoryPressureWatcher::Init()
>+ */
>+
>+MemoryPressureWatcher::MemoryPressureWatcher()
>+{
>+}
>+
>+/**
>+ * Empty destructor, we don't unsubscribe from the memory-pressure events as
>+ * this will be invoked only when the observer services is shuttind down.
>+ */
>+
>+MemoryPressureWatcher::~MemoryPressureWatcher()
>+{
>+}

Do you need to define these at all?

>+/**
>+ * Initialize and subscribe to the memory-pressure events. We subscribe to the
>+ * observer service in this method and not in the constructor because we need
>+ * to hold a strong reference to @c this before calling the observer service.
>+ */
>+
>+void
>+MemoryPressureWatcher::Init()

Nit: We don't usually put newlines between comments and function definitions,
here and elsewhere.

Also, I'm sure @c is some javadoc convention, but we don't seem to use it, so you probably shouldn't either.  (When in doubt, |git grep '\<@c'|.)

r=me with these nits addressed.
Comment 83 Gabriele Svelto [:gsvelto] 2012-11-12 08:48:26 PST
Created attachment 680657 [details] [diff] [review]
Part 2: Free dirty pages in response to all memory-pressure messages
Comment 84 Gabriele Svelto [:gsvelto] 2012-11-12 08:51:13 PST
(In reply to Justin Lebar [:jlebar] from comment #82)
> Please feel free to order the includes here however you like, subject to one
> condition: A cpp's "primary" header should come first.  This insures that the
> primary header includes everything it needs to compile.  That is, we don't
> want AvailableMemoryTracker.h to rely on always being included after one of the
> headers above.

I see, it makes sense, I've adjusted it.

> Maybe call this JemallocFreeDirtyPagesRunnable(), since that's what it
> actually does? Alternatively, you could make your observer implement nsIRunnable.

I've created a separate runnable in the patch but yes, if we want to shrink it further I can just fold it MemoryPressureWatcher.

> Can you move this into the anonymous namespace?

Yup, done.

> Do you need to define these at all?

No, actually not, I've removed them.

> Also, I'm sure @c is some javadoc convention, but we don't seem to use it,
> so you probably shouldn't either.  (When in doubt, |git grep '\<@c'|.)

It's the <code></code> tag in javadoc but I've taken it out just in case. If the patch looks good to you I'll push it to try and when the results are in we can commit it.
Comment 85 Justin Lebar (not reading bugmail) 2012-11-12 08:56:05 PST
> It's the <code></code> tag in javadoc

FWIW, javadoc in Gecko is a stylistic convention -- I don't know of anyone who actually uses the javadoc tool to generate documentation.

Looks good to me.
Comment 86 Gabriele Svelto [:gsvelto] 2012-11-12 13:40:00 PST
(In reply to Justin Lebar [:jlebar] from comment #85)
> FWIW, javadoc in Gecko is a stylistic convention -- I don't know of anyone
> who actually uses the javadoc tool to generate documentation.

I see... well there are tools like doxygen that extract Javadoc-like documentation from C++ sources so I thought it might be something like that.

> Looks good to me.

Pushed to try: https://tbpl.mozilla.org/?tree=Try&rev=b7ecd4cd7710

I'll set the checkin-needed flag once the results are in.
Comment 87 Mozilla RelEng Bot 2012-11-12 17:45:37 PST
Try run for b7ecd4cd7710 is complete.
Detailed breakdown of the results available here:
    https://tbpl.mozilla.org/?tree=Try&rev=b7ecd4cd7710
Results (out of 18 total builds):
    success: 16
    failure: 2
Builds (or logs if builds failed) available at:
http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/gsvelto@mozilla.com-b7ecd4cd7710
Comment 88 Gabriele Svelto [:gsvelto] 2012-11-13 08:49:32 PST
The results are all green except for the unagi optimized build which seem to have failed for unrelated reasons (some rsync-related issue apparently) so I would say we're good to check it in.

Note that I've obsoleted the patch that limited the amount of dirty unused pages because it has not much more reason to be now that we're cleaning up those pages both when applications are sent to the background and when memory is running low.
Comment 89 Justin Lebar (not reading bugmail) 2012-11-13 08:53:17 PST
> Note that I've obsoleted the patch that limited the amount of dirty unused pages because it has not 
> much more reason to be now that we're cleaning up those pages both when applications are sent to the 
> background and when memory is running low.

Well...at any time we have our current app and the main process in the "foreground".  So together they're using up to 8mb  of dirty pages.  If we could reduce that to ~2mb, that could be a significant win.

But we can handle this in a follow-up.
Comment 92 Justin Lebar (not reading bugmail) 2012-11-14 07:33:20 PST
Gabriele, can you file a bug on reducing the jemalloc dirty size, so we don't lose track of it?
Comment 93 Gabriele Svelto [:gsvelto] 2012-11-14 08:14:08 PST
(In reply to Justin Lebar [:jlebar] from comment #92)
> Gabriele, can you file a bug on reducing the jemalloc dirty size, so we
> don't lose track of it?

Done, see bug 811740, I'll add a more recent about:memory report to that bug that shows the impact of various sizes with the patches for this bug in place.
Comment 95 Jesse Ruderman 2012-11-15 13:10:26 PST
What is the effect of enabling memory.free_dirty_pages in Firefox for Desktop?  Should I randomize that pref when fuzz-testing Firefox for Desktop, e.g. in order to catch bugs that might affect B2G?
Comment 96 Gabriele Svelto [:gsvelto] 2012-11-15 13:16:16 PST
(In reply to Jesse Ruderman from comment #95)
> What is the effect of enabling memory.free_dirty_pages in Firefox for
> Desktop?  Should I randomize that pref when fuzz-testing Firefox for
> Desktop, e.g. in order to catch bugs that might affect B2G?

In theory even if the pref is enabled the relevant code is unlikely to be executed on the desktop version. The code is triggered by a memory-pressure event which is either caused by a low memory situation (very unlikely on the desktop, and I think available only on Windows at the moment) or when the about:memory page is invoked and the "Minimize memory usage" button is explicitly clicked.
Comment 97 Nicholas Nethercote [:njn] 2012-11-15 13:58:47 PST
> or when the
> about:memory page is invoked and the "Minimize memory usage" button is
> explicitly clicked.

I bet you $5 Jesse's fuzzers already send memory pressure events.  (And if they don't, they should! :)
Comment 98 Jesse Ruderman 2012-11-15 15:49:26 PST
Yes, my DOM fuzzer randomly sends memory pressure notifications :)
Comment 99 Gabriele Svelto [:gsvelto] 2012-11-15 16:53:16 PST
(In reply to Jesse Ruderman from comment #98)
> Yes, my DOM fuzzer randomly sends memory pressure notifications :)

Pardon my previous skepticism, that's some serious code coverage! :) OK, then you want to turn it on, it should have exactly the same effect on any desktop version using mozjemalloc as on B2G.
Comment 100 Kartikaya Gupta (email:kats@mozilla.com) 2012-11-22 11:37:45 PST
:gsvelto, have you verified this is working as expected? The reason I ask is that on Android when I turned on the pref I noticed that it wasn't having any effect. After some debugging I think what's happening is that the memory.free_dirty_pages pref is getting read before the prefs are loaded, and so the pref is always read as false.

See backtraces at http://www.pastebin.mozilla.org/1951695
Comment 101 Kartikaya Gupta (email:kats@mozilla.com) 2012-11-22 11:39:02 PST
Oh.. it probably works on B2G because the default value of the pref is true (I think). Which means on B2G you'll have the opposite problem: you'll never be able to turn it off via setting the pref in about:config.
Comment 102 Kartikaya Gupta (email:kats@mozilla.com) 2012-11-22 14:21:10 PST
Filed bug 814517 for reading the pref before it is fully set. Turns out there are a lot of other prefs that are in the same boat, so it seems like it should be addressed at a lower layer.
Comment 103 Gabriele Svelto [:gsvelto] 2012-11-23 01:37:33 PST
(In reply to Kartikaya Gupta (:kats) from comment #102)
> Filed bug 814517 for reading the pref before it is fully set. Turns out
> there are a lot of other prefs that are in the same boat, so it seems like
> it should be addressed at a lower layer.

Thanks for the catch, since the default worked properly on B2G I didn't realize the preference wasn't working properly. I've CC'd the bug and I'll see what can be done on this patch to fix it.

Note You need to log in before you can comment on or make changes to this bug.