Closed Bug 670492 Opened 9 years ago Closed 8 years ago

Talos Regression :( Tp5 (RSS) increase 19.4% on MacOSX 10.5.8 Firefox

Categories

(Core :: Memory Allocator, defect)

x86
macOS
defect
Not set

Tracking

()

RESOLVED WORKSFORME

People

(Reporter: paul.biggar, Unassigned)

References

Details

Why does it get worse on 10.5 but no change on 10.6?
Paul, can you clarify this from bug 414946 comment 101?  
(We can take this into bug 414946 if you'd prefer.)

> - The OS doesn't take memory back immediately, so the "closed tab" metric might 
> be used, even though the OS will take back that memory if it's needed.

In what way does the OS take back memory if it's needed?  If the allocator has mapped memory, and if the application has written to it, causing it to be committed, the OS can't free that memory until the allocator unmaps it.  If on the other hand the memory was not committed, then the OS never allocated physical pages to the mapping, and there's therefore nothing to release.

As far as I know, jemalloc does not listen to any OS-level notifications indicating that it should unmap this memory -- in fact, I don't believe anything like that exists in Linux, Mac, or Windows.  (See bug 664291, where we're trying to hack such a notification into existence.)
(In reply to comment #2)
> Why does it get worse on 10.5 but no change on 10.6?

Is it possibly a 32- versus 64-bit issue?  I recall that the things we call "10.5 builds" are the 32-bit version of the universal binary.
(In reply to comment #3)
> Paul, can you clarify this from bug 414946 comment 101?  
> (We can take this into bug 414946 if you'd prefer.)
> 
> > - The OS doesn't take memory back immediately, so the "closed tab" metric might 
> > be used, even though the OS will take back that memory if it's needed.
> 
> In what way does the OS take back memory if it's needed?  If the allocator
> has mapped memory, and if the application has written to it, causing it to
> be committed, the OS can't free that memory until the allocator unmaps it. 
> If on the other hand the memory was not committed, then the OS never
> allocated physical pages to the mapping, and there's therefore nothing to
> release.

I'm talking about madvise, see experiments at https://bugzilla.mozilla.org/show_bug.cgi?id=414946#c84.

Actually, this might also explain why we have a 10.5 regression. I noted in that comment that madvise didn't work in 10.5 (according to other sources, I didn't test it).
(In reply to comment #5)
> I'm talking about madvise, see experiments at
> https://bugzilla.mozilla.org/show_bug.cgi?id=414946#c84.

Let me clarify this better. After Firefox frees memory, jemalloc madvises the OS that the memory is no longer needed. As I recall (I don't have this in my notes so I may have to repeat this part of the experiment), during the time between which jemalloc calls madvise() and another process requesting memory, the madvised memory is counted towards Firefox's RSS.

Note that I'm by no means an expert in the eccentricities of particular OSes (hence experiments) so please correct me if this makes no sense.
Ah, I see.  In arena_purge, it does:


  #ifdef MALLOC_DECOMMIT
    pages_decommit((void *)((uintptr_t)
                   chunk + (i << pagesize_2pow)),
                   (npages << pagesize_2pow));
  #else
    madvise((void *)((uintptr_t)chunk + (i << pagesize_2pow)),
             (npages << pagesize_2pow),
             MADV_FREE);
  #endif

MALLOC_DECOMMIT is not defined on Mac, so we're making the madvise call.
pages_decommit does

    mmap(addr, size, PROT_NONE, MAP_FIXED | MAP_PRIVATE | MAP_ANON, -1, 0).

PROT_NONE means "you can't access these pages", so presumably the pages are
removed from the address space.

Here's what my Mac says about MADV_FREE:

  Indicates that the application will not need the information contained in
  this address range, so the pages may be reused right away.  The address range
  will remain valid.

I'm not really sure what that means.  But the FreeBSD man page is apparently
much more helpful [1]:

  MADV_FREE Gives the VM system the freedom to free pages, and tells
  the system that information in the specified page range
  is no longer important. This is an efficient way of
  allowing malloc(3) to free pages anywhere in the address
  space, while keeping the address space valid. The next
  time that the page is referenced, the page might be
  demand zeroed, or might contain the data that was there
  before the MADV_FREE call. References made to that
  address space range will not make the VM system page the
  information back in from backing store until the page is
  modified again.

This sounds like exactly what you described, Paul.

You could try building with MALLOC_DECOMMIT defined.  If there's no RSS
increase on the benchmark, then it's likely that this increase is not
significant.  If we could verify that the RSS eventually goes down, as soon as
X or Y happens, then I think we could leave this as-is.

[1] http://lkml.indiana.edu/hypermail/linux/kernel/0705.0/0087.html
As a simple, short-term fix to get jemalloc landed, I could turn jemalloc off on 32-bit.
I have a test program which helps figure out how to decommit, to test if we can in fact decommit on 10.5: https://bug414946.bugzilla.mozilla.org/attachment.cgi?id=478270

Anyone have a 10.5 machine?
I have two 10.5 machines in the QA lab on Floor 2.

(In reply to comment #9)
> I have a test program which helps figure out how to decommit, to test if we
> can in fact decommit on 10.5:
> https://bug414946.bugzilla.mozilla.org/attachment.cgi?id=478270
> 
> Anyone have a 10.5 machine?
OK, so we have a better solutino for decommiting on Mac. billm suggested using mprotect(addr, size, PROT_NONE), and that works on 10.5, and also works better on 10.6 (in that the memory is immediately removed from RSS).
Actually, mprotect just let the memory be swapped straight to disk, lowering the RSS but keeping private memory the same, and leading to more disk activity. I found that mmap works well. Since this is the same code as linux/android use for decommit, I turned on MALLOC_DECOMMMIT on Darwin too.

This solves the RSS problem. The talos regression (http://groups.google.com/group/mozilla.dev.tree-management/browse_thread/thread/b922dda089a842e0/ec98f523ac79e454) said:

Previous: avg 176938400.000 
New     : avg 211186200.000

TBPL on the latest changeset (http://tbpl.mozilla.org/?tree=Try&pusher=pbiggar@mozilla.com&rev=933061c83fd2) says:

tp5_rss: 165.9MB (details)

A manual test (opening tons of techcrunch pages, closing), showed a significant drop in RSS usage (reported via the OSX Activity Monitor), but I don't have that to hand, so I'll report back tomorrow.
Glandium, it strikes me as possible that this isn't a true regression, but rather that 10.5 is less aggressive about removing madvise'd pages than 10.6.

I don't see any results in this bug, but the test would first be to run the code from comment 10 on a 10.5 machine and see whether one of the decommit strategies releases memory upon memory pressure.  If so, maybe we have to change how we measure RSS on 10.5.
Closing this bug, since the regression is gone.  Work to get jemalloc working on 10.5 is in bug 693404.
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.