Closed Bug 947072 Opened 8 years ago Closed 7 years ago

Distinguish anonymous mappings in SystemMemoryReporter

Categories

(Core :: XPCOM, defect)

All
Linux
defect
Not set
normal

Tracking

()

RESOLVED WONTFIX

People

(Reporter: n.nethercote, Assigned: n.nethercote)

References

Details

(Keywords: perf, Whiteboard: [c=memory p= s=2014.01.17 u=] [MemShrink:P2])

Attachments

(1 file, 3 obsolete files)

(This bug was spun off from bug 945973.)

Anonymous mappings aren't distinguished by SystemMemoryReporter.  With judicious use of sparse file mappings, we can put them in some coarse buckets.
Whiteboard: [c=memory p= s= u=] [MemShrink] → [c=memory p= s= u=] [MemShrink:P2]
Comment on attachment 8345680 [details] [diff] [review]
On Linux, distinguish anonymous mappings by backing them with sparse, dummy files.

Review of attachment 8345680 [details] [diff] [review]:
-----------------------------------------------------------------

::: mfbt/Poison.cpp
@@ +115,4 @@
>  static void *
>  ReserveRegion(uintptr_t region, uintptr_t size)
>  {
> +  return MozTaggedAnonymousMmap(reinterpret_cast<void*>(region), size, PROT_NONE, MAP_PRIVATE|MAP_ANON, -1, 0);

You're not tagging, here, which, with the implementation, means you're crashing here.

::: mfbt/TaggedAnonymousMmap.cpp
@@ +13,5 @@
> +// On platforms other than Linux, just do a vanilla anonymous mmap.
> +int MozTaggedAnonymousMmap(void* addr, size_t length, int prot,
> +                           int flags, int tag, off_t offset);
> +{
> +  return mmap(addr, length, prot, flags, -1, offset);

s/-1/tag/

@@ +68,5 @@
> +        if (dir && stat(dir, &statbuf) != -1 && S_ISDIR(statbuf.st_mode))
> +          return dir;
> +      }
> +
> +      dir = ".";

it seems to me it would be better to fail, here.

@@ +95,5 @@
> +        hasFailed_ = true;
> +        return;
> +      }
> +
> +      unlink(path);

Note that 3.11 has O_TMPFILE, now, for future improvements. http://kernelnewbies.org/Linux_3.11#head-8be09d59438b31c2a724547838f234cb33c40357

@@ +123,5 @@
> +  if (tag < 0 || tag >= MOZ_TAG_LIMIT)
> +    MOZ_CRASH("bogus tag");
> +
> +  void* m;
> +  {

Why the brackets?
Attachment #8345680 - Flags: review?(mh+mozilla) → review+
> Note that 3.11 has O_TMPFILE, now, for future improvements.
> http://kernelnewbies.org/Linux_3.11#head-
> 8be09d59438b31c2a724547838f234cb33c40357

Nice!  But in the future, as you say -- e.g. my Keon is running 3.0.8-GP+.

> > +  void* m;
> > +  {
> 
> Why the brackets?

I initially tried using some kind of RAII mutex but that didn't work, and I forgot to remove the braces.  I will do so.
r?ing Waldo to check the MFBT changes for appropriate levels of MFBT-ness.
Attachment #8348500 - Flags: review?(jwalden+bmo)
Attachment #8345680 - Attachment is obsolete: true
This version fixes some Mac and Windows problems.

Waldo:  you only need to review the MFBT changes;  glandium's already given a
global r+.
Attachment #8349571 - Flags: review?(jwalden+bmo)
Attachment #8348500 - Attachment is obsolete: true
Attachment #8348500 - Flags: review?(jwalden+bmo)
Comment on attachment 8349571 [details] [diff] [review]
On Linux, distinguish anonymous mappings by backing them with sparse, dummy files.

Review of attachment 8349571 [details] [diff] [review]:
-----------------------------------------------------------------

It seems to me like it'd be good to produce an interface usable anywhere, not just on non-Windows, non-OS X systems.  But this is a step in the right direction, toward one thing to abstract away a whole mess of ifdefs everywhere, at least.

Documentation concerns aside (which I do think are important), the big thing here is that I think you're adhering a bit too much

::: js/src/assembler/jit/ExecutableAllocatorPosix.cpp
@@ +53,5 @@
> +        int fd = VM_TAG_FOR_EXECUTABLEALLOCATOR_MEMORY;
> +#else
> +        int fd = MOZ_TAG_FOR_JS_JIT_CODE;
> +#endif
> +        allocation = MozTaggedAnonymousMmap(NULL, n, INITIAL_PROTECTION_FLAGS, MAP_PRIVATE | MAP_ANON, fd, 0);

So, this is maybe convenient, to assume vmmap tags can be passed into MozTaggedAnonymousMap, but it seems like it clutter/makes the interface for this a bit more complex.  All things considered, it seems slightly better to me to have the platform-darwin case use mmap directly, and have the other case use MozTaggedAnonymousMmap.  Further refinements to the interface being added here should be able to reunify the two cases, so that custom vmmap tags can be handled in the MozTaggedAnonymousMmap code as well.  (Perhaps by throwing the vm tags into the higher-order tag macro I ask for later, then having an OS X version of the method that calls mmap using the associated VM tag number.  Or something.  But this seems followup-land, and this one instance should just fork into two for now.

::: mfbt/TaggedAnonymousMmap.cpp
@@ +15,5 @@
> +#ifndef __linux__
> +
> +// On platforms other than Linux, just do a vanilla anonymous mmap.
> +void* MozTaggedAnonymousMmap(void* addr, size_t length, int prot,
> +                             int flags, int fd, off_t offset)

void*
MozTaggedAnonymousMmap(...)

for methods defined at top level.

@@ +34,5 @@
> +#include <sys/types.h>
> +#include <unistd.h>
> +
> +#define TAG_FILENAME_PREFIX   "moz-tagged-mmap-file-"
> +#define TAG_FILENAME_SUFFIX   ".XXXXXX"

static const char blah[] = ...;

@@ +41,5 @@
> +// This means zero-initializing all its fields must result in a valid state.
> +class Tag
> +{
> +  bool   hasRun_;
> +  bool   hasFailed_;

Slight preference for initialized_ and maybe hadError_ as the names, but it doesn't really matter.

@@ +43,5 @@
> +{
> +  bool   hasRun_;
> +  bool   hasFailed_;
> +  int    fd_;
> +  size_t maxSize_;

Order these size_t, then int, then bool, then bool to minimize overall size.  Not that it matters here, but good habits by default pay off in places where it does matter.

@@ +46,5 @@
> +  int    fd_;
> +  size_t maxSize_;
> +
> +public:
> +  int flags() {

Indentation of class bodies is as SpiderMonkey does it, with the function bodies two-space indented.  So:

class Tag
{
    bool hasRun_;

  public:
    int flags() {
      MOZ_ASSERT(hasRun_);
      ...
    }

@@ +91,5 @@
> +      static const size_t pathlen = 512;
> +      char path[pathlen];
> +      snprintf(path, pathlen,
> +               "%s/" TAG_FILENAME_PREFIX "%d" TAG_FILENAME_SUFFIX,
> +               getDir(), tag);

For safety, seems like you should do

  int len = snprintf(...);
  if (len < 0 || len >= pathlen) {
    hasFailed_ = true;
    return;
  }

to properly handle large paths.

@@ +115,5 @@
> +};
> +
> +static Tag tags[MOZ_TAG_LIMIT];
> +
> +void* MozTaggedAnonymousMmap(void* addr, size_t length, int prot,

void*
MozTaggedBlahBlah(...)

as before.

@@ +121,5 @@
> +{
> +  static pthread_mutex_t mutex = PTHREAD_MUTEX_INITIALIZER;
> +
> +  if (flags != (MAP_PRIVATE | MAP_ANONYMOUS))
> +    MOZ_CRASH("this is not a vanilla anonymous mmap");

This is a precondition for any call to be correct.  Thus it's better written as

  MOZ_ASSERT(flags == (MAP_PRIVATE | MAP_ANONYMOUS),
             "must be a vanilla anonymous mmap");

Same for the tag-in-range check below.

Or, why even request |flags| as an argument, if it always has to be one exact value?  Seems better to just kill it off entirely so nobody can get it wrong.  There doesn't seem that much value in emulating mmap's interface to me, not when that interface is a Swiss army knife in the worst Unix tradition capable of way too much stuff.

@@ +128,5 @@
> +    MOZ_CRASH("bogus tag");
> +
> +  pthread_mutex_lock(&mutex);
> +
> +  tags[tag].update(length + offset, tag);

Maybe I'm just misreading mmap docs.  But isn't |length| the size of the produced mapping?  |offset| has to be page-aligned, but it doesn't have any effect on the size of the mapping, as far as I can tell.

But wait.  |offset| is really only meaningful if you want to view some subset of pages in a file.  But here, we're really effectively not reading a file at all.  So every user more or less should be passing 0.  And indeed they all are, at a quick skim.  So why have |offset| in the interface at all?  That makes two arguments that should disappear from this method, then.

@@ +166,5 @@
> +    "js-yarr",
> +    "asm-js-code",
> +    "asm-js-reserved",
> +    "poison",
> +    "sps-prof-entrys"

entrys and not entries?

::: mfbt/TaggedAnonymousMmap.h
@@ +4,5 @@
> + * License, v. 2.0. If a copy of the MPL was not distributed with this
> + * file, You can obtain one at http://mozilla.org/MPL/2.0/. */
> +
> +// On Linux, anonymous mappings are truly anonymous, which stymies OS-level
> +// memory profiling.  This module implements a scheme for coarsely tagging

"anonymous" I had to look up, but that's not too surprising, as I've mostly gained mmap knowledge by watching patchwork at a distance.  But "truly anonymous" just doesn't mean anything to me.  :-)  So how about something slightly more expository, that doesn't descend too deeply into platform-specific terminology, but does explain what's done.  Something like this, maybe?

"""
OS-level memory profilers such as /proc/<pid>/mmaps on Linux and vmmaps on OS X expose a how a process maps its address space, at the level of individual address range mappings.  Such profilers expose the name of the backing file, for file mappings.  But they can't always expose similar information for anonymous mappings: OS X permits them to be categorized, while Linux does not.

This module provides an interface for non-Windows use that works around this inconsistency on Linux, creating anonymous mappings assigned to one of a predefined set of different uses.  (Non-Linux simply falls back to plain old POSIX anonymous mmaps -- no identification, but equivalent mapping functionality.)  To achieve this...[rest of your comments continue as in the patch]
"""

@@ +17,5 @@
> +// immediately after creation, so that they will disappear as soon as they are
> +// no longer needed.  And they are extended as necessary with ftruncate().
> +// Because they are mapped privately, they are never written back to disk,
> +// which means they are sparse and do not take up any actual disk space for
> +// their contents.

Maybe add " (at least, if the OS is being smart)" as this seems unlikely to be part of the interface guarantee, just an optimization we're assuming will usually occur.

@@ +44,5 @@
> +#else
> +#define T(n)  -1
> +#endif
> +
> +// Keep this in sync with mozilla::TagNameFor()!

Rather than this, please use

#define MOZ_FOR_ANONYMOUS_MMAP_TAGS(macro) \
  macro(MOZ_TAG_FOR_JEMALLOC_HEAP, 0, "jemalloc-heap") \
  macro(MOZ_TAG_FOR_JS_GC_HEAP, 1, "js-gc-heap") \
  ...

and then use

enum {
#ifdef __linux__
#  define MOZ_TAG(n) n
#else
#  define MOZ_TAG(n) -1
#endif
#define MOZ_DEFINE_ENUM_VALUE(init, num, name) init = MOZ_TAG(n),
  MOZ_FOR_ANONYMOUS_MMAP_TAGS(MOZ_DEFINE_ENUM_VALUE)
#undef MOZ_DEFINE_ENUM_VALUE
#undef MOZ_TAG

  MOZ_TAG_LIMIT
};

here, and then use

static const char tagNames[] = {
#define TAGNAME(init, num, name) name,
  MOZ_FOR_ANONYMOUS_MMAP_TAGS(TAGNAME)
#undef TAGNAME
};

in the .cpp file.

@@ +65,5 @@
> +extern "C"
> +#endif
> +MFBT_API void*
> +MozTaggedAnonymousMmap(void* addr, size_t length, int prot,
> +                       int flags, int tag, off_t offset);

I don't think it's valuable for this to have exactly mmap's interface.  Not when the flags have to be exactly one value, not when the offset has no valid reason not to always be 0, not when the tag is not the file descriptor it is in the mmap interface.  Please add a /** stuff */ comment by this that spells out exactly what the interface does and the exact meaning of every parameter (mentioning all of the PROT_* constants for the |prot| argument with some sort of brief description of what they do, so that the user doesn't have to refer to mmap documentation to figure that out).  As it is now you'd have to dig all this information out of the overview comment at the top, then you'd have to track down mmap documentation to figure out what to pass, and it all gets way harder than just reading clear, concise, coder-oriented non-high-level descriptions.

@@ +77,5 @@
> +// tag value.  Otherwise, returns MOZ_TAG_LIMIT.
> +MFBT_API int
> +TagOfAnonymousMmap(const char* filename);
> +
> +MFBT_API const char*

Might be the default, but let's smack an |extern| before both of these to keep things simpler for readers.
Attachment #8349571 - Flags: review?(jwalden+bmo) → feedback+
Oh, and regarding "better to fail" for the "." case, that was about what I thought on first read, too, then I forgot and/or decided it probably wouldn't matter, or something.  If that can be a failure case, that sounds better to me.
> Indentation of class bodies is as SpiderMonkey does it, with the function
> bodies two-space indented.  So:
> 
> class Tag
> {
>     bool hasRun_;
> 
>   public:
>     int flags() {
>       MOZ_ASSERT(hasRun_);
>       ...
>     }

You wouldn't believe how much I wish you'd just gone with standard Gecko style
for MFBT.  You wouldn't have to spend time writing mfbt/STYLE.  And we
wouldn't have to have this discussion.
I've been getting crashes on try for Android, but not locally. I just pushed a version that disables the tagging for JIT code, and it fixes the problem. I'm guessing it's some kind of execution permissions problem.
glandium suggested a lack of insufficient icache flushing might be to blame, and mjrosenb mentioned a kernel bug that causes some flushings to not occur when files get mapped next to each other. I'm trying a diagnosis patch on try right now.
> mjrosenb mentioned a kernel bug that causes some flushings to not occur
> when files get mapped next to each other.

That appears to be the issue! mjrosenb's patch which works around this bug caused these to go green. So I think I'll disable the tagging for JIT code on Android, and add a comment.
> > +  tags[tag].update(length + offset, tag);
> 
> Maybe I'm just misreading mmap docs.  But isn't |length| the size of the
> produced mapping?  |offset| has to be page-aligned, but it doesn't have any
> effect on the size of the mapping, as far as I can tell.

True! Good catch.

> > +    "sps-prof-entrys"
> 
> entrys and not entries?

Yeah... the corresponding type is called |ProfileEntrys| :/
 
> > +MFBT_API void*
> > +MozTaggedAnonymousMmap(void* addr, size_t length, int prot,
> > +                       int flags, int tag, off_t offset);
> 
> I don't think it's valuable for this to have exactly mmap's interface.

There's an additional constraint, which is that we want in the future to be able to drop this into jemalloc3, replacing mmap() calls, via a macro. So it needs to have the same interface as mmap(), and if the args aren't appropriate (e.g. |flags| is not ANONYMOUS|PRIVATE, it should fall back to vanilla mmap(). I'll make this change and document it more clearly.
Updated patch, incorporates changes based on Waldo's feedback.

glandium, can you check over the following changes? Thanks.

- In jemalloc.c, I've defined the |mmap| macro to always call
  MozTaggedAnonymousMmap().

- In MozTaggedAnonymousMmap(), I now call the mmap() syscall directly rather
  than the mmap() libc function. The code is copied from mozjemalloc.

- In MozTaggedAnonymousMmap(), if the |flags| or |offset| isn't appropriate, I
  fall back to vanilla mmap(). Previously it aborted.
Attachment #8357611 - Flags: review?(mh+mozilla)
Attachment #8349571 - Attachment is obsolete: true
Comment on attachment 8357611 [details] [diff] [review]
On Linux, distinguish anonymous mappings by backing them with sparse, dummy files.

jld, would you mind trying this patch on B2G? If it's working, we should see "jemalloc-heap" and "js-gc-heap" entries within the "mem" and "processes" trees in the "System" section of about:memory, and not much in the way of "untagged". If you could attach the memory report file, that'd be great. Thanks!
Attachment #8357611 - Flags: feedback?(jld)
Comment on attachment 8357611 [details] [diff] [review]
On Linux, distinguish anonymous mappings by backing them with sparse, dummy files.

Review of attachment 8357611 [details] [diff] [review]:
-----------------------------------------------------------------

Program received signal SIGSYS, Bad system call.
ftruncate () at bionic/libc/arch-arm/syscalls/ftruncate.S:10

(gdb) bt
#0  ftruncate () at bionic/libc/arch-arm/syscalls/ftruncate.S:10
#1  0x401043d8 in Tag::update (addr=0x0, length=4096, prot=3, flags=<value optimized out>, tag=3, 
    offset=0) at /home/jld/src/B2G/gecko/mfbt/TaggedAnonymousMmap.cpp:150
#2  MozTaggedAnonymousMmap (addr=0x0, length=4096, prot=3, flags=<value optimized out>, tag=3, 
    offset=0) at /home/jld/src/B2G/gecko/mfbt/TaggedAnonymousMmap.cpp:176
#3  0x416a771e in WTF::OSAllocator::reserveAndCommit (bytes=4096, usage=<value optimized out>, 
    writable=<value optimized out>, executable=<value optimized out>)
    at /home/jld/src/B2G/gecko/js/src/yarr/OSAllocatorPosix.cpp:91

Would it be feasible to just create/open these files once, during startup, before the sandbox is enabled?
Attachment #8357611 - Flags: feedback?(jld) → feedback-
> Program received signal SIGSYS, Bad system call.
> ftruncate () at bionic/libc/arch-arm/syscalls/ftruncate.S:10
> 
> Would it be feasible to just create/open these files once, during startup,
> before the sandbox is enabled?

There are (currently) up to eight files, one for each tag (js-gc-heap, jemalloc-heap, js-jit-code, etc). Each of these files is created as needed. In practice, this is typcially very close to the start.

But the ftruncate calls happen at unpredictable times -- any time we call mmap() with a particular tag, and its length is larger than any previous map() call with that tag. I guess we could instead just create a really huge file (e.g. 1GB) immediately for all the tags, and hope that we never need anything bigger than that. Flaky, but probably doable... but jld pointed out on IRC that ftruncate() doesn't allow anything that write() and seek(), so it could potentially be whitelisted.

But that still leaves the open() calls. I can imagine ways to get the parent process to do that, but not easily. And yesterday I was already wondering if this bug's complexity outweighed its benefit -- it is quite invasive...
Attachment #8357611 - Flags: review?(mh+mozilla)
I'm giving up on this.
Status: ASSIGNED → RESOLVED
Closed: 7 years ago
Resolution: --- → WONTFIX
Whiteboard: [c=memory p= s= u=] [MemShrink:P2] → [c=memory p= s=2014.01.17 u=] [MemShrink:P2]
Possibly of interest: https://lkml.org/lkml/2013/10/14/789

The Android people had the same problem, and they made a kernel patch to allow naming truly anonymous memory.  It looks like the patch isn't in Linux upstream (and since it's been the better part of a year, may never be), but it is present in at least some Android kernels (e.g., Nexus 4 KitKat) and we could use it on B2G.
> Possibly of interest: https://lkml.org/lkml/2013/10/14/789

Definitely! I just tried it but it didn't seem to work. (I had to define PR_SET_VMA and PR_SET_VMA_ANON_NAME by copying the values from the patch.)

Is the current b2g kernel supposed to support this, or would we have to get a different one? On my Buri I have this:

> 127|root@android:/ # cat /proc/version
> Linux version 3.0.21-perf-gc4bd9d9-00403-g6110374-dirty (tclxa@aclgcl-ubnt) (gcc version 4.4.3 (GCC) ) #3 SMP PREEMPT Fri Jan 3 17:14:48 CST 2014
(In reply to Nicholas Nethercote [:njn] from comment #19)
> Is the current b2g kernel supposed to support this, or would we have to get
> a different one? On my Buri I have this:
> 
> > 127|root@android:/ # cat /proc/version
> > Linux version 3.0.21-perf-gc4bd9d9-00403-g6110374-dirty (tclxa@aclgcl-ubnt) (gcc version 4.4.3 (GCC) ) #3 SMP PREEMPT Fri Jan 3 17:14:48 CST 2014

On hamachi/buri you're out of luck, unless someone manages to reverse-engineer how to rebuild the kernel and boot image such that it won't be rejected by the bootloader on images newer than FxOS 1.1 or so.

Devices that already have the patch: nexus-4-kk, nexus-5, dolphin.

Devices where I know we can rebuild the kernels and the merge conflicts don't look too bad: emulator-kk/emulator-x86-kk, flame (pending bug 1004195), keon, peak.
I filed bug 1011350 for the prctl() approach.
You need to log in before you can comment on or make changes to this bug.