Tool to capture and reproduce Firefox's memory allocations

RESOLVED FIXED in mozilla36

Status

()

defect
RESOLVED FIXED
5 years ago
5 years ago

People

(Reporter: glandium, Assigned: glandium)

Tracking

(Depends on 1 bug)

unspecified
mozilla36
All
Linux
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(Not tracked)

Details

(Whiteboard: [MemShrink])

Attachments

(2 attachments, 5 obsolete attachments)

Quicker than running AWSY and checking how memory usage evolves with different allocators with different parameters is to use a replace-malloc library to capture allocations during one AWSY run, and to use that data with a program that can replay it against any allocator.

I happen to have written something like this at the time of bug 762448. I'll refresh it and put it in tree. Even if the code I have is essentially Linux-only, it's still valuable.
Whiteboard: [MemShrink]
Depends on: 1084210
It appears to be an unnecessary optimization, since the compiler is still inlining
the functions when they're not marked static. OTOH, following patches will require
the _impl functions not to be static.
Attachment #8509352 - Flags: review?(n.nethercote)
njn for the code, mshal for build system.
Attachment #8509353 - Flags: review?(n.nethercote)
Attachment #8509353 - Flags: review?(mshal)
Attachment #8509356 - Flags: review?(n.nethercote)
Attachment #8509356 - Flags: review?(mshal)
Attachment #8509357 - Flags: review?(n.nethercote)
Attachment #8509358 - Flags: review?(n.nethercote)
Posted patch part 5 - Some documentation (obsolete) — Splinter Review
Attachment #8509359 - Flags: review?(n.nethercote)
Note this actually currently doesn't work on Windows because of a subtle conflict with the PoisonIOInterposer, which redirects WriteFile, which write() uses, and in the replacement WriteFile, it uses PR_NewLock, which, guess what... calls calloc. Once past that problem, there's also a shutdown crash related to the PoisonIOInterposer.
Depends on: 1087245
Comment on attachment 8509352 [details] [diff] [review]
part 0 - Don't set MOZ_MEMORY_API to static for OSX in replace_malloc.c

Review of attachment 8509352 [details] [diff] [review]:
-----------------------------------------------------------------

rs=me
Attachment #8509352 - Flags: review?(n.nethercote) → review+
Attachment #8509353 - Flags: review?(mshal) → review+
Comment on attachment 8509356 [details] [diff] [review]
part 2 - Tool to replay munged allocations logs

>+# Keep this in sync with mozglue/build/Makefile.in
>+ifeq (Darwin_1,$(OS_TARGET)_$(MOZ_REPLACE_MALLOC))
>+OS_LDFLAGS += \
>+  -Wl,-U,_replace_init \
>+  -Wl,-U,_replace_malloc \
>+  -Wl,-U,_replace_posix_memalign \
>+  -Wl,-U,_replace_aligned_alloc \
>+  -Wl,-U,_replace_calloc \
>+  -Wl,-U,_replace_realloc \
>+  -Wl,-U,_replace_free \
>+  -Wl,-U,_replace_memalign \
>+  -Wl,-U,_replace_valloc \
>+  -Wl,-U,_replace_malloc_usable_size \
>+  -Wl,-U,_replace_malloc_good_size \
>+  -Wl,-U,_replace_jemalloc_stats \
>+  -Wl,-U,_replace_jemalloc_purge_freed_pages \
>+  -Wl,-U,_replace_jemalloc_free_dirty_pages \
>+  $(NULL)
>+
>+ifneq ($(MOZ_REPLACE_MALLOC_LINKAGE),compiler support)
>+OS_LDFLAGS += -flat_namespace
>+endif
>+ifeq ($(MOZ_REPLACE_MALLOC_LINKAGE),dummy library)
>+OS_LDFLAGS += -Wl,-weak_library,$(DEPTH)/memory/replace/dummy/$(DLL_PREFIX)replace_malloc$(DLL_SUFFIX)
>+endif
>+endif

Can we just pull this chunk out into a helper file that gets included by both Makefile.ins?
Attachment #8509356 - Flags: review?(mshal) → review+
Comment on attachment 8509359 [details] [diff] [review]
part 5 - Some documentation

Review of attachment 8509359 [details] [diff] [review]:
-----------------------------------------------------------------

::: memory/replace/logalloc/README
@@ +1,4 @@
> +Logalloc is a replace-malloc library for Firefox (see
> +memory/build/replace_malloc.h) that dumps a log of memory allocations to a
> +given file descriptor or file name. That log can then be replayed against
> +Firefox's default memory allocator independently or, through another

Nit: Remove ',' after 'or'.

@@ +35,5 @@
> +  MALLOC_LOG_FD=3 firefox 3>&1 1>&2 | gzip -c > log.gz
> +
> +(3>&1 copies the `| gzip` pipe file descriptor to file descriptor #3, 1>&2
> +then copies stderr to stdout. This leads to: fd1 and fd2 sending to stderr
> +of the parent process (the shell), and fd3 sending to gzip)

Nit: '.' after final 'gzip'.

@@ +46,5 @@
> +takes the raw log on its stdin, and outputs the preprocessed log on its
> +stdout.
> +
> +The logalloc-replay tool then takes the preprocessed log on its stdin and
> +replays the allocations printed there.

I'm reviewing this patch first, so the answer may become clear later, but: could the munging step be done automatically by the logalloc-replay tool?
Attachment #8509359 - Flags: review?(n.nethercote) → review+
Comment on attachment 8509358 [details] [diff] [review]
part 4 - Add a small testcase for alloc log/replay

Review of attachment 8509358 [details] [diff] [review]:
-----------------------------------------------------------------

r=me

::: memory/replace/logalloc/replay/Makefile.in
@@ +45,5 @@
> +LOGALLOC = LD_PRELOAD=$(CURDIR)/../$(DLL_PREFIX)logalloc$(DLL_SUFFIX)
> +endif
> +endif
> +check:: $(srcdir)/replay.log
> +	MALLOC_LOG_FD=1 $(LOGALLOC) ./$(PROGRAM) < $< | $(PYTHON) $(srcdir)/logalloc_munge.py | diff -w - $<

Do you really want to add new make targets at this point?
Attachment #8509358 - Flags: review?(n.nethercote) → review+
Comment on attachment 8509357 [details] [diff] [review]
part 3 - Script to munge allocations log for use with replay

Review of attachment 8509357 [details] [diff] [review]:
-----------------------------------------------------------------

I'm giving f+ because I'd like to see this again with the loop at the bottom rewritten as I've suggested.

::: memory/replace/logalloc/replay/logalloc_munge.py
@@ +3,5 @@
> +# file, You can obtain one at http://mozilla.org/MPL/2.0/.
> +
> +"""
> +This script takes a log from the replace-malloc logalloc library on stdin
> +and munges it so that it can be used with the logalloc-replay tool.

Very brief example input and output would be very helpful here.

@@ +14,5 @@
> +
> +class IdMapping(object):
> +    """Class to map values to ids.
> +
> +    Each value is associated to an increasing id, starting from 1.

Whitespace at end of line?

@@ +48,5 @@
> +def ignored(full_cmd, reason=''):
> +    if reason:
> +        print('Ignored %s: %s' % (' '.join(full_cmd), reason), file=sys.stderr)
> +    else:
> +        print('Ignored %s' % (' '.join(full_cmd)), file=sys.stderr)

The |reason| parameter is always non-empty, so this can be simplified.

@@ +63,5 @@
> +        cmd = l[1]
> +        if cmd == 'jemalloc_stats':
> +            # jemalloc_stats is the only function without any args
> +            pass
> +        else:

I'd like you to rewrite this so that every |cmd| had its own case, without any code sharing between cases. There will be some code duplication, but...
- The resulting code will be *much* easier to read.
- And more like the equivalent code in Replay.cc.
- And the case ordering will be more logical.
- And it would allow you to put the pointer at the end.
Attachment #8509357 - Flags: review?(n.nethercote) → feedback+
Comment on attachment 8509353 [details] [diff] [review]
part 1 - replace-malloc library to log Firefox allocations

Review of attachment 8509353 [details] [diff] [review]:
-----------------------------------------------------------------

Lots of nits, but basically good. r=me.

I take it you are only of those people who prefer /* */ comments over // comments? Oh well...

::: memory/replace/logalloc/FdPrintf.cpp
@@ +1,3 @@
> +/* This Source Code Form is subject to the terms of the Mozilla Public
> + * License, v. 2.0. If a copy of the MPL was not distributed with this
> + * file, You can obtain one at http://mozilla.org/MPL/2.0/. */

Please add mode lines to all new files, as per https://developer.mozilla.org/en-US/docs/Mozilla/Developer_guide/Coding_Style#Mode_Line

@@ +9,5 @@
> +#else
> +#include <unistd.h>
> +#endif
> +
> +void FdPrintf(int aFd, const char *aFormat, ...)

Return type on a separate line.

@@ +15,5 @@
> +  if (aFd == -1) {
> +    return;
> +  }
> +  char buf[256];
> +  char *d = buf;

Nit: I found |d| an odd name for this. I would use |p|. Or maybe |b| to mirror |f|.

@@ +19,5 @@
> +  char *d = buf;
> +  const char *f;
> +  va_list ap;
> +  va_start(ap, aFormat);
> +  f = aFormat;

Nit: assign this in the declaration?

@@ +29,5 @@
> +      f++;
> +      switch (*f) {
> +      case 'd': {
> +        size_t i = va_arg(ap, size_t);
> +        size_t x = 10;

Here I would write a comment like "Compute the number of digits"...

@@ +35,5 @@
> +        while (x <= i) {
> +          x *= 10;
> +          n++;
> +        }
> +        d += n;

... and then here I would write something like "Write the digits into the buffer".

@@ +48,5 @@
> +        intptr_t ptr = va_arg(ap, intptr_t);
> +        *(d++) = '0';
> +        *(d++) = 'x';
> +        int x = sizeof(intptr_t) * 8;
> +        int n = 0;

|n| could be a boolean.

@@ +52,5 @@
> +        int n = 0;
> +        do {
> +          x -= 4;
> +          if ((ptr >> x & 0xf) || n) {
> +            *(d++) = "0123456789abcdef"[ptr >> x & 0xf];

Factor out |ptr >> x & 0xf|.

@@ +57,5 @@
> +            n++;
> +          }
> +        } while (x > 0);
> +        if (n == 0)
> +          *(d++) = '0';

Brace single-line if blocks.

@@ +60,5 @@
> +        if (n == 0)
> +          *(d++) = '0';
> +        break;
> +      }
> +      }

As annoying as it is, Moz coding style requires |case| statements indented from the parent |switch|.

@@ +63,5 @@
> +      }
> +      }
> +      break;
> +    default:
> +      *(d++) = *f;

Missing |break|. Doesn't affect the behaviour, but please add it for consistency.

::: memory/replace/logalloc/FdPrintf.h
@@ +7,5 @@
> +
> +/* We can't use libc's (f)printf because it would reenter in replace_malloc,
> + * So use a custom and simplified version.
> + * Only %p and %d are supported. %d assumes input data being size_t.
> + * /!\ This function used a fixed-size internal buffer with no overflow check

Nit: missing '.' at end of sentence.

And please make this call MOZ_CRASH() if the buffer overflows. I can only see two places in FdPrintf.cpp that would need checks (the two |f++| statements).

::: memory/replace/logalloc/LogAlloc.cpp
@@ +19,5 @@
> +#include "mozilla/NullPtr.h"
> +
> +/* Avoid Lock code depending on mozilla::Logger */
> +#undef DEBUG
> +#include "base/lock.h"

This comment is too terse. I looked in base/lock.h and I don't understand how DEBUG and Logger relate to it.

@@ +26,5 @@
> +
> +const malloc_table_t *sFuncs = nullptr;
> +int sFd = -1;
> +
> +Lock sLock;

Can all three of these be |static|? If not, a comment explaining why would be helpful.

@@ +28,5 @@
> +int sFd = -1;
> +
> +Lock sLock;
> +
> +void prefork() {

Return type on its own line. Opening brace on its own line. Here and all the functions below.

@@ +40,5 @@
> +#ifdef _WIN32
> +DWORD getpid() {
> +  return GetCurrentProcessId();
> +}
> +#endif

I've also seen this done like so:

> #ifdef XP_WIN
> #include <process.h>
> #define getpid _getpid
> #endif

I don't have opinions on which is better, but the _getpid one does appear to be common.

@@ +60,5 @@
> +  sFuncs = aTable;
> +
> +#ifndef _WIN32
> +  pthread_atfork(prefork, postfork, postfork);
> +#endif

I admit I don't understand this pthread_atfork call at all -- why it's necessary, and why it's not used on Windows. Please add a comment!

@@ +86,5 @@
> +}
> +
> +/* Do a simple, text-form, log of all calls to replace-malloc functions.
> + * Use locking to guarantee that an allocation that did happen is logged
> + * before any other allocation/free happens.

Please add a brief specification of the format to the README file.

@@ +96,5 @@
> +void *replace_malloc(size_t aSize)
> +{
> +  AutoLock lock(sLock);
> +  void *ptr = sFuncs->malloc(aSize);
> +  FdPrintf(sFd, "%d malloc %p %d\n", size_t(getpid()), ptr, aSize, 0);

What's the 0 for? Looks like it will be ignored. (Can we put the special annotation on FdPrintf so it will get format string warnings?)

For all of these I would put |ptr| last, since it conceptually comes after the arguments, but I'm not too fussed either way. (Oh, I see that having it first makes the munging script simpler. So that would be a good thing to mention in the format description added to the README file... but if you change the structure of that script like I suggest, having it at the end would be no problem.)

@@ +105,5 @@
> +{
> +  AutoLock lock(sLock);
> +  int ret = sFuncs->posix_memalign(aPtr, aAlignment, aSize);
> +  FdPrintf(sFd, "%d posix_memalign %p %d %d\n", size_t(getpid()), *aPtr,
> +           aAlignment, aSize);

Is posix_memalign guaranteed to fill in *aPtr? I looked at the man page but it was unclear.

::: memory/replace/moz.build
@@ +13,5 @@
>  
>  if CONFIG['MOZ_DMD']:
>      DIRS += ['dmd']
> +
> +DIRS += ['logalloc']

Is it better to put the non-conditional DIRS additions before the conditional ones?
Attachment #8509353 - Flags: review?(n.nethercote)
Attachment #8509353 - Flags: review?(mshal)
Attachment #8509353 - Flags: review+
(In reply to Nicholas Nethercote [:njn] from comment #10)
> I'm reviewing this patch first, so the answer may become clear later, but:
> could the munging step be done automatically by the logalloc-replay tool?

logalloc-replay tries to avoid doing any allocation. Doing the munging step there would require a lot of them. The munging is also time consuming (although that could probably be optimized) and is better done once than every time you run the same log.

(In reply to Nicholas Nethercote [:njn] from comment #11)
> Comment on attachment 8509358 [details] [diff] [review]
> part 4 - Add a small testcase for alloc log/replay
> 
> Review of attachment 8509358 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> r=me
> 
> ::: memory/replace/logalloc/replay/Makefile.in
> @@ +45,5 @@
> > +LOGALLOC = LD_PRELOAD=$(CURDIR)/../$(DLL_PREFIX)logalloc$(DLL_SUFFIX)
> > +endif
> > +endif
> > +check:: $(srcdir)/replay.log
> > +	MALLOC_LOG_FD=1 $(LOGALLOC) ./$(PROGRAM) < $< | $(PYTHON) $(srcdir)/logalloc_munge.py | diff -w - $<
> 
> Do you really want to add new make targets at this point?

I'm not sure what you went through with DMD is really what we should strive for for those "simple" tests, and I'd rather address this when dealing with the rest of make check than now. Tell me if you disagree.
Note that in the DMD case, you did start with the requirement of running gecko, and slowly evolved into a separate independent test program, which you now drive from an xpcshell test, so the story is different.
(In reply to Nicholas Nethercote [:njn] from comment #13)
> And please make this call MOZ_CRASH() if the buffer overflows. I can only
> see two places in FdPrintf.cpp that would need checks (the two |f++|
> statements).

You mean the places incrementing d. Which there are more than two.

> @@ +26,5 @@
> > +
> > +const malloc_table_t *sFuncs = nullptr;
> > +int sFd = -1;
> > +
> > +Lock sLock;
> 
> Can all three of these be |static|? If not, a comment explaining why would
> be helpful.

Anonymous namespace makes them static. Does idiomatic C++ need commenting?

> @@ +96,5 @@
> > +void *replace_malloc(size_t aSize)
> > +{
> > +  AutoLock lock(sLock);
> > +  void *ptr = sFuncs->malloc(aSize);
> > +  FdPrintf(sFd, "%d malloc %p %d\n", size_t(getpid()), ptr, aSize, 0);
> 
> What's the 0 for?

No idea where it comes from.

> Looks like it will be ignored. (Can we put the special
> annotation on FdPrintf so it will get format string warnings?)

I was about to say that means having to change all the format strings to use %d or %ld depending on the size of size_t... but I realized that could be changed to %z... hoping that all compilers that handle those annotations now about %z...
> logalloc-replay tries to avoid doing any allocation. Doing the munging step
> there would require a lot of them. The munging is also time consuming
> (although that could probably be optimized) and is better done once than
> every time you run the same log.

Fair enough. This would be good to mention in the README.
 
> > Do you really want to add new make targets at this point?
> 
> I'd rather address this when dealing with the rest of make check than now.

Ok.
> You mean the places incrementing d. Which there are more than two.

Which will make it a more interesting challenge, then :)

> Anonymous namespace makes them static. Does idiomatic C++ need commenting?

Bleh. I would remove the anonymous namespace and just put static on everything because it's more explicit, but I know opinions differ on this topic...

> > What's the 0 for?
> 
> No idea where it comes from.

Probably best to remove it then.

> I was about to say that means having to change all the format strings to use
> %d or %ld depending on the size of size_t... but I realized that could be
> changed to %z... hoping that all compilers that handle those annotations now
> about %z...

Or import mozilla/IntegerPrintfMacros.h and use |PRIdPTR|.
(In reply to Nicholas Nethercote [:njn] from comment #17)
> Or import mozilla/IntegerPrintfMacros.h and use |PRIdPTR|.

Would make the printf function more complex. %z is probably fine, though.
Comment on attachment 8509356 [details] [diff] [review]
part 2 - Tool to replay munged allocations logs

Review of attachment 8509356 [details] [diff] [review]:
-----------------------------------------------------------------

::: memory/replace/logalloc/replay/Replay.cpp
@@ +1,3 @@
> +/* This Source Code Form is subject to the terms of the Mozilla Public
> + * License, v. 2.0. If a copy of the MPL was not distributed with this
> + * file, You can obtain one at http://mozilla.org/MPL/2.0/. */

Modelines again, please.

@@ +14,5 @@
> + * See memory/build/replace_malloc.c */
> +#define MALLOC_DECL(name, return_type, ...) \
> +  return_type name ## _impl(__VA_ARGS__);
> +#define MALLOC_FUNCS MALLOC_FUNCS_MALLOC
> +#include "malloc_decls.h"

malloc_decls.h undefs MALLOC_DECL? Ugh... I had to look inside it to understand how you weren't getting complaints about redefined macros here.

@@ +31,5 @@
> +#ifdef ANDROID
> +/* mozjemalloc uses MozTagAnonymousMemory, which doesn't have an inline
> + * implementation on Android */
> +void
> +MozTagAnonymousMemory(const void* aPtr, size_t aLength, const char* aTag) {}

This is unfortunate. Can it be fixed in TaggedAnonymousMemory.h instead?

@@ +37,5 @@
> +/* mozjemalloc and jemalloc use pthread_atfork, which Android doesn't have.
> + * While gecko has one in libmozglue, the replay program can't use that.
> + * Since we're not going to fork anyways, make it a dummy function. */
> +int
> +pthread_atfork(void (*prepare)(void), void (*parent)(void), void (*child)(void))

Many functions in this file don't have |aFoo| arguments. They should.

@@ +49,5 @@
> +
> +/* NUWA builds have jemalloc built with
> + * -Dpthread_mutex_lock=__real_pthread_mutex_lock */
> +int
> +__real_pthread_mutex_lock(pthread_mutex_t *mutex)

I just realized that all your patches use |T *| and |T &|. Mozilla style is to use |T*| and |T&|. Please fix.

@@ +60,5 @@
> +
> +size_t parseNumber(Buffer buf)
> +{
> +  if (!buf)
> +    die("Malformed input");

Braces. I won't mention it again, but there are more cases below.

@@ +77,5 @@
> +  mOps++;
> +  size_t size = parseNumber(args);
> +  slot.mPtr = ::malloc_impl(size);
> +  slot.mSize = size;
> +  /* Force allocated memory to take RSS. */

Took me a minute to parse "take RSS". Change to "be committed", or "be in physical memory"?

And rather than putting the same comment in every function, perhaps have a function above Replay::malloc explaining that they all do the memset() for the same reason?

@@ +118,5 @@
> +  size_t size = parseNumber(args);
> +  void *old_ptr = slot.mPtr;
> +  slot.mPtr = nullptr;
> +  slot.mSize = 0;
> +  MemSlot &new_slot = operator[](slot_id);

This surprised me. Use |this[slot_id]| instead? (Or whatever the syntax is.) Or just |mSlots[slot_id]|?

@@ +162,5 @@
> +  /* TODO: Add more data, like actual RSS as measured by OS, but compensated
> +   * for the replay internal data. */
> +}
> +
> +int main() {

Return type on its own line.

@@ +186,5 @@
> +    if (!first_pid)
> +      first_pid = pid;
> +
> +    /* The log may contain data for several processes, only entries for the
> +     * very first that appears are treated. */

This is a weird restriction. What if you want to replay for a child process? Allow the PID of interest to be specified via the command line?

@@ +198,5 @@
> +      continue;
> +    }
> +
> +    /* The logs are expected to be preprocessed so that allocations are
> +     * attributed a tracking slot. The input is trusted not to have crazy

Ah, now I get it: you want to use continuous integers instead of addresses so you can use a simple data structure in this program rather than some kind of hash table, because allocations in this program are problematic. I don't think this was explained anywhere; it would be good to mention somewhere, maybe the README.

::: memory/replace/logalloc/replay/Replay.h
@@ +2,5 @@
> + * License, v. 2.0. If a copy of the MPL was not distributed with this
> + * file, You can obtain one at http://mozilla.org/MPL/2.0/. */
> +
> +#ifndef __Replay_h__
> +#define __Replay_h__

Is this file needed? Looks like its contents can be moved into Replay.cpp.

@@ +15,5 @@
> +
> +/* We don't want to be using malloc() to allocate our internal tracking
> + * data, because that would change the parameters of what is being measured,
> + * so we want to use data types that directly use mmap/VirtualAlloc. */
> +template <typename T, size_t L>

Took me a minute to work out what |L| meant. |Len| or |Length| would be clearer.

@@ +20,5 @@
> +class MappedArray {
> +public:
> +  MappedArray(): mPtr(nullptr) {}
> +
> +  ~MappedArray() {

Nit: brace on its own line.

@@ +33,5 @@
> +
> +  T &operator[] (size_t aIndex) const {
> +    if (mPtr) {
> +      return mPtr[aIndex];
> +    }

You could make this |if (!mPtr)| and avoid repeating the return statement. But I can see getting the simple case out of the way is also reasonable. Either way.

@@ +56,5 @@
> +  mutable T *mPtr;
> +};
> +
> +/* Type for records of allocations. */
> +struct MemSlot {

Brace on its own line.

@@ +80,5 @@
> +
> +public:
> +  MemSlot &operator[] (size_t aIndex) const {
> +    if (aIndex < kGroupSize * kGroups)
> +      return mSlots[aIndex / kGroupSize][aIndex % kGroupSize];

Nit: Needs braces.

::: memory/replace/logalloc/replay/Utils.h
@@ +2,5 @@
> + * License, v. 2.0. If a copy of the MPL was not distributed with this
> + * file, You can obtain one at http://mozilla.org/MPL/2.0/. */
> +
> +#ifndef __Utils_h__
> +#define __Utils_h__

This file also could be moved into Replay.cpp.

@@ +30,5 @@
> +public:
> +  Buffer(): mBuf(nullptr), mLength(0) {}
> +
> +  Buffer(const void *aBuf, size_t aLength)
> +  : mBuf(reinterpret_cast<const char *>(aBuf)), mLength(aLength)

Nit: ctor lists should be indented two more space than that.

@@ +34,5 @@
> +  : mBuf(reinterpret_cast<const char *>(aBuf)), mLength(aLength)
> +  {}
> +
> +  /* Constructor for string literals. */
> +  template <size_t SIZE>

|Size| rather than |SIZE| seems to be standard.

@@ +36,5 @@
> +
> +  /* Constructor for string literals. */
> +  template <size_t SIZE>
> +  Buffer(const char (&aStr)[SIZE])
> +  : mBuf(aStr), mLength(SIZE - 1)

Ditto.

@@ +59,5 @@
> +    }
> +  }
> +
> +  /* Returns a sub-buffer of at most aLength characters. The "parent" buffer is
> +   * amputed of those aLength characters. If the "parent" buffer is smaller than

amputed?

@@ +87,5 @@
> +
> +  /* Returns whether the buffer is empty. */
> +  operator bool()
> +  {
> +    return mLength;

|!!mLength;| ?

@@ +122,5 @@
> +public:
> +  FdReader(int aFd)
> +  : mFd(aFd)
> +  , mData(&mRawBuf, 0)
> +  , mBuf(&mRawBuf, sizeof(mRawBuf))

Indentation, again.

@@ +128,5 @@
> +
> +  /* Read a line from the file descriptor and returns it as a Buffer instance */
> +  Buffer ReadLine()
> +  {
> +    while (true) {

You used |while (1)| in one of your other patches. I prefer |while (true)|.

@@ +134,5 @@
> +
> +      /* There are essentially three different cases here:
> +       * - '\n' was found "early". In this case, the end of the result buffer
> +       *   is before the beginning of the mData buffer (since SplitChar
> +       *   amputated it)

Nit: missing '.' at end of sentence.

@@ +145,5 @@
> +       * Only in the latter case do both result and mData's end match, and it's
> +       * the only case where we need to refill the buffer.
> +       */
> +      if (result.GetEnd() != mData.GetEnd())
> +        return result;

Nit: braces.

@@ +156,5 @@
> +
> +      FillBuffer();
> +
> +      if (!mData)
> +        return Buffer();

Ditto.

@@ +171,5 @@
> +    ssize_t len = 1;
> +    while (remainder && len > 0) {
> +      len = ::read(mFd, const_cast<char *>(remainder.get()), size);
> +      if (len < 0)
> +        die("Read error");

Braces.
Attachment #8509356 - Flags: review?(n.nethercote) → review+
> Would make the printf function more complex. %z is probably fine, though.

If it's too difficult, omitting the annotation isn't the end of the world.
> This is a weird restriction. What if you want to replay for
> a child process? Allow the PID of interest to be specified via
> the command line?

As mentioned in the README, you can filter things from the log with grep, or, as in the given example, with awk.
Comment on attachment 8509353 [details] [diff] [review]
part 1 - replace-malloc library to log Firefox allocations

Somehow mshal's r+ got eaten.
Attachment #8509353 - Flags: review?(mshal) → review+
Attachment #8509353 - Attachment is obsolete: true
Attachment #8509356 - Attachment is obsolete: true
Attachment #8509357 - Attachment is obsolete: true
Attachment #8509358 - Attachment is obsolete: true
Attachment #8509359 - Attachment is obsolete: true
Comment on attachment 8510231 [details] [diff] [review]
Tool to capture and reproduce Firefox's memory allocations

Review of attachment 8510231 [details] [diff] [review]:
-----------------------------------------------------------------

I have a bunch of small nits but this is looking very nice. r=me.

::: memory/replace/logalloc/FdPrintf.cpp
@@ +15,5 @@
> +#include "mozilla/Assertions.h"
> +
> +/* Template class allowing a limited number of increments on a value */
> +template <typename T>
> +class CheckedIncrement

Thanks for adding this checking!

@@ +52,5 @@
> +    return;
> +  }
> +  char buf[256];
> +  CheckedIncrement<char*> b(buf, sizeof(buf));
> +  CheckedIncrement<const char*> f(aFormat, strlen(aFormat));

|f| doesn't need the checking, though I guess it doesn't hurt.

@@ +60,5 @@
> +    switch (*f) {
> +      case '\0':
> +        goto out;
> +      case '%':
> +        switch (*++f) {

Blank line between cases? Easier to read that way.

::: memory/replace/logalloc/LogAlloc.cpp
@@ +30,5 @@
> +
> +void
> +prefork() {
> +  sLock.Acquire();
> +}

Make prefork and postfork static as well?

@@ +54,5 @@
> +  sFuncs = aTable;
> +
> +#ifndef _WIN32
> +  /* When another thread has acquired a lock before forking, the child
> +   * process will inherit the lock state but the thread, being inexistant

"inexistant" is French. "inexistent" is English, but unusual. "non-existent" or "nonexistent" is better.

@@ +59,5 @@
> +   * in the child process, will never release it, leading to a dead-lock
> +   * whenever the child process gets the lock. We thus need to ensure no
> +   * other thread is holding the lock before forking, by acquiring it
> +   * ourselves, and releasing it after forking, both in the parent and child
> +   * processes.

Nice explanation.

@@ +71,5 @@
> +  char* fd_num = getenv("MALLOC_LOG_FD");
> +  if (fd_num) {
> +    sFd = 0;
> +    while (*fd_num) {
> +      /* Reject non digits, and values >= 10000) */

Extraneous ')'. And it should be 1000, not 10000.

@@ +91,5 @@
> + * Use locking to guarantee that an allocation that did happen is logged
> + * before any other allocation/free happens.
> + * TODO: Add a thread id to the log so that a multi-threaded load can be
> + * replayed as well, since it can have an impact on how jemalloc handles
> + * the allocations.

I don't understand the bit about the "multi-threaded load".

::: memory/replace/logalloc/README
@@ +1,5 @@
> +Logalloc is a replace-malloc library for Firefox (see
> +memory/build/replace_malloc.h) that dumps a log of memory allocations to a
> +given file descriptor or file name. That log can then be replayed against
> +Firefox's default memory allocator independently or through another
> +replace-malloc library, allowing to test other allocators under the exact

s/allowing to test/allowing the testing of/

@@ +20,5 @@
> +
> +- on all platforms:
> +  MALLOC_LOG=/path/to/log-file
> +  or
> +  MALLOC_LOG_FD=number

A couple of suggestions, neither of which I will insist on -- you choose:
- The XPCOM_MEM_* environment variables combine both of these into a single thing -- if the value is 1 or 2, then it's stdin or stderr. Otherwise it's a file name. I know you allow file descriptors higher than 2 so this situation isn't quite the same... but maybe worth considering.
- If not, then MALLOC_LOG_NAME or MALLOC_LOG_FILENAME is arguably a better name than MALLOC_LOG.

@@ +38,5 @@
> +then copies stderr to stdout. This leads to: fd1 and fd2 sending to stderr
> +of the parent process (the shell), and fd3 sending to gzip.)
> +
> +Each line of the allocations log is formatted as follows:
> +  <pid> <cmd>([<args>])[=<result>]

Is command/<cmd> the right term? I would use function/<fn>.

@@ +55,5 @@
> +memory/replace/logalloc/replay. However, as the goal of that tool is to
> +reproduce the recorded memory allocations, it needs to avoid as much as
> +possible doing its own allocations for bookkeeping. Reading the logs as
> +they are would require data structures and memory allocations. As a
> +consequence, the logs need to be preprocessed beforehand.

Nice explanation.

::: memory/replace/logalloc/moz.build
@@ +16,5 @@
> +DEFINES['MOZ_NO_MOZALLOC'] = True
> +# Avoid Lock_impl code depending on mozilla::Logger
> +DEFINES['NDEBUG'] = True
> +
> +# Use locking code from the chromium stack

You have something against ending a sentence with a full stop? :P

::: memory/replace/logalloc/replay/Makefile.in
@@ +45,5 @@
> +LOGALLOC = LD_PRELOAD=$(CURDIR)/../$(DLL_PREFIX)logalloc$(DLL_SUFFIX)
> +endif
> +endif
> +check:: $(srcdir)/replay.log
> +	MALLOC_LOG_FD=1 $(LOGALLOC) ./$(PROGRAM) < $< | $(PYTHON) $(srcdir)/logalloc_munge.py | diff -w - $<

A comment about the idempotence here would be nice, for those of us who can't remember what $< means :)

::: memory/replace/logalloc/replay/Replay.cpp
@@ +26,5 @@
> +#include "mozilla/NullPtr.h"
> +#include "FdPrintf.h"
> +
> +void
> +die(const char* message)

|static|?

@@ +118,5 @@
> +/* Helper class for memory buffers */
> +class Buffer
> +{
> +public:
> +  Buffer(): mBuf(nullptr), mLength(0) {}

It's standard to put a space before the ':' in a case like this.

@@ +146,5 @@
> +      Split(1);
> +      return result;
> +    } else {
> +      return Split(mLength);
> +    }

No else after return. It might make sense to change the condition to |if (!c)| to get the shorter case out of the way first.

@@ +157,5 @@
> +  {
> +    Buffer result(mBuf, std::min(aLength, mLength));
> +    mLength -= result.mLength;
> +    mBuf += result.mLength;
> +    return result;

Have you run this under Valgrind? There's enough pointer and string manipulation that it would be comforting to know you have.

@@ +467,5 @@
> +};
> +
> +
> +int
> +main() {

Brace on its own line.

@@ +478,5 @@
> +#endif
> +
> +  /* Read log from stdin and dispatch commands to the Replay instance.
> +   * The log format is essentially:
> +   *   <pid> <command>([<args>])[=<result>]

Again, "function" is better than "command".

::: memory/replace/logalloc/replay/logalloc_munge.py
@@ +38,5 @@
> +        self._recycle = deque()
> +
> +    def __getitem__(self, addr):
> +        if addr == 0:
> +            return 0

I don't understand the special handling of 0 in this class. It seems to contradict the comment "Each value is associated to an increasing id, starting from 1." A comment about 0 would likely help.

@@ +39,5 @@
> +
> +    def __getitem__(self, addr):
> +        if addr == 0:
> +            return 0
> +        if addr not in self._values:

In this class's code and comments, you use a mixture of addr/addrs and value/values. Given that this class is used both for pids and pointers, using value/values throughout would be best.

@@ +65,5 @@
> +    try:
> +        pid, full_command = line.split(' ', 1)
> +        # full_command format is <cmd>([<args>])[=<result>]
> +        call, result = full_command.split(')')
> +        cmd, args = call.split('(')

As I commented in the README, "fn" is probably better than "cmd" throughout this file. And maybe "line_parts" instead of "full_command"?

::: memory/replace/logalloc/replay/replay.log
@@ +1,5 @@
> +1 malloc(42)=1
> +1 malloc(24)=2
> +1 free(1)
> +1 posix_memalign(4096,1024)=1
> +1 calloc(4,42)=3

It would be nice to test multiple pids, to check that the filtering is occurring. That would require having a separate file for expected output.

@@ +6,5 @@
> +1 free(2)
> +1 realloc(3,84)=2
> +1 aligned_alloc(512,1024)=3
> +1 memalign(512,1024)=4
> +1 valloc(1024)=5

Munged files are much harder to read than unmunged files because small integers are used for both pids and block IDs. One idea is to prefix pids with 'p' and block IDs with 'b', e.g.:

> p1 free(b2)
> p1 realloc(3,84)=b2
> p1 aligned_alloc(512,1024)=b3
> p1 memalign(512,1024)=b4
> p1 valloc(1024)=b5

I'll let you decide if this is worth it.
Attachment #8510231 - Flags: review?(n.nethercote) → review+
(In reply to Nicholas Nethercote [:njn] from comment #24)
> > +    while (*fd_num) {
> > +      /* Reject non digits, and values >= 10000) */
> 
> Extraneous ')'. And it should be 1000, not 10000.

It actually *is* 10000.
(In reply to Nicholas Nethercote [:njn] from comment #24)
> @@ +52,5 @@
> > +    return;
> > +  }
> > +  char buf[256];
> > +  CheckedIncrement<char*> b(buf, sizeof(buf));
> > +  CheckedIncrement<const char*> f(aFormat, strlen(aFormat));
> 
> |f| doesn't need the checking, though I guess it doesn't hurt.

Turns out it does, because the final null character is excluded. And an incomplete format string, like one ending with a % triggers the MOZ_CRASH where no check would actually handle the case gracefully. The check could be kept with strlen+1 instead, though.
https://hg.mozilla.org/mozilla-central/rev/3c91b6abb29b
https://hg.mozilla.org/mozilla-central/rev/ef34805b3a70
Status: NEW → RESOLVED
Closed: 5 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla36
Depends on: 1089614
Depends on: 1090114
You need to log in before you can comment on or make changes to this bug.