Closed Bug 945973 Opened 11 years ago Closed 11 years ago

A system-wide memory reporter for B2G

Categories

(Core :: XPCOM, defect)

All
Linux
defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla29
blocking-b2g 1.3+
Tracking Status
firefox27 --- fixed
firefox28 --- fixed
firefox29 --- fixed
b2g-v1.3T --- fixed

People

(Reporter: n.nethercote, Assigned: n.nethercote)

References

Details

(Keywords: perf, Whiteboard: [c=memory p= s= u=] [MemShrink:P2])

Attachments

(11 files, 7 obsolete files)

4.68 KB, patch
mccr8
: review+
Details | Diff | Splinter Review
7.38 KB, patch
mccr8
: review+
Details | Diff | Splinter Review
2.44 KB, patch
mccr8
: review+
Details | Diff | Splinter Review
212.83 KB, application/x-gzip
Details
3.07 KB, text/plain
Details
37.80 KB, patch
Details | Diff | Splinter Review
28.21 KB, patch
glandium
: review+
Details | Diff | Splinter Review
39.60 KB, patch
Details | Diff | Splinter Review
5.03 MB, application/x-gzip
Details
3.31 KB, patch
n.nethercote
: review+
Details | Diff | Splinter Review
3.33 KB, patch
n.nethercote
: review+
Details | Diff | Splinter Review
about:memory (obtained on B2G with get_about_memory.py) gives detailed
measurements within Mozilla processes, but it excludes all non-Mozilla
processes, and it overestimates because it doesn't account for sharing (which
will become more significant when Nuwa is enabled).

It would be good to have a tool that gives a good idea of the total memory
consumption of the whole device;  one that gives more detail and is less ad hoc
than top, procrank, /proc/meminfo, and similar things.

My basic idea is to write a memory reporter that uses /proc/meminfo and
/proc/<pid>/smaps to give a breakdown of the overall memory usage across the
entire device.  I have a prototype working and here's some sample output from
my Linux desktop machine.

31,862.34 MB (100.0%) -- mem
├──21,610.71 MB (67.83%) ── free
├───9,002.75 MB (28.26%) ── other
└───1,248.88 MB (03.92%) -- processes
    ├────735.05 MB (02.31%) ++ (76 tiny)
    │    ├──296.51 MB (00.93%) ++ process(cd64/dist/bin/firefox, pid=15151)
    │    ├──113.04 MB (00.35%) ++ process(/usr/bin/python3, pid=2299)
    │    ├───50.29 MB (00.16%) ++ process(compiz, pid=1744)
    ...
    ... <71 lines omitted for brevity>
    ...
    │    ├────0.19 MB (00.00%) ++ process(/bin/sh, pid=1988)
    │    └────0.13 MB (00.00%) ++ process(/bin/cat, pid=2207)
    └────513.83 MB (01.61%) -- process(co64dmd/dist/bin/firefox, pid=14978)
         ├──456.41 MB (01.43%) -- anonymous/anonymous, outside brk()
         │  ├──453.48 MB (01.42%) ── [rw-p] [103]
         │  └────2.93 MB (00.01%) ── [rwxp] [21]
         └───57.42 MB (00.18%) -- (3 tiny)
             ├──55.07 MB (00.17%) -- shared-libraries
             │  ├──46.35 MB (00.15%) -- mozilla
             │  │  ├──43.02 MB (00.14%) ++ libxul.so
             │  │  ├───0.91 MB (00.00%) ++ libnss3.so
             │  │  ├───0.57 MB (00.00%) ++ libmozsqlite3.so
             │  │  ├───0.53 MB (00.00%) ++ libnssckbi.so
             │  │  ├───0.31 MB (00.00%) ++ libfreebl3.so
             │  │  ├───0.21 MB (00.00%) ++ libsoftokn3.so
             │  │  ├───0.20 MB (00.00%) ++ libnspr4.so
             │  │  ├───0.20 MB (00.00%) ++ libssl3.so
             │  │  ├───0.15 MB (00.00%) ++ libnssutil3.so
             │  │  ├───0.12 MB (00.00%) ++ libnssdbm3.so
             │  │  ├───0.08 MB (00.00%) ++ libsmime3.so
             │  │  ├───0.02 MB (00.00%) ++ libplc4.so
             │  │  ├───0.02 MB (00.00%) ++ libplds4.so
             │  │  └───0.02 MB (00.00%) ++ libmozalloc.so
             │  └───8.72 MB (00.03%) ++ non-mozilla
             ├───2.16 MB (00.01%) ++ other-files
             └───0.19 MB (00.00%) ── main thread's stack/[rw-p]

- This output shows most of the details hidden, though the Firefox process has
  its sub-tree partly expanded to show more detail.

- The root is the MemTotal value from /proc/meminfo, which is "physical RAM
  minus a few reserved bits and the kernel binary code".  So you can see that
  this machine has 32GB of RAM.  (I'm not aware of any way on Linux to get the
  actual amount of installed physical memory on Linux.)  

- For each user-space process, the PSS measurements from /proc/<pid>/smaps are
  used.  PSS is the only measurement that is suitable for summing when we're
  trying to account for all processes without any overlap.  
  
- This gives lots of insight into code and static data for each process, but
  very little into dynamic heap/mmap usage, which falls into the "anonymous"
  bucket.  E.g. the running Firefox in the example has 456MB there.  I can't
  see how to combine the data from Firefox's own memory reports with the
  OS-level info in reasonable way, because we have no info on what is shared at
  the memory reporter level.
  
  However, I think we can get at least some insight into the anonymous mappings
  by using private mmaps that are backed with distinctively-named
  large-but-sparse files.  That would at least let us label the jemalloc heap,
  SpiderMonkey's GC heap, and generated JIT code.

  (Beyond that, the normal memory reports give per-process insight, as long as
  you remember that they overestimate due to sharing.)

- "free" is MemFree from /proc/meminfo.

- "other" is |MemTotal - MemFree - sum(each process's PSS)|.  I'm not exactly
  sure what this covers, but it does include the page cache, which is memory
  the kernel is using but can reallocated for other purposes at any point, so
  it's not exactly "free" but in practice it is.

My prototype uses the old Linux maps reporter (removed in bug 912165) as a
starting point.

Hooking into the existing memory reporter infrastructure is good -- we already
have scripts to pull that from the device, and about:memory can be used to view
the data, with its collapsible/expandable sub-trees.

This will obviously only work on Linux, and it will only really be useful on
B2G (and maybe Fennec).  So my plan is for the code to all be |#ifdef
__linux__|, and also behind a pref that is on by default for B2G but off by
default on desktop.  (It's useful having the pref so that we can test on
desktop.)
BTW, the goal here is to have all the measurements obtainable from the OS in a single place, obtained via a single tool.  So if you know of any other interesting measurements that are gettable from the OS I'd be happy to consider them for inclusion.  They don't have to fit in with the "mem" tree shown in comment 0 -- it's possible to have separate sub-trees, or even just single measurements.  E.g. the lowmem killer parameters could be shown.  These additional measurements don't even have to be measured in bytes;  about:memory allows for counts and percentages, too.
Keywords: perf
Whiteboard: [MemShrink] → [c=memory p= s= u=] [MemShrink]
This requirement has been more trouble than it's worth, and some of the
descriptions used by the new reporter will violate it.
Attachment #8342133 - Flags: review?(continuation)
Attachment #8342133 - Flags: review?(continuation) → review+
Whoops, forgot to update one comment.
Attachment #8342135 - Flags: review?(continuation)
Attachment #8342133 - Attachment is obsolete: true
Attachment #8342135 - Flags: review?(continuation) → review+
This is needed for the system-wide reporter, because it doesn't have any
"explicit" reports.

More generally, I've been removing special cases from the memory reporting
infrastructure, and this can be viewed as another step along that path.
Attachment #8342139 - Flags: review?(continuation)
The main event.  Most of the action is in SystemMemoryReporter.cpp.  As
mentioned earlier, it's a fairly-heavily modified version of the old
MapsMemoryReporter.cpp file which was removed in bug 912165.

I added a pref for this reporter, which is checked every time it's invoked.
The pref is off by default for desktop, and on by default for B2G.

I've only run this code on desktop Linux because I don't have a B2G device
suitable for development.  But what could possibly go wrong?  O_O
Attachment #8342149 - Flags: review?(mh+mozilla)
Attached file Sample output
Here's a memory dump demonstrating the output.  If you want to view this, load it in about:memory using the "Load..." button;  the results are in the "System" section at the bottom.

Note that this file is gzipped, and if you download it through Bugzilla it'll get gzipped *again* (due to bug 879764).  I recommend using |wget| to download it instead.
Here's a draft patch that gives greater insight into the anonymous mmaps.  It
creates dummy files for each of jemalloc, the JS GC heap, and JS jit code.
This means that mappings done for each of those components are distinguished.

Sample output:

    │    ├──174.22 MB (00.55%) -- process(cd64/dist/bin/firefox, pid=17802)
    │    │  ├──115.06 MB (00.36%) -- other-files
    │    │  │  ├───73.04 MB (00.23%) ── malloc-heap.co6Aez/[rw-p] [75]
    │    │  │  ├───40.99 MB (00.13%) ── js-gc-heap.cGLDrS/[rw-p] [41]
    │    │  │  ├────0.44 MB (00.00%) ── js-jit-code.5sJJsd/[rwxp] [11]
    ...
    │    │  ├────1.33 MB (00.00%) ── anonymous/outside-brk/[rw-p] [64]

The patch is really rough at the moment.  I will clean it up if you think it's
worth pursuing further.

Things of note.

- It seems to run fine, though I've only done some basic browsing with it
  enabled.

- Currently the entries show up other "other-files".  I'll put them in their
  own sub-tree with a nice name later.

- Currently the entries have the suffix like ".co6Aez".  I will implement
  something to strip that off later.

- The code is in MFBT, because it's used in both SpiderMonkey and jemalloc.

- The main part of the code is a macro.  Initially I had a class, but jemalloc
  is C code so that didn't work.  Then I tried doing most of it in a function
  (and put it in mfbt/AnonymousMappingMarkings.c) but I had linking troubles
  that I didn't understand.  So I ended up using something that would
  definitely work!

- The code should be made Linux-only in some fashion.  Later.

- There are lots of debugging printfs, "njn" comments indicating todo items,
  insufficient comments, unnecessary #includes, etc.  Please ignore this stuff
  for now.

Is this all too gross for words, or is it reasonable?
Comment on attachment 8342352 [details] [diff] [review]
(part 5) - On Linux, distinguish anonymous mappings by backing them with sparse, dummy files.

> Is this all too gross for words, or is it reasonable?

Another thing:  the temp file is currently put in the CWD, but $TMPDIR or /tmp would be better if they are available.
Attachment #8342352 - Flags: feedback?(mh+mozilla)
I think it is worth to write a kernel module just for figure out what pages are shared among processes.  How do you think?
Comment on attachment 8342139 [details] [diff] [review]
(part 2) - Don't complain if there aren't any "explicit" reports fora process.

Review of attachment 8342139 [details] [diff] [review]:
-----------------------------------------------------------------

::: toolkit/components/aboutmemory/tests/test_aboutmemory3.xul
@@ -229,5 @@
>  \n\
>  End of Main Process (pid NNN)\n\
>  Other-only process\n\
> -\n\
> -WARNING: the 'heap-allocated' memory reporter does not work for this platform and/or configuration. This means that 'heap-unclassified' is not shown and the 'explicit' tree shows less memory than it should.\n\

I guess this doesn't need to be removed from test_aboutmemory.xul, too?
Attachment #8342139 - Flags: review?(continuation) → review+
Attachment #8342140 - Flags: review?(continuation) → review+
(In reply to Nicholas Nethercote [:njn] from comment #6)
> ...
> I've only run this code on desktop Linux because I don't have a B2G device
> suitable for development.  But what could possibly go wrong?  O_O

Nicholas,

You should request FxOS devices for you and your team. The upcoming tarako device will be limited to 128MB and this memory reporter tool could be useful to help enable that.

You can submit requests here: http://bit.ly/1cbxfmW

Mike
> > -WARNING: the 'heap-allocated' memory reporter does not work for this platform and/or configuration. This means that 'heap-unclassified' is not shown and the 'explicit' tree shows less memory than it should.\n\
> 
> I guess this doesn't need to be removed from test_aboutmemory.xul, too?

With my patch applied that warning now only gets displayed if (a) there are some "explicit" reports, and (b) "heap-allocated" isn't reported.  The cases in test_aboutmemory.xul satisfy those conditions, so the warning is still shown.
Comment on attachment 8342149 [details] [diff] [review]
(part 4) - Add SystemMemoryReporter, which presents Linux-specific, system-wide memory measurements taken from the OS.

Review of attachment 8342149 [details] [diff] [review]:
-----------------------------------------------------------------

::: xpcom/base/SystemMemoryReporter.cpp
@@ +36,5 @@
> +namespace mozilla {
> +namespace SystemMemoryReporter {
> +
> +#if !defined(XP_LINUX)
> +#error "This doesn't have a prayer of working if we're not on Linux."

s/prayer/chance/

@@ +44,5 @@
> +// let's make sure that the native types have the sizes we expect.
> +static_assert(sizeof(long long) == sizeof(int64_t),
> +              "size of |long long| is expected to match |int64_t|");
> +static_assert(sizeof(int) == sizeof(int32_t),
> +              "size of |int| is expected to match |int32_t|");

You should use inttypes.h types and SCN* macros with fscanf instead of relying on that.

@@ +65,5 @@
> +  "libplds4.so",
> +  "libsmime3.so",
> +  "libsoftokn3.so",
> +  "libssl3.so",
> +  "libxul.so"

That leaves out binary components. That also doesn't account for libraries that are not mozilla libraries when built with system nspr/nss (which linux distros do). Why not rely on whether the path is in a subdirectory of the gre path and the app path (both of which may or may not be the parent of the other). If fact, don't we have code that does that distinction already for the "normal" memory reporter? I'd expect a lot of the code here to repeat what the normal memory reporter does, btw, reading /proc/self/maps.

@@ +200,5 @@
> +    }
> +
> +    unsigned long long memTotal, memFree;
> +    int n1 = fscanf(f, "MemTotal: %llu kB\n", &memTotal);
> +    int n2 = fscanf(f, "MemFree: %llu kB\n", &memFree);

I /think/ i've seen some kernels around that use KiB, you might as well leave out the unit. But maybe that's not in the /proc files.

@@ +235,5 @@
> +      const char* pidStr = ent->d_name;
> +      lstat(pidStr, &statbuf);
> +      if (S_ISDIR(statbuf.st_mode) && IsNumeric(pidStr)) {
> +        // Make sure we have enough space for the "/proc" and the filename.
> +        static const size_t len = 128;

Probably not true on b2g, but on my desktop linux, several cmdlines are longer than 128 characters.

@@ +242,5 @@
> +
> +        // Get the command name from cmdline.  If that fails, the pid is still
> +        // shown.
> +        nsCString processName("process(");
> +        sprintf(filename, "/proc/%s/cmdline", pidStr);

Why not just do that with the nsString API?

@@ +387,5 @@
> +      nsAutoCString dirname;
> +      GetDirname(absPath, dirname);
> +
> +      // Hack: A file is a shared library if the basename contains ".so" and
> +      // its dirname contains "/lib", or if the basename ends with ".so".

Why not correlate with dl_iterate_phdr? Come to think of it, it might not be in bionic...

@@ +420,5 @@
> +    aName.Append("]");
> +
> +    // Append explanation of the permissions.
> +    aDesc.Append(" (");
> +    if (strstr(aPerms, "rw")) {

You should be able to just check aPerms[0] for r or -, aPerms[1] for w or -, etc, as that's not going to be shifted.

@@ +482,5 @@
> +
> +    // Don't report nodes with size 0, and only report "Pss" values.
> +    if (size == 0 || strcmp(desc, "Pss") != 0) {
> +      return NS_OK;
> +    }

That won't report Pss values of size 0, is that expected?
Attachment #8342149 - Flags: review?(mh+mozilla) → feedback+
Comment on attachment 8342352 [details] [diff] [review]
(part 5) - On Linux, distinguish anonymous mappings by backing them with sparse, dummy files.

Review of attachment 8342352 [details] [diff] [review]:
-----------------------------------------------------------------

I think it would be better to just expose a mmap wrapper function that take the markings. It would also simplify code in the callers.
Attachment #8342352 - Flags: feedback?(mh+mozilla) → feedback-
(BTW, wrappinp mmag and VirtualAlloc is something i wanted to do in a replace-malloc at some point, which is not really directly related to this but, i was also thinking an interesting thing we could get out of such wrapping is detect from what shared library/program the call comes from)
> I think it would be better to just expose a mmap wrapper function that take
> the markings. It would also simplify code in the callers.

It can't just be a wrapper, because there must be associated static state.
(In reply to Mike Hommey [:glandium] from comment #16)
> (BTW, wrappinp mmag and VirtualAlloc is something i wanted to do in a
> replace-malloc at some point, which is not really directly related to this
> but, i was also thinking an interesting thing we could get out of such
> wrapping is detect from what shared library/program the call comes from)

dmalloc and other memory debugger tools could handle that.  Some one at Taipei is evaluting tools of this type.
I guess cervantes could provide detail info of our survey.
> > (BTW, wrappinp mmag and VirtualAlloc is something i wanted to do in
> 
> dmalloc and other memory debugger tools could handle that.

AFAICT, dmalloc is a malloc replacement.  glandium is talking about mmap/VirtualAlloc, which is a level lower than malloc.
> That leaves out binary components. That also doesn't account for libraries
> that are not mozilla libraries when built with system nspr/nss (which linux
> distros do). Why not rely on whether the path is in a subdirectory of the
> gre path and the app path (both of which may or may not be the parent of the
> other).

How do I get the gre path and/or app path?


> If fact, don't we have code that does that distinction already for
> the "normal" memory reporter? I'd expect a lot of the code here to repeat
> what the normal memory reporter does, btw, reading /proc/self/maps.

As discussed on IRC, the "normal" reporter was removed in bug 912165, and this file is a resurrected and heavily modified version of it.

But, more importantly, there was some code that tried to do that but it actually didn't work at all.  What it did was assume that any lib in the same dir as libxul.so was a mozilla lib.  But the real copy of libxul.so (not symlinks) is in toolkit/library/, and none of the other libs we build are there.


> I /think/ i've seen some kernels around that use KiB, you might as well
> leave out the unit. But maybe that's not in the /proc files.

I'll leave it unchanged until we have evidence to the contrary.


> > +        nsCString processName("process(");
> > +        sprintf(filename, "/proc/%s/cmdline", pidStr);
> 
> Why not just do that with the nsString API?

sprintf() is easier in this case.


> > +      // Hack: A file is a shared library if the basename contains ".so" and
> > +      // its dirname contains "/lib", or if the basename ends with ".so".
> 
> Why not correlate with dl_iterate_phdr? Come to think of it, it might not be
> in bionic...

Because I didn't know that function existed :)  But if it's not in bionic, we shouldn't use it.  Also
I carried this code over from MapsMemoryReporter.cpp.


> > +    // Don't report nodes with size 0, and only report "Pss" values.
> > +    if (size == 0 || strcmp(desc, "Pss") != 0) {
> > +      return NS_OK;
> > +    }
> 
> That won't report Pss values of size 0, is that expected?

Yes.
> > Why not just do that with the nsString API?
> 
> sprintf() is easier in this case.

Actually, nsCString is better.  I'll do that.
(In reply to Nicholas Nethercote [:njn] from comment #20)
> > That leaves out binary components. That also doesn't account for libraries
> > that are not mozilla libraries when built with system nspr/nss (which linux
> > distros do). Why not rely on whether the path is in a subdirectory of the
> > gre path and the app path (both of which may or may not be the parent of the
> > other).
> 
> How do I get the gre path and/or app path?

respectively
NS_GetSpecialDirectory(NS_GRE_DIR, getter_AddRefs(greDir));
NS_GetSpecialDirectory(NS_XPCOM_CURRENT_PROCESS_DIR, getter_AddRefs(greDir));

(yeah, really not the most intuitive name for the app dir, I'm all for adding #define NS_APP_DIR NS_XPCOM_CURRENT_PROCESS_DIR somewhere)

> > That won't report Pss values of size 0, is that expected?
> 
> Yes.

Maybe make the comment more explicit about that?
(In reply to Mike Hommey [:glandium] from comment #14)
> Why not correlate with dl_iterate_phdr? Come to think of it, it might not be in bionic...

It's not.  See also bug 827846.
(In reply to Nicholas Nethercote [:njn] from comment #20)
> But the real copy of libxul.so
> (not symlinks) is in toolkit/library/, and none of the other libs we build
> are there.

That only applies when you run off dist/bin on linux or mac. An actual build doesn't have this property.
This version addresses most of the previous comments.  A couple of new things:

- test_aboutmemory5.xul turns the reporter on, so it at least gets run.

- I removed the mozilla/non-mozilla distinction.  Partly because it's a pain to
  get right, and partly because on B2G it's not a particularly meaningful
  distinction -- *all* the code and data is our concern, whether it's "ours" or
  not.
Attachment #8343528 - Flags: review?(mh+mozilla)
Attachment #8342149 - Attachment is obsolete: true
Comment on attachment 8343528 [details] [diff] [review]
(part 4) - Add SystemMemoryReporter, which presents Linux-specific, system-wide memory measurements taken from the OS.

jld, would you be able to try out the first four patches on a real device?  I don't have access to a device suitable for development yet, and this definitely needs testing on-device before landing.  If you could upload the results of a get_about_memory.py run that'd be great.  Thanks!
Attachment #8343528 - Flags: feedback?
Let me make this easier:  here's a single patch that contains parts 1--4.
Attachment #8343533 - Flags: feedback?(jld)
Attachment #8343528 - Flags: feedback?
Blocks: 947072
Comment on attachment 8342352 [details] [diff] [review]
(part 5) - On Linux, distinguish anonymous mappings by backing them with sparse, dummy files.

I've spun off bug 947072 for part 5, because it's risky, and parts 1--4 will still be useful without it.
Attachment #8342352 - Attachment is obsolete: true
Comment on attachment 8343533 [details] [diff] [review]
Omnibus patch containing parts 1--4

Review of attachment 8343533 [details] [diff] [review]:
-----------------------------------------------------------------

::: xpcom/base/SystemMemoryReporter.cpp
@@ +424,5 @@
> +
> +    char desc[1025];
> +    int64_t size;
> +    int n = fscanf(aFile, "%1024[a-zA-Z_]: %" SCNd64 " kB\n", desc, &size);
> +    if (n == 0) {

n can also be EOF, if end-of-file was reached...

@@ +427,5 @@
> +    int n = fscanf(aFile, "%1024[a-zA-Z_]: %" SCNd64 " kB\n", desc, &size);
> +    if (n == 0) {
> +      return NS_ERROR_FAILURE;
> +    } else if (n == 1 && strcmp(desc, "VmFlags") == 0) {
> +      // This is the "VmFlags:" line.  Chew up the rest of it.

...because we don't have a VmFlags line in the older version of Linux we're using.  So this all winds up silently failing and returning NS_OK and then ParseMapping goes into an infinite loop.
Attachment #8343533 - Flags: feedback?(jld) → feedback-
Attached file 1.smaps
This is /proc/1/smaps from my Geeksphone keon, for reference.  The current patch, as mentioned, went into an infinite loop with the EOF flag set on the file handle while trying to parse this file.
> ParseMapping goes into an infinite loop.

And that would explain why the TBPL machines all had failures.  Thanks!  I'll post an updated patch on Monday.
Comment on attachment 8343528 [details] [diff] [review]
(part 4) - Add SystemMemoryReporter, which presents Linux-specific, system-wide memory measurements taken from the OS.

I'll have a new version of part 4 up soon.
Attachment #8343528 - Attachment is obsolete: true
Attachment #8343528 - Flags: review?(mh+mozilla)
This version of the patch fixes the infinite loop found by jld.  It also adds a
second reports tree that shows the memory used by all the user-space processes
grouped by category, instead of by process, viz:

1,488.94 MB (100.0%) -- processes
├──1,202.04 MB (80.73%) -- anonymous
│  ├────792.52 MB (53.23%) ── outside-brk
│  └────409.52 MB (27.50%) ── brk-heap
├────211.43 MB (14.20%) -- shared-libraries
│    ├──148.62 MB (09.98%) ── [r-xp]
│    ├───39.04 MB (02.62%) ── [r--p]
│    ├───23.78 MB (01.60%) ── [rw-p]
│    └────0.00 MB (00.00%) ── other
├─────72.35 MB (04.86%) ── other-files
└──────3.11 MB (00.21%) -- (2 tiny)
       ├──3.11 MB (00.21%) ── main-thread-stacks
       └──0.00 MB (00.00%) ── vdso
Attachment #8344438 - Flags: review?(mh+mozilla)
jld, can you please test this new omnibus patch?  Thanks.
Attachment #8344443 - Flags: feedback?(jld)
Attachment #8343533 - Attachment is obsolete: true
Comment on attachment 8344443 [details] [diff] [review]
New omnibus patch

Review of attachment 8344443 [details] [diff] [review]:
-----------------------------------------------------------------

Otherwise, seems to be working.  Here's the tree from the aforementioned keon (512 MB in theory; I haven't investigated where the discrepancy is yet), showing the lock screen:

397.24 MB (100.0%) -- mem
├──215.41 MB (54.23%) ── free
├──105.72 MB (26.61%) -- processes
│  ├───57.59 MB (14.50%) -- process(/system/b2g/b2g, pid=709)
│  │   ├──42.29 MB (10.65%) -- anonymous
│  │   │  ├──42.25 MB (10.63%) -- outside-brk
│  │   │  │  ├──41.94 MB (10.56%) ── [rw-p] [69]
│  │   │  │  └───0.31 MB (00.08%) ++ (2 tiny)
│  │   │  └───0.05 MB (00.01%) ── brk-heap/[rw-p]
│  │   ├──13.03 MB (03.28%) -- shared-libraries
│  │   │  ├───8.39 MB (02.11%) -- libxul.so
│  │   │  │   ├──6.05 MB (01.52%) ── [r-xp]
│  │   │  │   └──2.34 MB (00.59%) ── [rw-p]
│  │   │  └───4.64 MB (01.17%) ++ (69 tiny)
│  │   └───2.27 MB (00.57%) ++ (2 tiny)
│  ├───21.73 MB (05.47%) -- process(/system/b2g/plugin-container, pid=756)
│  │   ├──12.49 MB (03.14%) -- anonymous
│  │   │  ├──12.48 MB (03.14%) -- outside-brk
│  │   │  │  ├──12.41 MB (03.12%) ── [rw-p] [30]
│  │   │  │  └───0.07 MB (00.02%) ++ (2 tiny)
│  │   │  └───0.02 MB (00.00%) ── brk-heap/[rw-p]
│  │   ├───8.88 MB (02.23%) -- shared-libraries
│  │   │   ├──7.33 MB (01.85%) -- libxul.so
│  │   │   │  ├──4.99 MB (01.26%) ── [r-xp]
│  │   │   │  └──2.34 MB (00.59%) ── [rw-p]
│  │   │   └──1.54 MB (00.39%) ++ (50 tiny)
│  │   └───0.36 MB (00.09%) ++ (2 tiny)
│  ├───14.08 MB (03.54%) -- process(/system/b2g/plugin-container, pid=836)
│  │   ├───7.53 MB (01.89%) -- shared-libraries
│  │   │   ├──6.02 MB (01.52%) ++ libxul.so
│  │   │   └──1.51 MB (00.38%) ++ (47 tiny)
│  │   ├───6.24 MB (01.57%) -- anonymous
│  │   │   ├──6.23 MB (01.57%) -- outside-brk
│  │   │   │  ├──6.23 MB (01.57%) ── [rw-p] [22]
│  │   │   │  └──0.00 MB (00.00%) ── [r--p]
│  │   │   └──0.01 MB (00.00%) ── brk-heap/[rw-p]
│  │   └───0.31 MB (00.08%) ++ (2 tiny)
│  └───12.32 MB (03.10%) ++ (23 tiny)
└───76.11 MB (19.16%) ── other

::: xpcom/base/SystemMemoryReporter.cpp
@@ +3,5 @@
> +/* This Source Code Form is subject to the terms of the Mozilla Public
> + * License, v. 2.0. If a copy of the MPL was not distributed with this
> + * file, You can obtain one at http://mozilla.org/MPL/2.0/. */
> +
> +#include "mozilla/Util.h"

This file no longer exists (bug 713082); deleting the #include works.
Attachment #8344443 - Flags: feedback?(jld) → feedback+
(In reply to Jed Davis [:jld] from comment #35)
> Otherwise, seems to be working.  Here's the tree from the aforementioned
> keon (512 MB in theory; I haven't investigated where the discrepancy is
> yet), showing the lock screen:
> 
> 397.24 MB (100.0%) -- mem

pmem appears to be responsible for some of it (52.36 MB total; see /sys/kernel/pmem/*/size), but there's still a lot (62.4 MB) unaccounted for.
Patch updated for the removal of "mozilla/Util.h".
Attachment #8345070 - Flags: review?(mh+mozilla)
Attachment #8344438 - Attachment is obsolete: true
Attachment #8344438 - Flags: review?(mh+mozilla)
Whiteboard: [c=memory p= s= u=] [MemShrink] → [c=memory p= s= u=] [MemShrink:P2]
I cleaned up the way the process/ tree is measured.  Much nicer now.  (And
there shouldn't be any more updates to this patch.)
Attachment #8345668 - Flags: review?(mh+mozilla)
Attachment #8345070 - Attachment is obsolete: true
Attachment #8345070 - Flags: review?(mh+mozilla)
Comment on attachment 8345668 [details] [diff] [review]
(part 4) - Add SystemMemoryReporter, which presents Linux-specific, system-wide memory measurements taken from the OS.

Review of attachment 8345668 [details] [diff] [review]:
-----------------------------------------------------------------

::: xpcom/base/SystemMemoryReporter.cpp
@@ +410,5 @@
> +          *aProcessSizeKind = SharedLibrariesRWP;
> +        } else if (strcmp(aPerms, "r--p") == 0) {
> +          *aProcessSizeKind = SharedLibrariesRP;
> +        } else {
> +          *aProcessSizeKind = SharedLibrariesOther;

Note that on android the gecko share library mappings are shared instead of private.
Attachment #8345668 - Flags: review?(mh+mozilla) → review+
> Note that on android the gecko share library mappings are shared instead of private.

Ok.  I'll special-case "rw-s" segments as well as "rw-p" ones.  Thanks.
https://hg.mozilla.org/integration/mozilla-inbound/rev/c9e9f2940aef
https://hg.mozilla.org/integration/mozilla-inbound/rev/76dbeac45227
https://hg.mozilla.org/integration/mozilla-inbound/rev/0a276a083a26
https://hg.mozilla.org/integration/mozilla-inbound/rev/37aa7ce35dc6

Here was the green try run I did before landing:
https://tbpl.mozilla.org/?tree=Try&rev=74d47deb1eda

And I ended up entirely ignoring the p/s entry for the "processes" tree:

1,380.95 MB (100.0%) -- processes
├──1,123.42 MB (81.35%) -- anonymous
│  ├────813.71 MB (58.92%) ── untagged
│  ├────209.35 MB (15.16%) ── brk-heap
│  ├─────65.95 MB (04.78%) ── jemalloc-heap
│  ├─────34.14 MB (02.47%) ── js-gc-heap
│  └──────0.27 MB (00.02%) ++ (2 tiny)
├────228.14 MB (16.52%) -- shared-libraries
│    ├──161.34 MB (11.68%) ── read-executable
│    ├───41.82 MB (03.03%) ── read-only
│    └───24.98 MB (01.81%) ── read-write
├─────26.20 MB (01.90%) ── other-files
└──────3.19 MB (00.23%) ── main-thread-stack
No longer depends on: 951567
we are working on v1.3 for memory shrink.
I think v1.3 is needed.
njn, how hard/risky does that look to land on 1.3?
Flags: needinfo?(n.nethercote)
> njn, how hard/risky does that look to land on 1.3?

Should be easy.

The risk is pretty low;  the main risk is that the format of the relevant files might change between kernel versions.  E.g. an early version of the patch worked ok on a later kernel (3.18?) but caused an infinite loop on an earlier kernel (3.3?) because of this.  But I wouldn't expect the file formats to change often, and it'll be obvious if there are any problems on whatever kernel a particular device is using.
Flags: needinfo?(n.nethercote)
Comment on attachment 8342135 [details] [diff] [review]
(part 1) - Remove about:memory's is-a-sentence description check.

[Approval Request Comment]
We need these patches on 1.3 to investigate system wide memory usage on low memory devices. njn is confident that this is low risk:
https://bugzilla.mozilla.org/show_bug.cgi?id=945973#c45
Attachment #8342135 - Flags: approval-mozilla-aurora?
Comment on attachment 8342139 [details] [diff] [review]
(part 2) - Don't complain if there aren't any "explicit" reports fora process.

[Approval Request Comment]
We need these patches on 1.3 to investigate system wide memory usage on low memory devices. njn is confident that this is low risk:
https://bugzilla.mozilla.org/show_bug.cgi?id=945973#c45
Attachment #8342139 - Flags: approval-mozilla-aurora?
Comment on attachment 8342140 [details] [diff] [review]
(part 3) - Fix some trivial reporter/report confusions in aboutMemory.js.

[Approval Request Comment]
We need these patches on 1.3 to investigate system wide memory usage on low memory devices. njn is confident that this is low risk:
https://bugzilla.mozilla.org/show_bug.cgi?id=945973#c45
Attachment #8342140 - Flags: approval-mozilla-aurora?
Comment on attachment 8345668 [details] [diff] [review]
(part 4) - Add SystemMemoryReporter, which presents Linux-specific, system-wide memory measurements taken from the OS.

[Approval Request Comment]
We need these patches on 1.3 to investigate system wide memory usage on low memory devices. njn is confident that this is low risk:
https://bugzilla.mozilla.org/show_bug.cgi?id=945973#c45
Attachment #8345668 - Flags: approval-mozilla-aurora?
(In reply to Nicholas Nethercote [:njn] (PTO until Jan 2, 2014) from comment #45)
> > njn, how hard/risky does that look to land on 1.3?
> 
> Should be easy.
> 
> The risk is pretty low;  the main risk is that the format of the relevant
> files might change between kernel versions.  E.g. an early version of the
> patch worked ok on a later kernel (3.18?) but caused an infinite loop on an
> earlier kernel (3.3?) because of this.  But I wouldn't expect the file
> formats to change often, and it'll be obvious if there are any problems on
> whatever kernel a particular device is using.
Hi Njn and Fabrice, our low memory device kernel version is 3.08.
> Hi Njn and Fabrice, our low memory device kernel version is 3.08.

The patch should work with that version of the kernel.  If somebody could test to confirm (just get memory reports as usual and look for the "System" section), that would be helpful.
(In reply to Nicholas Nethercote [:njn] from comment #51)
> > Hi Njn and Fabrice, our low memory device kernel version is 3.08.
> 
> The patch should work with that version of the kernel.  If somebody could
> test to confirm (just get memory reports as usual and look for the "System"
> section), that would be helpful.

If these patches can be applied on v1.3 branch, we can test it and confirm.
Cheng, please help njn to do it.
Flags: needinfo?(n.nethercote)
Flags: needinfo?(cheng.luo)
I don't have a B2G development device suitable for testing the patch.  I got jld to test the patch on a B2G device before I landed it on mozilla-central.
Flags: needinfo?(n.nethercote)
I built on 1.3 and it works, but my device is not a tarako device and has a different kernel version.
(In reply to Fabrice Desré [:fabrice] from comment #54)
> I built on 1.3 and it works, but my device is not a tarako device and has a
> different kernel version.

Could I get you patch for 1.3.(the patch is best to git format)
I could help you to verify.
Flags: needinfo?(cheng.luo)
Attached patch memreport.patchSplinter Review
Here's a patch that applies on a 1.3 tree.
Attached file about-memory-0.tar.gz
After apply the patch on v1.3, this is the reslut of get_about_memory.py on fugu 128M device.But the Firefox Nightly(29.0a1 2013-12-15) could not paser memory-reports correctly.
Error Output:
“Invalid memory report(s): non-sentence other description: mem/processes/process(\init, pid=1)/other-files/init/[r-xp], /init [r-xp]”

Maybe I will try patch in Bug 917646?
A more recent (e.g. Nightly) build of Firefox will read that file without problem.  In fact, it's patch 1 of this bug that removes that check.  I will nominate that patch for backporting to Aurora and Beta.
(In reply to Nicholas Nethercote [:njn] from comment #58)
yes,I could read the parsed result with the recent version
thanks
For what it's worth, the geeksphone keon that I tested on also has a 3.0.8-based kernel (and I think all the ICS devices are based on the 3.0 branch, except the emulator which is older) — but all of these kernels have been patched to some extent by the hardware manufacturers in question, so in general the base version number may not be a clear indication.
Attachment #8355346 - Attachment is obsolete: true
Comment on attachment 8355346 [details] [diff] [review]
part 1, for backporting to Aurora

[Approval Request Comment]

Bug caused by (feature/regressing bug #): this bug.

User impact if declined: memory reports created in FF29 and B2G 1.3 won't load in FF28 and FF27.

Testing completed (on m-c, etc.): patch has been in m-c for weeks.  I also tested locally on mozilla-aurora.

Risk to taking this patch (and alternatives if risky): miniscule; it just removes a check.

String or IDL/UUID changes made by this patch: none.
Attachment #8355346 - Attachment is obsolete: false
Attachment #8355346 - Flags: review+
Attachment #8355346 - Flags: approval-mozilla-aurora?
Comment on attachment 8355348 [details] [diff] [review]
part 1, for backporting to Beta

[Approval Request Comment]
See comment 63. The rationale is the same as for Aurora, though the patch is distinct from the Aurora one due to trivial differences in code surrounding the changes.
Attachment #8355348 - Flags: review+
Attachment #8355348 - Flags: approval-mozilla-beta?
Attachment #8355348 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
Attachment #8355346 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Attachment #8345668 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Attachment #8342140 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Attachment #8342139 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Attachment #8342135 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Comment on attachment 8342135 [details] [diff] [review]
(part 1) - Remove about:memory's is-a-sentence description check.

Four patches were marked approval-mozilla-aurora+ that should not have been.  I'll remove the flags.
Attachment #8342135 - Flags: approval-mozilla-aurora+
Attachment #8342139 - Flags: approval-mozilla-aurora+
Attachment #8342140 - Flags: approval-mozilla-aurora+
Attachment #8345668 - Flags: approval-mozilla-aurora+
Is this patch land on v1.3? We need it for tarako.
Flags: needinfo?(styang)
Whiteboard: [c=memory p= s= u=] [MemShrink:P2] → [c=memory p= s= u=] [MemShrink:P2][tarako]
(In reply to James Zhang from comment #67)
> Is this patch land on v1.3? We need it for tarako.

Yes, they landed on aurora which is current 1.3
Flags: needinfo?(styang)
Hi Steven,

   memreport.patch hasn't landed on v1.3, so we can't get about memory on tarako now.
Flags: needinfo?(styang)
blocking-b2g: --- → 1.3+
Flags: needinfo?(styang)
(In reply to cheng.luo from comment #57)
> Created attachment 8355135 [details]
> about-memory-0.tar.gz
> 
The patch was OK on my build version 20140101。
But I try it now,it does not work well.

the error output:
Got 2/3 files.We've waited 120s but the only relevant files we see are
  /data/local/tmp/memory-reports/memory-report-1325376304-549.json.gz
  /data/local/tmp/memory-reports/memory-report-1325376304-85.json.gz
We expected 3 but see only 2 files.  Giving up...
Traceback (most recent call last):
  File "./get_about_memory.py", line 289, in <module>
    get_and_show_info(args)
  File "./get_about_memory.py", line 182, in get_and_show_info
    (out_dir, merged_reports_path, dmd_files) = get_dumps(args)
  File "./get_about_memory.py", line 179, in get_dumps
    return utils.run_and_delete_dir_on_exception(do_work, out_dir)
  File "/home/cheng/Mozilla/6821a/B2G/tools/include/device_utils.py", line 147, in run_and_delete_dir_on_exception
    return fun()
  File "./get_about_memory.py", line 164, in do_work
    optional_outfiles_prefixes=['dmd-'])
  File "/home/cheng/Mozilla/6821a/B2G/tools/include/device_utils.py", line 218, in notify_and_pull_files
    _wait_for_remote_files(outfiles_prefixes, num_expected_files, old_files)
  File "/home/cheng/Mozilla/6821a/B2G/tools/include/device_utils.py", line 300, in _wait_for_remote_files
    raise Exception("Unable to pull some files.")
Exception: Unable to pull some files.
> We expected 3 but see only 2 files.  Giving up...

Sounds like bug 948774.
status-b2g-v1.3T: fixed in tarako branch, remove [tarako] in whiteboard
Whiteboard: [c=memory p= s= u=] [MemShrink:P2][tarako] → [c=memory p= s= u=] [MemShrink:P2]
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: