Open Bug 662814 Opened 13 years ago Updated 2 years ago

Add a low overhead, always-on, system-wide event log

Categories

(Core :: General, defect)

defect

Tracking

()

ASSIGNED

People

(Reporter: jseward, Assigned: jseward)

References

Details

Attachments

(1 file, 4 obsolete files)

Add a simple logging facility to record events which are known to have
a significant impact on memory usage.  Nothing heavyweight, just one
line per event, timestamped, plus brief relevant stats.  For example:

    GC started / ended, total size N, reclaimed M bytes
    GC mapped in new heap / unmapped heap
    CC started / ended, total size N, reclaimed M bytes
    image discard ran
    jemalloc mmap’d more pages / munmapped some pages
    nanojit / mjit mapped / unmapped code pages

Perhaps with some indirectly relevant events such as

    downloaded another chunk of the phishing database
    no user input seen for 17 minutes
    new tab created; now there are 23 of them
    user requested about:memory (+ what it produced)
    extension Xyzzy loaded/initialised

That way, we’d at least have some information about the space history
leading up to a situation where a user says "urk!  massive memory
leak!".

The log to be put into a small in-ram circular buffer, which is shown
on an about:memlog or some such page.  No file I/O involved.

A bit more background is at
http://blog.mozilla.com/jseward/2011/05/26/a-low-overhead-always-on-system-wide-memory-event-log
I think this is great, but I don't think it should be restricted to memory events.  Although memory events will be a big part of it, let's not box ourselves into a corner by calling it "about:memorylog" or somesuch;  keep it to a more neutral name like "about:log".
Summary: Add a low overhead, always-on, system-wide memory event log → Add a low overhead, always-on, system-wide event log
Whiteboard: [MemShrink:P1]
Attached patch Initial implementation (WIP) (obsolete) — Splinter Review
Initial implementation, probably Linux only (OSX might work).
Currently receives events only from jemalloc mmap/munmap and from
ScopedGfxFeatureReporter::WriteAppNote.  Sample output below.  Output
appears in about:memlog, using a 16KB circular buffer.

Suggestions for further events to hook up happily received.  Next is
to hook up the non-malloc code allocation events done by the various
JITs.  Also improve the timestamping (first column below), which is
currently the number of wallclock seconds since the first line was
output.



0 :JEM: mmap:   3145728 size, 76 MB on loan
0 :JEM: munmap: 3145728 size, 73 MB on loan
0 :JEM: mmap:   4194304 size, 77 MB on loan
0 :JEM: munmap: 4194304 size, 73 MB on loan
0 :JEM: mmap:   3145728 size, 76 MB on loan
0 :JEM: mmap:   3145728 size, 79 MB on loan
0 :JEM: mmap:   1048576 size, 80 MB on loan
0 :JEM: mmap:   1048576 size, 81 MB on loan
0 :JEM: mmap:   1048576 size, 82 MB on loan
0 :JEM: mmap:   1048576 size, 83 MB on loan
0 :JEM: mmap:   2097152 size, 85 MB on loan
0 :JEM: mmap:   3145728 size, 89 MB on loan
0 :GFX: WriteAppNote: GL Context? 
0 :GFX: WriteAppNote: GL Context+
0 :GFX: WriteAppNote: WebGL+
1 :JEM: mmap:   1048576 size, 90 MB on loan
1 :JEM: mmap:   1048576 size, 91 MB on loan
1 :JEM: mmap:   1048576 size, 92 MB on loan
Comment on attachment 538483 [details] [diff] [review]
Initial implementation (WIP)

Review of attachment 538483 [details] [diff] [review]:
-----------------------------------------------------------------

::: memory/mozalloc/memlog.c
@@ +1094,5 @@
> +}
> +
> +
> +// END standalone, non-malloc-using printf facility copied
> +// from nsprpub/pr/src/io/prprf.c

Can you just #include that file directly?

::: netwerk/protocol/about/nsAboutMemlog.cpp
@@ +35,5 @@
> + * the terms of any one of the MPL, the GPL or the LGPL.
> + *
> + * ***** END LICENSE BLOCK ***** */
> +
> +#include "nsAboutMemlog.h"

You should really make an xpcom interface for adding entries to the log so that JS code can add entries.  That would also allow you to rewrite this module in JS, which would make it a lot simpler :)
OS: Linux → All
Hardware: x86_64 → All
Attached patch Improved version (WIP) (obsolete) — Splinter Review
* rebase to today

* move logging code into its own tier_base module, so it is
  independent of jemalloc.  It is built into a DSO that depends on
  nothing but libc and then not even much of that.

* log now has human readable timestamps ("2011 Mon Jul 04 19:59:09")

* improved formatting and cumulative numbers for jemalloc-mapped,
  mjit-code and tjit-code entries

* next to-do: get decent information for GC and CC activities instead
  of placeholder entries
Attachment #538483 - Attachment is obsolete: true
Comment on attachment 543809 [details] [diff] [review]
Improved version (WIP)

Review of attachment 543809 [details] [diff] [review]:
-----------------------------------------------------------------

Some more drive-by comments.  I stand by my unaddressed previous comments, too :)

::: js/src/assembler/jit/ExecutableAllocator.cpp
@@ +31,5 @@
>  
>  size_t ExecutableAllocator::pageSize = 0;
>  size_t ExecutableAllocator::largeAllocSize = 0;
>  
> +// This should really be word-sized.

So why isn't it?

@@ +32,5 @@
>  size_t ExecutableAllocator::pageSize = 0;
>  size_t ExecutableAllocator::largeAllocSize = 0;
>  
> +// This should really be word-sized.
> +PRUint32 mjit_code_bytes_on_loan = 0;

camelCaps is preferred style.

::: js/src/assembler/jit/ExecutableAllocator.h
@@ +213,5 @@
>      }
>  
>      void releasePoolPages(ExecutablePool *pool) {
>          JS_ASSERT(pool->m_allocation.pages);
> +        JS_ATOMIC_ADD(&mjit_code_bytes_on_loan, -(PRInt32)pool->m_allocation.size);

Hmm, with this we have dual implementations -- memlog uses a counter to track mjit code size, and about:memory uses iteration to get mjit code size.

I prefer the iteration style because the counter style is prone to bitrotting, but if the mjit code size is really important for memlog, we should get rid of the iteration and change the memory reporter to use this counter.

::: js/src/jstracer.cpp
@@ +231,5 @@
> +
> +/* Implement embedder-specific nanojit members for code allocation. */
> +
> +// This should really be word-sized.
> +static PRUint32 tjit_code_bytes_on_loan = 0;

camelCaps again.

@@ +243,5 @@
> +                             nbytes,
> +                             MEM_COMMIT | MEM_RESERVE,
> +                             PAGE_EXECUTE_READWRITE);
> +    JSRuntime* rt = vmca->mRt;
> +    if (rt && rt->memlogCallback) {

Do you need to test |rt|?  If there's no JSRuntime we have big problems.  Likewise below in a number of places.

@@ +249,5 @@
> +            rt->memlogCallback(
> +                "(%lu MB on loan)  acq %lu  by nanojit::CodeAlloc::allocCodeChunk (FAILED)\n",
> +                (size_t)tjit_code_bytes_on_loan / 1000000, nbytes);
> +        } else {
> +            JS_ATOMIC_ADD(&code_bytes_on_loan, nbytes);

Same story with the tracing vs. counting style of accounting here.

::: memlog/memlog.c
@@ +1,1 @@
> +

You obviously need the legal boilerplate stuff here.

Is it worth a comment explaining why this file is C, not C++?

@@ +120,5 @@
> +** WARNING: This code may *NOT* call PR_LOG (because PR_LOG calls it)
> +*/
> +
> +/*
> +** XXX This needs to be internationalized!

Does it?  Maybe if URLs are embedded in the memlog?

::: memlog/memlog.h
@@ +19,5 @@
> +
> +/* Add a message to the memory log. */
> +extern
> +__attribute__ ((visibility ("default")))
> +void memlog(MemlogCaller, const char* format, ...);

The names of all these functions will undoubtedly need NS prefixes of some kind, and camelCaps too.
If you want to add cycle collector information, probably the best place to place a logging call is in nsCycleCollector::CleanupAfterCollection().  There's a number of things that we've added for Telemetry that may or may be more detail than you want.

interval: how long the CC took
mVisitedRefCounted: the number of DOM objects the CC visited
mVisitedGCed: the number of JS objects the CC visited
mWhiteNodeCount: the number of DOM nodes the CC freed

mVisitedRefCounted+mVisitedGCed is the total number of objects the CC visited, and gives a rough idea of how much work the CC did, in a way that does vary depending on how fast or slow the computer is.

In your blog, you said you would like to log how much memory the CC freed, but unfortunately that is not possible right now.  The white node count gives a rough idea, but there's no way to tell if a freed object contains a huge chunk of data or not.  Mounir is doing some work to improve that for about:memory in Bug 663271, but we still probably won't be able to figure out how much memory is being freed by the CC in a zero-overhead way.
(In reply to comment #3)

> > +// END standalone, non-malloc-using printf facility copied
> > +// from nsprpub/pr/src/io/prprf.c
> 
> Can you just #include that file directly?

Not in any obvious way.  memlog.c (the main log function) needs to be
callable from the depths of jemalloc, whilst it holds a lock.  Any
dynamic memory allocation memlog.c then does leads back into jemalloc,
and deadlocks trying to acquire the lock for a second time.  I tried
it to see what would happen.

That means memlog.c can't do dynamic memory allocation, which is a
severely annoying limitation.  The sprintf implementation in prprf.c
does however do that, and the copied version is a cut down one which
removes such paths, and has limited functionality as a result,
primarily no printing of floats.


> > +#include "nsAboutMemlog.h"
> 
> You should really make an xpcom interface for adding entries to the log so
> that JS code can add entries.

Yes.  I'll get to that.


> That would also allow you to rewrite this
> module in JS, which would make it a lot simpler :)

Wouldn't that lose the guaranteed-won't-deadlock property?
(In reply to comment #6)
> If you want to add cycle collector information, [...]

Thanks for the info.  I added printing which makes CC logging 
appear like so:

2011 Wed Jul 06 12:55:03  :CC :  nsCycleCollectorRunner::Collect
2011 Wed Jul 06 12:55:03  :JEM:  (166 MB on loan)  acq 2097152  by mmap
2011 Wed Jul 06 12:55:03  :JEM:  (167 MB on loan)  acq 1048576  by mmap
2011 Wed Jul 06 12:55:03  :JEM:  (166 MB on loan)  rel 1048576  by munmap
2011 Wed Jul 06 12:55:03  :JEM:  (167 MB on loan)  acq 1048576  by mmap
2011 Wed Jul 06 12:55:03  :JEM:  (166 MB on loan)  rel 1048576  by munmap
2011 Wed Jul 06 12:55:03  :JEM:  (168 MB on loan)  acq 2097152  by mmap
2011 Wed Jul 06 12:55:03  :JEM:  (166 MB on loan)  rel 2097152  by munmap
2011 Wed Jul 06 12:55:03  :JEM:  (167 MB on loan)  acq 1048576  by mmap
2011 Wed Jul 06 12:55:03  :JEM:  (166 MB on loan)  rel 1048576  by munmap
2011 Wed Jul 06 12:55:03  :JEM:  (164 MB on loan)  rel 2097152  by munmap
2011 Wed Jul 06 12:55:03  :CC :  nsCycleCollector::CleanupAfterCollection
2011 Wed Jul 06 12:55:03  :CC :    32 milliseconds total time
2011 Wed Jul 06 12:55:03  :CC :    11926 DOM objects visited
2011 Wed Jul 06 12:55:03  :CC :    766 DOM nodes freed
2011 Wed Jul 06 12:55:03  :CC :    19792 JS objects visited

We see it start, then jemalloc both unmaps and maps space, the latter
presumably to hold CC's administrative structures during the
collection (right?)
(In reply to comment #5)
Thanks for the reviewing, btw.  Here's some comments.  I'll fix the
stylistic things in the next patch iteration.


> >  size_t ExecutableAllocator::pageSize = 0;
> >  size_t ExecutableAllocator::largeAllocSize = 0;
> >  
> > +// This should really be word-sized.
> 
> So why isn't it?

Because I'm using JS_ATOMIC_ADD to change the counters without having
to worry about locking.  Problem is it is a wrapper around
PR_AtomicAdd and that only operates on 32 bit values.  Options are:

(1) add a word sized version of PR_AtomicAdd
(2) accept non-atomic updating of the counters
(3) live with 32-bit counters
(4) as per discussion below, get this information by traversal instead


> >      void releasePoolPages(ExecutablePool *pool) {
> >          JS_ASSERT(pool->m_allocation.pages);
> > +        JS_ATOMIC_ADD(&mjit_code_bytes_on_loan, -(PRInt32)pool->m_allocation.size);
> 
> Hmm, with this we have dual implementations -- memlog uses a counter to
> track mjit code size, and about:memory uses iteration to get mjit code size.
> 
> I prefer the iteration style because the counter style is prone to
> bitrotting,

I agree with both objections, but I can't see how to make it work
without ending up reading freed memory in at least some cases.

eg.  In xpcjsruntime.cpp I added a function MemlogCallback which is
exported into the JSRuntime, so SM can call it for logging.  When SM
unmaps mjit code space, it calls MemlogCallback to log that fact.  In
order to show how much code space now remains, the MemlogCallback
could call GetJSMjitCode, but if it does, it will wind up reading freed
memory.

In short, some of these freeing operations (mjit code, tjit code)
involve both unmapping space and updating administrative structures.
Traversing those structures to collect stats in the midst of updating
them, leads to bad effects.

I guess it might be possible to defer the logging calls when (eg)
freeing mjit code long enough to allow the administrative updating to
complete.


> Is it worth a comment explaining why this file is C, not C++?

Originally it was part of libmozalloc, and also it contains a bunch
of code copied from C files in nsprpub.  Now it's standalone, it could
be made into C++, although I don't think there's much point.
(In reply to comment #8)
> We see it start, then jemalloc both unmaps and maps space, the latter
> presumably to hold CC's administrative structures during the
> collection (right?)
Yes, the CC creates an abstract copy of a portion of the heap, along with a few other things.
Attached patch wip, next version (obsolete) — Splinter Review
* buildable on Windows as well as Linux
* added VirtualAlloc/Free logging equivalent to existing mmap/munmap
  logging
* addresses many review comments above
Attachment #543809 - Attachment is obsolete: true
Attached patch patch for review (obsolete) — Splinter Review
What this provides:

* a low level in-memory logging facility, callable from anywhere,
  intended to be used to log events which have a significant effect on
  memory consumption.

* the log can be viewed as about:eventlog.

* logging calls for: jemalloc map/unmap, mjit and tjit code
  allocation, GC and CC runs, and from
  ScopedGfxFeatureReporter::WriteAppNote.


Implementation summary:

* new tier_base module, "eventlog/", containing low level routines --
  a simple circular buffer, a routine to sprintf into it, and a
  routine to copy it out, so it can be displayed.

  Is tier_base in order that it can be called from anywhere, in
  particular from inside jemalloc, without causing new inter-module
  dependencies (which would be circular).

  The fact that it must be callable from jemalloc complicates what
  would otherwise be something trivial.  It means this module can't do
  any dynamic memory allocation, since that would lead back into
  jemalloc and causes a deadlock (I tried it).

  That in turn means it can't use the system's sprintf since that
  could do dynamic memory allocation, nor can it use ctime to generate
  timestamps.  Hence it carries its own, cut-down, guaranteed-safe
  versions of those, copied from NSPR.

* new files
  toolkit/components/eventlog/content/eventlog.{js,css,xhtml}
  to make up the page "about:eventlog"

* xpcom/base/nsEventLog.{h,cpp} and xpcom/base/nsIEventLog.idl to
  provide support in JS-land, both for fetching the contents of the
  log buffer and for adding entries to it.

* a bit of refactoring of Nanojit code allocation, moving code out
  of js/src/nanojit/avmplus.cpp into jstracer.cpp, where it more
  logically belongs.


Limitations/queries re current implementation, some of which could be
addressed in followup patches:

* VirtualAlloc/Free hookups in jemalloc not working on Windows (I
  didn't manage to do a Windows build with jemalloc enabled).  I added
  relevant bits of code but I'm sure they didn't get compiled or
  tested.

* 32 bit only counters for tjit-code/mjit-code/jemalloc-mmaped running
  totals.  Need portable word-sized atomics to do this right, but NSPR
  doesn't offer those.  I could add them.

* Some of the logged lines are quite long, so they stick out over the
  RH side of the white box created by global/skin/about.css.  Not sure
  what to do about that.

* Timestamps are GMT only, since finding the timezone offset requires
  doing libc calls that could do dynamic allocation (see comments
  above).  This could be fixed; I have a plan.

* xpcom/base/nsEventLog.{h,cpp} has 
    nsresult
    nsEventLogConstructor(nsISupports* aOuter,
                          const nsIID& aIID,
                          void* *aInstancePtr)
  which is copied from nsCycleCollector.{h,cpp} (w/ renaming of
  course).  I have no idea if it is correct/appropriate, but it
  wouldn't build/link without it.  This needs to be checked by
  someone who understands XPCOM world.
  (is referred to by xpcom/build/XPCOMModule.inc)

* Buffer size is 16k, which is probably too small.  64k would be
  better.

* Is built as a DSO.  This guarantees that there is only one instance
  of it in the process and so the circular buffer it holds, which is a
  global variable, exists only once in the process.  (That's a correct
  understanding, is it?)

* Copyright notices on new files OK ?
Attachment #544751 - Attachment is obsolete: true
I'll look at this in detail tomorrow, but one thing that leapt out at me...

> * VirtualAlloc/Free hookups in jemalloc not working on Windows (I
>   didn't manage to do a Windows build with jemalloc enabled).  I added
>   relevant bits of code but I'm sure they didn't get compiled or
>   tested.

Do a Windows opt try server run and then download the compiled binary to test it.
(In reply to comment #12)
> * Copyright notices on new files OK ?

The notices should have the year 2011, not 2009.  The line listing you as a contributor should indented 2 spaces, not 1. (see http://www.mozilla.org/MPL/boilerplate-1.1/ )
I tried to build with the patch applied on Linux64 and got a link error:

/usr/bin/ld.gold.real: ../../dist/lib/libjemalloc.a(jemalloc.o): in function my_mmap:/home/njn/moz/mi4/memory/jemalloc/jemalloc.c:376: error: undefined reference to 'eventLog'
/usr/bin/ld.gold.real: ../../dist/lib/libjemalloc.a(jemalloc.o): in function my_munmap:/home/njn/moz/mi4/memory/jemalloc/jemalloc.c:388: error: undefined reference to 'eventLog'
collect2: ld returned 1 exit status
Comment on attachment 547372 [details] [diff] [review]
patch for review

Review of attachment 547372 [details] [diff] [review]:
-----------------------------------------------------------------

I have no idea how this is going to work with e10s.  Already in Fennec no info from the content process will be shown in about:eventlog, AFAICT.  That can be a follow-up, though.

::: eventlog/eventlog.h
@@ +64,5 @@
> +    enum { EL_Jemalloc,  /* jemalloc */
> +           EL_JS,        /* JS engine */
> +           EL_CC,        /* cycle collector */
> +           EL_Gfx,       /* gfx/ */
> +           EL_Misc       /* misc */

I'm guessing these should have names like NS_EL_JEMALLOC, NS_EL_JS, etc.  That seems to be the standard Gecko style.  And maybe NS_EVENTLOG_JEMALLOC would be better.

(But see at the very bottom about moving these into nsIEventLog.)

@@ +66,5 @@
> +           EL_CC,        /* cycle collector */
> +           EL_Gfx,       /* gfx/ */
> +           EL_Misc       /* misc */
> +    }
> +    EventLogCaller;

NS_EventLogCaller?  Not sure.

@@ +70,5 @@
> +    EventLogCaller;
> +
> +/* Add a message to the event log. */
> +EVENTLOG_EXPORT
> +void eventLog(EventLogCaller, const char* format, ...);

I suspect this should be named NS_EventLog() or similar.  Likewise for the other functions in this header.

::: js/src/assembler/jit/ExecutableAllocator.h
@@ +282,5 @@
> +
> +        JS_ATOMIC_ADD(&mjitCodeBytesOnLoan, allocSize);
> +        if (eventLog)
> +            eventLog("(%lu KB on loan)  acq %lu  by mjit systemAlloc\n",
> +                     (size_t)mjitCodeBytesOnLoan / 1000, (size_t)allocSize);

Dividing by 1000 when writing "KB" is correct, but not Mozilla policy :(  See bug 106618 for far more than you want to know about this.

However, I'd be inclined to show the number of bytes instead, like about:memory?verbose does.  (I personally think in bytes and MB, not KB, but MB would be too coarse here.)  Does eventLog allow you to print numbers with commas separating each group of 3 digits?  I sure hope so, byte count will be impossible to read otherwise.  (This obviously applies everywhere.)

::: js/src/nanojit/avmplus.cpp
@@ -78,5 @@
> -nanojit::CodeAlloc::allocCodeChunk(size_t nbytes) {
> -    return VirtualAlloc(NULL,
> -                        nbytes,
> -                        MEM_COMMIT | MEM_RESERVE,
> -                        PAGE_EXECUTE_READWRITE);

The nanojit parts will have to be landed separately to nanojit-central and then merged.  I can do that for you when it's ready.

::: memory/jemalloc/jemalloc.c
@@ +374,5 @@
> +		mappedOnLoan += size;
> +	}
> +	eventLog(EL_Jemalloc,
> +		 "(%lu MB on loan)  acq %lu  by %s%s\n",
> +		 mappedOnLoan / 1000000, size,

Bytes vs. MB and 1000 vs. 1024 here.  Again, I'd just go with bytes, so long as the numbers are comma'd.

::: toolkit/components/eventlog/content/eventLog.css
@@ +34,5 @@
> + * the terms of any one of the MPL, the GPL or the LGPL.
> + *
> + * ***** END LICENSE BLOCK ***** */
> +
> +.mrValue {

Looks like you've cut and pasted aboutMemory.css completely here.  Please remove the rules you aren't using!

You mentioned that the log can poke past the right-hand side of the enclosing box.  I wonder if about:eventlog should just be pure plaintext?

The eventLog.{css,js,xhtml} files should probably be called aboutEventLog.{css,js,xhtml} for consistency with other about: pages.

::: toolkit/components/eventlog/content/eventLog.js
@@ +63,5 @@
> +  // First, clear the page contents.  Necessary because update() might be
> +  // called more than once due to ChildMemoryListener.
> +  var content = $("content");
> +  content.parentNode.replaceChild(content.cloneNode(false), content);
> +  content = $("content");

This can be removed -- you've cut and pasted it from aboutMemory.js, but you don't have ChildMemoryListener.

::: toolkit/components/eventlog/content/eventLog.xhtml
@@ +45,5 @@
> +    <script type="text/javascript" src="chrome://global/content/eventLog.js"/>
> +  </head>
> +
> +  <!-- No newline before the div element!  This avoids extraneous spaces when
> +       pasting the entire output after selecting it with Ctrl-a. -->

Is that true here?

::: xpcom/base/nsEventLog.cpp
@@ +68,5 @@
> +    // regard for line boundaries.  Dumping it verbatim gives a
> +    // partial first line.  Hence try to find the start of the
> +    // next line.
> +    size_t ix_linestart = 0;
> +    while (1) {

|while (true)| is nicer.

::: xpcom/base/nsIEventLog.idl
@@ +47,5 @@
> +    /* Add an entry to the log.  It would be nice to have the caller
> +       supply an EventLogCaller value too, but there doesn't seem to
> +       be a clean way to convey C++ enum types between the two worlds.
> +       Hence this routine uses the value EL_Misc for all strings
> +       passed here. */

You're right about the enums, so the standard workaround is to define const integers here.  Eg. see the UNITS_* constants in nsIMemoryReporter.  Both JS and C++ code can see constants defined this way.
(In reply to comment #16)
> Comment on attachment 547372 [details] [diff] [review] [review]
> patch for review
> 
> Review of attachment 547372 [details] [diff] [review] [review]:
> -----------------------------------------------------------------
> 
> I have no idea how this is going to work with e10s.  Already in Fennec no
> info from the content process will be shown in about:eventlog, AFAICT.  That
> can be a follow-up, though.
> 
> ::: eventlog/eventlog.h
> @@ +64,5 @@
> > +    enum { EL_Jemalloc,  /* jemalloc */
> > +           EL_JS,        /* JS engine */
> > +           EL_CC,        /* cycle collector */
> > +           EL_Gfx,       /* gfx/ */
> > +           EL_Misc       /* misc */
> 
> I'm guessing these should have names like NS_EL_JEMALLOC, NS_EL_JS, etc. 
> That seems to be the standard Gecko style.  And maybe NS_EVENTLOG_JEMALLOC
> would be better.
> 
> (But see at the very bottom about moving these into nsIEventLog.)
> 
> @@ +66,5 @@
> > +           EL_CC,        /* cycle collector */
> > +           EL_Gfx,       /* gfx/ */
> > +           EL_Misc       /* misc */
> > +    }
> > +    EventLogCaller;
> 
> NS_EventLogCaller?  Not sure.

Please don't perpetuate the whole NS nonsense. The new hotness is to wrap things in a mozilla namespace, possibly even mozilla::EventLog.
(In reply to comment #15)
> I tried to build with the patch applied on Linux64 and got a link error:
> 
> /usr/bin/ld.gold.real: ../../dist/lib/libjemalloc.a(jemalloc.o): in function
> my_mmap:/home/njn/moz/mi4/memory/jemalloc/jemalloc.c:376: error: undefined
> reference to 'eventLog'
> /usr/bin/ld.gold.real: ../../dist/lib/libjemalloc.a(jemalloc.o): in function
> my_munmap:/home/njn/moz/mi4/memory/jemalloc/jemalloc.c:388: error: undefined
> reference to 'eventLog'
> collect2: ld returned 1 exit status

That's really not surprising: jemalloc is linked in the browser binary (firefox), and the eventLog is in a separate library, if I am to believe comment 12. The problem is that the firefox binary simply can't be linked against a non system DSO... (or at least, i'd advise against it, it requires an rpath or a wrapper script that we finally got rid of)
(In reply to comment #18)
> (In reply to comment #15)
> > I tried to build with the patch applied on Linux64 and got a link error:
> > 
> > /usr/bin/ld.gold.real: ../../dist/lib/libjemalloc.a(jemalloc.o): in function
> > my_mmap:/home/njn/moz/mi4/memory/jemalloc/jemalloc.c:376: error: undefined
> > reference to 'eventLog'

Urr, I am surprised and disappointed to hear that.  I was using
ld.traditional on 64-bit Ubuntu 10.04 and never saw any such failure,
and it worked, and it also worked with MSVC 2008 on WinXP.

> That's really not surprising: jemalloc is linked in the browser binary
> (firefox), and the eventLog is in a separate library, if I am to believe
> comment 12.

Yes, that's true.

> The problem is that the firefox binary simply can't be linked
> against a non system DSO... (or at least, i'd advise against it, it requires
> an rpath or a wrapper script that we finally got rid of)

So then the thing to do is to link the eventLog into the browser
binary, in the same way as jemalloc is?

These linking constraints are non-obvious and not "standard", eg on my
linux box, /usr/bin/kmail is linked against a huge stack of non system
DSOs.  Are the constraints documented anywhere?

Can we get this straightened out definitively before I embark on
another round of build-system mortal-kombat?  I did it as a DSO
because I was advised that this was a good way to make the
functionality available everywhere, without inducing new cyclic
inter-module dependencies.
(In reply to comment #19)
> So then the thing to do is to link the eventLog into the browser
> binary, in the same way as jemalloc is?

That very much depend how tied it is to the parts that are not in jemalloc (I haven't looked in detail)
 
> These linking constraints are non-obvious and not "standard", eg on my
> linux box, /usr/bin/kmail is linked against a huge stack of non system
> DSOs.  Are the constraints documented anywhere?

/usr/bin/kmail is linked against a huge stack of things the linker can find.
$somewhere/firefox must be linked against things what the linker can find, and the linker won't find stuff in $somewhere/ without a wrapper script or an rpath.
We used to have a wrapper script, but got rid of it at the same time we added preloading of our libraries (which is a non negligible startup perf improvement) - dynamically linking goes against preloading.

> Can we get this straightened out definitively before I embark on
> another round of build-system mortal-kombat?  I did it as a DSO
> because I was advised that this was a good way to make the
> functionality available everywhere, without inducing new cyclic
> inter-module dependencies.

The main problem for you is that different platforms have different ways to handle the situation. See how jemalloc is statically linked to the binary on linux, but provided as a DSO on OSX and windows (which effectively makes it non preloaded, btw). The core problem is that the library search path doesn't include the firefox directory by default on linux, while aiui, it does on OSX and windows. So all in all, maybe what we really want is to actually add an rpath such that the library search path *does* include the firefox directory by default on linux (hoping that the other unix systems that care do support $ORIGIN as an rpath).
Though, considering jemalloc is a central piece, why not bundle with jemalloc instead of having a separate lib ?
> Though, considering jemalloc is a central piece, why not bundle with
> jemalloc instead of having a separate lib ?

Because the logging facility then wouldn't be available on non-jemalloc
builds.
Another thing -- you should probably have at least one test for about:eventlog.  Not sure what it would do, maybe just check that it actually contains some sane-looking output.  You can use toolkit/components/aboutmemory/tests/chrome/test_aboutmemory.xul as a starting point.
Here's a revised version of the patch which gets around the link
failure reported in comment #15.  It also adds a $$ORIGIN rpath entry
to firefox-bin so it remains runnable without having to set
LD_LIBRARY_PATH.

I'm beginning to think this approach of building the logging
module as its own standalone tier_base module is flawed, and
needs revisiting.  A previous version of this patch (and maybe
this version) worked on Windows.  But a jemalloc enabled build
for Windows fails.  It tries to build the logger (libeventlog)
before libmozalloc, and the link fails when the build system
links libeventlog statically against libmozalloc.  This fails
because the CRT replacement involved in libmozalloc has not been
created yet.

This might be get-roundable with more build system magic, but it seems
to me to be adding fragility into a part of the system (the
build-system/jemalloc intersection) which is already complex.

Perhaps it would be better to scrap the tier_base module and put
the low level log support code in libxul.  This would have the
following advantages

(1) no build system hacks required

(2) the logger could use libc safely (see "Implementation
    summary" in comment #12) and so we could discard the cut-down
    copies of printf and ctime copies from NSPR

and the following disadvantage

(3) it wouldn't be callable from jemalloc: that would give xpcom
    and jemalloc a circular dependency, and lead to deadlocking
    as per comment #12.

(3) is serious, because we really want to have mmaps /
VirtualAllocs done by jemalloc to be shown in the log.

The only solution I can think of is to have jemalloc accumulate
mmap/VirtualAlloc log entries in its own circular buffer, and
periodically pull those out into the main log buffer.  This
effectively means polling jemalloc from xul.  Which is ungood,
would require driving it from a timer or the top level event
loop, and would have ramifications w.r.t. sleepingness on mobile
devices.

Suggestions, comments?  I'm kinda stuck with this now.
Attachment #547372 - Attachment is obsolete: true
Did you try, as I mentioned on irc, to fold your log library with libjemalloc ?
So, with the constraints from this API, our build system and our various ways of linking things for the various platforms we target, I think the least worse thing to do would be to have a directory where we'd build a libmozutils library that:
 - includes jemalloc when jemalloc is enabled
 - includes the event log infrastructure from this bug
 - includes what currently is libmozutils on android when building for android
 - is a DSO on win/osx and a static lib on linux (i.e., is whatever jemalloc currently is)

I think doing so as a separate patch (i.e. without the event log infrastructure) first would be better.
An advantage of this approach is that whenever we'll need to add something else that has the same kind of constraints, it will be much easier to handle.
(In reply to comment #23)
> Another thing -- you should probably have at least one test for
> about:eventlog.  Not sure what it would do, maybe just check that it
> actually contains some sane-looking output.  You can use
> toolkit/components/aboutmemory/tests/chrome/test_aboutmemory.xul as a
> starting point.

Is it possible to use js-ctypes to open the event log library, call a function to add some output to the log, then load about:eventlog to check that it was added?
Depends on: 677501
I think this looks great! I do have one concern though, what would be the privacy implications of this?  Perhaps a "clear event log" could be added to the clear history menu?

I would hate it for a less experienced user to go into their history, delete various websites and feel secure while all along his/her activity has been diligently recorded in the log.

This seems like it could cause some trouble. Perhaps an automatic deletion of entries more then a few days old?  What do you guys think?
Julian, now that bug 677501 has landed do you think you'll be able to look at this again?  It sounds like it might solve your linking/dependency woes.
We agreed today that this wouldn't help that much with the memory problems we've had lately, so we're downgrading to a MemShrink:P2.
Whiteboard: [MemShrink:P1] → [MemShrink:P2]
Whiteboard: [MemShrink:P2]
This would be cool to have, but it doesn't seem to be blocking any particular MemShrinky thing, so I'm removing the tag.

Sorry, there was a problem with the detection of inactive users. I'm reverting the change.

Assignee: nobody → jseward
Status: NEW → ASSIGNED
Severity: normal → S3
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: