Open Bug 820133 Opened 12 years ago Updated 2 years ago

NewDMD reports that GL context created by gl::GLLibraryEGL::fCreateContext, called by CompositorParent::AllocPLayers doesn't report its memory

Categories

(Core :: Graphics, defect)

ARM
Gonk (Firefox OS)
defect

Tracking

()

People

(Reporter: justin.lebar+bug, Unassigned)

References

(Blocks 1 open bug)

Details

(Whiteboard: [MemShrink:P2])

In the main B2G process with the gallery app open (see attachment 690543 [details]), I see:

> Unreported: 1 block in block group 5 of 533
>  667,648 bytes (667,648 requested / 0 slop)
>  2.77% of the heap (24.81% cumulative);  5.45% of unreported (48.85% cumulative)
>  Allocated at
>        malloc /Users/jlebar/code/moz/ff-git/src/memory/build/replace_malloc.c:152 (0x4020b2ae libmozglue.so+0x42ae)
>        os_malloc_ext (0x42ba6134 libgsl.so+0x4134) (no addr2line)
>        rb_alloc_primitive_lists (0x432ec98c libGLESv2_adreno200.so+0x3a98c) (no addr2line)
>        rb_context_create (0x432eab12 libGLESv2_adreno200.so+0x38b12) (no addr2line)
>        gl2_context_create (0x432d45fa libGLESv2_adreno200.so+0x225fa) (no addr2line)
>        oglCreateContext (0x432d6944 libGLESv2_adreno200.so+0x24944) (no addr2line)
>        eglCreateClientApiContext (0x42a84a50 libEGL_adreno200.so+0x17a50) (no addr2line)
>        qeglDrvAPI_eglCreateContext (0x42a7e7ac libEGL_adreno200.so+0x117ac) (no addr2line)
>        eglCreateContext (0x42a72560 libEGL_adreno200.so+0x5560) (no addr2line)
>        eglCreateContext /Users/jlebar/code/moz/B2G/frameworks/base/opengl/libs/EGL/eglApi.cpp:504 (0x4016868c libEGL.so+0xc68c)
>        mozilla::gl::GLLibraryEGL::fCreateContext(void*, void*, void*, int const*) /Users/jlebar/code/moz/ff-git/src/gfx/gl/GLLibraryEGL.h:194 (0x419f42da libxul.so+0xacd2da)
>        mozilla::layers::LayerManagerOGL::CreateContext() /Users/jlebar/code/moz/ff-git/src/gfx/layers/opengl/LayerManagerOGL.cpp:470 (0x419d9d1c libxul.so+0xab2d1c)
>        nsRefPtr<mozilla::gl::GLContext> /Users/jlebar/code/moz/B2G/objdir-gecko/dist/include/nsAutoPtr.h:905 (0x419db126 libxul.so+0xab4126)
>        mozilla::layers::CompositorParent::AllocPLayers(mozilla::layers::LayersBackend const&, unsigned long long const&, mozilla::layers::LayersBackend*, int*) /Users/jlebar/code/moz/ff-git/src/gfx/layers/ipc/CompositorParent.cpp:1023 (0x419e5a5c libxul.so+0xabea5c)
>        mozilla::layers::PCompositorParent::OnMessageReceived(IPC::Message const&, IPC::Message*&) /Users/jlebar/code/moz/B2G/objdir-gecko/ipc/ipdl/PCompositorParent.cpp:590 (0x41893d90 libxul.so+0x96cd90)
>        mozilla::ipc::SyncChannel::OnDispatchMessage(IPC::Message const&) /Users/jlebar/code/moz/ff-git/src/ipc/glue/SyncChannel.cpp:147 (0x41868f70 libxul.so+0x941f70)
>        mozilla::ipc::RPCChannel::OnMaybeDequeueOne() /Users/jlebar/code/moz/ff-git/src/ipc/glue/RPCChannel.cpp:400 (0x41867b7c libxul.so+0x940b7c)
>        RunnableMethod<mozilla::layers::ImageContainerChild, void (mozilla::layers::ImageContainerChild::*)(), Tuple0>::Run() /Users/jlebar/code/moz/ff-git/src/ipc/chromium/src/base/task.h:308 (0x40b4854a libxul.so+0x92554a)
>        mozilla::ipc::RPCChannel::DequeueTask::Run() /Users/jlebar/code/moz/B2G/objdir-gecko/dist/include/mozilla/ipc/RPCChannel.h:448 (0x40b6253c libxul.so+0x93f53c)
>        MessageLoop::RunTask(Task*) /Users/jlebar/code/moz/ff-git/src/ipc/chromium/src/base/message_loop.cc:334 (0x40c85d90 libxul.so+0xa62d90)
>        MessageLoop::DeferOrRunPendingTask(MessageLoop::PendingTask const&) /Users/jlebar/code/moz/ff-git/src/ipc/chromium/src/base/message_loop.cc:344 (0x40c86be6 libxul.so+0xa63be6)
>        MessageLoop::DoWork() /Users/jlebar/code/moz/ff-git/src/ipc/chromium/src/base/message_loop.cc:441 (0x40c877c4 libxul.so+0xa647c4)
>        base::MessagePumpDefault::Run(base::MessagePump::Delegate*) /Users/jlebar/code/moz/ff-git/src/ipc/chromium/src/base/message_pump_default.cc:24 (0x4198ba44 libxul.so+0xa64a44)
>        MessageLoop::RunInternal() /Users/jlebar/code/moz/ff-git/src/ipc/chromium/src/base/message_loop.cc:216 (0x40c85d40 libxul.so+0xa62d40)
Also

> Unreported: ~6 blocks in block group 13 of 533
>  ~118,784 bytes (~115,200 requested / ~3,584 slop)
>  0.49% of the heap (34.39% cumulative);  0.97% of unreported (67.69% cumulative)
>  Allocated at
>        calloc /Users/jlebar/code/moz/ff-git/src/memory/build/replace_malloc.c:182 (0x4020b208 libmozglue.so+0x4208)
>        os_calloc_ext (0x42ba60e4 libgsl.so+0x40e4) (no addr2line)
>        rb_mempool_dynamic_init (0x432ee388 libGLESv2_adreno200.so+0x3c388) (no addr2line)
>        rb_context_create (0x432eae22 libGLESv2_adreno200.so+0x38e22) (no addr2line)
>        gl2_context_create (0x432d45fa libGLESv2_adreno200.so+0x225fa) (no addr2line)
>        oglCreateContext (0x432d6944 libGLESv2_adreno200.so+0x24944) (no addr2line)
>        eglCreateClientApiContext (0x42a84a50 libEGL_adreno200.so+0x17a50) (no addr2line)
>        qeglDrvAPI_eglCreateContext (0x42a7e7ac libEGL_adreno200.so+0x117ac) (no addr2line)
>        eglCreateContext (0x42a72560 libEGL_adreno200.so+0x5560) (no addr2line)
>        eglCreateContext /Users/jlebar/code/moz/B2G/frameworks/base/opengl/libs/EGL/eglApi.cpp:504 (0x4016868c libEGL.so+0xc68c)
>        mozilla::gl::GLLibraryEGL::fCreateContext(void*, void*, void*, int const*) /Users/jlebar/code/moz/ff-git/src/gfx/gl/GLLibraryEGL.h:194 (0x419f42da libxul.so+0xacd2da)
>        mozilla::layers::LayerManagerOGL::CreateContext() /Users/jlebar/code/moz/ff-git/src/gfx/layers/opengl/LayerManagerOGL.cpp:470 (0x419d9d1c libxul.so+0xab2d1c)
>        nsRefPtr<mozilla::gl::GLContext> /Users/jlebar/code/moz/B2G/objdir-gecko/dist/include/nsAutoPtr.h:905 (0x419db126 libxul.so+0xab4126)
>        mozilla::layers::CompositorParent::AllocPLayers(mozilla::layers::LayersBackend const&, unsigned long long const&, mozilla::layers::LayersBackend*, int*) /Users/jlebar/code/moz/ff-git/src/gfx/layers/ipc/CompositorParent.cpp:1023 (0x419e5a5c libxul.so+0xabea5c)
>        mozilla::layers::PCompositorParent::OnMessageReceived(IPC::Message const&, IPC::Message*&) /Users/jlebar/code/moz/B2G/objdir-gecko/ipc/ipdl/PCompositorParent.cpp:590 (0x41893d90 libxul.so+0x96cd90)
>        mozilla::ipc::SyncChannel::OnDispatchMessage(IPC::Message const&) /Users/jlebar/code/moz/ff-git/src/ipc/glue/SyncChannel.cpp:147 (0x41868f70 libxul.so+0x941f70)
>        mozilla::ipc::RPCChannel::OnMaybeDequeueOne() /Users/jlebar/code/moz/ff-git/src/ipc/glue/RPCChannel.cpp:400 (0x41867b7c libxul.so+0x940b7c)
>        RunnableMethod<mozilla::layers::ImageContainerChild, void (mozilla::layers::ImageContainerChild::*)(), Tuple0>::Run() /Users/jlebar/code/moz/ff-git/src/ipc/chromium/src/base/task.h:308 (0x40b4854a libxul.so+0x92554a)
>        mozilla::ipc::RPCChannel::DequeueTask::Run() /Users/jlebar/code/moz/B2G/objdir-gecko/dist/include/mozilla/ipc/RPCChannel.h:448 (0x40b6253c libxul.so+0x93f53c)
>        MessageLoop::RunTask(Task*) /Users/jlebar/code/moz/ff-git/src/ipc/chromium/src/base/message_loop.cc:334 (0x40c85d90 libxul.so+0xa62d90)
>        MessageLoop::DeferOrRunPendingTask(MessageLoop::PendingTask const&) /Users/jlebar/code/moz/ff-git/src/ipc/chromium/src/base/message_loop.cc:344 (0x40c86be6 libxul.so+0xa63be6)
>        MessageLoop::DoWork() /Users/jlebar/code/moz/ff-git/src/ipc/chromium/src/base/message_loop.cc:441 (0x40c877c4 libxul.so+0xa647c4)
>        base::MessagePumpDefault::Run(base::MessagePump::Delegate*) /Users/jlebar/code/moz/ff-git/src/ipc/chromium/src/base/message_pump_default.cc:24 (0x4198ba44 libxul.so+0xa64a44)
>        MessageLoop::RunInternal() /Users/jlebar/code/moz/ff-git/src/ipc/chromium/src/base/message_loop.cc:216 (0x40c85d40 libxul.so+0xa62d40)
And a bunch more smaller blocks (1 and 2 pages in size).
This is all in the underlying driver; we have no visibility on size/etc. of these allocations.  Do we have a way to flag any allocations underneath a certain gecko call stack as something?  Or is this malloc replacing just part of debugging/getting this data?
DMD uses malloc-replace to instrument all allocations.

If this code isn't under our control, probably the only option is to wrap the allocator used for it.  If the library is present on the system only as a binary, this is difficult.  See bug 813843 for discussion of a similar case.
(In reply to Nicholas Nethercote [:njn] from comment #4)
> If the library is present on the system only as
> a binary, this is difficult.

Why? I suppose that my question is: under what circumstances would a binary library escape our replace-malloc?
...anyway, on the particular system that you ran this on, it didn't escape our replace-malloc (otherwise NewDMD wouldn't have caught it), so a solution based on using replace-malloc would correctly report the amount of memory allocated by that library.
Yep, though how would that work per-library?  We'd want to account for all allocations in libGLESv2_adreno200.so and libEGL_adreno200.so and count them as "GLES" or "EGL".  Note that these names are hardware specific, too; we never load these directly, nor can we.  The only way that I can think of to do that is to have a replacement malloc at all times (do we?), and then add some RIAA-type stack helpers, e.g.:

{
    MallocMemoryOwner mallocOwner("OpenGL subsystem");
    ... eglCreateContext(...);
}

but we don't have such a system in place now, do we?  And do we plan on having release builds run with replacement malloc?  Doing this would add some overhead to the underlying malloc impl, but maybe we can simplify it by not using strings and using a fixed set of (integer) tags that are defined by the replacement malloc library itself (that we update)... should make the overhead negligible.
> > If the library is present on the system only as
> > a binary, [wrapping its allocator] is difficult.
>
> Why? I suppose that my question is: under what circumstances would a binary library 
> escape our replace-malloc?

I think njn was referring to how we "wrap" sqlite and hunspell's allocators.  These libraries allow us to pass in a custom allocator, and the one we pass in does some memory accounting and then calls malloc().  Obviously that doesn't work if the library doesn't let us pass in a custom allocator.

I like vlad's idea, although it would take some infrastructure to get right (since we'd want nested malloc wrappers for something like DMD).
> > Why? I suppose that my question is: under what circumstances would a binary library 
> > escape our replace-malloc?
> 
> I think njn was referring to how we "wrap" sqlite and hunspell's allocators.

No... what I meant is that I consider enabling replace-malloc by default "difficult", and am assuming it won't happen for some time, if it ever does.  

I could be wrong, of course, but two issues that come to mind:

- Are there security issues?

- If replace-malloc is on by default, how does this affect tools like DMD that want to enable it?  Can two replace-malloc libraries co-exist happily?
I don't think there are any security issues, but I would foresee a nightmare of potential mismatched allocator errors :(  It might not be realistically feasible for the general case, but maybe we could do it for B2G where we control the entire OS... like actually patch the system malloc and add some optionally-enabled smarts to it for tracking like this.
(In reply to Nicholas Nethercote [:njn] from comment #9)
> I could be wrong, of course, but two issues that come to mind:
> 
> - Are there security issues?
> 
> - If replace-malloc is on by default, how does this affect tools like DMD
> that want to enable it?  Can two replace-malloc libraries co-exist happily?

What we could do, to solve both these issues, is:
 * add some built-in instrumentation in replace_malloc.c, build and use that *unconditionally*;
 * but put behind a #ifdef --enable-replace-malloc the part where we allow providing custom replace_foo functions. So the code in replace_malloc.c could look like (in this example all the built-in implementation does is count bytes allocated):

void*
malloc_impl(size_t size)
{
  global_counter_bytes_allocated += size;

#ifdef ENABLE_REPLACE_MALLOC

  // this is the code we currently have in the tree:

  if (MOZ_UNLIKELY(!replace_malloc_initialized))
    init();
  if (MOZ_LIKELY(!replace_malloc))
    return je_malloc(size);
  return replace_malloc(size);

#endif
}



(In reply to Vladimir Vukicevic [:vlad] [:vladv] from comment #10)
> I don't think there are any security issues, but I would foresee a nightmare
> of potential mismatched allocator errors :(

This is no more tricky than what we are already doing, overriding the system memory allocator by jemalloc, etc. The new replace-malloc stuff (bug 804303) is a convenient new way to put instrumentation in a single place without increasing our surface of bugginess in this area.
Note, the dummy code in my previous comment is of course wrong, it should read

void*
malloc_impl(size_t size)
{
  global_counter_bytes_allocated += size;

#ifdef ENABLE_REPLACE_MALLOC

  // this is the code we currently have in the tree:

  if (MOZ_UNLIKELY(!replace_malloc_initialized))
    init();
  if (MOZ_UNLIKELY(replace_malloc))
    return replace_malloc(size);

#endif

  return je_malloc(size);
}
(In reply to Justin Lebar [:jlebar] from comment #8)
> I like vlad's idea, although it would take some infrastructure to get right
> (since we'd want nested malloc wrappers for something like DMD).

With the approach of having some built-in instrumentation as in comment 12, no need for nested malloc wrappers.
> I don't think there are any security issues, but I would foresee a nightmare
> of potential mismatched allocator errors :(

Yeah; we already replace the system allocator on all platforms, so a mismatched malloc/free is already catastrophic, so I'm not worried about this.

> With the approach of having some built-in instrumentation as in comment 12, no need for 
> nested malloc wrappers.

I'm fine with this approach if glandium is.
What worries me is that if you want to make that work properly, you need some thread local storage. If you need thread local storage, you need to handle reentrancy (tls calls malloc, at least on some platforms), and you need to do so without thread local storage (obviously).
We handle this in DMD by doing pthread_key_create before we've enabled malloc tracking.  Then we hope that pthread_setspecific doesn't malloc, which it apparently doesn't do.  

This code is in DMD.cpp::Init and DMD.cpp::DMDThread::Fetch.
Whiteboard: [MemShrink]
Whiteboard: [MemShrink] → [MemShrink:P2]
Severity: normal → S3
You need to log in before you can comment on or make changes to this bug.