Closed Bug 1300900 Opened 8 years ago Closed 7 years ago

Add gdb helper to get information about memory allocations

Categories

(Core :: Memory Allocator, defect, P3)

defect

Tracking

()

RESOLVED FIXED
mozilla58
Tracking Status
firefox58 --- fixed

People

(Reporter: glandium, Assigned: glandium)

Details

Attachments

(2 files)

While digging for a memory leak, at some point, I was backtracking object references. The problem is that references can either be on the stack or in member fields of some object. In the latter case, they're pretty much never at the beginning of an object. What I started doing, then, was scan around and find vtable pointers, which indicate where the beginning of the object might be. But multiple-inherited classes come with multiple vtable pointers, so this makes the exercise possibly error prone.

So I started poking at the jemalloc structures to find information about the allocations, and figured that I might as well write a gdb python helper to do so.

The result allows things like:

(gdb) print this
$1 = (nsExternalHelperAppService * const) 0x7fffd67fd500
(gdb) print $jemalloc_info(this)
$2 = "96 bytes allocated small buffer"
(gdb) print $jemalloc_info(&mProxy)
$3 = "48 bytes inside a 96 bytes allocated small buffer at 0x7fffd67fd500"
This is derived from the isalloc function from mozjemalloc, with the additional property of handling pointers in the middle of an allocation.

Tom, can you check whether there aren't better ways to do some of this? (I'm particularly disappointed I had to do the to_int thing)
Attachment #8788649 - Flags: review?(n.nethercote)
Attachment #8788649 - Flags: feedback?(ttromey)
Comment on attachment 8788649 [details] [diff] [review]
Add a $jemalloc_info gdb helper

Review of attachment 8788649 [details] [diff] [review]:
-----------------------------------------------------------------

This is all mozjemalloc specific but the names used throughout don't make this clear. I would be happier if they did, even if you plan to extend support to jemalloc4 in the future. (At that future point, the names could be changed to be more general.)

::: python/gdbpp/gdbpp/jemalloc.py
@@ +1,3 @@
> +import gdb
> +
> +unsigned = gdb.lookup_type('unsigned')

You should have a top-level comment explaining what command this file defines, how to invoke it, and what kind of output it produces. And that it's based on mozjemalloc's isalloc function.

@@ +58,5 @@
> +        # However, if the pointer is in the middle of a huge allocation, the
> +        # chunk header won't have a valid arena pointer.
> +        try:
> +            arena = chunk['arena']
> +            if arena['magic'] != 0x947d3d24:

Give this value a name, like you did with the CHUNK_MAP_* values.

@@ +71,5 @@
> +        if bits & self.CHUNK_MAP_ALLOCATED == 0:
> +            info = {
> +                'type': 'free',
> +                'addr': addr & ~(self.page_size - 1),
> +            }

I found the lack of a 'size' field here surprising, and it took some working out. Would using size=None make it clearer? Or maybe a comment about it.

More generally, a comment explaining this entire |if| block would be helpful.

@@ +105,5 @@
> +                'addr': alloc,
> +                'size': size,
> +            }
> +
> +        # run = (arena_run_t *)(bits & ~pagesize_mask);

A comment here explaining we are in the "small" case would be helpful.

@@ +107,5 @@
> +            }
> +
> +        # run = (arena_run_t *)(bits & ~pagesize_mask);
> +        run = gdb.Value(bits & ~(self.page_size - 1)).cast(self.run_type)
> +        # Check the run has the right magic number

Nit: end comment with '.'. (Likewise for numerous comments through this file.)

@@ +108,5 @@
> +
> +        # run = (arena_run_t *)(bits & ~pagesize_mask);
> +        run = gdb.Value(bits & ~(self.page_size - 1)).cast(self.run_type)
> +        # Check the run has the right magic number
> +        if run['magic'] != 0x384adf93:

Give this value a name, too.
Attachment #8788649 - Flags: review?(n.nethercote) → review+
Comment on attachment 8788649 [details] [diff] [review]
Add a $jemalloc_info gdb helper

Review of attachment 8788649 [details] [diff] [review]:
-----------------------------------------------------------------

I don't know jemalloc, so I just skimmed those bits.
Other comments inline.

I'm curious to know more about the problems leading to the type-casting.
Perhaps we can file some upstream bugs.

If you want to get really fancy, this project did all kinds
of heap dissection for glibc:
https://fedorahosted.org/gdb-heap/
Perhaps it could be ported to jemalloc.

::: python/gdbpp/gdbpp/jemalloc.py
@@ +1,4 @@
> +import gdb
> +
> +unsigned = gdb.lookup_type('unsigned')
> +unsigned_long_long = gdb.lookup_type('unsigned long long')

Normally it's best to look up types as lazily as possible, because
the type found when the file is loaded may not be the type used by
the inferior.

For this specific spot I think using Python |long| (see below) would
probably remove the need for the type lookups anyway.

@@ +9,5 @@
> +}
> +def to_int(value):
> +    if isinstance(value, gdb.Value):
> +        return int(value.cast(int_types[value.type.sizeof]))
> +    return int(value)

I don't know what problem led to this approach, but yeah, sometimes goofy
stuff like this is needed, I think.
Normally what I do is work solely in |long| (setting `long = int` in Python 3).
This avoids any problems with |int| being too small in Python 2.
E.g., https://dxr.mozilla.org/mozilla-central/source/js/src/gdb/mozilla/unwind.py#11

@@ +11,5 @@
> +    if isinstance(value, gdb.Value):
> +        return int(value.cast(int_types[value.type.sizeof]))
> +    return int(value)
> +
> +class JemallocInfo(gdb.Function):

If you add class documentation here, it will show up in "help function".
You can see what a good example looks like with "help function _any_caller_is" (in a new-enough gdb).
Basically, first line is the summary, the rest is the full docs -- but I also like
that the gdb style includes Usage and argument descriptions.

@@ +38,5 @@
> +            self.page_size = to_int(gdb.lookup_symbol('pagesize')[0].value())
> +        except:
> +            self.page_size = 4 * 1024
> +
> +        self.chunk_type = gdb.lookup_type('arena_chunk_t').pointer()

Caching the types causes problems if the inferior changes (as in, recompile and restart).
It may not be worth worrying about this in the FF context, though
SpiderMonkey has some type cache helper to deal with this.

@@ +156,5 @@
> +        self.ensure_init()
> +        addr = to_int(addr)
> +        info = self.info(addr)
> +        if not info:
> +            return 'Not allocated with jemalloc'

Given the return values here, I'm not sure you really want to be defining a 
convenience function.  It seems to me that a new gdb command would be better.
It could just print the output directly rather than return it.

The doc string comment still applies in this case, since that's where the
gdb help command looks.
Attachment #8788649 - Flags: feedback?(ttromey)
(In reply to Tom Tromey :tromey from comment #3)
> @@ +9,5 @@
> > +}
> > +def to_int(value):
> > +    if isinstance(value, gdb.Value):
> > +        return int(value.cast(int_types[value.type.sizeof]))
> > +    return int(value)
> 
> I don't know what problem led to this approach, but yeah, sometimes goofy
> stuff like this is needed, I think.
> Normally what I do is work solely in |long| (setting `long = int` in Python
> 3).
> This avoids any problems with |int| being too small in Python 2.
> E.g.,
> https://dxr.mozilla.org/mozilla-central/source/js/src/gdb/mozilla/unwind.
> py#11

So, what led me to do that is:
- While gdb.Value instances can be used as is integer values most of the time, formatting operations don't work on them.
- I don't remember the exact details, but at some point, I had to convert a gdb.Value to a string before being able to convert it to a python int (as in int(str(foo), 16)).
- In one case, int(foo, 16) didn't work because the value was something like "0xpointer <symbol+offset>". This is where the whole "cast to an integer type the same size as the pointer" comes from.

I don't know how much of those are by (defective) design or should be considered bugs (If you ask me, the whole gdb python api is beyond saving)

> @@ +156,5 @@
> > +        self.ensure_init()
> > +        addr = to_int(addr)
> > +        info = self.info(addr)
> > +        if not info:
> > +            return 'Not allocated with jemalloc'
> 
> Given the return values here, I'm not sure you really want to be defining a 
> convenience function.  It seems to me that a new gdb command would be better.
> It could just print the output directly rather than return it.

Yeah, originally, I wanted to return some structured data, but it looks like it's not possible to create fake types to do that without doing something as awful as spawning a compiler to compile a C source with a struct in it and load the resulting object file.
(In reply to Mike Hommey [:glandium] from comment #4)

> - While gdb.Value instances can be used as is integer values most of the
> time, formatting operations don't work on them.

Maybe this could be done, but I don't know how.
Perhaps gdb.Value could define a __format__ method.

> - In one case, int(foo, 16) didn't work because the value was something like
> "0xpointer <symbol+offset>". This is where the whole "cast to an integer
> type the same size as the pointer" comes from.

Conversion of a Value to a string invokes the gdb printing machinery.

> I don't know how much of those are by (defective) design or should be
> considered bugs (If you ask me, the whole gdb python api is beyond saving)

I am super unhappy to hear that.

> Yeah, originally, I wanted to return some structured data, but it looks like
> it's not possible to create fake types to do that without doing something as
> awful as spawning a compiler to compile a C source with a struct in it and
> load the resulting object file.

Yes,  nobody ever implemented that.
I think the main stumbling block is that gdb doesn't have ABI knowledge
about struct layout; which is normally fine, but there are some exceptions
that have to be coded in.
Priority: -- → P3
Comment on attachment 8921761 [details]
Bug 1300900 - Add a helper around jemalloc_ptr_info for debuggers.

https://reviewboard.mozilla.org/r/192776/#review198050

::: memory/build/mozjemalloc.cpp:3587
(Diff revision 1)
>  }
>  
> +namespace Debug
> +{
> +  /* Helper for debuggers
> +   * we don't want it to be inlined and optimized out. */

This comment is formatted oddly. And should it be two sentences?

::: memory/build/mozjemalloc.cpp:4423
(Diff revision 1)
>      return true;
>    }
>  
>    malloc_initialized = true;
>  
> +  // Dummy call so that the function is not removed by dead-code elimination

Huh. It feels like there should be a nicer way to ensure it's not optimized away, via some kind of declaration on the function itself?
Attachment #8921761 - Flags: review?(n.nethercote) → review+
(In reply to Nicholas Nethercote [:njn] from comment #7)
> Huh. It feels like there should be a nicer way to ensure it's not optimized
> away, via some kind of declaration on the function itself?

Not without making the symbol public, unfortunately.
Pushed by mh@glandium.org:
https://hg.mozilla.org/integration/autoland/rev/b379388eaa92
Add a helper around jemalloc_ptr_info for debuggers. r=njn
https://hg.mozilla.org/mozilla-central/rev/b379388eaa92
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla58
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: