Open Bug 642320 Opened 13 years ago Updated 3 months ago

Generate gdb debug info for JIT code

Categories

(Core :: JavaScript Engine, enhancement, P5)

x86_64
Linux
enhancement

Tracking

()

People

(Reporter: sfink, Unassigned)

References

(Blocks 1 open bug)

Details

Attachments

(2 files, 6 obsolete files)

At least for methodjitted code, we should emit call frame unwinding and debug info and use gdb's JIT registration API to load it in.

This is one of a family of related bugs dealing with JIT code reporting. It started as bug 599499, but I want a general JIT code registration API that has multiple separately enabled backends: gdb, breakpad, whatever Windows uses, valgrind, oprofile, etc.

But this bug is for gdb support specifically.
I have a proof of concept working. It only handles x86_64, it can only do a single function, and due to a libdwarf bug the line number information isn't quite right so the 'next' command in gdb isn't working. But it's a start. It's nice to finally see a backtrace through a JITted function.
Assignee: general → sphink
JIT code on x86 always has a frame pointer, right? Is that also true of x86-64? This is mainly about getting better information from a debugging session in a DEBUG build?
(In reply to comment #2)
> JIT code on x86 always has a frame pointer, right? Is that also true of x86-64?
> This is mainly about getting better information from a debugging session in a
> DEBUG build?

Yes, both x86 and x86_64 preserve the frame pointer. Unfortunately, on x86_64 it doesn't do any good -- you can't get a backtrace with a JITted method on the stack, apparently because the x86_64 ABI says that you can't depend on having the frame pointer and so must use unwind info for this instead. I don't understand why gdb can't detect that no unwind information is available and fall back to using the frame pointer (before giving up entirely), and I suspect there's a worthwhile patch to gdb to fix that. jimb might know more.

The unwind info can either be stored in a .eh_frame section or a .debug_frame section. The two are almost identical in format, with just enough of a difference that you can't rename one to the other. The big difference is that .eh_frame is marked ALLOC, so it is part of the runtime image. (It's _e_xception _h_andler frame info.) In the JITted case, gdb is loading the symbol file, so the ALLOC bit is irrelevant.

So at least for now, this is mainly about getting better information from a debugging session, yes. As for DEBUG, I'm hoping we don't have to restrict it to that. I'd like opt builds to have a runtime option to turn it on. (Where "it" probably means more than just gdb-usable stuff. Breakpad, maybe.)

It'd be kind of funny getting better symbol info for JIT code than for the rest, but so be it...

Oh, and this'll give more than stack unwinding if you ask for it. I'm also generating line number -> PC mapping info, so you can step through Javascript code (which'll need to be written to disk for gdb to find it, unfortunately.) I'd like to expose more, but that's for later.
(In reply to comment #2)
> JIT code on x86 always has a frame pointer, right? Is that also true of x86-64?
> This is mainly about getting better information from a debugging session in a
> DEBUG build?

Well, in the long run, once SpiderMonkey has generated this DWARF, I think it should tell the crash handler where it is, so the crash handler can include the data in the minidump. Then the processor could say, "Oh, here's an interesting stream; let's add its contents to the symbol files I've read already" and we could get JavaScript backtraces for jitted code in our crash reports.

(!!!)
except the minidump writer doesn't store frame unwinding information, nor symbols.
(In reply to comment #5)
> except the minidump writer doesn't store frame unwinding information, nor
> symbols.

Steve's patch causes the method JIT to generate DWARF describing the machine code it produces in memory. The minidump writer has an interface via which you can request that other areas of memory be included in the minidump (I think; but even if it doesn't, we could add them). What I'm saying is that we should use that interface to include the run-time-generated DWARF in the minidump file, and we should extend the dump processor to parse that. The processor can reuse the DWARF parsing code we already have in the tree for the dump_syms tool.
For now, apply this patch on top of the two patches in bug 599499.

This is a half-baked work in progress. It generates an in-memory ELF object file containing DWARF debug info for Jägermonkey-generated JIT code, then calls the magic GDB dummy function to register that debug info with GDB.

It currently supports the main body and the JaegerTrampoline invocation code only. (The latter should really have its debug info included in its object file, but I haven't looked into how to do that yet.) It does not cover any ICs nor JaegerTrampolineReturn. It is x86_64 and probably Linux only. It leaks memory. It has bad breath and bites babies. Only use it if you're curious and want to try it out. Currently, it gives you:

- x86_64 can do stack traces through JM-generated code
- the innermost JS function will show up on the stack
- if you break or step into the JIT code, you can single-step through the Javascript

Right now, it generates the object file in /dev/shm/elf.o, which obviously fails if you don't have tmpfs or similar mounted there. It reuses the same filename for all JIT code objects. (I'm not sure if that's a problem.) It has some bizarre behavior sometimes when you rerun the JS-containing binary multiple times (possibly reloading with code changes between each?) A common failure mode is for the inferior to somehow have its PC get set to an invalid value in the middle of an instruction, leading to SIGILL or SIGSEGV.

It will error out if you don't have my patch to libdwarf to fix code alignment on x86_64. You can "fix" that uncommenting the section starting with "Temporary horrible hack" in jsdwarf.cpp, but then stepping through JS will be erratic (it'll probably skip some lines.) David Anderson (not ours), the libdwarf author, is working on a newer version and will include the fix in it. I'll need to do a configure check in the meantime.

I intend to change many things about this patch, including its interface with the rest of the JS engine.
Somewhat more "real" patch. Add unregistration, fix misuse of the registration API. Stop leaking memory like crazy. Support multiple scripts by setting ownership correctly. Rename jsjitutil -> jsjitwatcher. (The names are still pretty bad overall in this patch.)
Attachment #521039 - Attachment is obsolete: true
do we support JIT for DOMWorkers and other thread inhabitants?
Depends on: 645887
Configure changes for pulling in the support library needed for registering JIT-generated code with GDB. I am 80% certain that I'm doing this incorrectly, but I've tried a few approaches and none of them feel quite right. Either I can't get it to handle the error cases how I'd like, or I write too much configure code by hand (which usually means I'm doing it wrong.)

So I'll spell out what's going on:

- libdwarf is required for constructing the in-memory DWARF info

- it is possible, even likely, that users will need to point to a non-system version of libdwarf (the currently released version doesn't work very well on x86_64)

- libelf.so from elfutils is required for formatting and writing the ELF object file.

- libelf is much more standard, and is installed on most systems already.

- users may or may not want the gdb/JIT integration

AC_CHECK_LIB(foo,...) seems to add -lfoo to LIBS by default, whether it's actually needed for the options given or not. I mimicked this behavior.

Should add a --with-libelf option? Then it seems like I'd need to do manual checking for --with-libdwarf to be dependent on whether libelf was found.

configure hurts my poor little brain...
Attachment #522531 - Flags: review?(khuey)
Comment on attachment 522531 [details] [diff] [review]
configure changes for gdb/jit code registration

I forgot to mention: the patches in this bug depend on the patches in bug 645887, although the configure patch dependency is an artificial one (the context requires it, not the configure script code itself.)
If we want to ship with this stuff enabled (and I think we do), our usual practice is just to slurp the libraries into the tree.

But, is the DWARF we're emitting really so complex? Couldn't we just emit it ourselves? You can fake up an ELF file in a few pages:

http://code.google.com/p/google-breakpad/source/browse/trunk/src/common/dwarf/dwarf2reader_cfi_unittest.cc#2316
(To be clear, I think using an external libdwarf and libelf is fine for now.)
Depends on patch from bug 645887.

I'm sure I'm doing all kinds of things wrong in this patch, but I've been staring at it for too long. I know this code isn't pretty.

http://sourceware.org/gdb/current/onlinedocs/gdb/JIT-Interface.html is the gdb documentation of the JIT code registration API.
Attachment #521351 - Attachment is obsolete: true
Attachment #522547 - Flags: review?(jimb)
Do we really need the top-level configure.in changes? There's no requirement that configure.in and js/src/configure.in remain "in sync". configure scripts tolerate unrecognized flags, and the top-level configure script should just propagate everything it gets down to js/src/configure.

I'm not seeing an AC_DEFINE in the second patch, and I'm not sure what the AC_SUBST is used by.
As a local variable, maybe, but I don't think we can keep "snowWhite" as a class member name. :)
Rather than putting jsdwarf.cpp in a big #ifdef, wouldn't it better simply to put the conditional in the Makefile, and just skip that source file and object file altogether?

Are the declarations at the head of class DwarfGenerator meant to be public, or private?

I wasn't able to find uses of add_JaegerTrampoline_FDE; where should I look?

DwarfGenerator's data members need comments. And now is the time to choose meaningful names. 'die' is not a very good name, for example.

Should DwarfGenerator's destructor close fd if it's open? I know one shouldn't do serious work in a destructor...

Rather than using thread-specific storage, could we store the DwarfGenerator pointer in the context or compartment? That would require propagating JSContext pointers around more, but I think that's preferable...

I think you should use one of the standard temporary filename generation functions. I think mkstemp is probably best; it's available on Linux, and Mac, too.

Rather than including the code in the ELF file's .text segment, can't you create a SHT_NOBITS .text section? That's what separate-debug files use; GDB likes those.

What is the effect of the 'asm("")' in the registration function? Does the compiler really omit this function's code if that's not there?

The definition of init_bytes (again, not a great name) is conditional on __x86_64, but its use is not. It's fine to be x86-64-only for now, but we should be sure to error out helpfully (preferably in the configure script) if someone tries to use it on other platforms.

In general, this code should be written with an eye towards supporting all our major architectures. At present, there's no organized segregation of processor-specific and processor-independent code.

I think the above comments are going to require enough changes that I'd like to re-review after they've been addressed.
(In reply to comment #12)
> If we want to ship with this stuff enabled (and I think we do), our usual
> practice is just to slurp the libraries into the tree.
> 
> But, is the DWARF we're emitting really so complex? Couldn't we just emit it
> ourselves? You can fake up an ELF file in a few pages:
> 
> http://code.google.com/p/google-breakpad/source/browse/trunk/src/common/dwarf/dwarf2reader_cfi_unittest.cc#2316

Hey, I didn't know about that stuff! That API seems much nicer than libelf+libdwarf. It looks like it does less handholding, but it would be nice to get rid of the contortions induced by the libelf/libdwarf APIs.

When I got into this, I knew nothing about elf or dwarf, so I assumed for portability and sanity I'd need to use a generating library. Now I'm not so sure -- even with them, you end up writing architecture-dependent code, so what's the point?
(In reply to comment #15)
> Do we really need the top-level configure.in changes? There's no requirement
> that configure.in and js/src/configure.in remain "in sync". configure scripts
> tolerate unrecognized flags, and the top-level configure script should just
> propagate everything it gets down to js/src/configure.

I've gotten comments before that from people who would really like them to be in sync as possible. Although maybe that's only if they use the same stuff, even if they use it differently?

> I'm not seeing an AC_DEFINE in the second patch, and I'm not sure what the
> AC_SUBST is used by.

Yes, looking at it now, it looks like I botched them anyway. I did a bunch of patch splitting and refactoring, and apparently "does it still compile?" isn't really the right test...
Attachment #522531 - Flags: review?(khuey)
Ok, this repairs the configure support to actually work again.

khuey, is it ok to leave out all of the changes to the toplevel configure.in if it isn't used outside of js/src?

Also, note that this is the version using libelf and libdwarf. I may end up switching over to jimb's stuff, in which case this would mostly all go away. But I'm guessing I'll want to do that in a followup bug. We'll see.
Attachment #522531 - Attachment is obsolete: true
Attachment #522731 - Flags: review?(khuey)
Comment on attachment 522731 [details] [diff] [review]
configure changes for gdb/jit code registration

Doh! Never mind for now. I'll need to do more configure changes to make the whole compilation conditional.
Attachment #522731 - Flags: review?(khuey)
(In reply to comment #17)
> Rather than putting jsdwarf.cpp in a big #ifdef, wouldn't it better simply to
> put the conditional in the Makefile, and just skip that source file and object
> file altogether?

Ok, done.

> Are the declarations at the head of class DwarfGenerator meant to be public, or
> private?

Oh, right. I made them public because of the add_dwarf_section callback. Fixed.

> I wasn't able to find uses of add_JaegerTrampoline_FDE; where should I look?

In my personal patch queue history. Sorry, this was a leftover remnant after bug 645111 obsoleted it. Deleted.

> DwarfGenerator's data members need comments. And now is the time to choose
> meaningful names. 'die' is not a very good name, for example.

Commented. I also renamed 'die' and 'cu_die' to 'functionDIE' and 'compilationUnitDIE'. Is your objection to the use of the DWARF jargon 'DIE'? I thought it would help readers connect it with the DWARF spec.

> Should DwarfGenerator's destructor close fd if it's open? I know one shouldn't
> do serious work in a destructor...

Well, *something* probably ought to close it. Whoops! Ok, done. (In both registerJITCode() after it's no longer needed, and the destructor in case registerJITCode() got skipped somehow.)

> Rather than using thread-specific storage, could we store the DwarfGenerator
> pointer in the context or compartment? That would require propagating JSContext
> pointers around more, but I think that's preferable...

I can't. The whole reason I need tls is because libdwarf takes a callback argument in dwarf_producer_init_b that it uses to generate the ELF sections, and there's no way to pass any custom data to it. So I'd still need tls to pass in a JSContext, JSCompartment, or JSRuntime.

I've asked the libdwarf author (David Anderson) to add this, and he seems willing, but it doesn't sound like the next release is going to happen very soon. I may just stop using libelf and libdwarf instead to resolve this and other issues.

> I think you should use one of the standard temporary filename generation
> functions. I think mkstemp is probably best; it's available on Linux, and Mac,
> too.

Ok. But it doesn't address another issue, which is what directory to put things in. I was using /dev/shm because it's an easy way to avoid hitting the disk when using an API (libelf) that insists on having a file descriptor. But it's not guaranteed to exist.

> Rather than including the code in the ELF file's .text segment, can't you
> create a SHT_NOBITS .text section? That's what separate-debug files use; GDB
> likes those.

Great idea! I know next to nothing about ELF. I don't really understand the interaction between SHT_NOBITS and SHF_ALLOC, but it seems fine with the combination. (And that's what a sample separate-debug file I just checked does.)

Argh! Except libelf doesn't seem to handle it correctly. It appears to allocate space in the object file for SHT_NOBITS sections anyway. OpenSUSE has a patch for it: https://build.opensuse.org/package/view_file?file=libelf-ignore-NOBITS-sh_offset.patch&package=elfutils&project=openSUSE%3AFactory&srcmd5=965189fd9e18ff33f0ec437b77031378

I'll use SHT_NOBITS. Things should magically get smaller if we link with a fixed libelf.

> What is the effect of the 'asm("")' in the registration function? Does the
> compiler really omit this function's code if that's not there?

I don't know if it really does omit it, but apparently it would be legal. From the gcc info:

`noinline'
     This function attribute prevents a function from being considered
     for inlining.  If the function does not have side-effects, there
     are optimizations other than inlining that causes function calls
     to be optimized away, although the function call is live.  To keep
     such calls from being optimized away, put
          asm ("");
     (*note Extended Asm::) in the called function, to serve as a
     special side-effect.

> The definition of init_bytes (again, not a great name) is conditional on
> __x86_64, but its use is not. It's fine to be x86-64-only for now, but we
> should be sure to error out helpfully (preferably in the configure script) if
> someone tries to use it on other platforms.

I was using the name of the parameter in the libdwarf API call. But now that I'm down to a single FDE, I think I'll ditch the CIE instructions entirely, making the issue moot.

> In general, this code should be written with an eye towards supporting all our
> major architectures. At present, there's no organized segregation of
> processor-specific and processor-independent code.

I was hoping by using "standard" libraries that I'd be able to hide much of this, even though the generated code is necessarily architecture-specific. But yes, the remaining bits should be organized better. How's this latest patch? It's somewhat better.

> I think the above comments are going to require enough changes that I'd like to
> re-review after they've been addressed.

That's the outcome I was hoping for at this point. Thanks for the laundry list! Gimme another.
Attachment #522547 - Attachment is obsolete: true
Attachment #522547 - Flags: review?(jimb)
Attachment #522942 - Flags: review?(jimb)
Oh, right. One thing that I keep forgetting to write down somewhere -- this is still missing a way to get the source text for stepping through. Right now, I'm just recording script->filename, which works fine for my only test case, which is running a script from the shell. But in the browser, that'll be a URL, which is useless to gdb.

So to really step through JS code, we'll need a mechanism for writing out a plain copy of all mjitted source scripts somewhere. Or something.
(In reply to comment #22)
> > DwarfGenerator's data members need comments. And now is the time to choose
> > meaningful names. 'die' is not a very good name, for example.
> 
> Commented. I also renamed 'die' and 'cu_die' to 'functionDIE' and
> 'compilationUnitDIE'. Is your objection to the use of the DWARF jargon 'DIE'? I
> thought it would help readers connect it with the DWARF spec.

Oh, using DWARF terminology is good, for exactly the reason you say. It's just that there are so many different dies flying around in DWARF that 'die' by itself is too vague. 'functionDIE' is perfect.

> > Should DwarfGenerator's destructor close fd if it's open? I know one shouldn't
> > do serious work in a destructor...
> 
> Well, *something* probably ought to close it. Whoops! Ok, done. (In both
> registerJITCode() after it's no longer needed, and the destructor in case
> registerJITCode() got skipped somehow.)

Sounds perfect.

> > Rather than using thread-specific storage, could we store the DwarfGenerator
> > pointer in the context or compartment? That would require propagating JSContext
> > pointers around more, but I think that's preferable...
> 
> I can't. The whole reason I need tls is because libdwarf takes a callback
> argument in dwarf_producer_init_b that it uses to generate the ELF sections,
> and there's no way to pass any custom data to it. So I'd still need tls to pass
> in a JSContext, JSCompartment, or JSRuntime.

Wow, what a pain. If you don't switch away from libdwarf altogether, could you perhaps enclose the TSS usage in a little class that takes care of allocating and freeing the index, and is a template to provide a strictly-typed getters and setters?

> > I think you should use one of the standard temporary filename generation
> > functions. I think mkstemp is probably best; it's available on Linux, and Mac,
> > too.
> 
> Ok. But it doesn't address another issue, which is what directory to put things
> in. I was using /dev/shm because it's an easy way to avoid hitting the disk
> when using an API (libelf) that insists on having a file descriptor. But it's
> not guaranteed to exist.

mkstemp will respect the user's TMPDIR environment variable, and otherwise generally behave like everything else that makes temporary files. That seems fine to me.

> > Rather than including the code in the ELF file's .text segment, can't you
> > create a SHT_NOBITS .text section? That's what separate-debug files use; GDB
> > likes those.
> 
> Great idea! I know next to nothing about ELF. I don't really understand the
> interaction between SHT_NOBITS and SHF_ALLOC, but it seems fine with the
> combination. (And that's what a sample separate-debug file I just checked
> does.)

Yeah, you probably want to imitate separate-debug files as much as possible.

> Argh! Except libelf doesn't seem to handle it correctly. It appears to allocate
> space in the object file for SHT_NOBITS sections anyway. OpenSUSE has a patch
> for it:
> https://build.opensuse.org/package/view_file?file=libelf-ignore-NOBITS-sh_offset.patch&package=elfutils&project=openSUSE%3AFactory&srcmd5=965189fd9e18ff33f0ec437b77031378
> 
> I'll use SHT_NOBITS. Things should magically get smaller if we link with a
> fixed libelf.

You can tell you're in well-traveled territory when the tools are this good. :)

> > What is the effect of the 'asm("")' in the registration function? Does the
> > compiler really omit this function's code if that's not there?
> 
> I don't know if it really does omit it, but apparently it would be legal. From
> the gcc info:
> 
> `noinline'
>      This function attribute prevents a function from being considered
>      for inlining.  If the function does not have side-effects, there
>      are optimizations other than inlining that causes function calls
>      to be optimized away, although the function call is live.  To keep
>      such calls from being optimized away, put
>           asm ("");
>      (*note Extended Asm::) in the called function, to serve as a
>      special side-effect.

Could we try leaving out the asm and the noinline attribute, and see if we actually have any troubles? That function has external linkage, so I don't think the compiler can really tell that it's "unused" anyway. Perhaps the linker would throw it away. But I'd like to wait for the need to arise before we put in magic.

> > In general, this code should be written with an eye towards supporting all our
> > major architectures. At present, there's no organized segregation of
> > processor-specific and processor-independent code.
> 
> I was hoping by using "standard" libraries that I'd be able to hide much of
> this, even though the generated code is necessarily architecture-specific. But
> yes, the remaining bits should be organized better. How's this latest patch?
> It's somewhat better.

I'll take a look.

> That's the outcome I was hoping for at this point. Thanks for the laundry list!
> Gimme another.

On its way! :)
snowWhite still has to go. Sorry. :) The joke will get old.

> I can't. The whole reason I need tls is because libdwarf takes a callback
> argument in dwarf_producer_init_b that it uses to generate the ELF sections,
> and there's no way to pass any custom data to it. So I'd still need tls to pass
> in a JSContext, JSCompartment, or JSRuntime.

By the way, this is the kind of explanation that should be in the comment for the TSS index.

Looking at the JITWatcher member function implementations, this definitely seems like a case where JITWatcher would be an abstract base class, with implementations for GDB, oprofile, and so on. Having members and function bodies chosen by #ifdef is just using the preprocessor to get the same effect. I know there are admonitions in the SpiderMonkey C++ style guide against virtual functions, but this is a classic case.

More in a bit.
What's the status of this? Is it still planned to go ahead? I'd really like this on ARM. We have no mandatory frame pointers in the ABI and not even an easy way to identify the caller, as the return address is stored in a frame-specific location.
(In reply to comment #26)
> What's the status of this?

Ignored but not abandoned.

> Is it still planned to go ahead? 

Yes. Though I can't give an ETA. I have a little more work to do on an unrelated bug, then I'll be working on redoing the JIT registration API (since I'm now using it for yet another purpose, bug 642054). After that, I have to decide whether to work on this consumer or another (eg the profiler). This one may be close enough that it'd be worth finishing.

Also, the newer version of libdwarf is out, which will remove the need for some of the uglier bits of this.

> I'd really like
> this on ARM. We have no mandatory frame pointers in the ABI and not even an
> easy way to identify the caller, as the return address is stored in a
> frame-specific location.

When I get there, I may need some help with ARM support.
Status: NEW → ASSIGNED
Updated for newer JIT registration API from bug 645887. Still no ARM, still haven't imported the newer libdwarf, so not ready for review yet. Mostly works here on Linux x86-64, though I haven't figured out how to convince gdb to 'next' over what I want it to next over; it currently steps too much and too little, depending, even though it has full line <-> PC info.
Attachment #522942 - Attachment is obsolete: true
Attachment #522942 - Flags: review?(jimb)
Updated again. I think the difference from the last version is some memory management bugfixes (I was freeing some memory too early), and an attempt to handle inlined call frames properly. Well, mostly properly -- I give the write file:lineno for inlined frames, but do not generated the DWARF info giving the inlined call info, so gdb won't be able to synthesize stack frames for inline calls.
Attachment #563611 - Attachment is obsolete: true
Apparently a simpler interface landed in GDB trunk:
http://playingwithpointers.com/archives/633
I think now it is simpler to write an unwinder in Python; plus a frame filter
to display interesting stuff about the newly discovered frames.

I have started this project here:
https://github.com/tromey/spidermonkey-unwinder
... but really it ought to be moved into the tree.

This requires gdb 7.10 (not released yet); a further gdb patch (see the README);
and at least a partial fix for bug 757969 (which is in the bug, but old; I have
a rebased version here).

The current code has some limitations; and I've only ported it to x86-64.
But, it's better than the status quo and doesn't require the JITs to emit
any sort of debug info.
See bug 1232712 for the unwinder.
Assignee: sphink → nobody
Status: ASSIGNED → NEW
Severity: normal → S3
Severity: S3 → N/A
Type: defect → enhancement
Priority: -- → P5
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: