Closed Bug 673753 Opened 13 years ago Closed 13 years ago

install systemtap headers on linux build slaves

Categories

(Release Engineering :: General, defect)

x86_64
Linux
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: sfink, Assigned: bhearsum)

References

Details

Attachments

(2 files, 1 obsolete file)

I'd like to install the headers for systemtap onto the linux build slaves. Specifically, I need to be able to #include <sys/sdt.h>.

On Fedora, it is provided by the systemtap-sdt-devel RPM. I don't know what it is for other Linux flavors.

This plus my patch from bug 574403 will allow compiling with --enable-dtrace on Linux. I'd like to do this for a nightly "performance analysis" build; see bug 592518. (And eventually, I'd like to have a subset of probes on for release builds.)

Not sure if this is the right component.
Is this a runtime requirement, or only a build-time one?
Assignee: nobody → bhearsum
(sorry; should've changed my irc nick. Was on PTO today.)

Build-time only. I'm adding static markers to the JS engine for various things, and I need the headers to compile in the magic NOPs or whatever it does.

Oh, wait -- it also runs dtrace -G to generate an object file with a .probes section pointing to the probe points. So it needs the dtrace binary in addition to the headers. On Fedora at least, it comes in the same RPM (systemtap-sdt-devel).
The CentOS 5.6 systemtap packages (version 1.3) seem to work fine on our VMs. Once bug 674601 is fixed, sfink is going to have a look on the machine I installed them on and see if they'll work.

If not, we can build our own.
Depends on: 674601
Attached patch add systemtap to Linux machines (obsolete) — Splinter Review
On the assumption that sfink's tests pass, this will install systemtap and the dtrace shim on our Linux build machines.
Attachment #549186 - Flags: review?(catlee)
Comment on attachment 549186 [details] [diff] [review]
add systemtap to Linux machines

Review of attachment 549186 [details] [diff] [review]:
-----------------------------------------------------------------
Attachment #549186 - Flags: review?(catlee) → review+
Status update, since I think I'm still pinning that machine down:

I tried it out, but systemtap had problems doing stack unwinding when I copied a build over. This is not critical, but definitely suboptimal. On #systemtap, fche said this might be an old compiler, and sure enough, I was using the system gcc 4.1.1. Unfortunately, switching to gcc 4.5 (/tools/gcc-4.5) and setting the PATH I found in a build log did not resolve the problem.

I compiled the latest systemtap on the build machine, but got pulled away and have not managed to try it out yet. I will do that first thing tomorrow.
Sorry for hogging the build machine for so long.

I have a SRPM that should work (and I've tested the JS shell built with the systemtap-sdt-devel resulting from that SRPM). For some definition of "work", anyway -- it turns out that you can't actually run the systemtap-1.6 binary on CentOS 5.0, or even the 1.3 that you had installed. Something changed in the kernel utrace API that matters. But I can't bring myself to care too much about that. The binaries resulting from the systemtap-sdt-devel this produces seem to work fine on CentOS5.6, so that's good enough for me.

I put it together with some dependencies at http://people.mozilla.org/~sfink/data/systemtap-stuff.tar.bz2

To get the build dependencies, unpack the above archive and then:

cd systemtap-stuff
yum-builddep systemtap-1.6-1moz1.src.rpm
yum install latex2html xmlto
rpm -Uvh rpms/*.rpm # 32-bit only; for 64-bit get them from ftp://mirror.stanford.edu/pub/mirrors/centos/5.6/os/x86_64/CentOS/

Hopefully that works, anyway.

You can build the RPM with:

cd rpmbuild/SPECS
rpmbuild -ba --define "with_bundled_elfutils 1" --nodeps --define "with_crash 0" --define "elfutils_version 0.148" --define "with_grapher 0" systemtap.spec

or presumably with

rpmbuild --rebuild --define "with_bundled_elfutils 1" --nodeps --define "with_crash 0" --define "elfutils_version 0.148" --define "with_grapher 0" systemtap-1.6-1moz1.src.rpm

though I haven't tried that. That may look messy, but it's much cleaned up from what I started with, at the cost of requiring some upgraded *-devel RPMS.

In case that's not enough to get the depencies, here's the full set I installed (some of which turned out to be useless because they're too old, so I left those features disabled):

Jul 29 15:41:59 mv-moz2-linux-ix-slave01 Installed: nspr-devel.i386 4.6.5-1.el5
Jul 29 15:42:00 mv-moz2-linux-ix-slave01 Installed: nss-devel.i386 3.11.5-1.el5
Jul 29 15:42:07 mv-moz2-linux-ix-slave01 Installed: gettext-devel.i386 0.14.6-4.el5
Jul 29 15:42:07 mv-moz2-linux-ix-slave01 Installed: avahi-devel.i386 0.6.16-1.el5
Jul 29 15:43:26 mv-moz2-linux-ix-slave01 Installed: libsigc++20.i386 2.0.18-1.el5
Jul 29 15:43:26 mv-moz2-linux-ix-slave01 Installed: glibmm24.i386 2.12.10-1.el5
Jul 29 15:43:30 mv-moz2-linux-ix-slave01 Installed: tetex-fonts.i386 3.0-32.fc6
Jul 29 15:43:31 mv-moz2-linux-ix-slave01 Installed: tetex-dvips.i386 3.0-32.fc6
Jul 29 15:43:31 mv-moz2-linux-ix-slave01 Installed: cairomm.i386 1.2.4-1.el5
Jul 29 15:43:34 mv-moz2-linux-ix-slave01 Installed: gtkmm24.i386 2.10.10-1.el5
Jul 29 15:43:35 mv-moz2-linux-ix-slave01 Installed: cairomm-devel.i386 1.2.4-1.el5
Jul 29 15:43:37 mv-moz2-linux-ix-slave01 Installed: libsigc++20-devel.i386 2.0.18-1.el5
Jul 29 15:43:38 mv-moz2-linux-ix-slave01 Installed: netpbm-progs.i386 10.35-6.fc6
Jul 29 15:43:38 mv-moz2-linux-ix-slave01 Installed: perl-HTML-Tagset.noarch 3.10-2.1.1
Jul 29 15:43:38 mv-moz2-linux-ix-slave01 Installed: perl-HTML-Parser.i386 3.55-1.fc6
Jul 29 15:43:39 mv-moz2-linux-ix-slave01 Installed: dialog.i386 1.0.20051107-1.2.2
Jul 29 15:43:42 mv-moz2-linux-ix-slave01 Installed: tetex.i386 3.0-32.fc6
Jul 29 15:43:51 mv-moz2-linux-ix-slave01 Installed: tetex-latex.i386 3.0-32.fc6
Jul 29 15:43:51 mv-moz2-linux-ix-slave01 Installed: xmltex.noarch 20020625-8
Jul 29 15:43:52 mv-moz2-linux-ix-slave01 Installed: passivetex.noarch 1.25-5.1.1
Jul 29 15:43:52 mv-moz2-linux-ix-slave01 Installed: perl-Compress-Zlib.i386 1.42-1.fc6
Jul 29 15:43:53 mv-moz2-linux-ix-slave01 Installed: perl-libwww-perl.noarch 5.805-1.1.1
Jul 29 15:43:53 mv-moz2-linux-ix-slave01 Installed: perl-XML-Parser.i386 2.34-6.1.2.2.1
Jul 29 15:43:54 mv-moz2-linux-ix-slave01 Installed: glibmm24-devel.i386 2.12.10-1.el5
Jul 29 15:43:54 mv-moz2-linux-ix-slave01 Installed: w3m.i386 0.5.1-15.el5
Jul 29 15:43:55 mv-moz2-linux-ix-slave01 Installed: gtkmm24-devel.i386 2.10.10-1.el5
Jul 29 15:43:56 mv-moz2-linux-ix-slave01 Installed: latex2html.noarch 2002.2.1-6
Jul 29 15:43:56 mv-moz2-linux-ix-slave01 Installed: xmlto.i386 0.0.18-13.1
Jul 29 15:44:20 mv-moz2-linux-ix-slave01 Installed: libglademm24.i386 2.6.3-3.el5
Jul 29 15:44:20 mv-moz2-linux-ix-slave01 Installed: libglademm24-devel.i386 2.6.3-3.el5

If you just want the RPMs, I can generate the 64-bit ones if you give me access to a 64-bit geriatric CentOS 5.0 machine too. (Not saying I want to, just that I could...)

I don't think any of these other RPMs are necessary to just install systemtap-sdt-devel; as far as I know, they're only needed for the build. But I could be wrong. I haven't tried removing and downgrading everything to a virgin state.
Wow, thanks for going so far with this, it's super helpful! I tried out your RPM on a clean machine, and it seems to work fine:
[root@moz2-linux-slave04 cltbld]# rpm -i systemtap-sdt-devel-1.6-1moz1.i386.rpm 
 
[root@moz2-linux-slave04 cltbld]# 
[root@moz2-linux-slave04 cltbld]# dtrace
Usage /usr/bin/dtrace [--help] [-h | -G] [-C [-I<Path>]] -s File.d [-o <File>]

Is there anything else I should do to confirm that?
That's probably good enough, but here's my full testing procedure:

1. re-configure with --enable-dtrace
2. build the whole tree (I actually just build js/src, since it's much faster and it's all that you need)
3. cd $OBJDIR/js/src; readelf --sections jsgc.o | fgrep stap

This should display a couple of sections (.note.stapsdt, .rel.note.stapsdt, .stapsdt.base). Older systemtaps won't have these, though they may still work (they use a different mechanism.)

4. readelf -x.probes js

This should show "Hex dump of section '.probes'". The newer version will probably just show a bunch of zeroes. Older versions will show a bunch of stuff. If it didn't work, it'll show nothing.

The next steps need to be done on a more recent Linux machine, one that is compatible with systemtap 1.6 (or at least 1.4; I think that's where the changeover was.) I know it works on CentOS 5.6 and Fedora 14.

5. Copy over the $OBJDIR/dist directory to a newer machine.

6. (as root) cd $OBJDIR/dist/bin; stap -L 'process("./js").mark("*")'

This should show something like:

process("./js").mark("execute__done") $arg1:long $arg2:long
process("./js").mark("execute__start") $arg1:long $arg2:long
process("./js").mark("function__entry") $arg1:long $arg2:long $arg3:long
process("./js").mark("function__return") $arg1:long $arg2:long $arg3:long
process("./js").mark("object__create") $arg1:long $arg2:long
process("./js").mark("object__finalize") $arg1:long $arg2:long $arg3:long

To be honest, on my machine it dies with

stap: /usr/lib/libdw.so.1: version `ELFUTILS_0.149' not found (required by stap)

because I somehow managed to mangle things so that even though I'm 64-bit, it somehow prefers /usr/lib/libdw.so.1 over /usr/lib64/libdw.so.1. I haven't figured that out. Deleting the /usr/lib one fixes it. But that's unlikely to be a problem for anyone else; I'm juggling multiple ELF and DWARF handling libraries for a feature I've been working on.

If it shows nothing, then it isn't working.

7. Put this into a file /tmp/probe.stp:

probe process("./js").mark("function__entry") {
  printf("file=%s class=%s func=%s\n",
         user_string($arg1),
         user_string($arg2),
         user_string($arg3));
  print_ustack(ubacktrace());
}

8. run: stap /tmp/probe.stp -c "./js -e 'function f() { print(3); }; function g() { f(); f(); }; g()'"

This should print out several dozen lines of stack traces.

If it prints out just 4 lines, starting with "WARNING: unwind Bad ..." then there's a problem (but that seems to be a symptom of running a 32-bit build on a 64-bit machine.)

It may say something about missing unwind information and suggesting re-running with a -d /lib64/something. That's ok.
(In reply to comment #7)
> rpmbuild --rebuild --define "with_bundled_elfutils 1" --nodeps --define
> "with_crash 0" --define "elfutils_version 0.148" --define "with_grapher 0"
> systemtap-1.6-1moz1.src.rpm

I added --define "with_docs 0" (to avoid needing to install documentation dependencies) and tried building on a 64-bit machine and got:
RPM build errors:
    File not found: /var/tmp/systemtap-1.6-1moz1-root-cltbld/usr/libexec/systemtap/stap-authorize-cert
    File not found: /var/tmp/systemtap-1.6-1moz1-root-cltbld/usr/bin/stap-server
    File not found: /var/tmp/systemtap-1.6-1moz1-root-cltbld/usr/libexec/systemtap/stap-serverd
    File not found: /var/tmp/systemtap-1.6-1moz1-root-cltbld/usr/libexec/systemtap/stap-start-server
    File not found: /var/tmp/systemtap-1.6-1moz1-root-cltbld/usr/libexec/systemtap/stap-stop-server
    File not found: /var/tmp/systemtap-1.6-1moz1-root-cltbld/usr/libexec/systemtap/stap-gen-cert
    File not found: /var/tmp/systemtap-1.6-1moz1-root-cltbld/usr/libexec/systemtap/stap-sign-module
    File not found by glob: /var/tmp/systemtap-1.6-1moz1-root-cltbld/usr/share/man/man8/stap-server.8*
    File not found: /var/tmp/systemtap-1.6-1moz1-root-cltbld/var/run/stap-server


(Doesn't seem related to missing dependencies, though...)
Sorry, I should have kept around the command line and spec file I used to do something similar. You can build without the doc, nss, and nspr dependencies successfully if you remove the server subpackage. If you want to go down that road, you can comment out the whole '%files server' section in the spec file. (It's ok to leave in the '%package server' part; it won't do anything without a %files section.)

I think the doc stuff just needs xmlto and latex2html. Many of the missing files above look more like nss/nspr to me.
The elfutils patch is exactly what is in Steve's srpm, the spec file is the same, except with the files from  comment #10 commented out, so that the build works. (We don't use the packages those files are from, so it doesn't matter if those packages end up broken.)

I'm going to backup the other parts of the SRPM (systemtap & elfutils source packages) somewhere, haven't figured out where yet.
Attachment #550463 - Flags: review?(rail)
Steve, here's an objdir from a 64-bit build to test: http://people.mozilla.com/~bhearsum/obj64.tar.bz2
Comment on attachment 550463 [details] [diff] [review]
system spec file + elfutils patch used to build

Looks good.
Attachment #550463 - Flags: review?(rail) → review+
(In reply to comment #13)
> Steve, here's an objdir from a 64-bit build to test:
> http://people.mozilla.com/~bhearsum/obj64.tar.bz2

Works perfectly! Thank you very much.
Attachment #549186 - Attachment is obsolete: true
Attachment #550505 - Flags: review?(rail)
Attachment #550505 - Flags: review?(rail) → review+
Planning to roll this out to the build machines tomorrow morning.
Comment on attachment 550505 [details] [diff] [review]
updated puppet manifests

This is getting rolled out now. Should be on all of the machines within a day or two.
Attachment #550505 - Flags: checked-in+
Comment on attachment 550463 [details] [diff] [review]
system spec file + elfutils patch used to build

Still need to figure out where to put systemtap-stuff.tar.bz2
Attachment #550463 - Flags: checked-in+
I ended up just stashing the SRPM in /N/production/centos5-i686/build/RPMs/, since that's the only part of the tarball we need.

Steve, all the machines should have this tomorrow, so you should be good to push things that depend on it starting then.
Status: NEW → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
Great, thanks you so much for this! Sorry about requesting such a hairball. But maybe you can record the time spent as motivation for the project to upgrade from CentOS 5.0... :)
Product: mozilla.org → Release Engineering
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: