On the trunk, I can no longer run in gdb. I see: > Note: verifyreflow is disabled > Note: styleverifytree is disabled > Note: frameverifytree is disabled > [New Thread 3076 (runnable)] > > Program received signal SIGTRAP, Trace/breakpoint trap. > [Switching to Thread 2051 (runnable)] > 0x0 in ?? () and I can't continue past that point. Dbaron says: > Reverting xpcom/threads/nsThread.cpp to revision 1.26 and > xpcom/threads/nsThread.h to revision 1.15 (i.e., reverting dougt's > checkin at 2000-09-30 22:35) fixes this problem for me. (I figured > this out by doing a binary search of release builds, which also show > the problem.) I'm running gdb-5.0-0 and glibc-2.1.3-15 on a fairly stock (but the gdb isn't stock) RH 6.2 system. Marking critical because not being able to debug at all is a very serious regression.
Been seeing this too lately, the startup looks like this with gdb 4.18 (gdb) run Starting program: /usr/local/mozilla/mozilla-bin (no debugging symbols found)...(no debugging symbols found)...(no debugging symbols found)... (no debugging symbols found)...(no debugging symbols found)...(no debugging symbols found)... (no debugging symbols found)...[New Thread 20406 (manager thread)] [New Thread 20404 (initial thread)] [New Thread 20407] I am inside the initialize Hey : You are in QFA Startup (QFA)Talkback loaded Ok. [New Thread 20408] Cannot access memory at address 0x2e362e6f (gdb) It's always the same address.
bryner and I also see this. dougt said he didn't Here are some machine details I compiled via email and IRC: dbaron & bryner: * RH 7.0 plus all errata (dbaron tried with and without glibc errata) * gdb-5.0-7 * glibc-2.1.92-14 and glibc-2.1.94-3 * binutils-22.214.171.124-1 dbaron: * kernel 2.4.test10.pre2 * SMP (dual P-733) bryner: * kernel 2.2 * not SMP akkana & dougt: * gdb 5 * glibc-2.1.3-15 * binutils-126.96.36.199.22-6 akkana: * compat-binutils-5.2-188.8.131.52.23.1
R.K.Aa comments make me think that this is debugging optimized builds, akk, are you trying to debug an optimized build?
For reference, the output I saw from gdb was: Note: verifyreflow is disabled Note: styleverifytree is disabled Note: frameverifytree is disabled [New Thread 3076 (LWP 30351)] ptrace: No such process. (gdb)
I'm not debugging an optimized build. However, the problem *also* showed up in release builds for me too -- that's how I quickly narrowed it down to a half-day period and figured out it was your checkin.
No, I'm seeing this with my own debug builds.
this is a gdb bug. There's nothing we can do about it in mozilla. I'm going to mark this INVALID in 3...2...1...
what is the bug? is there a gdb bug number?
It may not be a gdb bug, since some of us who see/don't see the problem have the same versions of gdb. It could be a glibc bug, or something else. But we do need to figure out which...
Reopening. We need to be able to debug mozilla, and this used to work (and still does work on the branch). If we could find a version of gdb that does work to debug mozilla, that would be a perfectly adequate resolution; but just saying "Oh, well, try to do your debugging with printf" isn't enough.
bruce pointed me at something earlier today: http://sources.redhat.com/ml/gdb/2000-10/msg00016.html basically gdb is really flakey when it comes to mixing threads and DLLs loaded at runtime.
That's my message to the gdb mailing list. This message is about using a GUI to run gdb. Threads were an issue but only in the sense that the gdb code needed to service the event loop when the app was threaded. I was working at getting Insight (a GUI for gdb) to work on mozilla. The problem was that gdb when to sleep waiting for the app (mozilla in my case) to stop. But to stop mozilla I wanted to click the "Stop" button but the UI was frozen. After I figured out how to provide appropriate info this was fixed. I just updated my tree and it is not crashing. Is there something one does to make the crash happen?
err.. i run from console, simply a "mozilla -g -d gdb". That approach has worked just fine till quite recently.
I run from rxvt, usually as "gdb mozilla-bin", but I just tried mozilla -g and mozilla -g -d gdb, and saw the same problem both times.
I built the trunk and it doesn't happen for me. So I'm copying Akkana's tree to my system and trying that.
when i run my copy of akkana's code on akkana's system it crashes when i run it on my system it does not crash (it is nfs mounted on both systems)
I checked that the loaded libraries are the same (static lib must be the same since I'm running the same program) [guitar]$ LD_LIBRARY_PATH=. ldd mozilla-bin libgkgfx.so => ./libgkgfx.so (0x40015000) libxpcom.so => ./libxpcom.so (0x4006d000) libmozjs.so => ./libmozjs.so (0x401a9000) libjsj.so => ./libjsj.so (0x40257000) libplds4.so => ./libplds4.so (0x40278000) libplc4.so => ./libplc4.so (0x4027c000) libnspr4.so => ./libnspr4.so (0x40281000) libpthread.so.0 => /lib/libpthread.so.0 (0x402cd000) libjprof.so => ./libjprof.so (0x402e0000) libnsl.so.1 => /lib/libnsl.so.1 (0x402e4000) libutil.so.1 => /lib/libutil.so.1 (0x402fa000) libresolv.so.2 => /lib/libresolv.so.2 (0x402fd000) libdl.so.2 => /lib/libdl.so.2 (0x4030c000) libstdc++-libc6.1-1.so.2 => /usr/lib/libstdc++-libc6.1-1.so.2 (0x40310000) libm.so.6 => /lib/libm.so.6 (0x40352000) libc.so.6 => /lib/libc.so.6 (0x40370000) /lib/ld-linux.so.2 => /lib/ld-linux.so.2 (0x40000000) [guitar]$ sum -r /lib/libpthread.so.0 /lib/libnsl.so.1 /lib/libutil.so.1 /lib/libresolv.so.2 /lib/libdl.so.2 /usr/lib/libstdc++-libc6.1-1.so.2 /lib/libm.so.6 /lib/libc.so.6 /lib/ld-linux.so.2 25259 284 /lib/libpthread.so.0 11717 362 /lib/libnsl.so.1 16688 46 /lib/libutil.so.1 64731 166 /lib/libresolv.so.2 38159 74 /lib/libdl.so.2 62678 1118 /usr/lib/libstdc++-libc6.1-1.so.2 12635 516 /lib/libm.so.6 15111 4006 /lib/libc.so.6 24886 333 /lib/ld-linux.so.2 [guitar]$ [accipiter]$ LD_LIBRARY_PATH=. ldd mozilla-bin libgkgfx.so => ./libgkgfx.so (0x40015000) libxpcom.so => ./libxpcom.so (0x4006d000) libmozjs.so => ./libmozjs.so (0x401a9000) libjsj.so => ./libjsj.so (0x40257000) libplds4.so => ./libplds4.so (0x40278000) libplc4.so => ./libplc4.so (0x4027c000) libnspr4.so => ./libnspr4.so (0x40281000) libpthread.so.0 => /lib/libpthread.so.0 (0x402cd000) libjprof.so => ./libjprof.so (0x402e0000) libnsl.so.1 => /lib/libnsl.so.1 (0x402e4000) libutil.so.1 => /lib/libutil.so.1 (0x402fa000) libresolv.so.2 => /lib/libresolv.so.2 (0x402fd000) libdl.so.2 => /lib/libdl.so.2 (0x4030c000) libstdc++-libc6.1-1.so.2 => /usr/lib/libstdc++-libc6.1-1.so.2 (0x40310000) libm.so.6 => /lib/libm.so.6 (0x40352000) libc.so.6 => /lib/libc.so.6 (0x40370000) /lib/ld-linux.so.2 => /lib/ld-linux.so.2 (0x40000000) [accipiter]$ sum -r /lib/libpthread.so.0 /lib/libnsl.so.1 /lib/libutil.so.1 /lib/libresolv.so.2 /lib/libdl.so.2 /usr/lib/libstdc++-libc6.1-1.so.2 /lib/libm.so.6 /lib/libc.so.6 /lib/ld-linux.so.2 25259 284 /lib/libpthread.so.0 11717 362 /lib/libnsl.so.1 16688 46 /lib/libutil.so.1 64731 166 /lib/libresolv.so.2 38159 74 /lib/libdl.so.2 62678 1118 /usr/lib/libstdc++-libc6.1-1.so.2 12635 516 /lib/libm.so.6 15111 4006 /lib/libc.so.6 24886 333 /lib/ld-linux.so.2 [accipiter]$
uname -a Linux guitar.mcom.com 2.2.14-5.0smp #1 SMP Tue Mar 7 21:01:40 EST 2000 i686 unknown Linux accipiter.mcom.com 2.2.14-5.0smp #1 SMP Tue Mar 7 21:01:40 EST 2000 i686 unknown
[guitar]$ gtk-config --version 1.2.6 [guitar]$ [accipiter]$ gtk-config --version 1.2.6 [accipiter]$
[guitar]$ gdb -v GNU gdb 5.0 Copyright 2000 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "i386-redhat-linux". [guitar]$ [accipiter]$ gdb -v GNU gdb 20001004 Copyright 2000 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "i686-pc-linux-gnu". [accipiter]$
[guitar]$ gcc --version egcs-2.91.66 [guitar]$ [accipiter]$ gcc --version egcs-2.91.66 [accipiter]$
blizzard, can you look at this?
I've built a recent (10/4) insight/gdb. When I run it (uninstalled) on Akkana's machine guitar it does not crash but gdb still has a problem.
People are free to install from my build Note: it will as a side effect provide a GUI front end to gdb Note: it will install in /usr/local/bin/gdb so the copy in /usr/bin/gdb will still be there. One will need have the new version first on their path. To install: cd to /u/bstell/downloads/insight/insight+dejagnu-20001004 as root do a make install
If you did something in mozilla that breaks gdb it's a gdb bug. Period. You shouldn't be able to crash the debugger. For what it's worth I don't have any problems. You can also try breaking at main, shutting off loading libraries and let it finish running and loading the libraries later. It makes it hard to debug certain kinds of bugs but it does help some.
I've tried various combinations of loading / not loading shared libraries, and I see the same symptoms. Whatever the gdb bug is (or bug in something gdb uses), it only shows up on some people's machines. If we're going to report a bug against gdb, it might be good to have some idea of what the conditions are, although I guess I could create an account on my machine for someone to debug on...
I built with the "--prefix" options so I could run it without installing it. However this means it won't install. I'm rebuilding without the "--prefix".
There's a misperception here. gdb doesn't crash; it starts up mozilla, then mozilla dies with either a bad memory reference or a SIGTRAP (neither of which happens when mozilla is run outside of gdb). It's probably a race condition of some sort, which would explain why it happens on some machines and not others, and why it happens in slightly different ways on different machines. I run with prun (from mozilla's debugging-hints.html), so I already am breaking at main and delaying loading libraries, because that's the only way I've ever managed to get mozilla to run under gdb5 and RH6.2, even on my huge-memory machine. (I used to be able to run directly under gdb4 and RH6.0.)
it looks like one cannot install off a NFS partition. I would guess that the install creates a temp file but on NFS root (need to be root to install) cannot create that file so the install fails.
Akkana upgraded her gdb (with insight/gdb) and mozilla on the trunk appears not to crash. get insight here: http://sources.redhat.com/insight/ get gdb here: http://sources.redhat.com/gdb/
Life is good. Insight is pretty cool, too. Check it out! We should add this info to the Mozilla debugging faq (I can do that, unless bliz would rather do it).
sea linux 2000102021: Can again run "mozilla -g -d gdb" An additional "run" now starts mozilla just fine, like it used to, before this weirdness. (still using RH6.2 gdb-4.18-11)
just fine, apart from a line during startup: warning: find_solib: Can't read pathname for load map: Input/output error Didn't notice this earlyer, perhaps it doesn't always display.
I encountred the same prob with gdb-5.0 from ftp.gnu.org compile on RH6.2. After I grabbed cvs tree from :pserver:firstname.lastname@example.org:/cvs/src: cvs -z9 -d :pserver:email@example.com:/cvs/src co gdb dejagnu everything is OK(at least with 22 Oct 2000 cvs). So I think it's gdb bug.
Can someone with a working gdb put his/her binary (or a source tarball) on the web, please? I can't believe that everyone is supposed to pull the gdb sources from CVS, it's so sloooowwwww :(
Ok, sorry, I yelled at the wrong people. The problem is that the German ftp mirror is not up-to-date. Adding a working URL to the gdb snapshot tarball.
People probably want to talk to the gdb folks (or redhat folks, etc.) to figure out how to get pre-built gdb. get insight here: http://sources.redhat.com/insight/ get gdb here: http://sources.redhat.com/gdb/
Since upgrading gdb seems to solve this issue should we close this bug?
Since it isn't even a problem with OLD gdb anymore, i think it's safe to close this bug yes.
It isn't? I still can't run under the gdb 5.0 that comes with RH7: Note: verifyreflow is disabled Style Data Sharing is Enabled :) Note: styleverifytree is disabled Note: frameverifytree is disabled [New Thread 3076 (LWP 21385)] [New Thread 4101 (LWP 21386)] [New Thread 5126 (LWP 21387)] ptrace: No such process. (gdb) c Continuing. ptrace: No such process. [Switching to Thread 3076 (unknown thread_db state 1)] 0x0 in ?? () Subsequent continues just repeat the same messages, except for the "[switching to thread" message, which isn't repeated. Sure would be nice if there were a binary install of gdb available somewhere which could debug mozilla.
I am using RH7 too, with its version of gdb. I am not having any problems. How are you starting mozilla? I basically do this in emacs: gdb ~/mozilla-bin br main cont shar thread shar xpcom cont
Aha, you're right, it does work if I set auto-solib-add 0 and then do prun. It's just a regular run that doesn't work now, even on a huge-memory machine), i.e. gdb mozilla-bin (gdb) run
gdb in redhat rawhide seems to have the necessary fixes: ftp://rpmfind.net/linux/rawhide/1.0/i386/RedHat/RPMS/
Indeed, the rawhide RPM for gdb 5.0 does let me run, even without delayed shlib loading. However, once it runs it's not very useful: it randomly skips lines when single-stepping with "next" (usually the lines I most want to step into), and it randomly loses track of where it is in my source file and starts printing line numbers from nsCOMPtr.cpp or somewhere else in xpcom even when it's still stepping through source lines that live in some other file. The latest gdb built from cvs has the same problem.
akkana, are you still on an RH 6.x box running that rawhide rpm?
No, this is RH 7.0.
UGH. This bit me this morning. I am not sure what gives. Upgrading to the url that bryner suggests and this does not help :-( This is what I did: (gdb) prun Breakpoint 1 at 0x8054de7: file nsAppRunner.cpp, line 1216. Warning: MOZILLA_FIVE_HOME not set. main (argc=2, argv=0xbffffab4) at nsAppRunner.cpp:1216 [New Thread 1024 (LWP 2022)] (gdb) shar pthr Symbols already loaded for /lib/libpthread.so.0 (gdb) c Continuing. Warning: MOZILLA_FIVE_HOME not set. Type Manifest File: /home/builds/cmonkey/mozilla/dist/bin/components/xpti.dat nsNativeComponentLoader: autoregistering begins. nsNativeComponentLoader: autoregistering succeeded nNCL: registering deferred (0) --- > Buffered registry read fs hits (31) [New Thread 2049 (LWP 2025)] [New Thread 1026 (LWP 2026)] --- > Buffered registry read fs hits (32) GFX: dpi=96 t2p=0.0666667 p2t=15 depth=16 WEBSHELL+ = 1 [New Thread 2051 (LWP 2027)] ********** Got plugins path: /home/builds/cmonkey/mozilla/dist/bin/plugins Note: verifyreflow is disabled [New Thread 3076 (LWP 2028)] Note: styleverifytree is disabled ptrace: No such process. (gdb) Any ideas?
dougt: try building from gdb's CVS tip. the URL that bstell pasted in earlier on can get you to instructions on how to do that.
I tried build 5.0-11 which works better, but it still gets stupid during thread switches.
With gdb 20010102, I get continuable SIG32 breaks: (gdb) run Starting program: /project/omega/mozilla/mtrunk-0105/./mozilla-bin (no debugging symbols found)...(no debugging symbols found)... (no debugging symbols found)...(no debugging symbols found)... (no debugging symbols found)...(no debugging symbols found)... Gdk-WARNING **: locale not supported by C library (no debugging symbols found)... Program received signal SIG32, Real-time event 32. 0x4024bb6e in __sigsuspend (set=0xbfffdc4c) at ../sysdeps/unix/sysv/linux/sigsuspend.c:48 48 ../sysdeps/unix/sysv/linux/sigsuspend.c: No such file or directory. in ../sysdeps/unix/sysv/linux/sigsuspend.c (gdb) cont Continuing.
afranke: see <http://www.mozilla.org/unix/debugging-faq.html#sig32>
dmose: Thanks, I indeed had forgotten about that, but I was talking about nighlies here, and there a "b main" gives: Function "main" not defined. So I just start them with "(gdb) run", and that causes two or four SIG32 breaks before a mozilla window (profile manager) comes up, namely between Gdk-WARNING **: locale not supported by C library and Registering plugin 0 for: "*","All types",".*" On exit, there's the same problem, and sometimes during the run, too. But these may all be signs of subtle differences between Linux versions (I'm running on SuSE 6.2/6.4 here). Debugging a self-made debug build doesn't show this problem.
I just started seeing this problem for the first time with a CVS debug build with build date 2001-03-01 15:58:53 PST. In my case, the following steps make it happen reproduceably: 1. Start mozilla using either mozilla -g (I have ddd) or export LD_LIBRARY_PATH=. gdb mozilla-bin The results are the same either way. 2. b main 3. r 4. set auto-solib-add 0 5. c I see: Program received signal SIGTRAP, Trace/breakpoint trap. [Switching to Thread 3076 (runnable)] 0x0 in ?? () (gdb) I was able to debug using exactly this methodology without problem until recently. I have gdb 5.0 and gcc 2.95.2. I mention gcc because I only started to see this problem after the upgrade of gcc from 2.91 to 2.95.2 and full recompile of mozilla using it, but maybe that is just a coincidence. I get the problem with gdb itself compiled with gcc 2.91 or 2.95.2. The gcc business is probably just a bad tree I've been barking up. This is a RH 6.2 system (but with kernel 2.4.2). Still the original glibc-2.1.3-15. This has me absolutely at a standstill since I can't debug at all now. Based on reading all the prior comments, I guess I have to find a newer-than-5.0 gdb.
I have made some progress by getting a recent snapshot gdb (last weekly). However, this is not entirely satisfacory. First, there is some sort of incompatability with ddd. Wben I attempt to do a 'set env', I get 2 alerts: "GDB is busy", and "GDB terminated abnormally". The latter one has a "Restart GDB" button. If I click that, I am able to go ahead and type for example 'set env XPCOM_BREAK_ON_LOAD msgnews' this time without error (!?). But on most (but not all) attempts, after it has stopped (or sometimes before) at the load of libmsgnews.so and I have successfully set my breakpoint, everything grinds to a halt with: Cannot find thread 33: invalid thread handle. This last error occurs less often if I just run gdb from the command line, but it still happens sometimes even then. Once it does, I'm dead. Is there a specific gdb version that actually works reliably on mozilla?
seems like this is a fine collection of several gdb problems. I do get the "ptrace: No such process" message. Perhaps it is helpfull for you to know, that this ONLY happens with a selfcompiled version taken from CVS, while the precompiled versions from mozilla.org work fine. I do a configure with the following options: ./configure --enable-strip-libs --enable-optimize --disable-debug --disable-pedantic via CXXFLAGS and friends "-O2" is set. The compiler is gcc 2.95.2, gdb is version 5.0 and I am using a Suse 7.1 Linux distribution. Please note, that this bug prevents me from getting good stack traces, which are quite usefull for bugzilla.
rumstich: I suspect --enable-strip-libs might be what's causing your problem. try removing that and see if it makes any difference.
You may also want to try a post-5.0 gdb. I'm running a 04/01/2001 gdb. "info threads" still comes up empty, but otherwise it seems to work (SuSE 6.2). As a side note (probably unrelated): if your gdb stops on SIG32 real time events, adding a "handle SIG32 nostop" line to ~/.gdbinit may help.
<aol>me too</aol> seriously, this is blocking me from getting any stack traces out of my own debug builds. i should note that there's some vaguely useful info out there if you search google for "ptrace no such process." upping to blocker, per trudelle. adding cc's.
dan: I just found out, that I am stupid :-/ I just tested one of the latest binary downloads and it did not work either, it also showed the ptrace problem. Seems that since I installed the new Suse a few monthes ago I never downloaded a binary Mozilla (with a slow modem it takes too quite a while). Ok, the debugging works on a Suse 6.4, but not on a Suse7.1. I guess that one the the newer things broke this feature. After reading this bug I think it is gdb5.0, Suse6.4 still has gdb 4.18. But I don't know anything about it...
FWIW, the CVS tree for GDB is scheduled to get some big whacks related to it's handling of threads in the next few weeks. See <http://sources.redhat.com/ml/gdb-patches/2001-04/msg00240.html> and subsequent messages for more details.
now after mozilla changed to gcc2.95.3 I can't use gdb on my good old Suse6.4 system anymore, it worked fine here before. I get the following message Thread 1721 (manager thread)] [New Thread 1720 (initial thread)] [New Thread 1722] [New Thread 1723] [New Thread 1724] Cannot access memory at address 0x46203a6c. looks like the behaviour under gdb is also connected to the libs dynamicly linked to Mozilla. without gdb mozilla works fine. Suse6.4 ships with gcc2.95.2, so i am quite surprised that I get problems now!
Works for Me on Red Hat 7.1
blizzard: doesn't work on a LOT of distributions, for example several Suse's, all those which ship with gdb5.0. As described in this bug this is caused by a bug in gdb and it should work find with an upgraded gdb. Not sure if you can expect a little user to upgrade to a new (unofficial?) gdb just to produce a stacktrace...
Jens-Uwe: what other solution do you propose?
Moving all threading bugs to XPCOM. See bug 160356.