Overview Description: If built with --disable-debug Mozilla crashes while registering its components. Steps to Reproduce: Build with --disable-debug, run. Actual Results: A segmentation fault. GDB output follows, here TestXPC is run but the same applies to the browser itself: (gdb) run Starting program: /home/build/mozilla-build/dist/bin/./TestXPC (no debugging symbols found)...(no debugging symbols found)... (no debugging symbols found)...(no debugging symbols found)... Program received signal SIGSEGV, Segmentation fault. 0x40008714 in _dl_relocate_object (l=0x80ee4d8, scope=0x80ee6dc, lazy=1, consider_profiling=0) at ../sysdeps/i386/dl-machine.h:326 326 ../sysdeps/i386/dl-machine.h: No such file or directory. (gdb) bt #0 0x40008714 in _dl_relocate_object (l=0x80ee4d8, scope=0x80ee6dc, lazy=1, consider_profiling=0) at ../sysdeps/i386/dl-machine.h:326 #1 0x402a8df4 in dl_open_worker (a=0xbffff0bc) at dl-open.c:182 #2 0x40009bde in _dl_catch_error (errstring=0xbffff0b8, operate=0x402a8b84 <dl_open_worker>, args=0xbffff0bc) at dl-error.c:141 #3 0x402a8f55 in _dl_open ( file=0x80c1a18 "/home/build/mozilla-build/dist/bin/components/libnspng.so", mode=1, caller=0x4011f62b) at dl-open.c:232 #4 0x40178ffd in dlopen_doit (a=0xbffff20c) at dlopen.c:41 #5 0x40009bde in _dl_catch_error (errstring=0x804ec28, operate=0x40178fd0 <dlopen_doit>, args=0xbffff20c) at dl-error.c:141 #6 0x40179642 in _dlerror_run (operate=0x40178fd0 <dlopen_doit>, args=0xbffff20c) at dlerror.c:125 #7 0x4017903e in __dlopen_check ( file=0x80c1a18 "/home/build/mozilla-build/dist/bin/components/libnspng.so", mode=1) at dlopen.c:53 #8 0x4011f62b in pr_LoadLibraryByPathname () from /home/build/mozilla-build/dist/bin/libnspr4.so #9 0x4011f550 in PR_LoadLibraryWithFlags () from /home/build/mozilla-build/dist/bin/libnspr4.so #10 0x4011f589 in PR_LoadLibrary () from /home/build/mozilla-build/dist/bin/libnspr4.so #11 0x400c8668 in nsLocalFile::Load () from /home/build/mozilla-build/dist/bin/libxpcom.so #12 0x400e2a32 in nsDll::Load () from /home/build/mozilla-build/dist/bin/libxpcom.so #13 0x400dcb75 in nsNativeComponentLoader::SelfRegisterDll () from /home/build/mozilla-build/dist/bin/libxpcom.so #14 0x400dd2f8 in nsNativeComponentLoader::AutoRegisterComponent () from /home/build/mozilla-build/dist/bin/libxpcom.so #15 0x400dc97e in nsNativeComponentLoader::RegisterComponentsInDir () from /home/build/mozilla-build/dist/bin/libxpcom.so #16 0x400dc889 in nsNativeComponentLoader::AutoRegisterComponents () from /home/build/mozilla-build/dist/bin/libxpcom.so #17 0x400daf9a in nsComponentManagerImpl::AutoRegister () from /home/build/mozilla-build/dist/bin/libxpcom.so #18 0x400dffe2 in nsComponentManager::AutoRegister () from /home/build/mozilla-build/dist/bin/libxpcom.so #19 0x8049702 in JS_PushArguments () ---Type <return> to continue, or q <return> to quit--- #20 0x804aea1 in JS_PushArguments () #21 0x401f7711 in __libc_start_main (main=0x804ae68 <JS_PushArguments+6260>, argc=1, argv=0xbffffb94, init=0x80492d0 <_init>, fini=0x804b594 <_fini>, rtld_fini=0x40009df4 <_dl_fini>, stack_end=0xbffffb8c) at ../sysdeps/generic/libc-start.c:90 (gdb) Expected results: Succesful component registering and startup. Build Date & Platform: I first encountered the bug in November and it has followed me all the way. The latest I've tried is -rw-rw-r-- 1 22 21770161 Feb 24 09:51 mozilla-source.tar.gz I'm running on an i686-pc-linux-gnu, kernel 2.2.14, glibc 2.1.2, gcc 2.95.2, a self built system with no problems whatsoever with other programs. Additional information: Mozilla works just fine if built without --disable-debug. I succesfully built the same sources with --enable-optimize --enable-strip-libs --enable-x11-shm and I'm writing this bug report with the results. However, if I use --disable-debug it just won't work and it never has worked here. Don't hesitate to ask if you've got questions.
Now that glibc-2.1.3 is officially out I compiled both GCC (for building libstdc++) and Mozilla against it. However, the bug remains. The backtrace is a little different now, it won't die in _dl_relocate_object but in _dl_lookup_symbol instead: (gdb) run Starting program: /home/build/mozilla-build/dist/bin/TestXPC Program received signal SIGSEGV, Segmentation fault. 0x40007538 in _dl_lookup_symbol ( undef_name=0x7da5d989 <Address 0x7da5d989 out of bounds>, ref=0xbfffee9c, symbol_scope=0x80f03b4, reference_name=0x8076940 "./components/libnspng.so", reloc_type=18) at ../sysdeps/i386/i686/dl-hash.h:76 76 ../sysdeps/i386/i686/dl-hash.h: No such file or directory. (gdb) bt #0 0x40007538 in _dl_lookup_symbol ( undef_name=0x7da5d989 <Address 0x7da5d989 out of bounds>, ref=0xbfffee9c, symbol_scope=0x80f03b4, reference_name=0x8076940 "./components/libnspng.so", reloc_type=18) at ../sysdeps/i386/i686/dl-hash.h:76 #1 0x400092a3 in _dl_relocate_object (l=0x80f01b0, scope=0x80f03b4, lazy=1, consider_profiling=0) at ../sysdeps/i386/dl-machine.h:326 #2 0x402fae64 in dl_open_worker (a=0xbffff03c) at dl-open.c:182 -Vesa
The loader on older linux weren't thread safe and we had to use. We got it fixed in the Redhat 6.0 distribution. This looks so much like the same bug: Crash when loading dlls. Shaver ? Does this look like the same. Any suggestion on what needs to be installed.
Assignee: dp → shaver
marking new and cc'ing blizzard. vesuri, are you still seeing the problem?
Status: UNCONFIRMED → NEW
Ever confirmed: true
Yes, the problem is still there. I just downloaded the latest sources (Apr 23 19:03) and built with --disable-debug. Backtrace goes through pr_LoadLibraryByPathname() -> __dlopen_check -> _dlerror_run -> _dl_catch_error -> dlopen_doit -> _dl_open -> _dl_catch_error -> dl_open_worker -> _dl_relocate_object and dies there: #0 0x40009234 in _dl_relocate_object (l=0x808cad8, scope=0x808ccdc, lazy=1, consider_profiling=0) at ../sysdeps/i386/dl-machine.h:326 326 ../sysdeps/i386/dl-machine.h: No such file or directory. And again, this is GCC 2.95.2 with binutils 220.127.116.11.35 and glibc-2.1.3. Hmm, reloc fails. This is weird, indeed.
Adding crash keyword.
reassigning to the (hopefully) correct instance of shaver
Assignee: shaver → shaver
--disable-debug shouldn't matter, if it's the same bug, and I don't think TestXPC uses more than one thread. It's always libnspng.so, though, which is interesting. If you remove that library, what happens? Does your build use the system libpng, or build the one out of the Mozilla tree?
Status: NEW → ASSIGNED
18 years ago
Assignee: shaver → vesuri
Status: ASSIGNED → NEW
I need answers to those questions to fix this, so I'm reassigning to Vesa until I can get them.
This bug is the same as 41414.
And SuSE's going to ship 7.0 with a debugging version because it wouldn't work for hundred thousands of people using SuSE linux. Maybe I could do you a favour and send you a SuSE Linux distribution to reproduce it yourself? BTW: It's not a SuSE problem because I'm having the same problem on my homebrewn distribution which is completely different. The only common thing is the latest glibc2.1.3 on both systems.
Didn't the reporter of this bug (Vesa Halttunen, email@example.com) say he was using glibc 2.1.2?
A few lines above he stated that the bug is still there for him with glibc2.1.3. What are the people using who do NOT see the bug?
I did use glibc 2.1.2 when I first reported the bug but if you read my comments you'll notice the bug is also present when using glibc 2.1.3. I haven't had time to look into this and I'm sorry for that. To me it seems as if it might be a bug in either glibc or in the dynamic loader (ld.so). Another thing popped into my mind as well; is it possible that this comes up if Mozilla gets linked against the libpng distributed with Mozilla but the shared library in the system is a different version? I did have some strange crashing problems when I updated my libpng a few weeks ago.. This is pure speculation since I'm currently not able to test it (being in the US, not home) but I think I will do that when I get back. What comes to Daniel Egger's comment I'd like to emphasis that I have a homebrewn distribution as well - all binaries on my system have been built by me. And all of them do work just fine. If someone else is willing to look into this please go ahead but I do have this thing in my mind, on a lower priority =)
I reported bug 41414. I'm using Redhat 6.2 + all the errata/security updates + some of the Rawhide (redhat beta) rpms. But no one on redhat6.2 has been able to reproduce. I dunno why. I'm using glibc 2.1.3 and i compile without any options regarding libpng in my .mozconfig. I see this bug in the nightlies also but I don't know what options regarding libpng are included there. If someone wants I can send them an "rpm -qa".
I'm on Red Hat 6.2 with glibc 2.1.3. Pretty vanilla system.
*** Bug 41414 has been marked as a duplicate of this bug. ***
Last night I downloaded a talkback build for the first time and generated around 42 talkback reports for this bug. How do I go about getting these connected to this bug? Thanks.
*** Bug 47046 has been marked as a duplicate of this bug. ***
Shaver, please see my talkback reports in bug 47046.
Assignee: vesuri → shaver
Severity: normal → critical
Status: ASSIGNED → NEW
Keywords: dogfood, nsbeta3, relnote, relnote2
GetNewOrUsedProxy? Smells like something jband would know more about. It's not clear to me why this bug is assigned to me, but I'm pretty sure I'm not going to have time to fix it.
Putting on [dogfood-] radar. Not critical to everyday use.
GetNewOrUsedProxy would be dougt's world.
boy, this bug has morphed. The first stack looks like it is the js component loader. The Vesa Halttunen adds another stack that looks different - maybe a build problem then states it was from source that was pulled on Apr 23! Lastly Daniel Egger and Trudelle believe that this is a dup of 41414 and 47046. Ugh. I am going to reopen 47046 and take a look at it.
I've seeing this bug on glibc 2.1.3 box. added myself in cc list.
I don't follow, dougt: how does the first stack look like the JS component loader? I see the native component loader on the stack, trying to load our PNG plugin -- and possibly barfing on the entrained libpng.so, though I never really got an answer to that question -- but there's no JS component loader anywhere. I don't know what to do with this bug: I don't have enough information to fix it, it doesn't happen for me, and I'm not motivated to go install SuSE 7.x -- or duplicate Vesa's frankensystem. I'm pretty tempted to mark it WONTFIX, since I won't fix it. (It would be interesting to see if just turning off symbolic debugging information, versus turning off all the -DDEBUG stuff, makes a difference. Also, what happens if you rebuild just libnspng.so without debugging?)
Ok, new info: After a recompile of glibc it finally works on my system without a debugging build of glibc though it still does not on a standard SuSE 7.0 system. Mike: No, recompiling libns* with debugging doesn't help.
mike - I don't know what I was smoking.
Marking nsbeta3- per pdt review.
Whiteboard: [dogfood-] → [dogfood-][nsbeta3-]
Daniel: Let me make sure I get this straight: you recompiled the same version of glibc, and now it works? What options do you use with glibc? SuSE?
It seems unclear to me whether this bug requires either of a "developer" or "user" release note for Netscape 6 RTM. If anyone feels it does, can they please draft one and then nominate with the relnote-user or relnote-devel strings in the Status Whiteboard. Thanks :-) Gerv
Options? I haven't used any special options for glibc ... and I use --disable-debug --enable-optimize for mozilla normally.
firstname.lastname@example.org is no longer a valid email. reassigning qa contact to the component's default.
QA Contact: leger → rayw
Can anyone please verify this bug or close it? I haven't seen it for ages now so I guess it has been fixed in the meantime.
Status: NEW → RESOLVED
Last Resolved: 18 years ago
Resolution: --- → WORKSFORME
I'm going to mark this as WORKSFORME since shaver is MIA right now and I am his stunt double.
You need to log in before you can comment on or make changes to this bug.