Closed Bug 136144 Opened 22 years ago Closed 15 years ago

sparc-sun-solaris-7_8_9 builds crash on startup

Categories

(SeaMonkey :: Build Config, defect)

Sun
Solaris
defect
Not set
critical

Tracking

(Not tracked)

RESOLVED WONTFIX

People

(Reporter: Matt.Behrens, Assigned: mozilla)

References

Details

Attachments

(2 files)

I've been unable to run the Solaris 7/8/9 nightlies on a Solaris 8 machine that 
I've been using Solaris 2.6 nightlies on for some time.

8_Recommended patchset applied last week.  GTK+ and friends obtained from the 
Solaris GNOME 1.4 distribution.

100% reproducible.  Moving ~/.mozilla out of the way doesn't help.

[nye mozilla]$ ./run-mozilla.sh -g -d gdb 
MOZILLA_FIVE_HOME=.
  LD_LIBRARY_PATH=.:./plugins:/opt/kde/lib:/opt/kde/kde2/lib:/opt/kde/qt-
2.3.0/lib:/opt/gnome-1.4/lib:/opt/sfw/lib:/opt/fltk/lib
FONTCONFIG_PATH=/etc/fonts:./res/Xft
DYLD_LIBRARY_PATH=.
     LIBRARY_PATH=.:./components
       SHLIB_PATH=.
          LIBPATH=.
       ADDON_PATH=.
      MOZ_PROGRAM=./mozilla-bin
      MOZ_TOOLKIT=
        moz_debug=1
     moz_debugger=gdb
/opt/sfw/bin/gdb ./mozilla-bin -x /tmp/mozargs511
GNU gdb 5.0
Copyright 2000 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "sparc-sun-solaris2.8"...
(no debugging symbols found)...
(gdb) run
Starting program: /opt/mozilla/./mozilla-bin 
(no debugging symbols found)...(no debugging symbols found)...
(no debugging symbols found)...(no debugging symbols found)...
(no debugging symbols found)...(no debugging symbols found)...
(no debugging symbols found)...(no debugging symbols found)...
(no debugging symbols found)...(no debugging symbols found)...
(no debugging symbols found)...(no debugging symbols found)...
(no debugging symbols found)...(no debugging symbols found)...
(no debugging symbols found)...(no debugging symbols found)...
(no debugging symbols found)...(no debugging symbols found)...
(no debugging symbols found)...(no debugging symbols found)...
(no debugging symbols found)...warning: Lowest section in /usr/lib/libw.so.1 
is .hash at 00000074
(no debugging symbols found)...
(no debugging symbols found)...(no debugging symbols found)...
(no debugging symbols found)...(no debugging symbols found)...
(no debugging symbols found)...(no debugging symbols found)...
(no debugging symbols found)...(no debugging symbols found)...
(no debugging symbols found)...(no debugging symbols found)...
(no debugging symbols found)...[New LWP 1]
[New LWP 2]
[New LWP 3]
(no debugging symbols found)...
Program received signal SIGSEGV, Segmentation fault.
0xff137370 in __1cWnsComponentManagerImplMFreeServices6M_I_ ()
   from /opt/mozilla/./libxpcom.so
(gdb) bt
#0  0xff137370 in __1cWnsComponentManagerImplMFreeServices6M_I_ ()
   from /opt/mozilla/./libxpcom.so
#1  0xff0e1220 in NS_ShutdownXPCOM () from /opt/mozilla/./libxpcom.so
#2  0x1976c in main ()
I can confirm that 7_8_9 nightlies don't run on Sparc Solaris 7 for me either.

This is possibly a duplicate of bug 130309, which has a little discussion of the
7_8_9 builds, but which is mostly about problems with the 0.9.9 build.

This should probably be assigned to whomever is charge of 7_8_9 build, but I
can't tell who that is.
Status: UNCONFIRMED → NEW
Ever confirmed: true
dup

*** This bug has been marked as a duplicate of 130309 ***
Status: NEW → RESOLVED
Closed: 22 years ago
Resolution: --- → DUPLICATE
Reopening due to comments in bug 130309 that problems with the 7_8_9 builds
should be a separate bug from problems with the 0.9.9 milestone.
Status: RESOLVED → REOPENED
Component: XPCOM → Build Config
Resolution: DUPLICATE → ---
Re-assigning to default owner.
Assignee: dougt → seawood
Status: REOPENED → NEW
I don't even know where the 7_8_9 nightlies come from, much less how they are
built.  I suspect this belongs to dcran.
Assignee: seawood → mozilla
*** Bug 141415 has been marked as a duplicate of this bug. ***

<a href="http://bugzilla.mozilla.org/showattachment.cgi?attach_id=81840">truss -f ./mozilla output from my crash is doc ID 8140</a> (from dup Bug 141415)
FWIW tried building on solaris 9 (b58 shwpl3) without success.

(same result - sigsegv on startup). 



using GCC, gnome-1.4, /usr/sfw, etc.



suggest raising severity to "blocker" - can't test if we can't run...



(ps reference to truss output above, should be to  

 attachment (id=81840) )

I can confirm a 100% crash on Solaris7 (latest 7_Recommended.zip installed.
gtk++1.2.10, glib-1.2.10 from sunfreeware.com).

Actually, I can't run any recent (>0.9.2) mozilla on Solaris7, all of them crash
immediately on startup (both the solaris2.6 and the solaria-7-8-9 tgz).

Any workaround/ideas?
this is the gdb 'bt' output (with breakpoint on 'abort')
(gdb) bt
#0  0xfe8394c0 in abort () from /usr/lib/libc.so.1
#1  0xfeb5468c in __1cH__CimplMex_terminate6F_v_ () from /usr/lib/libCrun.so.1
#2  0xfd98cd84 in _init () from /tmp/m/mozilla/components/libnkcache.so
#3  0xff3bc174 in ?? ()
#4  0xff3c0a8c in ?? ()
#5  0xff3c0ba8 in ?? ()
#6  0xff043fe8 in PR_LoadLibrary () from /tmp/m/mozilla/./libnspr4.so
#7  0xff043f14 in PR_LoadLibrary () from /tmp/m/mozilla/./libnspr4.so
#8  0xff1319e0 in __1cLnsLocalFileELoad6MppnJPRLibrary__I_ () from
/tmp/m/mozilla/./libxpcom.so
#9  0xff146ba8 in __1cFnsDllELoad6M_i_ () from /tmp/m/mozilla/./libxpcom.so
#10 0xff140210 in __1cXnsNativeComponentLoaderPSelfRegisterDll6MpnFnsDll_pkci_I_
() from /tmp/m/mozilla/./libxpcom.so
#11 0xff141520 in
__1cXnsNativeComponentLoaderVAutoRegisterComponent6MipnHnsIFile_pi_I_ () from
/tmp/m/mozilla/./libxpcom.so
#12 0xff13fef0 in
__1cXnsNativeComponentLoaderXRegisterComponentsInDir6MipnHnsIFile__I_ () from
/tmp/m/mozilla/./libxpcom.so
#13 0xff13bbf8 in __1cWnsComponentManagerImplQAutoRegisterImpl6MipnHnsIFile_i_I_
() from /tmp/m/mozilla/./libxpcom.so
#14 0xff13b92c in __1cWnsComponentManagerImplMAutoRegister6MipnHnsIFile__I_ ()
from /tmp/m/mozilla/./libxpcom.so
#15 0xff13f0f0 in __1cSnsComponentManagerMAutoRegister6FipnHnsIFile__I_ () from
/tmp/m/mozilla/./libxpcom.so
#16 0x18684 in __1cKgetCountry6FrknJnsAString_r0_I_ ()
#17 0x1974c in main ()
(gdb) 
There does seem to be a problem with very recent solaris 2.6 nightlies on
Solaris for me as well, but older ones work fine.  The 2.6 build from
http://ftp.mozilla.org/pub/mozilla/nightly/2002-06-03-21-trunk/ works for  me on
Solaris 7.  The problems with the 2.6 builds should probably be a different bug,
since they had been working earlier.

Also, Norbert Kiesel, the back trace that you show goes with which build?
How recently did the Solaris 2.6 trunk builds start breaking?  Considering that
our trunk solaris tinderboxes are still orange, I'm suspecting darin's checkin
for bug 147333.
Yep, it looks like the patch to bug 14733 is the culprit for the nightlies.  The
first line of the backtrace points to iconv. Also, the last nightly before that
check-in leads works (2002020922), while the first one after it is broken
(2002021122?). I'll post details in bug 147333.
This happens to me too (on Solaris 8) with the ftp build of 1.2a, but only after
the second time it was run.
Reinout van Schouwen wrote:
> This happens to me too (on Solaris 8) with the ftp build of 1.2a, but only 
> after the second time it was run.

Just wondering:
Did you apply the libCstd patches from the "Recommended&Security" patch cluster
yet (or better: The whole patch cluster :) ?
That may fix the issue...
Attached file Output of showrev -p
I still observe this crash on a machine with some recent patches; I'm not sure
these are the patches you talk about so I've attached the output of 'showrev
-p' as pointed out in a private email.
The problem hasn't gone away in 1.2b. Roland: could you verify if the right
patches are listed in my attachment?
Summary: sparc-sun-solaris-7_8_9 nightlies crash on startup → sparc-sun-solaris-7_8_9 builds crash on startup
Reinout van Schouwen wrote:
> The problem hasn't gone away in 1.2b.

Just curious:
Did the stack trace from the crash change ?
Yes it has, the new stack trace is:

Program received signal SIGSEGV, Segmentation fault.
0xff2421b4 in realfree () from /usr/lib/libc.so.1
(gdb) bt
#0  0xff2421b4 in realfree () from /usr/lib/libc.so.1
#1  0xff242a28 in cleanfree () from /usr/lib/libc.so.1
#2  0xff241b5c in _malloc_unlocked () from /usr/lib/libc.so.1
#3  0xff241a50 in malloc () from /usr/lib/libc.so.1
#4  0xfe9b20e8 in _XIMVaToNestedList () from /usr/openwin/lib/libX11.so.4
#5  0xfe9dded0 in XSetICValues () from /usr/openwin/lib/libX11.so.4
#6  0xfe9500ec in gdk_ic_real_set_attr (ic=0x7a2488, attr=0x786d58,
mask=GDK_IC_FOCUS_WINDOW) at gdkim.c:762
#7  0xfe950574 in gdk_ic_set_attr (ic=0x7a2488, attr=0xffbed778,
mask=GDK_IC_FOCUS_WINDOW) at gdkim.c:1209
#8  0xfe94e93c in gdk_im_begin (ic=0x7a2488, window=0x1efc80) at gdkim.c:185
#9  0xfdb95c20 in NSGetModule () from
/net/public2/reinout/mozilla-1.2b/components/libwidget_gtk.so
#10 0xfdb8ff90 in NSGetModule () from
/net/public2/reinout/mozilla-1.2b/components/libwidget_gtk.so
#11 0xfeb424d8 in gtk_marshal_BOOL__POINTER (object=0x1785f8, func=0xfdb8fe04
<NSGetModule+25936>, func_data=0x314d20, args=0xffbedb38) at gtkmarshal.c:28
#12 0xfeb85390 in gtk_handlers_run (handlers=0x12d490, signal=0xffbeda98,
object=0x1785f8, params=0xffbedb38, after=0) at gtksignal.c:1912
#13 0xfeb8410c in gtk_signal_real_emit (object=0x1785f8, signal_id=31,
params=0xffbedb38) at gtksignal.c:1477
#14 0xfeb8138c in gtk_signal_emit (object=0x1785f8, signal_id=31) at gtksignal.c:552
#15 0xfebcb8b4 in gtk_widget_event (widget=0x1785f8, event=0xffbedf30) at
gtkwidget.c:2864
#16 0xfebd8254 in gtk_window_focus_in_event (widget=0x1f48d0, event=0x1a5d78) at
gtkwindow.c:1416
#17 0xfeb424d8 in gtk_marshal_BOOL__POINTER (object=0x1f48d0, func=0xfebd80e8
<gtk_window_focus_in_event>, func_data=0x0, args=0xffbee0d0) at gtkmarshal.c:28
#18 0xfeb84160 in gtk_signal_real_emit (object=0x1f48d0, signal_id=31,
params=0xffbee0d0) at gtksignal.c:1492
#19 0xfeb8138c in gtk_signal_emit (object=0x1f48d0, signal_id=31) at gtksignal.c:552
#20 0xfebcb8b4 in gtk_widget_event (widget=0x1f48d0, event=0x1a5d78) at
gtkwidget.c:2864
#21 0xfeb4134c in gtk_main_do_event (event=0x1a5d78) at gtkmain.c:837
#22 0xfdb7d44c in object.2 () from
/net/public2/reinout/mozilla-1.2b/components/libwidget_gtk.so
#23 0xfe94b258 in gdk_event_dispatch (source_data=0x1a5d78,
current_time=0xffbee778, user_data=0x0) at gdkevents.c:2139
#24 0xfed75cf4 in g_main_dispatch (dispatch_time=0xffbee778) at gmain.c:656
#25 0xfed764f0 in g_main_iterate (block=2764, dispatch=1) at gmain.c:877
#26 0xfed76830 in g_main_run (loop=0x2cf630) at gmain.c:935
#27 0xfeb40908 in gtk_main () at gtkmain.c:524
#28 0xfdb71e54 in object.2 () from
/net/public2/reinout/mozilla-1.2b/components/libwidget_gtk.so
#29 0x1aef8 in __1cKgetCountry6FrknJnsAString_r0_I_ ()
#30 0x1bb18 in main ()

More interestingly I have found that the gtk build used makes a difference. When
I use another installed version of gtk on the system (I've checked that both
versions are 1.2.10) then Mozilla doesn't crash. I'm not sure but it's possible
that the crashing one is built with gcc and the non-crashing one with Forte... 
Reinout,

When using mozilla, gtk library must be built with the same compiler as mozilla.
That means when building mozilla using gcc, we must use the gtk library built
with gcc, when building mozilla using forte, we must use the gtk library built
with forte.

Henry
Henry: maybe so, but in Mozilla <= 1.1 it just worked with the gtk version that
apparently causes 1.2b to crash. Perhaps the Solaris build contributor has
changed his compiler?
Reinout van Schouwen wrote:
> Yes it has, the new stack trace is:
>
> Program received signal SIGSEGV, Segmentation fault.
> 0xff2421b4 in realfree () from /usr/lib/libc.so.1
> (gdb) bt
> #0  0xff2421b4 in realfree () from /usr/lib/libc.so.1
> #1  0xff242a28 in cleanfree () from /usr/lib/libc.so.1
> #2  0xff241b5c in _malloc_unlocked () from /usr/lib/libc.so.1
> #3  0xff241a50 in malloc () from /usr/lib/libc.so.1
> #4  0xfe9b20e8 in _XIMVaToNestedList () from /usr/openwin/lib/libX11.so.4
> #5  0xfe9dded0 in XSetICValues () from /usr/openwin/lib/libX11.so.4
> #6  0xfe9500ec in gdk_ic_real_set_attr (ic=0x7a2488, attr=0x786d58,
> mask=GDK_IC_FOCUS_WINDOW) at gdkim.c:762
> #7  0xfe950574 in gdk_ic_set_attr (ic=0x7a2488, attr=0xffbed778,

This issue is caused by Mozilla's new GTK+ theme support... ;-(
>Henry: maybe so, but in Mozilla <= 1.1 it just worked with the gtk 
>version that apparently causes 1.2b to crash. Perhaps the Solaris build
>contributor has changed his compiler?

The solaris build contributor hasn't changed the compiler. I know this because
I've just been the contactor since mozilla 1.2b for solaris. :-) I've just
answered several these questions these days.
FYI: the libgtk version that works comes from the Netscape 6.2.3 for Solaris
package. if this problem isn't going to be fixed it should be mentioned in the
release notes.
Keywords: relnote
We have put this relnote in the mozilla 1.2b build. It's the README file when
you decompress this build. :-)
Right, I missed that. All the more reason to put it in the official release notes. 
Crashes also happen here with WorkShop-compiled GTK+ libraries on Solaris 8.

I had compiled GTK+ 1.2.0 on Solaris 2.5.1 using
	WorkShop Compilers 5.0 99/12/04 C 5.0 patch 107289-05
as shown by grepping my own libgtk-1.2.so.0.9.1 library.

Changing LD_LIBRARY_PATH to include <netscape-7.0>/dist/lib instead of my own
GTK+ libraries fixes the problem.

It's however strange that the WorkShop-compiled libraries don't work. Solaris
and Sun compilers are supposed to be at least forward-compatible, aren't they?

I'll build GTK+ again, this time on Solaris 8 using Sun ONE Studio 7 compilers.
Hopefully these libraries will be compatible with the ones used by the person
who compiled the Solaris Mozilla binaries.

But please, please... do add this issue to the release notes and remove the
sentence about libstdc++ which is not needed anymore.

Anyway, what a relief! I had been unable to run Mozilla on Solaris since
mozilla-1.0rc3. Only the Netscape releases from Sun would run on Solaris without
crashing. Thanks for fixing this Solaris showstopper!
OK, I've rebuilt glib-1.2.10 and gtk+-1.2.10 with Sun ONE Studio 7 (latest
patches) on Solaris 8:

$ cc -V
cc: Forte Developer 7 C 5.4 2002/03/09
usage: cc [ options] files.  Use 'cc -flags' for details
$ uname -a
SunOS nestor 5.8 Generic_108528-16 sun4u sparc SUNW,Sun-Blade-1000
$ 

It now works without using the libraries bundled with Netscape.

So it seems there's some binary compatibility bug, either in Solaris 8 vs.
Solaris 2.5.1, or in Forte Developer 6 vs. WorkShop 5.

I suggest adding this information to the release notes under Solaris.
OK, I now have consistent behaviour whether I use my own libraries built with
Sun ONE Studio 7 on Solaris 8 or the GTK+ libraries bundled with Netscape 7.0.

Mozilla crashes with X errors:

$ setenv MOZILLA_HOME /usr/local/mozilla-1.2b
$ setenv LD_LIBRARY_PATH /usr/local/netscape-7.0/dist/lib:$LD_LIBRARY_PATH
$ cd $MOZILLA_HOME
$ ./mozilla
X Error of failed request:  BadMatch (invalid parameter attributes)
  Major opcode of failed request:  70 (X_PolyFillRectangle)
  Serial number of failed request:  2145
  Current serial number in output stream:  2164
X Error of failed request:  BadMatch (invalid parameter attributes)
  Major opcode of failed request:  70 (X_PolyFillRectangle)
  Serial number of failed request:  2147
  Current serial number in output stream:  2165
$ 

Any clue? the debugger doesn't work. This hangs forever:

$ mozilla -g -d dbx
Dimitri, what's the enviroment when you encount this issue? Or what have you
done after you could succeed to use mozilla (comment 28)?

This issue seems to be related to the xprint lbx proxy which does some graphics
re-encoding including X_PolyFillRectangle.

Pete, any ideas?
Dimitri, could your problem be related to bug 157424 ?
Henry Jia wrote:
> This issue seems to be related to the xprint lbx proxy which does some 
> graphics re-encoding including X_PolyFillRectangle.

Huh?
WTF has this todo with a) "Xprint" b) "LBX"/"lbxproxy" ?! Both are IMHO
completely _unreleated_ here (please correct me if I am wrong). Xprint is only
being used if you print. And LBX is only being used if someone runs X
applications explicitly thougth "lbxproxy" - which doesn't seem to be the case
here.
*** Bug 186534 has been marked as a duplicate of this bug. ***
I have this very same problem on Solaris 7, my last successful build was 2003090723
I should have added that I have tried gcc-2.95.3, gcc-2.2.3 & gcc 3.3.1, with no
effect on the bug, I have used no SUN compilers with any of the libraries; GTK, etc
I had a manifestation of this bug (or something closely related)
when trying to run 1.5 on Solaris 8. It would just hang in the midst
of bringing up the browser window at startup. I finally traced
it to the gtk libraries. I DID do as suggested in the README and
get the libraries from Sun's Netscape 7 build, and I added them to my 
LD_LIBRARY_PATH. However, it was still using (older?) copies on my
system because they appeared earlier in the path. The problem went away
when I put the NS7 directory (i.e. the one containing the libraries)
first in the path. I hope this helps some people.
Product: Browser → Seamonkey
Ginn should this bug be closed? It's kind of obsolete
Whiteboard: CLOSEME
Status: NEW → RESOLVED
Closed: 22 years ago15 years ago
Resolution: --- → WONTFIX
Whiteboard: CLOSEME
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: