Closed Bug 116444 Opened 23 years ago Closed 21 years ago

undefined symbol: __vt_17nsGetServiceByCID in all JREs v1.3.1 and 1.4.0beta tried (Sun, IBM)

Categories

(Core Graveyard :: Java: OJI, defect)

x86
Linux
defect
Not set
major

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: asmith, Assigned: joshua.xia)

References

()

Details

(Whiteboard: close)

Attachments

(1 file)

Using nightly build, redhat source rpm (mozilla-2001122015_trunk-0_rh7.src.rpm),
i get this nsGetServiceByCID error when visiting java.sun.com with either sun's
JRE (1.3.1 and 1.4.0beta) or IBM's JRE (1.3.1) installed (only one JRE installed
at a time):

LoadPlugin: failed to initialize shared library
/opt/IBMJava2-13/jre/bin/libjavaplugin_oji.so
[/opt/IBMJava2-13/jre/bin/libjavaplugin_oji.so: undefined symbol:
__vt_17nsGetServiceByCID]


Running RedHat 7.2, rawhide. I've gone thru all the bugzilla entries re: jre or
java vm, none of the workarounds ive seen, well, work. Sorry if this is a
duplicate, I did search for bugs referring to __vt_17nsGetServiceByCID and
couldnt find any.
Have you symlinked the plugin into Mozilla's plugin directory ?
Yes, /usr/lib/mozilla/plugins/libjavaplugin_oji.so is a symlink to
/opt/IBMJava2-13/jre/bin/libjavaplugin_oji.so.
I had the same problem with Mozilla 0.9.6 compiled on my home system. I am using
gcc 3.0 and JRE 1.3.1 (tried 1.4 too).

The source of this problem is that how GCC creates class method names. Old GCC
(before 3.0) uses prefix "__vt_" and GCC 3.0.x uses "_ZTV" instead. I was unable
to find why it has been changed or a way to specify this prefix for GCC. It
seems that JDK 1.3 and 1.4 for Linux were compiled with GCC 2.x and the same
method name is encoden in different ways in Java Plugin library and Mozilla
shared libraries. This is why the loader can not resolve this name.

So I recommend to install gcc <3.0 and recompile Mozilla. I tried gcc 2.95.3 on
my system and now all work correctly including Java plugin from JDK 1.3.
Confirm.
I followed Nikolay's suggestion to compile using gcc2.9x instead of gcc3, it now
works for me. 

Perhaps configure should check for gcc3 and issue a warning?
Status: UNCONFIRMED → RESOLVED
Closed: 23 years ago
Resolution: --- → WORKSFORME
Mangling was changed but this was done on purpose and there is no way to change
it back.
The root of this is major C++ ABI changes in gcc 3.0. 
This basically means that any C++ lib must be recompiled to be used 
with g++ 3.0.

See http://www.gnu.org/software/gcc/bugs.html for details.

I am not sure whether it is possible to change java plugin implementation to 
not use C++ API directly - may be we can implement 
C glue layer in OJI what will incapsulate all functionality needed 
for java plugin? This layer will be recompiled by compiler 
used to build mozilla and therefore dependency on compiler 
used for java plugin will not be so high.

Anyway, this is more java plugin problem and not really OJI one. 
Reopen.
However plugins works, if mozilla was compiled with pre 3.x gcc,
problem is still exist, if it was compiled with 3.x compiler.
Think, that this bug should be in opened state and at least documentation
must be updated.
Status: RESOLVED → UNCONFIRMED
Resolution: WORKSFORME → ---
Still seeing this as of 3rd February on mozilla-gcc3 binary builds, latest
'official' Mozilla/Netscape Linux JRE download from plugin download page. Not
sure if anyone was watching this bug, but I'll just change back to the older
compilers' builds.
This issue needs attenion from plugin developers side.
cc-ing relevent people.
Status: UNCONFIRMED → NEW
Ever confirmed: true
I don't see what the plugin team can do about this.

If the Mozilla orginization wants to expose a different interface for embeding
Java that is C based we can move in that direction with the next releaase.  If
they want to standerdize on a set of C++ compilers that are internally
consistent we can update our build process to comply with that.

Ultimatly, I think this is something that Mozilla needs to take the lead on.

Steven, the problem is not that we have different C++ bindings as so much as
that the JRE plugin is using these C++ calls.  The JRE began both using the
unfrozen interfaces and concrete base classes before they were reviewed and
finalized.    

Yes, I agree, the use of unfrozen interfaces is definitly part of the problem. 
However, since they represented the only way to integrate Java into Mozilla at
any given point in time I see no other option but to have used them.

I guess we could explore the alternative of not even attempting to make Java
work with mozilla until there is a viable set of frozen interfaces.

Barring that, we just have to live with the fact that some versions of the
JRE/plugin arn't going to work with some versions of Mozilla (and that given the
really long development cycles here at sun that might mean a spell of time were
this is no version of Java available to Mozilla).

This does not mean I'm happy about this state of affairs, but I don't see any
way around it.  The best we can hope to do is minimize the number of things that
add to the problem.

 
The only symbol that I see being used in the *windows* build is:
    xpcom.dll
              100071FC Import Address Table
              100081A8 Import Name Table
                     0 time date stamp
                     0 Index of first forwarder reference

                 4BD 
?GetGlobalServiceManager@nsServiceManager@@SAIPAPAVnsIServiceManager@@@Z

I get the exact same problem on my FreeBSD 4.5-RELEASE box.  I had previously
had 4.4-RELEASE installed, and it also did not work then.  

The error I get when I run mozilla from cmd line is:

LoadPlugin: failed to initialize shared library
/usr/X11R6/lib/mozilla/plugins/java2/plugin/i386/ns600/libjavaplugin_oji.so
[/usr/X11R6/lib/mozilla/plugins/java2/plugin/i386/ns600/libjavaplugin_oji.so:
Undefined symbol "__vt_17nsGetServiceByCID"]

Before this I got a few other errors until I fixed them by copying files from a
different directory to the mozilla plugin directory.

# mozilla
LoadPlugin: failed to initialize shared library
/usr/X11R6/lib/mozilla/plugins/java2/plugin/i386/ns600/libjavaplugin_oji.so
[Shared object "libgdk-1.2.so.0" not found]

After I fixed that one I got this one:

# mozilla
LoadPlugin: failed to initialize shared library
/usr/X11R6/lib/mozilla/plugins/java2/plugin/i386/ns600/libjavaplugin_oji.so
[Shared object "libstdc++-libc6.1-1.so.2" not found]

Then this one:

# mozilla
LoadPlugin: failed to initialize shared library
/usr/X11R6/lib/mozilla/plugins/java2/plugin/i386/ns600/libjavaplugin_oji.so
[Shared object "libm.so.6" not found]

Then this one after:

# mozilla
LoadPlugin: failed to initialize shared library
/usr/X11R6/lib/mozilla/plugins/java2/plugin/i386/ns600/libjavaplugin_oji.so
[Shared object "libc.so.6" not found]

Then finally the final one which I posted first.  Hope this helps some. It
sounds familiar/similar/same as the posted ones on this thread already, but it
said its only been tested on Sun and IBM, so here is mine =).  Hope you guys can
get this fixed sometime, as it is a slight inconvenience.  Mozilla is wonderful
besides this little problem, have had almost no other problems.  

-Frank
Same bug on Solaris using Forte compiler build of 0.98

LoadPlugin: failed to initialize shared library
/netopt/SUNWns6-6.0/java/plugin/sparc/ns600/libjavaplugin_oji.so [ld.so.1:
/src/ns/milestone.newer.cc/mozilla/dist/bin/mozilla-bin: fatal: relocation
error: file /netopt/SUNWns6-6.0/java/plugin/sparc/ns600/libjavaplugin_oji.so:
symbol _vt.17nsGetServiceByCID: referenced symbol not found]

this is unsupported import by the java plugin.
Regarding comment #15:

If you are using the plugin on Solaris don't use the version that is in the
ns600 directory.  This version is built with gcc and was only ment for the
Netscape 6.0 browser.  Use the version in the ns610 directory.
Everyone,

When you right a comment saying you have reproduced this problem, please state
which versions of the plugin and browser you are using.

Can someone confirm that this problem occurs with the RELEASE verison on Java
1.4 and Mozilla on Linux (or anywhere else)?  I have checked the .so files and
we don't seem to be importing the offending symbol in that version.

At this point, I believe the only mangled symbol we import in the 1.4 version of
the plugin (on any platform) is nsServiceManager::GetGlobalServiceManager.

With Mozilla 0.9.8 that I compiled with gcc 3.0.2
and sun j2sdk 1.4 RELEASE version mozilla crashes
while starting (platform: solaris 2.8)
IsPluginFile(/afs/nd.edu/user19/bkovacs/src/mozilla/modules/plugin/samples/defau
IsPluginFile(/afs/nd.edu/user19/bkovacs/usr/j2sdk1.4.0/jre/plugin/sparc/ns610/li
LoadPlugin() /afs/nd.edu/user19/bkovacs/usr/j2sdk1.4.0/jre/plugin/sparc/ns610/li
INTERNAL ERROR on Browser End: No manager for initializing factory?

System error?:: Error 0
Just to correct some misconceptions which came up on irc this evening, AIUI:

gcc 3.0 and gcc2.95 have differnent name mangling, as well as a different ABI.

This has nothing to do with using unfrozen interfaces - this will be a problem
as long as the sun plugin links any mozilla library compiled with a different
compiler ABI, or to a different version of libstdc++. (The latter may work, but
mysteriously crash at random points in time, though)

There isn't anything mozilla can do about this, AFAIK. If it was just name
mangling, we could probably export an alias, or something, but the ABI changed,
and so everything else will be different too, I think.
Well, unfrozen interfaces is also problem - it makes 
  upgrade of plugin or mozilla quite tricky because of compatibility problems. 

As for  ABI changes then the solution is to use C and not C++ API between 
plugin and mozilla. In fact the most important part is callbacks from plugin.
This could be implemented on mozilla side and then ABI changes will not 
affect us. Usage of C++ API to interact with third-party plugins does 
not really look like great idea for me. 

Or may be i do not understand something?
Maybe those guys at SUN could release a gcc-3.0.x compiled version for all those
poor guys who cannot use their plugin with their own gcc-3.0.x compiled Mozilla
- like me. ;-) Or maybe there is a way around importing the mangled
nsServiceManager::GetGlobalServiceManager symbol?
I think that the non-deprecated frozen interfaces to get the componant manager
(ie symbols which need to be linked too) are extern "C", although I'm not sure.

The name mangling alone isn't guaranteed to be enough to stave of binary
incompatability between compilers, of course, although since we're only passing
arround pointers to abstract classes using only non-virtual single inheritance
maybe we'll be lucky for a while.

Plus the new ABI is apparently cross-vendor, although I think it has changed
between 3.0 and what will be 3.1 to fix bugs in conformance to that standard.
*** Bug 134807 has been marked as a duplicate of this bug. ***
Reporting the same bug in the Mandrake Cooker RPM of Mozilla 1.0RC2.  I've tried
it with the jre.xpi mentioned at http://www.mozilla.org/releases , the latest
version 1.4 from Sun and Netscape's JRE 1.3.1 xpi.

# mozilla -g
LoadPlugin: failed to initialize shared library
/usr/lib/mozilla/plugins/java2/plugin/i386/ns600/libjavaplugin_oji.so
[/usr/lib/mozilla/plugins/java2/plugin/i386/ns600/libjavaplugin_oji.so:
undefined symbol: __vt_17nsGetServiceByCID]

Perhaps, going by the comments above, the Mandrake binary is compiled with GCC 3.x
This bug should be duped with / made dependent on bug 124003.
Excuse me, that should be bug 124006. Sorry for the spam.
See bug raised on Java's site (Make Java 1.4 plugin work with gcc3 compiled
mozilla - http://developer.java.sun.com/developer/bugParade/bugs/4687814.html)
which Sun have rejected as they do not support Mozilla.

I suspect the only way this will get resolved is if:
a) The "Community" kick up enough fuss to make Sun think again or
b) Netscape specify that version Netscape version 7.0 on unix will use gcc3.
Assuming Sun still want to support Netscape they should release a new JRE in time.

Are their any pointers to the Plugin API? If the API is C++ I'm supprised the
Codeweavers plugins didn't break as I haven't touched/upgraded them since way
before my distro started compiling with gcc3.

Can any of the binutils tools be used to identify the gcc version used to
compile a plugin. I suspect other plugins may well be affected by this issue.
Blocks: 124006
Can we use kaffe instead of waiting for Sun to recompile a gcc-3 compatible
version?  

There is some source code for a kaffe-mozilla plugin at
http://www.kaffe.org/ftp/pub/packages/kaffe-mozilla-oji/

One way around this is to compile the jdk from source.

I downloaded the Sun JDK 1.4 source code and have managed to get the jdk
compiled with gcc3. I am not clear on Sun's guidelines about releasing patches
to their code or release binary versions of the jre. Could someone clarify?
Folks at both Netscape and mozilla.org are interested in making the gcc 3.1(.1?)
the default compiler for Mozilla on Linux at some point in the not too distant
future.  We'd like to see plugins continue to work.  In particular:

bbaetz said: 
> I think that the non-deprecated frozen interfaces to get the componant manager
> (ie symbols which need to be linked too) are extern "C", although I'm not 
> sure.

Looking in xpcom/glue and xpcom/glue/standalone, this appears to be true.  

Joe: is there any chance you could at least fix the next release of the 1.4.1
JRE to statically link against the XPCOM glue library so that this particular
problem goes away?
This, together with:

  http://bugzilla.mozilla.org/attachment.cgi?id=90140&action=view

from bug #154206 (http://bugzilla.mozilla.org/show_bug.cgi?id=154206), is
an attempt to add the missing symbols as pointers to the new mangled names.

Due to my very limited ASM/C++ knowledge, it is no work of art, and I know
it can be done much more dinamic.

Anyhow, it should corrispond to the symbols generated by gcc-3.1.  It resolves
the missing symbols problem, but mozilla then just segfaults :/

I post this, as I am stuck, and maybe this sparc some insperation in someone
else out there.
The gcc C++ ABI changed.

Bug 154206 will only work for plugins which, while written in c++, don't use C++
to call into mozilla.
Current error on GCC 3.1 builds with 1.4.1 plugin:

LoadPlugin: failed to initialize shared library
/usr/java/j2re1.4.1/plugin/i386/ns600/libjavaplugin_oji.so
[/usr/java/j2re1.4.1/plugin/i386/ns600/libjavaplugin_oji.so: undefined symbol:
GetGlobalServiceManager__16nsServiceManagerPP17nsIServiceManager]
So what should be a good solution to the problem here? 
1)Would compile JRE with static link to xpcom glue library (from comment #31) work?
2)Is there a way to end this unfrozen interface thing, which seems to be the
root cause of the problem?
3)Should we implement a C glue layer in OJI what will encapsulate all
functionalities needed by java plugin (from comment #6)? 
Or any other promising solutions?

Before mozilla/Netscape using gcc3.1x on linux, we need to find a solution.

dmose mentioned that even when he wrapped the missing symbol to call the real
one form mozilla, it didn't work.

The g++ ABI changed, and theres nothing mozill can do about it. If you only use
functionality from xpcom, then static linking to the glue lib may work.

If java uses other functionality based on C++ (nsIURI, for example), then its
probably impossible; does it?
For whatever reason, I thought that sun gave up on the xpcom plugin routine...
maybe wishful thinking. The first question is why are you using this call to
begin with?  There is absolutely no reason as near as i can tell to continue
using this function.  If it means breaking older nscp/mozilla builds, so be it.  

> 1)Would compile JRE with static link to xpcom glue library (from comment #31)
work?

This may work.  But you have to do this correctly, right?  I can offer my
assistance to make this happen.

>2)Is there a way to end this unfrozen interface thing, which seems to be the
root cause of the problem?

Ah.. No.. This one can't be pinned on an unfrozen interface.  The problem is
that your not making your component "standalone" - your depending on pieces that
you should not and are subject to change.  

The root of the problem is that SUN has not accepted any offer of mine, mike
shaver's, or any one with much more mozilla knowledge than your staff to inspect
your code to ensure that it is robust to mozilla changes. We could have
discovered most of these problems a few years ago?  Then again, maybe not.  I
remember telling you about this bug months ago. :-/

Ah well.... the offer still stands.

>3)Should we implement a C glue layer in OJI what will encapsulate all
functionalities needed by java plugin (from comment #6)? 
Or any other promising solutions?

Yeah.  This would be good.  But I doubt that this will be free of problems. 
What I think that we *should* do is work closer getting what we have today right.  

dougt
This symbol is part of the support for regxpcom.  Since it is our intention to
strip out regxpcom support from the plugin in the near future this problem might
just go away without the need to add glue layers or wrappers.
>Ah.. No.. This one can't be pinned on an unfrozen interface.  The problem is
>that your not making your component "standalone" - your depending on pieces
>that you should not and are subject to change.  

>The root of the problem is that SUN has not accepted any offer of mine, mike
>shaver's, or any one with much more mozilla knowledge than your staff to
>inspect your code to ensure that it is robust to mozilla changes. We could
>have discovered most of these problems a few years ago?  Then again, maybe
>not.  I remember telling you about this bug months ago. :-/

>Ah well.... the offer still stands.

As long as we are pointing the finger of "root cause" lets not forget MOZILLA's
steadfast refusal to document anything having to do with being a component or
using regxpcom while insisting that Sun adopt this interface model for the Java
plugin (apparently without any good reason) before it was finished.

I think there is enough "root cause" here for us all to share. 
huh?  There is no "refusal to document anything having to do with being a
component or using regxpcom".  Suggesting this implies that there is some
conspiracy.  The problem is that there has been lack of time and of resources to
do this critical part.  

Now, don't get me wrong, the developers working on mozilla (nscp mainly) suck
for not being able to push back on the wizbang features when critical stuff like
embedding, documentation, samples don't exists.  However, who knows if we would
be having this converstation if we didn't build a bad-ass browser first?  

In any case, if your developing a plugin or component for something like
mozilla, the first problem which you must answer is what is available to me. 
Clearly, in almost all cases linking directly to some C++ symbol is NOT a good
idea, right?  In fact, I can't think of any plugin API that you would want to do
this.

And yes, I understand that this plugin was suppose to support address both old
browser which didn't have proper apis and new version that do.  Maybe that is
the real problem: we didn't cut bait on older versions when we should have.
Hmmm...

Let me see if I follow your logic:

1) The first thing I must do, as a developer, if figure out what is available to me.

2) No documentation exist nor is any forthcoming so I must read the code (This
is the specific advise I have recieved from Netscape/Mozilla when asking for
guidance in these areas).

3) I find in the code an exported C++ symbol that does exactly what I need to
make my plugin work.  Other parts of mozilla are using it.  There appears to be
 NO other way to make my plugin work (at that time) without using it.

Conclusion:

Don't use that symbol because I should just "know it is not a good idea".

-----
Further, my previous statments about documentation were not ment to imply that I
though the Mozilla engineers got together each morning over coffie, decided not
to document anything and then had a good laugh.  Although, as the time without
documentation begins to be measured in years, you have to wonder.

It did mean however, that the Mozilla engineers/managment need to take SOME
responsibility for problems that arise as a result of THEIR CHOICE to spend
their time implementing new features instead of providing documentation for
existing features (especially when they expect other to use those features
properly).

Also Netscape engineers need to take SOME responsibiltiy for shipping product
based on pre-release code (i.e. mozilla pre 1.0).  Perhaps it would have been
better to ship a only betas of 6.0 until Mozilla shipped.  At least then Sun
could more easily justify dropping support for the early Netscapes to it's
customers.

As I said, trying to support regxpcom was a big mistake on Sun's part and we are
going to correct that in the next release. 

I should have competed with Microsoft.
One thing we very much need to take responsibility for is delegating an
important part of our content support story to a closed-source plugin with long
release cycles maintained as though it were an open source part of Mozilla that
could be rebuilt in sync.

I recommend (and did years ago, when Ed Burns was still maintaining this stuff)
that people who are writing a Mozilla component

 * wait until APIs are frozen, at which point documentation has a chance to 
   catch up (and which documentation is sorely needed; I'd always hoped that
   Sun would cough up docs for the OJI infrastructure, to help things like
   waterfall and Kaffe, but you know how it goes), or

 * lobby to become part of the mozilla/extensions world, where they can be
   rebuilt in sync with incompatible changes without requiring a multi-season
   respin that will always be out of sync, or

 * pick a plugin to copy, realizing that things which build _with_ Mozilla
   and things that build _outside_ of Mozilla have different binary 
   compatibility requirements, and taking appropriate additional care if they
   choose to keep their distance for whatever business/legal/etc. reasons.

I realize that this is sometimes hard to do if you don't understand Mozilla in
detail, which is why I offered more than a year ago to fix it up _myself_, in
spite of the fact that working on JVM stuff repels me to my very core.  The
repeated offer was declined, which I took to mean that Sun either had a
solution, or didn't care about the problem.  Maybe there's a third
interpretation I'm missing, though, in which case I'm always eager to learn things.

Note that using raw C++ interfaces will kill you eventually anyway.  Witness the
fact that gcc 2.9[56], gcc 3.1 and gcc 3.2 all have different ABIs.  This is one
of the reasons that COM was invented, and XPCOM from it -- we don't get paid by
the |virtual| or anything.

I believe that if Mozilla is going to have a Java story that isn't as filled
with pain and disappointment -- on both sides, no doubt -- as our current one,
we're going to have to find and integrate an OJI plugin into the Mozilla source
base, which will use the JRI or some other standard, documented, never-changed
API to talk to the JVM.  I'd love for Sun to contribute theirs, but it seems
unlikely.  The waterfall code seems a good base for us to work from, if we can
get it updated and integrate, or maybe another vendor (IBM?) will donate some code.

(Having Netscape distribute its own OJI plugin seems like something that would
have avoided this long ago in a galaxy far far away, but that's purely
speculation.  Nobody could have guessed that the APIs would change between NS6.0
and Mozilla 1.0.)

I am speaking, of course, just for myself, and this shouldn't be taken as
mozilla.org dictum.  I'm just another hacker who will someday want to run an
applet in his browser.
Your write up of my logic is correct.  Someone at Sun should have known that it
was not the right thing to do.  And, if they didn't know that linking to a C++
export could break when the compiler changes, they could have asked someone.

I do agree that quickly adopting xpcom plugin scripably costs us big time.  I
would say sorry but i was not part of the decision to intergrate java using
XPCOM.  I came in after this decision and have been plugging the holes where I
can.  
  
I think we're getting rather far afield from the specific issue of this bug.  It
sounds like it would be useful and productive for various folks to get together
and try and hash out a roadmap for Java that will work better for everyone in
the future.  But let's not do that in this bug; if someone wants to drive that,
post a note to n.p.m.oji, and we can take it from there.

In particular, the hope here is the next release of the various JRE's will be
able to work with gcc 3.2 builds of Mozilla. It sounds like, if I'm reading
Steven's last comment correctly, that perhaps dropping the regxpcom support from
Sun's JRE will be sufficient to do this.  You mention that this is planned for
the next JRE release. Is that the 1.4 release that's currently in beta?  Is
there any sort of approximate time frame for whichever release you mean: 3
months?  a year?  Further, at what point will it be known whether dropping the
regxpcom support will really be enough?
WRT #42

Huh?

WRT #43

I would be all in favour of Browser venders integrating Java into their
applications themselves.  Of course, this is the situation that existed with
Netscape originally in 4.X.  Problem was their turn around time for adopting new
versions of Java was infinite (still shipping 1.1.X).  Hence the plugin.

One of the things I (and others) have been pushing for at Sun is an API above
JNI for putting a VM inside an application for the purposes of running applets
(and doing other plugin like things).

Unfortunatly, I find myself spending most of my time having long conversation in
bugzilla about documentation, backward compatabiltiy, sun policies and offers of
assistance.  :-)

WRT #44

You are correct, it wasn't the right thing to do.  The right thing to do would
have been to follow the documentation, but there was none.  Then the right
thing to do would been to ask the Mozilla/Netscape engineers how to do it
correctly, but their answer to every question (when I actually got a answer)
was "look in the code" and I had already done that.  Maybe the really right
thing to do would have been to convince my boss that support for Netscape before
Mozilla ships wasn't something Sun should be doing.

In case you didn't infer it from my use of the first person, I am/was the
engineer assigned the task of added regxpcom support to the plugin.  Hence I am
the person who actually adopted the use of these C++ symbols.  Did I know it
would lead to problems down the road, of course.  Did I care, of course.  Did I
have any other alternatives, no.  

WRT #45

I agree.

Unfortunatly, it is too late in the development cycle to just take the code out
of the current beta product 1.4.1.  (That is to say I can't do it without
approval and we would only get approval as the result of an escalation of a bug
filed by an outside interest and since this is only a problem with Mozilla right
now it is unlikly that it would be escalated).  While I'm the guy who makes the
changes I'm not the guy who decides what changes get made.  Sorry, I know it
sucks.  

I hope to have it out in 1.4.2 (or maybe 1.4.1_XX if possible).  Best case would
be avalability some time around the end of the year for the latter and sometime
middle of next year for the former.

I will know if dropping support fixes this problem long before that and if it
does not I will continue to search for a way to make the plugin work with
Mozilla (3.X compiled) while still maintianing compatability with Netscape releases.




 

   
Steve,
Would this change be a relatively low-risk one? If so, regarding the big impact
it has (work or break the java plugin on linux if going gcc3), is there a chance
to make it as a patch to JRE1.4.1, instead of 1.4.2?
Cc-ing Jim, too.

As far as it being a Mozilla-only problem, this actually effects both Mozilla
and Netscape.  Because Netscape is gracious enough to provide a lot of the
build-support infrastructure to Mozilla, practical concerns mean that Netscape
and Mozilla will both need to switch their builds over to the newer GCC at the
same time.  I believe both entities are interesting in doing so because of the
-O2 that we will then be able to turn on.
*** Bug 157589 has been marked as a duplicate of this bug. ***
Per this past Monday's meeting, Steve Katz of Sun is going to take out the code
in JRE referencing the troubled symble in the next JRE release. In the meantime,
we can test if the change will work. There are two ways to do it:
1) Give Steve a gcc3 compiler, so the Java group can build mozilla with gcc 3
and then test it with the modified JRE. Doug Turner is going to send a pointer
of gcc3 to Steve and couple other folks (including me).
2) JRE provide a modified JRE bits to Dan, so he can test it with his gcc 3
compiled mozilla.

Personally, I think 2) may be easier. 
Some update on this one. Sun Java group (Steve Katz) is going to make the change
(taking out the code referencing the deprecated symble, and send the binary of
the modified JRE to Netscape (Dan Mosedale) to test in the next couple weeks.
*** Bug 161650 has been marked as a duplicate of this bug. ***
Are there any updates on this problem from some knowing soul? As RH is shipping
with gcc 3,2  now (RH 8.0) I think the urge to solve this (somehow) is getting
bigger and bigger. 
BTW, also SuSE 8.1 shipped with gcc-3.2 as default (and the only included) compiler.
This means, all RH 8.0 and SuSE 8.1 (and soon other distros) users can't compile
Mozilla and have Java support working - as long as they don't won't to compile
Java as well.

Joe Chou, danm:
any update of the status of a new J2RE that should work and an ETA of its release?
Chris Petersen is a new QA contact for oji component. His email is:
petersen@netscape.com
Assignee: joe.chou → petersen
fixing small error for pmac@netscape.com (filter with : SPAMMAILSUCKS)
Assignee: petersen → joe.chou
QA Contact: pmac → petersen
Steve, if you have the gcc 3 compiled JRE somewhere and need me to deliever to
Netscape for testing, please let me know.
To my knowledge, the JRE is not and will not be compiled with gcc 3.0 now or in
the near future.  
As a Java Licensee, can Netscape provide a gcc3 compiled JDK (which is also
tested for compliance against the Java Compatibility Kit)?
Well, using gcc 3.2 + -O2 over the 3.5-4 year old egcs compiler does give
mozilla approximately a 7-10% speedup in page load (see some of the numbers in
bug 53486)

SuSE have apparently dropped java support because there is no 3.2 compatible
plugin - see http://sdb.suse.de/sdb/en/html/mozilla.html. According to bug
158385 comment 6, mandrake is now building with gcc 3.2, and I think debian is
planning to move too RSN, so there goes the java plugin on those platforms, too.

IIRC, however, RedHat's mozilla rpms are compiled with gcc-2.96 so they do work
with the java plugin - some of those distributions may do the same, I suppose.

Note that because of the vtable layout changes between egcs and gcc-3.2, the 3.2
plugin will almost certainly have to be a separate build; just eliminating c++
linkage isn't going to be enough if java uses any c++ interfaces implemented by
mozilla.
Steven: Patrick Beard just confirmed that Bradley's last paragraph is correct:
just removing C++ linkage will not be sufficient to fix this bug.  The only way
to make this work is to make a JRE built with gcc 3.2 (or 3.2.1, I think)
available. Sorry I didn't realize this earlier.
*** Bug 178823 has been marked as a duplicate of this bug. ***
*** Bug 180517 has been marked as a duplicate of this bug. ***
*** Bug 180918 has been marked as a duplicate of this bug. ***
What is the status of this bug?  The equivalent bug on Sun's site:

http://developer.java.sun.com/developer/bugParade/bugs/4687814.html

is marked "will not be fixed".  This is dumb.  Someone needs to stop pointing
fingers and make Java and Mozilla work together on current distributions.

And just to fume, who's brain damage was it to use C++ as a binary interface
to 3rd party plugins, anyway?  Never once, in the whole history of the world,
has their been a stable C++ ABI.  Good grief...
What follow is a personal comment and is not ment to imply anything about what
Sun will or will not do:

The reason the Sun bug is marked as "will not be fixed" is because at that time
Sun did not official support Mozilla with the plugin.  

Sun's focus was on supporting the released versions of Netscape, then addressing
what issues we could to also make that same verison of the plugin work with
Mozilla.  In short: We where trying, but not promising anything.

Now that Mozilla has released a product I suspect the support story here at Sun
will be re-evaluated.
*** Bug 181659 has been marked as a duplicate of this bug. ***
reassign to me
Assignee: joe.chou → joshua.xia
*** Bug 182295 has been marked as a duplicate of this bug. ***
*** Bug 173179 has been marked as a duplicate of this bug. ***
I believe this bug isn't just present on Linux, but any platform which can use
GCC 3.x (e.g. when I compiled mozilla on Solaris with GCC 3.2.1 and installed
the 1.4.1 java plugin, it crashed. Removing the java plugin alleviated the problem).
Just to keep everyone up to date.

The offending symbols have been removed from the Java Plug-in currently under
development here at sun (1.4.2).

As a result, Mozilla built with gcc 3.x is able to load the shared object (the
plugin built with 2.9x.  

However, for some reason when the plugin tries to aquire the ServiceManager from
the nsISupports pointer supplied to it in NSGetFactory(), we end up with the
QueryInterface call returning success but the out-parameter of the call set to NULL.

---

On a related topic.  Can someone point me at a document that defines the memory
layout of an XPCOM interface?  Is it the same as COM?  Is it the memory layout
of the vtable produced by the gcc 2.9x compiler?  Does the gcc3.x compiler
produce this same layout?

> As a result, Mozilla built with gcc 3.x is able to load the shared object (the
> plugin built with 2.9x.  

This won't work. (You can in fact get to this stage with the existing plugin
just by creating a wrapper library for the missing symbol)

> However, for some reason when the plugin tries to aquire the ServiceManager from
> the nsISupports pointer supplied to it in NSGetFactory(), we end up with the
> QueryInterface call returning success but the out-parameter of the call set to
> NULL.

The ABI has changed between egcs and gcc-3.2. Its not only the mangling (which
we've already seen), but the virtual table layout, class layout, and so on. This
means that you can get it to load, and it will run as long as your plugin never
calls into mozilla using a c++ interfact pointer.

!!!!You need to compile the java plugin with gcc 3.2 for the java plugin to work
with 3.2!!!! It may be possible for that plugin to then work on 3.1, if the
3.1->3.2 abi changes don't affect what mozilla does, but I doubt it will work on
3.0 (although we _may_ be lucky, since everything is done via com, without
multiple/virtual inheritance), but it definately won't work on 2.x.

The _only_ way to get this to work is to have two separate builds, one done with
egcs, and one with gcc 3.2. (And possibly _another_ for 3.0, if you feel like
it, but 3.0 had a few bugs which miscompiled mozilla at one point. It is
possible that a 3.2 build may work with 3.0, but it may also have lots of little
problems which are hard to track down.)

> On a related topic.  Can someone point me at a document that defines the memory
> layout of an XPCOM interface?  Is it the same as COM?  Is it the memory layout
> of the vtable produced by the gcc 2.9x compiler?  Does the gcc3.x compiler
> produce this same layout?

No, and thats the problem. I don't believe that the egcs ABI was ever specified
formally, although I didn't look that hard for it. The 3.2 abi is at
http://www.codesourcery.com/cxx-abi/ (apart from a few known bugs, which 3.2.1
warns about when compiling with -Wabi)

If you're thinking of a marshelling layer, forget it - you'd have to handle
every single object passed between moz and your plugin (including those created
by moz), rewriting its virtual table along the way. Not to mention dealing with
the features which we disable under egcs (Using -fshort-wchar for unicode
strings under 3.x, for starters).

You need two builds for the plugin.
WRT #73

Actually, I was not thinking of a marshalling layer.

I was thinking that mearly switching to a new compiler is not an option for
mozilla.org, if that compiler does not produce the memory layout specified by XPCOM.

While compiling the plugin with the 3.x compiler may allow it to work with the
3.x compiled browser, it does not address this more fundemental issue.  In point
of fact, every plugin and xpcom component developed for the 2.9x browser would
have to be recompiled.  

[finger pointing mode on]
It seems to me this could be viewed as a failure of XPCOM to meet part of it's
design criteria.  All the complexity of a component object model with none of
the benefit.
[finger pointing mode off]




> While compiling the plugin with the 3.x compiler may allow it to work with the
> 3.x compiled browser, it does not address this more fundemental issue.  In point
> of fact, every plugin and xpcom component developed for the 2.9x browser would
> have to be recompiled.  

This is the case, yes.  You have to recompile everything.  If you manage to get
something to work it's just luck.

> 
> [finger pointing mode on]
> It seems to me this could be viewed as a failure of XPCOM to meet part of it's
> design criteria.  All the complexity of a component object model with none of
> the benefit.
> [finger pointing mode off]

Assuming that the compiler was stable, sure.  Too bad gcc isn't.  This isn't
XPCOM's fault it's entirely in the laps of the people who keep changing the damn
ABI.
WRT #75

Actually, it should not matter that the compiler was/is not stable or if the 
ABI changes.  XPCOM defines (or should define) the layout of the interface in 
memory.  If that layout is not what the compiler produced natively then it 
should have been hand coded.

Since no one seems to be able to point at the XPCOM spec that defines the 
layout, I think it is far to assume that it is (by default) the layout that 
existed in Mozilla 1.0 when it shipped.  As such before mozilla.org starts 
shipping a browser built with a different compiler they should insure that the 
XPCOM layout that browser uses matches the one Mozilla 1.0 used.

In short, this is a problem with Mozilla's implementation.




steve, you are right.  xpcom defines the binary format of the interface in
memory.  XPCOM took the "easy" road and used the default layout of the vtable as
defined by the compiler.  There was a discussion about a gcc compiler option
which would allow a sort of backward compatiblity story, but as I understand it,
the option does not work.  bbaetz, do you remember what that option was?


dougt: Basically, that wasn't really an option, from what I can recall. I can't
quite remember the details, but that option isn't documentened, and ISTR it not
being available in older versions and/or being deprecated.

steve: I think we all agree that this is not an ideal solution. Nevertheless,
its the situation we're stuck with. The g++-3.2 ABI was designed to be stable
and cross platform, while the egcs one wasn't. Whilst there are known bugs in
the 3.2 ABI, I don't think they affect mozilla's interfaces (has anyone compiled
with -Wabi under 3.2.1 to check this?) The unstable ABI was known and announced
_before_ 1.0 was released. I asked for this fact to be documented _specficially_
so that we didn't have someone coming up later and stopping us from doing a
compiler upgrade on the basis of the definition of @FROZEN. See bug 143887 + the
relevent doc links/newsgroup postings in that bug.

Upgrading to g++-3.2 gives mozilla a measurably large (ie approximately 8-10%)
speed improvment, as well as allowing us to run with a supported compiler using
libraries which ship and are installed by default on current linux
distributions. There have been several reports in this bug (and others) of
people downloading the jdk source and recompiling it under gcc-3.2, so its
clearly possible for java to run that way with minimal, if any, changes required
by Sun.
> Whilst there are known bugs in the 3.2 ABI, I don't think they affect mozilla's
> interfaces (has anyone compiled with -Wabi under 3.2.1 to check this?)

That would definitely be worth checking.  However, we may want to just force the
3.2 ABI using -fabi-version with newer versions of gcc once we switch so that we
end up using the exact same ABI that the vendors have already shipped with.  At
some future date, we will then (probably) have this issue again.
That will work as long as we (and sun, and anyone else) don't link to a c++
library with a different abi version.

Rember, we only care about -Wabi issues with interfaces (and a few concrete
things like nsCOMPtr), which only use single inheritance, and are not empty, and
have no data members, and don't use templates or exceptions and only pass POD
types arround. IOW, they're pretty simple from an ABI POV.

I think it may be better to leave the abi-verion as implied, and at the time
when gcc switches the default, we can set -fabi-version=0 if and only if -Wabi
shows a difference for any frozen interface. I may build 3.2.1, and try a moz
build this weekend, if I find time, even though of course that list won't be
final. I wonder if I could set it up to build against sdk/* with -Wabi -Werror,
or something....
Can someone tell me where people who are building the plugin (and java)
themselves are getting the source from?  Are they all licencesess?  Is it
available on-line somewhere?  
Status: NEW → ASSIGNED
The JDK source code is available from
<http://wwws.sun.com/software/java2/download.html>. One has to sign up for an
account and accept the SCSL (Sun Community Source License) to download the
source code.
WRT #82

Thanks.

-----

I (me personally, not in any way an offical voice of Sun) think it is really
unlikely that Sun will support Mozilla until it acheives some level of stability
with respect to it's object model.  Especially since it is fully anticipated
that this problem will come up again with some as yet unknown future gcc version. 

However since it is just a simple recompile [sic #78], I (me personally, not in
any way an official voice of Sun) suggest that mozilla.org become a Java
licensesee, recompile with the appropriate compilers and distribute the built
version off their website.
Or as a better long term strategy get gcj and its VM up to speed.
Reporter,
Can we close this one since this problem will be resolved in JRE1.4.2.
Whiteboard: close
*** Bug 181855 has been marked as a duplicate of this bug. ***
Pete:
When J2RE 1.4.2 is out, and we can see this really fixed, then you can resolve
this bug as fixed. We'll be happy to verify it. :)

Please don't close the bug before that, because people seeing this problem with
current builds and the newest available JRE should still find it, as long as
that combination shows the bug.
OK, let's keep this bug open till the JRE1.4.2 is out. I think it needs sereval
months.
I have a suggestion: why doesn't Sun contribute the source code to the Mozilla
project so that the plugin builds along with Mozilla?  That way, I can keep my
gcc that I need for GNUstep, a critical plugin is *always* built with the
browser and therefore *always* compatible, and I can stop cursing at Sun.

I know that this wouldn't fix the larger problem.  I also know that even trying
to explain this concept to Sun's IP lawyers (much less, convince them it's a
good idea) is about as difficult as explaining differential calculus to an
anteater.  They'd raise bloody hell.

But:

1) There is literally no way I would have known what the problem was if I hadn't
spent 1.5 hours searching around through bugzilla.  There's nothing on the
Mozilla download page, there's nothing on the Sun download page.  This isn't a
plugin for the Whiz-Bang-2000 Proprietary Media Format.  This is an essential
component we're talking about here.  Let's not hide this under a bushel, OK?

2) Having to download a plugin for something so critical as Java is stupid in
the first place.  It should "just work" out of the box.

3) Development of the plugin would probably speed up if it were part of the
Mozilla tree.

From what I've seen of Sun's legal and executive machine, I'm wasting a lot of
time typing all of this.  But I can dream, can't I? :-)
*** Bug 189244 has been marked as a duplicate of this bug. ***
*** Bug 192857 has been marked as a duplicate of this bug. ***
The new java plugin from blackdown works with gcc-3.2. Download from
http://www.blackdown.org/java-linux/mirrors.html (not all mirrors have it yet,
though) They ship two builds; one for gcc 3.2 and one for gcc-2.95. The gcc-2.95
build will stop being made at some point in the future, apparently.

I guess we should leave this open until Sun does the same.
I didn't find any RPMs for RH8 there. Do you have any idea where these
can be found?
*** Bug 157358 has been marked as a duplicate of this bug. ***
*** Bug 198471 has been marked as a duplicate of this bug. ***
This has been fixed on JPI side (JPI provide gcc3.x build)
I don't suppose that's going to be publically available any time soon?
Blizzard,

The jre1.4.2-beta can be downloaded from http://java.sun.com now.
->Fixed in JRE1.4.2
Status: ASSIGNED → RESOLVED
Closed: 23 years ago21 years ago
Resolution: --- → FIXED
The jre 1.4.2 beta on Sun's site is not compatible with mozilla compiled with
gcc 3.2 under Solaris - using it just results in a crash on startup.
ln -s .../j2sdk1.4.2/jre/plugin/i386/ns610-gcc32/libjavaplugin_oji.so plugins

Works like a charm (Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.4)
Gecko/20030624 on RedHat 8.0).
I am having this same problem with mozilla 1.7alpha, it was working with 1.6. I
removed the old plugin and installed a new one from sun with same results (I
only installed the pluggin not the whole jre, I will try that next).
Product: Core → Core Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: