Closed Bug 5369 Opened 25 years ago Closed 25 years ago

[PP]Crash on startup because of Java Plug-in 1.3 for Netscape Navigator.

Categories

(Core Graveyard :: Java: OJI, defect, P2)

x86
Windows 95
defect

Tracking

(Not tracked)

VERIFIED FIXED

People

(Reporter: desale, Assigned: drapeau)

References

Details

(Whiteboard: Crash due to JRE bug in Sun; workaround available; Must add release notes to download new JVM)

Attachments

(3 files)

Crash on startup because of Java Plug-in 1.3 for Netscape Navigator.

Product: Seamonkey [Apprunner]
Build: 04/21/99
Platform: PC.
OS: Windows-95.

Steps to simulate the problem:
1] Install todays [04/21/99] build.
2] Run the application.

Expected Results: Application should run.
Actual Results: Application Crashes.

Description: If I try to install new build, it lets me install new build. Now
when I try to run the application, it crashes and gives me error "You performed
illegal operation". And this is happening because application is looking for
plugins directory of communicator 4.5 and there is one file in there called
"npjava32.dll" which is causing some problem. File "npjava32.dll" refers to
plugin called Java Plug-in 1.3 for Netscape Navigator.
      Now If we try to rename this file to "npjava32.bak" or any other name,
then everything works fine. Applications does not crash. So this file
"npjava32.dll" is causing problem on start up.
Priority: P3 → P2
QA Contact: 3849 → 4616
Status: NEW → RESOLVED
Closed: 25 years ago
Resolution: --- → DUPLICATE
Please see Bug# 3800


*** This bug has been marked as a duplicate of 3800 ***
Status: RESOLVED → REOPENED
These two bugs Bug 5369 and bug 3800 appear to be same, but I think they are not
exactly same. This two bugs is related to java plugins, but particularly bug
3800 happens when you visit www.javasoft.com and when it tries to run the applet
so there is no problem in startup.
For some reason bug 3800 is been marked as fixed and verifyed, and it has
comment there "that it's happening because of there is a version mis-match
between the NSGetFactory() the plugin supports and the NSGetFactory() we expect.
The latest plugin does not work because there is still a mismatch, but it no
longer crashes:
ftp1.netscape.com/private/sun_oji/java-plugin-030199/jre1_2_2-win.exe
Please download this and verify that the crash is fixed."

Well I tried to download new plugins from this site, but it seems its not there
anymore. And bug 3800 is still there for windows-95 with todays build, without
downloading the new plugins. [and that makes sense.] Might be it is possible
that if I get to download new plugins then bug 3800 will not be there.

On the other hand for this bug 5369 particularly, it takes place on start-up.
So whenever you need to run the application, you need to change the filename of
file "npjava32.dll" to something else. I think these two bugs are little
different than each other.
The bugs appear to be different, but they are the same.  Bug 3800 only _used_ to
happen when you hit a page which required Java, but since we now load Java at
startup to initialize LiveConnect, this crash occurs at startup as well.

Please try the new Java plugin at:

http://warp/core/plugins/drops/oji/oji-installer.zip

to test it out.  Sorry for the confusion!
Thanx for information, and now I can see these bugs are same.
I installed new plugins but crash still seems to be there. I mean bug 3800 still
seems there with new plugins, but it's been already marked fixed and verified.
And this bug itself is also still there.
Now since 3800 is already marked as verifyed [Because, that bug won't be under
work right ?], I think I should we should leave this bug open untill that crash
thing is resolved.
If you have some additional information, like it all depends on something and
like that, please let me know. In that case we can close this bug.
So you installed the new JRE from the oji-installer.zip file and you still see a
crash?  It works fine for me.  hmmm.

Andrei - can you test this out on your machine?
Yeah I installed jre1_2_2-win from from the oji-installer.zip. Basically
jre1_2_2-win is an exe file which will install new plugins. I run thus file and
it installed new plugins, as it updated "npjava32.dll" file also. I came to that
it actually updated plugins because, when I tried to change name back from
"npjava32.bak" to "npjava32.dll", [which I changed previously in order to run
the application] so it did not allowed me, since it got new "npjava32.dll" file
there. So it updated plugins that is for sure. And crash is still there.
And bug 3800 is also there with these new plugins.
desale - can we arrange a time to meet so I can look at the problem on your
machine?  I cannot repro the problem with the latest installer and build.
Sure, Let me know the better time for you. Tomorrow [Friday 04/23/99], I do not
have any other meetings or anything. So tomorrow, let me know better time for
you and we can meet in my cube and I'll be able to show you.
Target Milestone: M5
Alex, do you need my help? I do not see the crash on my machine.
Whiteboard: test system configuration problem?
I want to move this to m6 if the problem is not widespread.
what is the latest status of testing, and understanding the bug?
today morning I tested it on todays build which is 05-03-08. Bug is still there.
Target Milestone: M5 → M6
Resolution: DUPLICATE → ---
Status: REOPENED → ASSIGNED
Stanley should be providing a new installer soon.  I'll post the location here
when I get it.
Whiteboard: test system configuration problem? → need drop from sun?
Target Milestone: M6 → M7
sounds like we need a new drop from Sun to fix this.
any one know the status?
moving to m7.  let me know if it gets here quicker.
Please try the new drop at:

http://dante.mcom.com/jre1_3-win.exe
have you had a chance to try the new drop?
Yeah I tried this new drop, but seems its still happening. Application runs only
if I rename NPjava32.dll to something else. Else it crashes on startup.
Target Milestone: M7 → M8
Reassigning all OJI bugs to george.drapeau@eng.sun.com
Assignee: drapeau → edburns
Status: NEW → RESOLVED
Closed: 25 years ago25 years ago
Resolution: --- → FIXED
Whiteboard: need drop from sun? → Tested with July 2 Build of Java Plugin 1.3, no problems
Status: RESOLVED → REOPENED
Reopening, since still crashing.
Status: REOPENED → ASSIGNED
I have tested this bug with mozilla from 7/12/99 and the Java 1.3 plugin from
7/9/99 and it does indeed crash.
Target Milestone: M8 → M9
moving this off to m9 since its a problem that has been around
for a while.  if we get a fix for this to day or tomorrow let me
know and we can see about getting it put on the m8 branch.
Summary: Crash on startup because of Java Plug-in 1.3 for Netscape Navigator. → Crash onn startup because of Java Plug-in 1.3 for Netscape Navigator.
Summary: Crash onn startup because of Java Plug-in 1.3 for Netscape Navigator. → Crash on startup because of Java Plug-in 1.3 for Netscape Navigator.
Resolution: FIXED → ---
Tried this with Java Plugin 1.3 from July 8, 1999 and it still crashed.

The problem is: in file ./js/src/liveconnect/jsj.c, in the function
init_java_VM_reflection(), there is a call to the macro
LOAD_FIELD_OBJ(java.lang.Void, TYPE, "Ljava/lang/Class;", jlVoid).  This call
fails, printing the following console message:

initialization error: Can't read static field java.lang.Void.TYPE

I'm currently trying to track down a fix for this, starting with the last
person who modified the file, fur@netscape.com.  Any information anyone has on
this would be greatly appreciated.
I'm handlng fur's buglist while he's on sabbatical.

i wasn't able to get the jre from the http://dante.mcom.com/jre1_3-win.exe
location, but did pick up the earlier oji installer. Moving the npjava32.dll
into the plugin directory for my build I can't reproduce the bug. The fact that
it fails on the last reflection is weird - clearly we've connected to the VM and
have been succesful getting other reflections - including the field id for this
one. Can I debug this on anyone's machine?
Roger Lawrence, I reported this bug and its still reproducible on my machine. If
you want to use my machine to look what exactly happens, you are always welcome.
It seems to be OS version dependent - desale showed me it failing on Win95 rev
C, and said that it doesn't fail on rev A. Doesn't fail on the JS lab win95 box
(plain Win95 install)
Assignee: edburns → rogerl
Status: ASSIGNED → NEW
The problem is in file ./js/src/liveconnect/jsj.c, line 271:

LOAD_FIELD_OBJ(java.lang.Void,
TYPE,               "Ljava/lang/Class;",            jlVoid);

Commenting out this line alleviates the crash, enabling the browser to start
and allowing the OJI plugin to work.  However, I'm not sure of the
consequences, so I reassigned the bug to Roger Lawrence, who is investigating
it.
Status: NEW → ASSIGNED
The consequences of commenting out the line are that the reflection into JS of
any constructor will fail since the void type field is used to specify the
return type.
The problem here is that the plug-in is returning an error code 0x80004005 when
the field java.lang.Void.TYPE is accessed. I'm still trying to track down a
answer to what that error code means.
I debugged the call into the JRE plug-in thus far - after a small amount of
error checking on the incoming parameters, a virtual dispatch is made from the
'this' object which calls a stub function that returns 0x10001. This code
triggers an immediate error return of the afore-mentioned 0x80004005 error code
and kaboom, that's all she wrote.
It looks like the plug-in is just not allowing access to the static field.
So should this static field be changed to public?  Can you tell me exactly in
what class this field lives?

Thanks,

Ed
Sure, the field being accessed is called 'TYPE'. It's in the class
'java.lang.Void' and is of type 'Class'.
It will probably take some time to get this into the JRE.  What are the
consequences of commenting the offending line out so people can use the Java
Plugin in the meantime?
Moving out to M10 per choffman's request
There's a few things I don't understand about this bug:

1) LiveConnect should not be initialized by the OJI subsystem during browser
startup.  Rather, initialization should be lazy, occurring the first time
LiveConnect is employed on an HTML page, i.e. by access to a static field/method
or to elements of the document.applets array.  I believe that Warren told me
that lazy initialization was working months ago, so if there is a regression
here, then someone familiar with the OJI code should take a look.

2) Even if LiveConnect initialization fails, the OJI code which invokes it
should fail gracefully rather than crashing.  It should be trivially possible to
use the Java plugin to run applets even without LiveConnect having initialized.

3) Why isn't java.lang.Void.TYPE publicly accessible in the JRE ?  Hasn't this
been a public field since the first release of JDK 1.1 ?
This from Tom Ball, manager of the JRE plugin for 1.3:

Ed,

TYPE has always been public (see attached).  Changing this API won't fix
your problem, as it's somewhere else.

Does LOAD_FIELD_OBJ instantiate an object?  The Void class doesn't allow any
objects of its type to be created.  If LOAD_FIELD_OBJ does, then jsj.c needs
a LOAD_FIELD_CLS macro that references the Void class directly.

Tom

> From: Ed Burns <ed.burns@sun.com>
> To: ryang@arago.eng.sun.com, tball@arago.eng.sun.com,
drapeau@arago.eng.sun.com
> Subject: java.lang.Void TYPE field
>
> Hello Robert,
>
> bug 5369, "Crash on startup because of Java Plug-in 1.3 for Netscape
> Navigator." is being caused by the fact that the field TYPE in
> java.lang.Void is not public.  Can you please make this field public?
> For more information, please see
>
> http://bugzilla.mozilla.org/show_bug.cgi?id=5369
>
> This bug prevents anyone from using OJI in the current mozilla.
>
> Ed
/*
 * @(#)Void.java	1.6 98/09/21
 *
 * Copyright 1996-1998 by Sun Microsystems, Inc.,
 * 901 San Antonio Road, Palo Alto, California, 94303, U.S.A.
 * All rights reserved.
 *
 * This software is the confidential and proprietary information
 * of Sun Microsystems, Inc. ("Confidential Information").  You
 * shall not disclose such Confidential Information and shall use
 * it only in accordance with the terms of the license agreement
 * you entered into with Sun.
 */

package java.lang;

/**
 * The Void class is an uninstantiable placeholder class to hold a
 * reference to the Class object representing the primitive Java type
 * void.
 *
 * @author  unascribed
 * @version 1.6, 09/21/98
 * @since   JDK1.1
 */
public final
class Void {

    /**
     * The Class object representing the primitive Java type void.
     */
    public static final Class TYPE = Class.getPrimitiveClass("void");

    /*
     * The Void class cannot be instantiated.
     */
    private Void() {}
}
Summary: Crash on startup because of Java Plug-in 1.3 for Netscape Navigator. → [PP]Crash on startup because of Java Plug-in 1.3 for Netscape Navigator.
Whiteboard: Tested with July 2 Build of Java Plugin 1.3, no problems → Very platform specific. Windows-95, Service Pack: C
Just thought of giving this information that it fails only with
Windows-95 and that too with Service Pack:C. It does not fail with
Service Pack:a. So its too platform specific.
Roger and I looked at this bug a bit yesterday and discovered some new details:

The bug is reproducible on NT, not just Win 95/C.

There was no crash, only two assertion failures, which can safely be skipped in
the debugger, with the browser subsequently starting up normally.  I'm assuming,
therefore, that the release build does not crash, though we have not tested that
theory.  If this is the case, this bug can be considered to have lower priority.

Desale, have you tested with a release build ?

I realized that this bug looked a lot like one we had seen in another vendor's
JVM.  In that case, the JVM performed lazy initialization of static fields, i.e.
so that the static initializer was run the first time a field was accessed.  The
bug was that accessing a static field via native code (using the JNI) would
never trigger the initializer.  The workaround was to call a method which
accessed the field from *Java* code first, thus triggering the call to the
static initializer, and then subsequent accesses to the field from native code
would work correctly.  I tried enabling the code in jsj.c that was ifdef'ed out
JAVA_STATIC_INITIALIZER_BUG.  Surprisingly, this did not seem to cure the
problem.  I have yet to verify, however, that this workaround code has not
suffered from bit-rot, since it hasn't been tested in more than a year.

Ed, any news from the JVM team as to the status of this JDK bug ?
Its interesting that its reproducible on Win-NT. I'm just gonna verify that.
Scott Furman, I'm always using release builds to test this.
I'll make some notes, once I test it on Win-NT too.
After reading Scott's post in n.p.m.oji, I've resigned myself to use bugzilla
for this discussion.

So, Scott, you mentioned that JDK is returning null for the value of
java.lang.Void.TYPE without throwing an exception.  What should happen ideally?

Ed
The value of java.lang.Void.TYPE should be an instance of java.lang.Class.

Something that Roger reminded me of (and that I had missed in his earlier
comments) is that the calls to the JDK are being intercepted by a "secure proxy
JNI" that sits in front of the raw JNI method calls, so as to ensure that method
invocations are made using the Java 2 security model, i.e. from a secure
context.  This proxy code resides in the jsjpi DLL that is supplied by Sun as
part of the JDK.  So, it is completely possible that the bug lies in this
wrapper code and not the JDK itself, but since we do not have the source, we
cannot easily do further debugging.

So, is the Java plugin source code available somewhere, or can someone at Sun
who is familiar with the code be brought on board ?
Forgot to mention: The method in question is GetStaticField(), a member of the
nsISecureEnv interface.
Blocks: 5429
*** Bug 11730 has been marked as a duplicate of this bug. ***
I've discovered the underlying reason for this bug.  The problem is in
kestrel/ext/plugin/oji-plugin/src/win32/core/CSecureJNIEnv.cpp
CSecureJNIEnv::GetStaticField().  The problem is that
m_env->GetVersion() is equal to JNI_VERSION_1_1, which is 10001.  But
why is m_env's version 10001, and not 10002?  Where does m_env's value
come from?  The following stack trace is from Mozilla on NT with the JRE
plugin for Java 1.3.

CJavaJNI::GetJNIEnv() line 140
CJavaVMService::GetJNIEnv(CJavaVMService * const 0x00415d84) line 132 + 15 bytes
CJavaPluginFactory::CreateSecureEnv(CJavaPluginFactory * const 0x003755a8,
JNIEnv_ * 0x018fef80, nsISecureEnv * * 0x018fefa0) line 501 + 33 bytes
ProxyJNIEnv::ProxyJNIEnv(nsIJVMPlugin * 0x003755a8, nsISecureEnv * 0x00000000)
line 1502 + 23 bytes
CreateProxyJNI(nsIJVMPlugin * 0x003755a8, nsISecureEnv * 0x00000000) line 1515
+ 35 bytes
JVM_GetJNIEnv() line 288 + 11 bytes
create_java_vm_impl(SystemJavaVM * * 0x0012fbd0, JNIEnv_ * * 0x0012fbc8, void *
0x00000000) line 433 + 5 bytes
JSJ_ConnectToJavaVM(SystemJavaVM * 0x00000000, void * 0x00000000) line 476 + 21
bytes
nsJVMManager::MaybeStartupLiveConnect() line 717 + 9 bytes
nsJVMManager::StartupLiveConnect(nsJVMManager * const 0x018fd6f8, JSRuntime *
0x00d4c6c0, int & 0) line 150 + 11 bytes
nsJSEnvironment::nsJSEnvironment() line 451 + 22 bytes
nsJSEnvironment::GetScriptingEnvironment() line 432 + 27 bytes
NS_CreateScriptContext(nsIScriptGlobalObject * 0x018fdb54, nsIScriptContext * *
0x00b07b80) line 474 + 5 bytes
nsWebShell::CreateScriptEnvironment() line 3164 + 20 bytes
nsWebShell::GetScriptGlobalObject(nsWebShell * const 0x00b07b60,
nsIScriptGlobalObject * * 0x0012fce8) line 3192 + 11 bytes
DocumentViewerImpl::Init(DocumentViewerImpl * const 0x018fa660, void *
0x000304ca, nsIDeviceContext * 0x00b04f80, nsIPref * 0x00a9a150, const nsRect &
{...}, nsScrollPreference nsScrollPreference_kAuto) line 374 + 16 bytes
nsWebShell::Embed(nsWebShell * const 0x00b07b50, nsIContentViewer * 0x018fa660,
const char * 0x00b07320, nsISupports * 0x00000000) line 942 + 69 bytes
nsDocumentBindInfo::OnStartRequest(nsDocumentBindInfo * const 0x00b00450,
nsIChannel * 0x018af480, nsISupports * 0x00000000) line 1953 + 36 bytes
nsOnStartRequestEvent::HandleEvent(nsOnStartRequestEvent * const 0x01887630)
line 212
nsStreamListenerEvent::HandlePLEvent(PLEvent * 0x01887634) line 149 + 12 bytes
PL_HandleEvent(PLEvent * 0x01887634) line 509 + 10 bytes
PL_ProcessPendingEvents(PLEventQueue * 0x00b25ce0) line 470 + 9 bytes
_md_EventReceiverProc(HWND__ * 0x007c033c, unsigned int 49420, unsigned int 0,
long 11689184) line 932 + 9 bytes
USER32! 77e71250()

So apparently the JNIEnv that comes back from the call to
CJavaJNI::GetJNIEnv() has a version of JNI_VERSION_1_1.  How can this
be?  Does the call

if (m_vm->AttachCurrentThread((void **) &env, NULL) != 0)

somehow attach to a bogus thread?

In any case, this is the reason for bug 5369.
I've discovered the underlying reason for this bug.  The problem is in
kestrel/ext/plugin/oji-plugin/src/win32/core/CSecureJNIEnv.cpp
CSecureJNIEnv::GetStaticField().  The problem is that
m_env->GetVersion() is equal to JNI_VERSION_1_1, which is 10001.  But
why is m_env's version 10001, and not 10002?  Where does m_env's value
come from?  The following stack trace is from Mozilla on NT with the JRE
plugin for Java 1.3.

CJavaJNI::GetJNIEnv() line 140
CJavaVMService::GetJNIEnv(CJavaVMService * const 0x00415d84) line 132 + 15 bytes
CJavaPluginFactory::CreateSecureEnv(CJavaPluginFactory * const 0x003755a8,
JNIEnv_ * 0x018fef80, nsISecureEnv * * 0x018fefa0) line 501 + 33 bytes
ProxyJNIEnv::ProxyJNIEnv(nsIJVMPlugin * 0x003755a8, nsISecureEnv * 0x00000000)
line 1502 + 23 bytes
CreateProxyJNI(nsIJVMPlugin * 0x003755a8, nsISecureEnv * 0x00000000) line 1515
+ 35 bytes
JVM_GetJNIEnv() line 288 + 11 bytes
create_java_vm_impl(SystemJavaVM * * 0x0012fbd0, JNIEnv_ * * 0x0012fbc8, void *
0x00000000) line 433 + 5 bytes
JSJ_ConnectToJavaVM(SystemJavaVM * 0x00000000, void * 0x00000000) line 476 + 21
bytes
nsJVMManager::MaybeStartupLiveConnect() line 717 + 9 bytes
nsJVMManager::StartupLiveConnect(nsJVMManager * const 0x018fd6f8, JSRuntime *
0x00d4c6c0, int & 0) line 150 + 11 bytes
nsJSEnvironment::nsJSEnvironment() line 451 + 22 bytes
nsJSEnvironment::GetScriptingEnvironment() line 432 + 27 bytes
NS_CreateScriptContext(nsIScriptGlobalObject * 0x018fdb54, nsIScriptContext * *
0x00b07b80) line 474 + 5 bytes
nsWebShell::CreateScriptEnvironment() line 3164 + 20 bytes
nsWebShell::GetScriptGlobalObject(nsWebShell * const 0x00b07b60,
nsIScriptGlobalObject * * 0x0012fce8) line 3192 + 11 bytes
DocumentViewerImpl::Init(DocumentViewerImpl * const 0x018fa660, void *
0x000304ca, nsIDeviceContext * 0x00b04f80, nsIPref * 0x00a9a150, const nsRect &
{...}, nsScrollPreference nsScrollPreference_kAuto) line 374 + 16 bytes
nsWebShell::Embed(nsWebShell * const 0x00b07b50, nsIContentViewer * 0x018fa660,
const char * 0x00b07320, nsISupports * 0x00000000) line 942 + 69 bytes
nsDocumentBindInfo::OnStartRequest(nsDocumentBindInfo * const 0x00b00450,
nsIChannel * 0x018af480, nsISupports * 0x00000000) line 1953 + 36 bytes
nsOnStartRequestEvent::HandleEvent(nsOnStartRequestEvent * const 0x01887630)
line 212
nsStreamListenerEvent::HandlePLEvent(PLEvent * 0x01887634) line 149 + 12 bytes
PL_HandleEvent(PLEvent * 0x01887634) line 509 + 10 bytes
PL_ProcessPendingEvents(PLEventQueue * 0x00b25ce0) line 470 + 9 bytes
_md_EventReceiverProc(HWND__ * 0x007c033c, unsigned int 49420, unsigned int 0,
long 11689184) line 932 + 9 bytes
USER32! 77e71250()

So apparently the JNIEnv that comes back from the call to
CJavaJNI::GetJNIEnv() has a version of JNI_VERSION_1_1.  How can this
be?  Does the call

if (m_vm->AttachCurrentThread((void **) &env, NULL) != 0)

somehow attach to a bogus thread?

In any case, this is the reason for bug 5369.
This has been confirmed as a JRE bug and has been filed into bugtraq, Sun's bug
system.  I'm told the problem is in the hotspot vm.  The workaround is to cause
the plugin to use the "classic vm".  Here is a workaround:

try specifying the classic VM in the plug-in's control
panel.  If that doesn't work, delete the JRE's jre\bin\hotspot directory
to force use of classic.

Could someone please test this workaround and let me know if it really works?
I'd couldn't see how to specify the classic VM in the plug-in control panel, so
I deleted the hotspot directory as you suggested. Apprunner came up fine after
that so I guess you're on to something here.
In the meantime, you can copy the "classic" VM from
%JDKHOME%/jre/1.3/bin/classic (copy the whole contents of the directory to
hotspot),
and set a "-classic" flag in the runtime arguments in the Java Plug-in's
Control Panel.
*** Bug 12582 has been marked as a duplicate of this bug. ***
Component: Plug-ins → OJI
Given this is a confirmed JRE bug, I'm re-assigning it. Ha, free at last!
Assignee: rogerl → drapeau
Status: ASSIGNED → NEW
Status: NEW → RESOLVED
Closed: 25 years ago25 years ago
Resolution: --- → INVALID
Since the bug is in the Sun JRE using the HotSpot VM and not in Mozilla code,
I'm going to close out the bug.  The bug comments mention a workaround, so
customers who wish to use the latest Java Plug-In have a way of working around
the JRE bug until it is fixed by Sun.
Status: RESOLVED → VERIFIED
Status: VERIFIED → REOPENED
Resolution: INVALID → ---
I wanted to reopen this bug and I was just waiting for some information. I was
planning to do it tommorow or so, and I really didn't knew that
"edburns@acm.org" verified this one.
Well I want this bug open for several reasons, and once I collect other
information and verify it, I'm gonna keep it open.
I'm clearing invalid resolution too.
After reviewing the bug log -- which is quite lengthly -- I believe this bug
should probably stay open for just a bit longer. Let me explain why:

Is there anything we, as Netscape, can do to alleviate the crash issue? I
noticed that a work around has been noted, can we incorporate this work around
anywhere so the user does not have to do it? Our customers may not be as
sophisticated in the aspect of incorporating such a work around. If after
brainstorming ideas, if it is determined that we cannot incorporate the work
around into the code somehow, could we possibly write a detailed, step-by-step
process that even my mother could follow? The simpler we make this now, then the
fewer calls tech support will get down the road.

You guys are creative and extremely bright -- can you come up with a way where
we can incorporate the work around so the customer doesn't have to be exposed to
it? And when the offending code is fixed, the customer won't even know that
there was an issue. Any ideas?
Thanks beth for your comments.
Now the reasons why we want this bug open may include 1] Documentation purpose.
If bug is closed, then it won't be documented. 2] We are providing this plugin
"npjava32.dll" with installation, so as a end user, he is not going to
understand why application is crashing. For computer literate people its,
different. But how about people, who even don't know where these plugins are
located and what is its purpose is.
I still believe that, we need to come up with some solution, where end user will
not be confused.
As beth mentioned, if we could come up with something that, customer won't even
know that there is some issue, then it would be really great.
The reson I feel that, there should be some solution because its a CRASHER. It
CRASHES application on start-up.
Whiteboard: Very platform specific. Windows-95, Service Pack: C → Crash due to JRE bug in Sun, workaround available, but not checked in
A lot of people are not getting this and still have this crash.  We know
that Java will not be part of base install and that this is not a bug.  Folks
just need to remove this file.  But this continues to show up consistently on
top ten Talkback crashers list.  Is there anyway to detect old plugin and prompt
user to remove?  Any ideas?
Ed Burns supplied a simple workaround that essentially disables LiveConnect.  We
could live with losing that feature for a beta, but eventually we have to
reenable it, and anyone that failed to update their Java plugins would still
have the crash.

Instead, we might consider adding code to OJI that detects the buggy beta
version of the JRE and inhibits LiveConnect startup.  (Or, if a repaired version
of the JRE is available for download, we should just disable Java in the browser
altogether, so as to force users to update.)
Whiteboard: Crash due to JRE bug in Sun, workaround available, but not checked in → Crash due to JRE bug in Sun, workaround available, but not checked in, should be complete by 10/01/1999
leger@netscape.com  09/24/99 14:59:
> A lot of people are not getting this and still have this crash.  We know
> that Java will not be part of base install and that this is not a bug.  Folks
> just need to remove this file.  But this continues to show up consistently on

"Are not getting this"?  Do you mean "are not seeing this bug manifest itself"?

"Folks just need to remove this file"?  What file?

Additional Comments From fur@netscape.com  09/24/99 15:32:

> Ed Burns supplied a simple workaround that essentially disables LiveConnect.

Disables LiveConnect IF the VM doesn't have the fix.

Please note that this bug is only manifesting itself with the beta 1 JDK1.3
VM.  I suggest putting something in the release notes.  Something like: "on't
use JDK1.3 Beta 1 with mozilla."  Or: "If you really want to use JDK1.3 beta,
[insert the text for the user workaround described above]."

> Instead, we might consider adding code to OJI that detects the buggy beta
> version of the JRE and inhibits LiveConnect startup.  (Or, if a repaired
> version of the JRE is available for download, we should just disable Java in
> the browser altogether, so as to force users to update.)

I need more evidence that many people will be experiencing this bug before I
implement a special-case fix such as you describe.  In any case, there appears
to be no way to discover if the vm has the bug or not from an external entity.
The bug is way down in the plugin's bowels.

Ed
>> Ed Burns supplied a simple workaround that essentially disables LiveConnect.
> Disables LiveConnect IF the VM doesn't have the fix.

I don't think this patch does the trick because it tests whether or not static
field access is broken by attempting to fetch the value of a static field.
Doesn't that cause the crash on release builds that we are trying to prevent ?
(It apparently does not cause a crash on debug builds.)

As an aside, your patch cripples LiveConnect whether or not the correct JVM is
installed, i.e. LiveConnect will work normally, except when a Java constructor
is called from JS, in which case a crash will ensue.  (This is because the patch
causes jlVoid_TYPE to remain uninitialized in all cases.)
Using the Oct 1 build, I get a crash on install.
According to talkback report: 14018513
the top of the stack is here: JPINS32.DLL + 0xd25e (0x013cd25e)

My machine is a Win95 laptop that has never had seamonkey w/Java installed on it
before. Today I checked the Java option and experienced a crash on install and
every time I startup.

Recommend we update java binaries or gray out the check box option for java.
Adding cathleen to the cc: line per chofmann.
Full Stack Trace:

   JPINS32.DLL + 0xd25e (0x013cd25e)


   ProxyJNIEnv::ProxyJNIEnv

[d:\builds\seamonkey\mozilla\modules\oji\src\ProxyJNI.cpp, line 1503]

   CreateProxyJNI

[d:\builds\seamonkey\mozilla\modules\oji\src\ProxyJNI.cpp, line 1516]

   JVM_GetJNIEnv

[d:\builds\seamonkey\mozilla\modules\oji\src\jvmmgr.cpp, line 290]

   create_java_vm_impl

[d:\builds\seamonkey\mozilla\modules\oji\src\lcglue.cpp, line 433]

   JSJ_ConnectToJavaVM

[d:\builds\seamonkey\mozilla\js\src\liveconnect\jsj.c, line 456]

   nsJVMManager::MaybeStartupLiveConnect

[d:\builds\seamonkey\mozilla\modules\oji\src\nsJVMManager.cpp, line 718]

   nsJVMManager::StartupLiveConnect

[d:\builds\seamonkey\mozilla\modules\oji\src\nsJVMManager.h, line 125]

   nsJSEnvironment::nsJSEnvironment

[d:\builds\seamonkey\mozilla\dom\src\base\nsJSEnvironment.cpp, line 532]

   nsJSEnvironment::GetScriptingEnvironment

[d:\builds\seamonkey\mozilla\dom\src\base\nsJSEnvironment.cpp, line 507]

   NS_CreateScriptContext

[d:\builds\seamonkey\mozilla\dom\src\base\nsJSEnvironment.cpp, line 559]

   nsWebShell::CreateScriptEnvironment

[d:\builds\seamonkey\mozilla\webshell\src\nsWebShell.cpp, line 3203]

   nsWebShell::GetScriptGlobalObject

[d:\builds\seamonkey\mozilla\webshell\src\nsWebShell.cpp, line 3235]

   DocumentViewerImpl::Init

[d:\builds\seamonkey\mozilla\layout\base\src\nsDocumentViewer.cpp, line 355]

   USER32.DLL + 0x4624 (0xbff64624)


   0xa7f000f8
Target Milestone: M10 → M11
we are closing down m10.  moving this to m11.  lets try for early m11.
1st, I like Chris Saito's recommendation that we get new JDK binaries,
especially since the current stuff is Beta (it's not an FCS's Java VM
that's causing the problem, it's a Beta release).  The HotSpot VM bug
has been fixed, and I'm awaiting word from the JDK group here at Sun
as to when they plan on releasing the latest build.  As you know, our (Sun's)
idea of release schedules is much less quick than Mozilla's, but the Sun JDK
group is planning an "early access" release of JDK 1.3
which has the fix in it.  I've requested that Netscape folks
be granted access.

Status: of this request: Netscape folks will be given early access to the latest
JDK 1.3 with the HotSpot VM bug fix; I will get the bundle from them, and will
ftp to somebody at Netscape.  Also, you can register for Early Access by
becoming a member of the Java Developer Connection:

http://developer.java.sun.com/

which is free.

Then, fill out this form to ask for the Early Access JDK release:

http://developer.java.sun.com/developer/surveys/restricted/early_access_survey.html


Meanwhile, we're in parallel working on the "right" way to make
Mozilla more bulletproof, so that even if somebody has an errant VM,
Mozilla will not crash, and LiveConnect will continue to initialize
correctly.  I have Jeff Dyer working on this (the JavaScript group
knows him well), and I expect to hear status from him in the next 48
hours or so.
*** Bug 14015 has been marked as a duplicate of this bug. ***
*** Bug 15955 has been marked as a duplicate of this bug. ***
The patch above has the effect of posting an error and disabling
LiveConnect when there is a problem initializing the reflection of Java classes.
The dubious part of the patch is that it compiles out the call to
JSJ_DisconnectFromJavaVM that is made when reflection initialization fails.
It appears that this function is trying to undo things that have not yet been
done (in this case), but I need somebody more familiar with this code to say how
it should be changed. As it is the vm is not being freed, nor is it put on the
linked list of vms to be freed later on. The other change being made by this
patch is to make the pre-condition guards in the nsCLiveconnect methods more
robust. This fix will guard LC against other kinds of vm failures as well as
the kestrel bug.
Jeff, have you tested your patches on a *release* build ?  The reason I ask is
that we seem to have two different problems: The first one is that assertion
failures are triggered when LiveConnect fails to start the JVM.  Obviously,
these assertions only occur in the debug build and, last I checked, they can be
safely skipped.  The release build exhibits a totally different problem and
crashes while attempting to create the JVM.  (See Jan Leger's stack trace.)  I
don't *think* your patch could fix the release crash bug, but I have not tried
it myself.
When I run the release build I do not see any of the problems described in this
bug report.
Sorry to make you go through this, Jeff, but I'm really confused:  I see how
commenting out the call to JSJ_DisconnectFromJavaVM() will kill the assertions
we were seeing on the debug builds, but I don't see how it accounts for the
stack trace we have seen in the release build crash reports.  You're saying that
when you leave this one call in, there is a crash with the stack trace reported
above, and when you comment out the call, there is no crash.  Is that right ?
How can you explain this ?
>Sorry to make you go through this, Jeff, but I'm really confused:

And now I am too.

>You're saying that when you leave this one call in,
>there is a crash with the stack trace reported above,
>and when you comment out the call, there is no crash.
>Is that right ?

I'm not saying that.

>How can you explain this ?

There are two completely independent bugs being discussed in this report. I
fixed the one that has to do with reflecting java.lang.Void.TYPE, but not the
one that has to do with creating the vm. Since my patched release build worked
fine, I did not recognize the difference.

I'll keep at it.
So here is the story as best as I can tell it.

When using kestrel beta 1 with hotspot enabled, a crash occurs on line 514 of
CJavaJNI.cpp. Obviously, either the address is bad or there is a bug in the code
that gets called at that point.

506:/*
507: * Creates a Java VM for the specified runtime library handle.
508: */
509:jint
510:CJavaJNI::JNI_CreateJavaVM(JavaVM **vmp, JNIEnv **envp, void *vmargs)
511:{
512:    JVM_CREATE proc;
513:    proc = (JVM_CREATE) GetProcAddress(m_hMod, "JNI_CreateJavaVM");
514:    return proc != 0 ? (*proc)(vmp, envp, vmargs) : -1;
515:}

The call stack at this point is given below. One requirement for this patch is
that it be implemented fully on the mozilla side (perhaps in oji). The last oji
call on the stack is ProxyJNI::ProxyJNI. At the point that this call is made,
the only reference we have to the Java plugin is nsIJVMPlugin.

CJavaJNI::JNI_CreateJavaVM(JavaVM_ * * 0x02df1a9c, JNIEnv_ * * 0x0012e398, void
* 0x0012a248) line 514
CJavaJNI::StartJavaVirtualMachine() line 809 + 26 bytes
CJavaJNI::Initialize(IResourceBundle * 0x02df1a20, IJavaVMCallBack * 0x02c719e0)
line 85 + 8 bytes
CActivatorJNI::Initialize(HINSTANCE__ * 0x02d80000, IJavaVMCallBack *
0x02c719e0) line 90 + 19 bytes
CJavaVMService::Initialize(CJavaVMService * const 0x02df1a84, IJavaVMCallBack *
0x02c719e0) line 71 + 32 bytes
CJavaPluginApp::StartJVM() line 892 + 26 bytes
CJavaPluginFactory::CreateSecureEnv(CJavaPluginFactory * const 0x02c67158,
JNIEnv_ * 0x02302cb0, nsISecureEnv * * 0x02302cd0) line 484 + 10 bytes
ProxyJNIEnv::ProxyJNIEnv(nsIJVMPlugin * 0x02c67158, nsISecureEnv * 0x00000000)
line 1502 + 23 bytes
CreateProxyJNI(nsIJVMPlugin * 0x02c67158, nsISecureEnv * 0x00000000) line 1515 +
35 bytes
JVM_GetJNIEnv() line 288 + 11 bytes
create_java_vm_impl(SystemJavaVM * * 0x0012fac4, JNIEnv_ * * 0x0012fabc, void *
0x00000000) line 436 + 5 bytes
JSJ_ConnectToJavaVM(SystemJavaVM * 0x00000000, void * 0x00000000) line 440 + 21
bytes

My question is can anyone think of a creative way to identify the vm from
outside the plugin, given the available plugin interfaces? If not, then the fix
must go into the plug-in, which gets us no closer then the jdk fix that
is forthcoming.

I see no way to patch mozilla to avoid this crash.
Any suggestions?
> My question is can anyone think of a creative way to
> identify the vm from outside the plugin, given the available
> plugin interfaces? If not, then the fix must go into the
> plug-in, which gets us no closer then the jdk fix that is
> forthcoming.

I'm guessing that there's no way to cure this problem within
Mozilla without adding checks that operate outside the
bounds of the OJI interfaces.  Is there anything distinctive
about the 1.3 beta plugin DLLs ?  Can we locate the files in
the plugins directory and checksum them to detect the
problematic plugin just prior to LC initialization ?
*** Bug 15924 has been marked as a duplicate of this bug. ***
Following up on Scott's last comment, my belief is that since it's such a
difficult problem to detect given our current constraints, it's best to wait for
a Java VM fix.

I've sent email to Mozilla staff with instructions on getting an Early Access
release of the JVM 1.3, which fixes the problem.  This VM will be widely
available soon, but for the purposes of testing and not holding up internal
development, Netscape folks can get it now.  Talk to Chris Hofmann for details.
Can someone please review the 10/13/99 patch for acceptability for checkin?

Thanks,

Ed
Ed, it sounds like you need to talk to Jeff Dyer.  I reviewed the 10/13 patch on
10/13 and Jeff agreed with me that, although the patch will quell the assertion
failures in the debug build, it will not prevent the crash in the release build.
The only known solutions to this bug (really two bugs) are to replace the
JVM/plugin with the new version or to disable LiveConnect and/or Java until a
new version can be made available.

Incidentally, this whole incident brings to mind a deficiency in JNI that there
is no easy way to determine the version/vendor of the JVM so as to work around
bugs such as this one.
While the patch I posted on 10/13 does not fix the crash in the release build of
kestrel beta1, it does fix the other problem described by this bug report. My
guess is that it will avoid crashers when other (pre-release) vms start to
become available as plugins. It is a low risk fix that only adds code that gets
executed in the negative case. For these reasons I suggest applying it as soon
as possible.
I still don't understand how the patch does anything but get rid of assertion
failures and add some sanity-checking, though both are worthy goals and we
should check the changes in for that reason alone.  Jeff, my only comment on the
patch is that checking for obj==NULL in the JSObject instance methods seems
excessively paranoid.  It shouldn't be possible to call an instance method
through a null instance pointer.
>Jeff, my only comment on the patch is that checking for obj==NULL in the
>JSObject instance methods seems excessively paranoid.  It shouldn't be possible
>to call an instance method through a null instance pointer.

Since obj is just another argument to a method of the public liveconnect
interface, it can be anything including NULL. It is the responsibility of the
plugin to supply it. In the case that Java class reflection fails and
LiveConnect is not initialized but still being called from the Applet, the Sun
plugin was passing NULL for the obj argument. Maybe this will never happen
again, but since Moz does not have control over how it is called, it seems like
a good idea to guard against it.

By the way you are right about the asserts in JSJ_DisconnectFromJavaVM. They are
benign. It would be better for that function to be called as it is in the
original code.
OK, I guess I don't understand how it is the plugin ends up calling the
LiveConnect methods with obj==null, but since it actually happens, maybe it
isn't so paranoid after all :)

Feel free to list me as a reviewer for the check-in.
starting to look at wrapping up m11.  did we get this worked out?
Whiteboard: Crash due to JRE bug in Sun, workaround available, but not checked in, should be complete by 10/01/1999 → Crash due to JRE bug in Sun, no workaround available; Must add release notes to download new JVM
Target Milestone: M11 → M12
m12
Target Milestone: M12 → M11
Status: REOPENED → RESOLVED
Closed: 25 years ago25 years ago
Resolution: --- → FIXED
drapeau has checked in for m11
Probably seems kind of stupid, but what is the actual fix for this bug (if
there is one)?
I can verify that it does occur on a Win95 machine running JRE 1.3 beta (latest
public release version) and the Java Plug-In.  Did not occur before installation
of JRE 1.3beta.  Crash on startup brings a Win31 style Program Error - Closing
Program dialog on initial run of Nov. 12 build.  Second time starting brings a
general GPF crash (the console showed the same file error, in the line "Could
not find -Xrun library: jdwp.dll", attempt to revive with Norton CrashGuard 2000
yielded the afformentioned Win31 crash.
Leaving status alone until response recieved.
The bug is in the Java plugin and we were never able to come up with a fix to
Mozilla that would work around the bug, short of disabling Java/LiveConnect
completely in the browser.  The workaround is to disable HotSpot as described in
the release notes.  The "fix" is to download the latest version of Java, which
no longer has the bug, when it's posted to the Sun web site.
Whiteboard: Crash due to JRE bug in Sun, no workaround available; Must add release notes to download new JVM → Crash due to JRE bug in Sun; workaround available; Must add release notes to download new JVM
Status: RESOLVED → REOPENED
George, Jeff,
Rob is seeing the same failure in other JVM's.  I've added him to the cc list so
he can add detail to my assertion.
= C =
I look forward to Rob's description, but first just one general comment. When a
vm crashes on startup, its probably not a Mozilla bug. This was actually the
case with the JDK 1.3 Beta bug described above. We chose to track it as a
mozilla bug because of the importance of having a first working java plugin.
From here on, I suggest reporting such bugs (including JDK bugs) to the plugin
developer.
Resolution: FIXED → ---
Clearing FIXED resolution due to reopen.
Target Milestone: M11 → M12
Moving from M11 due to reopen...moving to M12 for eval on a fix for that
milestone.
Target Milestone: M12 → M13
Per paw, would not hold this for M12.  paw, please provide an M12 release note
if needed at:
http://bugzilla.mozilla.org/show_bug.cgi?id=17788
Thanks...moving to M13
Status: REOPENED → RESOLVED
Closed: 25 years ago25 years ago
Resolution: --- → FIXED
Sorry to take so long to chime in.  What I AM seeing is the same failure in
jsj.c (loading java.lang.VOID), using the IBM JDK 1.1.8 on Linux.
What I AM NOT seeing is a crash in Mozilla as a result of this, and with
revision 1.17 of jsj.c, lcshell is crash free as well (it just doesn't work.)

I am still searching for a JVM on Linux that works correctly with LiveConnect.
Suggestions would be appreciated.

Changing back to FIXED as all I've added is a new "Broken JVM" to the list.  The
existing fix is still valid.
Marking verified.
Status: RESOLVED → VERIFIED
Product: Core → Core Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: