Closed Bug 339947 Opened 15 years ago Closed 15 years ago

Java XPCOM proxies can get used (resurrected) after having been garbage collected causing VM crash

Categories

(Core Graveyard :: Java to XPCOM Bridge, defect)

x86
Windows XP
defect
Not set
critical

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: ttudor, Assigned: jhpedemonte)

Details

(Keywords: crash, fixed1.8.1)

Attachments

(2 files, 2 obsolete files)

User-Agent:       Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.0.3) Gecko/20060426 Firefox/1.5.0.3
Build Identifier: 

This bug is quite subtle and (very) difficult to reproduce without a small hack in XPCOMJavaProxy.java. The description is rather long and relates to several intricacies of the java garbage collection. 

EXAMPLE AND BUG DESCRIPTION :

Consider the following Java code (only the relevant pieces are shown)

public class AbstractMozillaBrowser {
  private nsIWebBrowser webBrowser;
...	    
  void activate() {
    nsIWebBrowserFocus webBrowserFocus = 
      (nsIWebBrowserFocus)webBrowser.queryInterface(
        nsIWebBrowserFocus.NS_IWEBBROWSERFOCUS_IID
      );
    webBrowserFocus.activate();
  }

  void deactivate() {
    nsIWebBrowserFocus webBrowserFocus = 
      (nsIWebBrowserFocus)webBrowser.queryInterface(
        nsIWebBrowserFocus.NS_IWEBBROWSERFOCUS_IID
      );
    webBrowserFocus.deactivate();
  }
...
}




Reproducible: Sometimes

Steps to Reproduce:
Step 1.

When "activate()" is called a new webBrowserFocus java XPCOM proxy is created and put in the global NativeToJavaProxyMap (see GetNewOrUsedJavaProxy() in nsJavaXPCOMBindingUtils.cpp and CreateJavaProxy() in nsJavaWrapper.cpp). To be precise, a JNI weak reference to the newly created webBrowserFocus proxy is put in the map.

When "activate()" returns webBrowserFocus goes out of scope. It is now in an "unreachable" state (see java.lang.ref javadoc). The JNI weak reference BEING WEAKER THAN a java weak reference, the object does get UNREACHABLE and NOT weakly reachable - see http://java.sun.com/j2se/1.4.2/docs/guide/jni/jni-12.html#weakrefs

Being unreachable makes our webBrowserFocus proxy a good candidate for garbage collection. It's life is hereon in the hands of the garbage collector. Let's suppose that the garbage collector decides to put the object in a "finalizable" state (see http://java.sun.com/docs/books/jls/third_edition/html/execution.html#12.6 for a description of unfinalized, finalizable and finalized states).

Step 2.

deactivate() gets called. Our step 1 webBrowserFocus is still in the NativeToJavaProxyMap (the gc hasn't finalized it yet). When we try to NativeToJavaProxyMap::Find it, we have a jni weak reference in there. As stated by the the jni spec (http://java.sun.com/j2se/1.4.2/docs/guide/jni/jni-12.html#weakrefs again) the "weak global reference will not become functionally equivalent to NULL until after the completion of the the finalizer for the referenced object". So we will be able to get a new strong local reference to the old proxy.

Bug :

HOWEVER, having a new (strong) reference to a finalizable object DOES NOT stop the gc from finalizing it. The object will not be released, but it will be finalized. This is a rather unknown and unused feature(?) of the java garbage collector known as object resurrection. 

In our case finalizing the proxy while still using it causes great and unrecovarable damage. The underlying XPCOM object connection is lost (see finalizeProxy() in nsJavaWrapper.cpp). In the worst case the finalization takes place while the main thread is processing the deactivate() call, thinking that the proxy retrieved from the map is valid.

Event if we could find a way to cope with finalization happenning in the middle of another jni call, object resurrection is still a very bad thing for us. The vm specification gurantees that the finalizer is called exactly ONCE for every object. So our resurrected proxy would never be finalized again (even when its memory is actually released) and our related XPCOM object would inevitably leak.

HACK FOR HITTING THIS BUG :

In XPCOMJavaProxy.java, in the invoke() method, just before delegating to XPCOM, call the garbage collector. That is, around line 140 change :

  // If not already handled, pass method calls to XPCOM object.
  return callXPCOMMethod(aProxy, methodName, aParams);

to :

  System.gc();
  // If not already handled, pass method calls to XPCOM object.
  return callXPCOMMethod(aProxy, methodName, aParams);




Proposed solution (patch coming) :

Instead of keeping a jni weak reference in the NativeToJavaProxyMap we could keep a real java.lang.ref.WeakReference (a jni global reference to a java.lang.ref.WeakReference to be more precise).

In the NativeToJavaProxyMap::Find method we can use the get() method of the WeakReference class to get a strong reference to the proxy object. Unlike a jni weak reference, a java.lang.ref.WeakReference is automatically cleared by the gc before finalization of the referent (see java.lang.ref API). So we can either create a strong reference and no finalization will ever happen before releasing it (this is NOT object resurrection), or get NULL and go on as if we didn't find it.
Keywords: crash
Attached patch patch (obsolete) — Splinter Review
This patches uses java.lang.ref.WeakReference (global jni refs to WeakReferences) instead of jni weak references in NativeToJavaProxyMap.
Good catch.  Even though I cannot recreate the crash (using System.gc()), I understand what the issue is.  The relevant section from the listed URL is:

"The weak global reference is weaker than Java's internal references to objects requiring finalization. A weak global reference will not become functionally equivalent to NULL until after the completion of the the finalizer for the referenced object, if present."

Given that, I think we would also need to update the jweak usage in nsJavaXPTCStub and nsJavaXPTCStubWeakRef.  I'll test your patch and change those files also.
Status: UNCONFIRMED → NEW
Ever confirmed: true
I think that (but I might be wrong) such an approach is overkill for XPTCStubs. 

The problem appears only with objects that require finalization and that can be resurrected through a jni weak reference. Example :

1. (Java)    Proxy P gets unreachable.
2. (Java GC) P is scheduled for finalization (gets finalizable).
3. (JNI)     P is resurrected through a jni weak reference (possible, P hasn't been finalized)
4. (Java GC) P is finalized, but not released (it has just been resurrected)
5. (JNI)     Resurrected P is used as if P had never been finalized
6. (JNI)     Crash

For a Java XPTCStub, S, it's java object, JOBJ, can only be released when all java references to JOBJ have been lost AND all xpcom references to its XPCOM object have been cleared (we might still have some weak xpcom references). 

In such a case the worst scenario I can see is :

1. (Java)    JOBJ gets unreachable.
2. (Java GC) JOBJ is scheduled for finalization (gets finalizable). 
3. (JNI)     JOBJ is resurrected through one of the jni weak references pointing at OBJ (we have one in the stub S itself and one in each JavaXPTCStubWeakReference pointing to S).
4. (JAVA GC) JOBJ is finalized but not released. 
5. (JNI)     JOBJ continues to be used. It's finalize() method will never be called again.

This scenario poses a problem if and only if JOBJ's finalize() method exists 
and if JOBJ wasn't meant to be resurrected (in the first scenario, proxy P actually HAS a finalize() method and IS NOT meant to be resurrected). 

Chances are this is quite rare a situation - JOBJ has great chances to be just a simple xpcom interface implementation (like nsIObserver, nsIWebBrowserListener, etc.). 
We can get along with the problem by simply mentioning in the documentation  that every java xpcom implementation must be designed to be resurrectable. If we want to stay simple, we can just advice against relying upon finalize() in java xpcom interface implementations.

What do you think ?
(In reply to comment #3)
> I think that (but I might be wrong) such an approach is overkill for XPTCStubs.

Does using WeakReference instead of jweak add much overhead?  I'd imagine they are quite similar.  And although using WeakReference might be more involved, I'd prefer to be correct and consistent, rather than forcing implementations to be resurrectable or avoid finalize().

As an aside, I've read on java.sun.com that using finalize() for clean up purposes is discouraged.  Apparently, a JVM is not required to call finalize() when running the GC.  So for JavaXPCOM, we'd need to move to using PhantomReferences.  But that's another bug, which I need to open...
Attached patch remove all jweaks (obsolete) — Splinter Review
This is what I had in mind.  Tudor, what do you think?
Works for me. In fact you're right, it doesn't add much overhead and it's much more clear. And avoiding jni weak references can only be a good thing.

Just a two minor typos in nsJavaXPTCStub::AddRefInternal 
  
  jobject referent = env->CallObjectMethod(mJavaWeakRef, getReferentMID);
  if (!env->IsSameObject(referentObj, NULL))  // referentObj should be referent

and nsJavaXPTCStubWeakRef::QueryReferent

  jobject javaObject = GetJNIEnv()->CallObjectMethod(mWeakRef, getReferentMID);
  if (env->IsSameObject(referentObj, NULL))
     return NS_ERROR_NULL_POINTER;
  // env should be declared and referentObj should be javaObject
 
When applied with patches for bug 337675 and bug 338110 a fuzz factor of 6 is needed

  patch --fuzz=6 <patch

From what I've read in the java lang spec the finalize() method must be run before the storage of an object is reclaimed. Could you point me to the page/message where you've found that the gc is not required to run finalize()? (Some code in the project we're on relies on finalize() and I'd like to fully understand what happens).

Anyway, if we can rely on finalize(), I think it's simpler to do so instead of using PhantomReferences that we would have to fully manage (create a ref queue, listen to it in another thread or check it from time to time on the main thread, etc.)
Previous patch without typos.
Attachment #224180 - Attachment is obsolete: true
Attachment #224445 - Attachment is obsolete: true
Attachment #224528 - Flags: review+
Attachment #224528 - Flags: approval-branch-1.8.1+
(In reply to comment #6)
> Could you point me to the
> page/message where you've found that the gc is not required to run finalize()?

Here are some articles/posts that I have saved on this subject, although there are more mentions of this available:
http://java.sun.com/developer/TechTips/2000/tt0124.html#tip1
http://forum.java.sun.com/thread.jspa?forumID=256&threadID=273275
Checked in to trunk and 1.8 branch.  ->FIXED
Status: NEW → RESOLVED
Closed: 15 years ago
Keywords: fixed1.8.1
Resolution: --- → FIXED
I just struggled with this bug for 2 days, and then found this bug fixed :(.  I just wrote a test program which almost always reproduces this bug and post here FYI.
Product: Core → Core Graveyard
You need to log in before you can comment on or make changes to this bug.