Closed Bug 87389 Opened 23 years ago Closed 23 years ago

XPConnect proto's not cleared during document transition

Categories

(Core :: XPConnect, defect)

defect
Not set
normal

Tracking

()

VERIFIED FIXED

People

(Reporter: jst, Assigned: dbradley)

Details

(Whiteboard: [PDT+] new patch?)

Attachments

(3 files)

XPConnect must *not* re-use the XPConnect prototypes when going from document to
document, this lets one page override methods by modifying the prototype on any
DOM object/class and this change remains in all other documents that are loaded
into the same window, this is a must fix for rtm. There's two parts to this fix,
one part is to make nsXPConnect::InitClasses() blow away the proto's in the
current scope, and the other is to make the DOM code re-set the prototype of the
global object to a new fresh XPConnect proto. David has a fix for the XPConnect
part, I'll attach a fix for the DOM part of this.
Patch to follow, basically removes the wrapped protos from the scope's wrapped
proto map.
Status: NEW → ASSIGNED
Attaching another patch. This patch invalidates just the JS Proto Object not the
wrapper itself. The wrapper native proto's are now reconnected to a new JS
Objects when nsXPConnect::InitClassesis called. This prevents changes in proto
objects' values appearing across documents on the prototypes.

The wrapped native proto now lives until all the JS Proto's that were created
from it have been finalized via a mark and sweep system. The patch hooks into
JS's marking of the proto objects to determine when to prepare the proto's for
destroying. During the finalization of proto objects if the wrapper is not
marked it gets put on the list of dying proto's to be destroyed at the end of GC.

jst, add any additional comments you think necessary.
Need r's and sr's for this. Pay particular attention to potential leaks, in JS
as well as XPConnect. Thanks.
Whiteboard: [PDT+]
First, the quote below from email I got from jst...

"... (the last patch) does indeed fix the problem but I've seen crashes 
happen a few times when running with those changes, the crashes are 
non-trivial to reproduce, one crash I saw was in the DOM helper code 
where we tried to set the proto of a XPCWrappedNativeProto's JSObject 
and the JSObject had been deleted, the other crash I saw was due to a 
dangling XPCWrappedNativeProto reference from an XPCWrappedNative."

Someone should have already posted this problem info to this bug report (given 
that there is a posted request for code review).

Now, someone please explain why the scheme in the first patch was not good 
enough. It seems to me that simply making these shared proto objects 
non-findable ought to have been enough. I'd have cleared the map in one call 
rather than enumerating, but that's a minor issue. I also think I'd rather see 
an explicit (new) method call in nsIXPConnect to force this clearing rather than 
piggy-backing the InitClasses call. The DOM could make this call only when it 
has just called JS_ClearScope for the window.

Nevertheless, what is the problem with simply making the proto objects 
non-findable? Please explain.
I forget the exact details as to why we went from the scheme in the first patch
to the more complicated approach in the second patch. I know the first patch
introduced crashes where we ended up deleting XPCWrappedNativeProto objects to
which there were still live references through private data in JSObjects, and
there were also problems with "old" shared XPCWNP's removing the entry for their
classinfo from the map when the old proto object was finalized, even if there
were "new" proto objects still alive. At the time I didn't see an clean and easy
way to fix those problems (and there could've been other problems too), because
of that the second approach was tested, which didn't turn out to work completely
either. John, if this explanation gives you any hints on how to fix the problems
with the first or the second patch then please let us know...

I don't know this code well enough to know what the best approach is here, both
the patches in this bug do fix the problem, but they both introduce crashes so
they're not ready for prime time yet.
I don't see a need for a new method that does this, to me it makes sense to have
InitClasses do this and the DOM code only calls InitClasses either on a new
context or after we've called JS_ClearScope().
I did think of the problem of removing the wrong XPCWNP entry from the map after 
I wrote the above. I think is is fixable by the simple change to 
XPCWrappedNativeProto::JSProtoObjectFinalized...

Index: xpcwrappednativeproto.cpp
===================================================================
RCS file: /cvsroot/mozilla/js/src/xpconnect/src/xpcwrappednativeproto.cpp,v
retrieving revision 1.5
diff -u -r1.5 xpcwrappednativeproto.cpp
--- xpcwrappednativeproto.cpp   2001/06/13 02:16:29     1.5
+++ xpcwrappednativeproto.cpp   2001/07/15 10:00:13
@@ -115,7 +115,13 @@
     // Map locking is not necessary since we are running gc.

     if(IsShared())
-        GetScope()->GetWrappedNativeProtoMap()->Remove(mClassInfo);
+    {
+        // Only remove this proto from the map if it is the one in the map.
+        ClassInfo2WrappedNativeProtoMap* map =
+            GetScope()->GetWrappedNativeProtoMap();
+        if(map->Find(mClassInfo) == this)
+            map->Remove(mClassInfo);
+    }

     GetRuntime()->GetDyingWrappedNativeProtoMap()->Add(this);

I'm trying to think how any XPCWNP objects could be deleted if any JSObject's 
private data pointed to it. The XPCWNP is only added to the dying proto list 
(and later deleted) if its JSObject had been destroyed 
(XPCWrappedNativeProto::JSProtoObjectFinalized was called). This should only be 
happening for protos not being used by a live wrapper since MarkForValidWrapper 
will mark the JSObject of the XPCWNP.

The one place where I think this might be a problem is in the shutdown code... 
XPCWrappedNativeProto::SystemIsBeingShutDown needs to get called for the XPCWNP 
objects that have been removed from the classinfo key'd map but which have not 
yet been destroyed. We can do this by adding a single (per runtime) simple 
hashtable (keyed by the object's pointer). When we remove the XPCWNP from the 
classinfo key'd map we add it to this hashtable (call it the "detached proto 
table"). In XPCWrappedNativeProto::JSProtoObjectFinalized we remove the XPCWNP 
from the detached proto table. At system shutdown time we traverse the detached 
proto table and call XPCWrappedNativeProto::SystemIsBeingShutDown for each 
entry.

I don't *think* we'll need the detached proto table for anything else (except 
diagnostics), but it will be available to track these XPCWNP objects as they 
progress toward death. I don't expect there will be many or that they'll live 
long.

Maybe I'm missing something, But I think this will do the trick without adding 
much complication.

I guess I don't care if there is a separate api call or if this works 
automatically as part of InitClasses.

I'm looking closer at dbradley's second patch...

I'm not happy that there are two paths that lead to the delete of the proto: in 
WrappedNativeProtoSweeper and in DyingProtoKiller. I'm not comfortable with 
multiple JSObjects pointing to the same XPCWNP object. I *like* my original 
scheme of having the finalization of the proto's JSObject be the one and only 
thing that can trigger XPCWNP destruction. 

I think that second patch has ordering problems between the JS finalization and 
the marking system used in xpconnect. Specifically, imagine the case where a 
given XPCWNP object is referenced by two JSObjects (i.e. 
XPCWrappedNativeProto::CreateJSObject has been called twice). If one of those 
JSObjects is finalized then the XPCWNP will get moved to the dying proto list 
and will get killed later. Yet, the other JSObject is still referencing it. The 
xpconnect marking of the XPCWNP done of all the entries in the map by 
MarkAllWrappedNativesAndProtos won't help preserve the XPCWNP because it happens 
*after* all the JSObject finalization is done. I think this is a fatal flaw 
(though I'm inferring this only from inspecting the patch).

Again, I think the plan I outlined above is better.

Are there test cases that can trigger a repeatable failure of the first patch? I 
have not tried it yet, but I think the modifications I proposed will fix the 
problems I've imagined.
Do we think the new patch might be the one?
Whiteboard: [PDT+] → [PDT+] new patch?
Yes, this is the one, it's already checked in on the trunk (as of 15 minutes
ago). Once I'm done with some additional testing I'll land this patch on the
branch. So far we've run browser buster, pageload perf tests (in the interest of
loading more pages), DOM test suites, and a bunch of other testing to verify
these changes do the right thing w/o breaking things.
Changes are checked in on both the branch and the trunk, marking FIXED.
Status: ASSIGNED → RESOLVED
Closed: 23 years ago
Resolution: --- → FIXED
Marking Verified - 
Status: RESOLVED → VERIFIED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: