Open Bug 363383 Opened 18 years ago Updated 2 years ago

moving XBL-bound nodes between documents makes them unusable (customize toolbar dialog)

Categories

(Core :: DOM: Core & HTML, defect)

x86
Windows XP
defect

Tracking

()

REOPENED
mozilla1.9alpha8

People

(Reporter: jwkbugzilla, Assigned: peterv)

References

Details

(Keywords: regression)

Attachments

(4 files, 3 obsolete files)

adoptNode() sometimes seems to create a strange result. Steps to reproduce:

1. Take a current trunk build (20061125 is the first using adoptNode() in the customization dialog but 20061210 shows the same behavior).
2. Install Adblock Plus 0.7.2.2 from https://addons.mozilla.org/firefox/1865/ and restart.
3. Open customization dialog and try to drag the Adblock Plus (ABP) button there.

The button won't appear in the dialog, instead you get the following message in the error console:

Error: uncaught exception: [Exception... "Illegal operation on WrappedNative prototype object"  nsresult: "0x8057000c (NS_ERROR_XPC_BAD_OP_ON_WN_PROTO)"  location: "JS frame :: chrome://global/content/customizeToolbar.js :: setWrapperType :: line 452"  data: no]

Investigation shows that function wrapPaletteItem() in customizeToolbar.js uses adoptNode() on the aPaletteItem parameter - and after that requesting any property of this node will throw NS_ERROR_XPC_BAD_OP_ON_WN_PROTO. It only happens to the Adblock Plus button so I tested what makes it special - it comes out that removing disabled="true" attribute in the button's declaration fixes it (strange thing is that by the time customization dialog is opened the disabled attribute should have been removed anyways).

Replacing adoptNode() by importNode() in customizeToolbar.js fixes this trivially, however I don't think this is the desired solution. Blame shows to bug 330677, CC'ing peterv and bz.
Looks like the dragging problem described in Bug 362760. Same error I see now.
*** Bug 362760 has been marked as a duplicate of this bug. ***
Attached patch v1 (obsolete) — Splinter Review
Does this patch help?
Assignee: general → peterv
Status: NEW → ASSIGNED
Yes, it fixes the problem.
Isn't the real problem the XBL binding?  That is, reparenting a node with XBL attached to it, esp. into a different document, probably fails...  I'd like someone to try attaching XBL to HTML and seeing whether they get problems.
Flags: blocking1.9?
Comment on attachment 248196 [details] [diff] [review]
v1

We still want this hack imho, I wasn't planning on fixing bug 235640 simultaneously with implementing adoptNode. Note that bug 235640 also talks about customize toolbar, so if we don't take this then why is it not a dupe?
(I'll add a comment pointing to bug 235640)
Attachment #248196 - Flags: superreview?(bzbarsky)
Attachment #248196 - Flags: review?(bzbarsky)
Actually, maybe you're right, I remember looking into this at some point. I think the problem was that XBL inserts itself in the prototype chain, and XPConnect didn't find it's own wrapper so it couldn't reparent. Maybe we should close bug 235640 instead, and use this one to track the XBL problem then.
Summary: adoptNode() makes node unusable (customize toolbar dialog) → moving XBL-bound nodes between documents makes them unusable (customize toolbar dialog)
Comment on attachment 248196 [details] [diff] [review]
v1

Let's try fixing the real problem first.
Attachment #248196 - Flags: superreview?(bzbarsky)
Attachment #248196 - Flags: superreview-
Attachment #248196 - Flags: review?(bzbarsky)
Attachment #248196 - Flags: review-
Attached file Testcase (XBL file)
Attached file Testcase
Loading that testcase I see in the console:

Adding binding to HTML element (no implementation).
Adding binding to HTML element (with implementation).
Adding binding to XUL element.
Adopting HTML element.
Adopting bound HTML element (no implementation).
Adopting bound HTML element (with implementation).
WARNING: Moving XPConnect wrappedNative to new scope, but can't fixup __proto__: file mozilla/js/src/xpconnect/src/xpcwrappednative.cpp, line 1101
Adopting XUL elements.
WARNING: Moving XPConnect wrappedNative to new scope, but can't fixup __proto__: file mozilla/js/src/xpconnect/src/xpcwrappednative.cpp, line 1101
WARNING: Moving XPConnect wrappedNative to new scope, but can't fixup __proto__: file mozilla/js/src/xpconnect/src/xpcwrappednative.cpp, line 1101
Done adopting XUL elements.

Then:
- clicking first two buttons works (plain HTML element and bound HTML element without implementation)

- clicking the third button (bound HTML element with implementation) doesn't work and outputs:

JavaScript error: , line 0: uncaught exception: [Exception... "Illegal operation on WrappedNative prototype object"  nsresult: "0x8057000c (NS_ERROR_XPC_BAD_OP_ON_WN_PROTO)"  location: "JS frame :: https://bugzilla.mozilla.org/attachment.cgi?id=248252 :: onxblclick :: line 20"  data: no]

- clicking the two XUL buttons doesn't work and outputs:

************************************************************
* Call to xpconnect wrapped JSObject produced this error:  *
[Exception... "Illegal operation on WrappedNative prototype object"  nsresult: "0x8057000c (NS_ERROR_XPC_BAD_OP_ON_WN_PROTO)"  location: "JS frame :: chrome://global/content/bindings/general.xml :: get_disabled :: line 0"  data: no]
************************************************************
JavaScript error: , line 0: uncaught exception: [Exception... "Illegal operation on WrappedNative prototype object"  nsresult: "0x8057000c (NS_ERROR_XPC_BAD_OP_ON_WN_PROTO)"  location: "JS frame :: data:text/xml,<html xmlns='http://www.w3.org/1999/xhtml'><body><form id='aform' /></body></html> :: oncommand :: line 1"  data: no]
JavaScript error: , line 0: uncaught exception: [Exception... "Illegal operation on WrappedNative prototype object"  nsresult: "0x8057000c (NS_ERROR_XPC_BAD_OP_ON_WN_PROTO)"  location: "JS frame :: chrome://global/content/bindings/button.xml :: get_autoCheck :: line 0"  data: no]

************************************************************
* Call to xpconnect wrapped JSObject produced this error:  *
[Exception... "Illegal operation on WrappedNative prototype object"  nsresult: "0x8057000c (NS_ERROR_XPC_BAD_OP_ON_WN_PROTO)"  location: "JS frame :: chrome://global/content/bindings/general.xml :: get_disabled :: line 0"  data: no]
************************************************************
JavaScript error: , line 0: uncaught exception: [Exception... "Illegal operation on WrappedNative prototype object"  nsresult: "0x8057000c (NS_ERROR_XPC_BAD_OP_ON_WN_PROTO)"  location: "JS frame :: chrome://global/content/bindings/button.xml :: get_autoCheck :: line 0"  data: no]
JavaScript error: , line 0: uncaught exception: [Exception... "Illegal operation on WrappedNative prototype object"  nsresult: "0x8057000c (NS_ERROR_XPC_BAD_OP_ON_WN_PROTO)"  location: "JS frame :: https://bugzilla.mozilla.org/attachment.cgi?id=248252 :: onxblcommand :: line 27"  data: no]
Attached patch v2 (obsolete) — Splinter Review
This seems to fix the testcase, could someone try it with customizing the toolbar?
Attachment #248196 - Attachment is obsolete: true
No, that doesn't fix it. I am able to move the button into the customize window and it appears there but then I start getting exceptions from the Adblock Plus code that it adjusting the icon's status regularly:

Error: uncaught exception: [Exception... "Illegal operation on WrappedNative prototype object"  nsresult: "0x8057000c (NS_ERROR_XPC_BAD_OP_ON_WN_PROTO)"  location: "JS frame :: chrome://adblockplus/content/overlay.js :: anonymous :: line 198"  data: no]

Closing the customize dialog gives me then:

Error: [Exception... "Illegal operation on WrappedNative prototype object"  nsresult: "0x8057000c (NS_ERROR_XPC_BAD_OP_ON_WN_PROTO)"  location: "JS frame :: chrome://global/content/bindings/text.xml :: get_accessKey :: line 30"  data: no]
Source File: chrome://global/content/bindings/text.xml
Line: 30

The browser's menus don't re-enable and the toolbar changes are not saved.
*** Bug 364555 has been marked as a duplicate of this bug. ***
Attached patch v2.1 (obsolete) — Splinter Review
Let's do this for now. I tried fixing reparenting of wrappers for XUL elements, but I haven't been able to figure out what exactly goes wrong. I propose we leave that to be fixed in bug 235640.
Attachment #248260 - Attachment is obsolete: true
Attachment #251187 - Flags: superreview?(jst)
Attachment #251187 - Flags: review?(jst)
So the issue is that the wrapper determines the security context.  So we should really fix bug 235640 for 1.9, imo.
Comment on attachment 251187 [details] [diff] [review]
v2.1

This looks good to me. r+sr=jst
Attachment #251187 - Flags: superreview?(jst)
Attachment #251187 - Flags: superreview+
Attachment #251187 - Flags: review?(jst)
Attachment #251187 - Flags: review+
Actually, something's not right. Here's what I think is happening: with my patch we end up changing the prototype of the JS class corresponding to an XBL binding. That class is shared by other elements that aren't necessarily reparented and we mess up their prototype chain. Bz: we probably need to blow away the XBL implementation chain and rebuild it when reparenting (or more precisely when changing scopes), right? I guess it's time to make nsBindingManager::ChangeDocumentFor work. Sigh.
Hmm...  Yeah, that could be it.  :(
Attached patch v1Splinter Review
This seems to work, I think this might even fix bug 235640. I didn't try to reuse ChangeDocumentFor, it seems like that should actually be more of a BindToTree/UnbindFromTree kind of API. I'm not 100% sure about the AllowScripts() check, it could be that we insert the prototype, turn off scripting and then try to remove the prototype. But that should be fixed by removing the prototypes when scripting is turned off IMHO.
Attachment #251187 - Attachment is obsolete: true
Attachment #251527 - Flags: superreview?(jst)
Attachment #251527 - Flags: review?(bzbarsky)
It might take me a week or two to get to this.  :(
Lets block on this at least for now so that we don't loose it given that there's a patch here already.
Flags: blocking1.9? → blocking1.9+
Comment on attachment 251527 [details] [diff] [review]
v1

>Index: content/base/src/nsNodeUtils.cpp
>+    nsXBLBinding* binding = nsnull;

Don't we also need to remove the binding from the old binding manager and put it in the new one?  Probably while holding a ref to the binding...

>Index: content/xbl/src/nsXBLBinding.cpp

>@@ -754,14 +754,30 @@ nsXBLBinding::InstallImplementation()
>+  nsresult rv;

Initialize this to NS_OK?  Otherwise if there is no mNextBinding and AllowScripts() is false we'll return a random value....

>+#ifdef DEBUG
>+    {
>+      if (mPrototypeBinding->HasJSClass()) {

Maybe add a comment that says we're saving our class name for later use in assertions?

>+nsXBLBinding::GetBoundElementJSObject(JSContext** aContext)
>+    nsIScriptGlobalObject *global = document->GetScriptGlobalObject();

Should we be using GetScopeObject here, perhaps?  This seems like something we might want to do even if the document has been unloaded....

Should probably add some comments in the header for this method that it should only be called when AllowScripts() and mPrototypeBinding->HasJSClass() are true.  Otherwise we're passing out an unrooted JSObject here, and who knows what'll happen with it.  Actually, even if mPrototypeBinding->HasJSClass() is true we could still not have a JSObject yet (e.g. if something failed during binding install).  But fixing XBL to actually think about error-handling is sorta out of the scope of this bug, eh?  ;)

>+nsXBLBinding::RemoveFromJSPrototypeChain()

>+  // XXX Sanity check to make sure our class name matches
>+  // Pull ourselves out of the proto chain.
>+  JSObject* ourProto = ::JS_GetPrototype(cx, scriptObject);

Can't this be null at this point if script on a page somewhere set __proto__ to null on an XBL-bound node at some point?

That is, perhaps what we really need here is a non-debug check that the proto is non-null and that JSClass on the proto is the right thing (e.g. that the page didn't set it to some random object) before removing the proto.  In fact, perhaps we should walk the proto chain or something, in case something got inserted in between the node and the XBL proto?  I'm not really sure...

How does this interact with scriptable plugins (which can have more proto-munging coming from the plugin code)?

I guess this code already looked like this, though, so maybe it's not a big deal to leave it as-is and file a bug to make it better.

>@@ -899,42 +983,7 @@ nsXBLBinding::ChangeDocument(nsIDocument
>+nsXBLBinding::OwnerDocumentWillChange()
>+  if (mPrototypeBinding->HasJSClass()) {
>+    RemoveFromJSPrototypeChain();

If AllowScripts() is false, this could trigger creation of a wrapper where we didn't have one before, no?  Is that desirable?  In that case it'll also remove some random non-XBL prototype from the proto chain; I'm pretty sure that's not wanted.

>-nsXBLBinding::InitClass(const nsCString& aClassName,

Huh.  So this was completely unused?  Nice!

r=bzbarsky with the above issues fixed.
Attachment #251527 - Flags: review?(bzbarsky) → review+
Comment on attachment 251527 [details] [diff] [review]
v1

- In nsXBLBinding::RemoveFromJSPrototypeChain():

+  // XXX Sanity check to make sure our class name matches
+  // Pull ourselves out of the proto chain.
+  JSObject* ourProto = ::JS_GetPrototype(cx, scriptObject);

I share bz's concerns here. But yeah, it's old code, but probably a problem waiting to be noticed. Very worth filing a new bug on this, unless you want to add the checking here. And FWIW, I think this should check that the JSClass of the proto is the right class, not just the right name.

sr=jst
Attachment #251527 - Flags: superreview?(jst) → superreview+
should be easy to automate a test for this
Flags: in-testsuite?
(In reply to comment #24)
> >@@ -754,14 +754,30 @@ nsXBLBinding::InstallImplementation()
> >+  nsresult rv;
> 
> Initialize this to NS_OK?  Otherwise if there is no mNextBinding and
> AllowScripts() is false we'll return a random value....

FWIW, and AFAICT, this comment is not true. The code returns NS_OK if
!mNextBinding and !AllowScripts().
Oh, indeed.  I totally misread that...  Sorry about that.
Attachment #259323 - Flags: review?(jst)
Attachment #259323 - Flags: review?(jst) → review+
Target Milestone: --- → mozilla1.9alpha6
Target Milestone: mozilla1.9alpha6 → mozilla1.9beta1
Comment on attachment 251527 [details] [diff] [review]
v1

This doesn't actually work. This patch is pretty old, I do remember testing this but I have no idea if I tested the wrong thing or if things have changed in the meantime. The problem is that GetBinding in nsNodeUtils::CloneAndAdopt doesn't find the binding because it has been removed by the nsBindingManager::ChangeDocumentFor call in nsGenericElement::UnbindFromTree.
Attachment #251527 - Flags: superreview-
Attachment #251527 - Flags: superreview+
Attachment #251527 - Flags: review-
Attachment #251527 - Flags: review+
So is customizing toolbars broken because of this? Or is this bug just about XBL sucking? If it's just XBL suckage (which isn't even a regression) then this is probably not even a blocker.

If customizing toolbars are indeed broken we probably need to up the priority.
Priority: -- → P4
> So is customizing toolbars broken because of this?

In general, yes.  Some items can't be dragged.  See comment 0.

Oh, and it's a regression from the need to do adoptNode to move between documents; adoptNode is not playing nice with the XBL.
Keywords: regression
Ok, so we need to either work around this in the customize toolbar code, or get the underlying problem fixed.

Peter, do you have time to fix this in the adoptNode code for beta2? If not we need to punt it off to the frontend people to work around it per one of the suggestions in comment 0
Priority: P4 → P2
I'm going to need some help to evaluate this. The basic problem is that XPConnect can't reparent its wrapper if there's an XBL class on the prototype chain (because it doesn't find its wrapper as the first prototype). I tried adding code to unhook the XBL prototypes before reparenting and rehook them after, but we remove the XBL bindings on UnbindFromTree and so I can't use the fact that there's a binding to unhook/rehook the prototypes. Personally I find it odd that we blow away the binding without unhooking the prototype. I see three ways to deal with that: don't remove bindings on UnbindFromTree and move them to the new ownerDocument's binding manager, remove XBL prototypes when the XBL binding is removed, or use a different mechanism to know if a node has an XBL prototype (something like checking for flag NODE_MAY_BE_IN_BINDING_MNGR and looking up the class in the hash of XBL classes).
Boris: I'd especially appreciate your opinion on this.
> Personally I find it odd that we blow away the binding without unhooking the
> prototype.

As of bug 398135 getting fixed, we should be unhooking the prototype.  So is this bug as filed effectively fixed?

That said, any content page can mess with the prototype chain, right?  And you're saying that doing that will break reparenting on adopt?  Perhaps the real issue here is this block in XPConnect:

1192             // We only try to fixup the __proto__ JSObject if the wrapper
1193             // is directly using that of its XPCWrappedNativeProto.
1194 
1195             if(wrapper->HasProto() &&
1196                JS_GetPrototype(ccx, wrapper->GetFlatJSObject()) ==
1197                oldProto->GetJSProtoObject())

Perhaps we should walk up the proto chain until we get to oldProto->GetJSProtoObject() and then replace it in the proto chain with newProto->GetJSProtoObject()?  jst seems to have blame for this code; I'd be interested in knowing why it's the way it is.
If you mess with the proto-chain you are on your own IMHO. I.e. I don't care if that breaks moving nodes between documents, as long as the breakage doesn't manifest itself as a security problem, but that doesn't seem to be the case here right?
(In reply to comment #36)
> As of bug 398135 getting fixed, we should be unhooking the prototype.

Don't we only do that for style bindings?

> Perhaps we should walk up the proto chain until we get to
> oldProto->GetJSProtoObject() and then replace it in the proto chain with
> newProto->GetJSProtoObject()?  jst seems to have blame for this code; I'd be
> interested in knowing why it's the way it is.

Like in attachment 248260 [details] [diff] [review]?
> If you mess with the proto-chain you are on your own IMHO.

Agreed, as long as we don't leak and don't have a security problem.  We still set the new XPCWrappedNativeScope on the XPCWrappedNative, right?

That said, having the scope inconsistent seems to be asking for trouble, so we should fix it if we can.

> Don't we only do that for style bindings?

Ah, indeed.  So a binding added explicitly with addBinding won't get removed.  Just another special case of someone messing with the proto chain, basically.

> Like in attachment 248260 [details] [diff] [review]?

Yes, like that.  Since the style binding issue should be gone at this point, I think it should be a much safer change...

Alternately, we could blow away non-style binding protos on adopt.  Might be safer.
(In reply to comment #36)
> As of bug 398135 getting fixed, we should be unhooking the prototype.  So is
> this bug as filed effectively fixed?

Yes, it seems to be. At least I cannot reproduce the problem any more.
Awesome, lowering the priority then.

Is the underlying XBL issue still there, or can we mark this FIXED?
Flags: blocking1.9+ → blocking1.9-
Priority: P2 → --
The issue could still be triggered using addBinding, yes.  And the underlying issue that we incorrectly reparent some stuff is still there....
One option would be to remove all the in-between protos (between the object and the proto it "should" have) when adopting...
WORKSFORME in Firefox 3.0.
Status: ASSIGNED → RESOLVED
Closed: 16 years ago
Resolution: --- → WORKSFORME
Blocks: abp
(In reply to comment #44)
> WORKSFORME in Firefox 3.0.

Even when using addBinding?
Sorry, reopening - I forgot that this bug is wider than my toolbar customization issues. Testcase still has issues.
Status: RESOLVED → REOPENED
Resolution: WORKSFORME → ---
QA Contact: ian → general
No longer blocks: abp
Component: DOM → DOM: Core & HTML
Severity: normal → S3

The severity field for this bug is relatively low, S3. However, the bug has 4 duplicates.
:peterv, could you consider increasing the bug severity?

For more information, please visit auto_nag documentation.

Flags: needinfo?(peterv)

The last needinfo from me was triggered in error by recent activity on the bug. I'm clearing the needinfo since this is a very old bug and I don't know if it's still relevant.

Flags: needinfo?(peterv)
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: