Closed
Bug 418443
Opened 16 years ago
Closed 16 years ago
XPCOM embedded SpiderMonkey cannot access DOM properties - new in Beta 3
Categories
(Core :: JavaScript Engine, defect, P2)
Tracking
()
RESOLVED
INVALID
mozilla1.9
People
(Reporter: ron.langham, Assigned: jst)
References
Details
(Keywords: regression)
Attachments
(1 obsolete file)
User-Agent: Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; .NET CLR 1.1.4322; .NET CLR 2.0.50727; InfoPath.1) Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9b3) Gecko/2008020514 Firefox/3.0b3 We have a XPCOM extension that has its own internal SpiderMonkey engine that runs scripts on the loaded page Document. We injected some javascript into the page via XUL. A load event in the page then call methods in our XPCOM DLL that then passes the document object to our internal SM engine for other processing on the page. This has been working for a couple of years in FF 1.5, 2.0, and worked fine up to Beta 2, but broke in Beta 3. From high-level view, within the load event of the page, we add a 'document' variable to our SM that refers to the same document object in the main page. We use the nsIDOMHTMLDocument object that we get from the event and add it to our SM variables. We compile the JS script using the proper 'Principal' method, JS_CompileScriptForPrincipals. Other variables are added using JS_EvaluateScriptForPrincipals. When we execute our script using JS_ExecuteScript, everything runs fine, we are able to access all document elements, properties, etc. We can access all other DOM elements and their properties in the page with no problem. If we then call a method in the javascript using JS_CallFunctionName immediately after the JS_ExecuteScript, then we start seeing problems. e.g. // this works ok JS_ExecuteScript(m_pJSContext, m_pJSGlobalObject, m_pJSScript, &vRet); // this returns FALSE JS_CallFunctionName(m_pJSContext, m_pJSGlobalObject, sName.c_str(), (uintN)argc, argv, &vRet); The problem occurs if the method in the JS tries to access any properties on any of the DOM elements. The 'document' object is available and a typeof(document) properly returns [object HTMLDocument]. Other elements can be retrieved using document.getElementById('validName'). The element returned is valid and a typeof returns the proper value. The problem occurs when there is any access to a property on any DOM element. For instance, the following examples cause the JS method to abort with no exception as soon as they are executed, the JS_CallFunctionName method simply returns FALSE. temp = document.location; temp = typeof(document.location); temp = elem.innerHTML; // where elem is a valid value retrieved by getElementById temp = typeof(elem.innerHTML); This error of accessing any property is very similar to what is seen in JS_ExecuteScript if we do not compile and load variables using the Principal methods. Therefore, I suspect the the Principals assigned is lost for some reason when we call JS_CallFunctionName. In other tests, I was able to do reflection on DOM objects in the method and see all the properties, e.g. for (var prop in elem), but it just fails when I try to access one, even just doing a typeof() on an element property. I started going back to older nightly builds in order to track down when this bug occurred. I was able track down the problem to a few lines of code added to the NewCompilerFunction() in jsparse.c on 12/19 by Igor Bukanov in version 3.316 of jsparse.c. The change was related to bug 398219. There were many other changes similar to this, as well as some Principal specific code removed from fun_resolve() method in jsfun.c, but this code change didn't seem to affect the problem. If I commented out the code in NewCompilerFunction() as shown below in the Beta 3 release source code, our extension then began to work again with no problems. I copied the new patched js3250.dll to the Beta 3 release and did our tests. NewCompilerFunction(...) { // ... fun = js_NewFunction(cx, NULL, NULL, 0, JSFUN_INTERPRETED | lambda, parent, atom); // COMMENT OUT THE LINES BELOW //if (fun && !(tc->flags & TCF_COMPILE_N_GO)) { // STOBJ_SET_PARENT(fun->object, NULL); // STOBJ_SET_PROTO(fun->object, NULL); //} return fun; } Here are more low-level details on our implementation... 1) Take over the TabOpen event on Firefox and When a Tab component is first created, we initialize SM and get principals, e.g. nsCOMPtr<nsIJSRuntimeService> rtsvc; m_servMan->GetServiceByContractID("@mozilla.org/js/xpc/RuntimeService;1", NS_GET_IID(nsIJSRuntimeService), getter_AddRefs(rtsvc)); rtsvc->GetRuntime(&m_pJSRuntime); ... nsAXPCNativeCallContext *callContext; m_xpc->GetCurrentNativeCallContext(&callContext); ... callContext->GetJSContext(&m_pJSCallContext); ... sSecurityManager->GetPrincipalFromContext(m_pJSCallContext, getter_AddRefs(principal)); ... principal->GetJSPrincipals(m_pJSCallContext, &m_pJSPrincipals); ... m_pJSContext = JS_NewContext(m_pJSRuntime, JS_STACK_SIZE); ... m_pJSGlobalObject = JS_NewObject(m_pJSContext, &globalClass, nsnull, nsnull); JS_InitStandardClasses(m_pJSContext, m_pJSGlobalObject); ... m_xpc->InitClasses(m_pJSContext, m_pJSGlobalObject); These value are maintained in a C++ instance attached to the tab document. 2) On a page load event, take the nsIDOMHTMLDocument of the page and create new JS objects in our SM, add a variable 'document' into our SM for the document so that our scripts can access it as normal. e.g. nsCOMPtr<nsIXPConnectJSObjectHolder> wrapper; m_xpc->WrapNative(m_pJSContext, m_pJSGlobalObject, pDoc, NS_GET_IID(nsIDOMHTMLDocument), getter_AddRefs(wrapper)); ... JSObject* ourJSObject; wrapper->GetJSObject(&ourJSObject); ... JS_DefineProperty(m_pJSContext, m_pJSGlobalObject, "document", OBJECT_TO_JSVAL(ourJSObject), nsnull, nsnull, JSPROP_PERMANENT); 3) Load and compile our script using JS_CompileScriptForPrincipals. e.g. m_pJSScript = JS_CompileScriptForPrincipals(m_pJSContext, m_pJSGlobalObject, m_pJSPrincipals, sProcessedScript.c_str(), sProcessedScript.size(), scriptName.c_str(), 0); 4) Execute our script using JS_ExecuteScript... JSBool bRet = JS_ExecuteScript(m_pJSContext, m_pJSGlobalObject, m_pJSScript, &vRet); 5) Then try to call a method in the script JSBool bRet = JS_CallFunctionName(m_pJSContext, m_pJSGlobalObject, sName.c_str(), (uintN)argc, argv, &vRet); Scripts run fine as long as they do not access properties on any of the DOM elements. The script has no problem with other variables, even other of our XPCOM objects that we injected in to the SM engine. Our Firefox add-on is a very large and complex add-on. It relies heavily on some XPCOM classes, e.g. CIDs, IIDs, etc. Therefore, we generally have to compile the Firefox source and copy over the SDK directory and some unfrozen in order to properly re-compile our extension. I can be available to test a patched js3250.dll or assist in any other method. I can also be open to chat on IRC. Reproducible: Always Steps to Reproduce: Full details in the 'Details' section Actual Results: Full details in the 'Details' section Expected Results: javascript loaded in our SM should be able to access the DOM, properties, etc without error. This worked until Beta 3. Bug is CRITICAL to us. Without the fix, our extension will no longer be able to run under Firefox 3!!!
Reporter | ||
Updated•16 years ago
|
Version: unspecified → Trunk
Updated•16 years ago
|
Assignee: nobody → general
Status: UNCONFIRMED → NEW
Component: Extension Compatibility → JavaScript Engine
Ever confirmed: true
Product: Firefox → Core
QA Contact: extension.compatibility → general
Version: Trunk → unspecified
Comment 1•16 years ago
|
||
Brendan, Blake, Igor, what's up here? We should be picking up the function's principals from the script, I would have thought...
Comment 2•16 years ago
|
||
Errrr, not if it's a cloned function, in that case, we'll walk up the parent chain, which has been severed! We used to fail silently in this case (see bug 409514) but I should have also added an NS_WARNING. Can someone test to see if that's what's happening here?
Comment 3•16 years ago
|
||
(In reply to comment #2) > Errrr, not if it's a cloned function, in that case, we'll walk up the parent > chain, which has been severed! It is the parent chain of the original function stored in JSScript that is severed, not the parent chain of the clone. The clone's parent chain should include JSContext.globalObject.
Comment 4•16 years ago
|
||
I am concerned you are not using the JS GC rooting (root set management) API at all, or correctly. You show retained C++ JSObject pointers, and even JSScript * usage that may be GC-unsafe. Can you show more code, perhaps by email? You need to follow the rules of SpiderMonkey's exact GC, which are tricky and onerous but which can be followed. I'm not saying any GC safety bugs in your addon are involved in the proplem you report here -- it's not clear yet either way (they could be related). /be
Comment 5•16 years ago
|
||
As I understood it, the function is defined by the script during the JS_ExecuteScript call. So it shouldn't be a cloned function. Perhaps the real issue is in one of the functions on one of our wrappers, of course...
Reporter | ||
Comment 6•16 years ago
|
||
(In reply to comment #4) > I am concerned you are not using the JS GC rooting (root set management) API at > all, or correctly. You show retained C++ JSObject pointers, and even JSScript * > usage that may be GC-unsafe. Can you show more code, perhaps by email? Yes, let me extract some of our SpiderMonkey wrapper code and I will email to you. Also, regarding comment #5, yes it is a function that is defined within the script. Thanks.
Reporter | ||
Comment 7•16 years ago
|
||
I have sent our Spider Monkey wrapper code and details on how it is used to Brendan
Reporter | ||
Comment 8•16 years ago
|
||
Any update on this bug?
Comment 9•16 years ago
|
||
Ron, although this bug has been nominated for 1.9, it has not been assigned nor given a sufficient priority to be considered at this point. Pinging brendan or igor about the priority and blocking status:
Updated•16 years ago
|
Flags: tracking1.9? → blocking1.9?
Updated•16 years ago
|
Assignee: general → brendan
Flags: blocking1.9? → blocking1.9+
Priority: -- → P2
Updated•16 years ago
|
Status: NEW → ASSIGNED
Target Milestone: --- → mozilla1.9
Updated•16 years ago
|
Assignee: brendan → igor
Status: ASSIGNED → NEW
Comment 10•16 years ago
|
||
Ron: You might get more eyes on this with a simplified testcase attached -to- the bug, instead of sent privately to Brendan... is that possible?
Comment 11•16 years ago
|
||
Igor took this bug, he and I know what is going on. Plan is to fix general bug and then Ron should do some fast testing. /be
Reporter | ||
Comment 12•16 years ago
|
||
It would be somewhat difficult to provide a simple testcase, probably a couple of days work or more in order to make a new extension that does this. Our current extension has a very large code base. If Brendan and Igor know what the bug is, then I could grab the nightly build once checked in and verify on our end, if that will be suffucient. Thanks for handling this issue!
Updated•16 years ago
|
Priority: P2 → P1
Comment 13•16 years ago
|
||
The fix uses the global object if function's parent is not set.
Attachment #309174 -
Flags: review?(brendan)
Comment 14•16 years ago
|
||
Comment on attachment 309174 [details] [diff] [review] minimal fix If I have two windows that can communicate, and get a reference to a parent-less function in the second window from a script running in the first, then call it, the call should be statically scoped to the second window, not the first. But how would one get a reference to a parent-less compiler-created function object? Maybe the fix is nearer to the answer to that question. /be
Comment 15•16 years ago
|
||
(In reply to comment #14) > (From update of attachment 309174 [details] [diff] [review]) > If I have two windows that can communicate, and get a reference to a > parent-less function in the second window from a script running in the first, > then call it, the call should be statically scoped to the second window, not > the first. Without the original change that made useless JSFunction wrappers at least not to leak the scope used during compilation in the scenario above the function will be executed using that compiler scope. I claim that using the current global window is not worse than using the compilation scope.
Comment 16•16 years ago
|
||
(In reply to comment #15) > (In reply to comment #14) > > (From update of attachment 309174 [details] [diff] [review] [details]) > > If I have two windows that can communicate, and get a reference to a > > parent-less function in the second window from a script running in the first, > > then call it, the call should be statically scoped to the second window, not > > the first. > > Without the original change that made useless JSFunction wrappers at least not > to leak the scope used during compilation in the scenario above the function > will be executed using that compiler scope. I claim that using the current > global window is not worse than using the compilation scope. The current global window will be the wrong one, it will be the dynamic scope: In window w1, with target name "w1": <script> this.name = "w1"; var x = 42; var f = leak_proto(function () alert(x)); </script> in window w2, loaded after w1 from same origin: <script> var x = 24; var w1 = frames[0]; w1.f(); </script> Without the original change that fixed the leak, 42 would be alerted, per ECMA static scope rules. With the patch, unless I am mistaken 24 would be alerted. I don't believe that leak_proto can be written except by using the JS API, due to the primordial bug where JSFunction pointers are exposed and a non-commutative fun->object link can be followed. But that's what is happening for Ron et al.'s add-on. Of course we agree on the long-term fix, but it would be best if the short-term one did not introduce dynamic scoping. To retain static scope linkage without making a leak hazard, we would need a weak reference instead of a strong one from the parent slot of the proto-function to its window. This means overhead in tracking such proto-functions from their window and nulling their parent links when the window is finalized. Is this any easier than the long-term fix? /be
Comment 17•16 years ago
|
||
Oops, w2's source should be: <script> var x = 24; var w1 = window.open("", "w1"); w1.f(); </script> /be
Updated•16 years ago
|
Priority: P1 → P2
Comment 18•16 years ago
|
||
P2 per FF3 meeting today
Comment 19•16 years ago
|
||
(In reply to comment #16) > To retain static scope linkage without making a leak hazard, we would need a > weak reference instead of a strong one from the parent slot of the > proto-function to its window. The weak reference would not solve the problem of null scope after GC collects parent's scope.
Updated•16 years ago
|
Attachment #309174 -
Attachment is obsolete: true
Attachment #309174 -
Flags: review?(brendan)
Comment 20•16 years ago
|
||
I think I will have the proper fix bases on making JSFunction equivalent to JSObject lter today. Most of the preparation work will be done as a part of bug 423874.
Comment 21•16 years ago
|
||
(In reply to comment #12) > It would be somewhat difficult to provide a simple testcase, probably a couple > of days work or more in order to make a new extension that does this. Our > current extension has a very large code base. If Brendan and Igor know what > the bug is, then I could grab the nightly build once checked in and verify on > our end, if that will be suffucient. I have landed the bug 424376 which should address the issue. Could you check your extension with a a nightly built compiled after 2008-03-23 12:00 UTC (04:00 AM PST)?
Reporter | ||
Comment 22•16 years ago
|
||
(In reply to comment #21) Yes, I will get source and start preparing today. I will try to let you know later, or tomorrow at the latest . Thanks > (In reply to comment #12) > > It would be somewhat difficult to provide a simple testcase, probably a couple > > of days work or more in order to make a new extension that does this. Our > > current extension has a very large code base. If Brendan and Igor know what > > the bug is, then I could grab the nightly build once checked in and verify on > > our end, if that will be suffucient. > > I have landed the bug 424376 which should address the issue. Could you check > your extension with a a nightly built compiled after 2008-03-23 12:00 UTC > (04:00 AM PST)? >
Reporter | ||
Comment 23•16 years ago
|
||
(In reply to comment #21) It appears that the problem is still present. Same thing is happening. I did a check on calling a simple function that did not access any DOM and it worked ok, but as soon as I tried something that accessed a property on a DOM element (e.g. document), then had the same problem. I did a double check on the code to make sure I had the latest. I inspected the CVS Log on the js.c file that you checked in at 3/23 3:04 AM and my code had the same change, so I believe I am working with the proper code. > (In reply to comment #12) > > It would be somewhat difficult to provide a simple testcase, probably a couple > > of days work or more in order to make a new extension that does this. Our > > current extension has a very large code base. If Brendan and Igor know what > > the bug is, then I could grab the nightly build once checked in and verify on > > our end, if that will be suffucient. > > I have landed the bug 424376 which should address the issue. Could you check > your extension with a a nightly built compiled after 2008-03-23 12:00 UTC > (04:00 AM PST)? >
Reporter | ||
Comment 24•16 years ago
|
||
(In reply to comment #21) fyi, I made the same patch that I did in Beta 3, e.g. commenting out a few lines in the NewCompilerFunction in jsparse.c and did another test. In this case, it no longer fixes the issue as it did in Beta 3. > (In reply to comment #12) > > It would be somewhat difficult to provide a simple testcase, probably a couple > > of days work or more in order to make a new extension that does this. Our > > current extension has a very large code base. If Brendan and Igor know what > > the bug is, then I could grab the nightly build once checked in and verify on > > our end, if that will be suffucient. > > I have landed the bug 424376 which should address the issue. Could you check > your extension with a a nightly built compiled after 2008-03-23 12:00 UTC > (04:00 AM PST)? >
Reporter | ||
Comment 25•16 years ago
|
||
(In reply to comment #21) I have created a simple extension that shows the bug. I have sent the extension, source, and details to Igor and Brendan. > (In reply to comment #12) > > It would be somewhat difficult to provide a simple testcase, probably a couple > > of days work or more in order to make a new extension that does this. Our > > current extension has a very large code base. If Brendan and Igor know what > > the bug is, then I could grab the nightly build once checked in and verify on > > our end, if that will be suffucient. > > I have landed the bug 424376 which should address the issue. Could you check > your extension with a a nightly built compiled after 2008-03-23 12:00 UTC > (04:00 AM PST)? >
Comment 26•16 years ago
|
||
Here is an explanation of bug details. Prior clearing the parent and proto slots the compiler would create internal function objects with the parent scope pointing to the global object set during compilation. Then when the compiled script was executed against this global object, the interpreter would see that the function objects had the parent scope matching the global. In this case the interpreter avoids cloning assuming the function object was created for a so-called compile-and-go script or a script that would be immediately executed after compilation only single time. Thus after JS_ExecuteScript returns the global object will be populated with functions that are not clones. For such functions the security manager uses the principals embedded into the script itself. In the case of the extension in question this principal allows access to DOM properties. Now, with the parent slot of the compile-time function object properly cleared to prevent leaks, the interpreter populates the global scope with clones of compiler-time function objects when JS_ExecuteScripts runs. For clones the security manager uses the principals embedded into the parent scope chain. For the extension in question the principals do not give access to DOM properties leading to the bug. Thus to fix the bug the security manager should stop testing for clones and should instead check if the function comes from a brutally shared chrome script. Only in this case the security manager should extract the principals from the scope chain. Now, given my little experience working with principals implementations, I am not the right person to address this.
Assignee | ||
Comment 28•16 years ago
|
||
Igor/Brendan, how can caps tell if a function comes from a brutally shared chrome script? Right now such scripts just get caught in the check that tests whether a function is a clone or not. I'm not sure what's different between a function cloned from a brutally shared chrome script and one cloned by the interpreter during normal script execution.
Comment 29•16 years ago
|
||
(In reply to comment #28) > Igor/Brendan, how can caps tell if a function comes from a brutally shared > chrome > script? That is the essence of the bug - xpconnect has to use its own way to find out that and not to rely on checking if the function was cloned. I was thinking to mark somehow the principals that is stored in the shared script and its functions. But I do not know how to do that.
Reporter | ||
Comment 30•16 years ago
|
||
Thanks to Igor, Brendan, and Johnny, I now have a solution. They proposed that I implement nsIScriptObjectPrincipal and provide to the JS engine the principals. The instance of the nsIScriptObjectPrincipal object is set in the private area of the global data. The security manager will then find the instance and get the principals from it. I have implemented this in our code and tests seem to show that it works. This bug can be closed! Thanks!
Assignee | ||
Comment 31•16 years ago
|
||
Awesome! Thank you Ron for working through this! Closing bug as invalid based on the above comment.
Status: NEW → RESOLVED
Closed: 16 years ago
Resolution: --- → INVALID
You need to log in
before you can comment on or make changes to this bug.
Description
•