Closed Bug 157797 Opened 22 years ago Closed 20 years ago

XmlRpc.asyncCall produces "<i4>[xpconnect wrapped nsISupportsPRInt32]</i4>" instead of proper value

Categories

(Core :: XML, defect)

x86
Linux
defect
Not set
normal

Tracking

()

RESOLVED INVALID

People

(Reporter: cdmojoli, Assigned: dbradley)

Details

Attachments

(4 files)

I. SYNOPSIS

nsXmlRpcClient.js sometimes produces incorrect output in _generateArgumentBody
when invoking toString() of the wrapped primitive. Instead of a proper
request substring "<value><i4>50</i4></value>" we get
"<value><i4>[xpconnect wrapped nsISupportsPRInt32]</i4></value>" which
confuses the server.

II. CAUSED BY PROGRAMMATIC ERROR

I have found one way of inequivocally causing this ill-effect is by
incorrect programming:
---------------
var intParm=xmlRpc.createType(xmlRpc.INT, {});
intParm.data="i shouldn't be a string";
---------------
But this is simply incorrect programming and not a bug.

III. CAUSED BY SUSPECTED RACE CONDITION OR LOCKING ISSUE

The incorrect behavior can happen with (very reviewed) good code, but
it happens rarely. This lead me to suspect of some sort of race
condition.

The following code works pretty much always:
---------------
1. var xmlRpc = Components.classes['@mozilla.org/xml-rpc/client;1']
.createInstance(Components.interfaces.nsIXmlRpcClient);
2. xmlRpc.init("some url");
3. var intParm=xmlRpc.createType(xmlRpc.INT, {});
4. intParm.data=Number(document.getElementById("someID").value);
5. xmlRpc.asyncCall(someListener, null, 'module.method', [intParm], 1);
---------------
_Rarely_ it will fail and the problem is not data-related. When it
does fail the problem is always in the invocation of toString()
method within nsXmlRpcClient.js. (NB: calling toString() from your code does work.)

We have found that inserting an alert() between instructions #4 and #5 of the
above example will trigger the ill-effect regardless of what string is being
alerted. Further testing revealed the alert() itself is no the problem but
rather the *modal* nature of the alert. (NB: inserting a modal window or alert
before instruction #3 is NOT a problem.)

The modal window problem always causes failure except _perhaps_ the first time
it is called. Subsequent calls throughout the lifetime of Mozilla will fail.

IV. FURTHER SPECULATION

This problem seems to be deeply rooted. I suspect this is a symptom of an
underlying problem in either XPCOM or XPConnect but I am not really proficient
in them. I have read of threading issues of createInstance(), "JS rooting of
primitives" and JSContext problems. All might be related but are for now outside
my grasp.

I suspect the modal window trigger somehow forces the race condition or locking
issue that otherwise only pops up sporadically.

V. CIRCUMVENTION OF THE PROBLEM

While not a solution or explanation of the problem, an ammendment of the
nsXmlRpcClient.js appears to be possible. In _generateArgumentBody() simply
replace instances of obj.toString() for obj.data where appropriate. This avoids
the XPConnect/XPCOM call to toString() and uses plain JS objects.

I have to test some more on this issue but I have been trying to identify this
problem since March and don't think I can progress any further without your
comments.
Our XML-RPC uses XMLHttpRequest, which has problems with modal dialogs, at least
partly because it's implementation of synchronous load relies on modal dialog
event loop. This part of this bug could be a dupe.

I have no idea why/how we would output "[xpconnect wrapped nsISupportsPRInt32]"
instead of the actual number, though.
We'd get that if for some reason the toString() method of the object itself did
not get called and the default XPConnect toString() got called instead...
Heikki, XML-RPC no longer has a synchronous load.  You can only call it async.
(or did I miss something?)

Would it be worthwhile to change it to use the .toString workaround?
I agree with #2 as it appears that toString() is not virtualized even though the
internal code explicitly and successfully QI's to the specific primitive interface.

Maybe "virtualized" is not the proper word under XPCOM you know what I mean.

Why would this virtualization fail sometimes? Are there bugs similar to this in
other parts of Mozilla? Maybe the correct component of this bug is XPCOM. More
interestingly, if the known iteraction of modal windows and syncLoading is
thought no longer applicable to this bug (#3), why does it always trigger
whatever condition causes toString() to fail?

Please remember that the actual interaction with modal displays is not my real
concern. If they were the sole cause of the ill-effect I would simply make sure
not to have them in the sensitive sections. I mention it because it produces the
same ill-effect as the unknown sporadic condition and may thus be correlated. 

BTW, due to my pressing needs I will patch nsXmlRpcClient.js and simply install
it as a separate implementation of nsIXmlRpcClient. Should I upload the
forthcoming patch for consideration?
is thera any reason why 
Components.classes["@mozilla.org/supports-string;1"]
          .createInstance(Components.interfaces["nsISupportsString"]);
cant be used instead of
xmlRpc.createType(xmlRpc.STRING, {}); ?

I tried using it but the value never serialise. Instead it come up with
[xpconnect wrapped nsISupportsString] instead?
Mike, there seems to be some sort of context about the created XPConnected field
and the JS string attached to the .data property of the field. This problem
seems deeply buried.

For now, I am avoiding this problem by avoiding modal dialogs, but I need them.

At some point I will replace those XPConnected variables holding the parameters
for pure JS objects. This will make the modified component useless from
languages other than JS but it will work always.

I wonder if the SOAP infrastructure suffers a similar problem.
Could someone attach a testcase to this bug?
Changing the testcase by removing:
var stateCode = xmlRpc.createType(xmlRpc.INT, {});		
and replace it with:
var stateCode = Components.classes["@mozilla.org/supports-PRInt32;1"]
          .createInstance(Components.interfaces.nsISupportsPRInt32);
Will illustrate comment 5. (which im not sure if it's by design or bug)

Basically theres the two problem I know of. Which I suspected is related. One is
shown in the testcase, the fact that once you used the variable created by
xmlRpc::createType() that variable is not useable anymore unless re-initialised.
Which is what cdmojoli@idea.com.py experiencing.

The other problem is variable created not using xmlrpc library is useless.

Move the line:
var stateCode = xmlRpc.createType(xmlRpc.INT, {}); to inside the getState()
function will make things work. 
If anyone would like to try this, just drop the attached file into the
components directory.

Instead of asking for "@mozilla.org/xml-rpc/client;1" ask for
"@idea.com.py/xml-rpc/client;1". The implementation difference is so small that
it is almost trivial but there is a possibility (I am not sure) that the
modified implementation will be unuseable from a non JS invoker.

I have introduced no API changes whatsoever so your client should run
unmodified.

This version has completely eliminated the effects of this bug for me.

(BTW, the contract name is '@idea.com.py' simply not to pollute the
'@mozilla.org' namespace, and not to indicate any sort of authorship.)
I have been using the attachment
http://bugzilla.mozilla.org/attachment.cgi?id=100049&action=view since I posted
it and have never since seen this bug reemerge. I now install this version with
my apps.

(I really think someone in XPConnect ought to be interested in this bug.)
QA Contact: petersen → rakeshmishra
Giving to Samuel.
Assignee: heikki → samuel
I find the same behavior described in this bug present in attachment 121022 [details] of
bug 197087.

I have commented about it in the aforesaid bug.
From the behavior it looks like in some cases you're getting the toString
provided by XPConnect. I could see a number of places to look at, but it's a
little puzzling that the behavior is inconsistent. It's either a race condition
or uninitialized memory.

I'll go ahead and take this. Samuel, if you want it back feel free to grab it
from me ;-)
Assignee: samuel → dbradley
QA Contact: rakeshmishra → ashishbhatt
David, I have been reading a bunch a of XPConnect bugs and it is clear you
understand them well.

We have worked around this bug, but we recently isolated another similarly
puzzling effect in nsXmlRpcClient. We have not created a new bug. Please read
bug 197087 (c#17) and look at attachment 138337 [details] [diff] [review] (very brief patch).

There is now way the behavior there described isn't at least related to XPConnect.

The good thing is that we can write a test case that will trigger the problem
over 80% of the time. The workaround we propose has survived our test case.

The test case strategically places a single sleep() in the "first" of a set of
parallel xmlrpc.asyncCall, causing most others to exhibit the behavior in c#17.

If you are interested in more information, or access to our test case, please do
not hesitate to ask. Despite having worked around both problems (this and
c#17's) we are interested in helping solve the underlying problem(s).

I am willing to assign someone to producing a minimal test case. (Our test case
is not minimal, it is part of a large project.)
So from the error message and the patch, it looks like the error is at:

parent._inProgress = false;

Realize that the error message is probably a bit misleading. From what I see in
the code it's generated in js_ValueToNonNullObject. Which means that parent is
null. Which is a little different than not having any properties. Maybe you
already know that, but wasn't sure. What you're saying still should hold true.

What would be interesting to know is to do an alert(this) when you setup the
onload. And then do an alert(e.target) in the handler and report back what it says.

I can't think of any reason at the moment when or why a wrapped xpconnect thing
would have a null parent. Looks like it might be some issue with either double
wrapping, or wrapping and unwrapping. If you can get a simplified case, I might
be able to take a look via the debugger.

Well,if you put a dump() instead of an alert():
this in asyncCall: nsXmlRpcClient
e.target _onload: nsXMLHttpRequest

in asyncCall: this.xmlhttp.parent = this;
xmlhttp is asigned to jswrapper of nsXMLHttpRequest (not component itself). So
apparently if the JSContext of response is different to context where parent was
asigned to jswrapper a new wrapper is created without parent attached property
or the same wrapper is used but can't see its parent property.
I think isn't correct have a depend of a wrapper. 
So I just need pass this (nsXmlRpcClient) to _onload in a wrapper indepedent way
and indepedent to "this" evaluation.
this testcase is <i4>bug</i4>, because don't need a server's response.
here you can see when toString() is confused.
you need to change mozilla/components/nsXmlRpcClient.js to: const DEBUG = true;

you need to have dump enabled.

btw, about my previous commnet, please read: "xmlrpc is asigned to" instead
"xmlhttp".
and bugzilla denied privileges so please: download the file to use it.
testcase for race condition. commented in comments : #15 ~ #17
notes:
1- you need dump enabled.
2- you need be in chrome.
3- Execute and see your console and your JavaScript console.

This is what happens when you dispatch a lot of asyncCalls in xmlrpc
(implemented with xmlhttprequest). Some response have not the same jscontext
and can't see global variables previous modified in dispatcher's context.
Looking at the first testcase, I think I can see what the problem is.  You're
making a new call to the xmlrpc client from inside the onResult handler.  This
means you are in the wrong JS context.  That's why it won't work.

The second testcase is because you need to use createType to create your variables.

The third testcase looks invalid to me due to an apparent (and understandable)
misunderstanding of setTimeout.  A timer will not fire until control leaves your
JS code.  So what you're doing is setting the timer, waiting for a bit, then
nulling the variable before leaving the function.  At that point the timer
fires, because the time has long passed, but the object is null because you set
it to null.
about testcase 1: I'm agree.

about testcase 2: Now use createType is not an option , it's imperative, right?
I'm agree.

about testcase 3: you are right!!! (or I was wrong!!! ;) ) I was thinking that
if setTimeout didn't have control of JS then force it creating a new JSContext
to process it... but as I said : I was wrong.... ;)

Maybe this one must be INVALID with a note about createType thing.
This bug is no longer valid. We have exercised the original code, our modified
code, as well as attachment 121822 [details] [diff] [review] (and our modified version too) and it seems
that with the current state of matters, this bug, as expressed in the summary is
no longer real.

There were some real issues, such as the modal window interaction which was
solved outside this bug.

I would like to point out that in researching this problem we found issues with
the dependence on JS context reutilization and property adding to XPCOM wrappers
that leave room for improvement. These are outside the scope of this bug.
Status: NEW → RESOLVED
Closed: 20 years ago
Resolution: --- → INVALID
I agree.
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: