Closed Bug 49572 Opened 24 years ago Closed 23 years ago

Support "responseText" property for XMLHttpRequest

Categories

(Core :: XML, enhancement, P3)

enhancement

Tracking

()

VERIFIED FIXED
mozilla0.9

People

(Reporter: taras.tielkes, Assigned: hjtoi-bugzilla)

Details

(Whiteboard: [fixinhand])

Attachments

(10 files)

8.89 KB, patch
Details | Diff | Splinter Review
16.92 KB, patch
Details | Diff | Splinter Review
16.65 KB, patch
Details | Diff | Splinter Review
17.66 KB, patch
Details | Diff | Splinter Review
30.91 KB, text/xml
Details
22.18 KB, patch
Details | Diff | Splinter Review
43.75 KB, patch
Details | Diff | Splinter Review
21.62 KB, patch
Details | Diff | Splinter Review
23.69 KB, patch
Details | Diff | Splinter Review
23.27 KB, patch
Details | Diff | Splinter Review
The IE5 object XMLHttpRequest has "responseText" property that is accessible 
from script. This is very useful when the response is not actually an XML 
document.

This should be easy to implement.
Reassigning to vidur@netscape.com, as he owns the XMLHttpRequest component.
Assignee: nisheeth → vidur
Supporting responseText has an additional advantage.

Many ISP's do not support mime type text/xml which is required when doing a
XMLHttpRequest GET.  In IE, if the returned mime type is text/plain, the
contents of the document are placed in responseText even though it was not
parsed.  This allows a developer to work around the *stupid* ISP by parsing the
responseText using IE's loadXML method.

If Mozilla supported responseText in a similar fashion, we would be able to use
DOMParser::parseFromString to parse the document using the returned responseText.

I have already run into this situation with my ISP and have implemented the
workaround for IE. It would really be nice if I could do the same for Mozilla.
I think I have solved this. In nsXMLHttpRequest::Send() I create a responseText
stream listener that stores the listener returned by the StartDocumentLoad()
call. I give the responseTextListener to the channel's AsyncRead() method. Then,
in the responseTextListener's OnDataAvailable() I peek at the data in the stream
and build up the responseText string, forwarding that and all other calls to the
real stream listener. In OnStopRequest() I have the responseText finished
(regardless of its mime type).

There is one problem, though. Currently I do not know of a way to peek at the
input stream. If I do Read(), I clear the stream and the XML parser gets nothing.

jst said vidur has some code in his tree that would allow me to peek at the data
(from the channel?). He was hoping that code would be ready to land next week.

There is also something I was wondering about. Does IE5 store everything the
server sends in this property, regardless of mime type? I am thinking that it
doesn't make much sense if the data is binary (images for example).
Status: NEW → ASSIGNED
Using IE5's XMLHttpRequest to do a sync GET on a jpg image gives the following 
properties:

status         200
statusText     "OK"
responseXML    Empty XML Document 
responseText   ? Accessing this gives a system error
responseBody   appears to be the binary contents of the image
responseStream ?
readyState     4

getAllResponseHeaders() returns (for my example) 

"Server: Microsoft-IIS/5.0
Date: Wed, 22 Nov 2000 23:53:03 GMT
Content-Type: image/jpeg
Accept-Ranges: bytes
Last-Modified: Thu, 26 Oct 2000 09:08:29 GMT
ETag: \"60e4954e2c3fc01:be7\"
Content-Length: 30599

"

So maybe if we added a responseBody that would contain the raw binary result of 
the GET we could completely emulate MS's XMLHttpRequest.
If the received data is valid text, then put it in the responseText property. 
The question is "what is valid text?". I don't know how IE5 decides here, or 
why Bob's sample gives an error, but I assume some requirements for "a valid 
text string" are evaluated.

I don't see any use for accessing anything other than a text response through 
responseText.
perhaps it uses the response mime type: 

1.  put response bits into responseBody 
2.  if text/* => put something in responseText
3.  if text/xml => parse into responseXML
4.  if application/xml => ?
Bob, it would be that IE5 works that way. I'll check and try to load a text 
file named "abc.abc".

By the way, even when XML is loaded (and recognised) in IE, the resposeText 
property is still available.

IMO it should work like this:
1) If load successfull (and an HTTP error like 404 *is* a successfull load!!)
2) If it's a valid string, put it in responseText
3) Store the HTTP response code etc.
3) *after that* try to parse it
4) If it's legal xml, put the tree in responseXML
I've just tested this with IE5, and a text file that is server 
as "application/octet-stream" is full accessible through resposeText.
Taras,  XMLHttpRequest on IE5 will place response content with mime type 
text/plain into responseText without trying to parse it into an XML Document.  
I use that to work around the problem of ISP's not supporting text/xml by 
parsing the responseText 'by hand' so to speak.  

I just got your bugmail about application/octet-stream.  Did your file contain 
legal XML? Did IE try to parse it?  I guess what approach we take depends on 
how closely we want to mimic IE's behavior.  

To flesh out my example along the lines of yours, I think the following would 
work:

1.  parse HTTP Response Headers and set the status and statusText properties.

2.  place the response Body into the responseBody property.  In case of a not 
found or not authorized response etc, this will be the web server's error page 
for that response and the mime type would be text/html. In the case that the 
request didn't complete, the responseBody would be empty.  What would cause the 
request not to complete? timeout on the response?

3.  convert responseBody to text and store in responseText by dropping 'non-
text' characters?

4.  if response mime type is type/xml and statusText is "OK" , attempt to parse 
into responseXML.   This is what IE does I think.  We could try to parse any 
response mime type which might make life simpler in some ways, but would be 
different from IE. I don't have an opinion on which is better.

5.  in the event of parsing errors create a property of the responseXML object 
that corresponds to IE's parseError object and fill in accordingly.

what do you think?
The WIP I just attached contains the idea for the fix. The patch also contains
some other stuff. The part that should make the work is #if 0'd out - if you
enable it you will get the responseText property but not responseXML. Need some
way of peeking at the input stream or maybe another input stream we will pass on
to the parser.
Whiteboard: [wip for fix]
Target Milestone: --- → mozilla0.9
I am taking all XMLExtras bugs to make it easier to see what I am working on...
Assignee: vidur → heikki
Status: ASSIGNED → NEW
Prpoposed fix 1 implements responseText property, and responseBody method.
responseText is simple, but responseBody is defined in IDL as:

+  /**
+   * The response to the HTTP request as raw data.
+   * @param size  The size of the data block.
+   * @param block The data block.
+   * @return      For languages that accept this as return
+   *              value (JavaScript, for example): the data block.
+   */
+  void responseBody(out unsigned long size,
+                 [retval, size_is(size)] out string block);


This is how you can use this from JS:

// request is an XMLHttpRequest object
var length = new Object();
var dummy  = new Object();
var data = request.responseBody(length,dummy);

Here is a mail from jband when I asked how should I pass binary data through
XPConnect - I chose to use string data type:

jband:
>8 and 16 bit strings are assumed to be zero terminated (and any
>embedded zero is treated as the end of string). So, bare string
>or wstring are not very good data types for binary data.
>
>If you use [array] of type octet (or unsigned short, or unsigned
>int) then this maps well for C++. But, for JS this maps to a JS
>array of int values (and in JS very large or small ints - ints
>that need more than 31 significant bits - become doubles). This
>is workable, but not compact.
>
>Another option is "wstring with size". In idl you do this like:
>
>  void sendBlock(in PRUint32 size,
>                 [size_is(size)] in wstring block);
>
>This allows for blocks of chars with embedded nulls allowed. In
>JS this becomes a single JSString (where, again embedded nulls
>are allowed). You can use 'string' or 'wstring'. But with JS the
>'chars' in the 'string' get expanded to 16 bit anyway - so it
>uses twice as much space when converted for JS. For compactness
>in either language "wstring with size" is a pretty good option.
>Though, it is ugly in that it uses a string type for what is
>really binary data.

I still need to test this a bit. Especially, I am unsure if this works when
OnDataAvailable() is called multiple times (lots of data).
Status: NEW → ASSIGNED
Whiteboard: [wip for fix] → [fixinhand]
Doh, I immediately noticed that responseBody() method was buggy. I have fixed
that in my tree.

Some words about how this was implemented.

The most difficult part was to be able to look at the stream necko is providing
us without clearing it. I copied the principle from parser for this. We
basically have a callback that gets called when there is data. We copy the data
and say we did not consume it, and then pass it on to the parser. Here is one
catch: if OnDataAvailable() gets called multiple times I do not know if this
works - in the worst case the parser steals the input stream the first time it
sees it or something. If that were the case, this would break. More testing needed.

Then there is the question of what is text. I chose to implement a simple test
that checks the data passed in by necko and if there are zeros in it we assume
it is not a string. I do not know what happens if we are loading an XML file in
Unicode, though. If we pass the "string test", we copy data into the
responseText member and then to the caller. In other words, if nobody asks for
responseText we don't pay the storage price. Right now I am caching the
responseText, but that could be changed so that we would get it from
responseBody every time. It would save space, but would slow things down. The
usual tradeoff.

Finally, the responseBody() method. I have a small helper class that takes care
of the buffer. It would probably be best to store the buffer on disc so that we
wouldn't need to pay the memory penalty, but right now it is a simple memory
buffer. As jband pointed out there are several ways responseBody could have been
defined. I chose "string" because it was simplest to do, and it is efficient
with C++.

Finally the patch contains lots of places where I changed an assignment to a
plain constructor call, which saves a few instructions per use.
Ideally this bug should be fixed at the same time as bug 45552 and bug 45627.
Fixes for those bugs are attached into bug 55627. Testcases for all of these
bugs are attached to that bug as well, but here is a link to a good one:

http://bugzilla.mozilla.org/showattachment.cgi?attach_id=21078

I am pretty happy with the latest patch in this bug except for one thing: it
crashes every time in nsIUnicodeDecoder::Convert() (at last with the long
testcase). If I have shorter testcase, it crashes somewhere else. So far I have
no idea what is going on (probably something simple and I am blind to my own
mistake).

I am no longer storing mResponseText at all, I create it from responseBody every
time.

I have tested with a long document, and we correctly get called by necko (the
data comes in about 4k chunks in my tests, but there is some strange logic in
necko which determines buffer sizes).

I haven't yet tested with Unicode and images.
Ok, I think I'm finally gonna try and get this checked in. There is still one
major fault in the implementation that I can see and it is the fact that we do
not do charset detection if it was not an XML document. I want to open a new bug
on that and attack that later with ftang or someone more knowledgeable in the
issue...

I attached a Unicode test case, but I think XMLHttpRequest does not send it as
Unicode to the server (at least I am seeing normal char buffer when we get the
document back from the server).

So if someone could show me how to have the server really send a response in
Unicode as well as an image (or provide testcases) it would be helpful (my CGI
script writing has rusted badly...).
Oh crap, the latest diff again contains stuff it shouldn't (about serializer).

Anyway, vidur looked my patch over, and I fixed some things:

- Try to detect the charset from HTTP headers for responseText
- Use realloc
- Release listener as soon as we are done
- Replace illegal chars for responseText instead of bailing out

I also looked at nsIByteBuffer, but it reads from stream which means I cannot
use it.

responseBody is method which is different from IE, so that will need some more
thought (I don't know how to make it an attribute). Maybe we should leave it out
from the interface until we see a need for it?

It might be that the ReadSegments() is not correct, I will need to discuss this
with Rick Potts and/or necko people.

While testing I have found out that we always try to parse the response as XML.
I made the server send "text/plain" but if the data was XML I still got the
document. If I used mime type "image/jpeg" the send() call never returned.
Just thinking aloud, feel free to ignore... What should we do if a) a connection
was never made or b) the connection broke unexpectedly? In other words, if we
failed to connect to server or we got a network timeout. If the user then asks
for responseText/Body, should we just give out the data we got with NS_OK
return, or with NS_SUCCESS_PARTIAL (or some such) or throw an exception?
Some comments:
About resposeBody: this is never used in IE5, since its data type (byte array) 
is not accessible from JScript (in current implementations). Also, on the web, 
ASCII/UTF/XML formats make more sense. All browser use of XMLHTTP in IE5 works 
with text.

About network errors:
This is what IE5 does: if it's a HTTP error, responseText(and of course 
responseXML) is "undefined" (no real exception/error, you just get this back, 
same as uninitialized ECMAScript strings, I believe). The result (HTTP code) 
and resultText (HTTP error body) properties will be populated though.

Is there's a error on the TCP level, a "Catastrophic error" box will appear, 
and script execution will halt.

RE: Bob Clary's comments: to my knowledge, if the content-type of the HTTP 
respose is not text/xml, IE5 will not try to parse it, and the property will be 
null.(Yes, even if the file is legal XML. Parsing happens during transfer, not 
by "invoking" the responseXML property.) Agree with all you comments.
I discussed with harishd, and he said the parser will always copy the whole
buffer that it gets in OnDataAvailable(), so my remaining question has been
answered, I think. Now to find reviewers...
The patch does not seem to follow the naming convention. Other than that 
r=harishd.
The patch is following the general format of the whole file, and I'd rather not
spent too much time right now on changing the format to conform to the generic
guidelines. Most of the XMLExtras code fails to follow the convention that
function arguments begin with 'a'.
r=rpotts.  I've reviewed the nsIStreamListener portions of the patch :-)

-- rick
The reason I am using my own buffer storage implementation is that there is no
way to append char* data to an nsCString - it accepts only PRUnichar and
different Unicode string classes with conversion.
nsCString inherits nsAWritableCString which does have Append(char *aData,
PRUint32 aLength) so you should be able to use that, even if the string contains
embedded nulls, if not, talk to scc and see what could be done about the
problems you find.

As for the patch as a whole (modulo using nsCString and not implementing your
own buffer class) I think the patch looks good, a couple of silly nits tho:

Please fix the inconsistent code indentation at the end of
nsXMLHttpRequest::DetectCharset()

nsXMLHttpRequest::OnStartRequest() and ::OnStopRequest() use 4 space
indentation, what's up with that? :-)

Could you investigate the nsCString issue and attach a new patch, explain why
you can't use nsCString?
Ah, so it does. But it does not work if the string contains nulls.
nsCString::Length() returns 0 if the string contains nulls. I stepped in to the
Append() method and it seems the length is getting lost in there. I have opened
bug 70337 against scc to look into that. In the meanwhile, I'd like to check
this in with jst's suggestions, and when scc fixes the string I'll remove my own
buffer. I'll attach a patch.
r=jst
Fix checked in.
Status: ASSIGNED → RESOLVED
Closed: 23 years ago
Resolution: --- → FIXED
Reporter, please download the latest build and verify fix.
Marking verified per last comments.
Status: RESOLVED → VERIFIED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: