Closed Bug 307049 Opened 19 years ago Closed 19 years ago

XMLHttpRequest seems to try to parse the empty (!) body of the response to an HTTP HEAD request

Categories

(Core :: XML, defect, P5)

x86
Windows XP
defect

Tracking

()

RESOLVED FIXED
mozilla1.9alpha1

People

(Reporter: martin.honnen, Assigned: peterv)

References

()

Details

Attachments

(1 file)

The test case at
<http://home.arcor.de/martin.honnen/mozillaBugs/XMLHttpRequest/HTTPHEADTest1.html>
uses XMLHttpRequest to make asynchronous HEAD requests for an XML (text/xml) and
for an HTML (text/html) document.
The HEAD (!) request for the XML document seems to trigger an attempt to parse
something empty as XML as Mozilla displays

Error: no element found
Source File:
http://localhost/javascript/mozillaBugs/XMLHttpRequest/test2005090402.xml
Line: 3, Column: 1
Source Code:
^

in the JavaScript console.

Tested with a recent Firefox trunk nightly (Mozilla/5.0 (Windows; U; Windows NT
5.1; en-US; rv:1.9a1) Gecko/20050828 Firefox/1.6a1).

I first thought this is some kind of regression as Mozilla 1.7 releases don't
display the error message in the JavaScript console but on further investigation
it just looks as somewhere after 1.7 the Mozilla code has been changed to
display XML parse errors in the JavaScript console while the attempt to parse
something empty on a HEAD request might have gone unnoticed in 1.7.

The problem was reported in this newsgroup thread:
<http://groups.google.com/group/netscape.public.mozilla.xml/browse_frm/thread/7727bccb0a7f14e7/3b36a2f50640d892?hl=en#3b36a2f50640d892>

Expected behavior is to simply make the HEAD request and gives access to status,
status text and the response header but not to try to parse anything and display
a parse error.
The test case at
<http://home.arcor.de/martin.honnen/mozillaBugs/XMLHttpRequest/test2005090402.xml>
is a static text/xml resource which is not well-formed and lacks a root element.

When directly loaded in Mozilla 1.7.11 this displays the error "no element
found" in the browser window but no error in the JavaScript console while a
Firefox nightly displays that error in both the browser window as well as the
JavaScript console. So that is why I think the problem with XMLHttpRequest
trying to parse something after an HTTP HEAD request has probably existed in
Mozilla 1.7 but went unnoticed.
Boris, can you have a look? Thanks.
Attached patch v1Splinter Review
This seems to do the trick. However, there's more conditions when a request can
not return a body.
Assignee: xml → peterv
Status: NEW → ASSIGNED
An empty body is almost always allowed, so it's hard to detect when to avoid
parsing the response's body. Easy ones:
- HEAD request
- 1xx, 204, 304 response
- Content-Length 0

http://www.w3.org/Protocols/rfc2616/rfc2616-sec4.html#sec4.4 mentions the others.

It seems like we should hold off on hooking up the parser until the first call
to OnDataAvailable.
(In reply to comment #4)
> - HEAD request
> - 1xx, 204, 304 response

This code here will not see a 3xx response, btw.

> - Content-Length 0

Why should that not lead to a parse error? Clearly, the file exists, but is not
valid XML...

> It seems like we should hold off on hooking up the parser until the first call
> to OnDataAvailable.

I do think that an empty (200) response with an XML content type should show the
error.
I guess simply suppressing XML parse error reporting for XMLHttpRequest is not
really what we want, right?

I really do think that just not parsing for HEAD makes the most sense.
Comment on attachment 194857 [details] [diff] [review]
v1

Let's do this then.
Attachment #194857 - Flags: superreview?(bzbarsky)
Attachment #194857 - Flags: review?(cbiesinger)
Attachment #194857 - Flags: superreview?(bzbarsky) → superreview+
Attachment #194857 - Flags: review?(cbiesinger) → review+
Status: ASSIGNED → RESOLVED
Closed: 19 years ago
Resolution: --- → FIXED
Priority: -- → P5
Target Milestone: --- → mozilla1.9alpha
I think I ran into this using nsIJSXMLHttpRequest in Firefox 2.0.0.11.  I put some logging in my code, and here's what I see:

----------
req.readyState = 2

Error: no element found
Source File: http://localhost/~robert/test.rss
Line: 1, Column: 1
Source Code:
^
----------

If I get rid of the if-modified-since code... it's only parsing if readyState == 4 AND status == 200.  Neither case is true.

Content-Type is application/xhtml+xml
For anyone else who encounters a similar problem, here's my workaround:

// Make it text/plain so there's no parsing ahead of it's time
request.overrideMimeType('text/plain');

... 

req.onreadystatechange = function (aEvt) {
  if(req.readyState == 4 && req.status == 200){
    var parser = new DOMParser();
    var dom = parser.parseFromString(theString, "text/xml");
    req.responseXML = dom;
    processResponse(req);
  } else if(req.status == 304){
    // No parsing since no data.
    processResponse(null);
  }
}


Not really eloquent, but does the job.  This gets around the xml parser firing when there's no response.
Robert, see comment 5.  Are you seeing something that contradicts what Christian said there?
Per: http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.3.5

"The 304 response MUST NOT contain a message-body, and thus is always terminated by the first empty line after the header fields."

When the server returns a 304 and no content I *am* getting an xml error.  xml parsing of null is silly since the header clearly indicates no payload.

I don't think the xml parser should be invoked on a 304 unless the xmlhttprequest is smart enough to use the response from cache.

In order to ensure a change in server config never busts an xmlHttpRequest one is required to change the content type and parse it themselves.  Technically not hard to do, but a notable caveat.
> When the server returns a 304 

The point is, 304 is handled by necko internally.  That is, on a 304 response necko reads the data from the cache and hands it back to the caller with a 200 status.

I suppose if we get 304 but there is nothing in the cache the 304 might make it out to the necko consumer...  But we shouldn't be setting the request headers that make a 304 response possible if the data is not in the cache.  So either necko is screwing up, the XMLHttpRequest consumer is setting headers they have no business setting, or the server is returning a 304 when it wasn't given if-modified-since.  Do you know which it is in your case?
I'm going to presume the closest match is:
XMLHttpRequest consumer is setting headers they have no business setting

Since I'm doing:
request.setRequestHeader("If-Modified-Since", lastModDate);

I understand the internals, but still don't agree with xmlHttpRequest throwing an xml error in this case.  A 304 by definition has no payload, hence no xml parsing should take place.  Regardless of the reason for the 304 response, it shouldn't be parsed.  It's up to the consumer to decide if the 304 is appropriate.
Perhaps a followup bug is in order to figure out what to do with non-2xx responses.  For example, should a 404 response be parsed?  This should be covered by the XMLHttpRequest spec, really.
Agreed.  I'll file a new bug for good measure.
For anyone who cares, it's bug 411060.
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: