Closed Bug 120836 Opened 24 years ago Closed 23 years ago

XMLHttpRequest.responseText fails when character set indicated in headers

Tracking

()

Status:

VERIFIED FIXED

Milestone:

mozilla1.0

People

(Reporter: matthew, Assigned: hjtoi-bugzilla)

References

(
URL
)

Details

(Keywords: intl, regression)

Attachments

(5 files, 2 obsolete files)

Fix, needs testing and cleanup 23 years ago Heikki Toivonen (remove -bugzilla when emailing directly) 3.33 KB, patch		Details \| Diff \| Splinter Review
annotations.asc.xml 23 years ago Heikki Toivonen (remove -bugzilla when emailing directly) 1.31 KB, text/xml		Details
annotations.xml 23 years ago Heikki Toivonen (remove -bugzilla when emailing directly) 1.33 KB, text/xml		Details
test asc 23 years ago Heikki Toivonen (remove -bugzilla when emailing directly) 2.14 KB, text/html		Details
test other 23 years ago Heikki Toivonen (remove -bugzilla when emailing directly) 2.14 KB, text/html		Details
Cleaned up, tested, proposed fix 23 years ago Heikki Toivonen (remove -bugzilla when emailing directly) 3.70 KB, patch	harishd : review+ jst : superreview+	Details \| Diff \| Splinter Review
Removed unneeded free 23 years ago Heikki Toivonen (remove -bugzilla when emailing directly) 3.67 KB, patch	scc : review+ scc : superreview+ scc : approval+	Details \| Diff \| Splinter Review

Matthew Wilson

Reporter

Description

•

24 years ago

From Bugzilla Helper: User-Agent: Mozilla/5.0 (Windows; U; Win98; en-US; rv:0.9.7) Gecko/20011221 BuildID: 2001122106 When an XMLHttpRequest object retrieves an XML document whose character set is indicated in HTTP headers, but not in the XML declaration, then the responseText method can fail. Reproducible: Always Steps to Reproduce: I have an XML document encoded in us-ascii, containing characters with codes above 127. It indicates this by sending a content type of "text/xml; charset=us-ascii". The XML declaration does not indicate an encoding. I retrieve this document via an XMLHttpRequest. Actual Results: A call to responseText returns an UNEXPECTED_FAILURE. Expected Results: The call to responseText should return the text of the document (allowing for any Javascript restrictions on character sets etc., I don't know what they are.) Adding a bit of debug to XMLHttpRequest, it looks as though it is getting the character set from the nsIDocument at http://lxr.mozilla.org/seamonkey/source/extensions/xmlextras/base/src/nsXMLHttpRequest.cpp#370 which returns a character set of UTF-8 (even though this is not specified in the XML). As a result, the attempt to get the character set from the HTTP headers at http://lxr.mozilla.org/seamonkey/source/extensions/xmlextras/base/src/nsXMLHttpRequest.cpp#374 is never attempted. Hence the document is read as UTF-8 (which it is not) and the conversion fails. I've taken the document, removed all private information (I hope!) and placed it at http://crashonline.org.uk/test/annotations.asc.xml. A "GET" XMLHttpRequest for this URL should indicate the failure. As a comparison, the same document at http://crashonline.org.uk/test/annotations.xml indicates the character set in the XML declaration.

Matthew Wilson

Reporter

Comment 1

•

24 years ago

I was looking at working on a patch for this. It would decide whether to call DetectCharset based on the value from GetDocumentCharsetSource. However the current charset seems to get set near the end of nsXMLDocument::StartDocumentLoad http://lxr.mozilla.org/seamonkey/source/content/xml/document/src/nsXMLDocument.cpp#633 which doesn't set the character set source (at least not in the document). Is that a fault?

Heikki Toivonen (remove -bugzilla when emailing directly)

Assignee

Updated

•

24 years ago

Keywords: intl, nsbeta1+

Heikki Toivonen (remove -bugzilla when emailing directly)

Assignee

Updated

•

24 years ago

Priority: -- → P2

Target Milestone: --- → mozilla1.0

Heikki Toivonen (remove -bugzilla when emailing directly)

Assignee

Comment 2

•

23 years ago

Actually I think this is a duplicate. We don't support HTTP headers for XML yet, so it would be small wonder if this worked. *** This bug has been marked as a duplicate of 93218 ***

Status: NEW → RESOLVED

Closed: 23 years ago

Resolution: --- → DUPLICATE

Heikki Toivonen (remove -bugzilla when emailing directly)

Assignee

Comment 3

•

23 years ago

Hmm... after some more testing realized this is not dupe. We *should* try to give out some text. Fix coming up.

Status: RESOLVED → REOPENED

Resolution: DUPLICATE → ---

Heikki Toivonen (remove -bugzilla when emailing directly)

Assignee

Comment 4

•

23 years ago

Attached patch Fix, needs testing and cleanup (obsolete) — Details — Splinter Review

Heikki Toivonen (remove -bugzilla when emailing directly)

Assignee

Comment 5

•

23 years ago

Attached file annotations.asc.xml — Details

Heikki Toivonen (remove -bugzilla when emailing directly)

Assignee

Comment 6

•

23 years ago

Attached file annotations.xml — Details

Heikki Toivonen (remove -bugzilla when emailing directly)

Assignee

Comment 7

•

23 years ago

Attached file test asc — Details

Heikki Toivonen (remove -bugzilla when emailing directly)

Assignee

Comment 8

•

23 years ago

Attached file test other — Details

Heikki Toivonen (remove -bugzilla when emailing directly)

Assignee

Comment 9

•

23 years ago

Attached patch Cleaned up, tested, proposed fix (obsolete) — Details — Splinter Review

This might have been a regression, since I believe the scanner code was changed in response to a change in the decoders, but XMLHttpRequest was not updated. Now XMLHttpRequest again follows the scanner and things seem to work. What the code does it tries to convert the text to the requested character set, but if it finds errors (like characters that are illegal in that set) it will replace those characters with U+FFFD, which will show as '?' in the browser, and continue until the whole buffer has been converted. Notice that I don't seem to be able to see the accented e in Remillard. I see ? both in normal document load and now also using XMLHttpRequest. I do see the correct letter in view source, though. Another notice is that when testing http://www.mozilla.org/xmlextras/xgetinvalid.html we do not completely clear the document tree before the parsererror element. Something strange is going on there. These notices are different bugs that I will file later.

Attachment #75100 - Attachment is obsolete: true

Heikki Toivonen (remove -bugzilla when emailing directly)

Assignee

Comment 10

•

23 years ago

Yes, confirmed that this is a regression. It works fine in NS 6.2.1 which is based on 0.9.4.

Keywords: regression

harishd

Comment 11

•

23 years ago

Comment on attachment 75255 [details] [diff] [review] Cleaned up, tested, proposed fix >+ if (!outBuffer) { >+ nsMemory::Free(outBuffer); This doesn't make sense :).

Attachment #75255 - Flags: review+

Johnny Stenback (:jst)

Comment 12

•

23 years ago

Comment on attachment 75255 [details] [diff] [review] Cleaned up, tested, proposed fix What harishd said, sr=jst

Attachment #75255 - Flags: superreview+

Scott Collins

Comment 13

•

23 years ago

Comment on attachment 75255 [details] [diff] [review] Cleaned up, tested, proposed fix you need to address the reviewers comments before approval, that is get rid of the inappropriate |Free|

Attachment #75255 - Flags: needs-work+

Heikki Toivonen (remove -bugzilla when emailing directly)

Assignee

Comment 14

•

23 years ago

Attached patch Removed unneeded free — Details — Splinter Review

The only difference to the previous patch is the removed free.

Attachment #75255 - Attachment is obsolete: true

Scott Collins

Comment 15

•

23 years ago

Comment on attachment 75403 [details] [diff] [review] Removed unneeded free a=scc, and bringing forward previously good r and sr

Attachment #75403 - Flags: superreview+

Attachment #75403 - Flags: review+

Attachment #75403 - Flags: approval+

Heikki Toivonen (remove -bugzilla when emailing directly)

Assignee

Comment 16

•

23 years ago

Checked in.

Status: REOPENED → RESOLVED

Closed: 23 years ago → 23 years ago

Resolution: --- → FIXED

Chris Petersen

Comment 17

•

23 years ago

Changing QA Contact

QA Contact: petersen → rakeshmishra

Rakesh Mishra

Comment 18

•

23 years ago

verified on the trunk build 2002-05-07-08-trunk on Windows 2000

Status: RESOLVED → VERIFIED

You need to log in before you can comment on or make changes to this bug.