Closed Bug 382947 Opened 17 years ago Closed 17 years ago

XMLHttpRequest should only set the charset in the Content-Type request header if we know the charset

Categories

(Core :: DOM: Core & HTML, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED

People

(Reporter: sciguyryan, Assigned: sciguyryan)

References

Details

Attachments

(1 file, 3 obsolete files)

We shouldn't make assumptions about the charset for the reason as described by bz in bug 362043 comment 12.
Flags: blocking1.9?
Attached patch Patch v1 (obsolete) — Splinter Review
Patch attempt v1
Attachment #267006 - Flags: superreview?(peterv)
Attachment #267006 - Flags: review?(peterv)
Ideally we would also check whether a charset header already exists. Having two charset parameters only because the caller "did things right" is not nice.
Attachment #267006 - Flags: superreview?(peterv)
Attachment #267006 - Flags: review?(peterv)
Attached patch Patch v2 (obsolete) — Splinter Review
Patch v2

Same as patch v1 but also does as Wladimir suggested.
Attachment #267006 - Attachment is obsolete: true
Attachment #267011 - Flags: superreview?(peterv)
Attachment #267011 - Flags: review?(peterv)
Flags: in-testsuite?
I'm confused.  How could we ever not know the charset of strings coming from JavaScript?
To clarify. There are three options for the XMLHttpRequest.send parameter:

1. A plain string, AFAICT only the lower byte of the characters is interpreted in this case. This string could (and should) be the result of nsIScriptableUnicodeConverter.ConvertFromUnicode, consequently we don't know the charset.
2. A stream to be read from. Here the charset isn't known either.
3. A DOM document. That's the only case where we actually know the charset.
Note that for the 1 case the specification currently requires to encode as UTF-8. Not sure what case 2 is supposed to mean.
How should one send a string in a different encoding then? I have one extension that communicates with the server in windows-1251, that will be problematic with this change.
(In reply to comment #5)
> 1. A plain string, AFAICT only the lower byte of the characters is interpreted
> in this case. This string could (and should) be the result of
> nsIScriptableUnicodeConverter.ConvertFromUnicode, consequently we don't know
> the charset.

The code seems to be doing a conversion to UTF-8, are you saying that is not working?

(In reply to comment #7)
> Note that for the 1 case the specification currently requires to encode as
> UTF-8. Not sure what case 2 is supposed to mean.

I assume the charset parameter should be replaced in the Content-Type header set with setRequestHeader?
(In reply to comment #9)
> The code seems to be doing a conversion to UTF-8, are you saying that is not
> working?
> 

I think the point being argued here is that we just shouldn't set a charset parameter if we don't know the charset anyway... don't convert and just leave it alone as I understand it.

> (In reply to comment #7)
> I assume the charset parameter should be replaced in the Content-Type header
> set with setRequestHeader?
> 

We could do that too or just leave the header alone and change nothing - it was sent as being encoded in a specific charset so they probably want it encoded as such.
Flags: blocking1.9? → blocking1.9+
Note that comment 5 is wrong.  If we have a string, it's an honest to God Unicode string (UTF-16) that we convert to UTF8 and send as far as XMLHttpRequest is concerned.  If someone is passing in the result of ConvertToUnicode, then it will in fact be this bizarro "byte-inflate stuff" thing that XPConnect does with the non-intl-safe ACString.  People really shouldn't do that.  Really.

The issue Peter is pointing out with the patch is that when sending a string we _do_ want to send the UTF-8 charset.  Furthermore, I'd think we always want to set (or rather reset) the charset param in the upload stream content type for strings and documents (since there we _do_ know what encoding we're sending, and it's UTF-8 or the document encoding respectively).  For streams, we should leave it as-is.
Boris, I have an extension that has to communicate with the server in windows-1251. The only way I found to make this work was to pass XMLHttpRequest the result of ConvertToUnicode. And I absolutely agree with you that you shouldn't do that - but what *should* you do? Do I have to do bizarro "string byte-inflate to stream stuff" now?
You basically got lucky, in that all the bytes you passed in coverted to a single byte in UTF-8.  If you happened to use something that didn't do that, you'd get garbage on the server end, because what XMLHttpRequest is sending when you do that is:

1)  Take the UTF-16
2)  Convert to windows-1251
3)  Insert a zero byte every other byte
4)  Treat the result as UTF-16 and convert to UTF8
5)  Send the result of that.

I'm frankly rather surprised that you didn't hit any problems; you must not have been sending many (any?) non-ASCII chars.

> Do I have to do bizarro "string byte-inflate to stream stuff" now?

No, just use convertToInputStream() instead of ConvertToUnicode() on the scriptable converter.  Then pass in that input stream.  And set your charset on the channel (as I assume you already do).
(In reply to comment #13)
> I'm frankly rather surprised that you didn't hit any problems; you must not
> have been sending many (any?) non-ASCII chars.

Actually a lot, and this hack is already four years old (the method you suggest was added far later). No problems whatsoever as far as I know.
Comment on attachment 267011 [details] [diff] [review]
Patch v2

New patch coming up shortly.
Attachment #267011 - Flags: superreview?(peterv)
Attachment #267011 - Flags: review?(peterv)
Attached patch Patch v3.0 (obsolete) — Splinter Review
Patch v3.0

Sets the charset data information for everything but streams.
Attachment #267011 - Attachment is obsolete: true
Attachment #270716 - Flags: superreview?(peterv)
Attachment #270716 - Flags: review?(peterv)
Attached patch Patch v3.1Splinter Review
Patch v3.1

Fixed a small error in the last patch.
Attachment #270716 - Attachment is obsolete: true
Attachment #270717 - Flags: superreview?(peterv)
Attachment #270717 - Flags: review?(peterv)
Attachment #270716 - Flags: superreview?(peterv)
Attachment #270716 - Flags: review?(peterv)
Attachment #270717 - Flags: review?(peterv) → review?(bzbarsky)
Comment on attachment 270717 [details] [diff] [review]
Patch v3.1

Sure, but I basically suggested this fix, so I still think peterv should look at it.
Attachment #270717 - Flags: review?(bzbarsky) → review+
Attachment #270717 - Flags: superreview?(peterv) → superreview+
Keywords: checkin-needed
content/base/src/nsXMLHttpRequest.cpp 1.187
Status: ASSIGNED → RESOLVED
Closed: 17 years ago
Keywords: checkin-needed
Resolution: --- → FIXED
Depends on: 397234
Depends on: 398280
This patch produce that upload of files don't work any more. Now boundary parameter is deleted from the header. Look at the bug https://bugzilla.mozilla.org/show_bug.cgi?id=393968
Depends on: 393968
Component: DOM: Mozilla Extensions → DOM
Component: DOM → DOM: Core & HTML
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: