Closed
Bug 382947
Opened 17 years ago
Closed 17 years ago
XMLHttpRequest should only set the charset in the Content-Type request header if we know the charset
Categories
(Core :: DOM: Core & HTML, defect)
Core
DOM: Core & HTML
Tracking
()
RESOLVED
FIXED
People
(Reporter: sciguyryan, Assigned: sciguyryan)
References
Details
Attachments
(1 file, 3 obsolete files)
2.07 KB,
patch
|
bzbarsky
:
review+
peterv
:
superreview+
|
Details | Diff | Splinter Review |
We shouldn't make assumptions about the charset for the reason as described by bz in bug 362043 comment 12.
Flags: blocking1.9?
Assignee | ||
Comment 1•17 years ago
|
||
Patch attempt v1
Attachment #267006 -
Flags: superreview?(peterv)
Attachment #267006 -
Flags: review?(peterv)
Comment 2•17 years ago
|
||
Ideally we would also check whether a charset header already exists. Having two charset parameters only because the caller "did things right" is not nice.
Assignee | ||
Updated•17 years ago
|
Attachment #267006 -
Flags: superreview?(peterv)
Attachment #267006 -
Flags: review?(peterv)
Assignee | ||
Comment 3•17 years ago
|
||
Patch v2
Same as patch v1 but also does as Wladimir suggested.
Attachment #267006 -
Attachment is obsolete: true
Attachment #267011 -
Flags: superreview?(peterv)
Attachment #267011 -
Flags: review?(peterv)
Updated•17 years ago
|
Flags: in-testsuite?
Comment 4•17 years ago
|
||
I'm confused. How could we ever not know the charset of strings coming from JavaScript?
Comment 5•17 years ago
|
||
To clarify. There are three options for the XMLHttpRequest.send parameter:
1. A plain string, AFAICT only the lower byte of the characters is interpreted in this case. This string could (and should) be the result of nsIScriptableUnicodeConverter.ConvertFromUnicode, consequently we don't know the charset.
2. A stream to be read from. Here the charset isn't known either.
3. A DOM document. That's the only case where we actually know the charset.
Comment 7•17 years ago
|
||
Note that for the 1 case the specification currently requires to encode as UTF-8. Not sure what case 2 is supposed to mean.
Comment 8•17 years ago
|
||
How should one send a string in a different encoding then? I have one extension that communicates with the server in windows-1251, that will be problematic with this change.
Comment 9•17 years ago
|
||
(In reply to comment #5)
> 1. A plain string, AFAICT only the lower byte of the characters is interpreted
> in this case. This string could (and should) be the result of
> nsIScriptableUnicodeConverter.ConvertFromUnicode, consequently we don't know
> the charset.
The code seems to be doing a conversion to UTF-8, are you saying that is not working?
(In reply to comment #7)
> Note that for the 1 case the specification currently requires to encode as
> UTF-8. Not sure what case 2 is supposed to mean.
I assume the charset parameter should be replaced in the Content-Type header set with setRequestHeader?
Assignee | ||
Comment 10•17 years ago
|
||
(In reply to comment #9)
> The code seems to be doing a conversion to UTF-8, are you saying that is not
> working?
>
I think the point being argued here is that we just shouldn't set a charset parameter if we don't know the charset anyway... don't convert and just leave it alone as I understand it.
> (In reply to comment #7)
> I assume the charset parameter should be replaced in the Content-Type header
> set with setRequestHeader?
>
We could do that too or just leave the header alone and change nothing - it was sent as being encoded in a specific charset so they probably want it encoded as such.
Flags: blocking1.9? → blocking1.9+
Comment 11•17 years ago
|
||
Note that comment 5 is wrong. If we have a string, it's an honest to God Unicode string (UTF-16) that we convert to UTF8 and send as far as XMLHttpRequest is concerned. If someone is passing in the result of ConvertToUnicode, then it will in fact be this bizarro "byte-inflate stuff" thing that XPConnect does with the non-intl-safe ACString. People really shouldn't do that. Really.
The issue Peter is pointing out with the patch is that when sending a string we _do_ want to send the UTF-8 charset. Furthermore, I'd think we always want to set (or rather reset) the charset param in the upload stream content type for strings and documents (since there we _do_ know what encoding we're sending, and it's UTF-8 or the document encoding respectively). For streams, we should leave it as-is.
Comment 12•17 years ago
|
||
Boris, I have an extension that has to communicate with the server in windows-1251. The only way I found to make this work was to pass XMLHttpRequest the result of ConvertToUnicode. And I absolutely agree with you that you shouldn't do that - but what *should* you do? Do I have to do bizarro "string byte-inflate to stream stuff" now?
Comment 13•17 years ago
|
||
You basically got lucky, in that all the bytes you passed in coverted to a single byte in UTF-8. If you happened to use something that didn't do that, you'd get garbage on the server end, because what XMLHttpRequest is sending when you do that is:
1) Take the UTF-16
2) Convert to windows-1251
3) Insert a zero byte every other byte
4) Treat the result as UTF-16 and convert to UTF8
5) Send the result of that.
I'm frankly rather surprised that you didn't hit any problems; you must not have been sending many (any?) non-ASCII chars.
> Do I have to do bizarro "string byte-inflate to stream stuff" now?
No, just use convertToInputStream() instead of ConvertToUnicode() on the scriptable converter. Then pass in that input stream. And set your charset on the channel (as I assume you already do).
Comment 14•17 years ago
|
||
(In reply to comment #13)
> I'm frankly rather surprised that you didn't hit any problems; you must not
> have been sending many (any?) non-ASCII chars.
Actually a lot, and this hack is already four years old (the method you suggest was added far later). No problems whatsoever as far as I know.
Assignee | ||
Comment 15•17 years ago
|
||
Comment on attachment 267011 [details] [diff] [review]
Patch v2
New patch coming up shortly.
Attachment #267011 -
Flags: superreview?(peterv)
Attachment #267011 -
Flags: review?(peterv)
Assignee | ||
Comment 16•17 years ago
|
||
Patch v3.0
Sets the charset data information for everything but streams.
Attachment #267011 -
Attachment is obsolete: true
Attachment #270716 -
Flags: superreview?(peterv)
Attachment #270716 -
Flags: review?(peterv)
Assignee | ||
Comment 17•17 years ago
|
||
Patch v3.1
Fixed a small error in the last patch.
Attachment #270716 -
Attachment is obsolete: true
Attachment #270717 -
Flags: superreview?(peterv)
Attachment #270717 -
Flags: review?(peterv)
Attachment #270716 -
Flags: superreview?(peterv)
Attachment #270716 -
Flags: review?(peterv)
Assignee | ||
Updated•17 years ago
|
Attachment #270717 -
Flags: review?(peterv) → review?(bzbarsky)
Comment 18•17 years ago
|
||
Comment on attachment 270717 [details] [diff] [review]
Patch v3.1
Sure, but I basically suggested this fix, so I still think peterv should look at it.
Attachment #270717 -
Flags: review?(bzbarsky) → review+
Updated•17 years ago
|
Attachment #270717 -
Flags: superreview?(peterv) → superreview+
Assignee | ||
Updated•17 years ago
|
Keywords: checkin-needed
Comment 19•17 years ago
|
||
content/base/src/nsXMLHttpRequest.cpp 1.187
Comment 20•17 years ago
|
||
This patch produce that upload of files don't work any more. Now boundary parameter is deleted from the header. Look at the bug https://bugzilla.mozilla.org/show_bug.cgi?id=393968
Updated•12 years ago
|
Component: DOM: Mozilla Extensions → DOM
Updated•6 years ago
|
Component: DOM → DOM: Core & HTML
You need to log in
before you can comment on or make changes to this bug.
Description
•