Last Comment Bug 416178 - XMLHttpRequest posts set charset= in Content-Type header, breaking some webservers
: XMLHttpRequest posts set charset= in Content-Type header, breaking some webse...
Status: RESOLVED DUPLICATE of bug 918742
Product: Core
Classification: Components
Component: DOM (show other bugs)
: 1.9.0 Branch
: All All
: -- major (vote)
: ---
Assigned To: Nobody; OK to take it and work on it
: Andrew Overholt [:overholt]
: 455318 455850 (view as bug list)
Depends on:
  Show dependency treegraph
Reported: 2008-02-07 11:07 PST by Alexander Klimetschek
Modified: 2015-08-25 03:59 PDT (History)
17 users (show)
jst: blocking1.9-
See Also:
Crash Signature:
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---


Description Alexander Klimetschek 2008-02-07 11:07:41 PST
User-Agent:       Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.5; de; rv:1.9b3) Gecko/2008020511 Firefox/3.0b3
Build Identifier: Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.5; de; rv:1.9b3) Gecko/2008020511 Firefox/3.0b3

Firefox 3 beta 2 and beta 3 RC are (again) sending

Content-Type: application/x-www-form-urlencoded; charset=UTF-8

for a form POST. The "; charset=UTF-8" breaks many webservers, resulting in error or empty pages.

This is an old problem and it was accepted as not possible because the number of webservers, that can't handle it, is too large. It was removed from earlier Mozilla/Gecko versions, see these bugs for example (there are many bugs around that topic):

I have already seen this (current) bug which is about a similar problem with multipart forms:

Reproducible: Always

Steps to Reproduce:
1. Create a HTML page with a form with action=POST
2. Open this page with Firefox (3b3)
3. Submit the form and trace the HTTP request
Actual Results:  
This HTTP Header is part of the request:

Content-Type: application/x-www-form-urlencoded; charset=UTF-8

Expected Results:  
This is the lowest-common denominator for webservers:

Content-Type: application/x-www-form-urlencoded

Many servers choke on header params in the form of "; key=value" in the Content-Type header.

Used my profile from FF 2.0 for FF 3 beta, so I am a normal "upgrader" - no special configs involved, if that might be the case for such an "experimental feature" ;-)
Comment 1 Boris Zbarsky [:bz] (still a bit busy) 2008-02-07 16:48:11 PST
Uh... do you actually have a form that breaks that you can point me to?  There is no code in beta3 to do this for urlencoded form submissions, so I don't see how it could possibly be happening.
Comment 2 Alexander Klimetschek 2008-02-07 16:55:47 PST
Oh, I forgot to mention that it happens when doing an XHR with the prototype javascript lib. Haven't tested normal forms yet - but I thought it happens always ;-)

Here is a (hopefully) self-contained example:

    <form id="login" action="" method="post">
        <h2>Please Login to Post</h2>
        <input type="text" name="username" value="" class="login"/>
        <label for="username"><small>ISID</small></label>

        <input type="password" name="password" value="" class="login"/>
        <label for="password"><small>Password</small></label>
        <p class="postmetadata">
            <input class="link" type="submit" value="Submit" />
    <script src="/blog/js/prototype.js" type="text/javascript"></script>
        var login = $("login");
        login.onsubmit = function() {
            var url = login.action;
            new Ajax.Updater("formplaceholder", url, {
                evalScripts: true,
                method: "post",
                parameters: login.serialize(true),
                contentType: "application/x-www-form-urlencoded",
                encoding: ""
            return false;
    <div id="formplaceholder"></div>
Comment 3 Boris Zbarsky [:bz] (still a bit busy) 2008-02-07 17:00:53 PST
Oh.  XHR.  That matters!

*** This bug has been marked as a duplicate of bug 413974 ***
Comment 4 Alexander Klimetschek 2008-03-11 12:31:33 PDT
Tested with Firefox 3.0 beta 4 (on Mac) and the fix for bug 413974 (which I assume should be fixed in beta 4) DOES NOT solve this problem.

I think the duplication of this bug was wrong from the beginning.
- bug 413974 is about enctype multipart/form-data
- this bug is about enctype application/x-www-form-urlencoded
Comment 5 Boris Zbarsky [:bz] (still a bit busy) 2008-03-11 12:44:23 PDT
> DOES NOT solve this problem.

In that case, please put up a testcase showing the bug.  That is, a web page that I can go to to see the issue, as well as the source of the server code involved.  I did test exactly this situation when writing the patch for bug 413974, and in my testing it is fixed.

>- bug 413974 is about enctype multipart/form-data

The same codepath is used for both enctypes.
Comment 6 Boris Zbarsky [:bz] (still a bit busy) 2008-03-11 12:48:20 PDT
Oh, I should have clarified.  The new code WILL always send the charset.  It will put the charset right after the MIME type before all other params (instead of at the end of the MIME type, where it used to go).

I guess your original report claims that even this will break servers.  I suppose we could special-case this one particular MIME type in XMLHttpRequest.  That seems uncalled-for to me, but if the breakage is really that widespread we will have to.

Nominating for blocking, but honestly, I've seen no other reports of this being a problem.
Comment 7 Alexander Klimetschek 2008-03-11 15:13:26 PDT
Yes, it's the pure existence of "; charset=" that will break the webserver.

I have already placed a sample HTML in comment #2. But it's difficult to provide the webserver, cause it's proprietary. The parsing bug is fixed in there for a new version, but there are many large-scale installations out there. Problem is that parameters cannot be parsed at all for an XHR post request, which can be tricky in Ajax-heavy applications. Updating the servers is not a quick option.

Unfortunately I wasn't able to find an example installation that has a POST form using XHR, but if there is, it's an extremely difficult problem to spot. I also assume that there are lots of other webserver implementations with the same problem; i guessed so by reading this older issue around the same topic: bug 7533 (Interesting comments start at #34). I know it's over 8 years old (whew!), but I wouldn't be confident that this bug is out of the web, especially since all the main browsers don't do it.
Comment 8 Alexander Klimetschek 2008-03-11 15:17:21 PDT
And here is a possible solution that keeps the feature, but allows XHR developers to opt-out: don't add ";charset=utf-8" automatically if the XHR request only specifies a mime-type. But if the javascript code explicitly sets the content-type to something including the charset, keep it (and do whatever the underlying code might do, eg. reformatting or security checks).
Comment 9 Alexander Klimetschek 2008-03-11 15:31:51 PDT
Safari 3.0.4 behaves that way: if you don't specify "; charset=..." in the XHR content-type header, it won't add it. If you do, it will be passed through.
Comment 10 Boris Zbarsky [:bz] (still a bit busy) 2008-03-11 16:25:18 PDT
The whole point of adding a charset is that we need to identify the charset of the data, because otherwise neither the sender nor the receiver know what encoding the data is sent in.  We used to do what Safari 3.0.4 does.  It breaks sites, as it happens.
Comment 11 Boris Zbarsky [:bz] (still a bit busy) 2008-03-11 16:26:34 PDT
Note that the "breaks many web servers" claim could really use some backing up....
Comment 12 Alexander Klimetschek 2008-03-11 16:28:55 PDT
Firefox 3 beta 2, 3 and 4 did never behave like Safari 3.0.4. The charset was *always* appended, even if none was set by the javascript code in the XHR object.
Comment 13 Boris Zbarsky [:bz] (still a bit busy) 2008-03-11 16:48:30 PDT
Yes, but Firefox 2 behaved like Safari 3.0.4.
Comment 14 Alexander Klimetschek 2008-03-11 16:58:53 PDT
I see, didn't know that.

The problem with the unknown charset in browser requests is solved by web frameworks in different flavors anyway (we for example use a hidden parameter FormEncoding...). Would be cool to rely on charset set in content-type as standard, but it must be possible to avoid breaking existing servers that are buggy.
Comment 15 Jonas Sicking (:sicking) No longer reading bugmail consistently 2008-03-11 17:30:54 PDT
The spec says that when a string is sent it should always be utf8 encoded. In that case it seems like the receiver will always know the encoding and so sending it seems pointless.

The spec also says to append the charset parameter, but maybe we can get the spec changed on that. Mailing them now.
Comment 16 Boris Zbarsky [:bz] (still a bit busy) 2008-03-11 19:16:56 PDT
> it seems like the receiver will always know the encoding and so
> sending it seems pointless.

Only if the receiver knows that it's being sent an XMLHttpRequest and can special-case the text processing.  Since this whole bug is about using XMLHttpRequest to generate generic form submissions, the receiver knows no such thing in the cases in this bug, or in many other cases.  With forms, you can at least select the encoding the receiver expects if absolutely needed, but with XMLHttpRequest you always get UTF-8, so the only way to make it work is to tell the receiver so and fix all receivers to try following the specs instead of just writing a "it happens to work" parser by trial and error.
Comment 17 Johnny Stenback (:jst, 2008-03-12 15:29:55 PDT
Per discussion with Jonas we've decided that we won't change how Firefox works here. If there's evidence out there that a large set of real websites out there that break due to this, we'd be willing to reconsider this decision, but with only one bug report and no real feel of the number of broken sites it doesn't seem worth undoing this change.
Comment 18 Alexander Klimetschek 2008-03-12 15:58:31 PDT
IMHO there aren't any bug reports yet because FF3 is still beta and there isn't wide adoption yet.
Comment 19 Sean Fitzgerald 2008-06-11 16:29:09 PDT
I'll back up this report, and I'll even show you an example of it in action. 

The server is running helma, and echo's back the post object.

Comment 20 Sean Fitzgerald 2008-06-11 16:29:44 PDT
And yes, that should read "echoes".
Comment 21 Matthias Versen [:Matti] 2008-09-18 13:57:34 PDT
*** Bug 455318 has been marked as a duplicate of this bug. ***
Comment 22 Paul Downey 2008-09-19 04:19:25 PDT
We're being hit by this issue using proprietary enterprise systems, notably CA Siteminder and other security firewalls which have rules to reject Content-Type values with the charset asserted.

The current situation means we cannot use XMLHTTPrequest to POST to such servers. We don't have access to the server code, some of which is in firmware and on machines beyond our influence, and without a work-round from Javascript, users are being advised not to use Firefox 3.0 or 3.1.
Comment 23 Christian :Biesinger (don't email me, ping me on IRC) 2008-09-19 04:36:19 PDT
*** Bug 455850 has been marked as a duplicate of this bug. ***
Comment 24 Boris Zbarsky [:bz] (still a bit busy) 2008-09-19 04:39:59 PDT
Paul, have you considered raising your problem in the W3C group working on XMLHttpRequest?  I don't have a problem with adding a way for the page to opt-out of sending the charset header, but I'd be happier doing that in a way allowed by the spec instead of just doing something random.
Comment 25 Boris Zbarsky [:bz] (still a bit busy) 2008-09-19 04:50:50 PDT
I'm also having a hard time believing that these firewalls really reject _everything_ with a charset, since so much of the web carries charset params on Content-Types....
Comment 26 Boris Zbarsky [:bz] (still a bit busy) 2008-09-19 04:57:47 PDT
Paul, I'd also be interested in knowing your exact situation:  What you're setting Content-Type to, what type of object you're passing to send(), what header you're getting as a result, and what your firewall actually accepts and rejects.
Comment 27 Paul Downey 2008-09-19 06:43:33 PDT
Boris, thanks for replying.

I haven't contacted the W3C WG, but could if there's a spec issue, but my reading of:

is adding "; charset .." is  within specification,  but is optional, and what there is of

doesn't demand asserting the charset, but that's not so clear. 

From experiments with curl and differencing the working Form POST headers, which is identical apart from the charset string,  with the non-working XHR I'm pretty convinced our issue is directly attributable to the Content-Type value not being *exactly* "application/x-www-form-urlencoded" 

The difficulty with Firefox 3.0 and 3.1 appending the charset, is we have no mechanism to not send it. I'd suggest either reverting or giving us an option to remove it, ideally from Javascript, given this is behavior not backwards compatible, or exhibited by an other browser we can find. 

Firefox's current behaviour is to assert a charset of UTF8, regardless of one being missing, or supplied with a different value. We'd like to be able to programatically send no charset value.   I'd also note, this is new behavior introduced by Firefox 3.0 is for something easily supplied by the Javascript caller, and in a cross-browser way, if they choose to assert the charset as being UTF-8. 

I'd offer more detail on the actual implementations causing the issue, but they're security gateways and single signon services deployed inside a large enterprise, and as you can imagine, this is a sensitive area.
Comment 28 Boris Zbarsky [:bz] (still a bit busy) 2008-09-19 06:55:14 PDT
Paul, you're not looking at the latest spec version.  The latest one, cleverly not linked from anywhere useful, is at

For what it's worth, I raised the issue with the relevant working group.
Comment 29 Boris Zbarsky [:bz] (still a bit busy) 2008-09-19 07:01:03 PDT
In case you want to follow up, this is the '[XHR] Some comments on "charset" in the Content-Type header' thread in the mailing list.
Comment 30 Boris Zbarsky [:bz] (still a bit busy) 2008-09-19 07:02:14 PDT
And again for what it's worth, I think that getting the spec changed here would benefit from hard data that this is the only way to deal with the problem.  I realize that you may not be willing to provide this data in public, but the W3C has provisions for private communication of sensitive information for spec editors, I believe.
Comment 31 Richard Sears 2008-09-19 10:56:57 PDT
Our server (4d_WebStar_D/7.8) does not parse form variables when any charset value is added to Content-Type.  So the Firefox3 behavior which adds charset broke the pages using multipart/formdata POST with XMLHttpRequest.  Our code could not access the submitted form data.

I was able to workaround the problem by using sendAsBinary() instead of send() when browser is Firefox3.
Comment 32 Boris Zbarsky [:bz] (still a bit busy) 2008-09-19 11:37:50 PDT
Richard, you need to fix your server to actually follow the HTTP specification....
Comment 33 Richard Sears 2008-09-19 12:47:44 PDT
Boris, unfortunately I don't have access to the source code that controls this.  We are using an old version and are unable to upgrade or switch systems right now. 
The workaround will keep us for awhile.
I guess the thing that bothers me about the Firefox3 behavior is that it changes a value specifically set using setRequestHeader().  That seems like an odd thing to do.
Comment 34 Boris Zbarsky [:bz] (still a bit busy) 2008-09-19 12:50:57 PDT
It's not that odd in cases where that value is inconsistent with other data we have, for what it's worth...  Say if you explicitly set a charset other than UTF-8, and we encode the data as UTF-8.

In any case, please take spec issues to the W3C?
Comment 35 Alexander Klimetschek 2008-09-21 04:18:51 PDT
(In reply to comment #34)
> It's not that odd in cases where that value is inconsistent with other data we
> have, for what it's worth...

I think this is just too much "magic". If you set request headers directly, you expect them to be used - even if you do it wrong, ie. when the body actually contains a different charset.

What is the advantage of the current implementation? For most systems none, since they rely on different ways to pass the charset to the server (hidden form values etc., and in most cases this is utf-8). So if those should switch to the proper way of putting the charset in the content-type header, they should be able to choose that way on their own - when their servers and firewalls are ready to handle that format.
Comment 36 Boris Zbarsky [:bz] (still a bit busy) 2008-09-21 06:08:09 PDT
Alexander, you're saying that Mozilla should send malformed HTTP requests, violating the HTTP spec, just because the page author asked it to?  I don't think so.

> What is the advantage of the current implementation?

Much better functioning with cross-site XMLHTPRequest, where the server and XMLHttpRequest caller are completely independent.
Comment 37 Alexander Klimetschek 2008-09-22 09:44:33 PDT
(In reply to comment #36)
> Alexander, you're saying that Mozilla should send malformed HTTP requests,
> violating the HTTP spec, just because the page author asked it to?  I don't
> think so.

Well, not setting the charset in the content type header is not violating the HTTP spec. Also, using XHR as a page developer is more like a HTTP client lib than a user-driven browser, so you need ways to determine what is actually sent from your code.
Comment 38 guille.rodriguez 2014-12-11 02:24:47 PST
The W3C spec explicitly states that charset is not allowed for application/x-www-form-urlencoded:
Comment 39 Boris Zbarsky [:bz] (still a bit busy) 2014-12-11 08:05:17 PST
None of the text you link to talks about the request headers involved.

Though it does suggest that servers shouldn't use the Content-Type request header for charset info.  However in practice some do...
Comment 40 guille.rodriguez 2014-12-11 08:49:36 PST
The linked spec says that the charset is not allowed for this MIME type. This would suggest that "application/x-www-form-urlencoded; charset=utf-8" is not a valid MIME type string.

Also it seems that Firefox is adding the charset even when the application explicitly sets the content-type to "application/x-www-form-urlencoded" (no charset) via setRequestHeader(). The XMLHttpRequest spec ( specifies two cases where the user-agent should modify the Content-Type when sending the XHR data:

1. If a Content-Type header is in author request headers and its value is a valid MIME type that has a charset parameter whose value is not a case-insensitive match for encoding, and encoding is not null, set all the charset parameters of that Content-Type header to encoding.

2. If no Content-Type header is in author request headers and mime type is not null, append a Content-Type header with value mime type to author request headers.

This does not include the case where the application has set the Content-Type _without_ a charset. Firefox still mangles the content-type in this case.
Comment 41 Michael Holroyd 2015-08-19 12:03:28 PDT
When using Amazon S3's signed PUT requests, this bug is causing the content-type header to not match what is required (in this case it should be the string "application/json" as in the signature, but is "application/json; charset=UTF-8" instead), and thus the request fails because the signatures do not match.

I don't understand why this is marked RESOLVED WONTFIX - this is obviously a bug, prevents sending perfectly reasonable requests that are within the spec, and does so by "magically" changing the headers from what was explicitly requested. At a minimum there must be a way to "opt-out" of this behavior.
Comment 42 Jonas Sicking (:sicking) No longer reading bugmail consistently 2015-08-19 12:57:19 PDT
Doesn't spec define behavior here?
Comment 43 Anne (:annevk) 2015-08-19 13:19:39 PDT
It does, in step 4 of, which is in line with that comment 40 outlines, though that comment points to a document we should not look at.

Removing the whiteboard comment since we removed sendAsBinary() so that no longer works as a workaround.

Reopening since we should tweak our behavior here per the specification, although I suggest that bz signs off on it since he largely instigated the whole charset business in the first place, iirc.
Comment 44 2015-08-24 05:39:31 PDT

I went to the entire thread. Its important to realize that in more cases its impossible to convince the API provider to make changes like this because they do not want to touch an existing stable system. 
Our company caters to the BPO industry which has all kinds of browsers in enterprise market and vows to support all of them. But the API that we are using is giving the same problem with charset:utf. Its very important that Mozilla resolves this as soon as possible. Safari, IE, Chrome all works fine except for Firefox which seems to have this issue of charset:utf.

I request you not to make us feel handicapped in this regard. For a start may I ask you that what is your target hits on this thread that you need to decide that this needs to be fixed ? At least that should be a starter on this issue.
Comment 45 Hallvord R. M. Steen [:hallvors] 2015-08-25 03:59:14 PDT
This is a known bug and we want to fix it - it's required for conformance with the spec. Note how several sub-tests here fail because Gecko is adding too many ";charset=UTF-8" :
(This is a test in the official W3C XMLHttpRequest test suite)

I think this is a dupe of bug 918742, which is about fixing failures on that test. Since it's a real world problem we should nudge the priority of that bug upwards, but unfortunately I don't know when somebody will get around to fixing it.

*** This bug has been marked as a duplicate of bug 918742 ***

Note You need to log in before you can comment on or make changes to this bug.