multipart/form-data POST data should include charset for text controls




HTML: Form Submission
18 years ago
18 years ago


(Reporter: Adrian Havill, Assigned: Eric Pollmann)



Firefox Tracking Flags

(Not tracked)



(3 attachments)



18 years ago
HTML 4.01 says that all non-ASCII data should be POSTed via multipart/form-data
<URL:> and that
"User agents should supply the "Content-Type" header, accompanied by a "charset"
parameter." Lynx is one browser that does this.

This is necessary for server-side components that use a generic MIME processer
to process
non-ASCII data correctly.

Comment 1

18 years ago
Created attachment 18022 [details] [diff] [review]
Writes a Content-Type line with a charset iff charset != us-ascii

Comment 2

18 years ago
I agree with the overall idea of putting in the charset but nsFormFrame owners
and int'l folks should review this. ->pollmann(?)
Assignee: gagan → pollmann
Ever confirmed: true

Comment 3

18 years ago
Created attachment 18098 [details] [diff] [review]
updated patch (includes some suggested changes)

Comment 4

18 years ago
Some suggested changes included in the above patch:

1) Add logic to increment the calculated content length
2) Remove conversion of charset to cstring and instead use EqualsWithConversion

I haven't really looked at this patch in depth to determine if it would affect
existing pages.  Adrian, can you clarify any thoughts on that?

I'm also not sure how we will ensure that if this case is hit we switch to
posting using multipart/form-data if we were in the middle of a
application/x-www-form-urlencoded post was that kind of switching what the spec
meant to happen?
Component: Networking → Form Submission
OS: Linux → All
Hardware: PC → All

Comment 5

18 years ago
Regarding how it would affect existing pages, I seriously doubt it would affect
pages in
the wild because multipart/form-data POSTs a) rare and b) when they are used,
they are used
exclusively for uploading files, because script authors do not like the
additional complexity of MIME parsing, and there is no (current) benefit of
using multipart/form-data over application/x-www-form-urlencoded (there would be
if the charset was present).

I don't understand what was meant by "switching" to multipart/form-data if we
were in the
middle of a application/x-www-formurlencoded.

While we're at it, we should add the "Content-Transfer-Encoding: binary", as
detailed in
the HTML 4.01 example for file upload, to the case where the component is a
file, so that
authors that decide to use a premade MIME component won't have problems with the
8-bit data injected in the post causing the MIME parser to complain/fail.

Comment 6

18 years ago
> I don't understand what was meant by "switching" to multipart/form-data if we
> were in the middle of a application/x-www-formurlencoded.

I think that was just me misunderstanding your earlier description.  :)  If I
understand correctly now, this fix should only change things if the form
encoding is set to multipart/form-data.

In that case I'd agree that the risk is pretty low - not to many people out
there using multipart/form-data for text input.. :)

Comment 7

18 years ago
> While we're at it, we should add the "Content-Transfer-Encoding: binary", as
> detailed in the HTML 4.01 example for file upload. <snip>

This also sounds reasonable, though it would require quite a bit more testing -
it would be a miracle if we didn't hit a hand-rolled CGI unable to handle this
extra header before the data, but it could happen.  :)  Perhaps we should work
on this as a separate bug so that the two don't get tied together.

Comment 8

18 years ago
Content-Transfer-Encoding header suggestion moved to separate Bugzilla entry.


Comment 9

18 years ago
After looking at the patch and the suggestion to use EqualsWithConversion, I
noticed the ifs are missing a bang "!" operator so the effect is opposite...
only forms in us-ascii will get the charset applied.

Comment 10

18 years ago
Created attachment 18451 [details] [diff] [review]
fixed above suggested patch (refs to nsAutoString, us-ascii ignore logic)


18 years ago
Depends on: 65092


18 years ago
No longer depends on: 65092


18 years ago
Blocks: 65092

Comment 11

18 years ago
this bug has a patch which needs review. adding keywords.
Keywords: patch, review

Comment 12

18 years ago
Oops, this slid under the radar...  Thanks for the good catch (missing !).
r=pollmann for this change.  I'll try to get a super-review and then check in
this change.

Comment 13

18 years ago
moving to 0.9 if we really have a patch. Please set target milestone to Future 
if we don't have a solution
Target Milestone: --- → mozilla0.9

Comment 14

18 years ago
don't use this version of the patch as the patch will need to be patched again
to handle the rest of the things that need to be in the multipart/form-data
according to HTML 4-- there's so much to multipart/form-data in the standards
that isn't supported that this is only the tip of the iceberg.

Use the latest patch in bug 44464.

For better or worse, it's macroized so you can remove (for those that hate
macros, by manually deleting the macro lines or those that like them, via
setting the functionality you want to true (1) or false (0)) the HTML 4
compliance that you don't want (for those concerned about the lack of testing
against interoperability with legacy non-HTML4.01/non-RFC2388 compliant
file-upload scripts)

*** This bug has been marked as a duplicate of 44464 ***
Last Resolved: 18 years ago
Resolution: --- → DUPLICATE

Comment 15

18 years ago
Verified dup.
You need to log in before you can comment on or make changes to this bug.