Whine to console when a form is submitted using an encoding that can't represent all of Unicode

RESOLVED FIXED in mozilla12

Status

()

Core
HTML: Form Submission
--
enhancement
RESOLVED FIXED
5 years ago
5 years ago

People

(Reporter: hsivonen, Assigned: hsivonen)

Tracking

Trunk
mozilla12
Points:
---
Bug Flags:
in-testsuite +

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(1 attachment, 1 obsolete attachment)

(Assignee)

Description

5 years ago
See http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2011-November/033991.html

When Web apps that solicit textual input from users transfer the input to the server using an encoding that can't encode all of Unicode, user input (including names of people) may break.

To call Web author attention to this problem, Firefox should whine to console when a form is submitted using an encoding that can't encode all of Unicode and, therefore, can corrupt user input.
(Assignee)

Comment 1

5 years ago
Created attachment 582281 [details] [diff] [review]
Whine to console about form submission encodings that can't encode all of Unicode

GB18030 can encode all of Unicode, so excluding it from the whining. UTF-16 is mapped to UTF-8 before the whining check.
Assignee: nobody → hsivonen
Status: NEW → ASSIGNED
(Assignee)

Comment 2

5 years ago
Created attachment 582852 [details] [diff] [review]
Whine to console about form submission encodings that can't encode all of Unicode
Attachment #582281 - Attachment is obsolete: true
Attachment #582852 - Flags: review?(bugs)

Comment 3

5 years ago
Comment on attachment 582852 [details] [diff] [review]
Whine to console about form submission encodings that can't encode all of Unicode

This can be a bit noisy, but is an important thing.
Attachment #582852 - Flags: review?(bugs) → review+
(Assignee)

Comment 4

5 years ago
Thanks for the r.

https://hg.mozilla.org/integration/mozilla-inbound/rev/f4f47800d2ff
Flags: in-testsuite+
https://hg.mozilla.org/mozilla-central/rev/f4f47800d2ff
Status: ASSIGNED → RESOLVED
Last Resolved: 5 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla12
(In reply to Henri Sivonen (:hsivonen) from comment #0)
> See
> http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2011-November/033991.html
> 
> When Web apps that solicit textual input from users transfer the input to
> the server using an encoding that can't encode all of Unicode, user input
> (including names of people) may break.
Actually it is a problem of poorly-written server-side software as the reply says. If the encoding  can not represent a character, the character will be converted to character reference. Server-side software can handle to repair the original character. If the form encoding is changed to UTF-8, server-side software (and all existing data) needs to be updated anyway. I doubt it deserves a warning.
(Assignee)

Comment 7

5 years ago
(In reply to Masatoshi Kimura [:emk] from comment #6)
> If the form encoding is changed to UTF-8,
> server-side software (and all existing data) needs to be updated anyway.

Yes, the server needs fixing, too. It doesn't make sense to fix just the form.

Maybe the wording needs to be tweaked to make this clear?
(Assignee)

Comment 8

5 years ago
Oh, and the server cannot unambigously repair submission where something has been converted to character references, because it can't tell if substrings that look like character references are user input or artifacts of lossy conversion.
You need to log in before you can comment on or make changes to this bug.