Last Comment Bug 708620 - Whine to console when a form is submitted using an encoding that can't represent all of Unicode
: Whine to console when a form is submitted using an encoding that can't repres...
Status: RESOLVED FIXED
:
Product: Core
Classification: Components
Component: HTML: Form Submission (show other bugs)
: Trunk
: All All
: -- enhancement (vote)
: mozilla12
Assigned To: Henri Sivonen (:hsivonen)
:
Mentors:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2011-12-08 07:26 PST by Henri Sivonen (:hsivonen)
Modified: 2012-04-10 01:15 PDT (History)
3 users (show)
hsivonen: in‑testsuite+
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---


Attachments
Whine to console about form submission encodings that can't encode all of Unicode (5.92 KB, patch)
2011-12-16 08:18 PST, Henri Sivonen (:hsivonen)
no flags Details | Diff | Review
Whine to console about form submission encodings that can't encode all of Unicode (5.92 KB, patch)
2011-12-19 09:18 PST, Henri Sivonen (:hsivonen)
bugs: review+
Details | Diff | Review

Description Henri Sivonen (:hsivonen) 2011-12-08 07:26:42 PST
See http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2011-November/033991.html

When Web apps that solicit textual input from users transfer the input to the server using an encoding that can't encode all of Unicode, user input (including names of people) may break.

To call Web author attention to this problem, Firefox should whine to console when a form is submitted using an encoding that can't encode all of Unicode and, therefore, can corrupt user input.
Comment 1 Henri Sivonen (:hsivonen) 2011-12-16 08:18:41 PST
Created attachment 582281 [details] [diff] [review]
Whine to console about form submission encodings that can't encode all of Unicode

GB18030 can encode all of Unicode, so excluding it from the whining. UTF-16 is mapped to UTF-8 before the whining check.
Comment 2 Henri Sivonen (:hsivonen) 2011-12-19 09:18:27 PST
Created attachment 582852 [details] [diff] [review]
Whine to console about form submission encodings that can't encode all of Unicode
Comment 3 Olli Pettay [:smaug] 2011-12-29 16:08:13 PST
Comment on attachment 582852 [details] [diff] [review]
Whine to console about form submission encodings that can't encode all of Unicode

This can be a bit noisy, but is an important thing.
Comment 4 Henri Sivonen (:hsivonen) 2012-01-02 06:47:28 PST
Thanks for the r.

https://hg.mozilla.org/integration/mozilla-inbound/rev/f4f47800d2ff
Comment 5 Marco Bonardo [::mak] 2012-01-03 03:42:50 PST
https://hg.mozilla.org/mozilla-central/rev/f4f47800d2ff
Comment 6 Masatoshi Kimura [:emk] 2012-04-08 16:24:54 PDT
(In reply to Henri Sivonen (:hsivonen) from comment #0)
> See
> http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2011-November/033991.html
> 
> When Web apps that solicit textual input from users transfer the input to
> the server using an encoding that can't encode all of Unicode, user input
> (including names of people) may break.
Actually it is a problem of poorly-written server-side software as the reply says. If the encoding  can not represent a character, the character will be converted to character reference. Server-side software can handle to repair the original character. If the form encoding is changed to UTF-8, server-side software (and all existing data) needs to be updated anyway. I doubt it deserves a warning.
Comment 7 Henri Sivonen (:hsivonen) 2012-04-10 01:14:29 PDT
(In reply to Masatoshi Kimura [:emk] from comment #6)
> If the form encoding is changed to UTF-8,
> server-side software (and all existing data) needs to be updated anyway.

Yes, the server needs fixing, too. It doesn't make sense to fix just the form.

Maybe the wording needs to be tweaked to make this clear?
Comment 8 Henri Sivonen (:hsivonen) 2012-04-10 01:15:58 PDT
Oh, and the server cannot unambigously repair submission where something has been converted to character references, because it can't tell if substrings that look like character references are user input or artifacts of lossy conversion.

Note You need to log in before you can comment on or make changes to this bug.