Closed Bug 563820 Opened 14 years ago Closed 14 years ago

server fails to accept passwords with utf8 encoded characters

Categories

(Cloud Services Graveyard :: Server: Sync, defect)

x86
macOS
defect
Not set
blocker

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: dwalkowski, Assigned: telliott)

Details

Attachments

(1 file)

Steps to reproduce:

* use weave client add-on to change your password to something containing a non-ascii character.  I tried paßßword, and also pássword.  
* the changed password is accepted, and is stored correctly in the password manager of Firefox.
* disconnect from weave
* you can no longer log in, from any client. (I tested the iphone client also, it was denied too)
addendum: the server returns a 401
hmm, this used to be a problem, but was fixed a while ago.
Flags: blocking-weave1.3+
Summary: Weave 1.3 BLOCKER: server fails to accept passwords with utf8 encoded characters → server fails to accept passwords with utf8 encoded characters
Les, this is also happening on the stage server. Can you make sure the new reg code handles UTF-8 passwords correctly?
Starting to try working through this, but are we sure this worked before I started hacking on it?

The first thing is that I can't *create* a new account using "paßßword" as a password - not on any of dev, stage, or prod

But, I haven't touched the code that seems to be problematic. The code accepting a PUT to create a new account uses json_decode(), and that turns up NULL when it gets this JSON from the Weave Sync client:

{"password":"paflflword","email":"lorchard@mozilla.com","captcha-challenge":null,"captcha-response":null}

The PHP tests run fine with a password of "paßßword", but they send this JSON generated by PHP's json_encode():

{"password":"pa\u00df\u00dfword","email":"test@test.com"}

If I skip json_encode() and build the JSON by hand in PHP, I can send this (also accepted):

{"password":"paßßword","email":"test@test.com"}

So... the literal ASCII string "\u00df" in JSON is "ß" in Unicode, and the byte sequence 0xC3 0x9F is "ß" in UTF-8 encoding.

But, Weave Sync is sending the byte 0xDF for "ß", which looks like it's not UTF-8 encoded and is instead raw Unicode.

My Unicode-fu is not strong, so I'm a little confused here whether Weave Sync is doing the wrong thing, or whether PHP's json_decode() is broken.

Still looking into the password change failure case...
(In reply to comment #4)

> But, Weave Sync is sending the byte 0xDF for "ß", which looks like it's not
> UTF-8 encoded and is instead raw Unicode.

Oh wait, 0xDF is also "ß" in ISO-8859-1, so fix_utf8_encoding() on the PHP side seems like it should be turning this into 0xC3 0x9F but isn't.  Still digging...
So, it looks like mb_detect_encoding() wasn't quite detecting encodings.  It kept claiming that an ISO-8859-1 string was UTF8 and so the fix never happened.  Here are some patches that seem to fix things on my dev box:

http://hg.mozilla.org/users/lorchard_mozilla.com/weaveserver-sync-patches/file/93a8420363e5/utf8_enc_fix

http://hg.mozilla.org/users/lorchard_mozilla.com/weaveserver-registration-patches/file/f6eca5432d42/utf8_enc_fix

http://hg.mozilla.org/users/lorchard_mozilla.com/weaveserver-registration-admin-patches/file/00ffb81fcf9f/utf8_enc_fix

If this worked in the past, it might have been a php.ini setting changed since the default detect_order for mb_detect_encoding() appears to be sourced from there.

In the future (2.0 maybe?), it seems like it would be good to require that the client send a Content-Type header with charset to explicitly declare what encoding to expect.  Attempting to detect encodings for Unicode / ISO-8859-1 / UTF-8 leads to madness.
Attachment #443879 - Flags: review? → review?(telliott)
Comment on attachment 443879 [details] [diff] [review]
UTF-8 fix in registration (should apply to sync & registration-secure)

shouldn't hurt anything, but I have no idea why this would actually do anything!
Attachment #443879 - Flags: review?(telliott) → review+
(In reply to comment #8)
> shouldn't hurt anything, but I have no idea why this would actually do
> anything!

It seemed like, at least on my laptop, that ISO-8859-1 wasn't in the default order for encoding detection.  This should explicitly stick it in there.  Seems weird though.

But, just pushed this patch to sync, registration, and registration-secure:

http://hg.mozilla.org/labs/weaveserver-sync/rev/24bba735d86d
http://hg.mozilla.org/labs/weaveserver-registration/rev/cd22f3da185e
http://hg.mozilla.org/labs/weaveserver-registration-secure/rev/7ec9bd414c38
Status: NEW → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
Flags: in-testsuite?
Product: Cloud Services → Cloud Services Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: