Beginning on October 25th, 2016, Persona will no longer be an option for authentication on BMO. For more details see Persona Deprecated.
Last Comment Bug 546330 - http Digest Authentication fails to re-request the page with non-ASCII username
: http Digest Authentication fails to re-request the page with non-ASCII username
Status: NEW
: intl
Product: Core
Classification: Components
Component: Networking: HTTP (show other bugs)
: unspecified
: x86_64 Windows 7
: -- normal with 1 vote (vote)
: ---
Assigned To: Nobody; OK to take it and work on it
: Patrick McManus [:mcmanus]
Depends on: 41489
Blocks: 1212727
  Show dependency treegraph
Reported: 2010-02-15 13:20 PST by Alois Reisinger
Modified: 2016-02-04 23:59 PST (History)
7 users (show)
See Also:
Crash Signature:
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---


Description Alois Reisinger 2010-02-15 13:20:06 PST
User-Agent:       Mozilla/5.0 (Windows; U; Windows NT 6.1; de; rv:1.9.2) Gecko/20100115 Firefox/3.6
Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 6.1; de; rv:1.9.2) Gecko/20100115 Firefox/3.6

Using http digest authentication a browser is supposed to retry the request (see section 3.2.2 of RFC 2617) passing an authorization header line.
This works in general. But if a username (no password) with a special character like the german umlaut "ä"  (ASCII: 228, HEX: E8) is entered, the browser acts as if the authentication was canceled by the user.

Reproducible: Always

Steps to Reproduce:
0. Prepare with a http sniffing tool like fiddler
1. Point you firefox to
2. Enter one special umlaut character into the username textbox
3. Hit Enter
Actual Results:  
Firefox does not retry the request with the supplied username

Expected Results:  
Firefox should retry the request with the authentication line

see for all the source code behind this microsite.
Source code of .htaccess and the index.php is provided.
I used Fiddler to check if there were made any requests
Comment 1 Boris Zbarsky [:bz] (still a bit busy) 2010-02-15 18:03:48 PST
Sounds like likely confusion between UTF-8 and ISO-8559-1 somewhere...  Honza, do you know anything about this offhand?
Comment 2 Boris Zbarsky [:bz] (still a bit busy) 2010-02-15 18:06:24 PST
When I try the steps in comment 0 I get back a page that says "died" and nothing else.  Is that indicating that the bug was observed?
Comment 3 Boris Zbarsky [:bz] (still a bit busy) 2010-02-15 18:17:30 PST
Apparently yes, to answer my question from comment 2.

A log in a debug build per shows this: 

-1606793440[90a4e0]: nsHttpDigestAuth::GenerateCredentials [challenge=Digest realm="Test - test",qop="auth",nonce="4b79fe27f35ff",opaque="293906c87ba9c07bbcc539da83dde298"]
-1606793440[90a4e0]:    nonce_count=00000001
-1606793440[90a4e0]:    cnonce=01b6730aae57c007
-1606793440[90a4e0]: WARNING: NS_ENSURE_SUCCESS(rv, rv) failed with result 0x80004005: file /Users/bzbarsky/mozilla/vanilla/mozilla/netwerk/protocol/http/src/nsHttpDigestAuth.cpp, line 358
-1606793440[90a4e0]: nsHttpChannel::OnAuthCancelled [this=1ec456f0]

Relevant code:

356   authString.AssignLiteral("Digest username=");
357   rv = AppendQuotedString(cUser, authString);
358   NS_ENSURE_SUCCESS(rv, rv);

Looks like AppendQuotedString just can't handle non-ASCII as written.  It bails out because non-ASCII chars test true for <= 31 (since they're negative, as signed chars).  But if it didn't it would just create an invalid header anyway, per the comment at the end of it.
Comment 4 Alois Reisinger 2010-02-15 22:03:03 PST
just added a line to the to indicate if the bug was fixed and to clarify the expected result.
New expected: output of "bug fixed" after the entering of an umlaut (ä)
New actual result: output of "died"

Please refer to the
Comment 5 Honza Bambas (:mayhemer) 2010-02-16 13:56:19 PST
nsHttpDigestAuth::AppendQuotedString seems to expect ISO-8559-1 as an input, but it gets UTF-8 encoded string and walks each distinct octet of it.  It doesn't break because ä value is e8 but because there are two UTF-8 octets, c3 a4, where the first one is tested by that condition that fails.

When I look into the patch for bug 41489, it seems it changes loosy convert to regular convert in the basic auth code.  Digest code seems to use regular convert already.

As I read RFC 2047, we should probably use 'encoded-word' for this that includes character set (UTF-8 in our case), encoding (probably 'q') and the encoded string.  I'm just not sure if "An 'encoded-word' may replace a 'text' token" applies also in case of a 'quoted-string' that is by def a 'text' except the '"' char.
Comment 6 Boris Zbarsky [:bz] (still a bit busy) 2010-02-16 14:03:47 PST
Oh, I see.  This bug is about digest auth, not basic auth.  Gotcha.

So 2047-encoding is the only way to stuff these chars into this header, no?
Comment 7 Honza Bambas (:mayhemer) 2010-02-16 14:08:26 PST
If there is no conversion from UTF-8 to ISO Latin-1 (I expect not) after which we may use actual quotation of the string, then I cannot see other option at the moment, but I'm still googling.
Comment 8 Honza Bambas (:mayhemer) 2010-02-16 14:18:05 PST
Taking back: An 'encoded-word' MUST NOT appear within a 'quoted-string'.

Read more:
Comment 9 Honza Bambas (:mayhemer) 2010-02-16 14:26:03 PST
My other idea is to follow the solution from bug 41489 comment 84: first, send just lower byte (loosy convert, quoted) then try to send it in utf-8 (quoted).

It's not said anywhere how to say what character encoding is used for a quoted-string or how to encode username in the Authorization header.

I'm going to check some server implementations how they deal with non-ascii chars in the username.
Comment 10 Honza Bambas (:mayhemer) 2010-02-17 06:29:34 PST
Alois: how did you actually set the username on the server?

bz: for example htdigest for apache configuration takes username directly from command line and counts HA1 directly from bytes taken from a terminal (a native encoding).  mod_auth_digest.c takes the de-quoted string as is and counts HA1.  I'm not expert to passing non-ascii characters from terminal to arguments, but it seems to me that clients must know encoding used to generate the password on the server, what is obviously impossible.
Comment 11 Honza Bambas (:mayhemer) 2010-02-17 06:37:03 PST
Actually, I can see two problems here:

1. we have to send the Username= portion in the Authorization header, that is by definition a quoted-string, that is unable to carry any encoding (comment 8) and I haven't found any requirement for the header nor a quoted-string in any rfc.

2. we have to count HA1 hash from byte representation of the user name, for which is not said anywhere in which character set or encoding should be before we pass it to the hash function.

Issue 1 also affects basic auth that has to base64 encode the user name's byte representation, but it is not said what char set it has to be in, what is actually cause why bug 41489 is still not fixed.
Comment 12 Boris Zbarsky [:bz] (still a bit busy) 2010-02-17 06:48:29 PST
So I guess the question is whether we're trying to make non-ASCII usernames "work" or whether we're just trying to make sure we don't fall back to showing the 401 body even if the user enters a non-ASCII username...  The latter is what comment 0 is about, strictly speaking...
Comment 13 Michal Novotny (:michal) 2010-02-17 07:16:13 PST
Regarding #c10, we could IMHO assume that native encoding on the apache server is UTF-8. At least this is true for most linux distros (already for several years). So we can send the username as quoted UTF8. FYI this is what Opera does.
Comment 14 Michal Novotny (:michal) 2010-02-17 07:21:17 PST
Btw. it is also a question how to handle non-ASCII passwords. They aren't sent over the wire, but it is a question in which form they should be hashed. Non-quoted UTF-8?
Comment 15 Alois Reisinger 2010-02-17 13:18:07 PST
In fact, i would recommend two steps.
First, the violation of RFC 2617 should be corrected.
I think it´s better to send "somehow" converted characters back to the server and retry the request as stated in the RFC. In this (of course very bad) case some could implement its own/the correct algorithm on the server side. At least there is a chance to. Now there is no chance at all.

BTW, in my implementation i go fully UTF-8, as most sites do nowadays. This is why i think making the switch to use UTF as default (like OPERA) would also be a very good choice.

Second i´d recommend some extension to the RFC itself. Like a charset parameter in the WWW-Authenticate Line and in the authorization line.
I thought i´ve read something like this in a RFC draft somewhere, but can´t find yet. But i guess this bugzilla entry is the wrong place to discuss this. Maybe some of the fellow readers has the right connection to the correct place...
Comment 16 Honza Bambas (:mayhemer) 2010-02-17 13:35:22 PST
Alois, good summary, UTF-8 is a good way to go here, also supported by comment 13.

bz, do you agree?  To have a new error code is not a simple way to go if we want this fix for a branch - new stings.
Comment 17 Boris Zbarsky [:bz] (still a bit busy) 2010-02-17 16:05:10 PST
> BTW, in my implementation i go fully UTF-8

Note that doing that on our end violates RFC 2617...

I guess that's where the extension comes in?
Comment 18 Honza Bambas (:mayhemer) 2010-02-18 10:57:32 PST
(In reply to comment #17)
> > BTW, in my implementation i go fully UTF-8
> Note that doing that on our end violates RFC 2617...

Because quoted-string is supposed to be in 8859-1 charset?

> I guess that's where the extension comes in?

Use auth-param for this?

As I read RFC 2617, it seems we should hash directly the quoted string (with potential =NN in it) but w/o the leading and trailing '"'.  That's something apache is not doing...  Seems like something is really missing in that RFC.

To move to a conclusion: do we want to support ascii chars >127 in user names?  Do want to try at least something to work with most server implementations, that seems to differ as well?  WONTFIX this?
Comment 19 Boris Zbarsky [:bz] (still a bit busy) 2010-02-18 11:04:06 PST
> Because quoted-string is supposed to be in 8859-1 charset?


> Seems like something is really missing in that RFC.

Yes, like provisions for anything other than West European characters.

Have we tried contacting the relevant IETF folks about this?
Comment 20 Alois Reisinger 2010-02-22 22:26:27 PST
Just for my information:
Is anyone taking care of this? (contacting the cool guys at IETF ;-)
Comment 21 Honza Bambas (:mayhemer) 2010-02-23 06:59:33 PST
I'll do that soon.
Comment 22 Alois Reisinger 2010-04-21 13:28:18 PDT
Just to keep it on our minds....   any news on that?
Comment 23 Honza Bambas (:mayhemer) 2010-05-16 06:42:43 PDT
(In reply to comment #19)
> Have we tried contacting the relevant IETF folks about this?
I did on  No answer from the admin.  Any suggestions about a forum to post to?
Comment 24 Boris Zbarsky [:bz] (still a bit busy) 2010-05-17 06:55:03 PDT
No idea.  I would have thought the IETF would answer its mail....
Comment 25 Honza Bambas (:mayhemer) 2010-05-17 06:58:50 PDT
I got answer from Henrik Nordstrom, to feel free to propose an extension to both basic and digest auth scheme.  I heard there were some already proposed or drafted.  I'll look for it or let's propose one our self.
Comment 26 Julian Reschke 2010-05-17 07:03:44 PDT
(In reply to comment #24)
> No idea.  I would have thought the IETF would answer its mail....

The IETF is a loose connection of people. There's no "IETF" entity one sends email to.

That being said the feedback from the HTTPbis WG IMHO is clear (and has been for some time); you'll need to extend the authentication schemes.
Comment 27 Patrick McManus [:mcmanus] 2016-02-04 13:49:36 PST
possible this should be wontfix
Comment 28 Julian Reschke 2016-02-04 23:59:30 PST is supposed to answer this.

Note You need to log in before you can comment on or make changes to this bug.