Last Comment Bug 857291 - SPNEGO / MS KRB5 no longer working. Tries to use NTLM SSP instead.
: SPNEGO / MS KRB5 no longer working. Tries to use NTLM SSP instead.
Status: VERIFIED FIXED
: regression
Product: Core
Classification: Components
Component: Networking: HTTP (show other bugs)
: 20 Branch
: x86 Windows 7
: -- normal with 4 votes (vote)
: mozilla23
Assigned To: Patrick McManus [:mcmanus]
:
Mentors:
: 857483 858049 858555 858647 860539 (view as bug list)
Depends on:
Blocks: 807678 858634
  Show dependency treegraph
 
Reported: 2013-04-02 14:03 PDT by James Abbatiello
Modified: 2013-04-12 13:48 PDT (History)
26 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---
+
verified
+
verified
+
verified
verified
unaffected


Attachments
v1 - backout of bug 807678 (112.73 KB, patch)
2013-04-04 13:22 PDT, Honza Bambas (:mayhemer)
no flags Details | Diff | Splinter Review
[FOR REFERENCE ONLY] Reversed path v1 to compare with patch from bug 807678 (112.64 KB, patch)
2013-04-04 13:29 PDT, Honza Bambas (:mayhemer)
no flags Details | Diff | Splinter Review
patch of just DNS internals v0 (3.97 KB, patch)
2013-04-05 13:07 PDT, Patrick McManus [:mcmanus]
jaas: review+
bajaj.bhavana: approval‑mozilla‑aurora+
bajaj.bhavana: approval‑mozilla‑beta+
lukasblakk+bugs: approval‑mozilla‑release+
Details | Diff | Splinter Review

Description James Abbatiello 2013-04-02 14:03:18 PDT
User Agent: Mozilla/5.0 (Windows NT 6.1; rv:20.0) Gecko/20100101 Firefox/20.0
Build ID: 20130326150557

Steps to reproduce:

SPNEGO is no longer working properly on some intranet websites that I use.  These worked fine with Firefox 19.0.2 but are now broken on 20.0.

I have packet captures with details but I don't want to post them publicly since I'm not confident in my ability to sanitize them of all private information.  I could send them privately to someone if needed.

Relevant about:config prefs (with domain name replaced with example.com):
network.automatic-ntlm-auth.allow-non-fqdn = true
network.automatic-ntlm-auth.trusted-uris = http://example.com, https://example.com
network.negotiate-auth.allow-non-fqdn = true
network.negotiate-auth.delegation-uris = http://example.com, https://example.com
network.negotiate-auth.trusted-uris = http://example.com, https://example.com



Actual results:

The server sends a 401 response with "WWW-Authenticate: Negotiate".  Firefox sends a new request with "Authorization: Negotiate <base64 data>".  The base64 payload is decoded by Wireshark as "NTLM Secure Service Provider".  The server doesn't like this and sends another 401.



Expected results:

In previous versions of Firefox the request that was sent is decoded by Wireshark as "GSS-API Generic Security Service Application Program Interface" with "SPNEGO - Simple Protected Negotiation" inside.  It advertises 4 MechTypes: MS KRB5, KRB5, NEGOEX, and NTLMSSP.  The auth succeeds using MS KRB5.
Comment 1 Matthias Versen [:Matti] 2013-04-02 15:43:10 PDT
One way to help would be to find the regression range.
We have a tool for the regression range search: http://mozilla.github.com/mozregression/
You have to specify a profile because you need your changed prefs and without a profile specified the tool will always create a new profile for each tested build.
Comment 2 Patrick McManus [:mcmanus] 2013-04-02 16:13:11 PDT
crud - I wonder if this is a dup of 804605. I caused that regression and backed the code out.. so it went off my list, but its clear from the comment trail over there that the problem persisted for that reporter. My bad.

honza can you look at this - I can't easily test any of the windows auth stuff and I don't really understand it deeply.
Comment 3 Honza Bambas (:mayhemer) 2013-04-02 16:44:12 PDT
I'll take a look.
Comment 4 James Abbatiello 2013-04-02 17:03:01 PDT
Running mozregression produced:

Last good nightly: 2012-07-26
First bad nightly: 2012-07-27

Pushlog:
http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=20db7c6d82cc&tochange=8b96a33ecbd2
Comment 5 Patrick McManus [:mcmanus] 2013-04-02 17:07:17 PDT
thanks honza and james - james that points at the code I suspected in comment 2.. unfortunately I believe that was backed out. Hopefully honza has an easy enough test setup to figure out where that went wrong..
Comment 6 James Abbatiello 2013-04-03 15:44:39 PDT
I ran a test with some logging turned on.  Nightly from 2012-07-26 reports this:

Resolving host [friendly.example.com].
DNS lookup for host [friendly.example.com] blocking pending 'getaddrinfo' query.
Calling getaddrinfo for host [friendly.example.com].
Lookup completed for host [friendly.example.com].
Using SPN of [HTTP/canonical.example.com]

It correctly finds the canonical name and everything works.  Then on 2012-07-27 things look a bit different.  The page no longer loads but the code still gets the canonical name:

Resolving host [friendly.example.com].
DNS lookup for host [friendly.example.com] blocking pending 'getaddrinfo' query.
Calling getaddrinfo for host [friendly.example.com].
Suspending the transaction, asynchronously prompting for credentials
Lookup completed for host [friendly.example.com].
nsHttpChannelAuthProvider::OnLookupComplete this=f286060 rv=0
nsHttpChannelAuthProvider::OnLookupComplete this=f286060 resolved to canonical.example.com

A nightly from 2012-12-24 looks much the same.  Then on 2012-12-25, it changes:

Resolving host [friendly.example.com].
DNS lookup for host [friendly.example.com] blocking pending 'getaddrinfo' query.
Suspending the transaction, asynchronously prompting for credentials
Calling getaddrinfo for host [friendly.example.com].
Lookup completed for host [friendly.example.com].
nsHttpChannelAuthProvider::OnLookupComplete this=124566a0 rv=0
nsHttpChannelAuthProvider::OnLookupComplete this=124566a0 resolved to friendly.example.com

It can no longer find the canonical name.  After the backout mentioned above the log looks much like the original but now without the canonical name:

Resolving host [friendly.example.com].
DNS lookup for host [friendly.example.com] blocking pending 'getaddrinfo' query.
Calling getaddrinfo for host [friendly.example.com].
Lookup completed for host [friendly.example.com].
Using SPN of [HTTP/friendly.example.com]

I think something else broke between 2012-12-24 and 2012-12-25 so when the backout happened CNAME resolution was still broken somehow.  The pushlog is http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=d348dbf1dab4&tochange=dc2abccc2adb and http://hg.mozilla.org/mozilla-central/rev/7f5fad93ef78 for Bug 807678 seems a likely candidate, especially http://hg.mozilla.org/mozilla-central/rev/7f5fad93ef78#l13.50
Comment 7 Honza Bambas (:mayhemer) 2013-04-03 16:38:49 PDT
James, do you want to say that Nightly from 2012-12-24 doesn't suffer from this bug?


Check also https://bugzilla.mozilla.org/show_bug.cgi?id=804605#c35.
Comment 8 James Abbatiello 2013-04-03 17:28:19 PDT
Honza, no, 2012-12-24 does not work for me.  Everything that I've tested from 2012-07-27 or later is broken.

I was suggesting that perhaps there are two problems.  One was caused by
https://hg.mozilla.org/mozilla-central/rev/959f9da9f85e (2012-07-26)
and was backed out by
https://hg.mozilla.org/mozilla-central/rev/4a1188e7f538 (2013-01-22)

The other was perhaps caused by
https://hg.mozilla.org/mozilla-central/rev/7f5fad93ef78 (2012-12-23)
and is still present through today.

Since these date ranges overlap there was no window where things started working again.
Comment 9 Honza Bambas (:mayhemer) 2013-04-04 11:33:42 PDT
James, I've created an experimental build whom one of the suspected patches has been backed out from:

https://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/honzab.moz@firemni.cz-24c4514bc842/try-win32/firefox-20.0.en-US.win32.installer.exe

Please try it and let me know, thanks a lot for help!
Comment 10 James Abbatiello 2013-04-04 11:42:03 PDT
Honza, that build works for me!
Comment 11 Honza Bambas (:mayhemer) 2013-04-04 11:49:10 PDT
James, thanks a lot!  I'll check how this works for bug 804605 and we may have a patch then :)
Comment 12 Honza Bambas (:mayhemer) 2013-04-04 13:22:47 PDT
Created attachment 733526 [details] [diff] [review]
v1 - backout of bug 807678

[Approval Request Comment]
Regression caused by (bug #): 807678
User impact if declined: Bug 804605 and this bug
Testing completed (on m-c, etc.): none, only checked by reporters using mozilla-release based custom build
Risk to taking this patch (and alternatives if risky): Need to be evaluated
String or IDL/UUID changes made by this patch: 
  IDL: nsISocketTransport, nsIDNSRecord, nsISOCKSSocketInfo
  Strigs: none
Comment 13 Honza Bambas (:mayhemer) 2013-04-04 13:29:10 PDT
Created attachment 733528 [details] [diff] [review]
[FOR REFERENCE ONLY] Reversed path v1 to compare with patch from bug 807678
Comment 14 Patrick McManus [:mcmanus] 2013-04-04 17:04:29 PDT
*** Bug 858049 has been marked as a duplicate of this bug. ***
Comment 15 Robert Longson 2013-04-05 07:20:23 PDT
*** Bug 858555 has been marked as a duplicate of this bug. ***
Comment 16 Patrick McManus [:mcmanus] 2013-04-05 11:45:11 PDT
*** Bug 857483 has been marked as a duplicate of this bug. ***
Comment 17 Matthias Versen [:Matti] 2013-04-05 11:49:49 PDT
*** Bug 858647 has been marked as a duplicate of this bug. ***
Comment 18 Patrick McManus [:mcmanus] 2013-04-05 13:07:54 PDT
Created attachment 734017 [details] [diff] [review]
patch of just DNS internals v0
Comment 19 Patrick McManus [:mcmanus] 2013-04-05 13:20:50 PDT
I've just started a build that will include the patch from comment 18. It will take an hour or two.

http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/mcmanus@ducksong.com-d83057e4e731

The build in comment 9 successfully identified the problem but touches a lot more code than we would ideally like to backport - this change is a potential candidate for backport. It still needs review and more importantly testing from someone impacted by the problem.

none of the devs on this bug have a setup to reproduce this, so we're relying on the reporters. Thank you :)

I have been able to reproduce a problem with the DNS canonicalization and this fix is based on that repro, but I haven't been able to do it in a kerberos setting.
Comment 21 Andriy Syrovenko 2013-04-05 15:06:00 PDT
(In reply to Patrick McManus [:mcmanus] from comment #19)
> http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/mcmanus@ducksong.com-d83057e4e731

This build works fine for me. Thanks.

Please note, however, that FF21 beta seems to be affected as well.
Comment 22 Patrick McManus [:mcmanus] 2013-04-05 15:30:40 PDT
(In reply to Andriy Syrovenko from comment #21)
> (In reply to Patrick McManus [:mcmanus] from comment #19)
> > http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/mcmanus@ducksong.com-d83057e4e731
> 
> This build works fine for me. Thanks.
> 
> Please note, however, that FF21 beta seems to be affected as well.

Thanks! (yes 20->23 are all impacted. we'll take that into consideration when deciding where to land whatever the final patch turns out to be)
Comment 23 Alex Keybl [:akeybl] 2013-04-05 16:34:13 PDT
Please immediately land on Aurora once this has landed on mozilla-central. We're going to build with a beta for bug 857672 on Monday EOD, so we'll need to make a decision on the best path forward for 20/21 before then.

(In reply to Honza Bambas (:mayhemer) from comment #12)
> User impact if declined: Bug 804605 and this bug

We need much more help with user impact here. Bug 804605 only has two votes and 11 CCs. If we were to decline this, our read is that this would impact a very small group of users who may want to consider using the ESR. Are we underestimating the impact?

Also placing a needinfo on Matt to see if he's seeing mentions of this on SUMO.

> Risk to taking this patch (and alternatives if risky): Need to be evaluated

Who is going to perform this evaluation? This needs to happen _immediately_

> String or IDL/UUID changes made by this patch: 
>   IDL: nsISocketTransport, nsIDNSRecord, nsISOCKSSocketInfo

I don't have a good sense for how many third party plugins may be impacted by this. I'm hoping Jorge can help with that evaluation.

What other options do we have that don't impact the IDL for FF20/21?
Comment 24 Matt Grimes [:Matt_G] 2013-04-05 16:46:50 PDT
I'm not seeing anything right now on SUMO or Input. Perhaps we'll see something over the weekend. I'll keep an eye on it and let you know.
Comment 25 Jorge Villalobos [:jorgev] 2013-04-05 16:59:39 PDT
All the changes I see happen in [noscript] functions, meaning that this would only impact binary add-ons, and we can't really know how many use these interfaces. Given that these are networking interfaces, I think there's a good chance something will break.

If this is going to land on beta, it should happen as early as possible. I would recommend not landing on beta if the impact of not landing is minor.
Comment 26 Honza Bambas (:mayhemer) 2013-04-05 18:22:20 PDT
(In reply to Alex Keybl [:akeybl] from comment #23)
> > Risk to taking this patch (and alternatives if risky): Need to be evaluated
> 
> Who is going to perform this evaluation? This needs to happen _immediately_

I believe Patrick's focused patch is what we will take on all branches.

Patrick feel free to obsolete the backout patch to prevent confusion.

> 
> > String or IDL/UUID changes made by this patch: 
> >   IDL: nsISocketTransport, nsIDNSRecord, nsISOCKSSocketInfo
> 
> I don't have a good sense for how many third party plugins may be impacted
> by this. I'm hoping Jorge can help with that evaluation.
> 
> What other options do we have that don't impact the IDL for FF20/21?

Again, Patrick's patch.
Comment 27 Patrick McManus [:mcmanus] 2013-04-05 19:13:40 PDT
(In reply to Matt Grimes, Mozilla SUMO (irc: Matt_G) from comment #24)
> I'm not seeing anything right now on SUMO or Input. Perhaps we'll see
> something over the weekend. I'll keep an eye on it and let you know.

Thanks Matt. Keywords would be kerberos, windows auth, integrated windows auth, spnego, maybe even ntlm.

The new patch touches much less code and no idls. Once it gets review I'll nom it back to 20 and alex can decide where it should go based on what you've seen.
Comment 28 Matt Grimes [:Matt_G] 2013-04-06 14:20:05 PDT
Thanks Patrick. I've got one report each for Kerberos and ntlm on Input at this point. Still nothing on SUMO. I'll take a look again tomorrow, but I'm thinking this will be pretty low volume.
Comment 29 Patrick McManus [:mcmanus] 2013-04-08 09:03:16 PDT
 https://hg.mozilla.org/integration/mozilla-inbound/rev/b1f9f2bcaf16
Comment 30 Patrick McManus [:mcmanus] 2013-04-08 09:11:25 PDT
Comment on attachment 734017 [details] [diff] [review]
patch of just DNS internals v0

[Approval Request Comment]
Regression caused by (bug #): 807678
User impact if declined: A subset of users depending on "integrated windows authentication" to access either intranet servers or proxies will not be able to do so. The effected users will be using services that are setup with non canonical DNS names. There isn't a workaround.
Testing completed (on m-c, etc.): Comments 20 and 21 contain validation of this change by end users effected by it. I was able to use a debugger to confirm internal behavior change is as desired, but nobody at mozilla can test the end-user scenario fully due to lack of deployment of these enterprise auth systems.
Risk to taking this patch (and alternatives if risky): IDL risk is only risk. This code is only used by windows integrated auth, which is broken without the change.
String or IDL/UUID changes made by this patch: This does not change any IDLs or structures referenced by IDLs. It does change a structure that is in DNS.h which is included by the IDL, but I do not believe that should be a compatibility problem. Josh, an sr, was asked to consider that in his review of the patch.
Comment 31 bhavana bajaj [:bajaj] 2013-04-08 10:14:39 PDT
Adding needinfo on Tyler,Matt to see if we have anything new on SUMO/input here.
Comment 32 Tyler Downer [:Tyler] 2013-04-08 10:20:20 PDT
We haven't gotten anything on the SUMO forums, Matt may have seen data on input.
Comment 33 Matt Grimes [:Matt_G] 2013-04-08 10:22:06 PDT
We have one more mention of kerberos on Input over the weekend and that's it. I think visibility is extremely low.
Comment 34 Chris Kloosterman 2013-04-08 11:01:09 PDT
Note that a lot of users using integrated windows authentication are the same enterprise users that also use roaming profiles and appdata redirection.  As Firefox 20.0 is totally broken for these users at the moment because of bug 857672, IWA can't even be tested by them yet.
Comment 35 bhavana bajaj [:bajaj] 2013-04-08 11:50:06 PDT
Comment on attachment 734017 [details] [diff] [review]
patch of just DNS internals v0

Approving on beta/aurora.Although the impact of this would be subset of users depending on "integrated windows authentication" but considering there is no workaround and the try build have been verified by a few people affected by this issue.

Checked with jorge on add-on compat impact and since the updated patch does not have IDL changes we should be good on that front.

Will be helpful to gather more feedback from our beta users that the issue is resolved once the patch lands in preparation for taking this on 20.0.1 

Please land on mozilla-beta ASAP to get this into our Fx 21 beta 2 build going to build soon.Thanks!
Comment 37 Ryan VanderMeulen [:RyanVM] 2013-04-08 17:15:47 PDT
https://hg.mozilla.org/mozilla-central/rev/b1f9f2bcaf16
Comment 38 Anthony Hughes (:ashughes) [GFX][QA][Mentor] 2013-04-09 10:56:07 PDT
James, can you please verify if this is fixed for you?

Nightly: ftp://ftp.mozilla.org/pub/firefox/nightly/latest-mozilla-central/
Aurora: ftp://ftp.mozilla.org/pub/firefox/nightly/latest-mozilla-aurora/
Beta: ftp://ftp.mozilla.org/pub/mozilla.org/firefox/nightly/21.0b2-candidates/build1/

Thank you
Comment 40 Ryan VanderMeulen [:RyanVM] 2013-04-09 12:08:02 PDT
This doesn't apply trivially to mozilla-release, so Patrick will need to take care of landing this once it's approved.
Comment 42 Anthony Hughes (:ashughes) [GFX][QA][Mentor] 2013-04-09 12:23:15 PDT
(In reply to James Abbatiello from comment #41)
> I've tested the following and they all work for me:
> ftp://ftp.mozilla.org/pub/firefox/nightly/latest-mozilla-central/firefox-23.
> 0a1.en-US.win32.installer.exe
> ftp://ftp.mozilla.org/pub/firefox/nightly/latest-mozilla-aurora/firefox-22.
> 0a2.en-US.win32.installer.exe
> ftp://ftp.mozilla.org/pub/mozilla.org/firefox/nightly/21.0b2-candidates/
> build1/win32/en-US/Firefox%20Setup%2021.0b2.exe

Thank you very much James.
Comment 43 Lukas Blakk [:lsblakk] use ?needinfo 2013-04-09 12:27:57 PDT
Comment on attachment 734017 [details] [diff] [review]
patch of just DNS internals v0

We'll take this on mozilla-release since it's a low risk change to code that we expect to be contained to windows auth, it's windows-only like our other 20.0.1 driver (bug 846848), and it's been verified on our patched builds (thank you James).
Comment 44 Patrick McManus [:mcmanus] 2013-04-09 13:23:19 PDT
   https://hg.mozilla.org/releases/mozilla-release/rev/346a2850042d
Comment 45 Cornel Ionce [QA] (:cornel_ionce) 2013-04-10 04:31:01 PDT
James, can you please verify if this is also fixed on the candidate build?

ftp://ftp.mozilla.org/pub/mozilla.org/firefox/nightly/20.0.1-candidates/build1/win32/en-US/Firefox%20Setup%2020.0.1.exe
Comment 46 Andriy Syrovenko 2013-04-10 05:45:30 PDT
(In reply to Cornel Ionce [QA] from comment #45)
> James, can you please verify if this is also fixed on the candidate build?
> 
> ftp://ftp.mozilla.org/pub/mozilla.org/firefox/nightly/20.0.1-candidates/
> build1/win32/en-US/Firefox%20Setup%2020.0.1.exe

I'm not James, but the bug is reproducible in my environment as well. :)

The referenced build works fine for me. Well done. Thanks.
Comment 47 Loic 2013-04-10 17:22:09 PDT
*** Bug 860539 has been marked as a duplicate of this bug. ***
Comment 48 jdmc 2013-04-10 18:47:36 PDT
Thanks Loic. I downloaded the 20.0.1 candidate build from comment #46 and it works in our environment as well.
Comment 49 Andrew Phillips 2013-04-11 08:09:57 PDT
I had this problem. I also downloaded the 20.0.1 candidate build from comment #46 and it fixed the problem.
Comment 50 Daniel Holbert [:dholbert] 2013-04-12 13:48:27 PDT
AFAICT, this bug isn't currently listed in the 20.0.1 release notes at https://www.mozilla.org/en-US/firefox/20.0.1/releasenotes/ -- should it be?

Note You need to log in before you can comment on or make changes to this bug.