Last Comment Bug 494969 - GSSAPI negotiate authentication may fail if /etc/resolv.conf changes
: GSSAPI negotiate authentication may fail if /etc/resolv.conf changes
Status: RESOLVED FIXED
[3.6.x]
: fixed1.9.0.18
Product: Core
Classification: Components
Component: Networking (show other bugs)
: unspecified
: x86 Linux
: -- normal (vote)
: ---
Assigned To: Kai Engert (:kaie)
:
:
Mentors:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2009-05-26 15:25 PDT by Simo Sorce
Modified: 2010-02-18 07:05 PST (History)
5 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---
.2-fixed
.8-fixed


Attachments
Patch v1 (1.45 KB, patch)
2009-10-08 15:40 PDT, Kai Engert (:kaie)
cbiesinger: review+
Details | Diff | Splinter Review
Patch v1 fixed (1.49 KB, patch)
2009-10-26 15:43 PDT, Kai Engert (:kaie)
dveditz: approval1.9.2.2+
dveditz: approval1.9.1.8+
dveditz: approval1.9.0.18+
Details | Diff | Splinter Review

Description Simo Sorce 2009-05-26 15:25:47 PDT
User-Agent:       Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.0.10) Gecko/2009042708 Fedora/3.0.10-1.fc10 Firefox/3.0.10
Build Identifier: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.0.10) Gecko/2009042708 Fedora/3.0.10-1.fc10 Firefox/3.0.10

Filing under Networking because the ultimate bug is in a missing res_init() (I suppose) somewhere.

When using GSSAPI negotiate authentication, a failure may occur if /etc/resolv.conf has changed (eg. you connected to a VPN network).

Looking at network traces queries about the host we are connecting to shows that the main engine is correctly using the new DNSs while the gssapi resolve calls are still going to the old DNS server specified in resolv.conf

Apparently a res_init() call is missing somewhere.

Setting:
export NSPR_LOG_MODULES= negotiateauth:4

I see in the logs that the failing call is:
gss_init_se_context_ptr() in nsAuthGSSAPI::GetNextToken()

It fails with:
245614352[7fae0e754040]: gss_init_sec_context() failed: An invalid name was supplied
Hostname cannot be canonicalized

this happens in mozilla-1.9.1/extensions/auth/nsAuthGSSAPI.cpp line 463

Although I can't tell whether it is ok to just add a res_init() call before it or if res_init() should be triggered elsewhere.

HTH

Reproducible: Sometimes

Steps to Reproduce:
The best way I could reproduce this failure was to connect to my copmany's network over a VPN, kinit and auth to an internal web site using GSSAPI negotiate auth. Then let the ticket expire and disconnect from the network.
Wait till the next day, try again to connect and see that I fail to connect (can't resolve the host name at all).
Reconnect to the VPN (this changes /etc/resolve.conf to query internal DNS servers), kinit again and reload the page and see that firefox now finds the web server, but auth fails because the gssapi queries are still going out to the old DNS server.
Actual Results:  
SSO auth fails.

Expected Results:  
Auth works.

As a work around you can save&quit firefox and restart it, but that's annoying.
Comment 1 Zack Cerza 2009-09-03 09:06:56 PDT
Also seeing this:

Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.1.2) Gecko/20090803 Fedora/3.5.2-2.fc11 Firefox/3.5.2
Comment 2 Kai Engert (:kaie) 2009-10-08 14:16:46 PDT
The original testcase doesn't work for me reliably.
However, the following does:

- start firefox with vpn disconnected
- load any public page, and firefox will use the public dns
- connect the vpn, which will change the dns settings
- try kerberos auth and it fails, despite having a valid ticket
Comment 3 Kai Engert (:kaie) 2009-10-08 14:28:58 PDT
Firefox never calls res_init and thus never refreshes dns config explicitly.

There is a single reference to this function in the Mozilla code, within a comment in nsHostResolver.cpp:

"Use a persistent thread pool in order to avoid spinning up new threads all the time. In particular, thread creation results in a res_init() call from libc which is quite expensive."
Comment 4 Kai Engert (:kaie) 2009-10-08 15:37:03 PDT
Ok, res_init is deprecated, one should use res_ninit instead.

I see that Mozilla indeed calls res_ninit on failures, at most once per second.
This is why we actually succeed loading the intranet page after vpn connect.

However, the kerberos service code (contained in system libs) doesn't seem to do that, but probably calls the DNS resolver directly.

Mozilla's res_ninit() calls inside the host-resolver code don't help us, as Mozilla's resolver runs on its own separate thread.

This means, on the thread where we call the gssapi functions, Mozilla never calls res_ninit, and the system gssapi implementation apparently doesn't do it on its own.

I can confirm that a call to res_ninit(), just before calling gss_import_name, fixes this bug.

I'm not yet sure, is it fine to add this directly to the extensions/auth code. In Linux res_ninit is implemented inside libc.

On other systems, could it be implemented by some networking lib, that only gets linked by Mozilla's libnecko XPCOM module? If this may happen, then we'd need to implement/extend a service inside necko for the call to res_ninit, and have extension/auth call this service remotely.
Comment 5 Kai Engert (:kaie) 2009-10-08 15:40:03 PDT
Created attachment 405350 [details] [diff] [review]
Patch v1

This is the patch I used and tested on Linux.

As explained, I don't know whether res_ninit can be found on all platforms using the libs currently referenced from libauth, or whether some platforms require libs referenced by necko, only.
Comment 6 Kai Engert (:kaie) 2009-10-26 12:11:25 PDT
checked in
http://hg.mozilla.org/mozilla-central/rev/f825915212d4
Comment 7 Kai Engert (:kaie) 2009-10-26 12:13:56 PDT
Comment on attachment 405350 [details] [diff] [review]
Patch v1

Proposing correctness fix for stable branches.
Comment 8 Kai Engert (:kaie) 2009-10-26 12:35:38 PDT
reopening, patch broke the tree...
Comment 9 Kai Engert (:kaie) 2009-10-26 15:43:40 PDT
Created attachment 408486 [details] [diff] [review]
Patch v1 fixed

Sigh, how comes I added an #ifdef to include, but failed to use an #ifdef to the actual function call?

Thanks to Stephen Gallagher and blassey for their messages who suggested the same fix.

Attaching new patch, which built fine on TryServer.
Comment 10 Kai Engert (:kaie) 2009-10-30 02:18:46 PDT
Second checkin attempt:
http://hg.mozilla.org/mozilla-central/rev/c814dffbd980
Comment 11 Kai Engert (:kaie) 2009-10-30 06:39:26 PDT
Comment on attachment 408486 [details] [diff] [review]
Patch v1 fixed

This checkin attempt succeeded, tinderboxes look good.

Nominating for stable branches.
Comment 12 Samuel Sidler (old account; do not CC) 2009-11-04 15:20:03 PST
Comment on attachment 408486 [details] [diff] [review]
Patch v1 fixed

We'll look at this for the next cycle, after it's gotten baking on 1.9.2. Kai: You should ping a 1.9.2 driver for approval.
Comment 13 Daniel Veditz [:dveditz] 2009-12-02 15:34:33 PST
kaie: you'll need to lobby for the bug on 1.9.2 through email or IRC, the approval requests for non=blockers are getting missed.
Comment 14 Daniel Veditz [:dveditz] 2009-12-14 14:27:09 PST
Comment on attachment 408486 [details] [diff] [review]
Patch v1 fixed

Approved for 1.9.1.7 and 1.9.0.17, a=dveditz for release-drivers
Comment 15 Daniel Veditz [:dveditz] 2010-02-02 02:05:37 PST
Checking in extensions/auth/nsAuthGSSAPI.cpp;
/cvsroot/mozilla/extensions/auth/nsAuthGSSAPI.cpp,v  <--  nsAuthGSSAPI.cpp
new revision: 1.15; previous revision: 1.14
Comment 16 Daniel Veditz [:dveditz] 2010-02-02 02:12:36 PST
http://hg.mozilla.org/releases/mozilla-1.9.1/rev/44ee53072ead
Comment 17 Daniel Veditz [:dveditz] 2010-02-05 13:52:09 PST
Comment on attachment 408486 [details] [diff] [review]
Patch v1 fixed

Approved for 1.9.2.2, a=dveditz for release-drivers

Note You need to log in before you can comment on or make changes to this bug.