Closed Bug 734229 Opened 12 years ago Closed 5 years ago

DOS a proxy server after failed authentication

Categories

(Core :: Networking, defect, P3)

10 Branch
x86_64
Linux
defect

Tracking

()

RESOLVED DUPLICATE of bug 486508

People

(Reporter: robert, Assigned: abartlet)

References

Details

(Keywords: csectype-dos, sec-low, Whiteboard: [sg:dos (server)][necko-triaged][ntlm])

Attachments

(2 files, 2 obsolete files)

User Agent: Mozilla/5.0 (X11; Linux x86_64; rv:10.0.1) Gecko/20100101 Firefox/10.0.1
Build ID: 20120209085555

Steps to reproduce:

Environment:

1) Client running Firefox 10 on Linux configured with that proxy and Kerberos auth enabled
2) Squid proxy server with Kerberos authentication and NTLM auth (in that priority order)
3) Squid rules to block some domains if the user is not part of a special group


Actual results:

1) User navigate to blocked_domain.com
2) Proxy check the domains require authentication in order to check the user. The proxy logs:

TCP_DENIED/407 GET http://www.blocked_domain.com/ - NONE/- text/html

3) As Kerberos is enabled, Firefox try again with the Kerberos authentication header
4) Proxy receive and validates the user. The user is not authorized. Logs:

TCP_DENIED/407 GET http://www.blocked_domain.com/ user@EXAMPLE.COM NONE/- text/html

5) Firefox receive the 407 status code, try the next advertised authentication protocol that is NTLM, request the user and password. The user writes the wrong user and password combination

6) Proxy receive the NTLM header. The user is not authorized. Logs:

TCP_DENIED/407 GET http://www.blocked_domain.com/ user@EXAMPLE.COM NONE/- text/html

7) At this points Firefox continue retrying with hundreds of requests per second, The proxy logs each time

TCP_DENIED/407 GET http://www.blocked_domain.com/ user@EXAMPLE.COM NONE/- text/html

As an special case, if the user uses the correct user and password for the NTLM authentication, everything works as expected, Firefox gets the 407 response for the NTLM authentication and does not try anymore


Expected results:

Firefox after receiving 407 for both Kerberos and NTLM authentication, it must stop trying

Tried the same setup with another Kerberos enabled browser, Chrome,  and it does not hang in an infinite loop, avoiding DOSing the proxy:

Authentication methods Squid configuration

auth_param negotiate program /usr/lib64/squid/squid_kerb_auth
...
auth_param ntlm program /usr/bin/ntlm_auth
...

Basic ACL setup for Squid

acl grp_vip proxy_auth superuser
acl domains_blocked dstdomain .blocked_domain.com
http_access deny domains_blocked !grp_vip

This is a very complex environment to reproduce, anything that I can do to help with this, please ask

Note: a security bug because multiple users can DOS a very small proxy server, not sure if this should be private
Component: Untriaged → Networking
Product: Firefox → Core
QA Contact: untriaged → networking
Whiteboard: [sg:dos (server)]
Keywords: csec-dos
Status: UNCONFIRMED → NEW
Ever confirmed: true
Keywords: sec-low
Group: network-core-security
Group: network-core-security
Andrew - do you think you could help us out with confirming/fixing this too?
Yes, I can help out with this one.  (I was involved with NTLM in squid back when it was new, so I understand the moving parts here).
> Yes, I can help out with this one.

Awesome!  Go for it.
Assignee: nobody → abartlet
Something isn't right here - is squid really giving the same error code for 'forbidden to access' and 'wrong username/password'? 

Do you have a network trace?

In the meantime, I'm going to try and reproduce, to at least see if I can make Mozilla give up.
Flags: needinfo?(robert)
I've reproduced the issue, and it is worse than was imagined.

In my setup, I have configured NTLM and Kerberos, but kerberos isn't fully and correctly configured, as it tries to get a ticket for localhost, not the name specified. 

Then I get an NTLM prompt, and I put in x and y as username/password.  This is denied, and the loop of NTLM authentication starts, hammering the server with NTLMSSP_NEGOTIATE requests.

In short, this could happen simply due to network issues getting to the KDC and without strange proxy configuration steps like banned sites.
Sorry for the multiple follow-ups.  The issue as I see it is that nothing is waiting on NTLM exchange to complete.  In my case it sends NTLM_NEGOTIATE packets on a CONNECT hundreds of times on the same socket, without even waiting for a response.  There should NEVER be more than one NTLM or Negoitate exchange on a socket, as there is no multiplex ID, so if we can identify that a socket has NTLM or Negotiate in progress, we should at least be able to error out if we try and do another on the same socket.
The attached network capture shows the problem nicely.  

I've configured squid 'incorrectly' with ntlm_auth, so the Negotiate auth fails due to the wrong keytab.  This nicely triggers the error state. 

In my case, I don't see any failed authentication via NTLM, just the stream of negotiate packets.  This capture is from nightly (git hash 2bfbd70e6249419651c174d0e0285a7b984ba37d) plus my changes in bug 423758.
Flags: needinfo?(robert)
This patch avoids the DOS in my configuration.  It still pops up a pointless NTLM login dialog, the output of which will never be used, but I hope this helps find the right way to fix this. 

I'll keep looking, but I didn't want this useful partial state to be lost.
Attachment #8529961 - Flags: feedback?(honzab.moz)
My further thought on this is that in general, after a failed authentication, we really should tear down the channel.  I've found that mozilla isn't able to successfully authentication against Squid after a failed NTLM authentication, and I think this is related.
Comment on attachment 8529961 [details] [diff] [review]
0001-Attempt-to-fix-part-of-bug-734229-by-refusing-to-re-.patch

Review of attachment 8529961 [details] [diff] [review]:
-----------------------------------------------------------------

seems good to me, thanks.

sorry for such a long feedback
Attachment #8529961 - Flags: feedback?(honzab.moz) → feedback+
Patch, 8 line context, rebased on current master and re-verified.  What do we need to do next?
Attachment #8529961 - Attachment is obsolete: true
Flags: needinfo?(honzab.moz)
Attachment #8591480 - Flags: review?(dkeeler)
Flags: needinfo?(dkeeler)
Attachment #8591480 - Flags: review?(honzab.moz)
Comment on attachment 8591480 [details] [diff] [review]
0001-Attempt-to-fix-part-of-bug-734229-by-refusing-to-re-.patch

Review of attachment 8591480 [details] [diff] [review]:
-----------------------------------------------------------------

Looks good to me, but please address the style issues.
Also, you've verified that this doesn't interfere with multiple sites that may be configured differently, right? (It looks like we create a new instance or at least call .Init on the module for each domain we connect to, but it would be good to verify.)

::: security/manager/ssl/src/nsNTLMAuthModule.cpp
@@ +1000,5 @@
>    mDomain = domain;
>    mUsername = username;
>    mPassword = password;
> +  mNTLMNegotiateSent = false;
> +  

nit: trailing whitespace

@@ +1028,5 @@
>    //
>    if (PK11_IsFIPS())
>      return NS_ERROR_NOT_AVAILABLE;
>  
>    // if inToken is non-null, then assume it contains a type 2 message...

nit: move and/or update this comment

@@ +1030,5 @@
>      return NS_ERROR_NOT_AVAILABLE;
>  
>    // if inToken is non-null, then assume it contains a type 2 message...
> +  if (mNTLMNegotiateSent)
> +    {

nit: please follow the brace placement and indentation guidelines as outlined here: https://developer.mozilla.org/en-US/docs/Mozilla/Developer_guide/Coding_Style#Control_Structures

@@ +1039,5 @@
> +				inTokenLen, outToken, outTokenLen);
> +	}
> +      else
> +	{
> +	  NS_ERROR("NTLM negoitate already sent and presumably rejected by the server, refusing to send another");

Please use LOG("..."); instead of NS_ERROR. Also break long lines (>80 chars).

@@ +1042,5 @@
> +	{
> +	  NS_ERROR("NTLM negoitate already sent and presumably rejected by the server, refusing to send another");
> +	  rv = NS_ERROR_UNEXPECTED;
> +	}
> +    }  

nit: trailing whitespace

@@ +1047,5 @@
>    else
> +    {
> +      if (inToken)
> +	{
> +	  NS_ERROR("NTLM negoitate not sent but NTLM reply already received?!?");

LOG("...");
Attachment #8591480 - Flags: review?(dkeeler) → review+
Comment on attachment 8591480 [details] [diff] [review]
0001-Attempt-to-fix-part-of-bug-734229-by-refusing-to-re-.patch

Review of attachment 8591480 [details] [diff] [review]:
-----------------------------------------------------------------

r+ with the styling issues addressed.
Attachment #8591480 - Flags: review?(honzab.moz) → review+
Land it! :)
Flags: needinfo?(honzab.moz)
Related to this is bug 1057266, which suggests we are sending NTLM authenticate messages more than once per TCP socket.  That is the core of this bug as well, that we send an NTLM negotiate message more than once per TCP socket. 

(Squid should also drop the socket, I'll raise that with them). 

However, in attempting to validate the patch, I'm having trouble loading images from another NTLM authenticated server (an alias), which would show if the patch is correct or not in the multi-domain case.  I'll file a bug for that.
Adding a blocker on the bug 1159584 that is preventing me testing the fix here.
Depends on: 1159584
Bug 1057266 also appears to be related to this.  We need a strong tie between the TCP socket and the authenticated state, for NTLM and Negotiate.  Once one authentication has been attempted, we need to never again attempt authentication - we should drop the socket and start again if we think we need to try again.
No longer depends on: 1159584
Now that I understand what preference (network.auth.allow-subresource-auth) is needed to do the requested test, I confirm that we don't break with multiple NTLM authentication attempts in parallel, nor NTLM proxy authentication in general.
Attachment #8591480 - Attachment is obsolete: true
Attachment #8599662 - Flags: checkin?(dkeeler)
Have I got everything right here?
Flags: needinfo?(honzab.moz)
Comment on attachment 8599662 [details] [diff] [review]
Re-formatted patch for Do not re-send NTLMSSP negotiate packets on an existing socket

https://hg.mozilla.org/integration/mozilla-inbound/rev/4ea6bc9e2fd5
Flags: needinfo?(honzab.moz)
Attachment #8599662 - Flags: checkin?(dkeeler) → checkin+
https://hg.mozilla.org/mozilla-central/rev/4ea6bc9e2fd5

Given the "partially address" I presume there was meant to be a "leave-open" keyword.
Thanks.  BTW, I've (finally) advised the Squid team of this, and my analysis is that they should be closing the connection as suggested in https://bugzilla.mozilla.org/show_bug.cgi?id=1057266#c3

They are CC'ed.
Keywords: leave-open
The behaviour after a failed authentication will probably be resolved once we figure out bug 486508.
Status: NEW → ASSIGNED
Depends on: 486508
Group: core-security → network-core-security
Whiteboard: [sg:dos (server)] → [sg:dos (server)][necko-backlog]
Gary, do you have cycles to check if this is fixed as bug 486508 and related has landed on nightly?

Thanks.
Flags: needinfo?(gary)
Should be able to do this either today, or early next week.
(In reply to Honza Bambas (:mayhemer) from comment #24)
> Gary, do you have cycles to check if this is fixed as bug 486508 and related
> has landed on nightly?
> 
> Thanks.

bug 486508 didn't fix the remaining issue, described in my new bug 1309438 from my earlier investigations.
Flags: needinfo?(gary) → needinfo?(honzab.moz)
(In reply to Gary Lockyer from comment #26)
> (In reply to Honza Bambas (:mayhemer) from comment #24)
> > Gary, do you have cycles to check if this is fixed as bug 486508 and related
> > has landed on nightly?
> > 
> > Thanks.
> 
> bug 486508 didn't fix the remaining issue, described in my new bug 1309438
> from my earlier investigations.

OK, then it sounds like we may fix bug 1309438 and duplicate this one to it.

I'm also thinking of backing https://hg.mozilla.org/mozilla-central/rev/4ea6bc9e2fd5 out and just adding a comment that we never allow reuse of the module instance more than once on a connection and make sure of it in upper layers (by closing the connection and starting over in the http channel).  But that's not critical now.
Flags: needinfo?(honzab.moz)
Depends on: 1309438
Priority: -- → P3
Whiteboard: [sg:dos (server)][necko-backlog] → [sg:dos (server)][necko-triaged]
Whiteboard: [sg:dos (server)][necko-triaged] → [sg:dos (server)][necko-triaged][ntlm]

The leave-open keyword is there and there is no activity for 6 months.
:nhi, maybe it's time to close this bug?

Flags: needinfo?(nhnguyen)

:mayhemer, from comment #27 it seems we have fixed the issue? Is there any more to be done for this bug?

Flags: needinfo?(nhnguyen) → needinfo?(honzab.moz)

Yep, looks like. During the work on this bug it was found that bug 486508 was the core fix. The followup bug 1309438 is IMO unrelated (I didn't make a note why it would be).

Status: ASSIGNED → RESOLVED
Closed: 5 years ago
Flags: needinfo?(honzab.moz)
Resolution: --- → DUPLICATE
Group: network-core-security
You need to log in before you can comment on or make changes to this bug.