Closed Bug 1484149 Opened 2 years ago Closed 1 year ago
Cache racing breaks NTLM authentication - Load / NTLM Auth / cache issue in Firefox and Sharepoint on premises
Hi, I have the same Issue. In SharePoint 2013 and now a new installed SharePoint 2016. Current unhappy Solutions we used: 1. Reload the Page 2. Open SharePoint in Private-Mode (no Cache)
Correct, that is exactly the same behaviour I have noticed. If you disable the cache, then Sharepoint pages load without any issues. I hope some FF/SP expert can help us. Thanks
Hi, We have the same issue with SharePoint 2010. Please Help. Kind Regards
Hi, we have the Same issue with SharePoint 2016 On-Premise and Firefox 60.0.2 :-(. All users are affected. Page reload is needed very often because page is not loading comletly.
Hi, see also this Mozilla Support Forum entry: https://support.mozilla.org/de/questions/1213246 Since the creator of the post used Firefox 59, it shouldn't be related to Quantum. We experience the same behavior as described in Firefox 61 and 62 across several users in our company. We do use SharePoint 2007 (NTLM) and 2013 (Negotiate / Kerberos), but the problem is not limited to it. It also happens on our on-premise Team Foundation Server with NTLM and on custom ASP.NET applications hosted by our team on IIS.
Hi, we're not limited to sharepoint too, other sites using NTLM are concerned (delivred by IIS and Apache)
(In reply to Sebastian Segerer from comment #5) > Hi, > see also this Mozilla Support Forum entry: > https://support.mozilla.org/de/questions/1213246 > Since the creator of the post used Firefox 59, it shouldn't be related to > Quantum. > > We experience the same behavior as described in Firefox 61 and 62 across > several users in our company. > We do use SharePoint 2007 (NTLM) and 2013 (Negotiate / Kerberos), but the > problem is not limited to it. It also happens on our on-premise Team > Foundation Server with NTLM and on custom ASP.NET applications hosted by our > team on IIS. Sebastian, Please read the inital description of the issue. I clearly stated that I noticed this issue when working with firefox 60 ( Quantum version ). When I mention, firefox v.52, what I say is that version doesnt show the Sharepoint NTLM load issue.
@Alberto Suarez I was referring to the creator of the linked support.mozilla.org post, in which Firefox 59 was used.
(In reply to Sebastian Segerer from comment #8) > @Alberto Suarez > I was referring to the creator of the linked support.mozilla.org post, in > which Firefox 59 was used. My apologies Sebastian. I have taken a look to the post you shared. It seems that many users and experiencing the same issue. Thanks.
Based on many users experiencing this issue I am placing this under Core:Networking, so someone from the team can look into this issue. Thanks!
Component: Untriaged → Networking
Product: Firefox → Core
Thanks for putting this to the right component and thanks for the report. Alberto, Sebastian, I will kindly ask you for producing http logs according . Please set the list of logging modules (MOZ_LOG) as: timestamp,rotate:400,nsHttp:5,cache2:5,negotiateauth:5,NTLM:5 As the logs may contain sensitive information, it will be best to send them to my bugzilla email directly. Thank you!  https://developer.mozilla.org/en-US/docs/Mozilla/Debugging/HTTP_logging
Assignee: nobody → honzab.moz
Status: UNCONFIRMED → NEW
Component: Networking → Networking: HTTP
Ever confirmed: true
Priority: -- → P2
So, the problem is that when we race cache with network responses, NTLM authentication can be broken. What I can see in a privately provided log is that a _first request on a new connection_ is being raced with cache. We send a GET request (no authentication) and also open an entry that can be used w/o revalidation. The cache wins even before we get the first 401: NTLM response. The channel is finished, 401 response thew away, resource satisfied from the cache. The expected processing chain is to authenticate the connection with three 401,401,200 loops. The authentication state is kept in the requesting channel, so when that channel is finished, next request on the connection has to start over doing again a plain GET w/o any auth headers, restarting the NTLM auth process from scratch. If it happens that a previous raced request has already sent an NTLM message type 1, this likely confuses the server as it expects NTLM message type 3 in the next request. That is likely the cause of unsatisfied loads from the server, but I have to carefully look into the log again later. Possible fixes for the problem found so far: - in case of Basic or Digest auth we keep an information on a cache entry that it was served with authentication what makes us revalidate it ; I believe this also disallows cache racing (Michal?), but I can see we are sending conditional headers in requests... OTOH, this may be incomplete solution when NTLM is established later for the resource (we have an entry w/o "auth" marking) - let the channel finish the authentication despite a cache win ; this is probably the "easiest" and most clear way to fix this bug
Summary: Load / NTLM Auth / cache issue in Firefox and Sharepoint on premises → Cache racing breaks NTLM authentication - Load / NTLM Auth / cache issue in Firefox and Sharepoint on premises
Whiteboard: [necko-triaged][ntlm] → [necko-triaged][ntlm][http-conn]
Hmm.. I inspected the provided log more in detail and I see a different problem (actually a second one). Michal, I will send you the log with ref to the affected channel to inspect. It's also definitely related to racing.
The second problem, that actually manifests as reported - responses are hanging - is the following: - we do a request - find an entry that needs to be re-validated with the server - we do a request - we do it on a new connection - we get a 401:NTLM response => the cache racing algorithm takes it as if that the first response came from the network, but that is a totally wrong assumption - we do the full NTLM authentication round: another GET, 401, GET 304 - now the server has confirmed that the cached entry can be used - we call ReadFromCache, but it's skipped because cache racing believes we have already provided the response from network This leads to total omission of calling OnStartReqest/OnDataAvailable/OnStopRequest of the final listener (HttpChannelParent) and thus the child request is hanging forever. If this happens for a top level page, a user just stars at a blank page and spinning throbber. P1 as this is a corporate serious bug. I'll file bugs to disable cache racing on ESR branches.
Priority: P2 → P1
I'd love to verify that Basic/Digest are not affected, in all scenarios, see   https://searchfox.org/mozilla-central/rev/ce57be88b8aa2ad03ace1b9684cd6c361be5109f/netwerk/protocol/http/nsHttpChannel.cpp#4345-4348
For all affected users, the actual fix/good workaround is to switch 'network.http.rcwn.enabled' to |false| in about:config.
Thanks Honza, i'm testing the workaround on few machines and so far so good.
status-geckoview62=wontfix because NTLM is not a critical use case for Focus+GeckoView. We don't need to uplift a fix for GeckoView 62 in Focus 7.0.
(In reply to Honza Bambas (:mayhemer) from comment #12) > - in case of Basic or Digest auth we keep an information on a cache entry > that it was served with authentication what makes us revalidate it ; I > believe this also disallows cache racing (Michal?) We don't use this information when we're deciding whether to race or not because we don't know it. It's not stored in the index and we don't have the entry.
(In reply to Honza Bambas (:mayhemer) from comment #14) > The second problem, that actually manifests as reported - responses are > hanging - is the following: > - we do a request > - find an entry that needs to be re-validated with the server > - we do a request > - we do it on a new connection > - we get a 401:NTLM response > => the cache racing algorithm takes it as if that the first response came > from the network, but that is a totally wrong assumption > - we do the full NTLM authentication round: another GET, 401, GET 304 > - now the server has confirmed that the cached entry can be used To not end up with 304 response while we don't have the entry, we remove conditional headers before we send the request. This landed in bug 1382831. I need to understand more how NTLM works to understand why the problem persists for NTLM.
So the problem is that we remove the conditional headers only in the first request but not in subsequent requests. It seems we need change a bit the condition at https://searchfox.org/mozilla-central/rev/ce57be88b8aa2ad03ace1b9684cd6c361be5109f/netwerk/protocol/http/nsHttpChannel.cpp#1186.
RCWN has been disabled for this week's forthcoming ESR 60.2.2 release, which should resolve this issue for those users.
After studying the code, it seems that this is a dupe of bug 1477684 which was fixed in version 62 and was uplifted to ESR 60.2. Alberto, what version of ESR did you use when you were able to reproduce the bug? ESR 60 or ESR 60.2?
From an end-user perspective, I can confirm that I did not encounter this problem for some days / maybe weeks; probably since the FF 62 update mid September. I'm sorry I did not report this earlier, but I wasn't sure if I was just "lucky" to not have this issue for some days. I also just checked with our team and no one hat this behaviour anymore.
In our case, i've provided a log from 60.0 to Honza. I've updated firefox to 60.2, i guess i should switch back to true "network.http.rcwn.enabled" and try.
(In reply to Arnaud Meurou from comment #25) > I've updated firefox to 60.2, i guess i should switch back to true > "network.http.rcwn.enabled" and try. Yes, please enable rcwn again and let me know whether ESR 60.2 works correctly. Thanks.
Great news! Thanks. When confirmed, we can back bug 1494405 out from ESR.
(In reply to Sebastian Segerer from comment #24) > From an end-user perspective, I can confirm that I did not encounter this > problem for some days / maybe weeks; probably since the FF 62 update mid > September. > I'm sorry I did not report this earlier, but I wasn't sure if I was just > "lucky" to not have this issue for some days. > I also just checked with our team and no one hat this behaviour anymore. Yes I think, as Sebastian has obeserved, that I have not gone through this issue since the last FF update. Currently using FF ESR 60.2
(In reply to Michal Novotny (:michal) from comment #23) > After studying the code, it seems that this is a dupe of bug 1477684 which > was fixed in version 62 and was uplifted to ESR 60.2. > > Alberto, what version of ESR did you use when you were able to reproduce the > bug? ESR 60 or ESR 60.2? ESR 60
Status: NEW → RESOLVED
Closed: 1 year ago
Resolution: --- → DUPLICATE
Duplicate of bug: 1477684
Ok works for too with 60.2 ! thanks guys
Fixed in 62 and ESR in bug 1477684.
You need to log in before you can comment on or make changes to this bug.