GetNextTokenCompleteEvent doesn't handle cancellation properly
Categories
(Core :: Networking: HTTP, defect, P2)
Tracking
()
Tracking | Status | |
---|---|---|
firefox86 | --- | fixed |
People
(Reporter: kershaw, Assigned: kershaw)
Details
(Whiteboard: [necko-triaged])
Attachments
(5 files)
This bug is from bug 1668828 comment #20.
See the log below. The channel 0x7fe047a5e000
is dealing with Negotiate
authentication and the transaction pump is suspended here.
2020-12-16 05:32:23.664090 UTC - [Parent 165689: Main Thread]: V/nsHttp nsHttpChannel::ContinueProcessResponse3 [this=0x7fe047a5e000, rv=804b0001]
2020-12-16 05:32:23.664098 UTC - [Parent 165689: Main Thread]: D/nsHttp nsHttpChannelAuthProvider::ProcessAuthentication [this=0x7fe03f71c5b0 channel=0x7fe047a5e5d0 code=401 SSLConnectFailed=0]
2020-12-16 05:32:23.664106 UTC - [Parent 165689: Main Thread]: D/nsHttp nsHttpChannelAuthProvider::PrepareForAuthentication [this=0x7fe03f71c5b0 channel=0x7fe047a5e5d0]
2020-12-16 05:32:23.664111 UTC - [Parent 165689: Main Thread]: D/nsHttp proxy continuation state has been reset
2020-12-16 05:32:23.664119 UTC - [Parent 165689: Main Thread]: D/nsHttp nsHttpChannelAuthProvider::GetAuthenticator [this=0x7fe03f71c5b0 channel=0x7fe047a5e5d0]
2020-12-16 05:32:23.664127 UTC - [Parent 165689: Main Thread]: D/nsHttp nsHttpChannelAuthProvider::GetCredentialsForChallenge [this=0x7fe03f71c5b0 channel=0x7fe047a5e5d0 proxyAuth=0 challenges=Negotiate]
2020-12-16 05:32:23.664138 UTC - [Parent 165689: Main Thread]: D/nsHttp nsHttpChannelAuthProvider::GetIdentityFromURI [this=0x7fe03f71c5b0 channel=0x7fe047a5e5d0]
2020-12-16 05:32:23.664144 UTC - [Parent 165689: Main Thread]: V/nsHttp nsHttpAuthCache::GetAuthEntryForDomain 0x7fe05022f850 [realm=]
2020-12-16 05:32:23.664161 UTC - [Parent 165689: Main Thread]: V/nsHttp nsHttpAuthCache::LookupAuthNode 0x7fe05022f850 key=':https://wiki-int.example.org:-1' found node=0x7fe03f784348
2020-12-16 05:32:23.664167 UTC - [Parent 165689: Main Thread]: V/nsHttp returning 0x7fe047afc470
2020-12-16 05:32:23.664403 UTC - [Parent 165689: Main Thread]: D/nsHttp identity invalid = 0
2020-12-16 05:32:23.664411 UTC - [Parent 165689: Main Thread]: V/nsHttp nsHttpChannel::ConnectionRestartable this=0x7fe047a5e000, restartable=1
2020-12-16 05:32:23.664433 UTC - [Parent 165689: Main Thread]: V/nsHttp Suspending the transaction, asynchronously prompting for credentials
After awhile the channel 0x7fe047a5e000
is canceled, but the transaction pump is never resumed. In the end, the transaction and it's connection are not released and we run out of all slots of active connections.
2020-12-16 05:32:23.696950 UTC - [Parent 165689: Main Thread]: V/nsHttp nsHttpChannel::Cancel [this=0x7fe047a5e000 status=804b0002]
The problem is in GetNextTokenCompleteEvent::Cancel. When nsHttpChannel::Cancel()
is called, GetNextTokenCompleteEvent::Cancel()
is also called, but this function doesn't invoke the nsIHttpAuthenticatorCallback
callback. To fix this problem, we should make sure nsHttpChannelAuthProvider::OnAuthCancelled()
will be called when GetNextTokenCompleteEvent
is canceled.
Assignee | ||
Comment 1•5 years ago
|
||
Assignee | ||
Updated•5 years ago
|
Comment 2•5 years ago
|
||
Hi Kershaw,
I’m glad the log (in bug 1668828) was helpful.
I downloaded the nightly build linked from bug 1668828. That build appears to prevent Firefox from failing to release active connections.
However, it seems to cause a different issue: now, when taking the actions that would before cause Firefox to fail to release active connections, occasionally the Negotiate authentication fails, and the web server prompts me for Basic authentication credentials. E.g.:
https:://wiki-int.example.org is requesting your username and
password. The site says: “EXAMPLE.ORG (AD)”
The pop-up isn't fatal; if I dismiss it (and keep dismissing it whenever it appears, as it will appear repeatedly), I can interact with the web server just fine.
I've attached a lightly-redacted HTTP log (log.txt-main.60758.moz_log.gz) that demonstrates this behavior. As soon as I received the requesting your username and password pop-up, I dismissed the pop-up, waited for about:networking
to show no active/idle connections, and then ended the log.
An observation: this is almost identical to an issue I have observed in Thunderbird for years: when Thunderbird is acting as an IMAP client to an IMAP server, and AUTH=GSSAPI authentication is in use, opening a folder with multiple messages and rapidly attempting to view those messages (e.g., pressing F repeatedly and quickly after opening the first message in the folder) will usually generate pop-up errors about failed GSSAPI authentication. But if I view the messages slowly, and allow each message to fully load before pressing F to move to the next one, then I don't see the error messages about GSSAPI authentication failing.
(What it feels like is that Firefox is using a mutex of some sort around Negotiate authentication, such that if I create a situation where multiple threads are attempting to perform it in quick succession, threads that fail to grab the mutex give up and fall back to Basic authentication. But that's just my speculation.)
On the bright side, this is a great improvement over the previous behavior, as having to dismiss the authentication pop-ups is a lot less intrusive than having to restart Firefox entirely once it runs out of all slots for active connections!
Comment 3•5 years ago
|
||
It occurred to me after attaching the previous log that it might be useful to have a log where I have already given Firefox my Basic authentication credentials, and therefore it can fall back from Negotiate to Basic authorization without needing to prompt me.
I captured this log in the same way as the previous one, but I primed Firefox with my Basic auth credentials before I started the test. (Unless I am misreading, it looks like this test resulted in 11 total authentications, 6 with Negotiate, and 5 with Basic.)
Please let me know if you need any other information. Thanks!
Assignee | ||
Comment 4•5 years ago
|
||
Depends on D100057
Assignee | ||
Comment 5•5 years ago
|
||
Assignee | ||
Comment 6•5 years ago
|
||
Assignee | ||
Comment 7•5 years ago
|
||
(In reply to James Ralston from comment #2)
Created attachment 9194053 [details]
http log file demonstrating fallback to Basic auth behaviorHi Kershaw,
I’m glad the log (in bug 1668828) was helpful.
I downloaded the nightly build linked from bug 1668828. That build appears to prevent Firefox from failing to release active connections.
However, it seems to cause a different issue: now, when taking the actions that would before cause Firefox to fail to release active connections, occasionally the Negotiate authentication fails, and the web server prompts me for Basic authentication credentials. E.g.:
https:://wiki-int.example.org is requesting your username and
password. The site says: “EXAMPLE.ORG (AD)”The pop-up isn't fatal; if I dismiss it (and keep dismissing it whenever it appears, as it will appear repeatedly), I can interact with the web server just fine.
I've attached a lightly-redacted HTTP log (log.txt-main.60758.moz_log.gz) that demonstrates this behavior. As soon as I received the requesting your username and password pop-up, I dismissed the pop-up, waited for
about:networking
to show no active/idle connections, and then ended the log.An observation: this is almost identical to an issue I have observed in Thunderbird for years: when Thunderbird is acting as an IMAP client to an IMAP server, and AUTH=GSSAPI authentication is in use, opening a folder with multiple messages and rapidly attempting to view those messages (e.g., pressing F repeatedly and quickly after opening the first message in the folder) will usually generate pop-up errors about failed GSSAPI authentication. But if I view the messages slowly, and allow each message to fully load before pressing F to move to the next one, then I don't see the error messages about GSSAPI authentication failing.
(What it feels like is that Firefox is using a mutex of some sort around Negotiate authentication, such that if I create a situation where multiple threads are attempting to perform it in quick succession, threads that fail to grab the mutex give up and fall back to Basic authentication. But that's just my speculation.)
On the bright side, this is a great improvement over the previous behavior, as having to dismiss the authentication pop-ups is a lot less intrusive than having to restart Firefox entirely once it runs out of all slots for active connections!
Thanks for your great help!
I think I've fixed the problem about the authentication prompt. It'd be great if you could use this build to test again.
Comment 8•5 years ago
|
||
(In reply to Kershaw Chang [:kershaw] from comment #7)
I think I've fixed the problem about the authentication prompt. It'd be great if you could use this build to test again.
I downloaded the linked build and used it all of yesterday. I could not reproduce the issue with the authentication prompt appearing, and I was using the build extensively in scenarios that previously produced the authentication pop-up (i.e., interacting with web servers and using Negotiate authentication).
I also could not get the build to lose track of connections: about:networking
never showed any Active or Idle connections except ones which were actually active/idle.
I will continue to test, but: my organization is now on holiday break through the rest of 2020. Since this issue depends on Negotiate authentication, and I only use Negotiation when interacting with my organization's web servers, I won't be testing scenarios that would reproduce the issue with any consistency until 2021-01-04.
If, based on the above feedback, you are confident enough with these patches to push them to nightly, then I will switch to using the Firefox nightly builds. Otherwise, I'll continue to use the linked build.
Finally, thank you for unraveling and fixing this bug (and bug 1668828)! We strongly suspected there was some sort of issue when using Firefox to interact with certain web servers, but because we only saw the issue when interacting with some of our organization's web servers, we didn't know whether the bug was client side or server side (e.g., the Atlassian stack was doing something wrong), and we were hesistant to point the finger at Firefox without hard evidence that the issue was client side. We're glad we were able to at least provide the information that helped you find the issue and resolve it.
Comment 9•5 years ago
|
||
I used the linked build all of yesterday, and saw no issues: Firefox did not lose track of active connections, and did not generate authentication pop-ups.
I will continue to use this build until I see (or am told) that the fix has made it into the regular nightly builds.
If I can get Firefox to either lose track of active connections or generate an authentication pop-up, I will post a follow-up comment on this ticket. Otherwise, please assume I have not been able to reproduce the problem, and this bug is truly squashed. Thanks!
Assignee | ||
Comment 10•5 years ago
|
||
(In reply to James Ralston from comment #9)
I used the linked build all of yesterday, and saw no issues: Firefox did not lose track of active connections, and did not generate authentication pop-ups.
I will continue to use this build until I see (or am told) that the fix has made it into the regular nightly builds.
Sorry that the patches in this bug are not landed due to the holiday break. I'll try to get this done asap.
If I can get Firefox to either lose track of active connections or generate an authentication pop-up, I will post a follow-up comment on this ticket. Otherwise, please assume I have not been able to reproduce the problem, and this bug is truly squashed. Thanks!
Thanks for your great help!
Comment 11•5 years ago
|
||
Comment 12•5 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/3d3e88a9ed3a
https://hg.mozilla.org/mozilla-central/rev/002da22721a1
https://hg.mozilla.org/mozilla-central/rev/a36eac3e7d2e
Description
•