Closed Bug 1582589 Opened 5 years ago Closed 5 years ago

Client cert checking fails when CAs differ on server and client certs

Categories

(DevTools :: Netmonitor, defect, P2)

70 Branch
defect

Tracking

(firefox70 wontfix, firefox71 wontfix, firefox72 fixed)

RESOLVED FIXED
Firefox 72
Tracking Status
firefox70 --- wontfix
firefox71 --- wontfix
firefox72 --- fixed

People

(Reporter: darakian, Assigned: Honza)

Details

(Keywords: regression)

Attachments

(2 files)

Attached file ClientCertFail.txt

User Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.14; rv:69.0) Gecko/20100101 Firefox/69.0

Steps to reproduce:

Using an internal site I can reliably trigger a crash in ThreadSafeDevToolsUtils.js at 90:13 when visiting a client cert required page.

Actual results:

Crash in some sublogic which causes cert checking to fail.

Expected results:

Client cert checking should pass (it does so on chrome/safari).

Under repeated attempts to access the page access was granted once. Firefox version in use is 70.0b7 (64-bit)

Hi Jon,

Thank you for your report.

I understand that you are using an internal site, for that reason we wouldn't be able to replicate this behavior. Could you provide us with a way to test this on our end as well as the specific steps to follow?

Please let us know if this is reproducible on the latest Firefox Nightly, you can download it from here: https://nightly.mozilla.org/.

In the meantime, I'll add a product and component so the corresponding dev team can take a look at this. Feel free to change it if you think this is not the right component.

Regards,

Component: Untriaged → Security: PSM
Flags: needinfo?(darakian)
Product: Firefox → Core
Component: Security: PSM → General
Product: Core → DevTools

I'm not sure how specific I can get because I don't have complete knowledge of the infrastructure in use, but I will try my best. For what it's worth nightly seems to have fixed this issue. I can consistently access my internal page. Also if you'd like me to run some test I'd be happy to as long as I can filter the results back to you.

As for the setup. We have an HA Proxy (Unsure of version but I can check) load balancer checking for client certs when accessing a given path /foo. I'm not sure what other details would be useful in testing to be honest. Given that this is a piece of infrastructure that I haven't built I'm not 100% sure how to recreate it.

Flags: needinfo?(darakian)

The exact nightly version I tested was 71.0a1 (2019-09-23) (64-bit)

Thanks for filing. The attached file mentions that the error occurs in network-response-listener.js and that the failure code is 0x80040111 (NS_ERROR_NOT_AVAILABLE) [nsICacheInfoChannel.isRacing].
The code can be seen here: https://searchfox.org/mozilla-central/rev/45f30e1d19bde27bf07e47a0a5dd0962dd27ba18/devtools/server/actors/network-monitor/network-response-listener.js#335

I'm moving this to the network component for investigation.
It seems that without a URL to reproduce and a set of steps, this is going to be hard, but maybe the error is enough for people to understand what may be going on and make the code safer around this part.

Component: General → Netmonitor

Glad to help as much as I can. I'm going to be fairly conservative about what I share given that this is an internal only web portal, but I'd be happy to run tests around this. The issue seems to be a race condition with a high likely hood of crashing in FF version 70 and seems to be resolved (or at least the race has different likely outcomes) or in FF version 71.

The platform method HttpChannelChild::IsRacing throws the NS_ERROR_NOT_AVAILABLE exception if it isn't called during or after OnStartRequest. I.e. mAfterOnStartRequestBegun must be true.
https://searchfox.org/mozilla-central/rev/23f836a71cfe961373c8bd0d0219ec60a64b3c8f/netwerk/protocol/http/HttpChannelChild.cpp#3133

DevTools calls isRacing at one place only
https://searchfox.org/mozilla-central/rev/23f836a71cfe961373c8bd0d0219ec60a64b3c8f/devtools/server/actors/network-monitor/network-response-listener.js#335

It's called within _getSecurityInfo, which is called only from onStartRequest
https://searchfox.org/mozilla-central/rev/23f836a71cfe961373c8bd0d0219ec60a64b3c8f/devtools/server/actors/network-monitor/network-response-listener.js#224

@mayhemer: how come we can get the exception?
Is mAfterOnStartRequestBegun properly set at all times?

Honza

Flags: needinfo?(honzab.moz)

fwding to michal.

Flags: needinfo?(honzab.moz) → needinfo?(michal.novotny)

(In reply to Jan Honza Odvarko [:Honza] (always need-info? me) from comment #7)

@mayhemer: how come we can get the exception?
Is mAfterOnStartRequestBegun properly set at all times?

Honza

As the name says - this is set after we have started the notification but not AFTER WE HAVE MADE THE NOTIFICATION (yeah, a weird name, I have to admit).
https://searchfox.org/mozilla-central/search?case=true&q=mAfterOnStartRequestBegun

So a bit more input on this. Going through some testing on Monday (2019-10-07) between ~10am and 1pm PST firefox was well behaved and reliably letting me access my page. Picking up testing again today and the cert issue is back.

@Jon, any update on this? Still reproducible?

Honza

Flags: needinfo?(darakian)

Hi Jan,

Yes still reproducible. I'm now using 71.0b4 (64-bit) to test and the error I'm getting is

Exception { name: "NS_ERROR_NOT_AVAILABLE", message: "Component returned failure code: 0x80040111 (NS_ERROR_NOT_AVAILABLE) [nsICacheInfoChannel.isRacing]", result: 2147746065, filename: "resource://devtools/server/actors/network-monitor/network-response-listener.js", lineNumber: 335, columnNumber: 0, data: null, stack: "NetworkResponseListener.prototype._getSecurityInfo<@resource://devtools/server/actors/network-monitor/network-response-listener.js:335:26\nexports.makeInfallible/<@resource://devtools/shared/ThreadSafeDevToolsUtils.js:111:22\nonStartRequest@resource://devtools/server/actors/network-monitor/network-response-listener.js:224:10\n", location: XPCWrappedNative_NoHelper }
ThreadSafeDevToolsUtils.js:90:13
    reportException resource://devtools/shared/ThreadSafeDevToolsUtils.js:90
    makeInfallible resource://devtools/shared/ThreadSafeDevToolsUtils.js:117
    onStartRequest resource://devtools/server/actors/network-monitor/network-response-listener.js:224

Let me know if you'd like an expanded version of that error message.

On FF version 72.0a1 (2019-10-25) (64-bit)
Everything works.

Flags: needinfo?(darakian)

Thanks Jon!

Given that we don't have a test case let's use try/catch statement to make the isRacing API call safer.

Honza

Assignee: nobody → odvarko
Status: UNCONFIRMED → ASSIGNED
Has STR: --- → no
Ever confirmed: true
Priority: -- → P2
Pushed by dwalsh@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/9e324517aced
Client cert checking fails when CAs differ on server and client certs r=davidwalsh

Which version should I look for to test this change? Alternately is there a way I can hot patch this JS in my current install?

Status: ASSIGNED → RESOLVED
Closed: 5 years ago
Resolution: --- → FIXED
Target Milestone: --- → Firefox 72

(In reply to Jon from comment #16)

Which version should I look for to test this change? Alternately is there a way I can hot patch this JS in my current install?

The patch landed in Firefox 72, which is the current Nightly.
You can download Nightly build from here:
https://www.mozilla.org/en-US/firefox/nightly/all/

Honza

Cool. I'll make a note to comment again here if I see an issue in 72 (once that lands in stable). Thanks :)

(In reply to Jon from comment #19)

Cool. I'll make a note to comment again here if I see an issue in 72 (once that lands in stable). Thanks :)

Great, thanks! As soon as the fix is verified in Nightly we can ask for uplifting to 71
Honza

Flags: needinfo?(michal.novotny)

Jon, can you please check if 72 is fixed?
Honza

Flags: needinfo?(darakian)

72 was working before. I've updated to 72.0a1 (2019-11-05) (64-bit) and that is also working.

Flags: needinfo?(darakian)

71.0b7 (64-bit) however does not work.

Comment on attachment 9104859 [details]
Bug 1582589 - Client cert checking fails when CAs differ on server and client certs

Beta/Release Uplift Approval Request

  • User impact if declined: Client certification fails if DevTools are opened (and the Network panel activated)
  • Is this code covered by automated tests?: No
  • Has the fix been verified in Nightly?: Yes
  • Needs manual test from QE?: No
  • If yes, steps to reproduce:
  • List of other uplifts needed: None
  • Risk to taking this patch: Low
  • Why is the change risky/not risky? (and alternatives if risky): Small code change (just try-catch statement)
  • String changes made/needed:
Attachment #9104859 - Flags: approval-mozilla-beta?

For what it's worth when v71 was nightly that was working as well. Not sure if everything from nightly 71 made it into beta 71, but I think it's worth pointing out.

The versions affected seem a bit unclear to me. Jon mentioned that this was working in 71.0a1 and 72.0a1 before the patch, but he also said that it is not working in 71.0b4 and b7. That seems to indicate to there is a regression in 71 beta, maybe due to an uplift in beta 4, or that the bug manifests with a preference difference between nightly and beta. Could we provide maybe a try build on the beta branch with this patch before uplifting? The patch doesn't look harmful but we still have several betas to uplift it and I'd like to make sure we don't uplift something in a beta build to back it out later because it was not the cause of the problem.

Flags: needinfo?(odvarko)

To add to this again. A colleague of mine is using FF stable (not sure what specific version) on linux and does not have issues.

(In reply to Pascal Chevrel:pascalc from comment #26)

The versions affected seem a bit unclear to me. Jon mentioned that this was working in 71.0a1 and 72.0a1 before the patch, but he also said that it is not working in 71.0b4 and b7. That seems to indicate to there is a regression in 71 beta, maybe due to an uplift in beta 4, or that the bug manifests with a preference difference between nightly and beta. Could we provide maybe a try build on the beta branch with this patch before uplifting? The patch doesn't look harmful but we still have several betas to uplift it and I'd like to make sure we don't uplift something in a beta build to back it out later because it was not the cause of the problem.

Yeah, I am also confused by this. But given that I don't have a test case it's hard to verify.
I think that we can let this to raid the trains to avoid any risk

Honza

Flags: needinfo?(odvarko)

(In reply to Jan Honza Odvarko [:Honza] (always need-info? me) from comment #28)

Yeah, I am also confused by this. But given that I don't have a test case it's hard to verify.
I think that we can let this to raid the trains to avoid any risk

Honza

Let's do that, thanks.

Attachment #9104859 - Flags: approval-mozilla-beta? → approval-mozilla-beta-

I'm happy to test specific builds if that helps.

Since a new update got pushed out I gave it another run and I think I need to correct myself on my prior report. I discovered that I had manually imported my cert into FF nightly and not into FF beta. Upon importing my cert directly into FF beta access worked as expected.

In both nightly and beta setting security.enterprise_roots.enabled to true in about:config with no cert loaded causes FF to fail certificate authentication. I have FF set to prompt me to choose a cert and in both cases I get prompted when I load the cert manually, but not when I try to read from the system keychain.

Could it be that macos keychain access is broken?

Just tested and FF stable also works when manually loading the certificate.

FF stable version tested was 71.0 (64-bit)

Bugbug thinks this bug is a regression, but please revert this change in case of error.

Keywords: regression

Hey all, just wanted to ping here again to make sure that the updated behavior has been noted. If I don't see a reply I'll make a new issue.

Flags: needinfo?(odvarko)

Jon, does this work for you or did you file a new issue?
If yes, what's the number?

Honza

Flags: needinfo?(odvarko) → needinfo?(darakian)

It did not. I filed a new bug here.
https://bugzilla.mozilla.org/show_bug.cgi?id=1611865

I apologize, but my explanation in this thread was incorrect. I was confusing myself by pre-loading a cert into firefox and then promptly forgetting that I did that which lead me to believe that it was being read from the system keychain.

Flags: needinfo?(darakian)
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: