1530429 - SSL validation sometimes fails when it should pass for page loaded on startup (race?)

Reporter

Description

•

5 years ago

User Agent: Mozilla/5.0 (X11; Fedora; Linux x86_64; rv:64.0) Gecko/20100101 Firefox/64.0

Steps to reproduce:

An automated test I maintain enrols a client system as a member of a FreeIPA domain. This happens via the Cockpit web UI in a Firefox instance. Part of FreeIPA domain enrolment involves the addition of a new CA cert - the FreeIPA domain's own CA - to the system trust store, which is done during the enrolment, using the Fedora 'shared system certificate' mechanism (see https://fedoraproject.org/wiki/Features/SharedSystemCertificates - basically the CA cert file is dropped in a special location and 'update-ca-trust' is run, which among other things, updates the NSS database of trusted CA certs). That Firefox instance then exits. In a subsequent step, the test runs a new Firefox instance from a console like this:

startx /usr/bin/firefox -width 1024 -height 768 https://ipa001.domain.local

which should immediately load the FreeIPA webUI. That server uses an SSL cert issued by the FreeIPA domain CA, so it should be trusted.

Actual results:

Usually, this works fine, and the page loads correctly. However, about 30% of the time, we instead see a certificate error instead. The error is SEC_ERROR_UNKNOWN_ISSUER: "Peer's Certificate issuer is not recognized". So, I looked into this further: I had the test show the details of the error (the actual certificate chain it received from the server), and then go into the Firefox Certificate Manager and page through the entries. This showed that the expected certificate for the 'DOMAIN.LOCAL' CA is actually present in the list. I had the test export this certificate so we could look at it; manually comparing that certificate to the one shown as 'not recognized' in the SEC_ERROR_UNKNOWN_ISSUER error shows they are identical. Finally, I had the test try simply refreshing the page when it hits the certificate error...and lo and behold, on refresh the page loads fine.

I suspect there is a race here between load/refresh of the system trusted certificate store, and load of a page specified on the Firefox command line. If the latter happens fast, the certificate can be erroneously rejected in this way.

Here's a sample of the test in the version where it provides additional data on the certificate failure:

https://openqa.stg.fedoraproject.org/tests/484084
https://openqa.stg.fedoraproject.org/tests/484084#downloads

On the "Logs & Assets" page (the second link), 'freeipa_webui-test.crt' is the CA cert for the FreeIPA domain CA, as exported from the 'Certificate Manager' within Firefox. The screenshots https://openqa.stg.fedoraproject.org/tests/484084#step/freeipa_webui/14 and https://openqa.stg.fedoraproject.org/tests/484084#step/freeipa_webui/15 show the certificate chain that Firefox rejected; you can manually verify that the CA cert in the chain is identical to the one that we find in the Certificate Manager. 'freeipa_webui-certs' is the output of a further check I had the test run, where it uses openssl s_client to contact the same URL, and display and verify the certificate chain; it displays the same chain, and accepts it.

https://openqa.stg.fedoraproject.org/tests/484652 is an example of the test in the form where it retries when it hits the certificate error. You can see the certificate error at the seventh and eighth screenshots of the 'freeipa_webui' test module - https://openqa.stg.fedoraproject.org/tests/484652#step/freeipa_webui/7 and https://openqa.stg.fedoraproject.org/tests/484652#step/freeipa_webui/8 . The next screenshot - https://openqa.stg.fedoraproject.org/tests/484652#step/freeipa_webui/9 - is after the test then clicks the 'refresh' button: as you can see, the same URL loaded just fine. You can also see a video of this happening - https://openqa.stg.fedoraproject.org/tests/484652/file/video.ogv (skip to about 1:28 - note the video is faster than real time and skips some frames, it is constructed from screenshots).

Expected results:

The SSL cert should always be considered valid when I run the test 100 times, rather than failing to validate ~30% of the time.