Closed Bug 1593167 Opened 5 years ago Closed 5 years ago

Intermittent mis-reporting potential security risk SEC_ERROR_UNKNOWN_ISSUER

Categories

(NSS :: Libraries, defect, P1)

defect

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: subscriptions, Assigned: ueno)

References

Details

Attachments

(8 files)

User Agent: Mozilla/5.0 (X11; Fedora; Linux x86_64; rv:69.0) Gecko/20100101 Firefox/69.0

Steps to reproduce:

Opened browser and navigated to mlb.com and nfl.com.

Actual results:

Received warning.
Warning: Potential Security Risk Ahead
SEC_ERROR_UNKNOWN_ISSUER

Certificate is from DigiCert ECC Secure Server CA

Serial Number:
0B:54:63:60:DA:9F:10:FF:FA:65:F9:D9:DF:5E:C1:25

SHA-256 Fingerprint:
A6:F9:49:FE:96:2C:EF:6A:5F:E7:43:2B:20:E4:AF:47:8F:82:AB:54:E5:69:AF:C3:C3:22:9E:0F:54:E0:FB:D0

Expected results:

The site should have been verified as secure.

Additional Details:
This is an install of Firefox 70 on Fedora 30 and 31.

I uninstalled 70 and reinstalled 69. Under 69 I was unable to reproduce this error.

This error does not happen consistently. If I close the browser after receiving the error, reopen the browser the error will not be present. The error happens roughly 1 out of 5 attempts.

I compared the certificate received from an error session, a successful session and all the sessions in Firefox 69 and they were the same. Firefox 69 did not exhibit the issue in 30 attempts.

Bugbug thinks this bug should belong to this component, but please revert this change in case of error.

Component: Untriaged → Security: PSM
Product: Firefox → Core

This is probably caused by Fedora system nss used with Firefox, see https://bugzilla.redhat.com/show_bug.cgi?id=1752303.

Summary: Intermittent mis-reporting potential security risk SEC_ERROR_UNKNWON_ISSUER → Intermittent mis-reporting potential security risk SEC_ERROR_UNKNOWN_ISSUER

Can you run Firefox with the environment variable MOZ_LOG set to certverifier:5, reproduce the error, and attach the output here? Thanks!

Flags: needinfo?(subscriptions)

child log requested

Flags: needinfo?(subscriptions)

log file as requested

attached two logs... there is a third log, but it is 124 mb and would not allow me to attach here.

One more file with only certverifier:5 set in logging.

Also experiencing this, on Fedora 30. Can attach logs if needed.

Thanks! Unfortunately the issue isn't clear from that log. Can you try to find when this stopped working using https://mozilla.github.io/mozregression/quickstart.html?

Flags: needinfo?(subscriptions)

Alright, I'm using mozregression to try to reproduce the bug. Struggling so far... will keep you posted.

Hello Dana,

I'm attaching a couple of logs that are collected by accessing the same site with the same configuration (i.e., using p11-kit-trust.so as libnssckbi.so); the former is a successful case and the other is a failed case.

Those are the output of the following command:
MOZ_LOG="certverifier:5" P11_KIT_DEBUG=trust firefox https://phabricator.services.mozilla.com
where I changed p11-kit to print the thread name in addition to PID.

One thing I noticed is that, in the failed case, Firefox doesn't query CKA_MOZILLA_CA_POLICY. It seems that mozilla::psm::NSSCertDBTrustDomain::FindIssuer couldn't find the issuer of a certain certificate, but I'm not sure what is really happening there.

Attached file log in the failed case
Attached file certinfo.zip

Have also experienced this with Firefox 71.0b11 on Windows 8.1 when https://accounts.firefox.com was supposed to be loading. It got reported as MOZILLA_PKIX_ERROR_MITM_DETECTED. The site worked after a browser restart. Attached is the information from about:certificate as pdf (neither ctrl+a nor manually selecting works as expected) and the certificate for accounts.firefox.com and the certificate chain.

Now that the certificate loads, it lists DigiCert Global Root CA which was not listed when it didn't work.

(In reply to Daiki Ueno [:ueno] from comment #14)

One thing I noticed is that, in the failed case, Firefox doesn't query CKA_MOZILLA_CA_POLICY. It seems that mozilla::psm::NSSCertDBTrustDomain::FindIssuer couldn't find the issuer of a certain certificate, but I'm not sure what is really happening there.

I dug it further with rr and found that the actual difference seems to be in this code path:

NSSCertDBTrustDomain::FindIssuer
  -> CheckCandidates
    -> ShouldSkipSelfSignedNonTrustAnchor
      -> NSSCertDBTrustDomain::GetCertTrust
        -> CERT_GetCertTrust

When it fails, ShouldSkipSelfSignedNonTrustAnchor reports true on the "DigiCert Global Root CA" cert, based on the check with CERT_GetCertTrust against a temp certificate created by CERT_NewTempCertificate. However, this function for some reason returns a cached certificate, which does not have CERTDB_TRUSTED_CA bit set.

The offending certificate comes from a different thread which reads the cache as below. Now the question is why this could happen (in particular, more frequently with p11-kit-trust.so). Any ideas?

#0  add_certificate_entry (store=0x7f4aeaa16870, cert=0x7f4ae6b56888)
#1  nssCertificateStore_AddLocked (cert=0x7f4ae6b56888, store=0x7f4aeaa16870)
#2  nssCertificateStore_FindOrAdd
#3  0x00007f4b098a018f in NSSCryptoContext_FindOrImportCertificate
    (cc=cc@entry=0x7f4aeaa16830, c=c@entry=0x7f4ae6b56888)
#4  0x00007f4b098859ce in CERT_NewTempCertificate
    (handle=<optimized out>, derCert=derCert@entry=0x7f4af0b7e298, nickname=nickname@entry=0x0, isperm=isperm@entry=0, copyDER=copyDER@entry=1)
#5  0x00007f4b0984a733 in CERT_DecodeCertFromPackage
    (certbuf=0x7f4ae6b3a400 "0\202\003\257\060\202\002\227\240\003\002\001\002\002\020\b;\340V\220BF\261\241uj\311Y\221\307J0\r\006\t*\206H\206\367\r\001\001\005\005", certlen=<optimized out>)
#6  0x00007f4b0476e7c3 in nsNSSCertificate::InitFromDER(char*, int)
    (this=this@entry=0x7f4ae6b23ba0, certDER=<optimized out>, derLen=<optimized out>)
#7  0x00007f4b0476e8fd in nsNSSCertificate::InitFromDER(char*, int)
    (derLen=<optimized out>, certDER=<optimized out>, this=0x7f4ae6b23ba0)
#8  nsNSSCertificate::Read(nsIObjectInputStream*)
    (this=0x7f4ae6b23ba0, aStream=0x7f4ae6b45aa0)
#9  0x00007f4b013d4e6b in nsBinaryInputStream::ReadObject(bool, nsISupports**)
    (this=this@entry=0x7f4ae6b45aa0, aIsStrongRef=aIsStrongRef@entry=true, aObject=0x7f4af0b7e418)
#10 0x00007f4b0475aaae in nsNSSCertList::Read(nsIObjectInputStream*)
    (this=0x7f4ae6b47640, aStream=0x7f4ae6b45aa0)
#11 0x00007f4b013d4e6b in nsBinaryInputStream::ReadObject(bool, nsISupports**)
    (this=0x7f4ae6b45aa0, aIsStrongRef=<optimized out>, aObject=0x7f4af0b7e538)
#12 0x00007f4b0476078d in NS_ReadOptionalObject(nsIObjectInputStream*, bool, nsISupports**) (aResult=0x7f4af0b7e538, aIsStrongRef=true, aStream=0x7f4ae6b45aa0)
#13 NS_ReadOptionalObject(nsIObjectInputStream*, bool, nsISupports**)
    (aResult=0x7f4af0b7e538, aIsStrongRef=true, aStream=0x7f4ae6b45aa0)
#14 mozilla::psm::TransportSecurityInfo::Read(nsIObjectInputStream*)
    (this=0x7f4ae1f36100, aStream=0x7f4ae6b45aa0)
#15 0x00007f4b013d4e6b in nsBinaryInputStream::ReadObject(bool, nsISupports**)
    (this=0x7f4ae6b45aa0, aIsStrongRef=<optimized out>, aObject=0x7f4af0b7e738)
#16 0x00007f4b014e2ea3 in NS_DeserializeObject(nsTSubstring<char> const&, nsISupports**) (str=..., obj=0x7f4af0b7e738)
#17 0x00007f4b016bb425 in mozilla::net::CacheEntry::GetSecurityInfo(nsISupports**) (this=0x7f4ae1f36600, aSecurityInfo=0x7f4ae6b466d8)
#18 0x00007f4b017a9f03 in mozilla::net::nsHttpChannel::OpenCacheInputStream(nsICacheEntry*, bool, bool)
    (this=0x7f4ae6b46000, cacheEntry=0x7f4ae6b458e0, startBuffering=<optimized out>, checkingAppCacheEntry=<optimized out>)
#19 0x00007f4b017c8d11 in mozilla::net::nsHttpChannel::OnCacheEntryCheck(nsICacheEntry*, nsIApplicationCache*, unsigned int*)
    (this=0x7f4ae6b46000, entry=0x7f4ae6b458e0, appCache=<optimized out>, aResult=<optimized out>)

It turned out to be a race condition: when the first time the "DigiCert Global Root CA" is checked with CERT_NewTempCertificate, the ckbi module (either the stock one or p11-kit-trust.so) might not be loaded, and thus the lookup fails. In that case the function creates a temp certificate without any trust and put it in the cache.

Previously, there was a race condition: when the first time
CERT_NewTempCertificate is called, the ckbi module might not be
loaded, and thus the lookup fails. In that case the function creates a
temp certificate without any trust and put it in the cache, which
later interferes with certificate chain validation.

Status: UNCONFIRMED → NEW
Ever confirmed: true

Thanks for looking into this! Why doesn't the certificate's trust get updated when the module loads?

Flags: needinfo?(dueno)

(In reply to Dana Keeler (she/her) (use needinfo) (:keeler for reviews) from comment #20)

Thanks for looking into this! Why doesn't the certificate's trust get updated when the module loads?

Do you mean that NSS should invalidate the certificate cache (NSSCryptoContext::certStore) on module loading? It could help, but it wouldn't remove the race condition completely. In NSS, the actual place the race is happening is in SECMOD_LoadUserModule, which is called from Firefox directly:

SECMODModule *
SECMOD_LoadUserModule(char *modulespec, SECMODModule *parent, PRBool recurse)
{
    // START
    SECStatus rv = SECSuccess;
    SECMODListLock *moduleLock = SECMOD_GetDefaultModuleListLock();
    SECMODModule *newmod = SECMOD_LoadModule(modulespec, parent, recurse);
    // (*)
    if (newmod) {
	SECMOD_GetReadLock(moduleLock);
        // END
        rv = STAN_AddModuleToDefaultTrustDomain(newmod);
	SECMOD_ReleaseReadLock(moduleLock);
        if (SECSuccess != rv) {
            SECMOD_DestroyModule(newmod);
	    return NULL;
        }
    }
    return newmod;
}

where CERT_NewTempCertificate is called from another thread between START and END. It would be still possible that a temp cert is inserted at (*), because NSSTrustDomain_FindCertificate* checks the new module only after STAN_AddModuleToDefaultTrustDomain is done.

Flags: needinfo?(dueno)

When we're looking up the trust of the certificate in NSSCertDBTrustDomain::GetCertTrust, the loadable roots module is guaranteed to have loaded (assuming it exists) - see bug 1578882. It sounds like when we call CERT_NewTempCertificate at that point, NSS is returning the preexisting temporary certificate created by the network cache (that's what that stack is in comment 17). Is that correct? Should it be? Is there a way to make NSS return a certificate as it exists on a different module? Or perhaps what we want is a way to query a certificate's trust on all modules?

Flags: needinfo?(dueno)

If this is a race condition triggered only by the performance of p11-kit-trust.so, shouldn't we also see the issue with Mozilla's builds in some cases? Does dropping caches (via sysctl) and/or heavily stressing CPU or IO make it appear more often?

Is libnssckbi.so doing something synchronously which p11-kit-trust is doing asynchronously? I.e. by the time a function call in the former returns, the state which triggers the bug has already been left, but this is not the case for the latter.

(In reply to Dana Keeler (she/her) (use needinfo) (:keeler for reviews) from comment #22)

When we're looking up the trust of the certificate in NSSCertDBTrustDomain::GetCertTrust, the loadable roots module is guaranteed to have loaded (assuming it exists) - see bug 1578882. It sounds like when we call CERT_NewTempCertificate at that point, NSS is returning the preexisting temporary certificate created by the network cache (that's what that stack is in comment 17). Is that correct?

Yes, exactly.

Should it be?

This certificate caching seems inevitable as CERT_NewTempCertificate is indirectly called from CERT_DecodeCertFromPackage (and that means that NSS assumes that builtin roots are always loaded before calling those functions, I suppose).

Is there a way to make NSS return a certificate as it exists on a different module? Or perhaps what we want is a way to query a certificate's trust on all modules?

It seems to be possible with PK11_GetAllTokens and PK11_FindCertFromDERCertItem, but it affects performance a lot in my test.

Flags: needinfo?(dueno)

(In reply to Jan Alexander Steffens [:heftig] from comment #24)

If this is a race condition triggered only by the performance of p11-kit-trust.so, shouldn't we also see the issue with Mozilla's builds in some cases? Does dropping caches (via sysctl) and/or heavily stressing CPU or IO make it appear more often?

I haven't looked at the performance difference closely, but it is easily reproducible with the stock libnssckbi.so by adding a sleep around:
https://searchfox.org/mozilla-central/source/security/nss/lib/ckfw/nssck.api#1849

On my environment, 50 msec (PR_Sleep(PR_MillisecondsToInterval(50));) is sufficient.

Is libnssckbi.so doing something synchronously which p11-kit-trust is doing asynchronously? I.e. by the time a function call in the former returns, the state which triggers the bug has already been left, but this is not the case for the latter.

Could be the other way around. While libnssckbi.so uses separate locks for the module, slots, tokens, and objects, p11-kit-trust uses a global lock for most of the functions.

When a builtin root module is loaded after some temp certs being
loaded, our certificate lookup logic preferred those temp certs over
perm certs stored on the root module. This was a problem because such
temp certs are usually not accompanied with trust information.

This makes the certificate lookup logic capable of handling such
situations by checking if the trust information is attached to temp
certs and otherwise falling back to perm certs.

Assignee: nobody → dueno
Attachment #9111593 - Attachment description: certdb: prefer perm certs over temp certs when trust is not available → Bug 1593167, certdb: prefer perm certs over temp certs when trust is not available
Attachment #9111593 - Attachment description: Bug 1593167, certdb: prefer perm certs over temp certs when trust is not available → Bug 1593167, certdb: propagate trust information if trust module is loaded afterwards

Thank you to all that have worked on this.

Flags: needinfo?(subscriptions)
Component: Security: PSM → Libraries
Product: Core → NSS
QA Contact: jjones
Version: 70 Branch → other
Status: NEW → ASSIGNED
Priority: -- → P1
Status: ASSIGNED → RESOLVED
Closed: 5 years ago
Resolution: --- → FIXED
Target Milestone: --- → 3.49
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: