Closed Bug 787155 Opened 12 years ago Closed 11 years ago

crash in PKIX_List_GetItem (disable it on B2G)

Categories

(Core :: Security: PSM, defect, P1)

ARM
Gonk (Firefox OS)
defect

Tracking

()

RESOLVED FIXED
1.0.1 Madrid (19apr)
blocking-b2g tef+
Tracking Status
firefox21 --- wontfix
firefox22 --- wontfix
firefox23 --- fixed
b2g18 --- fixed
b2g18-v1.0.0 --- wontfix
b2g18-v1.0.1 --- fixed

People

(Reporter: wsmwk, Assigned: briansmith)

References

Details

(Keywords: crash, perf, Whiteboard: [tbird crash][b2g-crash][eta:2013-04-19][madrid])

Crash Data

Attachments

(2 files, 1 obsolete file)

This bug was filed from the Socorro interface and is 
report bp-3e9ed381-d082-4f27-9c6d-926a22120828 .
============================================================= 
0	nss3.dll	PKIX_List_GetItem	security/nss/lib/libpkix/pkix/util/pkix_list.c:1553
1	nss3.dll	pkix_CacheCert_Lookup	security/nss/lib/libpkix/pkix/util/pkix_tools.c:1072
2	nss3.dll	pkix_Build_GatherCerts	security/nss/lib/libpkix/pkix/top/pkix_build.c:1800
3	nss3.dll	pkix_BuildForwardDepthFirstSearch	security/nss/lib/libpkix/pkix/top/pkix_build.c:2377
4	nss3.dll	pkix_Build_InitiateBuildChain	security/nss/lib/libpkix/pkix/top/pkix_build.c:3615
5	nss3.dll	PKIX_BuildChain	security/nss/lib/libpkix/pkix/top/pkix_build.c:3786
6	nss3.dll	CERT_PKIXVerifyCert	security/nss/lib/certhigh/certvfypkix.c:2204
7	xul.dll	nsNSSCertificate::hasValidEVOidTag	security/manager/ssl/src/nsIdentityChecking.cpp:1189
8	xul.dll	nsNSSCertificate::getValidEVOidTag	security/manager/ssl/src/nsIdentityChecking.cpp:1224
9	xul.dll	nsNSSCertificate::GetIsExtendedValidation	security/manager/ssl/src/nsIdentityChecking.cpp:1250 

firefox bp-ece0c31e-cdad-48fc-aa94-41dbf2120803

crashes in all releases of the past year. and all OS
nothing especially useful in any of the crash comments afaict.
I have this type of crash on SeaMonkey [1], there are many crashes on Firefox [2], so IMHO it isn't Thunderbird-specific.
[1] https://crash-stats.mozilla.com/report/index/bp-e781b49c-7333-40be-85bc-f40c72121215
[2] https://crash-stats.mozilla.com/report/list?signature=PKIX_List_GetItem
Component: Security → Security: PSM
Product: Thunderbird → Core
This caused an OS restart crash on b2g upon launching the homescreen. Doing the following STR:

1. Flash the 3/29 build
2. Complete FTE and enable data
3. Unlock lockscreen
4. Switched to homescreen on the right

https://crash-stats.mozilla.com/report/index/bp-f298fc2c-8eca-43db-a36e-d7ca02130329

Brian - Any ideas? Why is the homescreen experiencing an OS restart crash here? Should we nom to block leo or tef?
tracking-b2g18: --- → ?
Flags: needinfo?(bsmith)
Whiteboard: [tbird crash] → [tbird crash][b2g-crash]
I am working on a patch that will remove all of this code from B2G 1.0 at least, and I have another patch that will allow us to remove this code from (at least the default configuration of) Firefox and Thunderbird.
Assignee: nobody → bsmith
Flags: needinfo?(bsmith)
(In reply to Jason Smith [:jsmith] from comment #2)
> Why is the homescreen experiencing an OS restart crash
> here?

This code (PKIX_*) is executed during the course of validating an SSL certificate for a site that has an Extended Validation (EV) certificate. This is the code that decides, in non-B2G products, whether the lock icon should be green or gray. Since we don't make that distinction in B2G (especially for apps where there isn't even an address bar), we can disable this functionality for B2G. Not only will this work around this problem, but it will be a significant performance enhancement for these EV sites due to the savings of revocation checking overhead.

> Should we nom to block leo or tef?

Sure, it seems reasonable to nom this to block. I will have a patch very soon to disable this code path.
I think Brian intended to nom this as a blocker.
blocking-b2g: --- → tef?
tracking-b2g18: ? → ---
Oh, and once we have a patch, an STR showing this was the cause would be nice.
blocking-b2g: tef? → tef+
Priority: -- → P1
Whiteboard: [tbird crash][b2g-crash] → [tbird crash][b2g-crash][eta:2013-04-15]
Target Milestone: --- → mozilla23
Hi Brian. Past 4/15 now - do you have a new ETA for this?
Whiteboard: [tbird crash][b2g-crash][eta:2013-04-15] → [tbird crash][b2g-crash][eta:2013-04-15][status: needs new ETA]
Sorry, I meant to write "2013-04-16" as the original ETA anyway.

I have a patch that I need to run through try and get reviewed. This is a simple patch (#ifdefing out some unneeded code), so it should go quickly.
Whiteboard: [tbird crash][b2g-crash][eta:2013-04-15][status: needs new ETA] → [tbird crash][b2g-crash][eta:2013-04-16]
https://tbpl.mozilla.org/?tree=Try&rev=d52de52751a0

^ This patch is a superset of the patch that will fix this bug. Pushing back ETA because cviecco has limited availability and it may take longer to get a review, once I post a patch for review, tomorrow, Pacific time.
Whiteboard: [tbird crash][b2g-crash][eta:2013-04-16] → [tbird crash][b2g-crash][eta:2013-04-19]
Whiteboard: [tbird crash][b2g-crash][eta:2013-04-19] → [tbird crash][b2g-crash][eta:2013-04-19][madrid]
Target Milestone: mozilla23 → Madrid (19apr)
This patch fixes the bug by ensuring that libpkix never runs in B2G (PKIX_List_GetItem is part of libpkix). The only thing we use libpkix for is the EV indicator, and B2G doesn't have an EV indicator anyway. Not only will this fix the crash, but it will be a "free" performance improvement for EV sites because (a) we don't do the extra certificate verification to see if the cert is EV, (b) we don't do the extra OCSP checks that are part of that extra certificate verification, and (c) we don't have to do the "GetPreviousCert" trip through the main thread's event loop to do the caching of the EV status.
Attachment #739380 - Flags: review?(honzab.moz)
Keywords: perf
OS: All → Gonk (Firefox OS)
Hardware: All → ARM
No longer blocks: 863534
I filed NSS bug 863534 for fixing this more generally in NSS.
Whiteboard: [tbird crash][b2g-crash][eta:2013-04-19][madrid] → [tbird crash][b2g-crash][eta:2013-04-19][madrid][awaiting review]
I thought we had a flag to switch PKIX usage on and off (nsNSSComponent::globalConstFlagUsePKIXVerification), why is there so much changes needed in the patch then?
Summary: crash in PKIX_List_GetItem → crash in PKIX_List_GetItem (disable it on B2G)
Comment on attachment 739380 [details] [diff] [review]
Do not use libpkix in B2G at all, to avoid crash in PKIX_List_GetItem

How this patch fixes this crash https://crash-stats.mozilla.com/report/index/5ef6e795-c9f4-4ede-932a-b22fe2130419 ?
Comment on attachment 739380 [details] [diff] [review]
Do not use libpkix in B2G at all, to avoid crash in PKIX_List_GetItem

Review of attachment 739380 [details] [diff] [review]:
-----------------------------------------------------------------

This fixes the only one crash ever reported on B2G: https://crash-stats.mozilla.com/report/list?product=B2G&query_search=signature&query_type=contains&query=PKIX_List_GetItem&reason_type=contains&date=04%2F19%2F2013%2015%3A55%3A21&range_value=2&range_unit=weeks&hang_type=any&process_type=any&do_query=1&signature=PKIX_List_GetItem

::: security/manager/ssl/src/nsIdentityChecking.cpp
@@ +1308,5 @@
>  
>  nsresult
>  nsNSSCertificate::getValidEVOidTag(SECOidTag &resultOidTag, bool &validEV)
>  {
> +#ifdef NSS_NO_LIBPKIX

Looks like you cannot get here according the previous #ifndef condition.

Leftover?  Since you don't declare the method for no-libpkix too.

@@ +1382,5 @@
>  
>      outDottedOid = oid_str;
>      PR_smprintf_free(oid_str);
>    }
> +#endif

#else
outDottedOid.Truncate(); or return NS_ERROR_FAILURE;
#endif

::: security/manager/ssl/src/nsNSSCertHelper.cpp
@@ +2097,5 @@
>        return rv;
>  
>      if (!validEV)
>        ev_oid_tag = SEC_OID_UNKNOWN;
> +#endif

Nit: Just put this #endif above one line?

::: security/manager/ssl/src/nsNSSCertificate.cpp
@@ +1277,5 @@
> +      return nsrv;
> +    RefPtr<nsCERTValInParamWrapper> survivingParams;
> +    nsrv = inss->GetDefaultCERTValInParam(survivingParams);
> +    if (NS_FAILED(nsrv))
> +      return nsrv;

This check seems to be important to me for both compilation variations.  It's on purpose to check we can instantiate the component prior any NSS calls here (PSM panic).

::: security/manager/ssl/src/nsNSSCertificateDB.cpp
@@ +582,5 @@
> +        return nsrv;
> +      RefPtr<nsCERTValInParamWrapper> survivingParams;
> +      nsrv = inss->GetDefaultCERTValInParam(survivingParams);
> +      if (NS_FAILED(nsrv))
> +        return nsrv;

Same concern about checking nss component here.  I'd rather leave it as is.

More places here in this file.

Also GetDefaultCERTValInParam doesn't seems not to PKIX_List_GetItem at all.

Maybe I don't see the reason, just explain in the code.

::: security/manager/ssl/src/nsSSLStatus.cpp
@@ +6,5 @@
>  
>  #include "nsSSLStatus.h"
>  #include "plstr.h"
>  #include "nsIClassInfoImpl.h"
> +#include "nsIIdentityInfo.h"

Why needed?
Attachment #739380 - Flags: review?(honzab.moz) → review+
Whiteboard: [tbird crash][b2g-crash][eta:2013-04-19][madrid][awaiting review] → [tbird crash][b2g-crash][eta:2013-04-19][madrid][awaiting review response from bsmith]
Comment on attachment 739380 [details] [diff] [review]
Do not use libpkix in B2G at all, to avoid crash in PKIX_List_GetItem

Review of attachment 739380 [details] [diff] [review]:
-----------------------------------------------------------------

Thanks for the quick review, Honza.

There are a few layers to this patch. Perhaps I should have broken it up into fewer patches.

First, the most important change is the change to nsIdentityChecking.cpp to ensure that we never do the EV checks, because EV checks always use libpkix, regardless of the value of globalConstFlagUsePKIXVerification. This change alone would be enough to fix this bug, AFAICT.

Second, all the other callers of CERT_PKIXVerifyCert are #ifdefed out just to be extra careful.

Third, since we're not doing EV validation any more for B2G (because there is no UI for it yet), we might as well #ifdef out all of the extra stuff that was added just to support EV validation. This is why all the DefaultCERTValInParam stuff and the changes to nsNSSIOLayer.cpp were done.

Fourth, there is another part to this patch, which I will post in a follow-up bug, that makes it possible to build NSS without libpkix at all, to save some space in NSS on B2G. (libpkix is a fairly large part of NSS, in terms of object code size and in terms of lines of code and complexity.) This is another reason I #ifdef'd out every call to CERT_PKIXVerifyCert, because otherwise linking to the minimized NSS would fail.

::: security/manager/ssl/src/nsIdentityChecking.cpp
@@ +1308,5 @@
>  
>  nsresult
>  nsNSSCertificate::getValidEVOidTag(SECOidTag &resultOidTag, bool &validEV)
>  {
> +#ifdef NSS_NO_LIBPKIX

Good catch. Fixed.

@@ +1382,5 @@
>  
>      outDottedOid = oid_str;
>      PR_smprintf_free(oid_str);
>    }
> +#endif

Fixed by doing outDottedOid.Truncate() at the start of the function.

::: security/manager/ssl/src/nsNSSCertHelper.cpp
@@ +2097,5 @@
>        return rv;
>  
>      if (!validEV)
>        ev_oid_tag = SEC_OID_UNKNOWN;
> +#endif

I think it is clearer the way I wrote it, so I am leaving it as-is.

::: security/manager/ssl/src/nsNSSCertificate.cpp
@@ +1277,5 @@
> +      return nsrv;
> +    RefPtr<nsCERTValInParamWrapper> survivingParams;
> +    nsrv = inss->GetDefaultCERTValInParam(survivingParams);
> +    if (NS_FAILED(nsrv))
> +      return nsrv;

The do_GetService for nsINSSComponent was only added when the globalConstFlagUsePKIXVerification option was added, so it should only be needed for the libpkix case.

If this is a problem, then it would be a problem in many other methods of this class that also assume that PSM/NSS has been initialized.

I think the best thing to do is check that PSM was instantiated in nsNSSCertificate::Create so that we don't have to worry about initializing PSM/NSS in any nsNSSCertificate method. I filed a bug for that in 863860.
https://hg.mozilla.org/mozilla-central/rev/f9405d359a22
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Depends on: 864633
This will need a branch-specific patch for the b2g18 uplift.
Flags: needinfo?(bsmith)
Whiteboard: [tbird crash][b2g-crash][eta:2013-04-19][madrid][awaiting review response from bsmith] → [tbird crash][b2g-crash][eta:2013-04-19][madrid]
This broke my build on osx 10.7.5, the problem is pretty obvious:

/Users/shanec/moz/m-c/security/manager/ssl/src/nsNSSCertificateDB.cpp:1362:73: error: use of undeclared identifier 'nsrv'
      nsCOMPtr<nsINSSComponent> inss = do_GetService(kNSSComponentCID, &nsrv);
                                                                        ^
/Users/shanec/moz/m-c/security/manager/ssl/src/nsNSSCertificateDB.cpp:1364:16: error: use of undeclared identifier 'nsrv'
        return nsrv;
Comment on attachment 739380 [details] [diff] [review]
Do not use libpkix in B2G at all, to avoid crash in PKIX_List_GetItem


>@@ -1349,34 +1343,45 @@ nsNSSCertificateDB::FindCertByEmailAddre

>+#ifndef NSS_NO_LIBPKIX
>     }
>     else {
>+      nsCOMPtr<nsINSSComponent> inss = do_GetService(kNSSComponentCID, &nsrv);
>+      if (!inss)
>+        return nsrv;
>+      RefPtr<nsCERTValInParamWrapper> survivingParams;
>+      nsresult nsrv = inss->GetDefaultCERTValInParam(survivingParams);
>+      if (NS_FAILED(nsrv))
>+        return nsrv;
>       CERTValOutParam cvout[1];
>       cvout[0].type = cert_po_end;


That is using nsrv before it is defined.
(In reply to Shane Caraveo (:mixedpuppy) from comment #20)
> That is using nsrv before it is defined.

Fixed in bug 864633. It turns out that I made a mistake and defined NSS_NO_LIBPKIX everywhere.

(In reply to Ryan VanderMeulen [:RyanVM] from comment #18)
> This will need a branch-specific patch for the b2g18 uplift.

I will merge the patch for bug 864633 into this patch, and merge it into b2g18.
Flags: needinfo?(bsmith)
Brian, it's getting pretty late in the game for tef blockers. Can you do the uplift soon please?
Flags: needinfo?(bsmith)
cviecco, this was a non-trivial merge due to many conflicting changes. Please sanity-check it for me.
Attachment #747649 - Flags: review?(cviecco)
Flags: needinfo?(bsmith)
Comment on attachment 747649 [details] [diff] [review]
Patch merged to mozilla-b2g18 + fix for bug 864633

Review of attachment 747649 [details] [diff] [review]:
-----------------------------------------------------------------

Looks good. EV is avoided.
Attachment #747649 - Flags: review?(cviecco) → review+
Attachment #747649 - Attachment is obsolete: true
Was the change on the way the intermediate certificates are validated intended? 
https://uapush-nv.srv.openwebdevice.com/about (signed with a Versign intermediate certificate that isn't supplied by the server) worked before this patch (and still works if I revert the changes) but it doesn't anymore. It fails with "The certificate is not trusted because no issuer chain was provided."

We can fix that server side, but the thing is that the server works correctly on Firefox desktop, IE, and "old" B2G builds.
Flags: needinfo?(bsmith)
(In reply to Antonio Manuel Amaya Calvo from comment #30)
> Was the change on the way the intermediate certificates are validated
> intended? 

It's complicated. I will try to be precise:

1. The behavior you see is expected because Gecko doesn't try to fetch intermediate certificates through the AIA mechanism.

2. It is *unexpected* that there is any change in behavior here though, since your server isn't even EV.

Is it possible that you are noticing the effects of a race condition? In particular, is it possible that when it "worked" your device had already made a connection to another server that did supply the intermediate, before your connection to https://uapush-nv.srv.openwebdevice.com/about? If so, that first connection that supplied the intermediate cert would have caused Gecko to cache the intermediate such that the subsequent connection to https://uapush-nv.srv.openwebdevice.com/about would succeed.
Flags: needinfo?(bsmith)
(In reply to Antonio Manuel Amaya Calvo from comment #30)
> We can fix that server side, but the thing is that the server works
> correctly on Firefox desktop, IE, and "old" B2G builds.

I tried it in Nightly desktop AND Firefox Beta desktop with a *new* profile and I got the same error in both:

uapush-nv.srv.openwebdevice.com uses an invalid security certificate.
The certificate is not trusted because no issuer chain was provided.
(Error code: sec_error_unknown_issuer)
(In reply to Brian Smith (:bsmith) from comment #32)
> (In reply to Antonio Manuel Amaya Calvo from comment #30)
> > We can fix that server side, but the thing is that the server works
> > correctly on Firefox desktop, IE, and "old" B2G builds.
> 
> I tried it in Nightly desktop AND Firefox Beta desktop with a *new* profile
> and I got the same error in both:
> 
> uapush-nv.srv.openwebdevice.com uses an invalid security certificate.
> The certificate is not trusted because no issuer chain was provided.
> (Error code: sec_error_unknown_issuer)

I just tried with a 20.0 with a new profile and you're right, it failed... and then a little while after it worked, without me visiting any site. I think https://versioncheck.addons.mozilla.org/ is the culprit of that :).

B2G, though, was a fresh install (compiled after doing a backout of this patch, flashed and tried directly)... maybe some of the sites it autoconnects uses the same CA too and I just was too slow checking. I didn't knew the intermediate CAs supplied by other sites were cached and reused.

Ok, since this is an expected (yet strange) behavior, we'll just fix up the server (as we were going to do anyway :) ) and consider this closed. 

Thanks Brian!
You need to log in before you can comment on or make changes to this bug.