Closed Bug 1181814 Opened 9 years ago Closed 8 years ago

Pick up FIPS-140 certification work done by Red Hat

Categories

(NSS :: Libraries, defect)

defect
Not set
normal

Tracking

(firefox42 affected)

RESOLVED FIXED
Tracking Status
firefox42 --- affected

People

(Reporter: elio.maldonado.batiz, Assigned: rrelyea)

References

Details

Attachments

(15 files, 4 obsolete files)

117.34 KB, patch
Details | Diff | Splinter Review
35.04 KB, patch
Details | Diff | Splinter Review
6.48 KB, patch
Details | Diff | Splinter Review
84.55 KB, patch
Details | Diff | Splinter Review
186.48 KB, patch
elio.maldonado.batiz
: review+
Details | Diff | Splinter Review
436.23 KB, patch
Details | Diff | Splinter Review
35.01 KB, patch
elio.maldonado.batiz
: review+
Details | Diff | Splinter Review
2.53 KB, patch
Details | Diff | Splinter Review
444.45 KB, patch
KaiE
: review+
Details | Diff | Splinter Review
2.03 KB, patch
Details | Diff | Splinter Review
278.71 KB, patch
Details | Diff | Splinter Review
6.78 KB, patch
Details | Diff | Splinter Review
7.32 KB, text/plain
Details
1.54 KB, patch
wtc
: review+
Details | Diff | Splinter Review
1.72 KB, patch
rrelyea
: superreview-
Details | Diff | Splinter Review
      No description provided.
Blocks: 1181807
The FIPS-140 related patch, or set patches, will be submitted by Robert Relyea.
Target Milestone: --- → 3.20
Assignee: nobody → rrelyea
This patch picks up the following downstream patches:
nss-softoken-3.16-fips-post.patch
nss-softoken-3.16-fips-remo-old-test.patch
nss-softoken-3.16-ppc-no-init_support.patch
nss-softoken-3.16-freebl_dyload.patch

These patches implement the following:
1) FIPS now require that post and integrity tests run at dll load time. This patch arranges for those to happen at dll load time.
2) For platforms that use freeb low hash to support hashing in glibc, all glibc application end up running post and integrity tests at load time even if they don't do hashing. To prevent that, freebl because a stub which calls the real freebl only if you need to do hashing.

This is patch 1 of 4 for we needed to handle the latest FIPS validation.
Attachment #8641127 - Flags: review?(kaie)
Attachment #8641127 - Flags: review?(ekr)
Attached patch fips.patchSplinter Review
This patch contains core FIPS required changes, namely:
Extra CPS data clearing. Changes to the way we handle continuous RNG check, FIPS-186 specific checks on generated RSA keys, Generating just a G parameter from P and Q.
Attachment #8641304 - Flags: review?(wtc)
Attachment #8641304 - Flags: review?(emaldona)
The fips.patch pick up the following patches in the downstream package:
nss-softokn-3.16-fips.patch
nss-softokn-3.16-addG.patch
nss-softokn-3.16-rsa-fips-186.patch
The fips.patch is 2 of 4.
Attached patch level1.patchSplinter Review
patch 3 of 4: allow a FIPS level 1 mode in softoken.

On RHEL-7 we make NSS automatically switch to FIPS mode if the system is in FIPS mode. Unfortunately this caused 2 issues: 1) openNewDB didn't work well when in FIPS mode, and 2) some apps could only use NSS in level 1 mode (no password on the database, or no database altogether).

To fix this we allowed NSS to operate in FIPS level 1 mode if there is no password on the database. This patch allows that to happen.
level1.patch includes the following downstream patches:
nss-softokn-allow-level1.patch
nss-softokn-3.16-allow_level1_init.patch
nss-softokn-3.16-fips_user_slots.patch

The following related downstream patch is already checked into the tree:
limit-create-fipscheck.patch
Comment on attachment 8641381 [details] [diff] [review]
level1.patch

You didn't ask for review on this patch, but it's the shortest, so I took a quick look.

I would like to make the very general statement, that more commenting of the changes would be helpful.

There are many changes that seem to change semtantics, but I have no idea if the change makes sense or not, so I cannot really review them.

For example, function FC_GetTokenInfo previously added the CKF_LOGIN_REQUIRED bit. With your change, you're no longer setting the bit, but instead, you assume it's already set, and query it.

I believe I cannot do a semantic review. All I can do is check for potential obvious coding errors, which can be found without knowing the semantics behind your changes.
Comment on attachment 8641127 [details] [diff] [review]
fips-post-2.patch

Review of attachment 8641127 [details] [diff] [review]:
-----------------------------------------------------------------

::: lib/freebl/Makefile
@@ +50,3 @@
>  	LOWHASH_SRCS = nsslowhash.c
>  	LOWHASH_EXPORTS = nsslowhash.h
> +	MAPFILE_SOURCE = freebl_hash_vector.def

The above assignment is unnecessary and should be deleted.

Because of the define statement in the next, the following code block will be executed, and the above assignment overridden.

::: lib/freebl/loader.c
@@ -95,5 @@
> -}
> -#else
> -/* default case, for platforms/ABIs that have only one freebl shared lib. */
> -static const char * getLibName(void) { return default_name; }
> -#endif

It seems like your patch won't build, because you've deleted all definitions of function getLibName() (including the default one), but freebl_LoadDSO() still calls that function.

::: lib/freebl/manifest.mn
@@ +108,5 @@
>  	des.c \
>  	drbg.c \
>  	cts.c \
>  	ctr.c \
> +	fipsfreebl.c \

I don't see this new file anywhere in your patches.

You should run "hg stat" to see the files that are new in your tree (shown with a leading question mark). You need to "hg add" them, prior to running "hg diff".

::: lib/freebl/nsslowhash.c
@@ -90,5 @@
> -    return( CKR_OK );
> -}
> -
> -static CK_RV
> -freebl_fips_SHA_PowerUpSelfTest( void )

Can you please briefly explain why it's ok to remove all these power up tests?

I think I'll stop reviewing now, because your patches remove a large anmount of code, without adding them back elsewhere.

Please doublecheck that you've included all new files in your patches (which potentially will show that you're simply moving code to other places?).
re comment 8.

Thanks for looking at the short patch.

This patch changes softokn from always requiring login in FIPS mode, to using the existance of the database password to determine if we are in FIPS level2 mode or not.

The code in fipstoken.c implements the delta between FIPS and softoken. So, for instance, the CKF_REQUIRE_LOGIN is set inside NSC_GetTokenInfo(), which FC_GetTokenInfo() calls. In the old way (Always level 2 in FIPS mode), this flag would always force a token login in FIPS mode. This patch wants to allow the token to run in level 1 mode, so we no longer always force the CKF_REQUIRE_LOGIN flag. We check the CKF_REQUIRED_LOGIN, and turn off the level2 semantics if it's not set. This is the only place we set isLevel2 to false.

The other thing it does is enforce login state checks on more operations. These enforced login state checks are only required in level2, thus the new isLevel2 variable.

The patch also fixes a bug, where we have tokens in FIPS mode, but the slot ID is user slots, not the system slots. These were returning the wrong values for mechanism info, and this patch corrects that.
Attachment #8641381 - Flags: review?(kaie)
> The above assignment is unnecessary and should be deleted.
> Because of the define statement in the next, the following code block will be executed, and the above assignment > overridden.

No, FREEBL_BUILD_LOWHASH and FREEEBL_LOWHASH_BUILD are two different variables. The first tells the make file that it needs to make the recursive call to make, and the second is set on that recursive call. Two different shared libraries are built as a result, the first with freebl_hash_vector.def as the def file and the second with freebl_hash.def as the def file. It's a bit convoluted, but I'm just reusing the existing mechanism here, in which multiple freebl shared libraries are built by recursive calls to this makefile.

> It seems like your patch won't build, because you've deleted all definitions of function getLibName()
> (including the default one), but freebl_LoadDSO() still calls that function.

Good catch. It builds for me because the patch is missing a file, blname.c where getLibName() was moved to. I'll include a patch with blname.c fipsfreebl.c freebl_hash_vector.def and  lowhash_vector.c (thanks hg status for helping find those).

> I don't see this new file anywhere in your patches.

yup they are missing.

> Can you please briefly explain why it's ok to remove all these power up tests?

Most of them have moved (to files that are missing in this patch). Several old export strength non-FIPS cipher tests were removed because they aren't needed in FIPS tests anymore.

bob
This is the missing files from fips-post-2.patch.
Attachment #8645117 - Flags: review?(kaie)
Attached patch fipstest.patchSplinter Review
This patch updates the fiptest program to 1) handle new ciphers (like gcm), 2) to automate the running of the CAVS tests, and 3) to function with the latest CAVS tests.

NOTE: no attempt has been made to solve the glaringly poor handling of errors in the original fipstest. The program still crashes if you pass it non-existant files, for instance.
Attachment #8645118 - Flags: review?(emaldona)
Attachment #8645118 - Flags: review?(dkeeler)
(In reply to Robert Relyea from comment #11)
> > The above assignment is unnecessary and should be deleted.
> > Because of the define statement in the next, the following code block will be executed, and the above assignment > overridden.
> 
> No, FREEBL_BUILD_LOWHASH and FREEBL_LOWHASH_BUILD are two different
> variables. The first tells the make file that it needs to make the recursive
> call to make, and the second is set on that recursive call.


Ok, that's confusing. I didn't notice they were different variables. They are named too similarly, and the different order of words isn't sufficient to understand the different meaning.

I think at least one of the variables should be renamed.

I suggest to rename the variable for "must trigger recursion" to FREEBL_LOWHASH_TRIGGER_BUILD or similar.

I suggest to rename the variable for "currently building lowhash" to FREEBL_LOWHASH_BUILDING or similar.
Comment on attachment 8641304 [details] [diff] [review]
fips.patch

In rijndael.c you're adding a line that fails to build:

rijndael.c: In function ‘AES_InitContext’:
rijndael.c:1167:7: error: ‘AESContext’ has no member named ‘mode’
     cx->mode = mode;
It seems that your patches might be on top of a snapshot that isn't in sync with the upstream NSS repository.

In the fipstest.patch, you're modifying nonexisting functions "alloc_value" and "isblankline".
More compiler errors:

pkcs11c.c: In function ‘nsc_parameter_gen’:
pkcs11c.c:3569:5: error: ‘rv’ may be used uninitialized in this function [-Werror=maybe-uninitialized]
  if (rv != SECSuccess) {

It suggest that you're checking the wrong variable for error here:

	crv = sftk_Attribute2SSecItem(arena, &params->prime, key, CKA_PRIME);
	if (rv != SECSuccess) {
	    goto loser;
	}


and again here:

	crv = sftk_Attribute2SSecItem(arena, &vfy->seed, key, CKA_NSS_PQG_SEED);
	if (rv != SECSuccess) {
	    goto loser;
	}



Unused variable:

lowhash_vector.c:24:20: error: ‘NameOfThisSharedLib’ defined but not used [-Werror=unused-variable]
 static const char *NameOfThisSharedLib = 


A linker failure:

ar cr Linux4.0_x86_64_gcc_glibc_PTH_64_DBG.OBJ/libnssdbm.a ... ar: Linux4.0_x86_64_gcc_glibc_PTH_64_DBG.OBJ/lgfips.o: No such file or directory


I don't see a file lgfips.c in your patches, but you're adding it in the post-2 patch
I think the compile errors may be because Martin has checked in the stricter 'fail on warnings' code. The patched compiled on the trunk at the time I attached them 2 weeks ago. I'll attach the full patch as I have it now..
Attached patch Full version of the patch. (obsolete) — Splinter Review
This version does not fix things like Kai's rename suggestion yet, or merge any of fixes needed after Martin's patches, but it should answer some of the other error Kai said, like lgfips.c and the ridjndael issue.
(In reply to Robert Relyea from comment #20)
It seems that I should review this patch instead of those other ones I still have pending. Don't see flags so I better check, does this one obsolete previous ones?
-Elio
Flags: needinfo?(rrelyea)
(In reply to Elio Maldonado from comment #21)
> (In reply to Robert Relyea from comment #20)
> It seems that I should review this patch instead of those other ones I still
> have pending. Don't see flags so I better check, does this one obsolete
> previous ones?

Elio, no, this patch was supposed to be the set union of all individual patches, attached at my request, to simplify testing all pieces together.

(I said supposed, because apparently it's not the full patch.)
(In reply to Robert Relyea from comment #19)
> Created attachment 8646991 [details] [diff] [review]
> Full version of the patch.

No, this isn't a full version.
It's the same attachment as the one in comment 13.
It has files from nss/cmd/fipstest, only.
Attached patch full.patchSplinter Review
Arg, I attached the wrong file.
Attachment #8646991 - Attachment is obsolete: true
Flags: needinfo?(rrelyea)
Comment on attachment 8641304 [details] [diff] [review]
fips.patch

Review of attachment 8641304 [details] [diff] [review]:
-----------------------------------------------------------------

I compare rsa.c after applying the patch with what we have for nss-softokn on RHEL-7.1 (and RHEL-7..2) and they are different.

[emaldona@dhcp-16-216 upstream]$ diff -u bug-1181814-fips-patch/nss/lib/freebl/rsa.c /home/emaldona/work4nss/BRANCHES4RHEL/nss-softokn/rhel-7.1/nss-softokn-3.16.2.3/nss/lib/freebl/rsa.c
--- bug-1181814-fips-patch/nss/lib/freebl/rsa.c	2015-08-13 16:07:59.899347826 -0700
+++ /home/emaldona/work4nss/BRANCHES4RHEL/nss-softokn/rhel-7.1/nss-softokn-3.16.2.3/nss/lib/freebl/rsa.c	2015-08-13 16:23:19.167759896 -0700
@@ -343,11 +343,10 @@
 	    if (rsa_fips186_verify(&p, &q, &d, keySizeInBits) ){
 		break;
 	    }
-	    prerr = PORT_GetError();
-	} else {
 	    prerr = SEC_ERROR_NEED_RANDOM; /* retry with different values */
+	} else {
+	    prerr = PORT_GetError();
 	}
-	prerr = PORT_GetError();
 	kiter++;
 	/* loop until have primes */
     } while (prerr == SEC_ERROR_NEED_RANDOM && kiter < max_attempts);

Why is there are difference?

::: lib/freebl/rsa.c
@@ -133,1 +133,1 @@
> >  

if (rsa_fips186_verify(&p, &q, &d, keySizeInBits) ){
 		break;
 	    }
 	    prerr = SEC_ERROR_NEED_RANDOM; /* retry with different values */
 	}
 	kiter++;
 	/* loop until have primes */
     } while (prerr == SEC_ERROR_NEED_RANDOM && kiter < max_attempts);
Previous version didn't apply cleanly due to lib/freebl/dh.c code changes since nss-3.16.2.3, on which the previous one was based. This modified one does and is what I'm reviewing.
Attachment #8641304 - Attachment is obsolete: true
Attachment #8641304 - Flags: review?(wtc)
Attachment #8641304 - Flags: review?(emaldona)
Though not caused by the patch under review as is related to a different commit, I should mention it here, before I forget, and is has to do with softokn/pksc11.c where some names for mechanisms are different from what Red Hat has.

--- /home/emaldona/work4nss/ALLBRANCHES4RHEL/nss-softokn/rhel-7.1/nss-softokn-3.16.2.3/nss/lib/softoken/pkcs11c.c       2015-08-14 08:23:41.540656268 -0700
+++ nss/lib/softoken/pkcs11c.c  2015-08-14 08:03:39.389773352 -0700
@@ -2519,17 +2519,17 @@
     case CKM_TLS_PRF_GENERAL:
        crv = sftk_TLSPRFInit(context, key, key_type, HASH_AlgNULL, 0);
        break;
-    case CKM_TLS12_MAC: {
-       CK_TLS12_MAC_PARAMS *tls12_mac_params;
+    case CKM_TLS_MAC: {
+       CK_TLS_MAC_PARAMS *tls12_mac_params;
        HASH_HashType tlsPrfHash;
        const char *label;
 
-       if (pMechanism->ulParameterLen != sizeof(CK_TLS12_MAC_PARAMS)) {
+       if (pMechanism->ulParameterLen != sizeof(CK_TLS_MAC_PARAMS)) {
            crv = CKR_MECHANISM_PARAM_INVALID;
            break;
        }
-       tls12_mac_params = (CK_TLS12_MAC_PARAMS *)pMechanism->pParameter;
-       if (tls12_mac_params->prfHashMechanism == CKM_TLS_PRF) {
+       tls12_mac_params = (CK_TLS_MAC_PARAMS *)pMechanism->pParameter;
+       if (tls12_mac_params->prfMechanism == CKM_TLS_PRF) {
            /* The TLS 1.0 and 1.1 PRF */
            tlsPrfHash = HASH_AlgNULL;
            if (tls12_mac_params->ulMacLength != 12) {
@@ -2539,7 +2539,7 @@
        } else {
            /* The hash function for the TLS 1.2 PRF */
            tlsPrfHash =
-               GetHashTypeFromMechanism(tls12_mac_params->prfHashMechanism);
+               GetHashTypeFromMechanism(tls12_mac_params->prfMechanism);
            if (tlsPrfHash == HASH_AlgNULL ||
                tls12_mac_params->ulMacLength < 12) {
                crv = CKR_MECHANISM_PARAM_INVALID
Comment on attachment 8645118 [details] [diff] [review]
fipstest.patch

Review of attachment 8645118 [details] [diff] [review]:
-----------------------------------------------------------------

In validate1.sh I don't know what extraneous_response=${3} and extraneous_fax=${4} are for. Some comments on the script would help. 
By the way, since I hardly ever use sed so this scripts taking me a while to review. The others ones look fine so far.

::: cmd/fipstest/aes.sh
@@ +1,1 @@
>  #!/bin/sh

Shouldn't the copyright notice be kept? The same comment applies to the other files.
In validatate.1.sh is see
sed -e 's;
;;g' -e 's;	; ;g' -e '/^#/d' $extraneous_response ${TESTDIR}/resp/${name}.rsp > /tmp/y1
would that ne equivalent to this one?
sed -e 's;;;g' -e 's;	; ;g' -e '/^#/d' $extraneous_response ${TESTDIR}/resp/${name}.rsp > /tmp/y1
Assuming so, then what's the sed -e 's;;;g' doing? Looks to my naive reading like replacing empty with empty. Another question that some of of the calling scripts seem to call this one with extraneous_response=' ' and I see $extraneous_fax being some complicated expression that looks like a sed script. Somehow you are trying to get rid on some unwanted lines. Without further help I can't give a trustworthy review of of the validate.1.sh portion.
Flags: needinfo?(rrelyea)
Comment on attachment 8648086 [details] [diff] [review]
fips.patch revised so it applies cleanly

r+ in general. My only reservation is the naming discrepancies with downstream I pointed out in https://bugzilla.mozilla.org/show_bug.cgi?id=1181814#c27
Attachment #8648086 - Flags: review+
re comment 27: I'll make sure the correct names from the spec are used. Once in, if the RHEL versions are wrong, we can update the RHEL versions when you next rebase (probably keep an alias for nss-softoken).
Flags: needinfo?(rrelyea)
> Why is there are difference? (comment 25). 

Looks like a bad patch application. The downstream version is correct and the current patch is wrong. I'll update.
RE comment 28
> In validate1.sh I don't know what extraneous_response=${3} and extraneous_fax=${4} are for.

Validate1.sh is a helper shell script that each of the base test shell scripts call to help validate that the generated response (response) matches the given response (fax). Sometimes (depending on the individual tests) there are extraneous output in either or both response and fax files. These allow the caller to pass in additional sed commands to clear out those extraneous outputs before we compare the two files.

from comment 28
> sed -e 's;
;;g' -e 's;	; ;g' -e '/^#/d' $extraneous_response ${TESTDIR}/resp/${name}.rsp > /tmp/y1
> would that ne equivalent to this one?
> sed -e 's;;;g' -e 's;	; ;g' -e '/^#/d' $extraneous_response ${TESTDIR}/resp/${name}.rsp > /tmp/y1

here's the sed command with the white space characters made visible:
sed -e 's;^M;;g' -e 's;<tab>; ;g'

It removes the carriage return (1/2 of the windows new line), changes tabs to a single space, and deletes any comment lines (starting with #).
(In reply to Elio Maldonado from comment #28)
> 
> Shouldn't the copyright notice be kept? The same comment applies to the
> other files.

missing license header in files:
    aes.sh
    aesgcm.sh
    dsa.sh
    ecdsa.sh
    hmac.sh
    rng.sh
    rsa.sh
    sha.sh
    tdea.sh
    tls.sh
You have moved
    fips_hashBuf
    fips_hashLen
    fips_hashOid
    sha_get_hashType
    getLibName


You removed:
-freebl_fips_MD2_PowerUpSelfTest
-freebl_fips_MD5_PowerUpSelfTest
-sftk_fips_RC2_PowerUpSelfTest
-sftk_fips_RC4_PowerUpSelfTest
-sftk_fips_DES_PowerUpSelfTest
-sftk_fips_MD2_PowerUpSelfTest
-sftk_fips_MD5_PowerUpSelfTest


You moved and renamed:

freebl_fips_HMAC
-sftk_fips_HMAC

freebl_fips_ECDSA_Test
-sftk_fips_ECDSA_Test

-sftk_fips_DES3_PowerUpSelfTest
freebl_fips_DES3_PowerUpSelfTest

-sftk_fips_AES_PowerUpSelfTest
freebl_fips_AES_PowerUpSelfTest

-sftk_fips_HMAC_PowerUpSelfTest
freebl_fips_HMAC_PowerUpSelfTest

-sftk_fips_SHA_PowerUpSelfTest
-freebl_fips_SHA_PowerUpSelfTest
freebl_fips_SHA_PowerUpSelfTest

-sftk_fips_ECDSA_PowerUpSelfTest
freebl_fips_ECDSA_PowerUpSelfTest

-sftk_fips_DSA_PowerUpSelfTest
freebl_fips_DSA_PowerUpSelfTest

-sftk_fips_RNG_PowerUpSelfTest
freebl_fips_RNG_PowerUpSelfTest

-sftk_fipsPowerUpSelfTest
-freebl_fipsPowerUpSelfTest
freebl_fipsPowerUpSelfTest


Did you make any changes inside the functions that you moved?
(It would be very tedious work to check if you have changed anything.)


You changed:
freebl_fips_RSA_PowerUpSelfTest
Attachment #8641304 - Attachment is obsolete: false
I took your full patch (13000 lines), removed everything where you just seem to be moving code (5000 lines), did a patch that ignores whitespace changes, and the remaining patch is about 4300 lines.

I looked through those 4300 lines, trying to find semantic changes that are outside of the selftests and test scripts.

Should someone with a deeper understanding of the math behind this code review these changes?
(In reply to Kai Engert (:kaie) from comment #37)
> I looked through those 4300 lines, trying to find semantic changes that are
> outside of the selftests and test scripts.
> 
> Should someone with a deeper understanding of the math behind this code
> review these changes?

To make it clear, the previous comment added a subset of your patch, which shows the potential semantic that you made, which might need explanation or review from someone else.
You've introduced a new shared library, (lib)freeblpriv3(.so/.dll)
The old freebl3 library became tiny, you probably changed it into a forwarder?

For the record, can you please briefly explain why this was necessary?


It has the consequence that all applications that ship or package NSS will have to update their shipping logic to include this additional file.

Also, all logic that produces .chk files seems to be required to include this additional new library in the .chk handling, too, correct?


Your patch adds three new shared-library entry points (DllMain), for fipsfreebl, fipstest, lgfips. Under which circumstances are these built?

If I understand correctly, it seems lgfips is just a new entry point for existing library nssdbm, and fipstest appears to be for the existing softokn3 library, and fipsfreebl seems to be for the forwarder replacement freebl3 library.
Comment on attachment 8645118 [details] [diff] [review]
fipstest.patch

Review of attachment 8645118 [details] [diff] [review]:
-----------------------------------------------------------------

I'm not sure I have much to contribute to this patch, review-wise. Overall, my comments would be to split out the whitespace-only changes and make sure there isn't any unnecessary trailing whitespace. Since this code doesn't affect gecko, it's not clear it needs review from someone in the Mozilla organization.

::: cmd/fipstest/fipstest.c
@@ +60,5 @@
>      int i;
>      unsigned char offset;
>      *byteval = 0;
>      for (i=0; i<2; i++) {
> +        if (c2[i] >= '0' && c2[i] <= '9') {

It would be easier to review this patch if the whitespace-only changes were separated out into their own patch.

@@ +1013,5 @@
> +    unsigned char plaintext[10*16];     /* 1 to 10 blocks */
> +    unsigned int plaintextlen;
> +    unsigned char ciphertext[11*16];    /* 1 to 10 blocks + tag */
> +    unsigned int ciphertextlen;
> +    unsigned char aad[10*16];            /* 1 to 10 blocks + tag */

Should this be 11*16 if it's 1 to 10 blocks and a tag block?

@@ +2035,5 @@
> +                }
> +            } else {
> +                if (!last) {
> +                    count++; 
> +                } else { 

There's some trailing whitespace around here - might as well get rid of it as well.
Attachment #8645118 - Flags: review?(dkeeler)
re comment 35,

Good catch, I've also now added license headers to validate*.sh and runtest.sh

re comment 36, 

> You removed:
These are all non-FIPS algorithms, so post isn't necessary in fips mode. Many of these are also deprecated and shouldn't be used normally anyway.

> You moved and renamed:

> Did you make any changes inside the functions that you moved?
No, they just moved.

re comment 39

> You've introduced a new shared library, (lib)freeblpriv3(.so/.dll)
> The old freebl3 library became tiny, you probably changed it into a forwarder?

Yes.

> For the record, can you please briefly explain why this was necessary?

This was only done on platforms that where glibc links with libfreebl3.so. The issue is when those platforms run prelink, the modify the actual library. This causes fips integrity checks to fail, since you are modifying the library. There are 2 ways do deal with this modification: 1) black-list softokn and libfreebl from prelink, and 2) use the prelink binary to restore the shared libraries to the pristine condition before checking their integrity.

In RHEL 5 we did option 1). Since nss-softokn is dlopened by nss and nss-softokn dlopens libfreebl, doing so perturbed nothing.
In RHEL 6, glibc linked directly with libfreebl. black-listing libfreebl caused most of the system to be black-listed. Because of that we implemented option 2.
When we did our last FIPS validation, NIST now requires that the integrity check happens at when the dll is loaded. This means any application the links with glibc (well actually libcrypt in glibc) would call the prelink library at load time, whether or not it ever actually calls libcrypt itself. This causes havoc with performance, signals (handling sigchild properly, in a thread safe manner), and selinux (prelink tried setting bits on the rebuilt library that the application that called it didn't have permission to set). The simple solution that didn't involve changing glibc or prelink, was to return to scenario 1, and force glibc to dlopen the real freebl library. libfreebl became a stub (only on those platforms that support calling libfreebl standalone), which calls libfreeblpriv. softokn calls libfreeblpriv directly, so apps like mozilla don't need libfreebl on those platforms anymore.

NOTE: NSS already has platform specific libfreebl implementations (See solaris, for instance). It will affect packaging tools that need to know about libfreeblpriv.


> Your patch adds three new shared-library entry points (DllMain), for fipsfreebl, fipstest, lgfips.

DllMain is only on Windows, but each of these has some sort of startup function (bl_startup_tests in freebl).

> Under which circumstances are these built?

They are always built, but they aren't turned on unless one of the following happens:
1) Windows is defined (DllMain handles that). or
2) GCC is defined and NSS_NO_INIT_SUPPORT has not been set.
3) USE_INIT_PRAGMA has been set.

If a platform doesn't fit into 1, 2, or 3, you do need to turn on NSS_NO_INIT_SUPPORT, which still runs the post, but only as part of normal initialization. If you don't, NSS will fail in FIPS mode on that platform (normal NSS will continue).

> If I understand correctly, it seems lgfips is just a new entry point for existing library nssdbm,
It's the new code to manage integrity and post and library load time.

> and fipstest appears to be for the existing softokn3 library,
Sort of, it's was there before, to be called by C_Initialize(). Now it also has the code the handle running at dlload time.

> and fipsfreebl seems to be for the forwarder replacement freebl3 library.
No it's for libfreeblpriv itself. There are not integrity checks for the forwarder, which is outside the binary library.
RE comment 37

> I looked through those 4300 lines, trying to find semantic changes that are outside of the selftests and
> test scripts.
>
> Should someone with a deeper understanding of the math behind this code review these changes?
>

There are 2 sets of changes. 

The changes for mpprime.c:

nBits is the size of the prime we are trying to generate. We generate primes by picking random odd numbers of the desired size and testing them for primality. Our primality test is probablistic. That is, if the test fails, we know for sure that the value is composite. If the test succeeds, then we know that there is a certain chance that it's prime. Rerunning the test improves our confidence that the number is prime. num_tests is the number of times we have to run the test to be confident that the number is prime. We used to use the table in applied cryptography for reasonable guesses. FIPS now has more stringent requirements (FIPS is based on the fact that number may not be random, but supplied to you, in which case there are some composites that could fake out the probabilistic tests more frequently). The comments in the code point to the tables.

The changes in rsa.c

phi(N) is called the order of the group. It's the smallest number such that a^phi(N) mod N = 1 for any a relatively prime to N. If N=p*q and p and q are prime, the phi(N) = LCM(p-1,q-1), where LCM is the Least common multiple. The old code calculated phi(N) as (p-1)*(q-1), which is too big, but it works mathematically because it's a multiple of phi(N) and if a^phi(N) mod N = 1, then a^(k*phi(N)) mod N = (a^phi(N))^k mod N = 1^k mod N = 1.

So the old code worked, but created private keys that were longer than necessary, and didn't match the standards for generating private keys.
RE comment 40:

> Should this be 11*16 if it's 1 to 10 blocks and a tag block?

yup!

> There's some trailing whitespace around here - might as well get rid of it as well.

OK,
Attached patch full.patch.2Splinter Review
Attachment #8650576 - Flags: review?(kaie)
Oops, I forgot the comment section: This is a full patch with all the review comments to date incorporated.
Comment on attachment 8650576 [details] [diff] [review]
full.patch.2

r=kaie

If you can think of individual changes that should get another closer look, please request an additional review for those small subsets.


Some final comments:

In rng_vst and rng_mct, are you sure you want to change variable b back to uninitialized?


(In reply to Robert Relyea from comment #34)
> here's the sed command with the white space characters made visible:
> sed -e 's;^M;;g' -e 's;<tab>; ;g'
> 
> It removes the carriage return (1/2 of the windows new line), changes tabs
> to a single space, and deletes any comment lines (starting with #).

It would be good to have that as a comment.
Attachment #8650576 - Flags: review?(kaie) → review+
Attached patch fix-errs.patchSplinter Review
You haven't yet fixed some compiler errors, which are the result of Martin's changes to treat warnings as errors.

You'll need this attached patch to build your new code, on top of your patch.

I think it's fine to cast away const, because your new combined array is const again.
Hmm, I compiled with martin's patch, maybe it's a different version of gcc?
(In reply to Robert Relyea from comment #48)
> Hmm, I compiled with martin's patch, maybe it's a different version of gcc?

possible, my fedora 22 system uses gcc 5.1.1
(In reply to Robert Relyea from comment #34)
> RE comment 28
> > In validate1.sh I don't know what extraneous_response=${3} and extraneous_fax=${4} are for.
> 
> Validate1.sh is a helper shell script that each of the base test shell
> scripts call to help validate that the generated response (response) matches
> the given response (fax). Sometimes (depending on the individual tests)
> there are extraneous output in either or both response and fax files. These
> allow the caller to pass in additional sed commands to clear out those
> extraneous outputs before we compare the two files.

Thank you for the explanation. I also looked at http://csrc.nist.gov/groups/STM/cavp/ and in particular http://csrc.nist.gov/groups/STM/cavp/documents/dss/DSAVS.pdf 
> 
> from comment 28
> > sed -e 's;
> ;;g' -e 's;	; ;g' -e '/^#/d' $extraneous_response
> ${TESTDIR}/resp/${name}.rsp > /tmp/y1
> > would that ne equivalent to this one?
> > sed -e 's;;;g' -e 's;	; ;g' -e '/^#/d' $extraneous_response ${TESTDIR}/resp/${name}.rsp > /tmp/y1
> 
> here's the sed command with the white space characters made visible:
> sed -e 's;^M;;g' -e 's;<tab>; ;g'
> 
> It removes the carriage return (1/2 of the windows new line), changes tabs
> to a single space, and deletes any comment lines (starting with #).

and downloading some of the test vectors helps also helps. The DSA tests with several validation steps was the most elaborate that needed the extra sed processing.
Attachment #8645118 - Flags: review?(emaldona) → review+
This introduced some build errors. See patch at:
https://codereview.appspot.com/262260043/
Flags: needinfo?(martin.thomson)
r+; please land and unbreak the build.
Flags: needinfo?(martin.thomson)
Attachment #8641127 - Flags: review?(kaie)
Attachment #8641381 - Flags: review?(kaie)
Attachment #8645117 - Flags: review?(kaie)
Based on full.patch.2 plus fix_errors patch from Kai updated for current sources were among others things warning are treated as errors. Locally ran all.sh which reported about 14000 tests which is consistent with recent builds. Caveat emptor: done in a rather mechanical fashion and I didn't try to address other issues such as possible change in semantics. Built upon was already done and hopefully serves as starting point for further work.
Attachment #8701162 - Flags: review?(rrelyea)
Comment on attachment 8701162 [details] [diff] [review]
full patch 3  - updated for current state of the sources

Review of attachment 8701162 [details] [diff] [review]:
-----------------------------------------------------------------

Build fails on OS X and Windows: comments for fipsfreebl.c and lgglue.c. Easy fix for the latter.

::: lib/freebl/fipsfreebl.c
@@ +44,5 @@
> +
> +/* Windows pre-defined entry */
> +#ifdef XP_WIN
> +#include <windows.h>
> +#include <primpl.h>

Attempt build on windows 7 failed with primpl.h: no such file or directory

@@ +52,5 @@
> +    DWORD fdwReason,     // reason for calling function
> +    LPVOID lpReserved )  // reserved
> +{
> +    // Perform actions based on the reason for calling.
> +    switch( fdwReason ) 

nit: ws

@@ +53,5 @@
> +    LPVOID lpReserved )  // reserved
> +{
> +    // Perform actions based on the reason for calling.
> +    switch( fdwReason ) 
> +    { 

nit: ws

@@ +1617,5 @@
> +    rv = freebl_fips_DSA_PowerUpSelfTest();
> +
> +    if( rv != SECSuccess )
> +        return rv;
> +    

nit: ws

@@ +1638,5 @@
> + * table. The standalone use can operation without nspr or nss-util, while
> + * the joint use requires both to be loaded. Certain functions (like RSA)
> + * needs locking from NSPR, for instance.
> + *
> + * At load time, we need to handle the two uses separately. If nspr and 

nit: ws

@@ +1639,5 @@
> + * the joint use requires both to be loaded. Certain functions (like RSA)
> + * needs locking from NSPR, for instance.
> + *
> + * At load time, we need to handle the two uses separately. If nspr and 
> + * nss-util  are loaded, then we can run all the selftests, but if nspr and 

nit: ws

@@ +1653,5 @@
> +
> +/*
> + * accessors for freebl
> + */
> +PRBool BL_POSTRan(PRBool freebl_only) 

nit: ws

@@ +1654,5 @@
> +/*
> + * accessors for freebl
> + */
> +PRBool BL_POSTRan(PRBool freebl_only) 
> +{ 

nit: ws

@@ +1675,5 @@
> +     * libraries, but now we want to use more than just a standalone freebl.
> +     * This requires the other libraries to be loaded.
> +     * If they are now loaded, Try to run the rest of the selftests,
> +     * otherwise fail (disabling access to these algorithms)  */
> +    self_tests_ran = PR_TRUE; 

nit: ws

@@ +1690,5 @@
> +
> +/*
> + * This function is called at dll load time, the code tha makes this 
> + * happen is platform specific on defined above.
> + */ 

nit: ws

@@ +1697,5 @@
> +{
> +    const char *libraryName;
> +    PRBool freebl_only = PR_FALSE;
> +    SECStatus rv; 
> +

nit: ws

@@ +1741,5 @@
> +    }
> +}
> +
> +/*
> + * this is called from the freebl init entry points that controll access to 

nit: ws

@@ +1749,5 @@
> +SECStatus
> +BL_FIPSEntryOK(PRBool freebl_only) {
> +#ifdef NSS_NO_INIT_SUPPORT
> +   /* this should only be set on platforms that can't handle one of the INIT
> +    * schemes.  This code allows those platforms to continue to function, 

nit: ws

::: lib/softoken/lgglue.c
@@ +249,3 @@
>  static PRLibrary *legacy_glue_lib = NULL;
>  static SECStatus 
> +sftkdbLoad_Legacy()

No longer takes arguments yet most calls elsewhere call it via rv = sftkdbLoad_Legacy(PR_FALSE); For some reason it compiles on Fedora system but it fails on OS X.
Attached patch full patch V4 (obsolete) — Splinter Review
Gets rid of the unwanted #include <primpl.h> lines and sftkdbLoad_Legacy invoked without arguments throughout. Built fine and all.sh passed in Fedora, OS X, and Windows 7 VMs.
Attachment #8701162 - Attachment is obsolete: true
Attachment #8701162 - Flags: review?(rrelyea)
Attachment #8703281 - Flags: review?(rrelyea)
Target Milestone: 3.20 → 3.22
Blocks: 1009429
Bob, Could you please review this new version? It would be nice if it could make it in time for 3.22 and will also help us make progress with Bug 100929. Thanks, - Elio
Flags: needinfo?(rrelyea)
Correction:  Bug 100929 is fixed, I meant to type Bug 923089.
Blocks: 923089
No longer blocks: 1009429
We won't try to jam this into 3.22, better to drop in in early for 3.23 I'll review your patch.

bob
Flags: needinfo?(rrelyea)
Elio, what is your diff from the patch4?
never mind comment 56 and comment 57 tell me the answer.

bob
Attachment #8703281 - Flags: review?(rrelyea) → review+
Target Milestone: 3.22 → 3.23
full patch v4 no longer applied cleanly due to substantial changes in the sources, mostly for ChaCha20/Poly1305, since the v4 as approved so I had to update the patch to account for this.
Attachment #8703281 - Attachment is obsolete: true
made with diff --unified full.4.patch full.5.patch > full4to5.patch
Pushed: https://hg.mozilla.org/projects/nss/rev/43e2d2e1cb74
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
Build failure on OSX 32-bit:

gcc -arch i386 -o Darwin9.8.0_OPT.OBJ/mangle -O2 -fPIC -Di386 -fno-common -pipe -DDARWIN -DHAVE_STRERROR -DHAVE_BSD_FLOCK  -Wall -DNSS_NO_GCC48 -DXP_UNIX -DSHLIB_SUFFIX=\"dylib\" -DSHLIB_PREFIX=\"lib\" -UDEBUG -DNDEBUG -DUSE_UTIL_DIRECTLY -DNO_NSPR_10_SUPPORT -DSSL_DISABLE_DEPRECATED_CIPHER_SUITE_NAMES -I../../../../dist/Darwin9.8.0_OPT.OBJ/include -I../../../../dist/public/nss -I../../../../dist/private/nss  Darwin9.8.0_OPT.OBJ/mangle.o   -L../../../../dist/Darwin9.8.0_OPT.OBJ/lib -lplc4 -lplds4 -lnspr4  
../../../coreconf/nsinstall/Darwin9.8.0_OPT.OBJ/nsinstall -R -m 775 Darwin9.8.0_OPT.OBJ/mangle ../../../../dist/Darwin9.8.0_OPT.OBJ/bin
cd Darwin9.8.0_OPT.OBJ ; sh "/Users/buildbot/slavedir/2-osx105-x32-OPT/hg/nss/cmd/shlibsign/."/sign.sh "/Users/buildbot/slavedir/2-osx105-x32-OPT/hg/nss/cmd/shlibsign/../../../dist/Darwin9.8.0_OPT.OBJ" \
	"/Users/buildbot/slavedir/2-osx105-x32-OPT/hg/nss/cmd/shlibsign/Darwin9.8.0_OPT.OBJ" Darwin \
	"/Users/buildbot/slavedir/2-osx105-x32-OPT/hg/nss/cmd/shlibsign/../../../dist/Darwin9.8.0_OPT.OBJ/lib" "/Users/buildbot/slavedir/2-osx105-x32-OPT/hg/nss/cmd/shlibsign/../../../dist/Darwin9.8.0_OPT.OBJ/lib/libsoftokn3.dylib"
/Users/buildbot/slavedir/2-osx105-x32-OPT/hg/nss/cmd/shlibsign/Darwin9.8.0_OPT.OBJ/shlibsign -v -i /Users/buildbot/slavedir/2-osx105-x32-OPT/hg/nss/cmd/shlibsign/../../../dist/Darwin9.8.0_OPT.OBJ/lib/libsoftokn3.dylib
/Users/buildbot/slavedir/2-osx105-x32-OPT/hg/nss/cmd/shlibsign/./sign.sh: line 13: 51854 Illegal instruction     "${2}"/shlibsign -v -i "${5}"
make[2]: *** [../../../dist/Darwin9.8.0_OPT.OBJ/lib/libsoftokn3.chk] Error 132
make[1]: *** [libs] Error 2
make: *** [libs] Error 2
TB [2016-02-18 08:10:51] NSS - build - 32 bits - OPT FAILED


full log here, but I've already pasted the interesting part:
https://bot.nss-crypto.org:8011/builders/2-osx105-x32-OPT/builds/861/steps/shell/logs/stdio
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Windows failed, too:


cl WIN954.0_OPT.OBJ\\pk11_chacha20poly1305_unittest.obj WIN954.0_OPT.OBJ\\pk11_pbkdf2_unittest.obj WIN954.0_OPT.OBJ\\pk11_prf_unittest.obj WIN954.0_OPT.OBJ\\pk11_rsapss_unittest.obj WIN954.0_OPT.OBJ\\pk11_gtest.obj -FeWIN954.0_OPT.OBJ/pk11_gtest.exe -link -nologo -SUBSYSTEM:CONSOLE,5.01  ..\\..\\..\\dist\\WIN954.0_OPT.OBJ\\lib\\gtest.lib ..\\..\\..\\dist\\WIN954.0_OPT.OBJ\\lib\\sectool.lib ..\\..\\..\\dist\\WIN954.0_OPT.OBJ\\lib\\nssutil3.lib ..\\..\\..\\dist\\WIN954.0_OPT.OBJ\\lib\\smime3.lib ..\\..\\..\\dist\\WIN954.0_OPT.OBJ\\lib\\ssl3.lib ..\\..\\..\\dist\\WIN954.0_OPT.OBJ\\lib\\nss3.lib ..\\..\\..\\dist\\WIN954.0_OPT.OBJ\\lib\\plc4.lib ..\\..\\..\\dist\\WIN954.0_OPT.OBJ\\lib\\plds4.lib ..\\..\\..\\dist\\WIN954.0_OPT.OBJ\\lib\\nspr4.lib   Ws2_32.lib
if test -f WIN954.0_OPT.OBJ/pk11_gtest.exe.manifest; then \
		mt.exe -NOLOGO -MANIFEST WIN954.0_OPT.OBJ/pk11_gtest.exe.manifest -OUTPUTRESOURCE:WIN954.0_OPT.OBJ/pk11_gtest.exe\;1; \
		rm -f WIN954.0_OPT.OBJ/pk11_gtest.exe.manifest; \
	fi
nsinstall -m 775 WIN954.0_OPT.OBJ/pk11_gtest.exe ../../../dist/WIN954.0_OPT.OBJ/bin
make[2]: Leaving directory `/d/slavedir/2-win-x32-OPT/hg/nss/external_tests/pk11_gtest'
cd ssl_gtest; make libs
make[2]: Entering directory `/d/slavedir/2-win-x32-OPT/hg/nss/external_tests/ssl_gtest'
cl -FoWIN954.0_OPT.OBJ/libssl_internals.obj -c -O2 -MD -w44267 -w44244 -w44018 -w44312 -FS -arch:IA32 -EHsc -nologo -DNOMINMAX -W3 -nologo -D_CRT_SECURE_NO_WARNINGS -D_CRT_NONSTDC_NO_WARNINGS -WX -DXP_PC -UDEBUG -DNDEBUG -DWIN32 -D_X86_ -D_WINDOWS -DWIN95 -DUSE_UTIL_DIRECTLY -DNO_NSPR_10_SUPPORT -DSSL_DISABLE_DEPRECATED_CIPHER_SUITE_NAMES -DNSS_USE_STATIC_LIBS -I../../external_tests/google_test/gtest/include -I../../external_tests/common -I../../../dist/WIN954.0_OPT.OBJ/include -I../../../dist/public/nss -I../../../dist/private/nss -I../../../dist/public/nspr -I../../../dist/public/nss -I../../../dist/public/libdbm -I../../../dist/public/gtest  -I../../lib/ssl "/d/slavedir/2-win-x32-OPT/hg/nss/external_tests/ssl_gtest/libssl_internals.c"
libssl_internals.c
d:\slavedir\2-win-x32-opt\hg\nss\external_tests\ssl_gtest\libssl_internals.h(16) : error C2146: syntax error : missing ')' before identifier 'serverKeyBits'
d:\slavedir\2-win-x32-opt\hg\nss\external_tests\ssl_gtest\libssl_internals.h(16) : error C2061: syntax error : identifier 'serverKeyBits'
d:\slavedir\2-win-x32-opt\hg\nss\external_tests\ssl_gtest\libssl_internals.h(16) : error C2059: syntax error : ';'
d:\slavedir\2-win-x32-opt\hg\nss\external_tests\ssl_gtest\libssl_internals.h(16) : error C2059: syntax error : ','
d:\slavedir\2-win-x32-opt\hg\nss\external_tests\ssl_gtest\libssl_internals.h(17) : error C2059: syntax error : ')'
d:/slavedir/2-win-x32-OPT/hg/nss/external_tests/ssl_gtest/libssl_internals.c(29) : error C2146: syntax error : missing ')' before identifier 'serverKeyBits'
d:/slavedir/2-win-x32-OPT/hg/nss/external_tests/ssl_gtest/libssl_internals.c(29) : error C2061: syntax error : identifier 'serverKeyBits'
d:/slavedir/2-win-x32-OPT/hg/nss/external_tests/ssl_gtest/libssl_internals.c(29) : error C2059: syntax error : ';'
d:/slavedir/2-win-x32-OPT/hg/nss/external_tests/ssl_gtest/libssl_internals.c(29) : error C2059: syntax error : ','
d:/slavedir/2-win-x32-OPT/hg/nss/external_tests/ssl_gtest/libssl_internals.c(29) : error C2059: syntax error : ')'
make[2]: *** [WIN954.0_OPT.OBJ/libssl_internals.obj] Error 2
make[2]: Leaving directory `/d/slavedir/2-win-x32-OPT/hg/nss/external_tests/ssl_gtest'
make[1]: *** [libs] Error 2
make[1]: Leaving directory `/d/slavedir/2-win-x32-OPT/hg/nss/external_tests'
make: *** [libs] Error 2
TB [2016-02-18 08:04:13] NSS - build - 32 bits - OPT FAILED


full log:
https://bot.nss-crypto.org:8011/builders/2-win-x32-OPT/builds/760/steps/shell_1/logs/stdio
(In reply to Kai Engert (:kaie) from comment #67)
> Build failure on OSX 32-bit:
> 
I built on OS X but it was 64-bit and Darwin15.3, Sorry, I should have done Windows also.
Fix for bustage due to fips_mode_available https://hg.mozilla.org/projects/nss/rev/6c6c113700a1
Attached file stack-osx32.txt
I've looked at the OSX 32 bit failure.

The shlibsign utility crashes with a memory access violation, so something more serious seems to be going on.

See the attachment. I'll back your changes out.
I'm trying to debug the shlibsign crash.

The execution of pr_LoadViaDyld() corrupts the calling stack.

The first time it gets called with libsoftokn3.dylib.
That triggers sftk_startup_tests(), freebl_RunLoaderOnce(), freebl_LoadDSO(), etc.,
and results in calling pr_LoadViaDyld(libfreebl3.dylib).

After execution of this second pr_LoadViaDyld(), the middle of the calling stack is damaged.

This correct stack:
#2  0x000692c3 in PR_LoadLibraryWithFlags (libSpec={type = PR_LibSpec_Pathname, value = {pathname = 0x104b80 "/Users/buildbot/test/hgcopy/nss/cmd/shlibsign/../../../dist/Darwin9.8.0_DBG.OBJ/lib/libfreebl3.dylib", mac_named_fragment = {fsspec = 0x104b80, name = 0x104b10 "/Users/buildbot/test/hgcopy/nss/cmd/shlibsign/../../../dist/Darwin9.8.0_DBG.OBJ/lib/libsoftokn3.dylib"}, mac_indexed_fragment = {fsspec = 0x104b80, index = 1067792}, pathname_u = 0x104b80}}, flags=10) at ../../../../pr/src/linking/prlink.c:418
#3  0x00237983 in loader_LoadLibInReferenceDir () at seed.c:367
#4  0x002379eb in loader_LoadLibrary () at seed.c:367
#5  0x00237abf in freebl_LoadDSO () at seed.c:368
#6  0x0007205a in PR_CallOnce (once=0x24085c, func=0x237a79 <freebl_LoadDSO>) at ../../../../pr/src/misc/prinit.c:778
#7  0x00237bee in freebl_RunLoaderOnce () at seed.c:369
#8  0x00237c18 in BL_Init () at seed.c:369
#9  0x00203c46 in sftk_startup_tests () at sha_fast.c:317

gets replaced by this nonsense stack:
#2  0x00237983 in SEED_encrypt (s=0x104b10 "/Users/buildbot/test/hgcopy/nss/cmd/shlibsign/../../../dist/Darwin9.8.0_DBG.OBJ/lib/libsoftokn3.dylib", d=0x23dfba "�", ks=0x8fe2fc20) at seed.c:367
#3  0x002379eb in SEED_encrypt (s=0x23dfba "�", d=0x24120d "�E\f��\005�E��E\b�@\bH;E�\017��\017����t*����\001", ks=0xbfffe0d8) at seed.c:367
#4  0x00237abf in SEED_encrypt (s=0x240860 "", d=0x1 <Address 0x1 out of bounds>, ks=0x4) at seed.c:368
#5  0x0007205a in PR_CallOnce (once=0x24085c, func=0x237a79 <SEED_encrypt+1714>) at ../../../../pr/src/misc/prinit.c:778
#6  0x00237bee in SEED_encrypt (s=0x104030 "", d=0xdae98 "\t", ks=0x15b) at seed.c:369
#7  0x00237c18 in SEED_encrypt (s=0x2d <Address 0x2d out of bounds>, d=0x1 <Address 0x1 out of bounds>, ks=0xbfffe164) at seed.c:369
#8  0x00203c46 in shaCompress (X=0x4, inbuf=0xbfffeefc) at sha_fast.c:317
Flags: needinfo?(rrelyea)
Kai: thanks a lot for the debugging info. Do you have a Mac?
With what you found, this should not be hard to track down
inside a debugger.
(In reply to Wan-Teh Chang from comment #74)
> Kai: thanks a lot for the debugging info. Do you have a Mac?
> With what you found, this should not be hard to track down
> inside a debugger.

I debugged it on the Darwin9 VM that we use as the 32bit buildbot slave.
It's inside the Mozilla community infrastructure.

Do you have any suggestion on how to debug further?

The corruption happens during the call to the NSCreateObjectFileImageFromFile() function.
IIUC this is a OS library call.

If that's true, I'd conclude the corruption happens during a shared library init function, inside libfreebl3, which gets automatically executed during loading.

Can you make suggestions which function this might be, and how to debug it?
Should a simple breakpoint work, on a module that gets loaded this way?
Hi Kai: I would run shlibsign directly, from inside the debugger.
Set a breakpoint in sftk_startup_tests, and go from there. At
this point, all you need is a lot of patience.

I think AddressSanitizer also works on Mac OS X. That would also
be a good tool to try.
(In reply to Wan-Teh Chang from comment #76)
> Hi Kai: I would run shlibsign directly, from inside the debugger.

That's what I do.

> Set a breakpoint in sftk_startup_tests, and go from there. At
> this point, all you need is a lot of patience.

It's not as simple as that.
Execution is much further than sftk_startup_tests already, 12 levels further down on the stack.

The corruption happens when calling a system function.

I don't want to step through undocumented assembler code of a system call.
Ok, BL_Init() is a working breakpoint, which will allow me to investigate further.
I've learned that with this patch, bl_startup_tests is the new init function for libfreebl3

By the time we reach bl_startup_tests, the stack is already corrupted.

I see that the library init mechanism is called recursively.
The execution of bl_startup_tests triggers the loading of libfreebl3 and the call to bl_startup_tests.

I wonder if there is a system limitation on Darwin9/OSX 10.5, where this kind of recursion doesn't work.

I see that Darwin10/OSX 10.6 is the minimal system requirement for Firefox >= 38.

I'll test if it works on our Darwin 10 machine. If it does, then maybe we can ignore that it's failing on Darwin 9.
Kai: thanks a lot for debugging this. Based on your comment 73,
comment 75, and comment 79, I think this is an infinite recursion
caused by the linker incorrectly binding the BL_Init() call made
by bl_startup_tests() (which is in libfreebl3.dylib) to the BL_Init()
symbol in lib/freebl/loader.c, which is actually in libsoftokn3.dylib
and outside libfreebl3.dylib, rather than the BL_Init() symbol in
libfreebl3.dylib.

We already added the -Bsymbolic linker flag in lib/freebl/Makefile
to deal with a similar issue before, and Mac OS X doesn't need that
linker flag. Perhaps a shared library init function needs a different
solution on Mac OS X 10.5. Perhaps the shared library init function
support has some bugs, for example, perhaps an init function can't
load a shared library.

We have several solutions:

1. Disable the shared library init functions for Mac OS X 10.5.
This may need to be done by having the init functions check the
OS version at run time and return immediately if it's <= 10.5.

2. Require Mac OS X 10.6 or later, as you suggested. This is the
current minimum requirement for both Firefox and Chrome.

3. Get rid of lib/freebl/loader.c. We aren't taking advantage of
it right now, not even for Solaris SPARC. I think we should plan
to do this work although this requires someone familiar with
build systems.
> 1. Disable the shared library init functions for Mac OS X 10.5.
> This may need to be done by having the init functions check the
> OS version at run time and return immediately if it's <= 10.5.

This would be my preference. There should already be an option in the patch to do this. We do this on PPC Linux already (though we trigger it in the .spec) . (You need to compile with -DNO_INIT_SUPPORT)

> 3. Get rid of lib/freebl/loader.c. We aren't taking advantage of
> it right now, not even for Solaris SPARC. I think we should plan
> to do this work although this requires someone familiar with
> build systems.

We're using it on Linux systems, so this isn't an option.
Flags: needinfo?(rrelyea)
Hi Gaston, we intend to add this patch to NSS. Would you like to test if NSS still works on BSD with it? Thanks in advance.
(In reply to Robert Relyea from comment #81)
>
> > 3. Get rid of lib/freebl/loader.c. We aren't taking advantage of
> > it right now, not even for Solaris SPARC. I think we should plan
> > to do this work although this requires someone familiar with
> > build systems.
> 
> We're using it on Linux systems, so this isn't an option.

Bob, I think you're referring to the NSSLOW_ and NSSLOWHASH_ functions
exported in lib/freebl/freebl_hash.def. Those functions are not
defined in lib/freebl/loader.c. So removing lib/freebl/loader.c
should not affect this use case on Linux systems.
Because there's consensus to deprecate OSX 10.5, I've disabled our 32-bit 10.5 buildbot, and requested that it gets upgraded to OSX 10.6, in bug 1255114.

I believe this means we can land the patch again.
I've repeated Elio's pushs from comment 66 and comment 70.

https://hg.mozilla.org/projects/nss/rev/19e76dfdb803
https://hg.mozilla.org/projects/nss/rev/7c54c665206c
Target Milestone: 3.23 → 3.24
Sorry, didnt manage to find time to build nss tip outside mozilla-central, it always fail with missing nspr headers.
Darn, figured it out.

'gmake NSPR_INCLUDE_DIR=/usr/local/include/nspr NSPR_LIB_DIR=/usr/local/lib/ USE_64=1 NSS_DISABLE_GTESTS=1' on nss tip as of now builds fine on OpenBSD/amd64 (at least libfreebl and libsoftokn).
Landry, thanks a lot for your feedback, that's good news!
We have a bustage on Windows:
lib/freebl/rijndael.c(1303) : error C2065: 'MP_32BIT_MAX' : undeclared identifier

This is a local symbol, defined in mpi.h

rijndael.c includes it, #ifdef USE_HW_AES

However, the new code that was added in this bug, apparently wants to use the symbol in software AES operations, too, based on this comment:
  "We do it here to cover both hardware and software GCM operations."

After discussion with Franziskus, we agreed to attempt to unconditionally include mpi.h as an attempted bustage fix:
https://hg.mozilla.org/projects/nss/rev/6f7b0bdbfa04
An additional bustage fix was necessary, because PR_STATIC_ASSERT cannot be used in the middle of a scope:
https://hg.mozilla.org/projects/nss/rev/73b2ff243748
We have one more bustage, that is limited to the RHEL 5.x platform.
I believe the fix is easy, therefore the tree remains open.

I've opened a blocker bug client, which we should attempt to resolve ASAP.
Bob, the new code that you had added has the following comment:

  /* PR_ATOMIC_SET macro is implemented with compiler intrinsics, if we don't
   * have the intrinsic, it will be set to an NSPR PR_Atomic function, which
   * will show up as missing at build time. If we get that missing function,
   * Then we'll need to implement a PR_AtomicSet() function for that platform
   */

This is failing on RHEL 5.11, because it uses gcc 4.1.2 and compiles for i386.

The required macros apparently aren't available in that combination, according to bug 415563.

We have the choice to:

(a) either depcrate i386, and require i486 for building NSS.

    This would require the NSPR fix from bug 1256665, plus a build system change 
    in NSS, to enforce the use of -march=i486 on 32bit x86.

(b) or you find a different solution for that platform.

What is your preference?

I've been told that RHEL 5.x requires i486.
Flags: needinfo?(rrelyea)
FYI, the combination of the patch from bug 1256665, plus this patch to require i486 when building freebl, builds successfully on RHEL 5.11 32bit.

diff -r 0af5db46bd80 lib/freebl/Makefile
--- a/lib/freebl/Makefile       Tue Mar 15 14:12:11 2016 +1100
+++ b/lib/freebl/Makefile       Tue Mar 15 11:24:02 2016 -0400
@@ -229,6 +229,7 @@
 endif
 ifeq ($(CPU_ARCH),x86)
     ASFILES  = mpi_x86.s
+    CFLAGS  += -march=i486
     DEFINES += -DMP_ASSEMBLY_MULTIPLY -DMP_ASSEMBLY_SQUARE 
     DEFINES += -DMP_ASSEMBLY_DIV_2DX1D -DMP_USE_UINT_DIGIT
     DEFINES += -DMP_IS_LITTLE_ENDIAN
I'm OK requiring i486 in RHEL 5
Flags: needinfo?(rrelyea)
Depends on: 1256665
Attached patch 1181814-atomic-x86.patch (obsolete) — Splinter Review
Wan-Teh has suggested, instead of fixing bug 1256665 in a general, that we change lowhash_vector to call __sync_lock_test_and_set directly on 32-bit x86, instead of calling PR_ATOMIC_SET.

This patch implements that suggestion. I've tested that it works on RHEL 5.11
Attachment #8730906 - Flags: review?(rrelyea)
Comment on attachment 8730906 [details] [diff] [review]
1181814-atomic-x86.patch

Review of attachment 8730906 [details] [diff] [review]:
-----------------------------------------------------------------

::: lib/freebl/lowhash_vector.c
@@ +106,5 @@
>  /* remove when NSPR pratom.h is updated. s390 has atomic intrinsics,
> + * but nspr doesn't know it.
> + * Also, the i386 that we care to support also supports them,
> + * but nspr limits its use. */
> +#if defined(__s390__) || defined(__s390x__) || defined(__i386__)

I suggest we simply replace PR_ATOMIC_SET with __sync_lock_test_and_set,
for all processor architectures.

Also, if this library can be linked with -lpthread, then pthread_once()
is the best solution.
Comment on attachment 8730906 [details] [diff] [review]
1181814-atomic-x86.patch

Review of attachment 8730906 [details] [diff] [review]:
-----------------------------------------------------------------

r+ rrelyea. Note wtc's comment. We probably should update nspr as he suggests.

bob
Attachment #8730906 - Flags: review?(rrelyea) → review+
(In reply to Robert Relyea from comment #96)
> 
> r+ rrelyea. Note wtc's comment. We probably should update nspr as he
> suggests.

In my understanding, he doesn't suggest to update nspr.
Wan-Teh, is this what you suggest?
Attachment #8730919 - Flags: review?(wtc)
For the record, I've jumped in to push this forward, and I had assumed it would be quick and easy. I didn't expect that so many changes might be necessary as part of pushing this patch.

If the patches I have attached aren't acceptable, I'd like to ask Bob/Elio to continue this work, as it's assigned to them.

(This is true in particular, if Wan-Teh insists on using pthread.)
Bob, Elio:

I am surprised by the new Linux libfreebl3.so library, which loads
libfreeblpriv3.so. Why is this required for FIPS 140 validation?

UPDATE: I see Kai and Bob already discussed this issue in comment 39
and comment 41. I don't fully understand the problem to be able to
propose a better solution. For the current solution, we should
name the two libraries as follows:

1. libfreeblpriv3.so should be renamed libfreebl3.so (its original name).

2. The new Linux-only libfreebl3.so library should be renamed
libfreebl_stub3.so, libfreebl_loader3.so, or something along those
lines. This shared library does the job of libfreebl.a, which I
understand cannot be a system library in Fedora because it is a
static library.

Another potential good name for this new Linux-only shared library
is liblowhash3.so.

Two other suggestions:

3. lowhash_vector.c should be renamed lowhash_loader.c
because it is closer to loaderc than to ldvector.c.

4. If I understand your requirements correctly, I believe
freebl_hash.def does not need to export FREEBL_GetVector
any more.
Comment on attachment 8730919 [details] [diff] [review]
1181814-atomic-x86-v2.patch

Review of attachment 8730919 [details] [diff] [review]:
-----------------------------------------------------------------

Kai: this patch is what I suggested. Note that this homegrown implementation
of one-time initialization is broken. You merely fixed a build error.

::: lib/freebl/lowhash_vector.c
@@ +115,5 @@
>    /* PR_ATOMIC_SET macro is implemented with compiler intrinsics, if we don't
>     * have the intrinsic, it will be set to an NSPR PR_Atomic function, which
>     * will show up as missing at build time. If we get that missing function,
>     * Then we'll need to implement a PR_AtomicSet() function for that platform
>     */

This comment block should be removed.
Attachment #8730919 - Flags: review?(wtc) → review+
(In reply to Wan-Teh Chang from comment #101)
> 
> Kai: this patch is what I suggested. Note that this homegrown implementation
> of one-time initialization is broken. You merely fixed a build error.

Bob, FYI.
Comment on attachment 8730919 [details] [diff] [review]
1181814-atomic-x86-v2.patch

Thanks Wan-Teh, checked in as you suggested.

https://hg.mozilla.org/projects/nss/rev/4333a65c4de5
Attachment #8730919 - Flags: checked-in+
Attachment #8730906 - Attachment is obsolete: true
Please try this patch.

Since lib/freebl/lowhash_vector.c is Linux only, we can
load libfreeblpriv3.so from a library constructor function.
Then we don't need to do lazy one-time initialization.

If this patch works, please remove the freebl_RunLoaderOnce()
function completely and use a better name for my_init(), which
probably should also be declared as static.
Attachment #8732016 - Flags: superreview?(rrelyea)
Attachment #8732016 - Flags: review?(kaie)
Comment on attachment 8732016 [details] [diff] [review]
Load libfreeblpriv3.so from a library constructor function

Since I haven't worked on this code, but only helped out doing the commit to HG, I'd like to forward the request to Bob.
Attachment #8732016 - Flags: review?(kaie)
I attempted an experimental Firefox build with a NSS beta snapshot, but it's failing.

shlibsign fails with: PR_CALL_ONCE_ERROR

 09:38:34     INFO -  Executing /builds/slave/try-l64-d-00000000000000000000/build/src/obj-firefox/dist/bin/shlibsign -v -o ../../dist/firefox/libsoftokn3.chk -i ../../dist/firefox/libsoftokn3.so
 09:38:34     INFO -  C_Initialize failed: 0x00000030, CKR_DEVICE_ERROR
 09:38:34     INFO -  NSPR error code: -5925: The one-time function was previously called and failed. Its error code is no longer available
 09:38:34     INFO -  Initiailzing softoken failed: 0x00000030, CKR_DEVICE_ERROR
 09:38:34     INFO -  NSPR error code: -5925: The one-time function was previously called and failed. Its error code is no longer available
 09:38:34     INFO -  moduleSpec configdir='' certPrefix='' keyPrefix='' secmod='' flags=noCertDB, noModDB

seen here: https://treeherder.mozilla.org/logviewer.html#?job_id=18353074&repo=try
(scroll a few lines up)

I couldn't reproduce locally.

Apparently the failure happens during the packaging stage.
I don't know how to execute this packaging step on my system.
Flags: needinfo?(rrelyea)
Blocks: 1258375
> 1. libfreeblpriv3.so should be renamed libfreebl3.so (its original name).

Not possible. We can't rename it because it will break already compiled programs. glibc already links with libfreebl3.so, so it's the one that can't be renamed. libfreeblpriv3.so could be renamed (though it will cause us pain because we have to deal with both names for a period of time, but that can be done with just a link).

I guess we could rename libfreebl3.so and use a link to go back, but libfreeblpriv3.so must have a distinct name from libfreebl3.so.

> 4. If I understand your requirements correctly, I believe
> freebl_hash.def does not need to export FREEBL_GetVector
> any more.

This is transitional. Applications that were built using older versions will still try to dlopen libfreebl3.so and call FREEBL_GetVector. However, that could be carried as a separate patch for RHEL systems. New releases built from the ground up don't need it. I suggest keeping it in for a period of time (a release or two) and then dropping it. That way we don't force distributions to do a massive rebuild of all NSS applications at once (such rebuilds usually are triggered by new compiler releases).

> Since lib/freebl/lowhash_vector.c is Linux only, we can
> load libfreeblpriv3.so from a library constructor function.
> Then we don't need to do lazy one-time initialization.

I'm not thrilled by this idea. The code in question gets called out of libcrypt in glibc. I'm leery about putting to much pressure at library load time (loading a library at library load time sounds like a good way to hose the loader). I think just picking up your patch to fix the private PR_CallOnce() code should be sufficient.

> I attempted an experimental Firefox build with a NSS beta snapshot, but it's failing.
> Apparently the failure happens during the packaging stage.
> I don't know how to execute this packaging step on my system.

This looks like an initialization issue (likely to be in the code that initializes NSS). NSS local builds run this automatically if you build the tools. It's trying to do a shlibsign to create a new .chk file. Are you running with wtc's suggested load at dlload time? or is this without that patch.

bob
Flags: needinfo?(rrelyea)
Comment on attachment 8732016 [details] [diff] [review]
Load libfreeblpriv3.so from a library constructor function

Review of attachment 8732016 [details] [diff] [review]:
-----------------------------------------------------------------

I'm not really comfortable doing a dlopen in a middle of a load time constructor. Someone loading another library while you are in the middle of loading a library doesn't seem like a good thing to me.
Attachment #8732016 - Flags: superreview?(rrelyea) → superreview-
OK, I'm going to soften that a bit. It looks like we do that already in softokn (which loads the same target library we are talking about in order to do the .chk file checks). libfreebl3.so is a different matter, though. It's linked with libcrypt.so, which is linked with 80% of the applications on linux. Most of those applications don't actually call libcrypt. If we do the ldopen at libfreebl3.so load time, we also trigger the immediate .chk file check. Multiply that with pretty much every exec call on linux and you have a pretty big performance hit (expecially for applications that will never actually do a hash), so I still think it's a bad idea to load it at dlopen time. (In fact one of the big benefits we got when we create libfreeblpriv3.so is the fact that we don't have to do the .chk for every one of these applications).

bob
Comment on attachment 8732016 [details] [diff] [review]
Load libfreeblpriv3.so from a library constructor function

Bob: the one-time initialization code is broken
because it lacks memory barriers.

Also, sleeping for one second is very long. We
can use usleep() or nanosleep() instead.

If we can link with the pthread library, then using
pthread_once() is a good solution.
We can't link with pthreads for the same reason we can't link with NSPR. I guess I misunderstood your patch. I thought it fixed the barrier issue.
I'm also OK with usleep() or nanosleep(). We just need to give up the CPU.

bob
So the package issue is because the symbolic link to libfreeblpriv3.so hasn't been made in the target bin directly. When I create the link the package completes.

BTW kai, to manually run package:

./mach package

in the mozilla-central directory.

bob
Since thee issue is in the mozilla package files, I've attached the patch to bug 1258375 .
No longer blocks: 923089
Bob's patches to fix Firefox have been landed.

I think this can be marked fixed.
Status: REOPENED → RESOLVED
Closed: 8 years ago8 years ago
Resolution: --- → FIXED
Blocks: 1262768
Bob, could you please write a short summary of these changes for the NSS release notes?
Flags: needinfo?(rrelyea)
For release notes:

NSS softoken has been updated with the latest NIST guidance (as of 2015).
   a. software integrity checks and POST functions are executed on shared memory load.
   b. count and gsm modes have checks to prevent counter overflow.
   c. Additional CSPs are zeroed in the code.
   d. Use new guidance for how many rabin-miller tests are needed to verify a prime based on prime size.
NSS softoken has also been updated to allow NSS to run in FIPS level-1 (no password). This mode is triggered by setting the database password to NULL. In FIPS mode, it's permissible to move from level-1 to level-2 (by setting an appropriate password), but not the reverse.
Flags: needinfo?(rrelyea)
Kai: Bob's comment seems to have some typos.

(In reply to Robert Relyea from comment #117)
> For release notes:
> 
> NSS softoken has been updated with the latest NIST guidance (as of 2015).
>    a. software integrity checks and POST functions are executed on shared
> memory load.

I think "shared memory" should be "shared library".

>    b. count and gsm modes have checks to prevent counter overflow.

I think "count" should be "counter" and "gsm" should be "GCM".

>    d. Use new guidance for how many rabin-miller tests are needed to verify
> a prime based on prime size.

Nit: "rabin-miller" should be capitalized: Rabin-Miller.

> NSS softoken has also been updated to allow NSS to run in FIPS level-1 (no
> password). This mode is triggered by setting the database password to NULL.

Bob: should "NULL" be "an empty string"?
setting needinfo so it's more likely that Bob will see Wan-Teh's questions.
Flags: needinfo?(rrelyea)
(In reply to Wan-Teh Chang from comment #118)
> > NSS softoken has also been updated to allow NSS to run in FIPS level-1 (no
> > password). This mode is triggered by setting the database password to NULL.
> 
> Bob: should "NULL" be "an empty string"?

I'm not certain about the behaviour of upstream code (there was some changes and I haven't tested it), but there is a difference in database/module initialisation between empty string as a password and no password at all ("NULL" password) as used in RHEL.
Yes, by NULL password, I mean the empty string. I'm not sure if passing NULL in will crash, produce an error, or be treated as an empty string, but the actual test is softoken checks to see if the empty string generates the key used to unlock the database, and if it does it uses level 1. 

As for the test of wtc's comments, the are true corrections to my typos.
Flags: needinfo?(rrelyea)

Comment on attachment 8641127 [details] [diff] [review]
fips-post-2.patch

clearing 7 year old review request on a closed bug;).

Attachment #8641127 - Flags: review?(ekr)
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: