Closed Bug 1058812 Opened 10 years ago Closed 10 years ago

mozilla::pkix rejects trust anchor certificates that are (self-)signed using MD5- or MD2- based signatures

Categories

(Core :: Security: PSM, defect)

defect
Not set
normal

Tracking

()

VERIFIED FIXED
mozilla35
Tracking Status
firefox32 --- unaffected
firefox33 + verified
firefox34 + verified
firefox35 + verified

People

(Reporter: keeler, Assigned: keeler)

References

Details

(Keywords: regression, relnote)

Attachments

(7 files, 10 obsolete files)

6.71 KB, patch
keeler
: review+
Details | Diff | Splinter Review
16.03 KB, patch
keeler
: review+
Details | Diff | Splinter Review
48.33 KB, patch
keeler
: review+
Details | Diff | Splinter Review
6.67 KB, patch
keeler
: review+
Details | Diff | Splinter Review
36.70 KB, patch
keeler
: review+
Details | Diff | Splinter Review
57.05 KB, application/zip
Details
37.90 KB, patch
keeler
: review+
Details | Diff | Splinter Review
Currently it appears that mozilla::pkix verifies signatures on trust anchors. This results in a compatibility problem with our remaining MD5 roots (e.g. "GTE CyberTrust Global Root") because MD5 is untrusted as a signature algorithm (although users do have the option to add a certificate override, so this may be WONTFIX/INVALID).
Here's an example site that chains to this root: https://www.wind.it/
Thank you David! I report here also the result of the openssl call:

$ openssl s_client -connect www.wind.it:443 < /dev/null
CONNECTED(00000003)
depth=2 C = US, O = GTE Corporation, OU = "GTE CyberTrust Solutions, Inc.", CN = GTE CyberTrust Global Root
verify return:1
depth=1 O = Cybertrust Inc, CN = Cybertrust SureServer Standard Validation CA
verify return:1
depth=0 C = IT, ST = Italia, L = Milano, O = WIND SPA, OU = TN.IT.OP.SI., CN = www.wind.it
verify return:1
---
Certificate chain
 0 s:/C=IT/ST=Italia/L=Milano/O=WIND SPA/OU=TN.IT.OP.SI./CN=www.wind.it
   i:/O=Cybertrust Inc/CN=Cybertrust SureServer Standard Validation CA
 1 s:/O=Cybertrust Inc/CN=Cybertrust SureServer Standard Validation CA
   i:/C=US/O=GTE Corporation/OU=GTE CyberTrust Solutions, Inc./CN=GTE CyberTrust Global Root
 2 s:/C=US/O=GTE Corporation/OU=GTE CyberTrust Solutions, Inc./CN=GTE CyberTrust Global Root
   i:/C=US/O=GTE Corporation/OU=GTE CyberTrust Solutions, Inc./CN=GTE CyberTrust Global Root
---
Server certificate
-----BEGIN CERTIFICATE-----
MIIEKzCCAxOgAwIBAgIOAQAAAAABNlh4HSZd7FUwDQYJKoZIhvcNAQEFBQAwUDEX
MBUGA1UEChMOQ3liZXJ0cnVzdCBJbmMxNTAzBgNVBAMTLEN5YmVydHJ1c3QgU3Vy
ZVNlcnZlciBTdGFuZGFyZCBWYWxpZGF0aW9uIENBMB4XDTEyMDMyODA4NTEyN1oX
DTE1MDMyODA4NTEyN1owbzELMAkGA1UEBhMCSVQxDzANBgNVBAgTBkl0YWxpYTEP
MA0GA1UEBxMGTWlsYW5vMREwDwYDVQQKEwhXSU5EIFNQQTEVMBMGA1UECxMMVE4u
SVQuT1AuU0kuMRQwEgYDVQQDEwt3d3cud2luZC5pdDCCASIwDQYJKoZIhvcNAQEB
BQADggEPADCCAQoCggEBAJ0tDMcy74qFd5SoaGxux4N13Pq3cPBl/iZvF+JyaTft
PtWL2i7B2yRNDF2NBYaK2/rECIMB3Zwwhh4YEe6RxIScO8p7rRAjlEfBIC9ABlC7
XRYVIB6btn89SjC1DR2kTSiSmitOTfY3M1GEx4pNLM5wOuR29y5AwKHDVje/5DEA
NQut3OxBRskBjXvjNlafZfQDnlN0ee8Ag4WoI1owxYZNBiiADTfSYnIvhFkGW21X
fl3W3iJ98+c5WNTj4QCesENmrSNMHWKQ/pYsHcyAieiXhfyYieoZ0ashHKiqHwVE
KtOGq6NAO3bH2xyPEGBl2qC7Erv8d5SAWb6yi/gyydsCAwEAAaOB4zCB4DAfBgNV
HSMEGDAWgBTNOpafrm4PQFwcSPhLLbhxAeuJ2jA5BgNVHR8EMjAwMC6gLKAqhiho
dHRwOi8vY3JsLm9tbmlyb290LmNvbS9TdXJlU2VydmVyRzIuY3JsMB0GA1UdDgQW
BBTlhQ+N10kt1x+cp3Gl3YCYHh+WGDAJBgNVHRMEAjAAMA4GA1UdDwEB/wQEAwIF
oDAdBgNVHSUEFjAUBggrBgEFBQcDAQYIKwYBBQUHAwIwEQYJYIZIAYb4QgEBBAQD
AgbAMBYGA1UdEQQPMA2CC3d3dy53aW5kLml0MA0GCSqGSIb3DQEBBQUAA4IBAQAy
ws4HEpACbpn5fTbLPOJqiyfUoijnlncG2myjygEhybPds9G0y752H6pRNWaNifmg
1M2+js3gesT4C+rpQWi43FDu0/ixZs4G9k+lLnOn32SEGqjzcwd/nS1FnnT8MoAQ
q332uQaayKQ8QNVV3jC+bV1rrW3iObhBsYlAX5VzOsTBqiYzxTcwSW5B7EHlwnqB
Z13+lXzQnati9MVITjYQVfrMN9YgrhRrC6dM8n/1CG6R0klKFeEXDDRUU3pRlJyt
3NPGt5LVy9X2zv5A28Yi3aGhRO2tObSj+vauHG49XdjDHjrz+d82ch8HUqcEG3pz                                                                                                                          
phiE3exGSCkDgPX5NvEU                                                                                                                                                                      
-----END CERTIFICATE-----                                                                                                                                                                 
subject=/C=IT/ST=Italia/L=Milano/O=WIND SPA/OU=TN.IT.OP.SI./CN=www.wind.it                                                                                                                
issuer=/O=Cybertrust Inc/CN=Cybertrust SureServer Standard Validation CA                                                                                                                  
---                                                                                                                                                                                       
No client certificate CA names sent                                                                                                                                                       
---                                                                                                                                                                                       
SSL handshake has read 3075 bytes and written 634 bytes                                                                                                                                   
---                                                                                                                                                                                       
New, TLSv1/SSLv3, Cipher is AES256-SHA                                                                                                                                                    
Server public key is 2048 bit
Secure Renegotiation IS supported
Compression: NONE
Expansion: NONE
SSL-Session:
    Protocol  : TLSv1
    Cipher    : AES256-SHA
    Session-ID: C9F85D51F2A5A3365BA59A4584C2102BC8403F9F723C67812E8C4F5F059B3291
    Session-ID-ctx: 
    Master-Key: F1581ED4E3297E7DD8B46C0E3E65AEA2C9D43C05B20CA7B6661EFC20D3C7A3F99A8D038516E865843E8F0C83D2A71A5F
    Key-Arg   : None
    PSK identity: None
    PSK identity hint: None
    SRP username: None
    TLS session ticket:
    0000 - e9 aa 14 c8 b0 0e df aa-1d b0 9f 2b 32 db 5e 05   ...........+2.^.
    0010 - 6a 0d 87 7c a0 dc b2 b3-91 11 b3 63 cb 73 a6 38   j..|.......c.s.8
    0020 - 6c 48 bd d8 d8 0a e4 77-2f 8f c9 07 6b 25 87 54   lH.....w/...k%.T
    0030 - 5a 95 6d 14 3a 5c e7 06-93 7c b7 ac ba a1 62 20   Z.m.:\...|....b 
    0040 - e4 0a c8 c4 5c f9 62 50-6b da db 3f 26 43 7d f3   ....\.bPk..?&C}.
    0050 - 66 b2 82 d5 5f 26 1d 25-2c ae 90 ef 98 e3 51 d8   f..._&.%,.....Q.
    0060 - 57 50 ff 50 88 6a 06 78-0c 5a 16 a2 9c 0b 79 42   WP.P.j.x.Z....yB
    0070 - 55 a5 b0 69 0a c2 65 ff-93 97 a6 8e cc 8b 70 ea   U..i..e.......p.
    0080 - fa b9 ff f7 a0 f5 37 00-a3 22 d6 96 5d 07 9f 19   ......7.."..]...
    0090 - 25 65 60 29 d1 31 9c 66-1d 1e c3 5b 9f b5 aa 28   %e`).1.f...[...(

    Start Time: 1409077457
    Timeout   : 300 (sec)
    Verify return code: 0 (ok)
---
DONE
My guess is that we're not really verifying the signature of trust anchor certificates, but rather the parsing of tbsCertificate.signature fails because we don't recognize the MD5 signature algorithm. Thus we should just need to add support for parsing the RSA-MD5 and maybe RSA-MD2 (I think there may still be one such trusted CA!) OIDs to der::SignatureAlgorithm, and then add explicit cases in pkixnss.cpp to cause MD5-based signatures to be rejected during verification. Easy.
Summary: mozilla::pkix verifies signatures on trust anchors (problematic with MD5 roots) → mozilla::pkix rejects trust anchor certificates that are (self-)signed using MD5- or MD2- based signatures
I imagine that many or all such certificates are SHA-1 signed and/or are using 1024-bit RSA keys. That should be factored into the prioritization here.

Also, we can probably parse the MD5-based and MD2-based OIDs, and/or any other unrecognized OIDs, into a generic SignatureAlgorithm::unsupported_algorithm value in the SignatureAlgorithm parser; there's no need within mozilla::pkix to recognize the individual algorithms.
Attached patch patch (obsolete) — Splinter Review
Something like this?
Assignee: nobody → dkeeler
Status: NEW → ASSIGNED
Attachment #8486032 - Flags: review?(brian)
Comment on attachment 8486032 [details] [diff] [review]
patch

Review of attachment 8486032 [details] [diff] [review]:
-----------------------------------------------------------------

In addition to the test you already wrote, I think it is important to have tests that verify that we reject otherwise-valid end-entity certificates, intermediate certificates, OCSP responses, and OCSP responder certificates that are signed with unsupported signature algorithms, and that we do not reject otherwise-valid trust anchor certificates signed with unsupported signature algorithms.

::: security/pkix/include/pkix/pkixtypes.h
@@ +73,5 @@
>    dsa_with_sha256 = 17,
>  
>    // id-dsa-with-sha1 (OID 1.2.840.10040.4.3, RFC 3279 Section 2.2.2)
>    dsa_with_sha1 = 18,
> +

Please add some documentation describing what this is for.

::: security/pkix/lib/pkixnss.cpp
@@ +132,5 @@
>        break;
>      case SignatureAlgorithm::dsa_with_sha1:
>        pubKeyAlg = SEC_OID_ANSIX9_DSA_SIGNATURE;
>        digestAlg = SEC_OID_SHA1;
>        break;

You will need to have an explicit case for unsupported_algorithm here to make Windows FAIL_ON_WARNINGS happy, IIRC.

@@ -134,5 @@
>        pubKeyAlg = SEC_OID_ANSIX9_DSA_SIGNATURE;
>        digestAlg = SEC_OID_SHA1;
>        break;
>      default:
> -      PR_NOT_REACHED("unknown signature algorithm");

We should keep the PR_NOT_REACHED for this default case.

I think we should add checks to the code to check against unsupported_algorithm before calling trustDomain.VerifySignedData and trustDomain.CheckPublicKey, so that we can guarantee that VerifySignedData and CheckPublicKey are never called with an unsupported algorithm. We chould put such a check in CheckIssuerIndependentProperties (only allow unsupported_algorithm if the certificate is a trust anchor, according to TrustDomain.GetCertTrust).

Or, at least we should provide some documentation for implementers of the TrustDomain interface as to what to return for unsupported_algorithm.
Attachment #8486032 - Flags: review?(brian) → review-
This is an example of the type of check I suggested in my previous comment. Similar checks are needed in pkixocsp.cpp and before the call to CheckPublicKey and maybe other places.
Attached patch patch v2 (obsolete) — Splinter Review
Thanks for the review - here's an updated patch with tests.
Attachment #8486032 - Attachment is obsolete: true
Attachment #8489059 - Attachment is obsolete: true
Attachment #8489070 - Attachment is obsolete: true
Attachment #8493307 - Flags: review?(brian)
Comment on attachment 8493307 [details] [diff] [review]
patch v2

We need this sooner rather than later, so r? to rbarnes.
Attachment #8493307 - Flags: review?(rlb)
Comment on attachment 8493307 [details] [diff] [review]
patch v2

Monica, if you have some time to look at the gtests, that would be great - thanks.
Attachment #8493307 - Flags: review?(mmc)
Comment on attachment 8493307 [details] [diff] [review]
patch v2

Review of attachment 8493307 [details] [diff] [review]:
-----------------------------------------------------------------

::: security/pkix/include/pkix/pkixtypes.h
@@ +81,5 @@
> +  unsupported_algorithm = UNSUPPORTED_ALGORITHM_FLAG | 0,
> +
> +  // md5WithRSAEncryption (OID 1.2.840.113549.1.4, RFC 3279 Section 2.2.1)
> +  // This is so we can test with a specific unsupported algorithm.
> +  unsupported_rsa_pkcs1_with_md5 = UNSUPPORTED_ALGORITHM_FLAG | 1,

Please don't add this testing-specific stuff to the non-test part of mozilla::pkix. Doing so makes it harder to understand what mozilla::pkix really cares about. For example, with your patch it is unclear whether mozilla::pkix treats PKCS#1 with MD5 differently than other unsupported algorithms, and that lack of clarity is bad.

Instead, change the testing logic so that it identifies signature algorithms using a different type--e.g. a ByteString containing the TLV for the SignatureAlgorithm. Changing it to be a ByteString of the TLV is needed for other reasons anyway. (See other bugs we're CC'd on together.)

::: security/pkix/lib/pkixbuild.cpp
@@ +187,2 @@
>    rv = trustDomain.VerifySignedData(subject.GetSignedData(),
>                                      potentialIssuer.GetSubjectPublicKeyInfo());

It would be better to define a function that centralizes this logic:

   // Don't pass anything signed by an unsupported algorithm to the TrustDomain.
   if (unsupported signature algorithm) {
     return Result::ERROR_CERT_SIGNATURE_ALGORITHM_DISABLED;
   }
   return trustDomain.VerifySignedData(...);

That way, we don't have to worry about them becoming out of sync. DRY.

::: security/pkix/test/gtest/pkixcert_signature_algorithm_tests.cpp
@@ +34,5 @@
> +CreateCert(const char* issuerCN,
> +           const char* subjectCN,
> +           EndEntityOrCA endEntityOrCA,
> +           Input signature, // the DER encoding of signatureAlgorithm
> +           SignatureAlgorithm signatureAlgorithm,

As I mentioned elsewhere in this review, you shouldn't use the SignatureAlgorithm type for this.

@@ +148,5 @@
> +class pkixcert : public ::testing::Test
> +{
> +};
> +
> +TEST_F(pkixcert, TrustedIssuerWithUnsupportedSignatureAlgorithm)

If these tests only differ in which algorithms are used, then please switch them to the data-driven approach, like I've done in the tests for bug 1063281, using TEST_P and friends. That would make the tests much easier to understand, review, and extend.
Attachment #8493307 - Flags: review?(brian) → review-
Comment on attachment 8493307 [details] [diff] [review]
patch v2

Thanks for the review, Brian. Clearing other reviews now since this patch will probably change considerably.
Attachment #8493307 - Flags: review?(rlb)
Attachment #8493307 - Flags: review?(mmc)
Brian, thanks for looking at the patches in this bug. I think it would be best if you kept reviewing them, if you're available. If not, let me know and I'll figure something else out.
Attachment #8493307 - Attachment is obsolete: true
Attachment #8498366 - Flags: review?(brian)
Shoot - forgot to actually instantiate some of the tests. This should do it.

https://tbpl.mozilla.org/?tree=Try&rev=50a536cb0911
Attachment #8498369 - Attachment is obsolete: true
Attachment #8498369 - Flags: review?(brian)
Attachment #8498372 - Flags: review?(brian)
Comment on attachment 8498366 [details] [diff] [review]
patch 1/3: check for unsupported signature algorithms

Review of attachment 8498366 [details] [diff] [review]:
-----------------------------------------------------------------

::: security/pkix/include/pkix/pkixtypes.h
@@ +74,5 @@
>  
>    // id-dsa-with-sha1 (OID 1.2.840.10040.4.3, RFC 3279 Section 2.2.2)
>    dsa_with_sha1 = 18,
> +
> +  // Used to indicate any algorithm not supported by mozilla::pkix.

s/algorithm not supported by mozilla::pkix/unsupported algorithm/

::: security/pkix/lib/pkixbuild.cpp
@@ +179,5 @@
>      return RecordResult(rv, keepGoing);
>    }
>  
> +  rv = SafeVerifySignedData(trustDomain, subject.GetSignedData(),
> +                            potentialIssuer.GetSubjectPublicKeyInfo());

Maybe s/Safe/Wrapped/ or similar, since it isn't a safety issue.

::: security/pkix/lib/pkixutil.h
@@ +193,5 @@
>         + ((year - 1u) / 400u); // except years divisible by 400.
>  }
>  
> +// First ensures that the signature algorithm in the given signed data is
> +// supported by mozilla::pkix.

The main purpose of this function, currently, is to ensure that we do not call the TrustDomain's VerifySignedData function if the algorithm is unsupported. Similar, but different.
Attachment #8498366 - Flags: review?(brian) → review+
Attachment #8498367 - Flags: review?(brian) → review+
Comment on attachment 8498372 [details] [diff] [review]
patch 3/3: actually test handling unsupported signature algorithms

Review of attachment 8498372 [details] [diff] [review]:
-----------------------------------------------------------------

Great work!

::: security/pkix/test/gtest/pkixcert_signature_algorithm_tests.cpp
@@ +19,5 @@
> + * distributed under the License is distributed on an "AS IS" BASIS,
> + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
> + * See the License for the specific language governing permissions and
> + * limitations under the License.
> + */

Should this be the CC0 (public domain) license text now?

@@ +71,5 @@
> +{
> +public:
> +  AlgorithmTestsTrustDomain(ByteString rootDER, ByteString rootSubjectDER,
> +               /*optional*/ ByteString intDER,
> +               /*optional*/ ByteString intSubjectDER)

nit: ByteString const&

@@ +96,5 @@
> +      trustLevel = TrustLevel::InheritsTrust;
> +    }
> +    return Success;
> +  }
> +

I suggest we define this (in pkixtestutil):

bool
InputEqualsByteString(Input input, const ByteString& bs)
{
   Input bsInput;
   if (bsInput.Init(bs.data(), bs.length() != Success) {
     // Init can only fali if it is given a bad pointer or if
     // the input is too long, which won't ever happen. Plus,
     // if it does, it is OK to call abort since this is only
     // test code.
     abort();
   }
   return InputsAreEqual(input, bsInput);
}

Then the below code could be simplified.

@@ +124,5 @@
> +      Input intCert;
> +      rv = intCert.Init(intDER.data(), intDER.length());
> +      if (rv != Success) {
> +        return rv;
> +      }

All of the above can be simplified quite a bit by using the function I suggest above.

@@ +143,5 @@
> +  }
> +
> +  virtual Result VerifySignedData(const SignedDataWithSignature& signedData,
> +                                  Input subjectPublicKeyInfo)
> +  {

EXPECT_NE(SignatureAlgorithm::unsupported_algorithm, signedData.algorithm);

@@ +174,5 @@
> +  // The certificate generated for the given rootSignatureAlgorithm is the
> +  // trust anchor.
> +  const ByteString& endEntitySignatureAlgorithm;
> +  const ByteString& optionalIntermediateSignatureAlgorithm;
> +  const ByteString& rootSignatureAlgorithm;

Probably better to just copy the strings instead of using ByteString&, since it is safer (avoids UaF).

@@ +182,5 @@
> +class pkixcert_IsValidChainForAlgorithm
> +  : public ::testing::Test
> +  , public ::testing::WithParamInterface<ChainValidity>
> +{
> +};

NIT: I suggest you define this class after the CHAIN_VALIDITY array. (I learned it is better that way when working on the tests for bug 1063281. Sometimes you can re-use the test data for parameterized test classes, and it is better to keep the test class near the TEST_P declarations for that test class.)

@@ +183,5 @@
> +  : public ::testing::Test
> +  , public ::testing::WithParamInterface<ChainValidity>
> +{
> +};
> +

Nit: You could define:

static const ByteString NO_INTERMEDIATE; // empty

And then use that instead of "ByteString()" below. Seems like it would be clearer.

@@ +184,5 @@
> +  , public ::testing::WithParamInterface<ChainValidity>
> +{
> +};
> +
> +const ChainValidity CHAIN_VALIDITY[] =

static

@@ +186,5 @@
> +};
> +
> +const ChainValidity CHAIN_VALIDITY[] =
> +{
> +  // The trust anchor may use an unsupported signature algorithm.

s/may use/may have a signature with/

@@ +193,5 @@
> +    md5WithRSAEncryption,
> +    true
> +  },
> +
> +  // Non-trust anchors may not use unsupported signature algorithms.

s/may not use unsupported signature algorithms/must not have a signature with an unsupported signature algorithm/.

@@ +203,5 @@
> +  { sha256WithRSAEncryption,
> +    md5WithRSAEncryption,
> +    sha256WithRSAEncryption,
> +    false
> +  },

Should we add RSA-MD2 too, since we know that there is/was a root cert with that algorithm?

Please file a follow-up bug for expanding the test coverage here. We should be testing that we don't accept algorithms that don't specify a hash and we should completely cover all the algorithms we do support, eventually. (That's out of scope for this bug though.)

@@ +226,5 @@
> +  const char* intermediateCN = "CN=Intermediate";
> +  ScopedTestKeyPair intermediateKey;
> +  ByteString intermediateSubjectDER;
> +  ByteString intermediateEncoded;
> +  if (!chainValidity.optionalIntermediateSignatureAlgorithm.empty()) {

if you take my suggestion for "NO_INTERMEDIATE" then you should change this condition to "!= NO_INTERMEDIATE".

::: security/pkix/test/gtest/pkixocsp_VerifyEncodedOCSPResponse.cpp
@@ +469,5 @@
>        return ENCODING_FAILED;
>      }
> +    const ByteString& signatureAlgorithm = signatureAlg
> +                                             ? *signatureAlg
> +                                             : sha256WithRSAEncryption;

The alignment is off. The "=" and "?" and ":" should be in the same column. (At least, that change is needed if it is desired to match the code I've written.)

@@ +578,5 @@
>    ByteString signerDER(CreateEncodedCertificate(
>                            ++rootIssuedCount, rootName,
>                            now - (10 * Time::ONE_DAY_IN_SECONDS),
>                            now - (2 * Time::ONE_DAY_IN_SECONDS),
> +                          signerName, extensions, nullptr, rootKeyPair.get(),

You could just name a variable RSA_SHA256 and pass that instead. It would be clearer at these call sites and you wouldn't have to special case nullptr, and you would be able to use a const& instead of const*.

@@ +852,5 @@
> +  ByteString responseString(
> +               CreateEncodedIndirectOCSPSuccessfulResponse(
> +                         "good_indirect_unsupportedSignatureAlgorithm",
> +                         OCSPResponseContext::good, byKey,
> +                         &md5WithRSAEncryption));

Add a comment calling out that the algorithm ID is for the certificate's signature, not the OCSP response's signature, and that the OCSP response, since it isn't clear here at the call site.

::: security/pkix/test/lib/pkixtestnss.cpp
@@ +79,5 @@
>      SECOidTag signatureAlgorithmOidTag;
>      if (signatureAlgorithm == sha256WithRSAEncryption) {
>        signatureAlgorithmOidTag = SEC_OID_PKCS1_SHA256_WITH_RSA_ENCRYPTION;
> +    } else if (signatureAlgorithm == md5WithRSAEncryption) {
> +      signatureAlgorithmOidTag = SEC_OID_PKCS1_MD5_WITH_RSA_ENCRYPTION;

You can also use SECOID_GetAlgorithmTag.
Attachment #8498372 - Flags: review?(brian) → review+
Thanks for the review. I believe I've addressed everything you pointed out, but I'm going to do another pass to make sure (probably early next week, since I'm out of the office tomorrow). In the meantime, I filed bug 1077192 for the follow-up additional testing.
Well, I was going to check this in, but inbound is closed again. In the meantime, here's a good try run: https://treeherder.mozilla.org/ui/#/jobs?repo=try&revision=82bfbbda6e84
Had to rebase. This is the patch as checked in.
Attachment #8501310 - Attachment is obsolete: true
Attachment #8501853 - Flags: review+
Looks like the ASAN failures were intermittent. Sorry for the trouble :(
Roots that will be impacted if this bug is not fixed in Firefox 33...

Symantec:
Equifax Secure Global eBusiness CA-1 (1024-bit root)
Equifax Secure eBusiness CA-1 (1024-bit root)
Thawte Premium Server CA (1024-bit root)
Thawte Server CA (1024-bit root)

Verizon:
GTE CyberTrust Global Root (1024-bit root)

Netlock:
NetLock Expressz (Class C) Tanusitvanykiado (1024-bit root)
NetLock Kozjegyzoi (Class A) Tanusitvanykiado (2048-bit root)
NetLock Uzleti (Class B) Tanusitvanykiado (1024-bit root)

We've always told the CAs that the signature algorithm in the root certs don't matter, because we don't use it. So, we haven't told them to migrate off of their MD5 roots. We've been working to move them off of the 1024-bit roots, but there is still a lot of work to do, which is why those 1024-bit root changes have been postponed. 

I think this is going to have a very big impact if we release Firefox 33 with it.
(In reply to Kathleen Wilson from comment #30)
Correction to Netlock, only their 2048-bit root will be impacted by this bug:
Netlock:
NetLock Kozjegyzoi (Class A) Tanusitvanykiado (2048-bit root)
[Tracking Requested - why for this release]: see comment 30
Keywords: relnote
David, due to the emergency, we need an uplift request more than a tracking request

Kathleen, this bug has been reported a month ago, we are hearing about this just a few days before the release. Could you evaluate the impact to users if we do not take this patch? I understand it is important, I just would like to understand how this will impact users (website not available?)

I am also a bit concern by the size of the changes.
(In reply to Sylvestre Ledru [:sylvestre] from comment #34)
> Kathleen, this bug has been reported a month ago, we are hearing about this
> just a few days before the release. Could you evaluate the impact to users
> if we do not take this patch? I understand it is important, I just would
> like to understand how this will impact users (website not available?)

I apologize for not raising this earlier and for not having current data.

David was trying to get it done in time for the release and I assumed it was going to make it in. Unfortunately, he wasn't able to get it done in time.

I have some slightly stale data I can share on *some* of the roots that are impacted by this bug...

Symantec:
Equifax Secure Global eBusiness CA-1 (1024-bit root) -- 
See https://bugzilla.mozilla.org/show_bug.cgi?id=986019#c4
The data Symantec provided a few months ago was that there were still over 1,700 valid certs chaining up to this root. They are actively working to move all of those customers off of that root by Firefox 37. I was planning to request new data on this root in November, to make sure it could be removed in Firefox 37, so I don't have new data to report.

Equifax Secure eBusiness CA-1 (1024-bit root)
Thawte Premium Server CA (1024-bit root)
Thawte Server CA (1024-bit root)
For these, see the data provided here: 
https://groups.google.com/d/msg/mozilla.dev.security.policy/N1myPWWp3mc/JF0zhQmpcAMJ
Resulting analysis: "As such, I'd say that removing those roots now would be premature."

Verizon:
GTE CyberTrust Global Root (1024-bit root) -- See https://bugzilla.mozilla.org/show_bug.cgi?id=1047011#c2
We tried to remove this root in Firefox 32, but had to put it back in because it's removal would still cause too many compatibility problems. We are planning to try again in Firefox 35. Verizon and their customers are working on this, but not ready yet.

Netlock:
NetLock Kozjegyzoi (Class A) Tanusitvanykiado (2048-bit root) -- I don't have data on this one, because it's not one of the 1024-bit roots that we've been working on phasing out.
(In reply to Sylvestre Ledru [:sylvestre] from comment #34)
> I just would
> like to understand how this will impact users (website not available?)

Correct. Any user browsing to a website with an SSL cert chaining up to one of these root certificates will get the Untrusted Connection error. So, either will will have to post something about those users not using Firefox 33, or would have to provide instructions about clicking through the Untrusted Connection Error.

This would look something like:
Users of Firefox 33 who browse to a website with a cert chaining to an MD5 root will get the following Untrusted Connection error:
This Connection is Untrusted
You have asked Firefox to connect securely to <site URL>, but we can't confirm that your connection is secure.
Normally, when you try to connect securely, sites will present trusted identification to prove that you are going to the right place. However, this site's identity can't be verified.
What Should I Do?
If you usually connect to this site without problems, this error could mean that someone is trying to impersonate the site, and you shouldn't continue.
<site URL> uses an invalid security certificate. The certificate is not trusted because it was signed using a signature algorithm that was disabled because that algorithm is not secure. (Error code: sec_error_cert_signature_algorithm_disabled)

Users can over-ride the error by selecting "I Understand the Risks", "Add Exception...", and "Confirm Security Exception"
==
David, could you prepare an uplift request if we decide to make second build of 33.0?
Thanks
Flags: needinfo?(dkeeler)
(In reply to David Keeler (:keeler) [use needinfo?] from comment #0)
> Here's an example site that chains to this root: https://www.wind.it/

When I go this with a Nightly that is a few days old, I do not see any error and it actually ends up at the non-SSL equivalent.
Could we have a testcase that really shows the issue so we can QA the fix? Hopefully along with steps to reproduce and explanation of expected and buggy behavior.
(In reply to Robert Kaiser (:kairo@mozilla.com) from comment #38)
> (In reply to David Keeler (:keeler) [use needinfo?] from comment #0)
> > Here's an example site that chains to this root: https://www.wind.it/
> 
> When I go this with a Nightly that is a few days old, I do not see any error
> and it actually ends up at the non-SSL equivalent.
> Could we have a testcase that really shows the issue so we can QA the fix?
> Hopefully along with steps to reproduce and explanation of expected and
> buggy behavior.

It seem that the changed the certificate: now it is issued by Symantec, so it work now.

$ openssl s_client -connect www.wind.it:443 < /dev/null
CONNECTED(00000003)
depth=3 C = US, O = "VeriSign, Inc.", OU = Class 3 Public Primary Certification Authority
verify return:1
depth=2 C = US, O = "VeriSign, Inc.", OU = VeriSign Trust Network, OU = "(c) 2006 VeriSign, Inc. - For authorized use only", CN = VeriSign Class 3 Public Primary Certification Authority - G5
verify return:1
depth=1 C = US, O = Symantec Corporation, OU = Symantec Trust Network, CN = Symantec Class 3 EV SSL CA - G3
verify return:1
depth=0 1.3.6.1.4.1.311.60.2.1.3 = IT, businessCategory = Private Organization, serialNumber = 05410741002, C = IT, postalCode = 00148, ST = Roma, L = Roma, street = VIA CESARE GIULIO VIOLA 48, O = WIND Telecomunicazioni S.p.A., OU = Cyber Security and Data Breach Defence, CN = www.wind.it
verify return:1
Kathleen, when you mention 1700 certificates, does that mean 1700 websites?
FYI, chatting with QE, we didn't see much bugs related to this issue during the aurora/beta cycle.
Flags: needinfo?(kwilson)
Do we have any form of telemetry data from the release channel on number of "untrusted cert" warnings displayed? Assuming so, can we see what effect this bug has had on those numbers for Firefox 33 builds?
I'm working on a patch for 33 now. The good news is the bug fix only touches a handful of lines. The bad news is the test changes are much more involved.

The telemetry probe that would get us the most insight is SSL_CERT_ERROR_OVERRIDES. For each entry in bucket 1, a user has connected to a site that did not require an override. For each entry in bucket >1, a user has connected to a site that did require an override. (So, this does not tell us directly how many times a particular error has been encountered.) Bucket 8 is the error SEC_ERROR_CERT_SIGNATURE_ALGORITHM_DISABLED, which is the one we're concerned about. From the telemetry dashboard, for 31 beta, 23K connections required an override for this error (compared to 58G non-error connections). For 32 beta, it was 28K vs 53G. For 33 beta, 4.2M connections required an override for this error vs 27G non-error connections.
Flags: needinfo?(dkeeler)
(In reply to David Keeler (:keeler) [use needinfo?] from comment #42)
> I'm working on a patch for 33 now. The good news is the bug fix only touches
> a handful of lines. The bad news is the test changes are much more involved.
The patches already landed are not enough? Any ETA? We are waiting for the patch to start the build for next Monday push to the mirror.

By the way, you confirm that it also affects Fennec, right?
Flags: needinfo?(dkeeler)
(In reply to Sylvestre Ledru [:sylvestre] from comment #43)
> (In reply to David Keeler (:keeler) [use needinfo?] from comment #42)
> > I'm working on a patch for 33 now. The good news is the bug fix only touches
> > a handful of lines. The bad news is the test changes are much more involved.
> The patches already landed are not enough? Any ETA? We are waiting for the
> patch to start the build for next Monday push to the mirror.

I meant I'm working on a branch-specific patch. The bug fix required a minor rebase. The patches will need more work, but shouldn't be fundamentally different. I should be able to finish by the end of the day, Pacific time.

> By the way, you confirm that it also affects Fennec, right?

Yes, this affects Fennec as well.
Flags: needinfo?(dkeeler)
Approval Request Comment
[Feature/regressing bug #]: mozilla::pkix refactoring
[User impact if declined]: see above comments
[Describe test coverage new/current, TBPL]: test coverage is good. See also: https://treeherder.mozilla.org/ui/#/jobs?repo=try&revision=06f0a5009d22
[Risks and why]: low - the actual code changes are minimal
[String/UUID change made/needed]: none
Attachment #8503360 - Flags: review+
Attachment #8503360 - Flags: approval-mozilla-release?
Approval Request Comment
These are the tests for the bug fix patch.
Attachment #8503361 - Flags: review+
Attachment #8503361 - Flags: approval-mozilla-release?
Attachment #8503360 - Flags: approval-mozilla-release? → approval-mozilla-release+
Comment on attachment 8503361 [details] [diff] [review]
(for 33) patch 2/2: tests

Not waiting for treeherder. It won't change much if it fails in the try or in the mozilla-release branch.
Attachment #8503361 - Flags: approval-mozilla-release? → approval-mozilla-release+
Confirmed issue on Fx33, release candidate build 1. 
Verified fixed on Fx33, release candidate build 2.

The following sites were used to test with, all of which chain to root certs signed by MD5 and which were affected by the regression. They came from bugs referenced in comment 30 and comment 35.

https://www.191.alice.it/
https://www.agenziafarmaco.gov.it/
https://my.oakwood.edu/
https://network.endian.com/

I don't have any examples of live sites that chain to the Netlock root certs. If people feel that is important to test, I'd really appreciate some help obtaining more info. However, if people feel that we've covered the actual issue with the above sites, perhaps testing every CA's MD5-signed cert isn't important.
Clearing my needinfo request, because I believe the question was answered with the telemetry data provided in Comment #42 

Thanks to all of you who worked on this.

Matt, no need to test with every MD5 root. The test sites you listed in Comment #49 are sufficient.

Thanks!
Flags: needinfo?(kwilson)
I ran a compatibility test this weekend for Fx33. The goal was to try to understand what telemetry is telling us, and to try to make sure we understand the source of the problem.

I ran the test between Fx33 and Fx31. I would have compared vs Fx32 but due to build issues, I went with Fx31, and it should not matter.

Using a list of top 200k SSL sites that I've used before, I just ran the first 20k or so, for the sake of time. 

Fx33 vs Fx31 comparison:
    * ~20000 sites run in Fx33
    * 2870 error sites produced in Fx33, and then run in Fx31

Of these error sites: 
    * 222 total site changes in Fx33
    * 26 sites were broken only in Fx33 (actually 29, minus 3 network hiccups)
    * 193 changed error messages from Fx31 to Fx33
    * 84 of these changed errors became "sec_error_cert_signature_algorithm_disabled" in Fx33 (but were not in Fx31)

Quick examination says that some affected sites chain to 1024-bit MD5 certs, which is consistent with the bug that we fixed.

As far as breakage goes, if we went with the admittedly small data sample above, we could extrapolate that we broke 0.13% of SSL sites (26/20000). 

The increase in telemetry could be partly explained by the change in error message. While we indeed had an increase in broken sites, we had a much bigger increase in sites that reported as broken with this particular error, but were already broken because of another SSL violation.

Of 2870 error sites:
    * 26 newly broken
    * 88 newly indicate "sec_error_cert_signature_algorithm_disabled" 
    * 4 already indicated "sec_error_cert_signature_algorithm_disabled" 

If we compare 4 instances in Fx31 with 88 total instances in Fx33, that's a 22x increase.

In comment #42, telemetry shows that this error required an override for 23k connections in Fx31 beta, versus 4.2 million for Fx33 beta. That's an increase of 183x. 

In summary, the numbers don't match exactly and I can't explain it entirely. I will say that my sample size is small, and also that my testing does not replicate real-world usage. However, this data may help us understand why we see such a large change in telemetry, but not a huge outcry from the public. Most of the sites that generate this error in affected builds of Fx33 were likely already broken.
(In reply to Matt Wobensmith from comment #51)
> ...
> Using a list of top 200k SSL sites that I've used before, I just ran the
> first 20k or so, for the sake of time. 
> ...
> If we compare 4 instances in Fx31 with 88 total instances in Fx33, that's a
> 22x increase.
> 
> In comment #42, telemetry shows that this error required an override for 23k
> connections in Fx31 beta, versus 4.2 million for Fx33 beta. That's an
> increase of 183x. 

I could guess that the discrepancy could stem from sites which got broken but are not part of your top SSL sites, e.g. self-signed certificates on sites which are accessed mostly locally, such as routers, virtual machines, test servers, and other mostly-not-public-facing servers.
Correction:
    * 88 newly indicate "sec_error_cert_signature_algorithm_disabled" 
Should read:
    * 84 newly indicate "sec_error_cert_signature_algorithm_disabled"
(In reply to Matt Wobensmith from comment #51)
> Using a list of top 200k SSL sites that I've used before, I just ran the
> first 20k or so, for the sake of time. 
> 
> Fx33 vs Fx31 comparison:
>     * ~20000 sites run in Fx33
>     * 2870 error sites produced in Fx33, and then run in Fx31
> 
> Of these error sites: 
>     * 222 total site changes in Fx33
>     * 26 sites were broken only in Fx33 (actually 29, minus 3 network
> hiccups)
>     * 193 changed error messages from Fx31 to Fx33
>     * 84 of these changed errors became
> "sec_error_cert_signature_algorithm_disabled" in Fx33 (but were not in Fx31)

Do you have some numbers about what was the previous error message in Fx31? How many changed from untrusted issuer to signature algorithm disabled?


> Quick examination says that some affected sites chain to 1024-bit MD5 certs,
> which is consistent with the bug that we fixed.

Does the 1024 bit part matter at all? Or is it just the MD5 that's making problems? Just curious because the CAcert root is MD5-self-signed but a 4096 bit RSA key.


> Of 2870 error sites:
>     * 26 newly broken
>     * 84 newly indicate "sec_error_cert_signature_algorithm_disabled" 
>     * 4 already indicated "sec_error_cert_signature_algorithm_disabled" 
> 
> If we compare 4 instances in Fx31 with 88 total instances in Fx33, that's a
> 22x increase.
> 
> In comment #42, telemetry shows that this error required an override for 23k
> connections in Fx31 beta, versus 4.2 million for Fx33 beta. That's an
> increase of 183x.

You also need to take into account that the telemetry data is per connection and your statistic is per site if I'm not mistaken. So if the sites you observed that changed from some other error to algorithm disabled are more popular than the sites that already had the error, then it would explain why the relative increase per connection is much higher than the increase per site.

For example there's a very popular German blog https://blog.fefe.de/ that uses a CAcert certificate which as far as I can tell is affected by this bug. I guess that many geeks, nerds and hackers are in the Fx beta which also happens to be the main audience of that blog so the telemetry data might be somewhat biased. Now before the bug almost all of them probably had the CAcert certificate imported in their trust store or added a permanent exception to that special site (does that trigger another entry in the telemetry data on a repeated visit? there's no error page after all). Now with the bug that cert is being rejected because of the signature algorithm and therefore the exception/imported root doesn't keep Firefox from showing the error page and that will happen a few thousand times. In your statistic on the other hand it only shows as a single site that switched from one error message (untrusted issuer) to another (algorithm disabled).


> In summary, the numbers don't match exactly and I can't explain it entirely.
> I will say that my sample size is small, and also that my testing does not
> replicate real-world usage. However, this data may help us understand why we
> see such a large change in telemetry, but not a huge outcry from the public.
> Most of the sites that generate this error in affected builds of Fx33 were
> likely already broken.

Another reason might be that the CAcert root is probably affected by this bug which is not in the Mozilla trust store. So in your world they're already broken while in the real world they're used for sites that don't target the wide audience and therefore can live with requiring people to import that root (even simplified in organisations where it can be pre-installed) or add a permanent exception (which is equivalent to self-signed).
Hi Neo - I agree with your observations. 

FWIW, as I've hopefully conveyed, my testing is not complete by any means... it's only a quick attempt to understand the discrepancy between the telemetry data and the lack of public response to what was obviously a real bug. 

With the caveat that I only tested 20k sites - and clearly not representative of the whole web - I'm including here a zip file of errors I got from both Fx31 and Fx33, which you can diff yourself. This will tell you what errors were reported for broken sites and how this bug affected the reported error message. 

Hope this is helpful. We intend to have much more comprehensive coverage of SSL compatibility in future releases, so that we can catch regressions sooner.
Approval Request Comment
see previous approval request comments - this is just the test patch rebased for 34
Attachment #8505130 - Flags: review+
Attachment #8505130 - Flags: approval-mozilla-beta?
Comment on attachment 8501309 [details] [diff] [review]
patch 1/3: check for unsupported signature algorithms v2

Approval Request Comment
(see previous approval request comments)

This applies on 34, so requesting uplift approval.
Here's a try run, just for the record:

https://treeherder.mozilla.org/ui/#/jobs?repo=try&revision=bcbd67624a76
And, to fix the build issues, a subsequent try run:
https://treeherder.mozilla.org/ui/#/jobs?repo=try&revision=1bc2bdc58041

(it's just the test patch that was broken - it's fixed, now)

Note for whoever checks this in: the "1/3" in the commit comment should probably be "1/2"
Attachment #8501309 - Flags: approval-mozilla-beta?
Comment on attachment 8501309 [details] [diff] [review]
patch 1/3: check for unsupported signature algorithms v2

We have already taken this fix in 33. Need to get it into 34 as well. Beta+
Attachment #8501309 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
Attachment #8505130 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
As per comment # 42, I checked the SEC_ERROR_CERT_SIGNATURE_ALGORITHM_DISABLED bucket (#8) under the SSL_CERT_ERROR_OVERRIDES probe and got the following results:

FX 33 Release:

- fx 33 (2014/10/11) - 15.48M hits (4.73k hits listed under bucket #8)
- fx 33.0.1 (2014/10/23) - 11.18k hits (0 hits listed under bucket #8)

FX 34 Beta:

- b1 (2014/10/14) - 18.32M hits (697.41k hits listed under bucket #8)
- b2 (2014/10/20) - 9.08M hits (80 hits listed under bucket #8)
- b3 (2014/10/23) - 479.49k hits (0 hits listed under bucket #8)

The patch landed in fx34 on 2014/10/16 as per comment #59 which explains why there were so many hits in b1. I'll keep an eye on the data and post another update with more information once more builds get released under fx 34.
I just realized I used the "Submissions" count rather than the total count: (the bucket #8 numbers were correctly used) 

FX 33 Release:

- fx 33 (2014/10/11) - 2.48G non-error hits (4.84k hits listed under bucket #8)
- fx 33.0.1 (2014/10/23) - 5.89G non-error hits (1 hit listed under bucket #8)

FX 34 Beta:

- b1 (2014/10/14) - 2.63G non-error hits (708.77k hits listed under bucket #8)
- b2 (2014/10/20) - 1.38G non-error hits (87 hits listed under bucket #8)
- b3 (2014/10/23) - 270.33M non-error hits (0 hits listed under bucket #8)
Depends on: 1096901
Verified fixed in Fx34, Fx35 2014-11-19.
Status: RESOLVED → VERIFIED
You need to log in before you can comment on or make changes to this bug.