sort out crash report certificate issues on Maemo

RESOLVED FIXED

Status

RESOLVED FIXED
9 years ago
8 years ago

People

(Reporter: ted, Assigned: ted)

Tracking

Trunk
ARM
Maemo
Bug Flags:
in-testsuite +

Details

Attachments

(1 attachment, 3 obsolete attachments)

I got crash reporting working on Maemo, but submitting the report fails:
[Sun Apr  4 2010 01:15:07 PM EDT] Crash report submission failed: Peer certificate cannot be authenticated with known CA certificates

Either the CA that signed the cert for https://crash-reports.mozilla.com isn't in the root store on the N900, or our usage of libcurl isn't picking up the right certs, or maybe libcurl just isn't configured right on the N900.
Blocks: 557114

Comment 1

9 years ago
IIRC the libcurl on N900 doesn't come with any root certs. That's why xchat over SSL doesn't work by default either.
Yuck. Looking at the docs for libcurl, curl_easy_setopt has a CURLOPT_SSLCERT option that lets you specify a CA cert file. I guess we can ship the CA cert we need alongside the crashreporter binary and enable it before sending.

Comment 3

9 years ago
the system certificate store is owned by something that provides a PKCS11 api which integrates with nss; so you can use nss to get access to the system certificate store :).

if you want to file a bug complaining about libcurl not interoperating with the system store, you can do so at https://bugs.maemo.org

https://bugs.maemo.org/show_bug.cgi?id=3003 indicates it was "not an official part of the diablo platform"

I'm told that for fremantle it was supplied by the connectivity team, so you could file a bug here:
https://bugs.maemo.org/enter_bug.cgi?product=Connectivity&component=Networking
(and it shouldn't be rejected, although I wouldn't bet on whether it'll be fixed...)
That sounds like a good plan for the future. I'll try to remember to file that bug upstream.

In the interim, my thoughts were:
1) ship a PEM formatted copy of the CA cert we need alongisde the crashreporter, like crashreporter.crt
2) Change the Breakpad HTTPUpload code to allow us to pass this cert file name:
http://mxr.mozilla.org/mozilla-central/source/toolkit/crashreporter/google-breakpad/src/common/linux/http_upload.cc#59
3) Use curl_easy_setopt with CURLOPT_CAINFO to point to this cert file:
http://curl.haxx.se/libcurl/c/curl_easy_setopt.html#CURLOPTCAINFO

This should be sufficient.
Assignee: nobody → ted.mielczarek
Status: NEW → ASSIGNED
Created attachment 438098 [details] [diff] [review]
Ship Equifax CA cert as crashreporter.crt, use it if present in Maemo crashreporter

I don't know who to ask for review on this, but this makes our crashreporter able to submit reports on Maemo.

Shipping the Equifax CA Cert like that is kind of ugly, but I don't have a better idea.
You definitely should talk to IT first before doing this, as you're basically tying their hands as to what CA they can use for this cert.
If we're content with the hack of shipping a PEM-encoded cert file alongside our install for curl's sake, why not just add a build step on maemo that dumps the complete contents of certdata.txt?  curl can take a bundle of CAs, iinm. If it's done as a build step, it ought to stay in sync with changes to our root, and pulling the necessary info from certdata.txt is something other utils[1] already do.

[1] http://www.unspecific.com/ssl/mkcabundle.pl
We can definitely do that. The libcurl docs state "CURLOPT_CAINFO
Pass a char * to a zero terminated string naming a file holding one or more certificates to verify the peer with", so we can have as many CA certs listed in the file as we want.

(Also, I didn't know certdata.txt was where that list of CAs lived!)
I'll rework this patch to use that perl script that Johnath linked. I've emailed the author of it asking if we can use it under an open source license, since it lacks a license header.
(In reply to comment #9)
> I'll rework this patch to use that perl script that Johnath linked. I've
> emailed the author of it asking if we can use it under an open source license,
> since it lacks a license header.

The inimitable Mr. Orton, it seems, lives in our bugzilla as well.
Using that script works just fine, I've got a simple local patch that does that. I'll wait for Joe to chime in re: script licensing.

Reed: using the contents of our CA store should make this a non-issue for IT, right?
(In reply to comment #3)
> I'm told that for fremantle it was supplied by the connectivity team, so you
> could file a bug here:
> https://bugs.maemo.org/enter_bug.cgi?product=Connectivity&component=Networking
> (and it shouldn't be rejected, although I wouldn't bet on whether it'll be
> fixed...)

I filed this upstream: https://bugs.maemo.org/show_bug.cgi?id=9884
(In reply to comment #11)
> Reed: using the contents of our CA store should make this a non-issue for IT,
> right?

Correct.

Comment 14

9 years ago
We stopped using mkcabundle.pl since it doesn't respect the trust bits in the database, but I can give you a copy under a free license if you really want.

We now use a python script from Debian, hopefully this URL won't get munged:

http://cvs.fedoraproject.org/viewvc/F-13/ca-certificates/certdata2pem.py?revision=1.1&content-type=text%2Fplain&view=co

Comment 15

9 years ago
Sorry, hit return too soon...

Note that certdata2pem.py spits out many individual PEM files, but it would be trivial to adapt it to output them all, concatenated.

The trust bits matter because there can be (and currently are) certs in certdata.txt which have all the trust bits unset, and hence should not be produced in the PEM bundle output.  Also there are certs in there which are only trusted for code signing, etc.
(In reply to comment #15)
> Sorry, hit return too soon...
> 
> Note that certdata2pem.py spits out many individual PEM files, but it would be
> trivial to adapt it to output them all, concatenated.
> 
> The trust bits matter because there can be (and currently are) certs in
> certdata.txt which have all the trust bits unset, and hence should not be
> produced in the PEM bundle output.  Also there are certs in there which are
> only trusted for code signing, etc.

Yep - that's officially a deal-breaker of a big deal. There are certs in our root added without trust explicitly because we know they are fraudulent. Thanks for the updated info, Joe. 

Ted - workable?
Sounds great, especially since that script is Python instead of Perl. :)

Thanks Joe!
Created attachment 438163 [details] [diff] [review]
Ship crashreporter.crt built from certdata.txt, use it if present in Maemo crashreporter

Here's a patch that uses that Python script to generate crashreporter.crt from our certdata.txt. Much nicer!
Attachment #438098 - Attachment is obsolete: true
Attachment #438163 - Flags: review?(mark.finkle)
Comment on attachment 438163 [details] [diff] [review]
Ship crashreporter.crt built from certdata.txt, use it if present in Maemo crashreporter

johnath: if you could give this a once-over just to make sure you're cool with how I'm manhandling our cert store, that'd be great.

I made some minor changes to the Python script:
1) Take input from sys.stdin instead of a file
2) Hardcode the blacklist
3) Output errors to sys.stderr
4) Print all converted output to sys.stdout instead of separate files
Attachment #438163 - Flags: review?(johnath)
(In reply to comment #12)
> I filed this upstream: https://bugs.maemo.org/show_bug.cgi?id=9884

I forgot to mention, upstream gave me a workaround, which is cool, but it only works on the N900. Since we're still supporting the N810, the patch here is what we need. (The N900 has a certificates dir, but libcurl isn't configured to see it by default. The N810 doesn't even have a cert dir, apparently.)
Comment on attachment 438163 [details] [diff] [review]
Ship crashreporter.crt built from certdata.txt, use it if present in Maemo crashreporter

>+def label_to_filename(label):
>+    label = label.replace('/', '_')\
>+        .replace(' ', '_')\
>+        .replace('(', '=')\
>+        .replace(')', '=')\
>+        .replace(',', '_') + '.crt'
>+    return re.sub(r'\\x[0-9a-fA-F]{2}', lambda m:chr(int(m.group(0)[2:], 16)), label)

This function isn't used at all. Maybe drop it?
Comment on attachment 438163 [details] [diff] [review]
Ship crashreporter.crt built from certdata.txt, use it if present in Maemo crashreporter

I looked at the crashreporter and breakpad changes. Looks good to me.
Attachment #438163 - Flags: review?(mark.finkle) → review+
Comment on attachment 438163 [details] [diff] [review]
Ship crashreporter.crt built from certdata.txt, use it if present in Maemo crashreporter

>diff --git a/toolkit/crashreporter/client/certdata2pem.py b/toolkit/crashreporter/client/certdata2pem.py
>new file mode 100755
>--- /dev/null
...
>+# Hard-code blacklist
>+blacklist = ['"MD5 Collisions Forged Rogue CA 25c3"']
>+
>+# Build up trust database.
>+trust = dict()
>+for obj in objects:
>+    if obj['CKA_CLASS'] != 'CKO_NETSCAPE_TRUST':
>+        continue
>+    if obj['CKA_LABEL'] in blacklist:
>+        print >>sys.stderr, "Certificate %s blacklisted, ignoring." % obj['CKA_LABEL']
>+    elif obj['CKA_TRUST_SERVER_AUTH'] == 'CKT_NETSCAPE_TRUSTED_DELEGATOR':
>+        trust[obj['CKA_LABEL']] = True
>+    elif obj['CKA_TRUST_EMAIL_PROTECTION'] == 'CKT_NETSCAPE_TRUSTED_DELEGATOR':
>+        trust[obj['CKA_LABEL']] = True
>+    elif obj['CKA_TRUST_SERVER_AUTH'] == 'CKT_NETSCAPE_UNTRUSTED':
>+        print >>sys.stderr, '!'*74
>+        print >>sys.stderr, "UNTRUSTED BUT NOT BLACKLISTED CERTIFICATE FOUND: %s" % obj['CKA_LABEL']
>+        print >>sys.stderr, '!'*74
>+    else:
>+        print >>sys.stderr, "Ignoring certificate %s.  SAUTH=%s, EPROT=%s" % \
>+              (obj['CKA_LABEL'], obj['CKA_TRUST_SERVER_AUTH'],
>+               obj['CKA_TRUST_EMAIL_PROTECTION'])
...
>+for obj in objects:
>+    if obj['CKA_CLASS'] == 'CKO_CERTIFICATE':
>+        if not obj['CKA_LABEL'] in trust or not trust[obj['CKA_LABEL']]:
>+            continue
>+        sys.stdout.write("-----BEGIN CERTIFICATE-----\n")
>+        sys.stdout.write("\n".join(textwrap.wrap(base64.b64encode(obj['CKA_VALUE']), 64)))
>+        sys.stdout.write("\n-----END CERTIFICATE-----\n")
>+


First of all, congrats on the new baby!  :)

Second - my python's not all that, but can you clarify something for me, here? Why do we need a blacklist as a separate code path, given that we already skip over certs that are untrusted for server auth? The blacklisted one would pass that test too, and be left off the trusted list, no? And that would also save you from having to keep the blacklist updated, since we just added a bunch more untrusted roots as a deprecation phase (see bug 554334).

Also - why are we including certs trusted for email protection in this case? I know it was originally a general purpose script, and many CAs will have multiple trust bits because they issue different categories of thing off the same root. But if a CA in our root only has the email trust bit set, if they don't have the server trust bit set, then I don't think we want them appearing in a list that's meant to identify a crash report server.

I think that means that the whole trust db block there could just reduce to:

# Build up trust database.
trust = dict()
for obj in objects:
    if obj['CKA_CLASS'] != 'CKO_NETSCAPE_TRUST':
        continue
    if obj['CKA_TRUST_SERVER_AUTH'] == 'CKT_NETSCAPE_TRUSTED_DELEGATOR':
        trust[obj['CKA_LABEL']] = True
    else:
        print >>sys.stderr, "Ignoring certificate %s.  SAUTH=%s" % \
              (obj['CKA_LABEL'], obj['CKA_TRUST_SERVER_AUTH'])

I guess you could put the elif UNTRUSTED block in, if you wanted that logged separately.

What do you think?
That sounds like the right thing to do, and that's exactly why I asked for your review. Thanks! I'll get a new patch up when I get out from under my bugmail backlog.
I'm having issues with johnath's suggested change. If I run the Python script from the current patch here, everything works fine on the N900. If I make his suggested changes, which should only leave out CAs that are trusted just for email signing, then it no longer works. I'm testing with this (much simpler) command:
openssl verify -CAfile test.crt -purpose sslserver crashreports.crt

Where "test.crt" is the output of the Python script, and "crashreports.crt" is the cert from crash-reports.mozilla.com in PEM format.
I think there's a bug in the openssl that ships with Maemo. If I run the same test with the same list of certs on OS X (where openssl doesn't ship with a built-in list of CA certs), it works fine. I'll see if I can work around it.
Created attachment 442462 [details] [diff] [review]
With an awful hack

So the OpenSSL bug seems to have to do with the ordering of certs in the file. I added a gross hack here to just reverse the list of certs before writing them out to the PEM file, and this works on my N900.

It's not a great solution, and could be fragile if we add more CAs to certdata.txt, but it works for now, and the bug is filed upstream.
Attachment #438163 - Attachment is obsolete: true
Attachment #442462 - Flags: review?(johnath)
Attachment #438163 - Flags: review?(johnath)
Comment on attachment 442462 [details] [diff] [review]
With an awful hack

>+# Build up trust database.
>+trust = dict()
>+for obj in objects:
>+    if obj['CKA_CLASS'] != 'CKO_NETSCAPE_TRUST':
>+        continue
>+    # We only want certs that are trusted for SSL server auth
>+    if obj['CKA_TRUST_SERVER_AUTH'] == 'CKT_NETSCAPE_TRUSTED_DELEGATOR':
>+        trust[obj['CKA_LABEL']] = True
>+
>+# For some reason, OpenSSL on Maemo has a bug where if we output
>+# the certs in order it fails to be able to verify the server certificate.
>+# Reversing the order makes things work. This is likely to be fragile
>+# if the NSS certdata.txt changes. The bug is filed upstream:
>+# https://bugs.maemo.org/show_bug.cgi?id=10069
>+objects.reverse()
>+for obj in objects:
>+    if obj['CKA_CLASS'] == 'CKO_CERTIFICATE':
>+        if not obj['CKA_LABEL'] in trust or not trust[obj['CKA_LABEL']]:
>+            continue
>+        sys.stdout.write("-----BEGIN CERTIFICATE-----\n")
>+        sys.stdout.write("\n".join(textwrap.wrap(base64.b64encode(obj['CKA_VALUE']), 64)))
>+        sys.stdout.write("\n-----END CERTIFICATE-----\n\n")

So, first of all, I only reviewed the python script, and I don't know python, so this is really mostly a "logical" review.

This reversing business is a mess, and I really worry that it will suddenly stop working some day. That leads me to 3 questions:

1) What happens if this breaks in deployment? Does the crash report just fail-to-get-there, or does something more... tragic... occur?
2) Is there any way we can get automated tests here so that a future change to certdata is quickly spotted -- *before* deployment? I don't see tests in this patch, but I could believe they exist elsewhere.
3) Would it work to just filter out the "problem" cert (with a similar comment about how it causes funny bustage, filed uptream bug, &c)? That *feels* more resilient to me than .reverse(), if we can really localize it to that cert.
(In reply to comment #29)
> This reversing business is a mess, and I really worry that it will suddenly
> stop working some day. That leads me to 3 questions:

I completely agree.

> 1) What happens if this breaks in deployment? Does the crash report just
> fail-to-get-there, or does something more... tragic... occur?

The crash reporter will simply report "failed to submit crash report", and the crash will be saved in the "pending" directory. The user can visit about:crashes and click on the crash to resubmit it from within the app (which will work because it uses necko+NSS).

> 2) Is there any way we can get automated tests here so that a future change to
> certdata is quickly spotted -- *before* deployment? I don't see tests in this
> patch, but I could believe they exist elsewhere.

This specific case is difficult to test, since we need to test that submitting works when the server certificate is one that was issued by a real CA. Our test harnesses use a fake CA and server certificates issued by that CA, so they're not a valid test. We don't currently have any tests that test submission from the crash reporter anyway, which is kind of bad but it's also been kind of difficult to test.

That being said, I can write a hacky xpcshell test that just uses a copy of the crash-reports.mozilla.com certificate and invokes the `openssl verify` command using the generated PEM file to check that this specific case works.

> 3) Would it work to just filter out the "problem" cert (with a similar comment
> about how it causes funny bustage, filed uptream bug, &c)? That *feels* more
> resilient to me than .reverse(), if we can really localize it to that cert.

This doesn't inspire me with any more confidence, since I don't know what salient feature of the cert is causing this failure. I can give it a shot though, and combined with a unit test as described above, it should at least let us know if things break.
Created attachment 442699 [details] [diff] [review]
Less awful hack + unit test

How does this strike you? I need to try the test on-device still, but it works on my desktop machine.
Attachment #442462 - Attachment is obsolete: true
Attachment #442699 - Flags: review?(johnath)
Attachment #442462 - Flags: review?(johnath)
Attachment #442699 - Flags: review?(johnath) → review+
Comment on attachment 442699 [details] [diff] [review]
Less awful hack + unit test

Yeah, this looks good to me (with the caveats above about only reading the python, not knowing python, and not having run this -- basically being totally unqualified to review this :)

>diff --git a/toolkit/crashreporter/client/maemo-unit/test_maemo_certs.js b/toolkit/crashreporter/client/maemo-unit/test_maemo_certs.js
>new file mode 100644
>--- /dev/null
>+++ b/toolkit/crashreporter/client/maemo-unit/test_maemo_certs.js
>+  let shscript = do_get_file("opensslverify.sh");
>+  let cacerts = do_get_file("crashreporter.crt");
>+  let servercert = do_get_file("crashreports.crt");
>+  let args = [shscript.path, cacerts.path, servercert.path];
>+  process.run(true, args, args.length);
>+
>+  do_check_eq(process.exitValue, 0);
>+}

Can we make that do_check more wordy so that when it fails, people have any hope of reconstructing why?  :)  A comment would be okay as a fallback, but ideally something in the log around the failure.

r=me for the pieces I can review, with that change. Thanks for the multiple rounds here, Ted.
Thanks! I added a dump statement there that should hopefully help. I think it's going to be confusing no matter what if this test starts failing, but at least we have *something*. I appreciate your reviewing despite claiming ignorance. ;-)

Pushed to m-c:
http://hg.mozilla.org/mozilla-central/rev/9cbe87222f44
Status: ASSIGNED → RESOLVED
Last Resolved: 8 years ago
Flags: in-testsuite+
Resolution: --- → FIXED
Comment on attachment 442699 [details] [diff] [review]
Less awful hack + unit test

Drivers: we'd like to take this patch to enable crash reporting on Fennec. It has minimal impact on Firefox builds (only 22 lines of the patch touch common code).
Attachment #442699 - Flags: approval1.9.2.5?

Updated

8 years ago
Attachment #442699 - Flags: approval1.9.2.5? → approval1.9.2.5+
Pushed to 1.9.2:
http://hg.mozilla.org/releases/mozilla-1.9.2/rev/1bad18064ab6
status1.9.2: --- → .5-fixed
You need to log in before you can comment on or make changes to this bug.