Closed
Bug 974151
Opened 10 years ago
Closed 10 years ago
Production and staging telemetry-experiments.mozilla.org
Categories
(Infrastructure & Operations Graveyard :: WebOps: Other, task)
Infrastructure & Operations Graveyard
WebOps: Other
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: benjamin, Assigned: cturra)
References
Details
(Whiteboard: [business - new app])
The telemetry experiments system will need a domain with staging and production versions with an SSL certificate. We will probably need to pin the certificate in the client the same way we pin the AMO cert, since it will be used to serve executable code updates. The system will be much like FHR: scripts which produce flat files to serve, no dynamic server code. We intend for this to go into production for nightly/aurora in 4 weeks.
Updated•10 years ago
|
Assignee: server-ops → server-ops-webops
Component: Server Operations → WebOps: Other
Product: mozilla.org → Infrastructure & Operations
QA Contact: shyam → nmaul
Comment 1•10 years ago
|
||
Background: FHR is served from the CDN: the scripts to generate the static content run on the CDN origin. Jake set it up and knows all :)
Comment 2•10 years ago
|
||
FHR (fhr.cdn.mozilla.net) does not pin cert... how certain are we that this is a requirement here? It affects what we do with SSL and/or CDNs. Relevant note, most properties used by the browser (snippets, fxfeeds, fhr, crash-reports, etc) don't use pinned certs.
Comment 3•10 years ago
|
||
(In reply to Jake Maul [:jakem] from comment #2) > FHR (fhr.cdn.mozilla.net) does not pin cert... how certain are we that this > is a requirement here? It affects what we do with SSL and/or CDNs. Relevant > note, most properties used by the browser (snippets, fxfeeds, fhr, > crash-reports, etc) don't use pinned certs. AUS is pinned to a couple specific Certificate Authorities, but afaik we don't any kind of pinning anywhere else
Reporter | ||
Comment 4•10 years ago
|
||
I'm still working on getting guidance from the sec teams about that, but since this will be shipping executable code to Firefox users I expect it will likely be necessary to pin the cert the same we do for AMO (and AUS, right?). "ships executable code to the browser" seems to be our criterion for needing a pinned cert.
Assignee | ||
Comment 5•10 years ago
|
||
:bsmedberg - can you please point me to where the code for this project is hosted? similar to fhr, we'll setup an auto-deploying dev environment for your dev/testing and one for prod, which we can assist with deployments for.
Flags: needinfo?(benjamin)
Updated•10 years ago
|
Whiteboard: [business - new app]
Reporter | ||
Comment 6•10 years ago
|
||
We did security review yesterday and the conclusion is that we do need to pin the cert for this service the same way we do for AMO and AUS in the client. Here is the initial code for the system which does nothing but publish an empty manifest: http://hg.mozilla.org/users/bsmedberg_mozilla.com/telemetry-experiment-server/ Build-time requirement: python and genshi runtime requirement: mod_rewrite rules via .htaccess, but if you have a different desired solution for mapping URLs like /manifest/Firefox/27.0/beta to flat files like firefox-manifest.json I'm happy to do something different.
Flags: needinfo?(benjamin)
Assignee | ||
Updated•10 years ago
|
Assignee: server-ops-webops → cturra
Assignee | ||
Comment 7•10 years ago
|
||
i have completed the dev setup for this environment. every 15 minutes an update script is run to do a hg pull/update and rebuild the webroot (destination path). you can access dev at: https://telemetry-experiment-dev.allizom.org/
Assignee | ||
Comment 8•10 years ago
|
||
:bsmedberg - before i get too far into the production setup for this, i just wanted to confirm the url i am using is okay with you? generally, we use singular names and it looks like this is how you've also names your hg repo. the plan for production will be: telemetry-experiment
Flags: needinfo?(benjamin)
Reporter | ||
Comment 9•10 years ago
|
||
Hrm, everything else has been "experiments" (that's the name we expose in the UI). But I also don't care that much because the website itself isn't really user-visible; the HTML content is mainly intended for developers.
Flags: needinfo?(benjamin)
Assignee | ||
Comment 10•10 years ago
|
||
good news! the production environment is now setup for this new service and completely fronted with one of our CDNs (just like fhr). $ curl -I https://telemetry-experiment.cdn.mozilla.net/ HTTP/1.1 200 OK Server: Apache X-Backend-Server: generic2.webapp.phx1.mozilla.com Content-Type: text/html; charset=UTF-8 Strict-Transport-Security: max-age=15768000 ; includeSubDomains Accept-Ranges: bytes ETag: "1b9" Last-Modified: Wed, 12 Mar 2014 18:30:08 GMT X-Cache-Info: not cacheable; response specified "Cache-Control: no-cache" Content-Length: 441 Expires: Thu, 13 Mar 2014 06:23:34 GMT Cache-Control: max-age=0, no-cache, no-store Pragma: no-cache Date: Thu, 13 Mar 2014 06:23:34 GMT Connection: keep-alive
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Reporter | ||
Comment 11•10 years ago
|
||
Looking at the production certificate, should this be pinned to "Cybertrust Public SureServer SV CA" in the client with a CN of "*.cdn.mozilla.net"?
Flags: needinfo?(cturra)
Assignee | ||
Comment 12•10 years ago
|
||
to be perfectly honest, issuer pinning concerns me in this case. akamai is hosting the ssl certificate for us, since they're the cdn we're using in this case. we have no control over which certificate authority and when they reissue. this could cause us all sorts of headaches down the road.
Flags: needinfo?(cturra)
Comment 13•10 years ago
|
||
Chris, For certPinning (bug 744204, coming real soon now (TM))for the cdn I am currently pin so the keys of several CAs (currently to all of verisign, gte, equifax, geotrust, digicert, thawte and Baltimore). Are comfortable with that set? if not who do you think I should ge in touch to augment the set.
Assignee | ||
Comment 14•10 years ago
|
||
that seems like a fair approach, and in this case, the the root ca is: Baltimore CyberTrust Root. does this issuer pinning code follow cert chains all the way back to the root?
Comment 15•10 years ago
|
||
(In reply to Chris Turra [:cturra] from comment #14) > that seems like a fair approach, and in this case, the the root ca is: > Baltimore CyberTrust Root. does this issuer pinning code follow cert chains > all the way back to the root? Correct, the test ensures that there is an intersection between the specified keys and the keys in the computed chain (including the root)[1]. I in the future we could even pin intermediates, once we have a non-optimistic algorithm for certificate path building. [1] There is default exception so that when a chain terminates in a non-built-in root we do not enforce pinning (self mitm)
Reporter | ||
Comment 16•10 years ago
|
||
I don't understand how the discussion here affects the situation for shipping this in Firefox 30. Using CertUtils.jsm I can validate any aspect of the site certificate that is exposed on nsIX509Certificate{,2,3} which includes the following attributes: .nickname (N/A) .emailAddress (N/A) .subjectName "CN=*.cdn.mozilla.net,OU=IT,O=Mozilla,L=Mountain View,ST=CALIFORNIA,C=US" .commonName "*.cdn.mozilla.net" .organization "Mozilla" .organizationalUnit "IT" .sha1Fingerprint "3D:FA:78:5B:D4:CF:A3:A6:0A:89:BF:70:39:1E:50:63:0B:EB:ED:8A" .issuerName "CN=Cybertrust Public SureServer SV CA,O=Cybertrust Inc" .serialNumber "01:00:00:00:00:01:44:0F:FD:39:11:20:74:BA" .issuerCommonName "Cybertrust Public SureServer SV CA" But unless we added some new feature to CertUtils.jsm there isn't a way to specify that it be a cert for *.cdn.mozilla.net and have specific root attributes.
Comment 17•10 years ago
|
||
It does not affect what is shipping in Firefox 30.
Reporter | ||
Comment 18•10 years ago
|
||
ok that leaves us back at what I should pin the client to in Firefox 30.
Flags: needinfo?(cturra)
Assignee | ||
Comment 19•10 years ago
|
||
i don't believe i should be the one making that decision. we're stuck with what we have from akamai. i would prefer, if possible, that we pin trust on the root certificate authority (Baltimore CyberTrust Root) rather than the intermediate issuer.
Flags: needinfo?(cturra)
Reporter | ||
Comment 20•10 years ago
|
||
I don't have the technical ability to pin on a root cert, which is why I expressed the pinning concern as maybe a blocker for not using the CDN. So now we're using the CDN and can't pin properly? Or can I just pin against the cert that we're actually using on the CDN?
Comment 21•10 years ago
|
||
I am uncomfortable with the idea of cert pinning altogether, at least in this form... doubly so when we're pinning against a cert purchased and provided by a 3rd party. Here's the scenario that scares me most: say we pin this in Firefox 30 (the exact cert, intermediate, root, whatever). What happens when the cert is changed (expires, new provider, etc)? Obviously we can change the pin in *new* versions, but there's always a trickle of users that (for some reason) get stuck/abandoned on older versions... they're not going to get updated. Hence, this functionality will simply be broken for them somehow (I don't know what that UI looks like). Who can accept responsibility for this type of risk? How many people (or what %) are we willing to abandon over pinning? How will we make sure this cert pinning is not simply forgotten about in future when changes need to be made? I think these are questions that need a higher-level business owner to answer, not us folk on the ground implementing it. Someone like Bob Moss, perhaps? I know this sounds very negative, but I feel it's important. We very nearly got badly burned by the pinning on aus3.mozilla.org just last year. This of course isn't *that* bad, but I'm still concerned by it. Pinning has multi-year ramifications, and so (IMO) deserves special consideration.
Comment 22•10 years ago
|
||
Some quick responses/questions on my way out the door: 1) Pin failures here are much less tragic than pin failures on AUS. Pin failures on AUS mean losing the ability to update all users forever. Pin failures on telemetry experiments mean losing the ability to push new experiments until we can update our users to look for new pins. (AIUI! Benjamin, correct me swiftly if I'm wrong) We should not treat them as comparable risk profiles (though in the absence of that clarity, I think Jake is right to raise it) 2) Once we get Camilo's cert pinning code in, do we intend to remove this special-purpose pinning and use that generic pinning instead? I think I hear that happening here, but that also scopes down our risk for this approach to N releases. 3) If we have a security requirement around pinning that can't be satisfied by our CDN, then we shouldn't use the CDN for it. Unless I'm misunderstanding things (possible! I'm standing up as I type this! There's wine waiting for me!) this is about delivering occasional chunks of experiment code to millions, but not tens or hundreds of millions, of users from time to time. Is akamai necessary/desirable for solving that problem, particularly given the peculiarities of that code's security properties? 4) Jake asks about how many people we're willing to abandon, but unless I've deeply misunderstood, in no case should this feature (pinned or otherwise) cause us to lose users when it breaks. Benjamin? I can be the guy to yea/nay that risk on behalf of moz, but I'll need to understand those questions to do so. If they are as I suspect (pin failures are indeed not scary; we do plan to switch to camilo's pinning soon; we don't need CDN; we won't lose users) then I hereby do so.
Reporter | ||
Comment 23•10 years ago
|
||
1) correct, we lose the ability to push new experiments or remote-kill bad ones, but it doesn't affect other functionality 2) AIUI Camilo's pinning requires server cooperation, but yes, we plan to move to the less fragile thing soon. 3) During security review pinning was identified as a requirement, since we know that our code-deployment services have been the subject of certificate-spoofing attempts in the past. The expected load on this service is 1-2M daily pings for a manifest. .xpi fetches might peak on experiment-deployment days at 150k/day but will typically be almost 0. 4) None of this should cause abandonment.
Comment 24•10 years ago
|
||
How do you feel, Jake?
Comment 25•10 years ago
|
||
Given the decreased risk here as compared to something like AUS, I'm much less concerned than I was back in comment 21, months ago. If there's no risk of abandoning users, then the biggest risk would seem to be the accidental invalidation or delay of an experiment, should the CDN vendor change their cert on us unexpectedly. That's much more recoverable, because at the very least future versions would be able to pin on the new thing properly, and we'd just lose the time in-between. r+ from me
Updated•5 years ago
|
Product: Infrastructure & Operations → Infrastructure & Operations Graveyard
You need to log in
before you can comment on or make changes to this bug.
Description
•