Closed Bug 1279662 (RansomPKP) Opened 8 years ago Closed 4 years ago

"Abusing Bleeding Edge Web Standards for AppSec Glory" (HPKP DOSsing)

Categories

(Core :: Security: PSM, defect, P3)

defect

Tracking

()

RESOLVED DUPLICATE of bug 1412438

People

(Reporter: dveditz, Unassigned)

References

Details

(Keywords: sec-other)

Bryant Zadegan and Ryan Lester are giving a Black Hat & DefCon talk about abusing HPKP as a "novel attack vector" under the title "Abusing Bleeding Edge Web Standards for AppSec Glory." Apparently there are two other parties involved with the same problem which is probably Chrome and Opera (at least, those are the only others listed at http://caniuse.com/#search=hpkp)

They say "It's not really a flaw in your implementation of the protocol, though. Rather, it seems to be a circumstantially emergent vulnerability arising from gaps in the HPKP protocol and novel capabilities internet-wide."

This bug is currently a placeholder for additional details from them.
Flags: sec-bounty?
Summary: HPKP attack vector → Sweet indeed cheesy seated poofs (HPKP vuln placeholder)
[From follow-up mail from Bryant Zadegan sent to Mozilla, Google, EFF, and LetsEncrypt]

All concerned,

What we're describing below is not a single specific vulnerability but rather an aggregate vulnerability arising from a combination of otherwise-low or informational findings which enable a massively potent new attack vector.

Please note that the timeframe on this report is limited mostly due to a self-imposed requirement on our end to thoroughly validate the novel attack pattern described in this report. Due to the fact that this is the centerpiece for the Black Hat/DEF CON talk "Abusing Bleeding Edge Web Standards for AppSec Glory," disclosure takes place either on August 3rd or August 4th depending on the schedule imposed by Black Hat.

A succinct list of low-ranked defects or informational observations which enable this potent attack pattern are as follows:

--- Let's Encrypt blindly trusts that the machine requesting the certificate is doing so with human authorization. This is the easiest to mitigate and is what currently most effectively enables this attack.

--- Let's Encrypt inconsistently applies rate limiting to certificate generation. This was only just confirmed+reproduced and is the only finding we will submit in a subsequent report.

--- Let's Encrypt's actual rate limiting is insufficient to restrict the damage potential posed by this attack.

--- HPKP max-max-age at one year in Firefox allows for an effective ransoming of Firefox users.

--- HPKP persistence in Chrome's Incognito and Firefox's Private Browsing modes in violation of privacy-centric threat models for these modes gives users no mechanism for avoiding an HPKP lock-out (what we're calling HPKP Suicide™) short of manually clearing keys or reinstalling the browser, something most lay users will never figure out.

--- The HPKP specification does not seem to suggest (let alone require) additional verification of a request to enable HPKP. In turn, both Chrome and Firefox blindly trust that HPKP headers have been issued with the authorization of site owners.

Cheers,

-Bryant Zadegan and Ryan Lester

keybase.io/bryant
(Ryan's key is attached)


----------------------

Vulnerability Details:

----------------------

Upon compromising a targeted site which may not already have HPKP enabled, an attacker can, through rapid key rotation enabled by a supported CA (such as Let's Encrypt), hold access to a site by a vast number of users up for ransom.

In more detail: essentially, emergent conditions arising from the existence of HPKP and a novel mechanism for producing trusted certificates (Let's Encrypt, henceforth LE) enables a new class of ransomware. Specifically, malware infecting a web server can successfully leverage LE and HPKP in order to hold users of a site up for ransom, meaning that vast quantities of visitors to a website (on the order of millions of uniques) may be denied access for as little as 60 days and as long as a year until the targeted site pays a ransom to retrieve the master key. The low-to-informational findings enumerated in the introductory section are sufficient to summarize some of the architectural decisions through which this vulnerability emerged.

We've seen conversation and speculation about this topic arise spontaneously on numerous forums such as Hacker News (e.g. https://news.ycombinator.com/item?id=11679887), and the authors of the HPKP specification even discuss hostile pinning in the context of naïve denial of service (https://tools.ietf.org/html/rfc7469#section-4.5), but we've never seen anybody identify these final steps required to successfully weaponize this as a persistent attack. The missing component that nobody seems to have designed involves the use of rapid key rotation to rotate disposable keys while pinning a recovery key, something which can currently be accomplished quite successfully via LE and very much in spite of LE's use of rate limiting. In the future and as anonymous cryptocurrencies take off, we anticipate that providers of paid certificates with little to no significant rate limiting enabled will be usable in implementing this attack in place of using LE. It is for this reason that we've also included the Google and Mozilla security teams on this report, though we believe the initial mitigation priority rests with LE.

In its own right, this aggregate vulnerability is not of critical severity due to the need for a machine compromise to enable the described attack. However, both its nature as a novel attack pattern arising from a combination of architectural and implementation-specific compromises in the current SSL/TLS landscape as well as the imminent financial damage enabled by this attack are what justify high severity as well as a need for a resolution by committee, specifically the committee included on this thread at a minimum.

Reproduction steps for the attack pattern (which we're dubbing RansomPKP in the spirit of unnecessarily naming security issues) can be found below.


---------------------------------------------

RansomPKP Reproduction Steps / What Happened:

---------------------------------------------

1) Determine your target.

2) Generate a fixed keypair from a machine not affiliated with your target (what we're calling a Ransom keypair).

3) Take control of the target web server and include a payload that will accomplish the following steps. Include the Ransom public key hash with the payload.

4) From the targeted web server, generate the first of a rotating keypair + CSR (what we're calling a Lockout keypair).

5) Follow the ransom lockout process (see subsequent steps for the ransom lockout process below). This step may loop.

6) Receive payment for Ransom keypair and disclose Ransom keypair to the target, ending the attack.

The ransom lockout process:

While the number of users for whom HPKP headers are set is less than *n* (a value to be determined by the attacker's research for the given target),

1) Install (or assure the presence of already-installed) HPKP headers containing the Ransom and Lockout key hashes. Set Max Age to one year.

2) If the number of users who've received the headers reaches *n*,

2a) Generate new Lockout keypair + CSR.

2b) Send CSR to LE for certificate generation.

2c) Update TLS certificate and HPKP header with new Lockout key and key hash.

Note: depending on the target, this process may be modified such that the payload maintains stealth for as long as possible prior to the *initial* rotation of keys. This ensures maximum coverage of users who will be locked out before the target detects the attack. The moment the key is rotated the first time, it's assumed that the clock on resolution begins counting down as the initial reports of a TLS error will likely overwhelm the target's support staff.

-------------

Version info:

-------------

Chrome 51.0.2704.84

Firefox 47

LE Client (Certbot): 0.8.1

--------------------------

What should have happened:

--------------------------

We frankly don't know. As stated earlier, this vulnerability appears to be an emergent property of HPKP and new, free certificate authorities such as LE. However, as paid CAs continue to catch up and eventually add features such as rapid key rotation, this will soon be a much more common problem. Because of this, this should be mitigated before this becomes not just a viable class of attack (as it already is) but a common one.

On technical merits alone, this would normally be considered moderate severity. However, considering that this opens a new vector for extremely potent ransomware, we're of the position that business impact upgrades the severity.

Additionally, we're slated to disclose this finding at the following talk to be presented at both Black Hat and DEF CON: "Abusing Bleeding Edge Web Standards for AppSec Glory". This puts the earliest release date as either August 3rd or 4th.

----------------------------------------------------------

Existing Mitigations, Strategies, and Counter-Mitigations:

----------------------------------------------------------

To begin: This attack pattern is a non-starter without some mechanism to infect the target. However, once the target has been compromised, this strategy can be used to quickly and effectively extort thousands of dollars.

Touching on specific mitigations present in affected technologies: The Chrome team has implemented a max-max-age of 60 days on HPKP. This inadvertently also reduced the impact of this vulnerability considerably for users of Chrome given that users of Chrome will now only be potentially ransomed for no longer than 60 days as of the version of Chrome (51) noted in this report. This is in reference to the following issue:

https://bugs.chromium.org/p/chromium/issues/detail?id=523654

Furthermore, it's understood that a victim may counteract the attack by publicizing remediation steps on various social media platforms, etc. to clear the ransom state for its users. We'll discuss in Aggravating Factors why this might not be effective.

Finally, one suggested mitigation made within the HPKP spec itself (https://tools.ietf.org/html/rfc7469#section-4.5) is the use of Certificate Transparency (https://tools.ietf.org/html/rfc6962). This may potentially work in discovering attacks should an interested party be actively monitoring certificate transparency logs (per https://tools.ietf.org/html/rfc6962#section-7.2), but this may not be effective at mitigating attacks against high-traffic properties timeboxed within, for instance, a 24 hour window.

--------------------

Aggravating Factors:

--------------------

Both Chrome and Firefox enforce HPKP in Incognito/Private Browsing modes in _blatant violation_ of the privacy-first threat model for Incognito/Private Browsing modes. In doing so, it's worth noting that site lock-outs and ransoms will persist even when lay users switch to Incognito. This references Firefox bug https://bugzilla.mozilla.org/show_bug.cgi?id=1242226, previously opened by Ryan Lester.

Speaking to publicizing remediation steps to a general audience: a tactically priced ransom may make payment of the ransom more cost effective than attempting to teach lay users how to reset their list of pinned keys, so we anticipate that RansomPKP attacks charging no more than a few dozen thousand dollars for a wide spread of large targets will still succeed. In the course of a more thorough attack where interfaces with the general public are also disabled, an attacker may be able to extort orders of magnitude more from the target.

Finally, LE inconsistently handles rate limiting, but even with rate limiting in place, the fact that 20 certificates can still be issued per week for a given domain makes this attack extremely potent.

--------------------

Potential Solutions:

--------------------

In short: there needs to be a better long-term process for validating whether a domain has been enrolled for LE. Likewise, changes should be considered to HPKP implementations, such as universal adoption of 60 day max-max-age (currently in place with Chrome) as well as the banishment of--or potential to bypass via user action--HPKP in Incognito/Private Browsing modes.

Immediate correction: allowing sites to opt out of LE should mitigate the risk to major properties or to properties which explicitly wish not to use LE certificates in the short term.

Additionally, dropping rate limiting to only a handful per week, if not only one per week or even one per month, may be enough to mitigate risks posed to larger institutions with better detection capabilities. It may be worth requiring an extra step of authentication to enable the current rate limits on key rotation.

In depth: While this issue can be handled in a number of ways, the following is the most conservative mitigation proposal we were able to come up with, with particular emphasis on keeping UX changes and implementation work on LE/EFF's part to a minimum:

1. Leverage DNS as proof that the true owner of any given domain actually wants a LE certificate. Simplest approach would be to require some arbitrary global constant in a TXT record for every domain included in a cert request, e.g. "LETS_ENCRYPT". With this, an attacker can't start generating a fresh LE cert for a website they don't own just by compromising the box; they'd also need control of its DNS.

2. For additional assurance, allow anyone to explicitly blacklist their domain from LE. This could be done with a command like `certbot blacklist -d cyph.com --email ryan@cyph.com`. On LE's end, this blacklist would just be a server-side list of domains with associated email addresses. If a cert with a blacklisted domain included were ever requested, it would be automatically rejected.

 — a. To remove a domain from the blacklist (`blacklist --remove`?), a confirmation link received via email would have to be clicked. The user initiating the blacklist reversal would have the option to receive the link at 1) the originally specified recovery address, 2) the admin address specified in the domain's whois info, or 3) any of a small set of predefined constants (e.g. letsencrypt@cyph.com and security@cyph.com).

 — b. See 3(b).

3. Make the following changes to the registration/recovery contact email address logic:

 — a. If you aren't willing to disable `--register-unsafely-without-email` entirely, at least highly discourage it (more so than you already are) with a big scary warning about the RansomPKP risk.

 — b. Check IP addresses of A and MX records for the address hostname; if there's a match, reject it and have certbot display an explanation that using an email account hosted on the same machine as the web server effectively circumvents LE's 2FA requirement and opens them up to a new class of ransomware (we can provide a link to include for more info on RansomPKP post-disclosure).

4. Add in a configurable cancellation window / time delay for cert generation (default to maybe a day or so?), with logic in the timing of automated renewals to account for whatever that delay is set to. When a certificate is requested for an FQDN whose root domain has already had itself or any of its subdomains issued a cert by LE, the previous email address on record for that domain is sent a link to cancel the request. If the cancellation link isn't clicked within the previously specified window, the certificate is issued.

 — a. There'd be no delay for a cert request wherein every domain included has never previously been issued a cert by LE.

5. Further to #3, it would be ideal (but maybe not necessary) if, after an actual issuance, another email were sent out to the same email address with a link to reverse the issuance (revoke cert on LE's end + blacklist the domain + restore the previous cert on the affected machine). This link should expire at the same time as the previous cert.

 — a. Not sure whether this is already the case, but it might be helpful for the automated renewals to occur a few days earlier than necessary so as to not make this email seem pointless. Alternatively, only send this email when a cert is issued well before the previous one's expiration date and/or the key pair used changes.

-----------------

Bug Report Notes:

-----------------

Tim Willis (@google), per request, this should be used to generate a new bug report within the Chrome team. Please put bryant@zadegan.net (Bryant's personal address) as the primary, ryan@cyph.com (Ryan's business address, copied) as the secondary.

- Note for the sake of the report that this is a Security issue which requires confidentiality.

- Note for the sake of the report that this affects any Chrome build which supports HPKP. Releases prior to 51 are more significantly affected due to the unrestricted max-max-age.

Daniel Veditz (@mozilla), this should be reasonably shoehorned into the stub bug which you created, https://bugzilla.mozilla.org/show_bug.cgi?id=1279662. Please add bugzilla@zadegan.net (Bryant's bugzilla-specific alias) for access, with hacker@linux.com (Ryan's personal address) as secondary.

- Note for the sake of the report that this is a Security issue which requires confidentiality.

- Note for the sake of the report that this affects any Firefox build which supports HPKP.

--------------------

Disclosure Timeline:

--------------------

This finding will be disclosed as early as August 3rd and as late as August 4th, 2016 given that it's the centerpiece of the talk "Abusing Bleeding Edge Web Standards for AppSec Glory," accepted to both Black Hat and DEF CON this year. Much of our time was spent cementing the validity of the find, but we're now entirely confident that this can be used as described to mount effective ransomware campaigns against sites vulnerable to any variety of traditional injection attacks. We feel that the immediate mitigation of allowing sites to blacklist themselves from LE is a sufficiently simple mitigation to impose within the next 48 days between now and the earliest date of public disclosure.
Depends on: 1242226
Is killing a site for "only" 60 days better than a year in practice? Either way users are going to have to figure out how to recover and we don't have a good story for that. That said, LE certs don't live for a year and it doesn't make sense to allow an HPKP max-age that's longer than the expiration date on the cert. It's true that sites might use the same key on a re-issued cert but we shouldn't count on that being the case.
Depends on: 1280417
Filed bug 1280417 about comment 2 so it's not a distraction to the main problem here.
Fair enough. We're actually not aware of how many bugs might be opened as a result of this; we just know that as of right now, the current state of HPKP combined with LE's capabilities are what enable this attack. 

We would've just sent this to Let's Encrypt, but I know that Cyph uses DigiCert's rapid key rotation function in production right now, so it's only a matter of time before this attack is something paid CAs will facilitate as well.

But you're right. Killing a site for 60 days is still bad, but at least it's not life-of-certificate bad, which in this case appears to be 90 days for Let's Encrypt.

P.s. might we have access to 1280417 as well?
Summary: Sweet indeed cheesy seated poofs (HPKP vuln placeholder) → "Abusing Bleeding Edge Web Standards for AppSec Glory" (HPKP DOSsing)
Alias: RansomPKP
> 1. Leverage DNS as proof that the true owner of any given domain actually wants a LE certificate. Simplest approach
> would be to require some arbitrary global constant in a TXT record for every domain included in a cert request, e.g.
> "LETS_ENCRYPT". With this, an attacker can't start generating a fresh LE cert for a website they don't own just by
> compromising the box; they'd also need control of its DNS.

The CAA spec may be relevant, and AFAIK Let's Encrypt honors CAA records. Of course if you're one of the millions of sites who is itself using a LE cert then they may not be able to use CAA. I suppose you could remove the record whenever you needed to renew and then add it back.
Meant to include the spec link: https://tools.ietf.org/html/rfc6844
So as far as I can tell, CAA isn't actually a requirement for using Let's Encrypt. It certainly simplifies the solution (Ryan and I worked on this without having known about or found the RFC), but as far as we can tell, a site would essentially be obligated to implement this in order to defend themselves from Let's Encrypt being used in an attack against them.

On top of that, it wouldn't seem to defend against the scenario of a site already using LE being attacked. 

The simplified solution (to at least stop the vector enabled by Let's Encrypt) would be for Let's Encrypt to check for CAA RRs permitting issuance before even allowing the certificates to be issued.

Thoughts?
Update for the thread:

We have two other discussion threads going on at the same time, one with LE and the other with Google.

With regards to LE, there was a proposed simplification of RansomPKP by Jacob Hoffman-Andrews which would rely on establishing a TLS relay on the compromised machine and ending the TLS connection at an attacker-owned endpoint, allowing an attacker to forgo Rapid Key Rotation by maintaining control of just one key (the Ransom key). We've whiteboarded this alternate proposal and believe it's viable, though for a number of reasons which we can enumerate later but which mostly reduce down to the resources needed and resulting ROI, we disagree that this version of the attack is as effective as the RKR version. Still, the point was made that there exist potentially multiple ways to successfully use HPKP to hold a site's userbase for ransom, something not at all considered during the drafting of HPKP beyond a mere DoS via hostile pinning. Additionally, it still points to risks created by a lack of human validation (or at least a secondary authorization channel) of a Let's Encrypt request. 

LE just now concluded with "Wontfix," and although we believe that any number of fixes by LE could substantially reduce the near-term potential for harm enabled by the service (such as dramatically reducing the rate limit, among others listed in the original email), it's within their right to assert that LE won't take action.

Since Ryan and I have an attack which clearly works and have no obvious mitigation paths in front of us in light of LE's "Wontfix"... we're out of ideas for mitigating the attack enabled by LE.

Google appears to be treating this seriously and has at least assigned a team to research the issue, so my hope is that *some* kind of mitigation can be implemented either before or soon after the talks are given, even if that doesn't come from LE. Looking forward to insights from the Mozilla side, though I still steadfastly believe that LE has primary mitigation responsibility here and would prefer the LE team to "see the light" so to speak.
(This seems to be sort-of a meta bug, so I'm marking it as P3 in accordance with new bug triage guidelines.)
Priority: -- → P3
(In reply to David Keeler [:keeler] (use needinfo?) from comment #10)
> (This seems to be sort-of a meta bug, so I'm marking it as P3 in accordance
> with new bug triage guidelines.)

Apologies for the belated reply; this works for us. On that note, what are we looking at in terms of timelines and potential mitigations as a result of the P3 designation?
We've implemented a 60-day max-max-age to match the spec and chrome in bug 1285052. Currently that will ship in Firefox 50 (November?) but I'll see if we can move that up. Firefox 48 (released right before Black Hat) would be ideal but that will require special dispensation.
Depends on: 1285052
Keywords: sec-other
Flags: sec-bounty?
Dan, is there anything else to do here?
Flags: needinfo?(dveditz)
I guess not. And the talk went public at BH/DC so we should be able to unhide this, right?
Flags: needinfo?(dveditz)
Group: crypto-core-security

HPKP was disabled by default in bug 1412438.

Status: NEW → RESOLVED
Closed: 4 years ago
Resolution: --- → DUPLICATE
You need to log in before you can comment on or make changes to this bug.