Closed Bug 1701192 Opened 3 years ago Closed 3 years ago

Size limit of SiteSecurityServiceState.txt and Firefox cache partitioning make HSTS unreliable

Categories

(Core :: Security: PSM, defect, P2)

Firefox 87
defect

Tracking

()

RESOLVED FIXED
91 Branch
Tracking Status
firefox91 --- fixed

People

(Reporter: fbausch, Assigned: keeler)

References

Details

(Whiteboard: [psm-assigned])

Attachments

(1 file)

User Agent: Mozilla/5.0 (X11; Linux x86_64; rv:87.0) Gecko/20100101 Firefox/87.0

Steps to reproduce:

By using Firefox as a normal user, I filled the HSTS cache in SiteSecurityServiceState.txt. (I've been using the affected profile for about 2 years, but I'm not sure at what time the cache was filled to the limet). As soon as the SiteSecurityServiceState.txt file reaches 1024 lines, new entries will be ignored.

Actual results:

By chance I realized that a website I re-visited was not forced to HTTPS, although the HSTS header was set.
It was really hard to find the reason for this behavior because Firefox does not log the eviction of HSTS entries from SiteSecurityServiceState.txt if it is filled up to the limit of 1024 lines.
The reason for this behavior is that the algorithm decides to remove the newly added HSTS entry if the HSTS already contains 1024 entries.

Recently Firefox additionally added cache partitioning. Now, in case of CDNs and similar services that set the HSTS header, each site that uses a CDN causes an additional HSTS entry for that CDN. So, cache partitioning is blowing the HSTS cache additionally so that the limit of 1024 entries is reached even quicker.

My HSTS file currently has 2014 entries:
wc -l ./.mozilla/firefox/<id>.default/SiteSecurityServiceState.txt
1024 ./.mozilla/firefox/<id>.default/SiteSecurityServiceState.txt

and has a size of 81KB. So, the reason for the 1024 entries limit cannot be size constraints. It could easily be 10 times or even 100 times larger.

Expected results:

Using Firefox as a normal user over some months or years should not result in a state where HSTS is de facto silently disabled.

HSTS is an important security concept that gets advocated for everywhere. Site admins are encouraged to use it and more and more websites make use of it.
By setting the cache limit to 1024 entries in SiteSecurityServiceState.txt and not informing users about the filled-up cache, users have a false sense of security while browsing.

While not having an immediate security impact, I think that this behavior makes browsing less secure while giving a false feeling of security.

The Bugbug bot thinks this bug should belong to the 'Core::Networking: Cache' component, and is moving the bug to that component. Please revert this change in case you think the bot is wrong.

Component: Untriaged → Networking: Cache
Product: Firefox → Core
Component: Networking: Cache → Security: PSM

I did some research on this topic and asked my colleagues to run
wc -l SiteSecurityServiceState.txt

I got 12 responses from colleagues who run Firefox on a daily basis. 10 of those 12 colleagues have the HSTS cache maxed out with 1024 entries. The other 2 have been using their Firefox profiles for just some months, but also already reached about 800 entries.

This means, from my perspective, that the size limit of 1024 entries is not sufficient to make HSTS reliable in Firefox.

Or in other words: This is not an edge case; HSTS breaks after some months if you use Firefox frequently.

We can increase the limit, but that doesn't ultimately solve the problem. I'm wondering if we should not partition HSTS but only note hosts as HSTS if they're loaded in a first-party context. That way, third parties can't change HSTS state as a tracking vector. Christoph, does that make sense? Or am I misunderstanding the concerns with respect to partitioning and HSTS?

Flags: needinfo?(ckerschb)

From my perspective, there are several things that can be done:

  • Increase the size of the cache: I think 1024 is not sufficient, at all. A lot (maybe most) pages set a max-age of 1y or 2y. Simplified this means, I can only visit 1024 pages per year without running into problems with HSTS. RAM and disk usage of a cache that is 10 or 100 times larger should not have a great impact on today's computers.
  • Change the algorithm that evicts data from the cache: It is the main issue that new entries are evicted immediately as soon as the size limit is reached. Potential alternatives: evict the oldest entry or evict a random entry. So far, if I visited a website that sets a max-age of 5y several times last week but I will not visit it ever again, it will occupy at least one cache entry for the next 5 years and it will not be evicted by the current algorithm. With another algorithm, it would be removed eventually.
  • Collect telemetry about the size of the cache to get an idea what percentage of users actually hits the size limit. Then it is also possible to find out whether the mitigations have a notable effect or not.

I do not think that any HSTS header should be ignored to prevent a full HSTS cache. Instead, an adequate cache size should be chosen so that evictions from cache because of size constraints are only necessary in rare cases (e.g. a targeted attack on the HSTS cache in order to prevent the cache from filling up disk space or DoSing the UA).

I thought about another algorithm to decide which entry to evict from SiteSecurityServiceState.txt:

  • for each entry compute the number of days since last visit (today - 3rd column in SiteSecurityServiceState.txt).
  • for each entry compute: days since last visit (3rd column) divided by days with visit (2nd column) + 1: ratio=days_since_last_visit/(days_with_visit+1). The +1 is necessary, because the value is set to 0 at first visit.
  • evict the entry with the largest ratio.

Example:
Assuming there are those entries in the cache (ratio in brackets):

  • 1.example.com visited today for the first time. (0)
  • 2.example.com visited every day for the last year (365 times), last visit was yesterday. (0.0027)
  • 3.example.com visited once a month for the last year (12 times), last visit was a 30 days ago. (2.5)
  • 4.example.com visited only once half a year (182 days) ago. (182)
  • 5.example.com visited 30 times, but the last time was 330 days ago. (11)

-> 4.example.com would be evicted.

This algorithm would at first evict domains that were visited rarely a long time ago, i.e. domains that are more likely never to be visited again. At the same time, it honors if domains are visited regularly but not daily.

(In reply to Dana Keeler (she/her) (use needinfo) (:keeler for reviews) from comment #3)

We can increase the limit, but that doesn't ultimately solve the problem. I'm wondering if we should not partition HSTS but only note hosts as HSTS if they're loaded in a first-party context. That way, third parties can't change HSTS state as a tracking vector. Christoph, does that make sense? Or am I misunderstanding the concerns with respect to partitioning and HSTS?

Not allowing HSTS to be set by third parties is an option - I think Safari is following that strategy currently. I would like to loop in other folks who might have an opinion here. Obviously we need to do something, because maxing out entries which causes HSTS to break is not sustainable.

Anne for web-combat and Steve for HSTS tracking vectors - any suggestions?

Flags: needinfo?(senglehardt)
Flags: needinfo?(ckerschb)
Flags: needinfo?(annevk)

Once bug 1672106 is fixed, we wouldn't have to care about third parties as only the top-level navigations would be a potential upgrade point. That seems like the most straightforward solution here. Otherwise, if we just disallow HSTS for third parties, we might end up increasing the number of insecure media requests.

Furthermore, if we are not concerned about the top-level redirect tracking vector described at https://github.com/mikewest/strict-navigation-security, we could consider collecting non-partitioned assertions from third parties for potential future top-level navigations. Might want to do some research first to see if that is worth it though.

Flags: needinfo?(annevk)

I tried to think through this again and the benefit of allowing third parties to set HSTS is somewhat marginal because it's partitioned. It seems that basically you need two pages on site A to embed something from site B and in one case it uses HTTPS (to get the HSTS entry) and in the other case it uses HTTP (and gets upgraded due to the HSTS entry). And the latter case cannot be active mixed content as that would be blocked before the HSTS upgrade kicks in, as I understand it.

So there's a question of how likely it is that a site A (which can consist of multiple origins, mind) inconsistently embeds B in a way where HSTS would end up having a positive effect for the user. It doesn't seem likely to me and I would be okay at this point with giving up third-party HSTS even if we don't fix bug 1672106 first.

Another option here would be increasing the limit from 1024 to an order of magnitude more, which would still not be close to the size of the preload list.

Dana, Tim, thoughts on either of those?

Speaking of the preload list, Tim, have we made sure that we are ignoring partitioning when doing preload lookups?

Flags: needinfo?(tihuang)
Flags: needinfo?(dkeeler)

I agree with your reasoning that allowing third parties to set HSTS is of limited utility due to partitioning. I'll go ahead and start working on changing that.

Assignee: nobody → dkeeler
Severity: -- → S2
Flags: needinfo?(dkeeler)
Whiteboard: [psm-assigned]

We don't partition the preload list, see here. And I generally agree with Dana. But I think maybe the algorithm in comment 4 is also something we could look at.

Flags: needinfo?(tihuang)

Thanks Tim! So the one thing we need to look out for is that we still end up supporting preload entries for third parties, even if we ignore HSTS there otherwise.

Flags: needinfo?(senglehardt)
Status: UNCONFIRMED → ASSIGNED
Ever confirmed: true
Priority: -- → P2
Pushed by dkeeler@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/54267d9f3d78
don't allow third-party loads to set HSTS state r=annevk,necko-reviewers,dragana

Backed out for causing mochitest failures on test_hsts_upgrade_intercept.html.

Push with failures

Failure log

Backout link

Flags: needinfo?(dkeeler)
Flags: needinfo?(dkeeler)
Pushed by dkeeler@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/9bae0f6ea847
don't allow third-party loads to set HSTS state r=annevk,necko-reviewers,dragana
Status: ASSIGNED → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
Target Milestone: --- → 91 Branch
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: