CSP errors related to the recent about:blank change, possibly specific to iframes
Categories
(Core :: DOM: Security, defect, P1)
Tracking
()
| Tracking | Status | |
|---|---|---|
| firefox-esr140 | --- | unaffected |
| firefox148 | --- | wontfix |
| firefox149 | --- | fixed |
| firefox150 | --- | fixed |
People
(Reporter: alternativerockingchair, Assigned: vhilla)
References
(Regression)
Details
(Keywords: regression)
Attachments
(4 files)
User Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:148.0) Gecko/20100101 Firefox/148.0
Steps to reproduce:
I noticed that after updating Firefox to 148.0, one page on my local HomeAssistant instance started failing to load due to some CSP errors (shared below). The URI: https://homeassistant.my.domain/hacs/dashboard (replaced my actual domain name with "my.domain", but otherwise that's the path).
The page uses an iframe to display content from the Home Assistant Community Store (HACS). Apparently starting from a blank iframe, because it has no src attribute? The inspector just shows <iframe title="HACS" class="loaded"> for the iframe element (with contents inside).
Reproducible with a fresh profile, safe mode, etc. - doesn't seem to be related to addons at all.
As a point of reference, in the latest Chrome build (145.0.7632.160, Windows 10 x64), this page works fine.
I used mozregression to check between the 144 and 148 releases (since I wasn't certain what the "last good" version was, I went back a good way), and eventually it pointed me to this pushlog: https://hg-edge.mozilla.org/integration/autoland/pushloghtml?fromchange=37f1c536fdb89cd69039774a87f3ad8a94b95328&tochange=86c4f03a43058ea62f76021ad5a50e6199c34567
I realize that this may not be super helpful without an actual website where you can reproduce this yourselves, but I don't want to expose my HomeAssistant instance to the internet at large (and it requires a login to reach this page). I'm happy to provide extra details if it would help, run tests, etc. Worst case, I could probably spin up a dummy instance to share, but would still rather avoid that if possible.
Actual results:
Depending on page load timing (refresh vs. shift-refresh?), I'm seeing one of these two CSP errors occur when loading this page. From the Firefox console:
Content-Security-Policy: The page’s settings blocked a script (script-src-elem) at https://homeassistant.my.domain/frontend_latest/custom-panel.db002d67f98b98db.js from being executed because it violates the following directive: “script-src 'self' 'report-sample' 'unsafe-inline' 'unsafe-eval' https://cdnjs.cloudflare.com/ajax/libs/”
Content-Security-Policy: The page’s settings blocked a worker script (worker-src) at https://homeassistant.my.domain/hacsfiles/frontend/frontend_latest/sort-filter-worker.5d60c2ed348e2db0.js from being executed because it violates the following directive: “worker-src 'self' 'report-sample'”
In both cases, the link in the console to "view source in debugger" links to about:blank.
My nginx reverse proxy in front of HomeAssistant received CSP reports like these corresponding to the above in-browser errors, respectively:
{
"csp-report": {
"blocked-uri": "https://homeassistant.my.domain/frontend_latest/custom-panel.db002d67f98b98db.js",
"column-number": 1,
"disposition": "enforce",
"document-uri": "about",
"effective-directive": "script-src-elem",
"original-policy": "default-src 'none'; script-src 'self' 'report-sample' 'unsafe-inline' 'unsafe-eval' https://cdnjs.cloudflare.com/ajax/libs/; style-src 'self' 'report-sample' 'unsafe-inline' https://cdnjs.cloudflare.com/ajax/libs/; worker-src 'self' 'report-sample'; img-src 'self' data: https:; font-src 'self' data: https://cdnjs.cloudflare.com; connect-src 'self' https://raw.githubusercontent.com/home-assistant/ https://brands.home-assistant.io/; manifest-src 'self'; form-action 'self'; object-src 'none'; base-uri 'none'; upgrade-insecure-requests; frame-ancestors 'self' moz-extension:; report-to csp-reports; report-uri https://csp-reports.my.domain/reports; frame-src 'self'; media-src 'self' blob:",
"referrer": "",
"status-code": 0,
"violated-directive": "script-src-elem"
}
}
{
"csp-report": {
"blocked-uri": "https://homeassistant.my.domain/hacsfiles/frontend/frontend_latest/sort-filter-worker.5d60c2ed348e2db0.js",
"column-number": 1,
"disposition": "enforce",
"document-uri": "about",
"effective-directive": "worker-src",
"original-policy": "default-src 'none'; script-src 'self' 'report-sample' 'unsafe-inline' 'unsafe-eval' https://cdnjs.cloudflare.com/ajax/libs/; style-src 'self' 'report-sample' 'unsafe-inline' https://cdnjs.cloudflare.com/ajax/libs/; worker-src 'self' 'report-sample'; img-src 'self' data: https:; font-src 'self' data: https://cdnjs.cloudflare.com; connect-src 'self' https://raw.githubusercontent.com/home-assistant/ https://brands.home-assistant.io/; manifest-src 'self'; form-action 'self'; object-src 'none'; base-uri 'none'; upgrade-insecure-requests; frame-ancestors 'self' moz-extension:; report-to csp-reports; report-uri https://csp-reports.my.domain/reports; frame-src 'self'; media-src 'self' blob:",
"referrer": "",
"status-code": 0,
"violated-directive": "worker-src"
}
}
Expected results:
I wouldn't expect any CSP errors, since 'self' seems like it should allow loading the scripts that are being blocked, as both the blocked scripts and the page being loaded are under https://homeassistant.my.domain, i.e., should share the same origin.
Another possibly-useful bit of information: if I add https://homeassistant.my.domain to the script-src, worker-src, and connect-src directives, then everything seems fine. I ran into the connect-src breakage after adding this workaround to the other two - its errors were basically the same as the others, but I can share them as well if interested.
It seems to me like Firefox is using about:blank for "self" in some instances now where it didn't before? Or enforcing CSP rules where it didn't before. That could explain why the above workarounds help - they're allowing content from https://homeassistant.my.domain to be loaded even from about:blank.
I'm by no means an expert though. If this new Firefox behavior is actually correct now, I can try raising a bug with the HACS maintainers instead to find a solution for this, but this strikes me as surprising, and something that might break other websites too (though I haven't personally noticed any yet).
Comment 1•2 months ago
|
||
The Bugbug bot thinks this bug should belong to the 'Core::Audio/Video: Playback' component, and is moving the bug to that component. Please correct in case you think the bot is wrong.
Updated•2 months ago
|
Comment 2•2 months ago
|
||
Set release status flags based on info from the regressing bug 543435
:vhilla, since you are the author of the regressor, bug 543435, could you take a look? Also, could you set the severity field?
For more information, please visit BugBot documentation.
| Assignee | ||
Comment 3•2 months ago
•
|
||
It's plausible that bug 543435 introduced a bug here. I spun up a HomeAssistant locally with HACS, but can't reproduce it in 148 or nightly. Seems like a simple iframe, nested in the top document, no sandboxing, the iframe head contains no meta elements. It's in a shadow dom, but that should be irrelevant here.
| Assignee | ||
Comment 4•2 months ago
•
|
||
I'm unsure if nsresult rv = csp->SetRequestContextWithDocument(blankDoc); in nsDocShell::CreateAboutBlankDocumentViewer as was added by bug 543435 is correct, as it means we set mSelfURI to about:blank. But the same happens in Document::InitCSP. Before that happens, we init the policy container and parse frame-src: self against the old mSelfURI. The policy container comes from the embedder document. Maybe there is some edge case where the CSP is parsed later after mSelfURI was changed.
If I add a CSP script-src: "self" via a meta tag within the about:blank itself, there are CSP violations that weren't there before bug 543435.
Comment 5•2 months ago
|
||
Can you please share the whole test case that changed? I do think this is the likely culprit. Our handling of 'self in general is kind of broken (bug 1899512) and if we suddenly get a different mSelfURI that seems like the most likely cause.
In Document::StartDocumentLoad we have this hack for parsing CSPs before setting the mSelfURI, we are probably not hitting that code anymore?
| Assignee | ||
Comment 6•2 months ago
•
|
||
script.js doesn't need to exist, serve via e.g. python -m http.server
Edit: No violation in Chrome, but I seem to have minimized it too far and now I get a violation in old Firefox too.
| Assignee | ||
Comment 7•2 months ago
•
|
||
(In reply to Tom Schuster from comment #5)
Can you please share the whole test case that changed? I do think this is the likely culprit. Our handling of
'selfin general is kind of broken (bug 1899512) and if we suddenly get a differentmSelfURIthat seems like the most likely cause.
Sorry, test.html shows what I mean and it's a difference to Chrome. But when I saw no violation in my build from before bug 543435, that was because I added the meta tag to the transient about:blank that gets thrown out on load and that we got rid of in bug 543435. So I don't think this is the issue.
Perhaps we want to EnsureIPCPoliciesRead() somewhere in CreateAboutBlankDocumentViewer too? But I don't know why policies should get to about:blank via IPC, it's not cross process after all.
| Assignee | ||
Comment 8•2 months ago
|
||
Thanks for the report! Could you try to attach logs and a profile?
With a new firefox profile (or as few tabs and extensions as possible), go to about:logging, paste nsDocShell:4,CSPContext:4,CSPOrigin:4, click "start logging", reproduce the issue, "stop logging", a profile opens, click upload in the top right and paste the link.
Ah, my apologies! I should have been clearer - HomeAssistant on its own doesn't seem to send any CSP headers, and HACS does still work for me in 148 if I access my HomeAssistant instance directly. However, most of the time I access it through an nginx reverse proxy that I use to provide ACME TLS certs for all my self-hosted services, and that's where I introduced the CSP headers.
I don't see any built-in way to set arbitrary headers directly in HomeAssistant, so for you to reproduce this locally you'd probably also need to stick a reverse proxy in front of your local HomeAssistant instance and set the headers there. I'm happy to run tests though!
I gathered the logs you requested using a new profile. Did a quick find/replace on my domain in the .json and gzipped it back up - I assume that shouldn't hurt anything, but if so I could send you the original file directly. Uploaded here: https://share.firefox.dev/4syGKu4
Let me know if there's anything else I can do to help!
| Assignee | ||
Comment 10•2 months ago
|
||
Thank you!
Interesting, the CSP violation happens in the parent after the content about:blank finished loading and indeed, selfURI is about:blank.
| Reporter | ||
Comment 11•2 months ago
|
||
In case it helps, I thought to gather logs for both cases where I was originally seeing this. Not sure which I caught in the first profile so I captured both here, using the same new Firefox instance as before.
script-src-elem violation: https://share.firefox.dev/4dieJm2
- Open tab at https://homeassistant.my.domain/hacs/dashboard, then start logging
- Click the refresh button in Firefox
- Observe HACS fails to load - the whole right pane is empty; script-src-elem CSP violation in the console.
- Stop logging
worker-src violation: https://share.firefox.dev/4s4bsvt
- Open tab at https://homeassistant.my.domain/hacs/dashboard, then start logging
- Shift-click the refresh button in Firefox
- Observe HACS seems to load fine - it shows all the integrations, etc.
- Type anything in the filter field - no filtering occurs; worker-src CSP violation in the console.
- Stop logging
These behaviors are repeatable, so whatever is happening to cause the different outcomes maybe isn't timing related as I originally speculated, so much as related to the state of the page and how it's loaded. Anyway thanks for looking into it!
| Assignee | ||
Comment 12•2 months ago
|
||
The request does go through in the content, it's the parent that cancels it. Content and parent parse the CSP at different times with different selfURI. In the parent, the request is canceled and the CSP parsed right around a SERVICE_WORKER_FETCH_INTERCEPTION_DURATION_MS_2 marker.
With this, I finally got Claude to create a test case using a SW intercept that seems promising, i.e. fails since bug 543435 with a very similar profile.
Strangely, before bug 54345 the CSP is parsed with a similar timing, but the correct selfURI. I'm still debugging what changed. My guess is there was some timing change that caused a new variation of bug 1899512.
| Assignee | ||
Comment 13•2 months ago
•
|
||
I think before bug 543435, nsGlobalWindowInner::EnsureClientSource got called before mSelfURI was set to about:blank. It serialized the policy container and sent it to the parent while mSelfURI still pointed to the parent's URI. Then CSP checks in the child passed because the CSP was already parsed, while CSP checks in the parent passed because it used an incorrect mSelfURI?
| Assignee | ||
Comment 14•2 months ago
|
||
Here's the test from claude that I broke down a bit.
index.htmlis a pre-existing issue similar to the attachedtest.html, a shared worker from anabout:blankcauses CSP violationsindex3.htmlis the actual test case for this bug. Use the python script for the correct headers, go tolocalhost:8080/index3, reload to get a CSP violation.
Comment 15•2 months ago
|
||
Vincent: Can you give us an assessment of the priority and severity for this bug? Thanks.
| Assignee | ||
Comment 16•2 months ago
|
||
To summarize the issue.
Bug 543435 initialized the CSP and added SetRequestContextWithDocument in nsDocShell::CreateAboutBlankDocumentViewer. This means that nsCSPContext::mSelfURI is changed earlier from the parent URI to the iframe's URI.
I don't remember why we added it there specifically. I guess as the initial about:blank load didn't anymore go through a channel, Document::InitCSP wasn't called anymore and this initialization needed to happen somewhere.
nsGlobalWindowInner::EnsureClientSource is called from CreateAboutBlankDocumentViewer (through Embed, SetupNewViewer, Init, InitInternal, SetNewDocument, ExecutionReady) and calls SetPolicyContainer. It serializes / copies the policy container including mSelfURI and this is also propagated to the parent. The change introduced by bug 543435 changes what self is at that point.
I don't know the details about how the service worker is involved and under what circumstances a script load involves the parent. But when a same origin script is loaded in this about:blank, we somehow involve the parent for the service worker interception, de-serialize this policy container, parse the CSP and get a violation.
This largely seems like a consequence of bug 1899512 and I'm suspect it's a bug that mSelfURI was different between content document and client source.
- The best option would probably be to fix bug 1899512.
- Alternatively, we can consider restoring this mismatch by doing the CSP initialization / the
SetRequestContextWithDocumentcall innsDocShell::CompleteInitialAboutBlankLoad.- Try shows one WPT failure so far for this. I added that one in bug 2002073 and we would've failed it prior to bug 543435, so it might be ok to fail it again. https://treeherder.mozilla.org/jobs?repo=try&revision=467bdfb776f3b1a35fc45a950c96e85bcebf29b6
- Or, perhaps we can change what
policyContainerArgs_contains.- The simplest way would be to add another hack somewhere so that
mClientSource->SetPolicyContainersetsselfURIto the inherited CSP's / parent's URI for initialabout:blank. - Or we serialize the policy container's CSP's
mPoliciestoo so that the parent doesn't re-parse the policies based on the wrongself.
- The simplest way would be to add another hack somewhere so that
| Assignee | ||
Updated•2 months ago
|
| Assignee | ||
Comment 17•2 months ago
|
||
My guess is S3 based on the two existing reports. But there is the risk that some high-impact pages have a SW and frame-src: "self" or similar. I'll attach a patch and hope it can land quickly.
| Assignee | ||
Comment 18•2 months ago
|
||
Updated•2 months ago
|
Comment 19•2 months ago
|
||
It does feel quite error prone to me that we have the PolicyContainer both on the ClientSource and the Document. I don't really know the history of this set-up, but I suspect only having this in one place would be better. We also only sync from ClientSource to the Document.
| Assignee | ||
Comment 20•2 months ago
•
|
||
bug 543435 landed Nov 24 and on Nov 25, bug 2001842 adjusted the test expectations for /content-security-policy/inheritance/frame-src-javascript-url.html from timeout to fail. The attached patch caused it to timeout again, I suspect bug 543435 accidentally improved / changed our behavior here. I'm confused why that test didn't show up in my numerous try runs, but a change here is plausible.
This test operates on a transient about:blank (from CreateAboutBlankDocumentViewer). So before bug 543435, SetRequestContextWithDocument didn't happen yet, nsCSPContext::mInnerWindowID is the one of the parent. So the test timed out as the expected CSP violation is raised on the parent window. Bug 543435 changed this, the CSP violation is raised correctly in the iframe, but somehow still with wrong blockedURI. The attached patch keeps calling SetRequestContextWithDocument for the transient about:blank, but at a later time so we end up with the CSP violation on the parent window again. I'm not quite sure yet why the call is too late, but the CSP is checked in the parent process. So likely, when we update mInnerWindowID, the wrong CSP was already sent to the parent. We could consider setting only mSelfURI later.
To avoid timeouts on CI, we can just fail immediately if we get the violation on the wrong window. The attached patch is a regression nevertheless, but maybe it's better than blocking requests incorrectly?
Comment 21•2 months ago
|
||
Comment 22•2 months ago
|
||
| bugherder | ||
| Assignee | ||
Comment 23•2 months ago
|
||
A fix landed in Firefox Nightly. Could you verify that your HomeAssistant instance works fine?
Comment 26•2 months ago
|
||
The patch landed in nightly and beta is affected.
:vhilla, is this bug important enough to require an uplift?
- If yes, please nominate the patch for beta approval.
- See https://wiki.mozilla.org/Release_Management/Requesting_an_Uplift for documentation on how to request an uplift.
- If no, please set
status-firefox149towontfix.
For more information, please visit BugBot documentation.
| Reporter | ||
Comment 27•2 months ago
|
||
(In reply to Vincent Hilla [:vhilla] from comment #23)
A fix landed in Firefox Nightly. Could you verify that your HomeAssistant instance works fine?
Yes! Thank you very much - all seems fine when I use the nightly build.
Comment 28•2 months ago
|
||
firefox-beta Uplift Approval Request
- User impact if declined/Reason for urgency: Content requests in about:blank iframes might be incorrectly blocked when service worker controlled.
- Code covered by automated testing?: yes
- Fix verified in Nightly?: yes
- Needs manual QE testing?: no
- Steps to reproduce for manual QE testing: See attached test. There are two reported bugs in the wild, but none that we know of on publicly accessible pages.
- Risk associated with taking this patch: low
- Explanation of risk level: Small change that restores previous behavior.
That behavior was/is buggy in other, less severe ways though. If a CSP violation occurs for a transient about:blank in an iframe, it might be raised on the embedder. - String changes made/needed?: No
- Is Android affected?: yes
| Assignee | ||
Comment 29•2 months ago
|
||
Original Revision: https://phabricator.services.mozilla.com/D286934
| Assignee | ||
Comment 30•2 months ago
|
||
Thanks for testing and reporting the bug!
Uplift request to beta is done. I'm unsure if this warrants a release uplift, I tend towards no based on the two reports we had and 149 ships soon.
Updated•2 months ago
|
Updated•2 months ago
|
Updated•2 months ago
|
Comment 32•2 months ago
|
||
| uplift | ||
Description
•