Open Bug 1892069 Opened 5 months ago Updated 5 months ago

Retries of downloads can cause a different file to download from a website (esp. noticeable due to dom.block_download_insecure)

Categories

(Toolkit :: Downloads API, defect, P3)

Firefox 125
defect

Tracking

()

Tracking Status
relnote-firefox --- 125+
firefox-esr115 --- unaffected
firefox125 blocking disabled
firefox126 blocking disabled
firefox127 blocking disabled

People

(Reporter: git, Unassigned)

References

(Depends on 1 open bug, Blocks 1 open bug)

Details

(Keywords: regression)

Attachments

(2 files)

User Agent: Mozilla/5.0 (X11; Linux x86_64; rv:125.0) Gecko/20100101 Firefox/125.0

Steps to reproduce:

Using Firefox 125 and verified with a fresh copy of FF Nightly 127.0a1 (2024-04-17) (64-bit) on Fedora Linux 39.

Go to http://tools.buzzstream.com/link-building-extract-urls (Note the HTTP URL scheme.)

Stick in some HTML with some links, such as this. (It doesn't actually matter what though)

<ol>
<li><h3><a href="https://www.example.org">ORG</a></h3></li>
<li><h3><a href="https://www.example.com">COM</a></h3></li>
<li><h3><a href="https://www.example.net">NET</a></h3></li>
</ol>

Press the "Create CSV" button.

Observe that the website's own HTML is downloaded, rather than a CSV.

Then go to about:config and set dom.block_download_insecure to false.

Repeat the above steps, but now observe that a CSV file is downloaded.

Actual results:

The webpage downloaded its own HTML, rather than the CSV file it was meant to create.

Expected results:

Even over HTTP the correct file was downloaded.

Attached video shows reproduction steps.

I should note that the website works fine with a HTTPS scheme, so presumably this is the blocking of HTTP downloads that is breaking it.

The Bugbug bot thinks this bug should belong to the 'Core::DOM: Core & HTML' component, and is moving the bug to that component. Please correct in case you think the bot is wrong.

Component: Untriaged → DOM: Core & HTML
Product: Firefox → Core
Status: UNCONFIRMED → NEW
Ever confirmed: true
Flags: needinfo?(ckerschb)
Keywords: regression
Regressed by: 1877195

James, thanks for filing the bug and reporting this problem. I can reproduce this on my machine.

@Gijs I am having a hard time wrapping my head around this, within Bug 1877195 we are basically just updating ClassifyDownload() and similar as before return nsITransfer::DOWNLOAD_POTENTIALLY_UNSAFE in case it's not a potentially trustworthy URI, see actual code. However, that shouldn't interfere with any of the actual download machinery and should cause the webpage html appear in the downloaded *.csv instead of the expected values in the *.csv file.

I guess I need a little support in investigating what's going on here. Can you help a little?

Flags: needinfo?(ckerschb) → needinfo?(gijskruitbosch+bugs)
See Also: → 1892115
Duplicate of this bug: 1892211
Duplicate of this bug: 1892115

The download in comment 0 is the result of a POST request.

In this case, the page posts TO ITSELF. Accordingly, the download URL is identical to the page URL (the form data is not in the URL).

When clicking Allow download, the Browser Console shows that instead of repeating the original POST request, Firefox performs a GET request for the same URL, which returns the wrong content.

If I use the Page Inspector to edit the form method to GET, and then repeat the test, the form data is added to the URL. In that scenario, when clicking Allow download, the extra GET request gets the correct response and Firefox is able to save the expected file.

Is it safe to repeat a blocked POST request? If not, is possible to cache the response somewhere so it can be written to disk correctly?

Based on the reproducible URL (http://www.primefaces.org:8080/showcase/ui/file/download.xhtml) in bug 1892115, this is what I observed.

There are 3 http requests involved in a successful download case.

E/nsHttp   GET /showcase/ui/file/download.xhtml?jfwid=409e1 HTTP/1.1
E/nsHttp   Host: www.primefaces.org:8080
E/nsHttp   GET /showcase/ui/file/download.xhtml?jfwid= HTTP/1.1
E/nsHttp   Host: www.primefaces.org:8080
E/nsHttp   Referer: http://www.primefaces.org:8080/showcase/ui/file/download.xhtml?jfwid=409e1
E/nsHttp   Cookie: JSESSIONID=node08gve5yr2fnhi1vkupnn1ez50l7520294.node0
E/nsHttp   GET /showcase/ui/file/download.xhtml?jfwid=42911 HTTP/1.1
E/nsHttp   Host: www.primefaces.org:8080
E/nsHttp   Cookie: JSESSIONID=node08gve5yr2fnhi1vkupnn1ez50l7520294.node0; pf.initialredirect-42911=true

Note that the actual download seems to be completed in the third request (/showcase/ui/file/download.xhtml?jfwid=42911).

However, this is what happened in the failed case.

E/nsHttp   GET /showcase/ui/file/download.xhtml?jfwid=409e1 HTTP/1.1
E/nsHttp   Host: www.primefaces.org:8080
E/nsHttp   GET /showcase/ui/file/download.xhtml?jfwid= HTTP/1.1
E/nsHttp   Host: www.primefaces.org:8080
E/nsHttp   Referer: http://www.primefaces.org:8080/showcase/ui/file/download.xhtml?jfwid=409e1
E/nsHttp   Cookie: JSESSIONID=node01826ckxg5nwr01l95asxidtxak7518176.node0
E/nsHttp   GET /showcase/ui/file/download.xhtml?jfwid=9ef2e HTTP/1.1
E/nsHttp   Host: www.primefaces.org:8080
E/nsHttp   Referer: http://www.primefaces.org:8080/showcase/ui/file/download.xhtml?jfwid=409e1
E/nsHttp   Cookie: JSESSIONID=node01826ckxg5nwr01l95asxidtxak7518176.node0; pf.initialredirect-9ef2e=true
E/nsHttp   GET /showcase/ui/file/download.xhtml?jfwid=9ef2e HTTP/1.1
E/nsHttp   Host: www.primefaces.org:8080
E/nsHttp   Referer: http://www.primefaces.org:8080/
E/nsHttp   Cookie: JSESSIONID=node01826ckxg5nwr01l95asxidtxak7518176.node0; primefaces.download_ui_file_download=null

The forth request is the one that retries the download and it appears that the problem is in the Cookie header.
The cookie primefaces.download_ui_file_download was from the server's response to the third request:

E/nsHttp   HTTP/1.1 200 OK
E/nsHttp   Cache-Control: no-cache, no-store, must-revalidate
E/nsHttp   Pragma: no-cache
E/nsHttp   Expires: 0
E/nsHttp   Content-Type: image/jpg
E/nsHttp   Content-Disposition: attachment;filename="downloaded_boromir.jpg"; filename*=UTF-8''downloaded_boromir.jpg
E/nsHttp   Set-Cookie: primefaces.download_ui_file_download=true; Path=/showcase

Apparently, we might need to clear/reset some status before retrying the download.

I'm seeing a POST request on the PrimeFaces page as well (for the Download button, not the Ajax Download button).

Setting Fx126 as disabled, the regressor was backed out of beta

In either the POST or GET case, why is the site using a form? If they've put some kind of anti-CSRF or anti-bot token in a hidden field, if it's a one-time-use token or has a short expiration time that could make it impossible to ever re-download such a file. Or maybe it's a digital asset that you've just purchased, and if you re-submit then you'll be purchasing it again. Sites shouldn't be doing e-commerce on insecure connections, but that doesn't mean 100% aren't (PCI DSS rules forbid it, but western big-bank credit cards aren't the only payment systems in the world).

There are other potential reasons they've used a form that would be perfectly fine to cache and re-send.

We should not try to guess whether a transaction is safe to repeat. If the site is telling us not to cache it we should assume asking a second time might get a different result. There are other reasons a site may say not to cache, but if it's explicitly a download (as in comment 8, content-disposition: attachment) that the user is expected to save then they aren't doing it for privacy reasons!

Bug 1877195 was backed out from Release for Desktop 125.0.2 / Android 125.2.0 going out early next week. Additionally, a remote pref-flip hotfix (visible in about:studies as "HTTP download configuration") has been deployed to users of Desktop 125.0.1 to disable the new functionality to mitigate this issue until the dot release goes out.

This nobbled document downloads in my firm's intranet app and cost me a chunk of my evening off. I really don't see why all of these protections can't be subject to blanket exclusion (optionally at least) when the origin is a private IP address.

The bug is marked as blocking firefox127 (nightly). However, the bug still isn't assigned.

:hsinyi, could you please find an assignee for this tracked bug? Given that it is a regression and we know the cause, we could also simply backout the regressor. If you disagree with the tracking decision, please talk with the release managers.

For more information, please visit BugBot documentation.

Flags: needinfo?(htsai)

Setting Fx127 as disabled, the regressor was backed out of central

Flags: needinfo?(htsai)

Presumably related to GET instead of POST, a user reported on SUMO that one site reports HTTP Status 405 Method Not Allowed after clicking Allow download:

https://support.mozilla.org/questions/1445079

Added to the Fx 125.0.2 relnotes:

Reverted the changes to potentially untrusted download protection that were recently shipped in Firefox 125 due to unexpected problems reported by users. We plan to fix and re-enable these protections in a future release.

Afaict, the downloads code is currently unable to repeat the network request exactly as it was, when retrying a download.
Downloads from the Web are handled by DownloadLegacy.sys.mjs that implements nsITransfer. When the download is retried we create a DownloadCopySaver that will just use a new http channel. Many of the characteristics of the original download are lost at this point, we retain the url, the referrer, the auth header, the userContextId... not much more. To pass downloads around (and across sessions) we use a serializable object, that cannot really contain a complex request object.

There is a helper that Valentin found yesterday called adjustChannel, that was introduced in Bug 915036 for WebExt API, that allows to adjust the download channel with specific request method, post data and headers. Though that is currently only used by the WebExt API.
It is possible to set download.source.adjustChannel when the download object is created, to a function setting certain characteristics (the requestMethod, postData and headers) and it will be used when retrying the download. I have a local patch doing that and it fixes the case in Bug 1892115.
BUT, setting adjustChannel makes the download not serializable, and not portable across sessions. Whether that's acceptable for POST blocked pages is debatable.

To got a bit further, one could add requestMethod and postData to the serializable data, instead of using adjustChannel. That would make POST downloads act properly across sessions.
I must note that once the download is considered "done" and moved to history, that serialized data is lost again. So if a download fails today, and one would retry it from history in a few days, it won't be the same. To also support that case, we should store additional metadata (method and postData) in history too.

Headers are even more complicate, as there may be many that are necessary and many that are not, or maybe some should also be removed to retry the download succesfully, like in the case outlined here.
It's unlikely we can detect all those cases, so maybe we should just address the POST/postData problem and leave headers to the future?

These issues would normally affect failed downloads, failed POST downloads are probably rare enough that we didn't notice. Blocked downloads make the problem more frequent.

The problem with POST requests is that when you do them again they might not return the same contents anyway.
Let's assume there's an endpoint called /removeAndReturnData which downloads the data and removes it from the server. Even retrying with the correct method might not work. Quite the corner case, I know.

I think it's fine for these retries not to work across sessions, though it would be nice to confirm how other browsers handle this corner case too.

I think for insecure downloads we could approach this some other way, by not cancelling the channel and simply suspending it (for 1 minute). If the user chooses to retry, we can resume it. If they delete the download, or don't make any choice we can cancel it. Though this might complicate the downloader/nsContentSecurityUtils::ClassifyDownload/nsExternalHelperAppService instead.

(In reply to Valentin Gosu [:valentin] (he/him) from comment #19)

The problem with POST requests is that when you do them again they might not return the same contents anyway.
Let's assume there's an endpoint called /removeAndReturnData which downloads the data and removes it from the server.

Wouldn't that also be valid for GET? Off-hand I don't see why a server side script couldn't do the same.
I agree that a better approach would be to use a long pause, before canceling, so the channel remains the same. That will solve any short term issues, but we could also try to partially improve the long term (next day or next session) story.

Wouldn't that also be valid for GET? Off-hand I don't see why a server side script couldn't do the same.

Get are normally idempotent, so I assume they wouldn't return different content. At worst they should fail if you don't set authorization or cookies.

we could also try to partially improve the long term (next day or next session) story.

I agree. Some of these issues are not limited to dom.block_download_insecure

See Also: 1892115

Hello! We verified the issue with Firefox 125.0.2, 126.0b3 and 127.0a1 (2024-04-21) on Windows 10x64, macOS 14 and Ubuntu 23 using steps from comment 0 and 1892115#c0.

I'm a bit late to this party because I was out on PTO when the incident happened.

I think Valentin and Marco have mostly answered some of the more obvious questions here.

One thing that I am somewhat intrigued by:

(In reply to Valentin Gosu [:valentin] (he/him) from comment #19)

The problem with POST requests is that when you do them again they might not return the same contents anyway.
Let's assume there's an endpoint called /removeAndReturnData which downloads the data and removes it from the server. Even retrying with the correct method might not work. Quite the corner case, I know.

I understand that this is the case for retries. But if we're blocking download requests because they're mixed content (pre-incident) or purely because they're over http (what the backed out patch did), I would expect us to block and fail before we hit the network, which makes all of this moot. Is that not the case? (perhaps because we check if the server redirects to https before complaining? Except if the risk we're trying to mitigate against is a MITM, then the attacker might be the one redirecting us to https controlled by the attacker, so that seems somewhat cold comfort...)

Flags: needinfo?(valentin.gosu)
Flags: needinfo?(gijskruitbosch+bugs)
Flags: needinfo?(ckerschb)

Oh, and more generally when retrying POST requests from network pages, we have a dialog that asks the user to confirm that we re-submit values. That dialog needs a UX review, I expect, but perhaps we could do something similar in the download case?

(In reply to :Gijs (he/him) from comment #23)

I understand that this is the case for retries. But if we're blocking download requests because they're mixed content (pre-incident) or purely because they're over http (what the backed out patch did), I would expect us to block and fail before we hit the network, which makes all of this moot. Is that not the case? (perhaps because we check if the server redirects to https before complaining? Except if the risk we're trying to mitigate against is a MITM, then the attacker might be the one redirecting us to https controlled by the attacker, so that seems somewhat cold comfort...)

We block a request only if the request triggers a download. But we can't determine whether the request will trigger a download without hitting the network (For example, we have to check the response header contains Content-Disposition: attachment).

(In reply to Masatoshi Kimura [:emk] from comment #25)

(In reply to :Gijs (he/him) from comment #23)

I understand that this is the case for retries. But if we're blocking download requests because they're mixed content (pre-incident) or purely because they're over http (what the backed out patch did), I would expect us to block and fail before we hit the network, which makes all of this moot. Is that not the case? (perhaps because we check if the server redirects to https before complaining? Except if the risk we're trying to mitigate against is a MITM, then the attacker might be the one redirecting us to https controlled by the attacker, so that seems somewhat cold comfort...)

We block a request only if the request triggers a download. But we can't determine whether the request will trigger a download without hitting the network (For example, we have to check the response header contains Content-Disposition: attachment).

Oh right, of course, not sure how I forgot this. :-(

Still curious if we think the prompt-before-resubmit would be helpful here.

Flags: needinfo?(ckerschb)

Would it be possible to hold the response/content in a temporary location? Then if the user accepts the download the response can be saved, otherwise it is dropped.

I'm not familiar with the implementation, but Firefox already writes a temporary .part file while downloading, so maybe this could be used, and is only renamed when the user accepts the download, and deleted otherwise.

(In reply to :Gijs (he/him) from comment #24)

Oh, and more generally when retrying POST requests from network pages, we have a dialog that asks the user to confirm that we re-submit values. That dialog needs a UX review, I expect, but perhaps we could do something similar in the download case?

I think that is a good idea. The main issue here is that the POST request is resubmitted with GET. We shouldn't do that without the user agreeing to it, or maybe not allow it at all.

Flags: needinfo?(valentin.gosu)
Component: DOM: Core & HTML → DOM: Security

Assigning but with low severity given this feature has been disabled.

Assignee: nobody → ckerschb
Severity: -- → S4
Whiteboard: [domsecurity-active]

As mentioned within Bug 1877195, Comment 31, we decided to not ship the HTTP download protection mechanism after all. The status quo remains that we will continue to block HTTP downloads within HTTPS pages as outlined within https://blog.mozilla.org/security/2021/10/05/firefox-93-protects-against-insecure-downloads/.

Nevertheless, the bug that surfaced when trying to ship Bug 1877195 and described within this bug remains and potentially we should fix it. However, I personally don't know how to do it, hence I am not a good assignee for the bug. Further, I think it's more of a 'download manager' bug, hence I am moving it over to CORE: HTML again.

Assignee: ckerschb → nobody
Component: DOM: Security → DOM: Core & HTML
Whiteboard: [domsecurity-active]
Component: DOM: Core & HTML → Downloads API
Product: Core → Toolkit
Severity: S4 → S3
OS: Unspecified → All
Priority: -- → P3
Hardware: Unspecified → All
See Also: → 1820083
Summary: dom.block_download_insecure causes a different file to download from a website → Retries of downloads can cause a different file to download from a website (esp. noticeable due to dom.block_download_insecure)
Depends on: 1894667
No longer regressed by: 1877195
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: