Open Bug 1526731 Opened 9 months ago Updated 10 days ago

Saving a PNG image incorrectly uses WebP file extension

Categories

(Core :: Networking: HTTP, defect, P2)

65 Branch
defect

Tracking

()

Tracking Status
firefox-esr60 --- unaffected
firefox65 --- unaffected
firefox66 - wontfix
firefox67 - wontfix
firefox68 --- fix-optional
firefox69 --- fix-optional

People

(Reporter: yoasif, Unassigned, NeedInfo)

References

(Regression)

Details

(Keywords: regression)

As seen on https://www.reddit.com/r/firefox/comments/aoxp9d/webp_format_forced_on_me_and_disabling_it_causes/eg4erpk/

Starting with Firefox 65, images displayed as WebP are sometimes saved as PNG (likely because Firefox doesn't use the loaded image to save, but instead downloads from server) with the wrong file extension (webp).

This really annoys users who don't have a filetype handler for webp, even though the image is actually a PNG.

Steps to reproduce:

  1. Navigate to https://yugioh.fandom.com/wiki/Quintet_Magician
  2. Right click on the playing card image and choose "Save Image As..."
  3. Click save in the dialog that appears.

What happens:

The file is saved with filename "QuintetMagician-VB20-JP-UR.webp". Running "identify" (imagemagick) on the file reveals that the file is a PNG:

QuintetMagician-VB20-JP-UR-ff.webp PNG 300x438 300x438+0+0 8-bit sRGB 277681B 0.000u 0:00.009

Expected result:

The file is saved as: "QuintetMagician-VB20-JP-UR.png" since it is a PNG image. Chrome does this correctly.

Saving the file in Chrome vs. Firefox and running md5sum reveals that they are the same file.

This is likely related to not sending the conneg header for images when using "save as", and the server sending a different image under the same name.

Probably a duplicate of bug 1347958

If this is a duplicate of bug 1347958 then probably it would not be a new issue in 65.

I can Reproduce the issue on Nightly67.0a1 Windows10.
And right click on the image > View Image is also broken.

Regression window:
https://hg.mozilla.org/integration/mozilla-inbound/pushloghtml?fromchange=1760e005cecc1f7eedd1e6e4c022d73375d12327&tochange=2a7dce7a42ea0527694c978e98d9af1687aecb36

Regressed by: Bug 1417463

:baku,
Your bunch of patch seems to cause the regression.
Can you please look into this?

Blocks: 1417463
Flags: needinfo?(amarchesini)

The problem is that the server returns 2 different images based on the 'accept' header values.
When the image is downloaded to be displayed, 'accept' header is set to: "image/webp,/"
But when the image is downloaded to be saved, 'accept' header is set to: "image/png,image/svg+xml,image/;q=0.8,/*;q=0.5"

We should be consistent and use the same accept header for any image request. The fetch spec suggests to use: "image/png,image/svg+xml,image/;q=0.8,/*;q=0.5": https://fetch.spec.whatwg.org/#fetching

Andrew, what do you think about changing imgRequest to use that accept header instead of what is set in pref image.http.accept?

Flags: needinfo?(amarchesini) → needinfo?(aosmond)

(In reply to Liz Henry (:lizzard) (use needinfo) from comment #2)

If this is a duplicate of bug 1347958 then probably it would not be a new
issue in 65.

Well, that depends -- seems to me like this was a change that exposed the same problem, i.e. not using image.http.accept for fetch requests.

Andrew, what do you think about changing imgRequest to use that accept header instead of what is set in pref image.http.accept?

Why not use image.http.accept for all image requests as suggested in bug 1347958 instead of using various hard-coded ones?

Any ideas on this one for 66, Gijs?

Flags: needinfo?(gijskruitbosch+bugs)

(In reply to Andrea Marchesini [:baku] from comment #4)

The problem is that the server returns 2 different images based on the 'accept' header values.
When the image is downloaded to be displayed, 'accept' header is set to: "image/webp,/"
But when the image is downloaded to be saved, 'accept' header is set to: "image/png,image/svg+xml,image/;q=0.8,/*;q=0.5"

We should be consistent and use the same accept header for any image request. The fetch spec suggests to use: "image/png,image/svg+xml,image/;q=0.8,/*;q=0.5": https://fetch.spec.whatwg.org/#fetching

Andrew, what do you think about changing imgRequest to use that accept header instead of what is set in pref image.http.accept?

I think there is another bug or two open on this topic; if I had to say anything, we should be using the fetch spec + image/webp so that the WebP detection works properly.

Flags: needinfo?(aosmond)

(In reply to Tim Spurway [:tspurway] from comment #6)

Any ideas on this one for 66, Gijs?

I don't know our image/networking code well enough to comment here. I'm moving this over to networking because that's where the regressing bug was. Looks like :baku and :aosmond more or less agree so re-pinging :baku to radar this so we can consider a fix for 66.

(In reply to Alice0775 White from comment #3)

And right click on the image > View Image is also broken.

I'd like to understand this better to make sure whatever we end up fixing actually fixes the entire issue - using 'view image' on beta shows me the same image (visually, at least - I haven't checked the bytes over the wire, though the page info dialog says it's a webp image) as on the page, as expected. Are you getting a broken image, or does reproducing this require disabling webp, or something else?

Component: File Handling → Networking: HTTP
Flags: needinfo?(gijskruitbosch+bugs)
Flags: needinfo?(amarchesini)
Flags: needinfo?(alice0775)
Product: Firefox → Core

Right click on the image > View Image Info
The image is WebP as expected.

However,
Right click on the image > View Image
And then Right click on the image > View Image Info

The image is now PNG. So I said 'broken'

Flags: needinfo?(alice0775)

I suspect there are several issues here:

  1. We send different Accept header values and the server answers with 2 different images. We must be consistent in how we use the Accept header. I think we have a good solution proposed in comment 7.

  2. The use of content-disposition to generate the file-name.

Reading the code, I see this:
https://searchfox.org/mozilla-central/rev/4587d146681b16ff9878a6fdcba53b04f76abe1d/toolkit/content/contentAreaUtils.js#144-145,152

In contentAreaUtils.js we retrieve the content-disposition from the image cache. The content-disposition is used to generate the file name (I guess). But then, the downloading of the image sends different headers then what we sent before - see my previous comment. Nothing guarantees that the server offers the same content when important headers (such as Accept) are sent differently. But more important, nothing guarantees that the server answers always at the same way in general. Maybe the second request contains a different content-disposition value...

I can work on the first part. Gijs, can you check if what I'm saying is correct for this second point? Thanks.

Flags: needinfo?(amarchesini) → needinfo?(gijskruitbosch+bugs)

(In reply to Andrea Marchesini [:baku] from comment #10)

I suspect there are several issues here:

  1. We send different Accept header values and the server answers with 2 different images. We must be consistent in how we use the Accept header. I think we have a good solution proposed in comment 7.

  2. The use of content-disposition to generate the file-name.

Reading the code, I see this:
https://searchfox.org/mozilla-central/rev/4587d146681b16ff9878a6fdcba53b04f76abe1d/toolkit/content/contentAreaUtils.js#144-145,152

In contentAreaUtils.js we retrieve the content-disposition from the image cache. The content-disposition is used to generate the file name (I guess). But then, the downloading of the image sends different headers then what we sent before - see my previous comment. Nothing guarantees that the server offers the same content when important headers (such as Accept) are sent differently. But more important, nothing guarantees that the server answers always at the same way in general. Maybe the second request contains a different content-disposition value...

I can work on the first part. Gijs, can you check if what I'm saying is correct for this second point? Thanks.

We don't hit the code you referenced because of e10s - there's no document because we're in the parent process. The content type and content disposition have been passed on through messaging before we get there. However, we do not pass that info to nsIWebBrowserPersist (and in fact, off-hand I don't see any relevant parameters where we could do so from the frontend). It's probably determined here: https://searchfox.org/mozilla-central/rev/4587d146681b16ff9878a6fdcba53b04f76abe1d/browser/actors/ContextMenuChild.jsm#504-524 but I haven't checked that.

It sounds to me like what you're getting at is that we shouldn't settle on a filename before we make the request and get a response. Unfortunately, that's not how the current code works; nsIWebBrowserPersist needs to know where it's saving things before we invoke it, so we have to prompt for a destination (or default to some "clever" filename guess in $defaultdownloaddirectory) before saving. Reworking all of that is definitely not upliftable, even if it's probably a good idea for the longer term...

Does that answer your question?

Flags: needinfo?(gijskruitbosch+bugs) → needinfo?(amarchesini)

(Alternatively, perhaps there's a way to force the webbrowserpersist to use the image cache for saving images, if it's caching the bytes we get over the network instead of some rendered representation?)

If you come up with something upliftable, let me know. Otherwise maybe we can aim for fixing this in 67 or even 68.

Gijs, Baku, is that something we should fix with a 67 uplift or do you estimate that the potential patch to uplift would be risky and it is better to let it slip to 68? Thanks

Flags: needinfo?(gijskruitbosch+bugs)

The fundamental issues here are that:

  1. we assume that we know what type of thing we're saving before we do the network request. This applies to both images and 'save page as'. Only for 'save link as' do we do some kind of "prefetch" request to see what content type and disposition the server returns, and prompt for a filename after. This has "always" (for the past 20 or so years) worked like this, and is difficult to fix. There is no strong owner for the file saving code, or people who are experts on it. A bunch of the stuff on the frontend side is a pile of helper functions that happen to mostly work today that nobody wants to touch and has little to no automated test coverage (plus, if it did, you'd have to rewrite most of the test to make a change like this). I don't think attempting to fix this in order to fix this bug is a realistic option.
  2. the image accept header code. I don't know this code and can't speak to it. :baku and :aosmond are the people to ask. This was what changed and surfaced this bug, though a similar thing could technically happen if there were other situations where we varied the accept header. I don't know why webp is not in the Accept header list for fetch, and if changing this would be likely to cause web compat issues or whatever. I also don't understand why we send "image/webp,/" for images we load in the page... why no other mime types and/or why send webp at all? Or under what circumstances we send that spec'd fetch header - you can use fetch() for non-image resources, so why does it even bother including images specifically? I also don't know what spec covers the webp stuff. Passing ni to :aosmond for all of this.
  3. the lack of cache usage. This is actually caused by the previous point: the site in comment #0 sends a "Vary: Accept" header for the image, and so now that we send a different accept header than we did the first time, necko correctly determines that it can't use the network cache for the image saving request.
  4. not having a way to specify what kind of URI we're persisting when calling nsIWebBrowserPersist::SavePrivacyAwareURI or SaveURI, and thus using the "standard" fetch headers for the save as request. This would be an API change, but would probably allow us to use a different accept header, if the right magic was invoked. I'm also not really the best person to work on this. The changes on the frontend would be relatively minimal: saveImageURL would have to pass the relevant flags to internalSave, which would have to pass them to nsIWebBrowserPersist. I don't know what would be required on that side of things to make things work correctly. Perhaps just passing a content policy for the request is enough, perhaps we need other things.

Hopefully that helps. I can't personally write a patch for this that would be upliftable to 67, I expect - unless just passing the right content policy type would be sufficient here (in which case we could probably fix this through (4) ).

Flags: needinfo?(gijskruitbosch+bugs) → needinfo?(aosmond)

Just a comment re: (2)

I don't know why webp is not in the Accept header list for fetch

I opened a spec issue for this inflexibility with whatwg. Mandating a set, limited image accept: header in a spec like that pretty much kills the possibility for conneg. I'm quite sure that that isn't the intention of the fetch spec but currently is the result if browsers are going to be compliant with it.
https://github.com/whatwg/fetch/issues/877

I also don't understand why we send "image/webp,/" for images we load in the page... why no other mime types and/or why send webp at all?

The point of sending these headers at all is important for formats that are not/cannot be presumed to be accepted/supported by "all"(*) browsers. The whole point of conneg is to use these headers to indicate what non-standard formats are supported and/or what formats are being preferred by the client. WebP is a relatively new format and support for it should be indicated in image requests so web servers don't have to take a "risk" that this new format might or might not be supported.

(*) "all" is generally a good guess as to what is in common use, and includes a good range of older versions of mainstream browsers as well as some smaller browsers. In general this is a conservative approach.

As to why it's sent in img requests despite being specced differently in necko now, that is the result of image.http.accept; I've had a related issue with this in bug 1347958 and I think ultimately this -could-? be solved by making the accept headers prefable (as opposed to hardcoded what has been done now in bug 1417463) or by at least perpetuating the image accept from image.http.accept to necko for all image requests instead of the limited use of where it is being done now...

Priority: -- → P2
No longer blocks: 1417463
Has Regression Range: --- → yes
Has STR: --- → yes
Regressed by: 1417463

Untracking and marking as fix-optional for 67 given that the bug is unassigned, is now a P2 and we are past our early beta cycle.

You need to log in before you can comment on or make changes to this bug.