Closed Bug 1526731 Opened 3 years ago Closed 2 years ago

Saving a PNG image incorrectly uses WebP file extension

Categories

(Core :: Networking: HTTP, defect, P2)

65 Branch
defect

Tracking

()

RESOLVED FIXED
mozilla74
Tracking Status
firefox-esr60 --- unaffected
firefox-esr68 --- wontfix
firefox65 --- unaffected
firefox66 - wontfix
firefox67 - wontfix
firefox68 --- wontfix
firefox69 --- wontfix
firefox72 --- wontfix
firefox73 --- wontfix
firefox74 --- fixed

People

(Reporter: yoasif, Assigned: Gijs)

References

(Regression)

Details

(Keywords: regression, Whiteboard: [necko-triaged])

Attachments

(3 files)

As seen on https://www.reddit.com/r/firefox/comments/aoxp9d/webp_format_forced_on_me_and_disabling_it_causes/eg4erpk/

Starting with Firefox 65, images displayed as WebP are sometimes saved as PNG (likely because Firefox doesn't use the loaded image to save, but instead downloads from server) with the wrong file extension (webp).

This really annoys users who don't have a filetype handler for webp, even though the image is actually a PNG.

Steps to reproduce:

  1. Navigate to https://yugioh.fandom.com/wiki/Quintet_Magician
  2. Right click on the playing card image and choose "Save Image As..."
  3. Click save in the dialog that appears.

What happens:

The file is saved with filename "QuintetMagician-VB20-JP-UR.webp". Running "identify" (imagemagick) on the file reveals that the file is a PNG:

QuintetMagician-VB20-JP-UR-ff.webp PNG 300x438 300x438+0+0 8-bit sRGB 277681B 0.000u 0:00.009

Expected result:

The file is saved as: "QuintetMagician-VB20-JP-UR.png" since it is a PNG image. Chrome does this correctly.

Saving the file in Chrome vs. Firefox and running md5sum reveals that they are the same file.

This is likely related to not sending the conneg header for images when using "save as", and the server sending a different image under the same name.

Probably a duplicate of bug 1347958

If this is a duplicate of bug 1347958 then probably it would not be a new issue in 65.

I can Reproduce the issue on Nightly67.0a1 Windows10.
And right click on the image > View Image is also broken.

Regression window:
https://hg.mozilla.org/integration/mozilla-inbound/pushloghtml?fromchange=1760e005cecc1f7eedd1e6e4c022d73375d12327&tochange=2a7dce7a42ea0527694c978e98d9af1687aecb36

Regressed by: Bug 1417463

:baku,
Your bunch of patch seems to cause the regression.
Can you please look into this?

Blocks: 1417463
Flags: needinfo?(amarchesini)

The problem is that the server returns 2 different images based on the 'accept' header values.
When the image is downloaded to be displayed, 'accept' header is set to: "image/webp,/"
But when the image is downloaded to be saved, 'accept' header is set to: "image/png,image/svg+xml,image/;q=0.8,/*;q=0.5"

We should be consistent and use the same accept header for any image request. The fetch spec suggests to use: "image/png,image/svg+xml,image/;q=0.8,/*;q=0.5": https://fetch.spec.whatwg.org/#fetching

Andrew, what do you think about changing imgRequest to use that accept header instead of what is set in pref image.http.accept?

Flags: needinfo?(amarchesini) → needinfo?(aosmond)

(In reply to Liz Henry (:lizzard) (use needinfo) from comment #2)

If this is a duplicate of bug 1347958 then probably it would not be a new
issue in 65.

Well, that depends -- seems to me like this was a change that exposed the same problem, i.e. not using image.http.accept for fetch requests.

Andrew, what do you think about changing imgRequest to use that accept header instead of what is set in pref image.http.accept?

Why not use image.http.accept for all image requests as suggested in bug 1347958 instead of using various hard-coded ones?

Any ideas on this one for 66, Gijs?

Flags: needinfo?(gijskruitbosch+bugs)

(In reply to Andrea Marchesini [:baku] from comment #4)

The problem is that the server returns 2 different images based on the 'accept' header values.
When the image is downloaded to be displayed, 'accept' header is set to: "image/webp,/"
But when the image is downloaded to be saved, 'accept' header is set to: "image/png,image/svg+xml,image/;q=0.8,/*;q=0.5"

We should be consistent and use the same accept header for any image request. The fetch spec suggests to use: "image/png,image/svg+xml,image/;q=0.8,/*;q=0.5": https://fetch.spec.whatwg.org/#fetching

Andrew, what do you think about changing imgRequest to use that accept header instead of what is set in pref image.http.accept?

I think there is another bug or two open on this topic; if I had to say anything, we should be using the fetch spec + image/webp so that the WebP detection works properly.

Flags: needinfo?(aosmond)

(In reply to Tim Spurway [:tspurway] from comment #6)

Any ideas on this one for 66, Gijs?

I don't know our image/networking code well enough to comment here. I'm moving this over to networking because that's where the regressing bug was. Looks like :baku and :aosmond more or less agree so re-pinging :baku to radar this so we can consider a fix for 66.

(In reply to Alice0775 White from comment #3)

And right click on the image > View Image is also broken.

I'd like to understand this better to make sure whatever we end up fixing actually fixes the entire issue - using 'view image' on beta shows me the same image (visually, at least - I haven't checked the bytes over the wire, though the page info dialog says it's a webp image) as on the page, as expected. Are you getting a broken image, or does reproducing this require disabling webp, or something else?

Component: File Handling → Networking: HTTP
Flags: needinfo?(gijskruitbosch+bugs)
Flags: needinfo?(amarchesini)
Flags: needinfo?(alice0775)
Product: Firefox → Core

Right click on the image > View Image Info
The image is WebP as expected.

However,
Right click on the image > View Image
And then Right click on the image > View Image Info

The image is now PNG. So I said 'broken'

Flags: needinfo?(alice0775)

I suspect there are several issues here:

  1. We send different Accept header values and the server answers with 2 different images. We must be consistent in how we use the Accept header. I think we have a good solution proposed in comment 7.

  2. The use of content-disposition to generate the file-name.

Reading the code, I see this:
https://searchfox.org/mozilla-central/rev/4587d146681b16ff9878a6fdcba53b04f76abe1d/toolkit/content/contentAreaUtils.js#144-145,152

In contentAreaUtils.js we retrieve the content-disposition from the image cache. The content-disposition is used to generate the file name (I guess). But then, the downloading of the image sends different headers then what we sent before - see my previous comment. Nothing guarantees that the server offers the same content when important headers (such as Accept) are sent differently. But more important, nothing guarantees that the server answers always at the same way in general. Maybe the second request contains a different content-disposition value...

I can work on the first part. Gijs, can you check if what I'm saying is correct for this second point? Thanks.

Flags: needinfo?(amarchesini) → needinfo?(gijskruitbosch+bugs)

(In reply to Andrea Marchesini [:baku] from comment #10)

I suspect there are several issues here:

  1. We send different Accept header values and the server answers with 2 different images. We must be consistent in how we use the Accept header. I think we have a good solution proposed in comment 7.

  2. The use of content-disposition to generate the file-name.

Reading the code, I see this:
https://searchfox.org/mozilla-central/rev/4587d146681b16ff9878a6fdcba53b04f76abe1d/toolkit/content/contentAreaUtils.js#144-145,152

In contentAreaUtils.js we retrieve the content-disposition from the image cache. The content-disposition is used to generate the file name (I guess). But then, the downloading of the image sends different headers then what we sent before - see my previous comment. Nothing guarantees that the server offers the same content when important headers (such as Accept) are sent differently. But more important, nothing guarantees that the server answers always at the same way in general. Maybe the second request contains a different content-disposition value...

I can work on the first part. Gijs, can you check if what I'm saying is correct for this second point? Thanks.

We don't hit the code you referenced because of e10s - there's no document because we're in the parent process. The content type and content disposition have been passed on through messaging before we get there. However, we do not pass that info to nsIWebBrowserPersist (and in fact, off-hand I don't see any relevant parameters where we could do so from the frontend). It's probably determined here: https://searchfox.org/mozilla-central/rev/4587d146681b16ff9878a6fdcba53b04f76abe1d/browser/actors/ContextMenuChild.jsm#504-524 but I haven't checked that.

It sounds to me like what you're getting at is that we shouldn't settle on a filename before we make the request and get a response. Unfortunately, that's not how the current code works; nsIWebBrowserPersist needs to know where it's saving things before we invoke it, so we have to prompt for a destination (or default to some "clever" filename guess in $defaultdownloaddirectory) before saving. Reworking all of that is definitely not upliftable, even if it's probably a good idea for the longer term...

Does that answer your question?

Flags: needinfo?(gijskruitbosch+bugs) → needinfo?(amarchesini)

(Alternatively, perhaps there's a way to force the webbrowserpersist to use the image cache for saving images, if it's caching the bytes we get over the network instead of some rendered representation?)

If you come up with something upliftable, let me know. Otherwise maybe we can aim for fixing this in 67 or even 68.

Gijs, Baku, is that something we should fix with a 67 uplift or do you estimate that the potential patch to uplift would be risky and it is better to let it slip to 68? Thanks

Flags: needinfo?(gijskruitbosch+bugs)

The fundamental issues here are that:

  1. we assume that we know what type of thing we're saving before we do the network request. This applies to both images and 'save page as'. Only for 'save link as' do we do some kind of "prefetch" request to see what content type and disposition the server returns, and prompt for a filename after. This has "always" (for the past 20 or so years) worked like this, and is difficult to fix. There is no strong owner for the file saving code, or people who are experts on it. A bunch of the stuff on the frontend side is a pile of helper functions that happen to mostly work today that nobody wants to touch and has little to no automated test coverage (plus, if it did, you'd have to rewrite most of the test to make a change like this). I don't think attempting to fix this in order to fix this bug is a realistic option.
  2. the image accept header code. I don't know this code and can't speak to it. :baku and :aosmond are the people to ask. This was what changed and surfaced this bug, though a similar thing could technically happen if there were other situations where we varied the accept header. I don't know why webp is not in the Accept header list for fetch, and if changing this would be likely to cause web compat issues or whatever. I also don't understand why we send "image/webp,/" for images we load in the page... why no other mime types and/or why send webp at all? Or under what circumstances we send that spec'd fetch header - you can use fetch() for non-image resources, so why does it even bother including images specifically? I also don't know what spec covers the webp stuff. Passing ni to :aosmond for all of this.
  3. the lack of cache usage. This is actually caused by the previous point: the site in comment #0 sends a "Vary: Accept" header for the image, and so now that we send a different accept header than we did the first time, necko correctly determines that it can't use the network cache for the image saving request.
  4. not having a way to specify what kind of URI we're persisting when calling nsIWebBrowserPersist::SavePrivacyAwareURI or SaveURI, and thus using the "standard" fetch headers for the save as request. This would be an API change, but would probably allow us to use a different accept header, if the right magic was invoked. I'm also not really the best person to work on this. The changes on the frontend would be relatively minimal: saveImageURL would have to pass the relevant flags to internalSave, which would have to pass them to nsIWebBrowserPersist. I don't know what would be required on that side of things to make things work correctly. Perhaps just passing a content policy for the request is enough, perhaps we need other things.

Hopefully that helps. I can't personally write a patch for this that would be upliftable to 67, I expect - unless just passing the right content policy type would be sufficient here (in which case we could probably fix this through (4) ).

Flags: needinfo?(gijskruitbosch+bugs) → needinfo?(aosmond)

Just a comment re: (2)

I don't know why webp is not in the Accept header list for fetch

I opened a spec issue for this inflexibility with whatwg. Mandating a set, limited image accept: header in a spec like that pretty much kills the possibility for conneg. I'm quite sure that that isn't the intention of the fetch spec but currently is the result if browsers are going to be compliant with it.
https://github.com/whatwg/fetch/issues/877

I also don't understand why we send "image/webp,/" for images we load in the page... why no other mime types and/or why send webp at all?

The point of sending these headers at all is important for formats that are not/cannot be presumed to be accepted/supported by "all"(*) browsers. The whole point of conneg is to use these headers to indicate what non-standard formats are supported and/or what formats are being preferred by the client. WebP is a relatively new format and support for it should be indicated in image requests so web servers don't have to take a "risk" that this new format might or might not be supported.

(*) "all" is generally a good guess as to what is in common use, and includes a good range of older versions of mainstream browsers as well as some smaller browsers. In general this is a conservative approach.

As to why it's sent in img requests despite being specced differently in necko now, that is the result of image.http.accept; I've had a related issue with this in bug 1347958 and I think ultimately this -could-? be solved by making the accept headers prefable (as opposed to hardcoded what has been done now in bug 1417463) or by at least perpetuating the image accept from image.http.accept to necko for all image requests instead of the limited use of where it is being done now...

Priority: -- → P2
No longer blocks: 1417463
Has Regression Range: --- → yes
Has STR: --- → yes
Regressed by: 1417463

Untracking and marking as fix-optional for 67 given that the bug is unassigned, is now a P2 and we are past our early beta cycle.

See Also: → 1610214
Assignee: nobody → gijskruitbosch+bugs
Status: NEW → ASSIGNED
Flags: needinfo?(aosmond)
Flags: needinfo?(amarchesini)
Whiteboard: [necko-triaged]
Pushed by gijskruitbosch@gmail.com:
https://hg.mozilla.org/integration/autoland/rev/af1b9cf9eec6
pass content policy to webbrowserpersist to improve image request headers, r=smaug,johannh
https://hg.mozilla.org/integration/autoland/rev/169c76460aa4
handle image.http.accept pref inside netwerk instead of forcing all consumers to do the same dance, r=valentin,aosmond
https://hg.mozilla.org/integration/autoland/rev/a90042cbefc6
do not mixed-content-block image loads from webbrowserpersist, r=ckerschb
Status: ASSIGNED → RESOLVED
Closed: 2 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla74
Duplicate of this bug: 1610214

Hi,

I was still able to reproduce the issue on Windows 10 x64, Mac 10.14 and Ubuntu 18.04.3 LTS using the latest Firefox Nightly 74.0a1 (2020-02-10).

Should this bug be reopened or should a new bug report be submitted?

Flags: needinfo?(yoasif)
Flags: needinfo?(rmaries)
Flags: needinfo?(gijskruitbosch+bugs)
Flags: needinfo?(rmaries)

(In reply to Alexandra Martin from comment #25)

Hi,

I was still able to reproduce the issue on Windows 10 x64, Mac 10.14 and Ubuntu 18.04.3 LTS using the latest Firefox Nightly 74.0a1 (2020-02-10).

Should this bug be reopened or should a new bug report be submitted?

What steps are you using, and what are you actually seeing? Because I suspect what you're seeing is we're saving a png but it's actually a png, or we're saving a webp image with a webp extension, whereas before we were saving the file with extension X but contents Y. The fix means X and Y actually match up. I just checked again with the steps in comment #0 in a clean nightly profile on macOS (nightly from Feb 6) and I see a webp image with a webp extension.

Flags: needinfo?(gijskruitbosch+bugs) → needinfo?(alexandra.martin)

The file I end up downloading in Firefox now differs from Chromium (using Windows right now so I didn't try imagemagick); previously, I was downloading a PNG file with the wrong file extension, now it seems like I am actually downloading a WebP image with the correct file extension.

Flags: needinfo?(yoasif)

(In reply to :Gijs (he/him) from comment #26)

What steps are you using, and what are you actually seeing? Because I suspect what you're seeing is we're saving a png but it's actually a png, or we're saving a webp image with a webp extension, whereas before we were saving the file with extension X but contents Y. The fix means X and Y actually match up. I just checked again with the steps in comment #0 in a clean nightly profile on macOS (nightly from Feb 6) and I see a webp image with a webp extension.

I used the steps from comment #0. Initially I thought that the image saved from Firefox should also be a png (like the same image saved from Chrome).
Tried imagemagick for the webp image and indeed the extension matches the content. Sorry for the confusion.

Flags: needinfo?(alexandra.martin)
Regressions: 1619567
Regressions: 1619155
Blocks: 1347958

Surely this must have regressed. Or is it only fixed for PNG and not for JPG?

JPG file gets saved as WebP

Note: Manually renaming the extension to jpg works. But it's a tedious workaround.

Flags: needinfo?(rmaries)

(In reply to my comment #29)

Surely this must have regressed. Or is it only fixed for PNG and not for JPG?
JPG file gets saved as WebP
Note: Manually renaming the extension to jpg works. But it's a tedious workaround.

Update: It appears that the file content is indeed WebP but the server is serving it with a JPG extension. Somehow Firefox recognizes that the file is WebP and ignores the extension provided by the server. I guess this is the fault of the server, not Firefox.

As for manually renaming the extension, this only works for applications that can read the WebP format despite the file having a JPG extension. Applications that expect a true JPG file will throw an error or have unexpected behavior. In order to make it a true JPG file, you have to run the file through a legitimate conversion process.

(In reply to Arun Das from comment #30)

(In reply to my comment #29)

Surely this must have regressed. Or is it only fixed for PNG and not for JPG?
JPG file gets saved as WebP
Note: Manually renaming the extension to jpg works. But it's a tedious workaround.

Update: It appears that the file content is indeed WebP but the server is serving it with a JPG extension. Somehow Firefox recognizes that the file is WebP and ignores the extension provided by the server. I guess this is the fault of the server, not Firefox.

You would be surprised how the "progressively switch to webp" thing has been implemented on the website-side of things. And as the internet, we're gonna be going through that all again in the near-ish future with avif, I suspect. :-(

Anyway, if there are still issues around webp/jpg defaults, they are probably best reported in new bugs. We do already have bug 1644950 to cover the request to build in a conversion inside Firefox to move from webp to something more programs like / are able to deal with.

(In reply to Arun Das from comment #29)

Surely this must have regressed. Or is it only fixed for PNG and not for JPG?

JPG file gets saved as WebP

Note: Manually renaming the extension to jpg works. But it's a tedious workaround.

This bug has been automatically closed by bugherder. I'm not sure how my input could help here. I'm part of the code sheriffs team.

Flags: needinfo?(rmaries)
You need to log in before you can comment on or make changes to this bug.