Open Bug 448436 Opened 17 years ago Updated 6 months ago

Page Info / Media / Save As silently overwrites files

Categories

(Firefox :: File Handling, defect)

3.0 Branch
defect

Tracking

()

People

(Reporter: Mly, Unassigned)

References

Details

(Keywords: dataloss)

Attachments

(1 file)

User-Agent: Mozilla/5.0 (Macintosh; U; PPC Mac OS X 10.4; en-US; rv:1.9.0.1) Gecko/2008070206 Firefox/3.0.1 Build Identifier: Mozilla/5.0 (Macintosh; U; PPC Mac OS X 10.4; en-US; rv:1.9.0.1) Gecko/2008070206 Firefox/3.0.1 If multiple URLs are selected, the Page Info / Media / Save As... dialog will overwrite existing files without warning and without a choice of alternate filename. A good example is to attempt to save tiles from maps.google.com (note: just for testing purposes, one wouldn't want to breach a licence agreement) and note that even if 100 JPEGs were displayed on the page, only ***ONE*** file ends up being saved, with the unhelpful name "kh" (which is the base of scores of URLs of the form http://khm*.google.com/kh?v=*&hl=*&cookie=*&t=*). Note also that any existing file named "kh" in the target directory is silently overwritten by the above. Reproducible: Always Steps to Reproduce: 1. Visit a page containing multiple images with the same base URL (eg map.jpg?x=0&y=0 & map.jpg?x=1&y=1 or x/y.jpg and z/y.jpg) 2. Bring up the Tools / Page Info / Media dialog 3. Select multiple images which have the same base URL. 4. Select "Save As..." and choose a folder from the "Select a Folder to Save the Images" dialog Actual Results: Only one file is written, despite user request to save multiple files. Any existing file with base filename is overwritten. Expected Results: User should be warned if an existing file is to be overwritten. User should be offered two choices of "Overwrite existing file" versus "Skip this download" and a boolean "Apply to all downloads". Even better would be an option to name the duplicately-named downloaded files uniquely. BEST of all would be better generation of the base filename, taking into account more than one tiny part of the URL. (See for example Bug 448432) I don't want to have to use something external like wget all the time just to do trivial should-be-browser-internal stuff.
Depends on: 175841
The way you report this bug, it's easy to confuse it (or lump it together with) bug 448432 (which you also reported, and which you recognize is a request for enhancement, not really a bug report). I'm able to reproduce the "files silently overwritten when you Save As multiple files in Tools : Page Info : Media dialog" problem, on OS X, Windows and Linux (so the problem is cross-platform). Here are less confusing steps-to-reproduce: 1) Visit http://www.mozilla.org/ 2) Choose Tools : Page Info, then click on the Media tab. 3) Select the top two images (mozilla-16.png and body_back.gif), perhaps by shift-clicking on the second one. 4) Click on the Save As button (in the bottom right), and then choose to save these files somewhere (say to your account's Desktop). 5) Wait for them to be saved, then click on the Save As button again, and save them to the same location. Notice that you're not asked if you want to overwrite the existing files, and that the existing files are silently overwritten. 6) Select just one image in the Media tab (say mozilla-16.png). 7) Click on the Save As button, then choose to save it in the same location you've been using above. Notice that you're asked if you want to replace the existing file. In Firefox 2 the Save As button is grayed out if you select more than one file in the Media tab -- so the problem doesn't arise.
Status: UNCONFIRMED → NEW
Ever confirmed: true
OS: Mac OS X → All
Hardware: Macintosh → All
Version: unspecified → 3.0 Branch
Keywords: dataloss
The relevant code is in pageInfo.js, specifically in the SaveMedia() function. When multiple rows are selected, the code uses internalSave() from file contentAreaUtils.js, using an AutoChosen object to force internalSave() to use the fileName from each image's URL. So if 2 images have the same file name (but different directories or hosts), the second will overwrite the first. I've used a better general workaround for internalSave()'s API limitations: use a fake content disposition string (and pass null for the aChosenData argument). With this approach, internalSave() will only overwrite an existing file with user approval (and will also allow the user to edit the suggested name and/or select a different directory). Of course, we don't want to launch a Save dialog for every selected image, only for those which clash with existing filenames. In the long run, it would be nice to revise contentAreaUtils.js to provide a nicer API. I for one don't like calling a function with "internal" in it's name ;-).

I have tried the STR in comment #1, and there is still no prompt to alert the user about existing filenames when attempting to save more than file simultaneously.

However, instead of silently overwriting the existing files, new files are saved with non-conflicting filenames by using a unique numeric suffix.

When attempting to save a single file, where an existing file with the same name exists, the user is still prompted to confirm overwriting of the single existing file.

I have only been able to test this behaviour on Windows 10.

Severity: normal → S3

Wow, this has been around a while. But I think I have relevant info to add.

Comment 3 says "new files are saved with non-conflicting filenames by using a unique numeric suffix". But that's not entirely true. There seems to be a race condition where existing files from the same batch save operation are overwritten. For example, consider a page with a bunch of data: URL images. They don't have names, but they will be saved as "Image", "Image(2)", etc., or as "Background", "Background(2)", etc. depending on how they are used on the page. Say you select 30 of them and try to save them into a new folder. You should end up with "Image" through "Image(30)". But you can very often end up with only 29 files or 28 or 26. If you look in your Downloads history afterward, all 30 will be listed, but you might see, for example "Image(2)" listed twice in a row, suggesting the first to be named "Image(2)" was saved and immediately overwritten.

I think it's a race condition, in that the behavior is inconsistent from one try to another, but it's very reproducible for me.

Example page: https://unsplash.com/
Once the page loads, open Page Info, click Media tab, sort by type, and select all the backgrounds that use data: URLs, then save them to a new folder. Check for duplicate names in Downloads history and see whether you have them all. Out of 6 times trying this, it always overwrote at least one file and as many as four. I created a test page based on these data: URLs and it reproduced the bug, but somewhat less consistently. Out of six tries, it saved all 30 four times but only 28 or 29 the other times.

Observed on Linux, using Firefox 144.0.2.

These are extremely small BMP images that I think are automatically generated anyway, so I can't imagine there would be any legit copyright issues.

I think just trying this as a local file, it will block data: URLs, so try one these other ways:

  1. Serve the file from a web server.
  2. Visit some other webpage and use Web Developer Tools to replace the entire html element with the one from this file.
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: