Open Bug 1485555 Opened 6 years ago Updated 10 months ago

"Remove from History" results in data loss on "failed" Save As (Ctrl+S)

Categories

(Toolkit :: Downloads API, defect, P3)

57 Branch
defect
Points:
5

Tracking

()

Tracking Status
firefox-esr52 --- unaffected
firefox-esr60 --- wontfix
firefox61 --- wontfix
firefox62 --- wontfix
firefox63 --- wontfix
firefox64 --- wontfix
firefox65 --- wontfix
firefox66 --- wontfix
firefox67 --- wontfix
firefox68 --- fix-optional

People

(Reporter: from_bugzilla3, Unassigned)

References

(Regression)

Details

(Keywords: dataloss, regression)

User Agent: Mozilla/5.0 (X11; Linux x86_64; rv:62.0) Gecko/20100101 Firefox/62.0
Build ID: 20180820172315

Steps to reproduce:

1. Go to a page where something (eg. ad-blocking HOSTS file, ad-blocking browser extension, etc.) prevents some irrelevant subresources from being loaded.
2. Use Ctrl+S in "Web Page, complete" mode to save the page
3. Verify that the downloads pane marked the download as failed
4. Verify that all DESIRED subresources successfully made it to disk.
5. Choose "Remove from History" for the file in the downloads panel

(I can't test right now, but I think other means of clearing download history may also trigger the misfeature... at least  if it's a Private Browsing window.)


Actual results:

Starting some time between 52 ESR and current Developer Edition, the behaviour of "Remove from History" changed so that, without any indication that it's happened, the HTML file and associated _files folder will vanish from disk unless I manually move them somewhere Firefox can't find them first.


Expected results:

Modern versions of Firefox should not silently delete data that actually made it to the disk when a request to clear the corresponding history entry is made... especially when the failure is actually just an indication of successful ad-blocking.

Firefox has been throwing this spurious "failed" status since time immemorial (possibly as far back as Firefox 3.x, if I remember correctly) and, up until at least 52 ESR, it was just that... spurious.

Users in my situation have been trained to ignore it and I spent quite a bit of time cursing when I discovered that Firefox had been deleting stuff I was downloading in Private Browsing mode for at least a week and I had to go back and try to piece together what was supposed to be there which wasn't.

(Though it's my fault for forgetting to report it until now, this is actually one of the things that's kept me on 52 ESR up to the last minute and I'm now scrambling to design, implement, and gain sufficient confidence in a FUSE-based overlay filesystem which will allow the Save dialog to function in a reasonable familiar manner but throw up a confirmation dialog if Firefox attempts to delete something under the relevant parts of the filesystem.)
Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:63.0) Gecko/20100101 Firefox/63.0
20180822221004

https://hg.mozilla.org/integration/mozilla-inbound/pushloghtml?fromchange=a6bb991ed09261b1ee7c90a0ed88fb82465e3877&tochange=e406af77d28ddd1b59a432a4523b8a09ddf05e54

STR
1. 0.0.0.0 upload.wikimedia.org in HOSTS file
2. https://en.wikipedia.org/wiki/Main_Page
3. Menu button → Save Page As → Web page, complete
4. Download button → right-click "Wikipedia, the free encylopedia.html" → Remove From History

Expected results
The download entry is removed from history.

Actual results
The download entry is removed from history, but the file "Wikipedia, the free encylopedia.html" is deleted from disk as well. The folder "Wikipedia, the free encyclopedia_files" and its contents are usually unaffected. In numerous attempts, I've only seen the folder deleted once, and I don't know how to reproduce it.
Blocks: 1139913
Status: UNCONFIRMED → NEW
Has Regression Range: --- → yes
Has STR: --- → yes
Component: Untriaged → Downloads API
Ever confirmed: true
Flags: needinfo?(paolo.mozmail)
Keywords: regression
OS: Unspecified → All
Product: Firefox → Toolkit
Hardware: Unspecified → All
Version: 62 Branch → 57 Branch
Thanks for reporting and finding the regression range. This is probably something we should work on when possible.
Flags: needinfo?(paolo.mozmail)
Priority: -- → P2
This also results in data loss if you...

1. Try to download the same thing twice
2. Both times fail
3. Click "Retry" on one of them (which ends up with a _files folder and a .html file that doesn't used it, last I checked)
4. Choose "Remove From History" on both
Too late for a fix for 63. We could still take a patch for 65 though.
See Also: → 536632

Paolo, who can look more into this to drive towards a fix?

Flags: needinfo?(paolo.mozmail)

Marco is now the triage owner of the module, but I don't think he's working on bugs in this component at the moment. However, he may be able to find someone to look into this, or re-triage to a lower priority.

Since the download is failed, it's not clear to me what the correct fix would be. Probably, either the download should be considered succeeded even if a subresource was not saved, or the download should still fail and everything should be removed from disk immediately, instead of when the download is removed from history.

Flags: needinfo?(paolo.mozmail) → needinfo?(mak77)

(In reply to :Paolo Amadini from comment #8)

Since the download is failed, it's not clear to me what the correct fix would be. Probably, either the download should be considered succeeded even if a subresource was not saved, or the download should still fail and everything should be removed from disk immediately, instead of when the download is removed from history.

Returning to the old behaviour of "succeeded even if a subresource was not saved" would be the option which isn't a further regression in the face of ad-blocking extensions or HOSTS files.

Currently, I'm working around this by doing one of two things, depending on how the browser behaves:

  1. If it failed in a way which prompted it to not write the HTML file at all, hit retry which, for some reason, often causes it to work at the expense of saving a copy without the URLs in the HTML file being rewritten to use the _files folder that was produced during the first attempt. (Not a problem. I can write a Python script to patch things up as long as the files are all there)

  2. If it did produce the HTML file, rename the HTML file and _files folder, delete it from the download history, and then rename them back after Firefox has assumed I already deleted them by hand.

I really don't want to have to keep a copy of Chrome around just because Ctrl+S has effectively become non-functional in the face of ad-blocking extensions.

There are various cases where complete files are being removed, instead of partials. Bug 1501277 has a first patch fixing some of the cases. There may be more to do, but the component doesn't have a dedicated team, we'll accept investigation or patches by the community.
I'd say to wait for Bug 1501277 to be fixed first, and check the behavior again, since in the end the regression range is the same.

Points: --- → 5
Depends on: 1501277
Flags: needinfo?(mak77)
Priority: P2 → P3
No longer blocks: 1139913
Regressed by: 1139913
Severity: normal → S3
You need to log in before you can comment on or make changes to this bug.