Closed Bug 1139913 Opened 9 years ago Closed 7 years ago

Downloads with blocked data should still keep the placeholder on disk

Categories

(Toolkit :: Downloads API, defect, P1)

defect

Tracking

()

VERIFIED FIXED
mozilla57
Tracking Status
firefox57 --- verified

People

(Reporter: Paolo, Assigned: Paolo)

References

(Blocks 1 open bug, Regressed 1 open bug)

Details

(Whiteboard: [fxprivacy])

Attachments

(1 file, 1 obsolete file)

While testing bug 1137909 it seemed to me that downloads with blocked data don't keep the zero byte placeholder on disk. This means that if the same download is retried from the original page, there can be a naming conflict.
Flags: firefox-backlog+
Flags: firefox-backlog+ → qe-verify?
Whiteboard: [fxprivacy]
Flags: qe-verify? → qe-verify+
Priority: -- → P3
Priority: P3 → P2
Priority: P2 → P1
Priority lowered to accommodate higher priority work.
Priority: P1 → P2
This bug can be solved by keeping the placeholder every time we have partial data, as indicated in bug 921052 for the case of paused downloads. If a download is paused and another one with the same name is started, a naming conflict occurs and, if the first download is resumed, both will attempt to write to the same target file, with undefined results.

The same will happen for blocked downloads, with the difference that while pausing is a secondary user interaction, download blocking doesn't require any interaction and just retrying the download from the original page would trigger this condition.

The problem could also be solved without a placeholder by checking for the existence of ".part" files when assigning a file name, however there are two difficulties. First, there are several different code paths that assign the file name, some of which just use the createUnique API. Second, when using a file picker to choose the name, the check would have to be done after the user has already chosen the file, meaning we would need to ask whether to overwrite the file in a separate message box.

To ensure we don't forget the placeholder for blocked downloads in addition to the ".part" file after a restart, bug 1139914 should also be fixed first.
Depends on: 921052, 1139914
Priority: P2 → P3
Paolo, can you give me some starters, roughly what changes I might have to make to get this working? I suppose Bug 921052 will  have to be fixed before, so you might also comment there.

Thank you!!
Flags: needinfo?(paolo.mozmail)
I think the work to resolve this bug is actually the same as bug 921052, I've not marked this one as a duplicate because it has a little bit more information and is more specific. I'll comment here for convenience but this applies to both bugs equally. By necessity, this has to be a summary rather than a detailed explanation of how the code works, but I'm available to answer any questions you might have.

Since we're changing behavior, some of the tests will need to change. The first ones I saw were the ones for "cancel()". In the case where we don't want to keep partial data to begin with (which is the case for most downloads initiated programmatically), probably we should still remove the placeholder. So, this check should stay the same:

http://mxr.mozilla.org/mozilla-central/source/toolkit/components/jsdownloads/test/unit/common_test_Download.js?rev=29d7a174bdb4#666

If we try to keep partial data, but we are asked to remove it with "removePartialData()", we should probably still remove the placeholder. This maps to downloads manually cancelled in the user interface:

http://mxr.mozilla.org/mozilla-central/source/toolkit/components/jsdownloads/test/unit/common_test_Download.js?rev=29d7a174bdb4#706

For canceled downloads where we still retain partial data, the placeholder should be kept instead, so we should change this test:

http://mxr.mozilla.org/mozilla-central/source/toolkit/components/jsdownloads/test/unit/common_test_Download.js?rev=29d7a174bdb4#802

There are several other tests that check different combinations, and some of them should be updated based on the ones above. If we find new edge cases while writing the code, we should add tests for them.

Downloads where "hasBlockedData" is true should keep the placeholder on disk. This is specifically the case indicated by this bug.

For the implementation, I suggest placing the logic to determine whether to keep the placeholder or not in a single place. Ideally, DownloadSaver or DownloadTarget, instead of Download, should handle all the I/O and the existence checks, possibly using the state from the Download object as reference.

In the main code, there are a few places where we assume that an error condition in the download, which includes cancellation, should result in the placeholder being removed, which will not necessarily be the case anymore:

http://mxr.mozilla.org/mozilla-central/source/toolkit/components/jsdownloads/src/DownloadCore.jsm?rev=29d7a174bdb4#490
http://mxr.mozilla.org/mozilla-central/source/toolkit/components/jsdownloads/src/DownloadCore.jsm?rev=29d7a174bdb4#2141
http://mxr.mozilla.org/mozilla-central/source/toolkit/components/jsdownloads/src/DownloadCore.jsm?rev=29d7a174bdb4#2543

These will need to change and make a decision based on the expected end state. Conversely, there are places where we assume we don't need to explicitly remove a placeholder:

http://mxr.mozilla.org/mozilla-central/source/toolkit/components/jsdownloads/src/DownloadCore.jsm?rev=29d7a174bdb4#680
http://mxr.mozilla.org/mozilla-central/source/toolkit/components/jsdownloads/src/DownloadCore.jsm?rev=29d7a174bdb4#2224

There are also properties on DownloadTarget that store the state in the file system. I'm not sure if we should specify the behavior of "download.target.exists" if the download has not succeeded. In fact, here we assume that "exists" has to be updated only if the download succeeded:

http://mxr.mozilla.org/mozilla-central/source/toolkit/components/jsdownloads/src/DownloadCore.jsm?rev=29d7a174bdb4#926
Flags: needinfo?(paolo.mozmail)
Thanks, I'll work through it and get back to you :)
Assignee: nobody → jhofmann
Paolo, I finally gave this a shot. I moved all the deletion code into Download.removePartialData and updated the existing tests, I feel like they're covering the behavior pretty well already. Let me know what you think about this approach and if we should add another test.
Attachment #8772348 - Flags: review?(paolo.mozmail)
Comment on attachment 8772348 [details]
Bug 1139913 - Do not delete placeholders when keeping partial downloads.

https://reviewboard.mozilla.org/r/65178/#review63164

Thanks for tackling this bug!

::: toolkit/components/jsdownloads/src/DownloadCore.jsm:849
(Diff revision 1)
> -            if (this._promiseCanceled) {
> -              yield this._promiseCanceled;
> +            let tasks = [];
> +            if (this.target.partFilePath) {

We wait on _promiseCancelled and do the checks at the beginning to avoid some possible race conditions. Unfortunately it's difficult to catch all the race conditions with the regressions tests, although they exist.

::: toolkit/components/jsdownloads/src/DownloadCore.jsm:858
(Diff revision 1)
> +              yield Promise.all(tasks);
> +            } catch (ex) {

Promise.all is not correct here: "If any of the passed in promises rejects, the all Promise immediately rejects with the value of the promise that rejected".

This means that if one deletion fails the function might return before the other file is deleted.

Anyways, I think serializing the I/O calls one after the other might even have slightly better performance than parallelizing them.

::: toolkit/components/jsdownloads/src/DownloadCore.jsm:864
(Diff revision 1)
> -            // Ask the saver object to remove any partial data.
> -            yield this.saver.removePartialData();
> +
> +            this.target.exists = false;

Actually the separation we want here is different.

- The Download object should take care of serializing all the calls to the Saver object so the latter doesn't have to care about race conditions itself.

- The Saver object should handle all the disk I/O or other operations as instructed by the Download object.

Some Saver implementations might not even use a file on disk at all. So the problem with the existing code was that I/O for the file removal was done inside the Download object to begin with. It should be delegated to the specific Saver instance instead.

::: toolkit/components/jsdownloads/src/DownloadCore.jsm:1009
(Diff revision 1)
> -      this.cancel();
> +      return this.cancel().then(() => this.removePartialData());
> -      return this.removePartialData();

The reason why this is done this way is that when the function returns "this._promiseRemovePartialData" has already been set synchronously, which is not the case anymore with the change here.

::: toolkit/components/jsdownloads/test/unit/common_test_Download.js
(Diff revision 1)
> -  // Progress properties are not reset by canceling.
> -  do_check_eq(download.progress, 50);
> -  do_check_eq(download.totalBytes, TEST_DATA_SHORT.length * 2);
> -  do_check_eq(download.currentBytes, TEST_DATA_SHORT.length);
> -

Maybe removing these lines won't be necessary anymore after the patch has been restructured to delegate the I/O to the saver.

I still have to figure out if there is a reason for these properties not being reset, for example to avoid flickering in the front-end, or is just a side effect of the current implementation that we can ignore.
https://reviewboard.mozilla.org/r/65178/#review63172

::: toolkit/components/jsdownloads/src/DownloadCore.jsm:490
(Diff revision 1)
>          // Check for the last time if the download has been canceled. This must
>          // be done right before setting the "stopped" property of the download,
>          // without any asynchronous operations in the middle, so that another
>          // cancellation request cannot start in the meantime and stay unhandled.
>          if (this._promiseCanceled) {
>            try {
> -            yield OS.File.remove(this.target.path);
> +            yield this.removePartialData();
>            } catch (ex) {
>              Cu.reportError(ex);
>            }

Hm, also, re-reading comment 4, I've mentioned that here we might have to keep the placeholder if the download has been cancelled but we want to keep partial data. But this call seems to remove the placeholder unconditionally, is this correct? Is this caught by a regression test?

Even if the cancellation case might be slightly different than blocked downloads, and there's a separate bug on file, it's probably easier if we work on the problem in the same patch.
Thanks for the review!

(In reply to :Paolo Amadini from comment #8)
> Comment on attachment 8772348 [details]
> Bug 1139913 - Do not delete placeholders when keeping partial downloads.
> 
> https://reviewboard.mozilla.org/r/65178/#review63164
> 
> Promise.all is not correct here: "If any of the passed in promises rejects,
> the all Promise immediately rejects with the value of the promise that
> rejected".
> 

Hah, wow! Great catch. I didn't know that. 

> Some Saver implementations might not even use a file on disk at all. So the
> problem with the existing code was that I/O for the file removal was done
> inside the Download object to begin with. It should be delegated to the
> specific Saver instance instead.

Interesting, I was intrigued why this is handled by the Saver to begin with, but that makes sense.


> Hm, also, re-reading comment 4, I've mentioned that here we might have to
> keep the placeholder if the download has been cancelled but we want to keep
> partial data. But this call seems to remove the placeholder unconditionally,
> is this correct? Is this caught by a regression test?

Yeah, I think I just misread/missed that part. I'll update that line and add a test. But I think right now we're not using "cancel and keep partial data" anywhere, right?
Status: NEW → ASSIGNED
(In reply to Johann Hofmann [:johannh] from comment #10)
> But I think right now we're not using "cancel and keep partial data"
> anywhere, right?

That's what pausing a download normally does, if keeping partial data is possible.
Comment on attachment 8772348 [details]
Bug 1139913 - Do not delete placeholders when keeping partial downloads.

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/65178/diff/1-2/
Attachment #8772348 - Flags: review?(paolo.mozmail)
Comment on attachment 8772348 [details]
Bug 1139913 - Do not delete placeholders when keeping partial downloads.

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/65178/diff/2-3/
 
> > Hm, also, re-reading comment 4, I've mentioned that here we might have to
> > keep the placeholder if the download has been cancelled but we want to keep
> > partial data. But this call seems to remove the placeholder unconditionally,
> > is this correct? Is this caught by a regression test?
> 
> Yeah, I think I just misread/missed that part. I'll update that line and add
> a test. But I think right now we're not using "cancel and keep partial data"
> anywhere, right?

As it happens that test exists already, not 100% sure why it wasn't failing. http://searchfox.org/mozilla-central/source/toolkit/components/jsdownloads/test/unit/common_test_Download.js#794
(In reply to Johann Hofmann [:johannh] from comment #14)
> As it happens that test exists already, not 100% sure why it wasn't failing.
> http://searchfox.org/mozilla-central/source/toolkit/components/jsdownloads/
> test/unit/common_test_Download.js#794

Well, it would be good to know for sure why it didn't fail! ;-)

In this version, does the test fail as expected if we change the code to remove the file instead?

I have to put some more thought on this patch. As the name says, "this.tryToKeepPartialData" doesn't necessarily mean we can actually keep the partial data. I'm not sure if we want to use the actual state of partial data for the current download attempt for this check, or if "this.tryToKeepPartialData" is correct. Maybe there is no difference, but if there is one, we should make sure we have a test to catch it, to support future code changes.

Also checking "this.download.currentBytes <= 0" is suspicious at first glance... but again, maybe it's the correct solution. I'm afraid this review might have to wait for next week.
Comment on attachment 8772348 [details]
Bug 1139913 - Do not delete placeholders when keeping partial downloads.

https://reviewboard.mozilla.org/r/65178/#review73116

::: toolkit/components/jsdownloads/src/DownloadCore.jsm:490
(Diff revision 3)
>          // Check for the last time if the download has been canceled. This must
>          // be done right before setting the "stopped" property of the download,
>          // without any asynchronous operations in the middle, so that another
>          // cancellation request cannot start in the meantime and stay unhandled.
> -        if (this._promiseCanceled) {
> +        if (this._promiseCanceled && !this.tryToKeepPartialData) {
>            try {
> -            yield OS.File.remove(this.target.path);
> +            yield this.saver.removePartialData();
>            } catch (ex) {
>              Cu.reportError(ex);
>            }
>  
> -          this.target.exists = false;
> -          this.target.size = 0;
> -
>            // Cancellation exceptions will be changed in the catch block below.
>            throw new DownloadError();
>          }

This is a special case for which we don't have a specific regression test. It exists because if the download succeeded but in the meantime we set _promiseCancelled, we don't have a code path that gets the download to a consistent completed state.

To keep the internal state of the Download object consistent, we just delete the target and effectively cancel the download. Since the DownloadSaver succeeded, we already renamed the ".part" file to the final name, and this results in all the data begin deleted. This can also happen when pausing, because it uses the same mechanism.

This is actually not optimal, but this issue doesn't normally occur because it is really improbable. There might be a way to fix this so that the download is kept in the completed state, but it's probably better to do that in another bug.

In other words, remove "&& !this.tryToKeepPartialData" :-)

::: toolkit/components/jsdownloads/src/DownloadCore.jsm:2143
(Diff revision 3)
>        } catch (ex) {
>          // Ensure we always remove the placeholder for the final target file on
>          // failure, independently of which code path failed.  In some cases, the
>          // background file saver may have already removed the file.
>          try {
> -          yield OS.File.remove(targetPath);
> +          if (!this.download.tryToKeepPartialData || this.download.currentBytes <= 0) {

This condition can effectively be "!this.download.hasPartialData && !this.download.hasBlockedData". I've checked that if we have partial data, we almost certainly reported the status to the Download object at this point, and if we didn't, then there is probably very little partial data that we can safely delete.

::: toolkit/components/jsdownloads/src/DownloadCore.jsm:2147
(Diff revision 3)
>            // If we failed during the operation, we report the error but use the
>            // original one as the failure reason of the download.  Note that on
>            // Windows we may get an access denied error instead of a no such file
>            // error if the file existed before, and was recently deleted.
> -          if (!(e2 instanceof OS.File.Error &&
> +          if (!(e2 instanceof OS.File.Error && e2.becauseAccessDenied)) {
> -                (e2.becauseNoSuchFile || e2.becauseAccessDenied))) {
>              Cu.reportError(e2);
>            }

Although it applies only here, we should probably move the becauseAccessDenied special case to the removal function in DownloadSaver, with a more detailed comment on why it's there. In particular it's there because the component that executed the download may have just deleted the file.

::: toolkit/components/jsdownloads/src/DownloadCore.jsm:2222
(Diff revision 3)
>    /**
>     * Implements "DownloadSaver.removePartialData".
>     */
>    removePartialData: function ()

We should rename this DownloadSaver method to indicate what it does now. Maybe removeTargetAndPartialData, removeTarget, or removeData. The Download interface won't change of course.

::: toolkit/components/jsdownloads/src/DownloadCore.jsm:2230
(Diff revision 3)
> +            yield OS.File.remove(this.download.target.path);
> +            this.download.target.exists = false;
> +            this.download.target.size = 0;

I wonder if we should move this to DownloadTarget. It would still be called from the DownloadSaver method though, not directly by the Download object.

::: toolkit/components/jsdownloads/src/DownloadCore.jsm:2237
(Diff revision 3)
>          } catch (ex) {
>            if (!(ex instanceof OS.File.Error) || !ex.becauseNoSuchFile) {
>              throw ex;
>            }
>          }

We should ignore errors independently for the two files, a helper function might be needed.

::: toolkit/components/jsdownloads/src/DownloadCore.jsm:2546
(Diff revision 3)
>          // Ensure we always remove the final target file on failure,
>          // independently of which code path failed.  In some cases, the
>          // component executing the download may have already removed the file.
>          try {
> -          yield OS.File.remove(this.download.target.path);
> +          if (!this.download.tryToKeepPartialData || this.download.currentBytes <= 0) {
> +            yield this.removePartialData();
> +          }
>          } catch (e2) {
>            // If we failed during the operation, we report the error but use the
>            // original one as the failure reason of the download.  Note that on
>            // Windows we may get an access denied error instead of a no such file
>            // error if the file existed before, and was recently deleted.
> -          if (!(e2 instanceof OS.File.Error &&
> +          if (!(e2 instanceof OS.File.Error && e2.becauseAccessDenied)) {
> -                (e2.becauseNoSuchFile || e2.becauseAccessDenied))) {
>              Cu.reportError(e2);
>            }
>          }

Because this code is now independent from the specific DownloadSaver implementation of file I/O, we can move this code to the exception handling section of the function in the Download object that calls "DownloadSaver.execute", so we don't need to repeat it for DownloadLegacySaver and DownloadCopySaver.
Attachment #8772348 - Flags: review?(paolo.mozmail)
Looks like we did 80% of the work on this bug and then we both forgot about it?
Flags: needinfo?(jhofmann)
Oh, right, sorry for that. I can probably work on the rest very soon but I'll unassign myself for now in case anyone else wants to steal it.
Assignee: jhofmann → nobody
Status: ASSIGNED → NEW
Flags: needinfo?(jhofmann)
Rebased and addressed my review comments. Tryserver build here:

https://treeherder.mozilla.org/#/jobs?repo=try&revision=aea461f746a25d6142e901576d149b9682a3f7a8
Assignee: nobody → paolo.mozmail
Status: NEW → ASSIGNED
Priority: P3 → P1
Attachment #8772348 - Attachment is obsolete: true
Comment on attachment 8900282 [details]
Bug 1139913 - Downloads with partial data should still keep the placeholder on disk.

https://reviewboard.mozilla.org/r/171680/#review177404

Looks like you already fixed the eslint from the previous Try, just commenting about it for sanity.
Attachment #8900282 - Flags: review?(mak77) → review+
(In reply to Marco Bonardo [::mak] from comment #23)
> Looks like you already fixed the eslint from the previous Try, just
> commenting about it for sanity.

I actually missed this, thanks for the reminder.
Pushed by paolo.mozmail@amadzone.org:
https://hg.mozilla.org/integration/mozilla-inbound/rev/d2a4fa2a4005
Downloads with partial data should still keep the placeholder on disk. r=mak
Backed out for failing browser-chrome's browser_downloads_panel_height.js:

https://hg.mozilla.org/integration/mozilla-inbound/rev/b9e2b684b9883fd5dae81b3592280cd12451521d

Push with failures: https://treeherder.mozilla.org/#/jobs?repo=mozilla-inbound&revision=d2a4fa2a4005db6c39aece63208059a86c5e35a5&filter-resultStatus=testfailed&filter-resultStatus=busted&filter-resultStatus=exception&filter-resultStatus=retry&filter-resultStatus=usercancel&filter-resultStatus=runnable
Failure log: https://treeherder.mozilla.org/logviewer.html#?job_id=125544622&repo=mozilla-inbound

[task 2017-08-24T13:48:07.502751Z] 13:48:07     INFO - TEST-START | browser/components/downloads/test/browser/browser_downloads_panel_height.js
[task 2017-08-24T13:48:08.103516Z] 13:48:08     INFO - TEST-INFO | started process screentopng
[task 2017-08-24T13:48:10.031001Z] 13:48:10     INFO - TEST-INFO | screentopng: exit 0
[task 2017-08-24T13:48:10.036746Z] 13:48:10     INFO - Buffered messages logged at 13:48:07
[task 2017-08-24T13:48:10.038166Z] 13:48:10     INFO - Entering test bound test_height_reduced_after_removal
[task 2017-08-24T13:48:10.039940Z] 13:48:10     INFO - Buffered messages logged at 13:48:08
[task 2017-08-24T13:48:10.041933Z] 13:48:10     INFO - TEST-PASS | browser/components/downloads/test/browser/browser_downloads_panel_height.js | 202.6666717529297 > 129.3333282470703 - 
[task 2017-08-24T13:48:10.045231Z] 13:48:10     INFO - Leaving test bound test_height_reduced_after_removal
[task 2017-08-24T13:48:10.052594Z] 13:48:10     INFO - Buffered messages finished
[task 2017-08-24T13:48:10.055897Z] 13:48:10     INFO - TEST-UNEXPECTED-FAIL | browser/components/downloads/test/browser/browser_downloads_panel_height.js | Cleanup function threw an exception - [Exception... "Component returned failure code: 0x80520006 (NS_ERROR_FILE_TARGET_DOES_NOT_EXIST) [nsIFile.remove]"  nsresult: "0x80520006 (NS_ERROR_FILE_TARGET_DOES_NOT_EXIST)"  location: "JS frame :: chrome://mochitests/content/browser/browser/components/downloads/test/browser/head.js :: <TOP_LEVEL> :: line 27"  data: no]
Flags: needinfo?(paolo.mozmail)
Pushed by paolo.mozmail@amadzone.org:
https://hg.mozilla.org/integration/mozilla-inbound/rev/e406af77d28d
Downloads with partial data should still keep the placeholder on disk. r=mak
This mochitest was affected because of the synthetic downloads it created.
Flags: needinfo?(paolo.mozmail)
https://hg.mozilla.org/mozilla-central/rev/e406af77d28d
Status: ASSIGNED → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla57
This issue is verified fixed using Firefox 57.0b13 (BuildId:20171030163911) on Windows 10 64bit, macOS 10.11.6 and Ubuntu 16.04 64bit.

The zero byte placeholder is successfully kept.
Status: RESOLVED → VERIFIED
Flags: qe-verify+
Depends on: 1485555
Regressions: 1501277
No longer depends on: 1485555
Regressions: 1485555
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: