Open Bug 1323414 Opened 8 years ago Updated 9 months ago

Add streaming support to downloads

Categories

(WebExtensions :: General, enhancement, P3)

enhancement

Tracking

(Not tracked)

People

(Reporter: michel.gutierrez, Unassigned)

References

(Depends on 1 open bug)

Details

(Whiteboard: [downloads][outreach][awe: firefox@mega.co.nz], triaged)

User Agent: Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:50.0) Gecko/20100101 Firefox/50.0
Build ID: 20161114144739

Steps to reproduce:

Here is a proposal for a modification of the WebExtensions downloads API to allow add-ons to write files: https://github.com/mi-g/webext-stream-download-support-proposal based on Video DownloadHelper needs.

Anything else that provides the same functionalities will do.
Status: UNCONFIRMED → NEW
Component: WebExtensions: Untriaged → WebExtensions: General
Ever confirmed: true
Whiteboard: [downloads]
See Also: → 1246236
Hi Michel,
I recently created a new pull request on the webextensions-examples repo related to "read/modify/save files" use cases (mostly based on our discussions during the All Hands):

- https://github.com/mdn/webextensions-examples/pull/171

Can you take a look at it and check how much of the usual files manipulation features would be still missing?

In particular the MutableFile seems to be an existent feature that could cover at least part of what you would like to do with the proposed changes to the download API. 

Thanks!
Flags: needinfo?(michel.gutierrez)
Thanks Luca.

I am doing some tests to see if it matches VDH needs, in particular how it behaves with big files and whether the generated virtual files can be automatically downloaded to the real filesystem. I come back to you.

From a UX point of view, implementing the file writing API through the downloads API is better. In most use cases i thought of, writing a file is considered as a download by the end user. Since i implemented in VDH chunk streaming downloads (using raw file writing and not going through the download API), i have got dozens if not hundreds of support messages saying it was not working because it did not appear in the Firefox download manager even if VDH provides its own monitoring UI.
(In reply to Michel Gutierrez from comment #2)
> From a UX point of view, implementing the file writing API through the
> downloads API is better. In most use cases i thought of, writing a file is
> considered as a download by the end user. Since i implemented in VDH chunk
> streaming downloads (using raw file writing and not going through the
> download API), i have got dozens if not hundreds of support messages saying
> it was not working because it did not appear in the Firefox download manager
> even if VDH provides its own monitoring UI.

Except that doing so would validate the author of downThemAll!'s claims that WebExtensions is just as much the death of downThemAll! as Chrome's extension API.

...and I'm currently investigating the feasibility of either using something like FlashGot or writing my own WebExtensions-based integration to pair in a browser-external download manager like KGet or JDownloader. This would only reinforce that decision. (If I have long-running downloads in a separate download manager, I most certainly do NOT want them cluttering up the status view for quick little things like Ctrl+S)
Sorry i did not get your point. If we had this API, it would benefit DTA as well (though i'm not sure Nick is willing to pursue).

If you download through an external download manager (which is my plan B in case Firefox does not provide a decent data controllable download interface or a usable file writing API) then you have no problem and do not need any of this.
(In reply to Michel Gutierrez from comment #4)
> Sorry i did not get your point. If we had this API, it would benefit DTA as
> well (though i'm not sure Nick is willing to pursue).

The problem is that DTA is designed to support operating independently from the browser's built-in download UI and I get the impression that the DTA developers have a very poor opinion of any kinds of API restrictions which would force them to remove even minor features.
> I am doing some tests to see if it matches VDH needs, in particular how it
> behaves with big files and whether the generated virtual files can be
> automatically downloaded to the real filesystem. I come back to you.

I gave it a try from a fork of your code, and i run into 2 major issues:

1. generating IDBMutableFile objects works well, except that you easily run into a quota error, generally around 2GB. I was under the impression (i read that somewhere) that the limitation was half of the available disk space, but this doesn't seem to be the case.

2. i haven't found a way to get an URL to the generated IDBMutableFile objects. URL.createObjectURL fails since the object does not have a blob interface. I presume there is way to make it work since this is what the jszip library you use in your sample code does: build a file that can then be downloaded (at least from a <a href="...">). I'll check whether the blob url can be used in the browser.downloads API.
I pushed further my tests.

The blob URL created by the jszip can be downloaded through the browser.downloads API. However this is limited to something around 1GB before getting a QuotaExceeded error (and it's probably not a limitation of my machine that has 128GB of RAM).

As a result:
- using the IndexedDB file API can not be used since the is no downloadable URL that can be retrieved out of the generated virtual files (or i missed the way to do so). 
- downloading via a blob would work (I can submit a PR with small changes to indexeddb-save-files.js for downloading the resulting zip if you want) but in both cases (blob or file), quota limitations would severely restrict VDH ability to download big videos.

If you ask "do users really download videos that big" (and i asked myself this question): when i first released the version of VDH with add-on generated MP4 files, i only supported 32 bits offsets, limiting the file size to 4GB. I got many complains about this limitation. Users download a lot of full movies and they want them in high quality: it's big.
(In reply to Stephan Sokolow from comment #5)
> The problem is that DTA is designed to support operating independently from
> the browser's built-in download UI and I get the impression that the DTA
> developers have a very poor opinion of any kinds of API restrictions which
> would force them to remove even minor features.

I kind of agree that those Firefox API restrictions introduced first in the addon SDK then now in WebExtensions do not benefit anyone (except Chrome). Contrarily to Nils, i made the choice to try to deal with what's there. If those limitations must lead to a product that won't be accepted by the end users, VDH will follow the same path as DTA.
(In reply to Michel Gutierrez from comment #6)

> 1. generating IDBMutableFile objects works well, except that 
> you easily run into a quota error, generally around 2GB. 
> I was under the impression (i read that somewhere) that the 
> limitation was half of the available disk space, 
> but this doesn't seem to be the case.

Thanks Michael for deeply testing this example, I think that the informations that you got related to the quota limits (and the visibility of an ongoing download in the regular "downloads dialog/window") are very useful to move this into the right direction.

I'll take a look to where these "quota limits" are actually enforced and try to engage the right people into a conversation related to if and how we can bend these limits for an "extension" scenario (or where we can actually apply some changes to make it possible in the future).

> The blob URL created by the jszip can be downloaded through 
> the browser.downloads API. However this is limited to something 
> around 1GB before getting a QuotaExceeded error (and it's probably 
> not a limitation of my machine that has 128GB of RAM).

is the QuotaExceeded error producing any useful stacktrace in the Browser Console?
 
> 2. i haven't found a way to get an URL to the generated IDBMutableFile
> objects. URL.createObjectURL fails since the object does not have a blob
> interface. I presume there is way to make it work since this is what the
> jszip library you use in your sample code does: build a file that can then
> be downloaded (at least from a <a href="...">). I'll check whether the blob
> url can be used in the browser.downloads API.

To be able to use the MutableFile as a Blob, you need to retrieve a regular File instance from the mutable file, here is where/how it is done in the exaple:

- https://github.com/rpl/webextensions-examples/blob/example/indexeddb-fs/indexeddb-files/tabs/indexeddb-filehandle.js#L178-L185

once you get the File instance from the request.success handler, you can use it as a Blob instance (in the example I currently use it in the `reader.readAsText(fileRequest.result);` step, but you can use it to convert it into a blob url as well).
Thanks.

(In reply to Luca Greco [:rpl] from comment #9)
> I'll take a look to where these "quota limits" are actually enforced and try
> to engage the right people into a conversation related to if and how we can
> bend these limits for an "extension" scenario (or where we can actually
> apply some changes to make it possible in the future).

Some more about those quotas issues:

- when writing a IDBMutableFile, the quota error happens at exactly 2GB (with message "The current file handle exceeded its quota limitations.")
- any further file write attempt, even of small files) generates the same error
- clearing the object store does not reset the quota. writing small files still fails. only restarting the browser (or possibly waiting some time ?) reset the storage.

i pushed to github some experiments i did from a fork of your code:  https://github.com/mi-g/webextensions-examples (branch example/indexeddb-fs) . there is a fourth option in the add-on menu to generate a file.

> is the QuotaExceeded error producing any useful stacktrace in the Browser
> Console?

Unfortunately not.

> To be able to use the MutableFile as a Blob, you need to retrieve a regular
> File instance from the mutable file, here is where/how it is done in the
> exaple:
> 
> -
> https://github.com/rpl/webextensions-examples/blob/example/indexeddb-fs/
> indexeddb-files/tabs/indexeddb-filehandle.js#L178-L185
> 
> once you get the File instance from the request.success handler, you can use
> it as a Blob instance (in the example I currently use it in the
> `reader.readAsText(fileRequest.result);` step, but you can use it to convert
> it into a blob url as well).

I am definitely going to investigate this.
i implemented the MutableFile-to-blob code, getting a URL like blob:moz-extension://b32df641-5f41-430f-a932-0afa7d5283d4/6b56f6fe-5a41-458d-a08e-e14696042dd8 from URL.createObjectURL()

but if i try to download this URL (through browser.downloads) it fails with error: Component returned failure code: 0x80070057 (NS_ERROR_ILLEGAL_VALUE) [nsIAnnotationService.setPageAnnotation]"  nsresult: "0x80070057 (NS_ERROR_ILLEGAL_VALUE)"  location: "JS frame :: resource://app/modules/DownloadsCommon.jsm :: onDownloadChanged :: line 801"  data: no]

plus, if i copy/paste the link to a new tab, the tab crashes: "Gah. Your tab just crashed."

though, i can read the resulting Blob as an array buffer with FileReader.readAsArrayBuffer.
Hi Michel,

I looked into the "Quota limits"-side of the issues discussed above, and I've collected some useful links to the MDN docs and updated the webextension example accordingly.

Some Firefox-specific bits of the IndexedDB Storage limits behavior are described in the following MDN doc pages: 
  * https://developer.mozilla.org/en-US/docs/Web/API/IndexedDB_API/Browser_storage_limits_and_eviction_criteria#Firefox_specifics
  * https://developer.mozilla.org/en-US/docs/Web/API/IndexedDB_API#Storage_limits_and_eviction_criteria

In the updated version of the WebExtension example, the IndexedDB should now be opened with higher quota limits, available by requesting a "persistent" storage (besides other changes based on the ongoing review):
  - https://github.com/mdn/webextensions-examples/pull/171/commits/6c066cfff4e8c662984f704cb17c8b39211ed062

With the updated example (which uses "persistent" as storage type), when the db is opened a permission popup requests from which the user can allow or disallow the persistent storage to the origin (the dialog currently shows the moz-extension url, which is not very helpful and so we should track and fix the dialog message in a separate bugzilla issue, e.g. by change it to show the addon name and/or by discussing about inplementing the unlimitedStorage optional permission).

Once the user allow the "persistent" storage, the IndexedDB files are allowed to grow bigger than 2Gb (e.g. I successfully uploaded a local directory with a 4Gb file into the IndexedDB ObjectStorage).

Let me know how it goes once you give it a try.
Thanks Luca.

Indeed, opening the DB as persistent allowed me to create a 20GB file, so i presume using this mode clears the quota issue.

In the case of VDH like in probably most add-ons, the file download/manipulation is performed in the background and not in a content window. I tried opening the DB as persistent in the background side of the add-on and it did not work (no confirmation UI appeared). Once that dialog opened and has been accepted, the background can open the DB.

For this reason and because showing this dialog message from the add-on would be a terrible UX choice leading users to confusion, a permission should be added to the manifest so users would be notified of the add-on requesting unlimited storage only once at install time.

So, at this time the issues are:

- there is no way (that i could test) to bring a generated virtual file to the real filesystem (downloading a file-to-blob URL fails)
- the manifest should implement a permission to use the indexeddb without quota restrictions
- fixing the previous items would very certainly allow VDH to work, but from a usability point-of-view, controlling the file writing through the download manager API would be much better

(i updated my fork of the webextensions examples)
This bug has turned into discussion on a couple of areas. Luca will focus this bug down on to one API and open bugs for the other parts of the discussion.
Assignee: nobody → lgreco
webextensions: --- → +
Priority: -- → P2
Whiteboard: [downloads] → [downloads]triaged
Blocks: 1310316
Summary: WebExtensions: writing files from an add-on → Add streaming support to downloads
Assignee: lgreco → kmaglione+bmo
Whiteboard: [downloads]triaged → [downloads][outreach][awe: firefox@mega.co.nz], triaged
webextensions: + → ---
Flags: needinfo?(michel.gutierrez)
Product: Toolkit → WebExtensions
Assignee: kmaglione+bmo → nobody
Priority: P2 → --
Priority: -- → P3

This will probably depend on the Streams API (bug 1503319), specifically WritableStream (bug 1474543).

Depends on: streams-meta, 1474543
Whiteboard: [downloads][outreach][awe: firefox@mega.co.nz], triaged → [downloads][outreach][awe: firefox@mega.co.nz], triaged
Type: defect → enhancement
Depends on: 1749547
No longer depends on: 1474543
Severity: normal → S3
You need to log in before you can comment on or make changes to this bug.