Closed Bug 1607791 Opened 5 years ago Closed 5 years ago

Get rid of IDBMutableFile.getFile()

Categories

(Core :: Storage: IndexedDB, task, P2)

task

Tracking

()

RESOLVED FIXED
mozilla74
Tracking Status
firefox74 --- fixed

People

(Reporter: baku, Assigned: baku)

Details

(Keywords: site-compat)

Attachments

(4 files)

There are several reasons to remove this method:

  1. it's part of a non-standard API (FileHandle), not supported by other browser vendors.
  2. it's used by the 0% of the websites: https://sql.telemetry.mozilla.org/queries/67436/source https://sql.telemetry.mozilla.org/queries/67432/source
  3. the exposed BlobImpl is not thread-safe. This is the only non-thread-safe blobImpl class in our code base and we need a lot of extra code to support it. Removing this BlobImpl we can get rid of SameProcessSameThread, PPendingIPCBlob, and a few other things.
Assignee: nobody → amarchesini
Status: NEW → ASSIGNED
Priority: -- → P2

I slightly modified baku's query and it seems there are only 548 uses of the getFile method (unique per client) for the past 6 months on all channels. That is really negligible and we can just drop the method without making it deprecated first, what do you think Andrew ?

Flags: needinfo?(bugmail)

I agree with dropping it immediately.

Flags: needinfo?(bugmail)
Pushed by amarchesini@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/75bc7533c899
Get rid of IDBMutableFile.getFile() - part 1, r=asuth,janv
https://hg.mozilla.org/integration/autoland/rev/80707bcc8427
Get rid of IDBMutableFile.getFile() - part 2 - Remove SameProcessSameThread, r=asuth,sfink
https://hg.mozilla.org/integration/autoland/rev/4c92f506cf62
Get rid of IDBMutableFile.getFile() - part 3 - Rename SameProcessDifferentThread to SameProcess, r=asuth,sfink
https://hg.mozilla.org/integration/autoland/rev/f15835338319
Get rid of IDBMutableFile.getFile() - part 4 - Remove a few tests, r=janv
Backout by malexandru@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/0bb7d3150587
Backed out 9 changesets (bug 1607791, bug 1605566) for causing multiple wpt failures.
Pushed by amarchesini@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/cf0bb4ff99e0
Get rid of IDBMutableFile.getFile() - part 1, r=asuth,janv
https://hg.mozilla.org/integration/autoland/rev/070db3fe8f75
Get rid of IDBMutableFile.getFile() - part 2 - Remove SameProcessSameThread, r=asuth,sfink
https://hg.mozilla.org/integration/autoland/rev/139f30e7bbc9
Get rid of IDBMutableFile.getFile() - part 3 - Rename SameProcessDifferentThread to SameProcess, r=asuth,sfink
https://hg.mozilla.org/integration/autoland/rev/70891d71f1f9
Get rid of IDBMutableFile.getFile() - part 4 - Remove a few tests, r=janv
Flags: needinfo?(amarchesini)

Hi, I have an extension that depends on this method. Do you know if there is any alternative to getFile(), to download a large binary IDBMutableFile? I've thought about reading each 100MB chunks into normal indexedDB Files and then concatenating them, but this requires another copy of the storage.

Also, may I ask how long will the IDBMutableFile API last? I was hoping it would last until the Writable files (native-file-system) API is implemented, so it will be an alternative with good performance. But if you plan to remove IDBMutableFile soon, I'll need to find another approach.

Thank you.

Flags: needinfo?(amarchesini)

Also, may I ask how long will the IDBMutableFile API last? I was hoping it would last until the Writable files (native-file-system) API is implemented, so it will be an alternative with good performance. But if you plan to remove IDBMutableFile soon, I'll need to find another approach.

We are planning to remove FileHandle (IDBMutableFile) soon-ish. But to be honest, we don't have any plans to implement a writable Files API.
Why are you actually using Files instead of IDB?

Flags: needinfo?(amarchesini)

(In reply to jingyu9575 from comment #12)

Hi, I have an extension that depends on this method. Do you know if there is any alternative to getFile(), to download a large binary IDBMutableFile? I've thought about reading each 100MB chunks into normal indexedDB Files and then concatenating them, but this requires another copy of the storage.

If your goal is to download and assemble a large download in pieces, you can store the pieces in smaller individual chunks in IndexedDB and only create a multipart blob made up of those Blobs when exposing the Blob for download.

That is, imagine a, b, c, and d are all 20 MB Blobs previously put() into IDB and have since been retrieved via get()/getAll(). You can create the multipart blob aggregate like so: let aggrBlob = new Blob([a, b, c, d], { type: 'application/octet-stream' }). There's no need to put() aggrBlob back into IndexedDB, you can then do URL.createObjectURL(aggrBlob) and the download will be performed by consuming the contents of the IDB disk-blacked Blobs in sequence.

Thank you both for the information.

(In reply to Andrea Marchesini [:baku] from comment #13)

Why are you actually using Files instead of IDB?

The extension is for downloading files with multiple connections. In the end, the whole file has to be saved with browser.downloads.download, so I need the result to be a File instance.

(In reply to Andrew Sutherland [:asuth] (he/him) from comment #14)

That is, imagine a, b, c, and d are all 20 MB Blobs previously put() into IDB and have since been retrieved via get()/getAll(). You can create the multipart blob aggregate like so: let aggrBlob = new Blob([a, b, c, d], { type: 'application/octet-stream' }).

I've considered this as a fallback if there are no other alternatives. The problem is the segment size.

Let's say each blob is 20 MB. Then if the user close the browser after 19 MB has been downloaded, it will be lost and it may cost minutes to download again. However, if a small blob size is chosen (e.g. 3 seconds of download), it may result in tens of thousands of segments, causing IO performance decreases during my test.

With IDBMutableFile, it is possible to write with very small segments with good performance. But with standard Blobs it is difficult to prevent both performance problems and data loss. I've thought about workarounds such as writing recovery data every second or combine existing segments periodically, but I cannot find an ideal solution.

It sounds like you have a good handle on the various factors. The one thing not discussed here is the HTTP cache. "browser.cache.disk.max_entry_size" per https://developer.mozilla.org/en-US/docs/Mozilla/Preferences/Mozilla_networking_preferences#Cache is "The maximum size of an entry in the disk cache."

Based on previous investigations, if you attempt to fetch something larger than that size from the network, its cache entry will immediately be marked for discarding as soon as the request has begun. So as long as your download size is smaller than the preference's value (and smaller than 1/8th of the disk cache limit, as also covered in that document), then it should be theoretically possible for the download to be resumed. (Byte range requests and downloads are more than a little complicated and I do not pretend to be an expert.)

It does sound like sizing the chunk size based on the expected/experienced average bandwidth may be the way to go.

(In reply to Andrew Sutherland [:asuth] (he/him) from comment #16)

It does sound like sizing the chunk size based on the expected/experienced average bandwidth may be the way to go.

Thank you, I think so. I will try to tune the parameters and see if the performance is sufficient.

I'm indeed using byte range requests, which will not be cached.

You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: