Open Bug 1898785 Opened 4 months ago Updated 3 months ago

Staging folder compression step with nsIZipWriter janks the parent process main thread

Categories

(Firefox :: Profile Backup, defect, P3)

defect

Tracking

()

People

(Reporter: mconley, Unassigned)

References

(Blocks 1 open bug)

Details

(Keywords: perf, Whiteboard: [fidefe-device-migration])

After cloning the various data stores in the user's profile directory, we compress the resulting files into a single compressed file via nsIZipWriter:

https://searchfox.org/mozilla-central/rev/0529464f0d2981347ef581f7521ace8b7af7f7ac/browser/components/backup/BackupService.sys.mjs#451-499

Despite using the background queue, it looks like deflate is still happening on the main thread: https://share.firefox.dev/44VdyTu

Are we using nsIZipWriter wrong? Is there a way we can do this deflation off of the main thread?

Setting to Severity N/A for now because this feature isn't enabled by default yet - but I imagine this would block any attempt to turn it on by default.

Flags: needinfo?(dtownsend)

(In reply to Mike Conley (:mconley) (:⚙️) from comment #0)

After cloning the various data stores in the user's profile directory, we compress the resulting files into a single compressed file via nsIZipWriter:

https://searchfox.org/mozilla-central/rev/0529464f0d2981347ef581f7521ace8b7af7f7ac/browser/components/backup/BackupService.sys.mjs#451-499

Despite using the background queue, it looks like deflate is still happening on the main thread: https://share.firefox.dev/44VdyTu

Are we using nsIZipWriter wrong? Is there a way we can do this deflation off of the main thread?

Not easily currently. The background queue was really built with the goal of avoiding main thread disk I/O assuming that the cost of compression would be relatively small.

The deflation all happens here which is called on the main thread. One could imagine instead pushing the data to a background thread to do the work, though since it may get called again while that is happening you have to figure out queueing the data and other fun multi-threaded things.

Flags: needinfo?(dtownsend)

The severity field is not set for this bug.
:mconley, could you have a look please?

For more information, please visit BugBot documentation.

Flags: needinfo?(mconley)

Severity is set at N/A because this feature isn't enabled yet. I guess that's not enough to keep the bot happy, so I'll set it to S4.

Severity: N/A → S4
Flags: needinfo?(mconley)

So here's a thought - I believe a CompressionStream would let me compress a single stream of bytes down off of the main thread. What if I use the nsIZipWriter at COMPRESSION_NONE to archive all of the various backup files into a single uncompressed file, and then run it through a CompressionStream before sending it down to be written into the single-file archive. So I'd basically be using nsIZipWriter to just... tar the files up. It'd be a .zip.gzip archive. Kinda hilarious.

Alternatively, we could modify nsIZipWriter to deflate off of the main thread, but that sounds pretty annoying and error-prone.

We could also potentially introduce a new off-main-thread zipping mechanism. It looks like we already vendor in the zip Rust crate: https://searchfox.org/mozilla-central/rev/4c8627a76e2e0a9b49c2b673424da478e08715ad/Cargo.toml#190-191. This would, however, likely make backporting the backup mechanism to 115 more challenging.

(In reply to Mike Conley (:mconley) (:⚙️) from comment #4)

So here's a thought - I believe a CompressionStream would let me compress a single stream of bytes down off of the main thread. What if I use the nsIZipWriter at COMPRESSION_NONE to archive all of the various backup files into a single uncompressed file, and then run it through a CompressionStream before sending it down to be written into the single-file archive. So I'd basically be using nsIZipWriter to just... tar the files up. It'd be a .zip.gzip archive. Kinda hilarious.

Ugly but yes that would work.

Alternatively, we could modify nsIZipWriter to deflate off of the main thread, but that sounds pretty annoying and error-prone.

We could also potentially introduce a new off-main-thread zipping mechanism. It looks like we already vendor in the zip Rust crate: https://searchfox.org/mozilla-central/rev/4c8627a76e2e0a9b49c2b673424da478e08715ad/Cargo.toml#190-191. This would, however, likely make backporting the backup mechanism to 115 more challenging.

Neither of these options sounds compatible with uplifting to 115, and the zip crate isn't async anyway so you'd still have to figure out threading too.

There is another option that I think would work. Add a new method to nsIZipWriter, something like addCompressedFile. It should behave like addEntryFile when COMPRESSION_NONE is passed but set the zip entry header's method field to ZIP_METHOD_DEFLATE. Then for the files you give it pass their data through a CompressionStream first, I think using the deflate-raw compression.

Essentially do the compression for each file yourself, then have nsIZipWriter just store that data but mark it as compressed so it is handled correctly when extracting.

Sort of similar to your first suggestion, but you end up with a normal zip file.

I guess the other question is, are you actually getting much compression here. Not sure how well sqlite for instance compresses and I imagine that is the bulk of the data.

It's a good question. This is anecdotal, but I think illustrative - for my personal every day profile, my staging folder (the set of backed up files) is ~140MB. Compressed down by ZIP deflate, that comes down to ~94.5MB. So it's not nothing - down by almost a third.

(And that's particularly helpful when we end up bulking it up again when base64 encoding it)

There is another option that I think would work. Add a new method to nsIZipWriter, something like addCompressedFile. It should behave like addEntryFile when COMPRESSION_NONE is passed but set the zip entry header's method field to ZIP_METHOD_DEFLATE. Then for the files you give it pass their data through a CompressionStream first, I think using the deflate-raw compression.

I really like this idea, by the way. I'm going to try to prototype this to see how difficult it would be to implement.

You need to log in before you can comment on or make changes to this bug.