Closed Bug 1365171 Opened 7 years ago Closed 7 years ago

web worker data compression should preserve Content-Encoding when possible

Categories

(Core :: DOM: Workers, defect)

53 Branch
defect
Not set
normal

Tracking

()

RESOLVED DUPLICATE of bug 1132041

People

(Reporter: v+mozbug, Unassigned)

Details

User Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:53.0) Gecko/20100101 Firefox/53.0
Build ID: 20170504105526

Steps to reproduce:

This is an enhancement request, regarding compression. I researched compression some time back, and decided to encode all my static files using brotli, because it is small and fast to decompress. 


Actual results:

However, in planning a web-worker for off-line data, I found somewhere that Firefox decompresses the data from the wire, and then recompresses it with snappy, which is 1.8 times bigger for my sample data. If I misunderstand that, just point me to the proper documentation. I don't recall where I found that info, and I may have misinterpreted it. I don't find any explicit compression handling in the web-worker API, which surprised me, and inspired my search that turned up snappy.


Expected results:

 I understand that brotli compression is slower than snappy compression, but it seems that if handed compressed data, a web worker should have the option to store the compressed data, rather than have the decompress and recompress both happen, as that clearly takes longer to perform, and consumes more storage.  Even gzip, much more commonly transferred on the wire than brotli, has a better compression ratio than snappy. Snappy is a good choice because of its compression speed when data is uncompressed to start with, but preserving the better-compressed gzip or brotli data would be a good win for web worker data storage reduction.
Component: Untriaged → DOM: Workers
Product: Firefox → Core
Are you talking about the Cache API introduced with service workers?  We do recompress with snappy there.  We also use snappy in IDB.
Flags: needinfo?(v+mozbug)
I am, indeed, referring to the Cache API for service workers. Perhaps my remarks would be applicable to IDB as well, but I've not investigated that interface.

My perception (possibly due to ignorance) is that IDB is best used for locally generated data, which is unlikely to be pre-compressed, and snappy, having a fast compression algorithm, can be used by default to save some space. This seems like a good tradeoff as a default.  Whether or not the interface should be enhanced to allow other compression algorithms (including NONE) based on the application's knowledge of of the type and size of data (possibly pre-compressed, more likely not), and expected future reference patterns, to know whether it is worth spending more time on the compression step to save more space, is not clear to me.

However, for the service worker Cache, when data is received from a server with a compressed Content-Encoding, it would seem extremely appropriate to have a method of preserving that encoding, which was chosen by the server from which the data was obtained, and costs nothing to compress using that algorithm (because it already is compressed). In populating a cache of data for a single-page or even more complex application with a sizable data store, both time and space could be preserved during that initial stage of loading the data, by avoiding the steps of decompressing and recompressing the data.  Since gzip and brotli both have fast decompression algorithms, and the data would already be compressed by the server, the cost of the slower compression algorithm for gzip and brotli would be completely avoided, and the additional space savings would be significant.

Using snappy by default for initially uncompressed data seems reasonable, but allowing application choice of compression algorithm for such data (including NONE) would seem to be even more beneficial for service worker cache than for IDB, as I would expect the data volumes to be larger. The slower algorithms, if chosen by the application, would be run in the background service workers, not affecting the interactive performance significantly, and could result in a better space/time tradeoff over time, rather than using snappy all the time.

While I wouldn't expect the whole gamut if compression algorithms available in the wild to be implemented in every browser, it seems foolish not to support the few that are already implemented in the browser, especially in the case where the data is supplied by the server in pre-compressed form.
Flags: needinfo?(v+mozbug)
Ok, this is a dupe of bug 1132041 then.

I agree preserving the wire compression would be ideal, but its complicated by the fact gecko's network stack strips it before Cache API sees it.
Status: UNCONFIRMED → RESOLVED
Closed: 7 years ago
Resolution: --- → DUPLICATE
You need to log in before you can comment on or make changes to this bug.