Closed Bug 1301878 (zstd) Opened 8 years ago Closed 6 months ago

[meta] Implement support for Zstandard (zstd)

Categories

(Core :: Networking: HTTP, task, P2)

task

Tracking

()

RESOLVED FIXED

People

(Reporter: Virtual, Unassigned, NeedInfo)

References

(Depends on 1 open bug, )

Details

(Keywords: feature, meta, nightly-community, Whiteboard: [necko-triaged])

Attachments

(1 obsolete file)

Has Regression Range: --- → irrelevant
Has STR: --- → irrelevant
This bug is quite generic. Support for zstd for what? Possible replacement where brotli is currently used? (woff, HTTP compression) Are there W3C/WhatWG/IETF work underway in that direction? Relatedly, are the patent licensing terms from zstd (or anything else from Facebook, really) considered problematic? (https://github.com/facebook/zstd/blob/dev/PATENTS https://github.com/facebook/zstd/issues/335)
Flags: needinfo?(gerv)
(In reply to Mike Hommey [:glandium] (VAC: Apr 20-May 4) from comment #1) > Relatedly, are the patent licensing terms from zstd (or anything else from > Facebook, really) considered problematic? > (https://github.com/facebook/zstd/blob/dev/PATENTS > https://github.com/facebook/zstd/issues/335) ellee knows the answer to this question :-) Gerv
Flags: needinfo?(gerv) → needinfo?(ellee)
Flags: needinfo?(ellee)
Mike, I'll reach out to you directly. Thanks!
Elvin, I never heard back from you. I got reminded of the issue by this HN thread: https://news.ycombinator.com/item?id=14779881 , which also reminded me that the same patent license applies to react, which we already ship (and I'm not sure this issue was ever considered when react was added to the Firefox code)
Flags: needinfo?(ellee)
Sorry -- i sent you a note via e-mail. The ASF decision you referenced is helpful to know, and I'm discussing it with some other folks.
Flags: needinfo?(ellee)
Depends on: 1477516
Blocks: 1477516
No longer depends on: 1477516
Type: defect → task

Hello. I've investigated RFC 8478 a bit.

3.1.1.1.3. Dictionary_ID:
However, for frames and dictionaries distributed in public space, Dictionary_ID must be attributed carefully. The following ranges are reserved for use only with dictionaries that have been registered with IANA (see Section 6.3).

6.3. Dictionaries:
However, there are at present no such dictionaries published for public use, so this document makes no immediate request of IANA to create such a registry.

Brotli RFC 7932 includes single public dictionary optimized for web content. Zstd RFC looks like a joke.

I made a web search and found almost nothing. 8 months passed and nobody are going to provide these dictionaries. I can't understand how anyone can implement zstd bindings for firefox without clear dictionary registry. Zstd will never destroy brotli without effective dictionaries.

Please let me know if I am wrong. Thank you.

Please let me know if I am wrong. Thank you.

Even if zstd doesn't destroy brotli for static content, it can definitely seriously outperform all others for dynamic content. A dynamically generated php (or whatever dynamic) page can be compressed with zstd instead of deflate while transmitting to the browser. Brotli has really good performance decompressing but it uses a LOT of resources to compress. Zstd can get close-enough compression results with only a small fraction of the resources to compress while using about the same resources to decompress as brotli.

Do you think this is not reason enough?

Besides, the RFC does state there's still studying needed before optimized dictionaries are distributed... I just guess they are still working on them, as written.

Cloudflare is using zstd with success back in 2018 and zstd has been having lots of improvements ever since.

I since mid this year, I strongly believe that (with the current selection) zstd will be the "new" de-facto standard to compress, replacing deflate.

IMO: Lz4 is a quick-win compression ("stolen" from deflate), LZMA is the compression at all costs, brotli is the static content compression and zstd is the "balanced" compression (role deflate has now).

From cloudflare's brotli study they concluded, for streaming, brotli is even worse than zlib/gzip/deflate, so they dropped the idea.

Flags: needinfo?(aladjev.andrew)

Even if zstd doesn't destroy brotli for static content, it can definitely seriously outperform all others for dynamic content.

Hello. Yes, may be, but I think that integration is not really possible.

Please imagine that we are trying to implement zstd support in firefox. Web browser is a special application in terms of compatibility. For example we released firefox v75 with zstd support (without dictionary support) in 2019. It means that our browser should be able to decompress any zstd content in 2019-2030 years. We can see in RFC the following text:

Frame header: ..., Dictionary_id
This is a variable size field, which contains the ID of the dictionary required to properly decode the frame.

Facebook will provide dictionary ecosystem in 2023-2024 year (just my assumption). Webservers will produce compressed content with dictionary id fields. Our firefox v75 released in 2019 won't be able to decompress this content. So I think that firefox and other web browsers should wait until dictionary system will appear.

Flags: needinfo?(aladjev.andrew)

It's not immediately obvious why 2020's Firefox would need to be able to decompress zstd-encoded responses from 2030. But maybe this is not a bad moment for a discussion about Accept-Encoding and how to handle dictionary variants of zstd.

(In reply to Andrew from comment #9)

Even if zstd doesn't destroy brotli for static content, it can definitely seriously outperform all others for dynamic content.

Hello. Yes, may be, but I think that integration is not really possible.

Please imagine that we are trying to implement zstd support in firefox. Web browser is a special application in terms of compatibility. For example we released firefox v75 with zstd support (without dictionary support) in 2019. It means that our browser should be able to decompress any zstd content in 2019-2030 years. We can see in RFC the following text:

Frame header: ..., Dictionary_id
This is a variable size field, which contains the ID of the dictionary required to properly decode the frame.

Facebook will provide dictionary ecosystem in 2023-2024 year (just my assumption). Webservers will produce compressed content with dictionary id fields. Our firefox v75 released in 2019 won't be able to decompress this content. So I think that firefox and other web browsers should wait until dictionary system will appear.

What is the issue? Firefox can ship with zstd already ready to have dictionaries "plugged in", except it comes with none installed initially. Later on, a quick update can add official dictionaries as they come out. This feature can even be developed such that dictionaries are searched in a directory and they are used in firefox (can be used for quick patching like this for PC that update late).
Nothing prevents firefox from being already shipping with zstd dictionary capability. (sing your dates) In 2023-2024, when a dictionary is defined in the RFC, we create a ticket and a quick patch will add the dictionary to firefox. Even for LTS. What's the trouble?

Flags: needinfo?(aladjev.andrew)

What is the issue? Firefox can ship with zstd already ready to have dictionaries "plugged in", except it comes with none installed initially. Later on, a quick update can add official dictionaries as they come out. This feature can even be developed such that dictionaries are searched in a directory and they are used in firefox (can be used for quick patching like this for PC that update late).

There is no such directory. It is not possible to implement it with imaginary directory ecosystem.

when a dictionary is defined in the RFC, we create a ticket and a quick patch will add the dictionary to firefox. Even for LTS. What's the trouble?

RFC 8478 does not define any way of downloading, synchronizing, protecting dictionaries, etc.

ZSTD developers said the following:

The plan is the opposite: as I described in the caniuse thread, the RFC does not standardize the use of a dictionary. Responses with Content-Encoding: zstd should not use a dictionary. If and when a dictionary-based scheme is standardized for HTTP, it will use a different content-coding identifier.

So for now integration is possible without dictionary, some workarounds for dictionary support will appear later.

Flags: needinfo?(aladjev.andrew)

(In reply to Andrew from comment #12)

What is the issue? Firefox can ship with zstd already ready to have dictionaries "plugged in", except it comes with none installed initially. Later on, a quick update can add official dictionaries as they come out. This feature can even be developed such that dictionaries are searched in a directory and they are used in firefox (can be used for quick patching like this for PC that update late).

There is no such directory. It is not possible to implement it with imaginary directory ecosystem.
There's none because there's no zstd implemented in firefox.

when a dictionary is defined in the RFC, we create a ticket and a quick patch will add the dictionary to firefox. Even for LTS. What's the trouble?

RFC 8478 does not define any way of downloading, synchronizing, protecting dictionaries, etc.
That's fine. Why would it define that if it's the program's responsibility to ship the dictionary and the OS' responsibility to prevent the files from being tampered.
ZSTD developers said the following:

The plan is the opposite: as I described in the caniuse thread, the RFC does not standardize the use of a dictionary. Responses with Content-Encoding: zstd should not use a dictionary. If and when a dictionary-based scheme is standardized for HTTP, it will use a different content-coding identifier.

So for now integration is possible without dictionary, some workarounds for dictionary support will appear later.
I didn't know that part. If that's so, then, what's the problem???
Zstd can have a great web use for dynamic contents with good streaming compression for very fast speeds!
I don't understand what I'm missing then...

Flags: needinfo?(aladjev.andrew)

I am with brunoais: Firefox can ship now with the current state of zstd / zstd for the web. If there are updates in the future, they can be added in the future.

Type: task → enhancement

Commonly brotli decodes a web page in one millisecond, fully shadowed by the transfer in any mobile system. Definitely there is no waiting or multithreaded decoding on any mobile platform that I know of.

Dynamic brotli compression is somewhat more dense than dynamic zstd compression. Particularly so for languages with utf-8 use (like Chinese, Russian, Vietnamese, ...).

In my experiments brotli slows down less than zstd in parallel computation. This is mostly because brotli achieves compression density by more computation whereas zstd is relying on more memory access (larger window size).

Brotli is more streamable. Zstd gets its slightly better decoding speed by processing blocks of data. The format is ordered in away that makes full streaming inpossible and some of the already received data is just not decodeable. In Brotli streaming was the first class citizen, and much more data can be outputed and used for further processing (like HTML tree construction or JavaScript parsing), streaming all the way down to the last LZ copy or literal. Blocked processing of Zstd is favorable for datacenter level computing, but when humans are waiting for data, full streaming is more favorable.

Brotli is fast enough. Decoding speed -- while slightly slower than zstd -- is far faster than any mobile connection. Because of streaming, the decoding can happen during the transfer.

We don't observe in our mobile or desktop deployments the kind of gzip vs. brotli issues that some confused early bloggers reported. Later bloggers did call out these confused bloggers as rumors. https://certsimple.com/blog/nginx-brotli writes "To summarize Akamai's study on Brotli performance... Brotli with setting 4 is both significantly smaller AND compresses faster than gzip"

Please note that rfc8478 paragaph "3.1.1.1.1.2. Single_Segment_Flag" leaves the window size open. Even in the area of normal operation, and the client promising to be able to decode zstd, the server has no way to actually know if the client will be able to decode the stream if the request is for more than 8 MB. In brotli there is no such gray area that can fragment the client space. I'd strongly recommend removing the gray area.

Please note that when zstd beats brotli in compression speed or density benchmarks, it is because it defaults to larger window sizes (often 128 MB vs. brotli's 4 MB). When used in a browser environment usual deployments use smaller window size because larger window sizes mean more OoMs in browsers. As one datapoint Facebook themselves are using a 512 kB window with brotli for dynamic HTTP compression.

Curl now support zstd

QA Whiteboard: qa-not-actionable

Are there any updates on the priority?

curl already supports zstd.
ZSTD can provide superior compression performance for non-static content such as HTML in HTML pages, websockets and XMLHttpRequests.
Depending how it's tuned, can provide ~20% time saving. The library itself even has an automatic mode which adjusts the compression with the purpose of exhausting the internet speed.

Out of its compression rivals, the biggest one being brotli wins for static content. However, due to its slowness in compression, becomes unreasonably slower than zstd for dynamic content.

Flags: needinfo?(Virtual)

I'm not the right person to reply to this question. This should be asked Mozilla developers.

Flags: needinfo?(Virtual)

(In reply to brunoais from comment #18)

Are there any updates on the priority?

curl already supports zstd.
ZSTD can provide superior compression performance for non-static content such as HTML in HTML pages, websockets and XMLHttpRequests.
Depending how it's tuned, can provide ~20% time saving. The library itself even has an automatic mode which adjusts the compression with the purpose of exhausting the internet speed.

Out of its compression rivals, the biggest one being brotli wins for static content. However, due to its slowness in compression, becomes unreasonably slower than zstd for dynamic content.

I've provided the comparison https://github.com/andrew-aladev/brotli-vs-zstd of brotli/zstd for web data usage. You can just open chart folder and find results you need.

For example you can find result for html ratio, html compress performance, html decompress performance.

For example you can find result for js ratio, js compress performance, js decompress performance.

You can find more about css, fonts, minified/not minified files, different sized files, limits etc.

But the most important question is total ratio and performance. total means you have created a compressed stream and feeding it using large amount of unique data with the same type (1GB+). Total results reveal all potential capabilities of compressor/decompressor.

For example: total results for html ratio, html compressor performance, html decompressor performance.

For example: total results for js ratio, js compressor performance, js decompressor performance.

We can see that zstd just ruins brotli in total results. How to bring the same effect to single results? Facebook should create a dictionaries for different data types, add it to IANA registry, etc. I've provided this information for Facebook zstd developers and project lead. I've recommended them to add dictionary to the list of most important milestones. But they just don't care.

PS my personal considerations: Facebook created local dictionaries for their web projects and stopped investing money in zstd. Dictionaries will never appear, we can just use zstd for creating local dictionaries for our web projects. Zstd will never replace brotli in web data compression. So I see no reason in its integration into firefox.

Flags: needinfo?(aladjev.andrew)

In the process of migrating remaining bugs to the new severity system, the severity for this bug cannot be automatically determined. Please retriage this bug using the new severity system.

Severity: major → --

We can see that zstd just ruins brotli in total results. How to bring the same effect to single results? Facebook should create a dictionaries for different data types, add it to IANA registry, etc. I've provided this information for Facebook zstd developers and project lead. I've recommended them to add dictionary to the list of most important milestones. But they just don't care.

PS my personal considerations: Facebook created local dictionaries for their web projects and stopped investing money in zstd. Dictionaries will never appear, we can just use zstd for creating local dictionaries for our web projects. Zstd will never replace brotli in web data compression. So I see no reason in its integration into firefox.

Is there anything preventing anyone from making these dictionaries? (beside time)
About adding them to IANA registry: that indeed can't be submitted by anyone. But if someone makes dictionaries and other people interested in having zstd used in web review and validate them. Then could Mozilla submit them to the IANA?

(In reply to Andrew from comment #20)

We can see that zstd just ruins brotli in total results.

If you run both with the same window size (for example 8 MB), brotli continues to win in density.

In usual use we would not want to allocate hundreds of megabytes for compression -- there can be more than one stream active in a browser at once.

ZSTD doesn't really compete and can't win against brotli in what brotli is already used (reused static files).
Where ZSTD mostly shines and competes is against gzip. ZSTD mostly wins at quickly compressing and quickly decompressing a data stream while also supporting dynamic compression rate to match the transfer speeds. ZSTD wins in compression speed against brotli and gzip every time (for time required to get the same final size).
ZSTD wins against gzip in almost every way.

(In reply to pooya3d from comment #40)

Support for zstd for what?

Content-Encoding (both request and response)

(In reply to pooya3d from comment #40)

Possible replacement where brotli is currently used?

No. More like a replacement to where gzip/deflate is commonly used.
Use brotli for static content and use zstd for dynamic content (HTTP compression).

(In reply to pooya3d from comment #40)

Are there W3C/WhatWG/IETF work underway in that direction?

Yes. However, some complaints relate to lack of default dictionaries. Even without them, zstd appears to perform much better at compressing speed than brotli while being a little slower than brotli at decompressing (while brotli can do much more compact compression)

(In reply to pooya3d from comment #40)

Relatedly, are the patent licensing terms from zstd

This is the license:
BSD license: https://github.com/facebook/zstd/blob/dev/LICENSE
GPLv2 license: https://github.com/facebook/zstd/blob/dev/COPYING
There are no patents attached with those licenses, so no issues.

Flags: needinfo?(Virtual)

There is a zstd PR about dictionnary : https://github.com/facebook/zstd/issues/3100

Attachment #9320928 - Attachment is obsolete: true

This bug is somehow being targeted by spammers, should we lock this one? (It's not that frequent though)

Restricting comments due to spam.

Restrict Comments: true
Restrict Comments: false

It's probably worthwhile to mention that Chrome now supports this behind a developer flag in 117 (https://chromestatus.com/feature/6186023867908096). Given the widespread industry support for Zstd at this point it seems very likely that they'll keep the feature in.

No longer depends on: 1871963
No longer depends on: 1884299
No longer depends on: 1884301
No longer depends on: 1884302
No longer depends on: 1884303
Severity: -- → N/A
Type: enhancement → task
Priority: -- → P2
Whiteboard: [necko-triaged]

@Mathew Hodson - yes, we've looked at that. Lack of compressor support is an issue for certificate compression. Also, it's someone's pet project, and likely would need additional work to be production-level, especially performance-wise, but in other ways too. We'll continue to monitor it, as we'd love to use a pure rust compressor/decompressor. We also have looked at leveraging rlbox for security reasons with zstd, but rlbox doesn't do well with zstd (200-300% slower). That might get fixed over time.

I think this can be closed? zstd support landed in v126: https://www.mozilla.org/en-US/firefox/126.0/releasenotes/. The devtools don't support it yet (responses just look like gibberish) but that should be fixed by #1891610

No longer depends on: 1884304
No longer blocks: 1477516
Keywords: meta
Summary: Implement support for Zstandard (zstd) → [meta] Implement support for Zstandard (zstd)
No longer blocks: 1879030
Depends on: 1879030
Depends on: 1884306

On Firefox 126.0.1 HTTPS connections send "Accept-Encoding: gzip, deflate, br, zstd" while HTTP connections only send Accept-Encoding: "gzip, deflate". Is it possible to support zstd for HTTP connections as well? Or is HTTP stuck with only gzip/deflate?

(In reply to Will Lentz from comment #72)

On Firefox 126.0.1 HTTPS connections send "Accept-Encoding: gzip, deflate, br, zstd" while HTTP connections only send Accept-Encoding: "gzip, deflate". Is it possible to support zstd for HTTP connections as well? Or is HTTP stuck with only gzip/deflate?

There are reasons that brotli and zstd aren't supported for http. Perhaps these reasons are no longer valid, though. See this for a description of why Firefox and Chrome both require https (and I assume Safari):
https://hacks.mozilla.org/2015/11/better-than-gzip-compression-with-brotli/#comment-19069

If you want to ask for that be reconsidered, please file a new bug on that issue.

Flags: needinfo?(will_lentz)
Status: NEW → RESOLVED
Closed: 6 months ago
Resolution: --- → FIXED
Depends on: 1904754
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: