It would be nice for mozilla to have the ability (or at least the option to) use
bzip2 as a content transfer encoding, similar to how gzip is used now. As bzip2
compression tends to result in smaller files, it would be particularly useful to
people over dial-up.
w3m appears to have support for bzip2 transfer, but I couldn't find anything on
bzip2 and mozilla.
nice suggestion; no time to work on this now -> future
I'm prepared to try working on this, but I've not done any development on Moz
before. Any decent guides I should be looking at to get me up to speed? Give
us a clue ;-)
sure, take a look at:
you'd want to either add code to this "stream converter" to handle bzip2, or
you'd want to create a similar stream converter object for BZIP2
compression/decompression. then you'd just have to update the pref in
that defines our Accept-Encoding header. i believe the HTTP code, when it sees
a Content-Encoding header, will just query the stream converter service for the
also, mozilla doesn't currently support Transfer-Encoding: gzip, so adding bzip2
support for the Transfer-Encoding header is a much bigger task.
Sorry my mistake, I wanted bzip support for Content-Encoding, not
Transfer-Encoding. Watch this space.
Created attachment 102322 [details] [diff] [review]
bzip support patch
The attached rather inelegent patch works for me. Two major issues with it:
1. Requires bzip libraries. Really configure ought to guess this, but I don't
know enough about autoconf (yet).
2. I'm sure I've inadvertently tweaked a few too many things.
yeah, it's going to take someone's time to clean this patch up so it can be
landed in mozilla. i unfortunately don't have much free time to do that right now.
Apart from being diffed against a slightly oldish tree (which is not a big deal
really), I have a few comments/questions:
1) Does the bzip lib handle both bzip1 and bzip2?
2) Do people really send bzip2 files as application/bzip? What do they do for
3) + if (!PL_strncasecmp (fromStr, HTTP_BZIP_TYPE , strlen
(HTTP_COMPRESS_TYPE)) -- use HTTP_BZIP_TYPE as the arg to strlen, perhaps?
and HTTP_BZIP2_TYPE for the next comparison
4) Don't do the /* LCR */ comment thing. That's what the "Contributors"
section at the top of the file is for. ;)
5) Perhaps rename mGzipStreamInitialized to something that makes sense for both
stream types, like "mDecompressStreamInitialized" or something?
6) I'd sort of prefer to keep HTTP_COMPRESS_IDENTITY as the last element of the
enum... shouldn't affect any other code.
7) You also need to make some changes to nsExternalHelperAppService (see the
ApplyDecodingForType and ApplyDecodingForExtension functions and the arrays
8) There are some whitespace issues in the code (tabs? If so, please convert
9) There are some subtle differences between the z_stream and bz_stream
structs. For example, the former uses uInt while the latter uses "unsigned
int". I would assume these are the same thing, but for clarity and
consitency, I'd rather cast to the type the struct expects (too bad, really
-- there are so many parallels between the two codepaths that I almost wish
we could have a single codepath for the two decompressions, with some
conditional dispatch of functions).
10) mGzipStreamEnded could use renaming too, in line with the widened usage.
11) Darin, is it correct to just pass through aSourceOffset unchanged here?
This applies to the gzip code too.
Ta for the feedback. As for the old tree, CVS access isn't easy here ;-)
bzip1 appears to have run into trouble with patent issues, which bzip2 doesn't
appear to suffer. Quite a few people use bzip to mean bzip2, confusingly.
Probably the best MIME type is application/x-bzip2.
Regarding the use of strlen(HTTP_COMPRESS_TYPE), I did wonder about that, but
the GZIP code does it the same way.
The /* LCR */ comments snuck through along with some printfs. They were various
markers and debug bits for me, and I didn't intend for them to go into the patch.
The other issues all sound sensible to me.
I'm currently working on a second version of the patch, with a --with-bzip2
switch to configure (via autoconf) to turn support on (and to incorporate some
of the suggestions made).
> I did wonder about that, but the GZIP code does it the same way.
Wanna fix that too? ;)
Luke, any luck with that updated patch?
Sorry, I've got distracted by work of late, and also because the fall-out of
Java gcc-2.95 vs gcc-3.2 RH8 has meant I haven't upgraded for a while (policy
here is that things must come as RPM!).
I've just grabbed the 1.4 source and I'll try and edit the patch suitably. I
plan to wrap the new bits in #ifdef BZIP2 so they can be turned on and off
easily enough and also fix up the other issues raised, but my knowledge doesn't
stretch as far as autoconf - any autoconf experts around? :-)
Hmm.... any good reason to make this disableable? It's not that much code....
If we decide we want to do that, I can probably try to figure out the autoconf end.
The bzip2 libraries would have to be bundled in if it were always on, and
judging by the issues over libmng it seems code space is at a premium? Does bzp2
compile everywhere Mozilla does?
Ah, ok. The library issue is a good reason to have a compile-time flag...
Another thought on MIME types. As regards full types, I'd have thought
application/x-bzip2 should be right and fits in with what people seem to be
using for it.
However we also need a short name for the Accept/Content-Encoding, and as far as
I can see we send out Accept-Encoding gzip and accept a Content-Encoding of
either gzip or x-gzip, as per RFC2616. However bzip2 isn't mentioned and not
does it say how new ones should be named, other than "New content-coding value
tokens SHOULD be registered". Should we send out in Accept-Encoding bzip2 or
x-bzip2? Should we accept both as synonomous on Content-Encoding? My personal
preference is for just "bzip2" to be sent out.
I agree with comment 8 point 9 that it would be nice if the bzip2/gzip streams
could be merged as they are so similar, but I couldn't think of a neat way of
doing so. Damn compile-time member lookup!
Also, I forgot to mention I came across the following during a google:
8. Using bzip2 with Netscape under XWindows
I also found a way to get Linux Netscape to use bzip2 for Content-
Encoding just as it uses gzip. Add this to $HOME/.Xdefaults or
I use the -s option because I would rather trade some decompressing
speed for RAM usage. You can leave the option out if you want to.
x-compress : : .Z : uncompress -c \n\
compress : : .Z : uncompress -c \n\
x-gzip : : .z,.gz : gzip -cdq \n\
gzip : : .z,.gz : gzip -cdq \n\
x-bzip2 : : .bz2 : bzip2 -ds \n
Was this supported under Communicator? And does Mozilla support it? Not that
it's much of a substitute as it won't work on Windows etc.
mozilla does not read preferences from .Xdefaults like 4x communicator did.
until there is an official "Accept-Encoding: bzip2", we should prefix what we
send with "x-", but we should accept both forms as equivalent.
Created attachment 127784 [details] [diff] [review]
Revised patch for bzip2 support
Hopefully this addresses most of the previous issues raised, and is diffed for
Mozilla 1.4 release source. By default it won't enable bzip2 support because
of the library issues mentioned previously - for that to happen the updated
files need to be compiled with -DBZIP2, and libbz2.so needs linking in. To test
I changed DEFINES in config/config.mk and OS_LIBS in config/autoconf.mk but I'm
sure there are better places to do this. Also, all.js needs
network.http.accept-encoding needs to be changed to
"x-bzip2,gzip,deflate,compress" if bzip2 support is compiled in (I'm not sure
if configure can do this).
Comment on attachment 127784 [details] [diff] [review]
Revised patch for bzip2 support
i have a concern with this patch. i think that this patch will make it so that
users will need to have the bzip2 library installed on their system. it seems
like it might be better to explicitly load the bzip2 library (using
PR_LoadLibrary). or, we could move this into an extension library. however,
given that the amount of code is small, i prefer the PR_LoadLibrary option.
another option might be to just statically link libbz2.a ... i think we'd need
to verify that the license is compatible before we can do that though.
here's the license from bzlib.h from my RedHat 9 box:
This file is a part of bzip2 and/or libbzip2, a program and
library for lossless, block-sorting data compression.
Copyright (C) 1996-2002 Julian R Seward. All rights reserved.
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions
1. Redistributions of source code must retain the above copyright
notice, this list of conditions and the following disclaimer.
2. The origin of this software must not be misrepresented; you must
not claim that you wrote the original software. If you use this
software in a product, an acknowledgment in the product
documentation would be appreciated but is not required.
3. Altered source versions must be plainly marked as such, and must
not be misrepresented as being the original software.
4. The name of the author may not be used to endorse or promote
products derived from this software without specific prior written
THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS
OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY
DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE
GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY,
WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING
NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
Julian Seward, Cambridge, UK.
bzip2/libbzip2 version 1.0 of 21 March 2000
This program is based on (at least) the work of:
Ian H. Witten
Jon L. Bentley
For more information on these sources, see the manual.
Initially I'd had in mind that configure would check for the presence of -lbz2
and decide whether to build it in compile time, but I hadn't successfully been
able to modify the autoconf setup to do this (I know little about autoconf).
However the PR_LoadLibrary sounds interesting - I'll take a look into it.
bzip2 claims to be BSD-style license, I don't know how that fits in with the MPL.
(In reply to comment #22)
> Initially I'd had in mind that configure would check for the presence of -lbz2
> and decide whether to build it in compile time, but I hadn't successfully been
> able to modify the autoconf setup to do this (I know little about autoconf).
> However the PR_LoadLibrary sounds interesting - I'll take a look into it.
configure checking is not sufficient unless we statically link to libbz2.a
because otherwise the mozilla builds from ftp.mozilla.org would require the user
to have installed libbz2.so. that would be a new dependency. sure, most linux
distros include it, but we'd have to decide if we want to add that as a required
dependency. given that bzip2 isn't in common use as a content-encoding, i think
it'd be a hard sell.
> bzip2 claims to be BSD-style license, I don't know how that fits in with the MPL.
BSD-style license should mean that we're ok statically linking to the code, but
i'm not an expert on such things. i'd need someone like mitchell or gerv to
The license does not impose any additional requirements over and above the MPL,
and so we could check it into the main tree just as we have libjpeg and libpng,
if necessary. We could also include it in nightly builds if we wanted.
Does that answer your question?
Gervase Markham wrote:
> The license does not impose any additional requirements over and above the
> MPL, and so we could check it into the main tree just as we have libjpeg and
> libpng, if necessary.
Erm... correct me if I am wrong: The MIT/X.org license does not impose any
additional requirements over/above MPL either - but requests to include such
code into the main tree was DENIED multiple times in the past. Was the policy
for such checkins changed or am I missing something here ?
Roland: please email me with more details of whatever you are referring to.
Firefox now includes bzip2 for partial updates (because bsdiff works well with bzip2), so it might be easier to add bzip2 as a transfer-encoding now.
-> default owner
See also bug 366559, which asks for LZMA (7-zip) transfer-encoding support.
To All -
It'd be great to revisit this issue at this time.
Are there still licensing issues around bzip2 code inclusion?
bzip2 transfer-encoding would be sweet. Save even more bytes on the wire :-).
*** Bug 479323 has been marked as a duplicate of this bug. ***
Due to its better decoding speed, brotli seems to be the right direction. That's being pursued in bug 366559 so I will close this.