Closed Bug 296303 Opened 19 years ago Closed 19 years ago

Provide simple archive file format for update service

Categories

(Firefox Build System :: General, defect)

defect
Not set
major

Tracking

(Not tracked)

RESOLVED FIXED
mozilla1.8beta3

People

(Reporter: darin.moz, Assigned: darin.moz)

References

()

Details

Attachments

(2 files, 2 obsolete files)

Provide simple archive file format for update service

Given that bsdiff is designed to compress really well with bzip2, it is not
ideal to use ZIP as the archive format for the update service.  I choose to
write a very simple archive format that is basically a concatenation of all of
the bzip2 compressed files with an index table tacked on to the end.  The
archive format itself does not know about compression.  The amount of code
required to read the archive format is very small, so it helps minimize the
codesize of the standalone updater executable.

I've included code for a command line utility that makes it easy to perform the
usual operations: create, extract, and test.

This code is known to work on Linux, Windows, and Mac OSX.
Attached patch v1 patch (obsolete) — Splinter Review
Attachment #185087 - Flags: superreview?(brendan)
Attachment #185087 - Flags: review?(benjamin)
Blocks: 292021
It's often better to use wide-spread standards instead of custom file formats. 
I think that cpio format will do the job.

/* A cpio archive consists of a sequence of files.
   Each file has a 76 byte header,
   a variable length, NUL terminated filename,
   and variable length file data.
   A header for a filename "TRAILER!!!" indicates the end of the archive.  */

-- http://savannah.gnu.org/cgi-bin/viewcvs/cpio/cpio/src/cpio.h
You don't have to compress with zip, though, and it's a pretty simple format as
well (and effectively standard).

Do we really need to invent our own format here?  How much does this save over
just using zip or cpio on the bz2-compressed files?  (In terms of code and
download size.)

I really don't think we want a custom format here, and I'm not looking forward
to someone having to check all the manual pointer math so we can avoid the
overflow issues that have affected pretty much every other such extraction
utility.  But I bet that the process that led to this code was more involved
than just the quick summary in the description, so maybe there's something
really compelling yet to hear!
Yes, I don't insist on cpio but I just can't find reason for inventing custom
format too.  I guess that Mozilla already have zip/jar extraction code.  If code
size matters, cpio is just one of the simplest possible archive formats and only
extraction needs to be implemented -- when updates are prepared, providing
external cpio is easy.
Originally, I used libjar-standalone.  But, I wanted to reduce codesize since
zlib wasn't needed for this application, so I spent a couple hours and wrote
something really simple.  Perhaps my time would have been better spent finding a
way to build libjar-standalone without zlib (some more #ifdefs to nsZipArchive
probably).

Here's a rough comparison of codesize:

15512  nsZipArchive.obj
 3617  mar_read.obj

I understand the arguments in favor of using a standard file format.  I went
that way initially, but as I said I was trying to ensure that the updater
executable remained small.  This system doesn't seem to need a standard file
format as the update archive is already very specific to this update system.

That said, I guess someone might like to take the full update packages and
simply extract it manually.  That would be similar to installing our current ZIP
and TAR.GZ packages.

I don't know about cpio.  Given that each file has only a 76 byte header, I'd
imagine that it has limits on the length of file names (and paths), so that
probably isn't a good option.

If people feel strongly enough that this relatively closed system needs an open
file format, then my preference would be to build a special version of
libjar-standalone such as how I proposed above.
I think people are going to want to pull apart and reassemble these things, such
as for generating an update set that goes between two internally-customized
builds at an enterprise or ISP installation.  I guess they're likely going to
need our bsdiff anyway, so we'll need some sort of SDK for that.

I don't really think saving 10K uncompressed is worth inventing another new
format (which you're going to document, of course, and write tests for =) ), and
I'm a little worried about us only finding out that we have some overflow case
or whatever else in the field, so I think I'd _rather_ we reuse our libjar stuff.

I'm not going to throw a hissy-fit if we don't, though.  (I'll just pout a
little.  =) )
The 76-byte header doesn't include pathname.  Just take a look at
http://savannah.gnu.org/cgi-bin/viewcvs/cpio/cpio/src/cpio.h?rev=1.3&content-type=text/vnd.viewcvs-markup

cpio archive = file + file + ...
file = 76-byte header + nul-terminated file name (pathname) + file content

Last file is special and has name "TRAILER!!!" -- it just marks end of archive.
OK, it seems that cpio is not completely perfect too :-)  There are multiple
formats and what I described is the POSIX format that is understood by everyone
and is produced by GNU cpio by using "--format odc" (or "-H odc") -- this is not
the GNU cpio default. Nice description can be found in
http://www.mkssoftware.com/docs/man4/cpio.4.asp  Precise description can be
found in Single Unix Specification v3 under pax(1).  Good introduction of
FreeBSD cpio is in
http://www.onlamp.com/pub/a/bsd/2002/07/11/FreeBSD_Basics.html (cpio(1) is
somewhat weird in its use)
Another issue with libjar is that the API requires me to extract items to a
temporary file, and then I have read that temporary file into memory to run it
through the bzip2 decompressor.

I would probably want to hack libjar to add an API similar to mar_read, but then
I would only want to implement that for items stored in the ZIP without
compression.  nsZipArchive.cpp is also not very nice code to modify.  It is very
complex, already heavily #ifdef'd, types are redefined as macros for other
types, etc.

Another complication is that nsZipArchive.cpp uses crc32 from zlib.  That will
be more difficult to factor out.

In constrast, mar_read.c is exactly what I need and no more (despite being a
different format).
Yeah, I probably wouldn't do that either.  Bleh.

I wonder if writing your own uncompressed-zip-extractor would be bad, though. 
It would be no worse than mar (same new-code smell!) but we wouldn't have to
invent the format and it'd work with standard tools.

I'm losing steam here on my objections, I have to say...
I posted some details about the MAR file format here:
http://wiki.mozilla.org/Software_Update:MAR
Status: NEW → ASSIGNED
Target Milestone: --- → mozilla1.8beta3
Comment on attachment 185087 [details] [diff] [review]
v1 patch

>Index: modules/libmar/Makefile.in

>+# The contents of this file are subject to the Netscape Public

MPL

>Index: modules/libmar/src/Makefile.in

>+# The contents of this file are subject to the Netscape Public

MPL

>+LIBXUL_LIBRARY	= 1

Put FORCE_STATIC_LIB = 1 up here next to the other vars that name the library,
but do not include LIBXUL_LIBRARY (not linked into libxul, therefore doesn't
need FORCE_USE_PIC).

I don't understand why the source files are arranged all in src/ but then built
from tool/...  seems to me that mar.c has no use in src/ and should only be
tool/

>Index: modules/libmar/src/mar.h

>+typedef struct MarItem_ {
>+  struct MarItem_ *next;
>+  unsigned int offset;  /* offset into archive */
>+  unsigned int length;  /* length of data in bytes */
>+  unsigned int flags;   /* contains file mode bits */
>+  char name[1];
>+} MarItem;

Any particular reason you used "unsined int" instead of a type that's
guaranteed to be 32 bits?

>Index: modules/libmar/src/mar_create.c

>+  free(stack.head);

I know that free is supposed to be null-safe, but we still have some libc's
that don't comply: I would prefer that this had a null-check.

>+/* Ensure that the directory containing this file exists */
>+static int mar_ensure_parent_dir(const char *path)

>+#ifdef XP_WIN
>+    _mkdir(path);

This is using forward-slashes: does windows mkdir accept forward slashes?

>Index: modules/libmar/src/mar_private.h

>+typedef unsigned int uint32;

Is this safe? Why can't we #incude "prtypes.h" and use PRUint32:
http://lxr.mozilla.org/mozilla/source/nsprpub/pr/include/prtypes.h#329
Comment on attachment 185087 [details] [diff] [review]
v1 patch

Clearing review request pending patch with at least NSPR type-love.
Attachment #185087 - Flags: superreview?(brendan)
Attachment #185087 - Flags: review?(benjamin)
I actually wonder if it might make sense to build the tool in "src" and the
"library that is just mar_read.c" in the other directory.
Subdirs suck ;-).  What breaks if you put everthing in one directory?  Like
js/src or modules/libjar or ....

/be
I believe that our build system requires me to have two separate makefiles since
I have two separate build targets:

  libmar.a  (just includes code to read a mar file)
  mar       (standalone tool used to create, test, and extract mar files)

So, I could put the source in modules/libmar/ and then create a single subdir
for one of those build targets.  Suggestions?
A final option would be to simply add mar_read.c to the list of files compiled
under mozilla/toolkit/mozapps/update/src/updater.  Then, I would make
modules/libmar only build "mar" the command line tool.  In that case, calling
the directory libmar is probably wrong, and it should just be modules/mar. 
Thoughts?
Once upon a time, the gmake-include-based pre-autoconf Makefiles I hacked, which
included config.mk and rules.mk (so the DNA is still evident), allowed PROGRAM
and LIBRARY to be built from one Makefile.  If that's not the case, then too bad.

The "lib" prefix comes from AT&T-ish flavors of Unix source, and was imitated in
the Mozilla Classic source tree.  Under "modules" it strikes me as redundant,
but hey, tradition and all.  Except that if the primary product here is a tool,
losing the "lib" is the least you might do.  Consider mozilla/tools as the "cmd"
to mozilla/modules' "lib".

/be
You can still build a library & program from the same makefile but that's about
it.  You'll have to set PROGOBJS manually.  See
http://lxr.mozilla.org/seamonkey/source/modules/libreg/src/Makefile.in for an
example.

Attached patch v2 patchSplinter Review
This patch includes the latest version from the software update branch.

I made the following changes:

  o  mar_test.c is now folded into mar.c

  o  libmar.a now includes mar_create.c, mar_extract.c, and mar_read.c.  When 
     linking against libmar.a, modern linkers only include the modules or
object
     code that they need, so there's no reason not to link all three pieces
into
     libmar.a.	This avoids needing to recompile the source files when building

     the tool.

  o  The mar executable is now output into dist/host/bin since it is something
     that will be used by the build system and should not be included in our
     release of Firefox and other products.

  o  libmar/tool is not built if CROSS_COMPILE is defined.

I looked at the PROGOBJS option and considered that.  But, I think that I
prefer the separate directories for separate build targets.  The main build
target is the library, and that is what lives under modules/libmar/src.  The
tool is optional, so using an ifdef around a DIRS declaration seems nice to me.
Attachment #185087 - Attachment is obsolete: true
Attachment #185509 - Flags: review?(benjamin)
In addition to the changes listed above, I also added more documentation to mar.h.
Comment on attachment 185509 [details] [diff] [review]
v2 patch

in libmar/src/Makefile.in the comment about "reading only" is incorrect.

I think it's a little odd that we're using HOST_PROGRAM and HOST_CSRCS in the
tool directory, when it won't possibly work in a cross-compiling environment:
why not just use CSRCS and PROGRAM and set NO_DIST_INSTALL? Either way, r=me
Attachment #185509 - Flags: review?(benjamin) → review+
I thought it wise to install the mar executable into some place under dist/, so
that other tools would have an easy time locating the resulting executable.
Comment on attachment 185509 [details] [diff] [review]
v2 patch

> why not just use CSRCS and PROGRAM and set NO_DIST_INSTALL? Either way, r=me

Can it use CSRCS and PROGRAM but leave NO_DIST_INSTALL unset?  I'm thinking we
don't want to further abuse the HOST_* cross-compiling support (although it may
be badly broken elsewhere).  Be good if cls could bless this.

/be
This is a program that is designed to be run on the host system, not the target
system.  So, it is far more correct to build with the HOST_ prefixes.  The only
problem is that support for building dependent host libraries is lacking AFAICT.

If we avoid the HOST_ prefixes, and do the NO_DIST_INSTALL stuff, then that will
work too.  I was just trying to use the HOST_ prefixes since that would seem to
be more correct.
> Can it use CSRCS and PROGRAM but leave NO_DIST_INSTALL unset?

Doing that would result in mar being included in .tar.gz builds.  It isn't our
goal to distribute mar that way, right?
For the moment, I'm fine with the current patch since that |ifndef
CROSS_COMPILE| prevents host & target objs from being linked together and this
tool is, presumably, only used in the m.o release process.

Is there any reason to think that the cross-compile case wouldn't work if you
built libmar as a host library as well?  You can build both the host & target
versions of a library from a single makefile.  We already handle this particular
case in the libxpt/xpidl makefiles.
Attachment #185509 - Flags: approval1.8b3?
Attachment #185509 - Flags: approval-aviary1.1a2?
(In reply to comment #27)
> Is there any reason to think that the cross-compile case wouldn't work if you
> built libmar as a host library as well?  You can build both the host & target
> versions of a library from a single makefile.  We already handle this
> particular case in the libxpt/xpidl makefiles.

Cool, that's good to know.  Just so my ignorant question bouncing of bsmedberg's
comment doesn't leave the wrong impression: I'm all in favor of fixing our cross
compilation support, and building on it.  Maybe we should start a project for
this.  I suspect the MontaVista guys, if they are building any of our code for
small device targets, have had to blaze this trail already.

/be
> Is there any reason to think that the cross-compile case wouldn't work if you
> built libmar as a host library as well?  You can build both the host & target
> versions of a library from a single makefile.  We already handle this particular
> case in the libxpt/xpidl makefiles.

Good idea.  But, I don't have a cross-compiling environment, so I would rather
not check in changes to enable building this in a cross-compiling environment
until I have a way to actually test such an environment.
v2 patch has been checked in on the trunk.  we can followup with improvements to
support cross-compiling later.

marking FIXED
Status: ASSIGNED → RESOLVED
Closed: 19 years ago
Resolution: --- → FIXED
FYI, I just tested a linux->mingw32 cross-compile build and with the exception
of a couple of minor bugs recently introduced by the msys landing (bug 294122),
it works fine. YMMV.
Urrrr .... not compatible with VS8 Beta2?



/cygdrive/c/mozilla/trunk/mozilla/build/cygwin-wrapper link -NOLOGO -OUT:mar.ex
 -PDB:mar.pdb host_mar.o /NODEFAULTLIB:LIBC ../../../dist/lib/mar.lib Ws2_32.li

host_mar.o : non-native module found; restarting link with /LTCG; add /LTCG to
he link command line to improve linker performance
MSVCRT.lib(MSVCR80.dll) : error LNK2005: _free already defined in LIBCMT.lib(fr
e.obj)
MSVCRT.lib(MSVCR80.dll) : error LNK2005: ___iob_func already defined in LIBCMT.
ib(_file.obj)
MSVCRT.lib(MSVCR80.dll) : error LNK2005: _realloc already defined in LIBCMT.lib
realloc.obj)
MSVCRT.lib(MSVCR80.dll) : error LNK2005: _fclose already defined in LIBCMT.lib(
close.obj)
MSVCRT.lib(MSVCR80.dll) : error LNK2005: _malloc already defined in LIBCMT.lib(
alloc.obj)
LINK : warning LNK4098: defaultlib 'MSVCRT' conflicts with use of other libs; u
e /NODEFAULTLIB:library
mar.exe : fatal error LNK1169: one or more multiply defined symbols found
make[4]: *** [mar.exe] Error 145
make[4]: Leaving directory `/cygdrive/d/mozilla/built/mmx/modules/libmar/tool'
make[3]: *** [libs] Error 2
make[3]: Leaving directory `/cygdrive/d/mozilla/built/mmx/modules/libmar'
make[2]: *** [tier_1] Error 2
make[2]: Leaving directory `/cygdrive/d/mozilla/built/mmx'
make[1]: *** [default] Error 2
make[1]: Leaving directory `/cygdrive/d/mozilla/built/mmx'
make: *** [build] Error 2
See bug 297124.
Attachment #185509 - Flags: approval1.8b3?
Attachment #185509 - Flags: approval1.8b3+
Attachment #185509 - Flags: approval-aviary1.1a2?
Attachment #185509 - Flags: approval-aviary1.1a2+
This breaks my Win32/MinGW/cygwin build as follows :-

/cygdrive/e/mozilla/source/mozilla/build/cygwin-wrapper gcc -mtune=athlon-xp -ma
rch=athlon-xp -mmmx -msse -m3dnow -o mar.exe -mno-cygwin -DXP_WIN32 -DXP_WIN -DW
IN32 -D_WIN32 -DNO_X11  host_mar.o ../../../dist/lib/libmar.a
../../../dist/lib/libmar.a(mar_create.o)(.text+0x96): In function `mar_push':
e:/mozilla/source/mozilla/modules/libmar/src/mar_create.c:87: undefined referenc
e to `htonl@4'
The MingW port will need some Makefile.in love to make the executable link
against WS2_32.lib.  I don't know what is required to do that; patches welcome.
Attached patch mingw + cross-compile fixes v1 (obsolete) — Splinter Review
This patch needs to be tested on a native win32 build.
- Moves HOST_AR_FLAGS into configure.in
- Adds HOST_OS_ARCH to better handle the native build case
- Exposes autoconf standard host_cpu, host_vendor, host_os variables
- Makes every HOST_PROGRAM link against HOST_EXTRA_LIBS instead of EXTRA_LIBS
- Always build native version of libmar & mar tool
- Add -D_WIN32_IE=0x300 define needed for mingw build of updater
Attachment #185789 - Flags: review?(benjamin)
I've tried Chris Seawood's mingw compiler fixes (I build on Mingw WinXP)

I get still a build error:
r ../../../../../dist/sdk/include -o module.res /cygdrive/c/mozilla/mozilla/tool
kit/mozapps/update/src/updater/module.rc
c:/mozilla/mozilla/toolkit/mozapps/update/src/updater/module.rc:55:20: afxres.h:
 No such file or directory
c:\mozilla\mingw\bin\windres.exe: c:\mozilla\mingw\bin\gcc exited with status 1
make[7]: *** [module.res] Error 1
make[7]: Leaving directory `/cygdrive/c/mozilla/mozilla/toolkit/mozapps/update/s
rc/updater'
make[6]: *** [libs] Error 2
make[6]: Leaving directory `/cygdrive/c/mozilla/mozilla/toolkit/mozapps/update/s
rc'
make[5]: *** [libs] Error 2
make[5]: Leaving directory `/cygdrive/c/mozilla/mozilla/toolkit/mozapps/update'
make[4]: *** [libs] Error 2
make[4]: Leaving directory `/cygdrive/c/mozilla/mozilla/toolkit/mozapps'
make[3]: *** [libs] Error 2
make[3]: Leaving directory `/cygdrive/c/mozilla/mozilla/toolkit'
make[2]: *** [tier_50] Error 2
make[2]: Leaving directory `/cygdrive/c/mozilla/mozilla'
make[1]: *** [alldep] Error 2
make[1]: Leaving directory `/cygdrive/c/mozilla/mozilla'
make: *** [alldep] Error 2

But I think this could be more a result from fixing bug 292021, not?
Hmmm, interesting error (with new patch applied)....

rm -f libhostmar.a
ar @HOST_AR_FLAGS@ host_mar_create.o host_mar_extract.o host_mar_read.o
e:\gnu\mingw\bin\ar.exe: illegal option -- @
Usage: e:\gnu\mingw\bin\ar.exe [emulation options] [-]{dmpqrstx}[abcfilNoPsSuvV]
 [member-name] [count] archive-file file...
       e:\gnu\mingw\bin\ar.exe -M [<mri-script]
 commands:
  d            - delete file(s) from the archive
  m[ab]        - move file(s) in the archive
  p            - print file(s) found in the archive
  q[f]         - quick append file(s) to the archive
  r[ab][f][u]  - replace existing or insert new file(s) into the archive
  t            - display contents of archive
  x[o]         - extract file(s) from the archive
 command specific modifiers:
  [a]          - put file(s) after [member-name]
  [b]          - put file(s) before [member-name] (same as [i])
  [N]          - use instance [count] of name
  [f]          - truncate inserted file names
  [P]          - use full path names when matching
  [o]          - preserve original dates
  [u]          - only replace files that are newer than current archive contents

 generic modifiers:
  [c]          - do not warn if the library had to be created
  [s]          - create an archive index (cf. ranlib)
  [S]          - do not build a symbol table
  [v]          - be verbose
  [V]          - display the version number
 emulation options:
  No emulation specific options
e:\gnu\mingw\bin\ar.exe: supported targets: pe-i386 pei-i386 elf32-i386 elf32-li
ttle elf32-big srec symbolsrec tekhex binary ihex
make[4]: *** [libhostmar.a] Error 1

The patch contains a change to configure.in.  You have to run autoconf to
regenerate configure.
Attachment #185789 - Flags: review?(benjamin)
I ran into an extra problem with the native win32 build.  gcc (on the behalf of
ISO C++) complains about taking the address of ::main() in WinMain().  This
patch fixes that problem.  My VC build runs into bug 297124 but otherwise
builds fine.
Attachment #185789 - Attachment is obsolete: true
Attachment #185989 - Flags: review?(benjamin)
Attachment #185989 - Flags: review?(benjamin) → review+
Attachment #185989 - Flags: approval1.8b3?
With the "mingw + cross-compile fixes v2" patch, my build fails with this:

rm -f libhostmar.a
ar  host_mar_create.o host_mar_extract.o host_mar_read.o
c:\mozilla\mingw\bin\ar.exe: illegal option -- h
Usage: c:\mozilla\mingw\bin\ar.exe [emulation options] [-]{dmpqrstx}[abcfilNoPsS
uvV] [member-name] [count] archive-file file...
       c:\mozilla\mingw\bin\ar.exe -M [<mri-script]
 commands:
  d            - delete file(s) from the archive
  m[ab]        - move file(s) in the archive
  p            - print file(s) found in the archive
  q[f]         - quick append file(s) to the archive
  r[ab][f][u]  - replace existing or insert new file(s) into the archive
  t            - display contents of archive
  x[o]         - extract file(s) from the archive
 command specific modifiers:
  [a]          - put file(s) after [member-name]
  [b]          - put file(s) before [member-name] (same as [i])
  [N]          - use instance [count] of name
  [f]          - truncate inserted file names
  [P]          - use full path names when matching
  [o]          - preserve original dates
  [u]          - only replace files that are newer than current archive contents

 generic modifiers:
  [c]          - do not warn if the library had to be created
  [s]          - create an archive index (cf. ranlib)
  [S]          - do not build a symbol table
  [v]          - be verbose
  [V]          - display the version number
 emulation options:
  No emulation specific options
c:\mozilla\mingw\bin\ar.exe: supported targets: pe-i386 pei-i386 elf32-i386 elf3
2-little elf32-big srec symbolsrec tekhex binary ihex
make[4]: *** [libhostmar.a] Error 1
make[4]: Leaving directory `/cygdrive/c/mozilla/mozilla/modules/libmar/src'
make[3]: *** [libs] Error 2


Strange, because building with the "mingw + cross-compile fixes v1" seems to
work fine (I've used set RUN_AUTOCONF_LOCALLY=1)
rm -f libhostmar.a
ar  host_mar_create.o host_mar_extract.o host_mar_read.o

looks like the line probably wants to be:
ar libhostmar.a host_mar_create.o host_mar_extract.o host_mar_read.o
Doesn't ar need a command switch/letter before the list, like tar?  (That's
where tar got that horrible syntax, I think.)
Comment on attachment 185989 [details] [diff] [review]
mingw + cross-compile fixes v2

Yes, it does.  It looks like I botched the AR_FLAGS setting when I was messing
with VC.  I'll have to reboot to XP and retest everything. :-/
Attachment #185989 - Flags: approval1.8b3?
Reopening based on continued work, and possible dup bug #297386 (which I will
set as being blocked by this one for now).
Blocks: 297386
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
This bug is fixed. Please file new bugs on new issues.
Status: REOPENED → RESOLVED
Closed: 19 years ago19 years ago
Resolution: --- → FIXED
Blocks: 298417
Depends on: 301643
Product: Core → Firefox Build System
You need to log in before you can comment on or make changes to this bug.