Open Bug 504624 Opened 15 years ago Updated 27 days ago

Consider using the Courgette algorithm in product updates

Categories

(Toolkit :: Application Update, enhancement, P3)

enhancement

Tracking

()

People

(Reporter: gkw, Unassigned)

References

(Blocks 1 open bug, )

Details

Attachments

(1 file)

Courgette is supposedly an open source diff'ing algorithm by Google. This enhancement is to consider using it over whatever we're using now, to see if it's better than whatever we have.

http://blog.chromium.org/2009/07/smaller-is-faster-and-safer-too.html

http://dev.chromium.org/developers/design-documents/software-updates-courgette
Assignee: morgamic → nobody
Component: General → Application Update
Product: AUS → Toolkit
QA Contact: general → application.update
Version: 3.0 → unspecified
See 'Test MAR generation with Courgette?' at
http://groups.google.com/group/mozilla.dev.platform/browse_thread/thread/2d19b1795e6e0fc0
Flags: wanted1.9.2?
OS: All → Windows 2000
Hardware: All → x86
Summary: Consider using the Courgette algorithm in nightly/product updates → Consider using the (Windows) Courgette algorithm in nightly/product updates
Version: unspecified → Trunk
Note that there are allegations of patent infringement relating to Courgette, including a patent lawsuit brought against Google. (See Red Bend Ltd. and Red Bend Software Inc. v. Google Inc., Civ. Action No. 09-cv-11813 (E.D. Ma.)).
cross-posted from Bug 662102

Making updates to Aurora smaller size - update compression algorithm - Courgette 

"Rather than push put a whole new 10MB update, we send out a diff that takes the previous version of Google Chrome and generates the new version." ... 

Right now every day there is a 17MB update in Aurora.

"We want smaller updates because it narrows the window of vulnerability. If the update is a tenth of the size, we can push ten times as many per unit of bandwidth," 

Google has open-sourced the Courgette code, so now Firefox can use it.

See: http://www.theregister.co.uk/2009/10/28/red_bend_sues_google/
Nick Thomas [:nthomas] wrote:

"FWIW, Mozilla apps already support differential updates, which we call a partial update. So if you keep up to date then there is an update of a few MB, rather than 17 or so. eg in http://ftp.mozilla.org/pub/mozilla.org/firefox/nightly/2011/06/2011-06-03-04-mozilla-aurora/:
firefox-6.0a2.en-US.win32.complete.mar                                    18M  
firefox-6.0a2.en-US.win32.partial.20110602042006-20110603042008.mar      2.5M  

It's certainly possible that courgette can make smaller diffs, particularly for PGO rearranging libxul.dll, but that's for bug 504624 to look at."

--------------

In Aurora Channel, when I click "About Firefox", every day it pushes the whole 17 MB update. Or maybe once every 2 days. I am sure there are not that many fixes landed, but I have to download the whole 17 MB every 1-2 days.
Flags: wanted1.9.2?
Nick's note was to tell you that we already use binary diffs for partial updates. The fact that you're getting full updates is a separate issue. It's possible that using Courgette could make our partial updates somewhat smaller, but it's not going to have a huge impact.
A couple of points:

1. We only do binary deltas to the last release. With Aurora building daily, if you miss one you will likely get the full update. So, unless you use Aurora and install an update every day you'll get the full. And (I haven't checked this) there is likely the potential that the update check + install nag timers are skewed in just the right way with the build times so some people always are > 1 release behind and always get the full.

We are tracking building patch packages for more than the last release in bug 575317.

2. As Reed says in comment 2, Google is currently being sued for using it (see http://www.computerworlduk.com/news/it-business/17350/google-faces-chrome-legal-battle/ as well). We should wait to see how that turns out or at the very least need our own lawyers to do a legal review.
I've filed bug 662208 for the 'always get a complete' issue, since it's not related to Courgette.
(In reply to comment #7)
> Google is currently being sued for using it
> [...]. We should wait to see how that turns out [...]

Since it's hard to find any news more recent than 2009, and some might be left wondering what happened, here's the info I found.
In short, the litigation is still going on, and is not looking very rosy for Google.

Google failed last year to get the patent office to invalidate the patent (despite finding some potential prior art)
Red Bend failed to get an injonction to stop Google from publishing the source.
This will certainly go on for quite some time more.
Join the Firefox Aurora channel! Download 20 MB updates nightly and click through the intrusive installer every time. Why not update when the user isn't using?

http://www.codinghorror.com/blog/2011/05/the-infinite-version.html
That is completely unrelated to this bug.
I'd be interested to see if this actually does a better job than bsdiff on our PGO'ed builds. Note that Google doesn't build Chrome with PGO, so it's possible it has the same issues. (Our nightly updates used to be a few hundred KB before we turned on PGO, then they jumped up into the 2MB range.)
Oh, the link in comment 1 has some tests run by rhelmer a few years ago that seem to show a pretty big improvement.
Yeah. This could be a significant win. Let's do it :-)
(In reply to Asa Dotzler [:asa] from comment #14)
> Yeah. This could be a significant win. Let's do it :-)
Agreed!

This will need to be prioritized with the existing planned work.
(In reply to Robert Strong [:rstrong] (do not email) from comment #15)
> (In reply to Asa Dotzler [:asa] from comment #14)
> > Yeah. This could be a significant win. Let's do it :-)
> Agreed!
> 
> This will need to be prioritized with the existing planned work.

Setting tracking-firefox15? but I'm not sure if it's the correct flag, so pls feel free to change if necessary.
Not going to happen for Firefox 15. We'll evaluate for what release this will land when we assign resources to implement it.
I just built the latest version of courgette and ran some numbers with firefox 12.0 -> 13.0.1.

The existing bsdiff based partial mar is 10,244,590 bytes. Using courgette to do the diffs instead of bsdiff yields a mar which is 8,414,954 bytes - a 17% reduction. In particular, xul.dll.patch shrinks from 4,358,810 bytes to 2,925,995 bytes.

courgette doesn't help with omni.ja at all - we save about 16k vs the original omni.ja.patch. omni.ja.patch is now the largest file in the partial mar taking up around 40% of the size.
It would be interesting to know if bug 772841 also caused the inflation of the omni.ja patch since it would have changed the ordering of the files within omni.ja.
according to 'unzip -l', the ordering between omni.ja in 12.0 and 13.0.1 seems to be the same, modulo new/deleted files.

Also, bsdiff should be able to handle files with the same content that change position within the archive since individual files are compressed in zip rather than the entire archive.
(In reply to comment #18)
> courgette doesn't help with omni.ja at all - we save about 16k vs the original
> omni.ja.patch. omni.ja.patch is now the largest file in the partial mar taking
> up around 40% of the size.

That's not surprising, courgette actually attempts to disassemble executable files and diff the disassembled version while handling things like relative offsets.  I think we should consider using bsdiff for non-executable files when we switch to courgette.
(In reply to Ehsan Akhgari [:ehsan] from comment #21)
> (In reply to comment #18)
> > courgette doesn't help with omni.ja at all - we save about 16k vs the original
> > omni.ja.patch. omni.ja.patch is now the largest file in the partial mar taking
> > up around 40% of the size.
> 
> That's not surprising, courgette actually attempts to disassemble executable
> files and diff the disassembled version while handling things like relative
> offsets.  I think we should consider using bsdiff for non-executable files
> when we switch to courgette.
or evaluating other tools to handle patching zip files. There has been a request to zip up the files during install for example though I am loathe to add that type of complexity especially since we will want to order the files inside of the compressed file.
(In reply to comment #22)
> (In reply to Ehsan Akhgari [:ehsan] from comment #21)
> > (In reply to comment #18)
> > > courgette doesn't help with omni.ja at all - we save about 16k vs the original
> > > omni.ja.patch. omni.ja.patch is now the largest file in the partial mar taking
> > > up around 40% of the size.
> > 
> > That's not surprising, courgette actually attempts to disassemble executable
> > files and diff the disassembled version while handling things like relative
> > offsets.  I think we should consider using bsdiff for non-executable files
> > when we switch to courgette.
> or evaluating other tools to handle patching zip files. There has been a
> request to zip up the files during install for example though I am loathe to
> add that type of complexity especially since we will want to order the files
> inside of the compressed file.

Do you know of any such tools?  Compressing zip files well is pretty tricky...
No, just saying that it should be evaluated though not as part of this bug.
For reference, bug 772868 covers handling zip files better in updates.
This better algorithm might save us a factor of 10x on update size.  In addition to bug 772868, that would be a Big Deal.
blocking-basecamp: --- → ?
(In reply to comment #26)
> This better algorithm might save us a factor of 10x on update size.  In
> addition to bug 772868, that would be a Big Deal.

Last I checked (which is, right now!), courgette only supports Elf and PE x86.  So, how is this going to help b2g?
Yeah, we just refreshed ourselves on this too :).
blocking-basecamp: ? → ---
Keywords: sec-want
Assignee: nobody → areinald
OS: Windows 2000 → All
Hardware: x86 → All
Summary: Consider using the (Windows) Courgette algorithm in nightly/product updates → Consider using the Courgette algorithm in product updates
I was attempting to follow this bug and bug 772868, together with some experiments of my own, and I sort of follow what the current state is but not completely. So let me try to summarize my flawed understanding, and hopefully people can correct me:

1. Aurora gets updated just about every night, deltas are only based on the previous version, and people are seeing frequent (constant?) full downloads instead of delta updates. This might be because of a timing issue. [bug 504624 comment 7]. No bug seems to be open on this.

2. We have the capability to generate partials from multiple base versions, but it doesn't apply to Nightly or Aurora channels. [bug 575317 comment 18]

3. Our partials are currently made using bsdiff on an uncompressed archive that contains a compressed archive member, omni.ja [bug 772868 comment 0]. omni.ja is compressed using zip, which stores individual files separately, so bsdiff is still able to produce savings for unmodified files within omni.ja even though it can't do much of anything for modified files (because they are compressed.) [bug 504624 comment 20].

Background {

Generic compression works mostly by finding identical repeated substrings within one version.

Differential compression works by finding identical substrings between two versions, so that only the changed portions are stored/transmitted.

bsdiff is differential compression that additionally allows small mismatches within the matching substrings (in-place modifications only, no insertions or deletions), and subtracts the substring from its parent to reduce the cost when all of the mismatches differ by a constant amount (common in recompiled object files.) Thus, bsdiff is a general binary delta algorithm that will do particularly well with machine code produced by recompiling relatively small source modifications.

Courgette is even more specialized for machine code. It decompiles machine code and computes the delta based on that, which exposes many more compression opportunities when it knows how to handle the input. (Both bsdiff and courgette will fall back to standard differential compression on data that they aren't specialized for, so there is a small but not significant compression cost to using them on input they are not optimized for.)

}

4. Courgette achieves substantially better compression than bsdiff on the main machine code file (the "MAR", apparently?) [bug 504624 comment 18 and https://groups.google.com/forum/#!topic/mozilla.dev.platform/LRmxeV5uD8A]. bsdiff by itself saves a large amount over generic binary deltas for the machine code files, but also handles many of the non-machine code files (and machine code files for architectures that Courgette does not yet handle.)

5. I experimented with generating binary deltas using rsync's batch files and small difference block sizes. It is fully generic, can do far better than untuned binary deltas, and is nowhere near as good as bsdiff.

6. Courgette has big scary patent encumbrances. Patents are why we can't have nice things. Google is fighting the patent fight and may not be doing so well, and the comments in these bugs give no indication that we would have an easier time. [bug 504624 comment 9]

7. Courgette is naturally very limited in what formats it can handle, and b2g uses something that is not one of those formats [bug 504624 comment 27]. I think Courgette handles ELF x86 objects, and b2g uses ELF ARM objects? So it would need an ARM "port" to be useful there.

8. We build with pgo, which substantially reduces the effectiveness of bsdiff. Presumably, Courgette would not be affected nearly as much.

9. To reduce link-time memory overhead, Windows PGO is now restricted to a smaller subset of files [bug 871712]. I would guess that this should help bsdiff (as in, reduce the PGO-induced hit that bsdiff suffers from.)

10. bsdiff has some pathological behavior on certain inputs, where the initial suffix sorting phase takes insane amounts of time [http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=409664]. Solutions to this have been proposed and appear fairly straightforward. We are not seeing this in practice.

Anything I got wrong? Anything to add?
> 1. Aurora gets updated just about every night, deltas are only based on the
> previous version, and people are seeing frequent (constant?) full downloads
> instead of delta updates. This might be because of a timing issue. [bug
> 504624 comment 7]. No bug seems to be open on this.

People on Nightly and Aurora often get more than one build out of date and we're not currently optimizing for that case. I believe we shouldn't worry about that now. Betas and releases have less of a problem with users skipping builds and needing full updates because of the lower frequency.

> 2. We have the capability to generate partials from multiple base versions,
> but it doesn't apply to Nightly or Aurora channels. [bug 575317 comment 18]

More that we *don't* do it for nightly/aurora, not that we couldn't if it were deemed to be important.

> 4. Courgette achieves substantially better compression than bsdiff on the
> main machine code file

Yes. primarily xul.dll/libxul.so

 (the "MAR", apparently?)

No. The MAR is an archive of individual files or patches. The .bsdiff for xul.dll is just one file within the MAR.

> 7. Courgette is naturally very limited in what formats it can handle, and
> b2g uses something that is not one of those formats [bug 504624 comment 27].
> I think Courgette handles ELF x86 objects, and b2g uses ELF ARM objects? So
> it would need an ARM "port" to be useful there.

Getting updates to be smaller on Windows x86 (and in the future Windows x86-64) is a primary concern; it will reduce our release-day costs as well as significantly increase the number of users who are stay up to date.

> 8. We build with pgo, which substantially reduces the effectiveness of
> bsdiff. Presumably, Courgette would not be affected nearly as much.

This is probably true, but needs to be measured.
Here is what I can add to the discussion:

(In reply to Steve Fink [:sfink] from comment #30)
> 6. Courgette has big scary patent encumbrances. Patents are why we can't
> have nice things. Google is fighting the patent fight and may not be doing
> so well, and the comments in these bugs give no indication that we would
> have an easier time. [bug 504624 comment 9]

On 2012-11-16, a stackoverflow post mentions the legal status of the patent matter:
http://stackoverflow.com/questions/13415147/how-to-compute-differences-between-two-binaries-i-e-two-executables-in-linux
"Yeah, it looks like Red Bend failed to get a preliminary injunction against Google, but Google failed to get the patent invalidated, so the fight goes on."

> 7. Courgette is naturally very limited in what formats it can handle, and
> b2g uses something that is not one of those formats [bug 504624 comment 27].
> I think Courgette handles ELF x86 objects, and b2g uses ELF ARM objects? So
> it would need an ARM "port" to be useful there.

Having checked out and compiled courgette from the Chromium repository, it presently handles 3 executable formats:
- elf 32
  - arm
  - x86
- win 32
  - x86

And the architecture is thought as to ease implementation of additional executable formats / processor instructions set.
For our own usage it would make sense to add :
- win 64 executable format
- mach-o executable format (we can apparently discard Mac OS "fat binaries" as we only build x86_64 versions of Firefox Mac)
- x86_64 instructions

I am currently investigating if Google had some reasons not to implement those yet.

> 8. We build with pgo, which substantially reduces the effectiveness of
> bsdiff. Presumably, Courgette would not be affected nearly as much.

I can confirm that part of the algorithm is about "adjustment" which tries to find similar segments of code even if they have been moved around.

Finally, some topics have been discussed on our dev.platform mailing list:
https://groups.google.com/forum/#!topic/mozilla.dev.platform/LRmxeV5uD8A
An idea to circumvent Courgette patent:

If we can leave functions/method names inside the binaries, or in some way provide those names inside a dictionary (like a .sym file), we could have almost the same functionality as Courgette's, but being different enough so to avoid patent claims. Indeed, most address changes are functions/method calls. Of course this only applies to open source projects, which we are.
(In reply to André Reinald from comment #33)
> An idea to circumvent Courgette patent:
> *** leave functions/method names inside the binaries, ***

The problem is not with Courgette, but with the Red Bend patent. 
The RB patent is strongly different from Courgette, and a much more generic principle of optimization that your idea will not escape from : http://www.google.com/patents/US6546552
Since it's very generic, Google had good hopes of finding prior art before the priority date of 1998, but they failed to convince and patentability has been confirmed in February 2011 : http://www.google.com/patents/US6546552#legal-events
However Red Bend also failed to get a preliminary injunction forbidding Google to publish the source code : http://www.gpo.gov/fdsys/pkg/USCOURTS-mad-1_09-cv-11813/content-detail.html

Apparently still nothing new happened since then. Google and Red Bend might have made an undisclosed back-room deal, and stopped actually pursuing the litigation.
There was talks of IBM acquiring Red Bend, but it apparently didn't work out.

Mozilla legal team needs to try to sort this out, AFAIC contacting Red Bend to get a clear idea of what their current position is might be a good idea.
(In reply to Jean-Marc Desperrier from comment #34)
> Mozilla legal team needs to try to sort this out, AFAIC contacting Red Bend
> to get a clear idea of what their current position is might be a good idea.

Seems they have sorted this out already. In short: we can use Courgette.
(In reply to Steve Fink [:sfink] from comment #30)
> 8. We build with pgo, which substantially reduces the effectiveness of
> bsdiff. Presumably, Courgette would not be affected nearly as much.

This is the most interesting thing, to me. Our Windows updates are much larger than they used to be before we shipped PGO binaries. If courgette performs much better on these builds then that would be a good win.
Assignee: areinald → nobody
Blocks: 1303172
I just tried using courgette to generate a patch for xul.dll between 43.0.1 and 48.0.2. It ran for 9 hours overnight at 100% cpu before I gave up and killed it. It does work on firefox.exe, although it generates a larger patch than bsdiff does!
Not sure what's going on with Chris's testing, but I do get better results. I ran courgette on my Windows machine with the 64-bit xul.dll 43.0.1 to 48.0.2, and I got these numbers:

           creation time   patch size   LZMA'd patch size    apply time
courgette    5m12.48s        26.8MB           12.4MB            4.18s
bsdiff         58.48s        69.3MB           15.7MB            0.29s

So that's pretty good. Even after applying LZMA we still have a 21% size reduction (remember this is just xul.dll, not an entire MAR). I don't know whether the creation time would be an issue; note that an entire CPU thread is consumed for that entire time, and it gradually works its way up to about 2 GB of memory usage.
I am a little concerned about the time required to apply the courgette patch. My machine has a pretty good CPU and an SSD and wasn't doing much else, so since it took that long under those conditions, I'm worried about that process being unacceptably taxing on a less powerful or heavily loaded system. I would want to test that before making any decisions.

I also wanted to try 48.0.2 -> 49.0, because that's something that we would more realistically generate a partial MAR for. The results are about the same:

           creation time   patch size   LZMA'd patch size    apply time
courgette    5m17.26s        20.6MB           11.1MB            4.83s
bsdiff       1m05.69s        69.2MB           14.4MB            0.33s
I've looked into the performance of Courgette patch application, I'm attaching a very rough patch which provides significant but probably still insufficient improvements.

With some recent Firefox releases (64-bit Windows xul.dll 56.0.2 to 57.0), here is what the compression and times look like on my Macbook. I'm including a row for Courgette's internal bsdiff implementation which is maybe also worth considering for small but fast gains.

                  creation time   xz'd patch size   apply time   apply time (patched)
courgette            7m31s           8.1M              4.656s       1.887s
courgette bsdiff       38s          11.3M              0.845s       0.402s
mbsdiff              1m11s          11.8M              0.312s

Courgette application is 2.5x faster with the current patch, but still 6x slower than our bspatch application. Without courgette's three checksum verifications this goes down to 5.2x slower, and I may see a way to get down to 4.7x if we can eliminate one final correction patch applied after the reassembly.

For one more comparison here's the most recent point release, 57.0.3 to 57.0.4:

                  creation time   xz'd patch size   apply time   apply time (patched)
courgette            7m28s           2.0M              4.682s       1.837s
courgette bsdiff       25s           4.9M              0.883s       0.356s
mbsdiff                51s           5.2M              0.237s

Changes in the patch are:
- Never write an EncodedProgram, writes are made directly from the disassembler to a SinkStreamSet via InstructionStreamWriteReceptor (required hacking SinkStream to hold headers for the varint number of elements in a stream)

- Special cases for guaranteed writes of 1 byte to a vector or stream to avoid calling into memcpy

- Batched writes in AssembleTo's COPY case

- Reordered comparisons in Rel32FinderX64 to check smaller conditions first

- Chunks are copied all at once in bsdiff apply, then adjustment is made on top of them, which avoids passing over unmodified data slowly (required hacking SinkStream for re-reading)
Priority: -- → P2
Priority: P2 → P3
Keywords: sec-want
Severity: normal → S3
You need to log in before you can comment on or make changes to this bug.