Consider bsopt or mbsdiff30 for smaller patch size

RESOLVED FIXED in Firefox 61

Status

()

enhancement
P1
normal
RESOLVED FIXED
Last year
Last year

People

(Reporter: agashlin, Assigned: agashlin)

Tracking

(Blocks 1 bug)

unspecified
mozilla61
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(firefox61 fixed)

Details

Attachments

(1 attachment, 1 obsolete attachment)

tl;dr we could save 1.2 MB on a Windows 64 incremental update MAR with a new, but slow, diff generation tool, or 0.7 MB with a small patch to mbsdiff.

---

I’ve written a new diff generator called bsopt for the bsdiff format used by incremental updates, which produces about 8% smaller patches than mbsdiff after LZMA compression, and which is compatible with the bspatch we use now. It is written in Rust, though I am planning a C++ implementation for incorporation into Courgette.

The downside is it takes quite a while to run, in the vicinity of 10 minutes for xul.dll in the below comparison and 22 minutes for a whole incremental patch. Patch application times are not impacted.

I ported part of this back to mbsdiff, but it doesn’t do quite as well without the optimization pass, particularly on clang binaries where it gets half the savings of bsopt. It will be much easier to review the three line change to mbsdiff, though, and it is much faster, so that might still be preferred. It is listed below as mbsdiff30.

Here are measurements for a patch of Windows 64 xul.dll from 56.0.2 to 57.0, using xz --compress --x86 --lzma2 --format=xz --check=crc64 as in the current make_incremental_update.sh, run on a MacBook Pro. The x86 filter rarely leads to any improvement on these patches so I also ran most tests without that.

           creation time   LZMA’d patch size    LZMA without --x86
mbsdiff     1m40s          12.3M                12.1M
mbsdiff30   1m40s          11.6M (-0.7M, 5.7%)  11.5M (-0.6M, 5.0%)
bsopt       9m54s          11.2M (-1.1M, 8.9%)  11.2M (-0.9M, 7.4%)

Here’s a whole incremental update MAR (Windows 64, 56.0.2 -> 57.0): 

           creation time   LZMA’d patch size    LZMA without --x86
mbsdiff     3m15s          14.6M                14.4M
mbsdiff30   4m00s          13.9M (-0.7M, 4.8%)  13.8M (-0.6M, 4.2%)
bsopt      22m23s          13.4M (-1.2M, 8.2%)  13.4M (-1.0M, 6.9%)

It may only be worthwhile to run bsopt on xul.dll, since that represents 90% of the improvement but less than half the creation time.

Windows 32 doesn’t lose quite as much as shown in this incremental MAR (56.0.2 -> 57.0):

           creation time   LZMA’d patch size    LZMA without --x86
mbsdiff     2m54s          12.9M                12.7M
mbsdiff30   3m35s          12.7M (-0.2M, 1.6%)  12.6M (-0.1M, 0.7%)
bsopt      20m16s          12.4M (-0.5M, 3.9%)  12.4M (-0.3M, 2.4%)

Linux 64 libxul.so (56.0.2 -> 57.0):

           creation time   LZMA’d patch size    LZMA without --x86
mbsdiff     2m25s          24.9M                24.5M
mbsdiff30   2m34s          23.5M (-1.4M,  5.6%) 23.2M (-1.3M,  5.3%)
bsopt      10m09s          21.9M (-3.0M, 12.0%) 22.0M (-2.5M, 11.4%)

This is approximately the same savings we would get on Linux with Courgette.

With Mac XUL (56.0.2 -> 57.0):

           creation time   LZMA’d patch size    LZMA without --x86
mbsdiff     2m24s          11.8M                11.6M
mbsdiff30   2m18s          11.4M (-0.4M, 3.4%)  11.2M (-0.4M, 3.4%)
bsopt      12m00s          10.8M (-1.0M, 8.5%)  10.6M (-1.0M, 8.6%)


and for a different flavor of file, browser/omni.ja (from Mac, 56.0.2 -> 57.0):

           creation time   LZMA’d patch size    LZMA without --x86
mbsdiff      29s           1.0M                 1.0M
mbsdiff30    28s           983K (-17K, 1.7%)    978K (-22K, 2.2%)
bsopt      5m34s           930K (-70K, 7.0%)    926K (-74K, 7.4%)


Finally, a point release incremental MAR (Windows 64, 57.0 -> 57.0.1):

           creation time   LZMA’d patch size
mbsdiff     2m45s           7.1M
mbsdiff30   3m26s           6.8M (-0.3M, 4.2%)
bsopt      23m39s           6.8M (-0.3M, 4.2%)

---

I'm attaching the mbsdiff30 patch. I'm not sure what to do yet with bsopt as it could use some cleanup, and currently it uses a port of qsufsort from bsdiff so I think it would all have to be licensed under the BSD Protection License. I'm looking into porting divsufsort [1] or SAIS [2], which both have the MIT License, and divsufsort at least is considerably faster.

[1] https://github.com/y-256/libdivsufsort
[2] https://sites.google.com/site/yuta256/sais
agashlin, excellent report on your investigation!

catlee, what do you think about this... specifically the mar file size reduction vs. the additional time it takes to produce the mar files?
Flags: needinfo?(catlee)
Would this require any changes to the client? (aka: would we need to watershed for it?)
(In reply to Ben Hearsum (:bhearsum) from comment #2)
> Would this require any changes to the client? (aka: would we need to
> watershed for it?)

Sorry, just reread comment #0 and answered by own question (yes!).
This is good work! I'm not too concerned by the increased run-time: the patch file caching that was recently added and the fact that libxul/xul.dll aren't localised means that the en-US partials task can happily spend 20 minutes doing its thing and still be done before the localised partials start. The other tasks will then just hit the cache.
This should not require any changes to the client. I haven't run through a full updater test yet, but my standalone build of bspatch handles these files well with either change.

Sorry for dumping so much into the description!
(In reply to Robert Strong [:rstrong] (use needinfo to contact me) from comment #1)
> agashlin, excellent report on your investigation!
> 
> catlee, what do you think about this... specifically the mar file size
> reduction vs. the additional time it takes to produce the mar files?

I'm not concerned about the time. As Simon says, we have pretty good caching now, so the impact to wall clock time is probably minimal.

If bsopt still needs some work, could we go with the mbsdiff30 patch for now?
Flags: needinfo?(catlee)
Quick update, I made a few changes that have improved compression in some cases and significantly increased speed. I ran tests on most platforms, here are the numbers for incremental MARs (for all runs I left --x86 on):

56.0.2 -> 57.0
win32:   12.2M (from 12.9M, -0.7M)
win64:   13.3M (from 14.5M, -1.2M)
linux64: 25.7M (from 28.1M, -2.4M)
mac:   13.9M (from 15.2M, -1.3M)

57.0 -> 57.0.1
win32:   6.1M (almost the same as the original)
win64:   6.7M (from 7.1M, -0.6M)
linux64: 6.0M (from 5.8M, +0.2M)
mac:   2.4M (original was 60K smaller)

Things are a lot less clear cut with smaller changes, even losing ground in some cases. Since this is all still the same format, it might make sense to run both diff programs and just pick the smaller output file!

Each run took about 7 minutes, longest was mac 56.0.2 -> 57.0 which took 8m 26s. And again this is on my MacBook which takes an hour to compile Firefox, so mileage will vary. That time includes everything, including an extra xz pass to estimate file compressibility.


I'll run these tests again with mbsdiff30 tomorrow, I expect it will have the same directions but smaller magnitudes, we'll see if it still makes sense then.
Blocks: 1303172
Assignee: nobody → agashlin
Priority: -- → P1
Apologies for the delay here. Did you have an updated diff with your most recent changes? 

I'll get the mbsdiff30 patch applied first.
I'm sorry about the delay, here's the numbers for mbsdiff30, in parentheses is (old size, change):

56.0.2 -> 57.0
linux32 30.3M (30.4M, -0.1M)
linux64 27.4M (28.1M, -0.7M)
mac     14.7M (15.1M, -0.4M)
win32   12.7M (12.9M, -0.2M)
win64   13.9M (14.6M, -0.7M)

57.0 -> 57.0.1
linux32 6.4M (6.4M, 0)
linux64 5.8M (5.8M, 0)
mac     2.4M (2.4M, 0)
win32   6.2M (6.1M, -0.1M)
win64   6.8M (7.1M, -0.3M)

I haven't updated anything since posting the above patch, but I'll revise it to apply cleanly to the full tree path.

I'm trying to get some testing in place for patch generation, sfraser is there anything run regularly on the releng side already for automatically test-applying generated patches?
Flags: needinfo?(sfraser)
I found the Courgette documentation of bsdiff [1] helpful in understanding this code.

The changes work as follows:

>                         if(((len==oldscore) && (len!=0)) || 
> -                               (len>oldscore+8)) break;
> +                               (len>oldscore+10)) break;

Stop scanning when the next match would be 10 bytes better than the current one we're extending now, instead of 8.

> -                               if(s*2-i>Sf*2-lenf) { Sf=s; lenf=i; };
> +                               if(s*3-i*2>Sf*3-lenf*2) { Sf=s; lenf=i; };

|s*2 - i| is a score derived from |i|, the number of bytes by which we will extend the match forward, and |s|, the number of matched bytes among those:

  matches * 2 - extend
= (extend - mismatches) * 2 - extend
= extend - mismatches * 2

So each mismatch contributes a penalty of -2. We change this weighting to -3:

  matches * 3 - extend * 2
= (extend - mismatches) * 3 - extend * 2
= extend - mismatches * 3

The loop below with lenb works similarly.

[1] https://chromium.googlesource.com/chromium/src/courgette/+/7a041c6f4f50e7c8004bb54fe964466228a63cca/third_party/bsdiff/bsdiff_create.cc#136
Attachment #8946966 - Attachment is obsolete: true
Attachment #8956667 - Flags: review?(mh+mozilla)
(In reply to Adam Gashlin [:agashlin] from comment #9)
> I'm sorry about the delay, here's the numbers for mbsdiff30, in parentheses
> is (old size, change):

Don't worry about any delay, you've already made a great contribution! I was just making sure I had the up to date patch.

> I'm trying to get some testing in place for patch generation, sfraser is
> there anything run regularly on the releng side already for automatically
> test-applying generated patches?

There's update verification for releases, but not for nightly. It's something we should add, perhaps even in the partials task.
Flags: needinfo?(sfraser)
Attachment #8956667 - Flags: review?(mh+mozilla) → review+
https://hg.mozilla.org/mozilla-central/rev/660d7b66d239
Status: NEW → RESOLVED
Closed: Last year
Resolution: --- → FIXED
Target Milestone: --- → mozilla61
See Also: → 1450011
You need to log in before you can comment on or make changes to this bug.