Bug 1643211 Comment 0 Edit History

Note: The actual edited comment in the bug view page will always show the original commenter’s name and original timestamp.

A BCJ filter works by converting relative offsets into absolute positions in code, to increase redundancy by representing all references to a position by the same bytes. This assumes that the files are executables and performs a simple disassembly to locate the relative offsets. This disassembly is optimized for speed and doesn't have to be completely accurate, but inaccurately-identified targets will reduce compressibility. Also, some relative offsets are likely to be fairly common (e.g. short-distance branches), while the positions they refer to are individually referenced more rarely. Generally these issues are more than offset by the improvement in commonly-referenced code.

Our update .mars use this on each file in the update, whether it is a wholly new file or a BSDiff patch. In the case of a patch this analysis is not as valuable because the patch contains pieces of new code at arbitrary positions, so relative offsets will not translate reliably to the same absolute position.

The upshot is that using BCJ increases patch size. Here are the effects on recent partial updates of disabling BCJ:

update | arch | BCJ | no BCJ | increase
---| --- | --- | --- | ---
76.0-77.0 | win32 | 12554866 | 12,274,570 | 280,296
76.0-77.0 | win64 | 12444221 | 12,137,601 | 306,620
2020-06-01-09-38-12 - 2020-06-01-21-42-28 | win32 | 7,486,437 | 7,441,697 | 44,740
2020-06-01-09-38-12_2020-06-01-21-42-28 | win64 | 8,450,892 | 8,353,316 | 97,576

It's not a big difference, but it is simple to fix.
A BCJ filter works by converting relative offsets into absolute positions in code, to increase redundancy by representing all references to a position by the same bytes. This assumes that the files are executables and performs a simple disassembly to locate the relative offsets. This disassembly is optimized for speed and doesn't have to be completely accurate, but inaccurately-identified targets will reduce compressibility. Also, some relative offsets are likely to be fairly common (e.g. short-distance branches), while the positions they refer to are individually referenced more rarely. Generally these issues are more than offset by the improvement in commonly-referenced code.

Our update .mars use this on each file in the update, whether it is a wholly new file or a BSDiff patch. In the case of a patch this analysis is not as valuable because the patch contains pieces of new code at arbitrary positions, so relative offsets will not translate reliably to the same absolute position.

The upshot is that using BCJ increases patch size. Here are the effects on recent partial updates of disabling BCJ:

update | arch | BCJ | no BCJ | decrease
---| --- | --- | --- | ---
76.0-77.0 | win32 | 12,554,866 | 12,274,570 | 280,296
76.0-77.0 | win64 | 12,444,221 | 12,137,601 | 306,620
2020-06-01-09-38-12 - 2020-06-01-21-42-28 | win32 | 7,486,437 | 7,441,697 | 44,740
2020-06-01-09-38-12_2020-06-01-21-42-28 | win64 | 8,450,892 | 8,353,316 | 97,576

It's not a big difference, but it is simple to fix.

Back to Bug 1643211 Comment 0