x86 BCJ filter shouldn't be used on diffs in partial updates
Categories
(Release Engineering :: Release Automation: Updates, enhancement, P1)
Tracking
(firefox79 fixed)
Tracking | Status | |
---|---|---|
firefox79 | --- | fixed |
People
(Reporter: agashlin, Assigned: agashlin)
References
Details
Attachments
(1 file)
A BCJ filter works by converting relative offsets into absolute positions in code, to increase redundancy by representing all references to a position by the same bytes. This assumes that the files are executables and performs a simple disassembly to locate the relative offsets. This disassembly is optimized for speed and doesn't have to be completely accurate, but inaccurately-identified targets will reduce compressibility. Also, some relative offsets are likely to be fairly common (e.g. short-distance branches), while the positions they refer to are individually referenced more rarely. Generally these issues are more than offset by the improvement in commonly-referenced code.
Our update .mars use this on each file in the update, whether it is a wholly new file or a BSDiff patch. In the case of a patch this analysis is not as valuable because the patch contains pieces of new code at arbitrary positions, so relative offsets will not translate reliably to the same absolute position.
The upshot is that using BCJ increases patch size. Here are the effects on recent partial updates of disabling BCJ:
update | arch | BCJ | no BCJ | decrease |
---|---|---|---|---|
76.0-77.0 | win32 | 12,554,866 | 12,274,570 | 280,296 |
76.0-77.0 | win64 | 12,444,221 | 12,137,601 | 306,620 |
2020-06-01-09-38-12 - 2020-06-01-21-42-28 | win32 | 7,486,437 | 7,441,697 | 44,740 |
2020-06-01-09-38-12_2020-06-01-21-42-28 | win64 | 8,450,892 | 8,353,316 | 97,576 |
It's not a big difference, but it is simple to fix.
Assignee | ||
Comment 1•5 years ago
|
||
Comment 2•5 years ago
|
||
I wonder also about using the BCJ filter on non-executable data like the omni.ja files. Does disabling it there have any impact?
Assignee | ||
Comment 3•5 years ago
•
|
||
Good question. On x86/x64 (the only arches we use BCJ on), the only opcodes used are 0xE8 (CALL near) and 0xE9 (JMP near), and as omni.ja is mostly ASCII text < 0x80 this shouldn't come up too often. For 77.0:
file | BCJ | no BCJ | decrease |
---|---|---|---|
omni.ja | 4,816,116 | 4,803,712 | 12,404 |
browser/omni.ja | 11,836,084 | 11,837,088 | -1,004 |
It's weird that the size goes up slightly with browser/omni.ja, at that level it's likely just noise.
I didn't want to include something like this in the patch because it's a little more complicated to check for these files, and adding a "non-exe" list would be a pain to maintain for little benefit.
Comment 4•4 years ago
|
||
Weird, I have the opposite results with nightly:
file | BCJ | no BCJ | decrease |
---|---|---|---|
omni.ja | 5,004,308 | 4,991,108 | 13,200 |
browser/omni.ja | 12,909,728 | 12,905,312 | 4,416 |
In any case, it's a pretty small difference for these two files. There are a few other non-executable files in the updates, but those are definitely the largest.
Comment 6•4 years ago
|
||
bugherder |
Description
•