0.1% installer size (Windows) regression on Tue December 17 2024
Categories
(Toolkit :: Blocklist Implementation, defect, P1)
Tracking
()
Tracking | Status | |
---|---|---|
firefox-esr128 | --- | fix-optional |
firefox133 | --- | wontfix |
firefox134 | --- | fix-optional |
firefox135 | --- | fix-optional |
People
(Reporter: intermittent-bug-filer, Assigned: robwu)
References
Details
(Keywords: perf-alert, regression, Whiteboard: [addons-jira])
Perfherder has detected a build_metrics performance regression from push 1938560610a1d0ab8716d0a337e5f4dcafc61e9f. As author of one of the patches included in that push, we need your help to address this regression.
Regressions:
Ratio | Test | Platform | Options | Absolute values (old vs new) |
---|---|---|---|---|
0.10% | installer size | windows2012-64 | 101,929,773.50 -> 102,028,444.25 |
Details of the alert can be found in the alert summary, including links to graphs and comparisons for each of the affected tests. Please follow our guide to handling regression bugs and let us know your plans within 3 business days, or the patch(es) may be backed out in accordance with our regression policy.
You can run all of these tests on try with ./mach try perf --alert 43157
The following documentation link provides more information about this command.
For more information on performance sheriffing please see our FAQ.
If you have any questions, please do not hesitate to reach out to afinder@mozilla.com.
Updated•28 days ago
|
Updated•28 days ago
|
Comment 2•27 days ago
|
||
(In reply to Treeherder Bug Filer from comment #0)
Perfherder has detected a build_metrics performance regression from push 1938560610a1d0ab8716d0a337e5f4dcafc61e9f.
That changeset was a test-only change, so couldn't have caused this unless something went really wrong.
The other changeset in that landing was 1033685b7f6dcca657f68af25a71a1f6cf2c67f5, which in its regular updates also has the new binary bloomfilter attachments. So this is probably a regression from bug 1922308.
Assignee | ||
Comment 3•27 days ago
|
||
This is not a regression of bug 1922308.
The main relevant thing from bug is softblocks-addons-mlbf.bin.meta.json
, whose size changes from 4009 to 9413.
There is another change:
file: addons-mlbf.bin.meta.json
old size: 845025
new size: 931343
These increases are not minor. The same commit shows that the number of removed JSON entries (that get compressed and included in the bloomfilter) is relatively small. This increase in data size implies that there is either a lot of additional items being included (=add-ons being blocked) or a bug in the bloomfilter generation.
I'll raise this within my team.
Comment 4•23 days ago
|
||
(In reply to Rob Wu [:robwu] from comment #3)
This is not a regression of bug 1922308.
The main relevant thing from bug is
softblocks-addons-mlbf.bin.meta.json
, whose size changes from 4009 to 9413.There is another change:
file: addons-mlbf.bin.meta.json old size: 845025 new size: 931343
These increases are not minor. The same commit shows that the number of removed JSON entries (that get compressed and included in the bloomfilter) is relatively small. This increase in data size implies that there is either a lot of additional items being included (=add-ons being blocked) or a bug in the bloomfilter generation.
I'll raise this within my team.
Hi Rob!
Any updates on this regression ? Looks like it has downstreamed to mozilla-beta.
Thanks!
Assignee | ||
Comment 5•22 days ago
•
|
||
This is not a Firefox-side regression. The addons-mlbf.bin
file is fetched from Remote Settings by periodic_file_updates.sh
. This script is run on all branches, so the observed binary size increase will be observed on Nightly, Beta, DevEd, Release, ESR128, ESR115. Only Firefox desktop is affected, the mobile browsers are not affected because we do not package these remote settings dumps out of size concerns.
The addons-mlbf.bin
is a multi-layered bloom filter (aka cascade filter) that compactly represents the set of blocked add-ons among the full set of signed add-ons (at the time of bloom filter generation).
This remote settings attachment is updated by AMO, with generation logic at:
generate_mlbf
: https://github.com/mozilla/addons-server/blob/ffd993045cef73fb7635e4a94ac95e1a3a9073ec/src/olympia/blocklist/mlbf.py#L41-L74generate_and_write_filter
: https://github.com/mozilla/addons-server/blob/ffd993045cef73fb7635e4a94ac95e1a3a9073ec/src/olympia/blocklist/mlbf.py#L234-L280filtercascade
library: https://github.com/mozilla/filter-cascade/blob/b5469a2d0efaf150e3ce4c04b0d21b4ba2306cf7/filtercascade/__init__.py
To verify the bloomfilter generation, the following inputs are needed:
- The
include
andexclude
values, i.e. the set of all add-on versions, partitioned by hard blocked (=include) and not hard blocked (=exclude). All entries should be formatted as${addon.guid}:${addon.version}
. - The salt used in the bloom filter. I can extract it from the start of the
addons-mlbf.bin
file (seefiltercascade
file format). The history of past dumps can be found in the git log ofaddons-mlbf.bin
. A human-readable overview of size changes is visible in theaddons-mlbf.bin.meta.json
file that is included with each update (git log ofaddons-mlbf.bin.meta.json
).
Although I don't know the inputs at the time of the generation, I am able to parse the old and new bloom filters to run an analysis:
new | old | older | oldest | |
---|---|---|---|---|
date | 17 dec 2024 | 7 nov 2024 | 26 aug 2024 | 1 jun 2020 |
addons-mlbf.bin.meta.json | new meta.json | old meta.json | older meta.json | oldest meta.json |
addons-mlbf.bin | new addons-mlbf.bin | old addons-mlbf.bin | older addons-mlbf.bin | oldest addons-mlbf.bin |
size of addons-mlbf.bin | 931343 bytes | 845025 bytes | 841024 bytes | 787677 bytes |
size of first layer | 608252 bytes | 379564 bytes | 374473 bytes | 318293 bytes |
size of second layer | 66032 bytes | 142117 bytes | 142526 bytes | 145398 bytes |
size of third layer | 105102 bytes | 105040 bytes | 105327 bytes | 103894 bytes |
size of fourth layer | 31003 bytes | 66531 bytes | 66732 bytes | 67962 bytes |
number of hash functions at first layer | 3 | 2 | 2 | 2 |
number of hash functions at other layers | 1 | 1 | 1 | 1 |
number of layers | 24 | 25 | 24 | 24 |
The interesting observation is that the latest bloom filter is the almost doubled size, and the 3 hash functions.
These sizes are directly derived from the input, with the source code at filtercascade's calc_n_hashes
and calc_size
functions. Although the size is dependent on the number of elements (i.e. from the "include" set) and the falsePositiveRate
, the calc_n_hashes
function takes only one input:
number of hash functions at first layer = ceil( log2(1 / falsePositiveRate) )
From this, my conclusion is that the falsePositiveRate
dropped so much that the result of the computation went from 2 to 3. That implies that falsePositiveRate
went below 2 ^ -2.5 = 0.1767766952966369
. EDIT: -2, not -2.5 - see comment 7 below.
The falsePositiveRate
is computed by set_crlite_error_rates
, as:
falsePositiveRate at first layer = include_len / (sqrt(2) * exclude_len)
falsePositiveRate at other layers = 0.5
(implying number of hash functions at every other layer is 1, becauselog2(1 / 0.5) = 1
)
Since we know that the falsePositiveRate
crossed 2 ^ -2.5
, we can therefore compute the threshold ratio include_len / exclude_len
:
include_len / exclude_len
= sqrt(2) * falsePositiveRate
= sqrt(2) * 2^-2.5
= 0.25 (or lower in the new bloom filter)
Translated back to the original application, this means that for every blocked add-on (=include), there are 4 or more non-blocked add-ons (=exclude). Or equivalently, the relative number of blocked add-ons among all add-ons dropped below 20%. The most likely explanation is that the total number of add-ons increased past this threshold. For reference, in 2020 the number was closer to 33%, according to this comment in addons#7492.
Given this analysis, it looks like a way to immediately improve the space usage is to fix falsePositiveRate
above 2^-2.5
, e.g. to 0.176777
, so that number of hash functions at first layer
is fixed at 2. That can be achieved by removing set_crlite_error_rates(include_len=error_rates[0], exclude_len=error_rates[1])
in favor of cascade.set_error_rates([0.176777, 0.5])
at https://github.com/mozilla/addons-server/blob/0f718e347cde838085c9f8b2f5eec8fb45f125b4/src/olympia/blocklist/mlbf.py#L55-L57
The above magic number is specific to the current numbers on production. Given the benefits of smaller sizes, it may be worth trying to generate different bloom filters with different parameters, and taking the smallest result out of the attempts. I'll file a follow-up task for the addons-server repo.
Updated•22 days ago
|
Assignee | ||
Comment 6•22 days ago
|
||
I filed https://github.com/mozilla/addons/issues/15261
I'll keep this bug open until a new MLBF has been generated and published.
Updated•11 days ago
|
Updated•10 days ago
|
Updated•9 days ago
|
Assignee | ||
Comment 7•7 days ago
|
||
(In reply to Rob Wu [:robwu] from comment #5)
number of hash functions at first layer = ceil( log2(1 / falsePositiveRate) )
From this, my conclusion is that the
falsePositiveRate
dropped so much that the result of the computation went from 2 to 3. That implies thatfalsePositiveRate
went below2 ^ -2.5 = 0.1767766952966369
.
Correction: since the number is rounded up instead of rounded to the nearest integer, the correct conclusion is that falsePositiveRate
went below 2 ^ -2
(i.e 0.25). And the ratio therefore dropped below the following threshold, resulting in number of hash functions at first layer
to grow from 2 to 3:
include_len / exclude_len = sqrt(2) * falsePositiveRate = sqrt(2) * 0.25 = 0.353553390593 (or lower in the new bloom filter)
Given this analysis, it looks like a way to immediately improve the space usage is to fix falsePositiveRate
at 0.25
, so that number of hash functions at first layer
is fixed at 2. That can be achieved by removing set_crlite_error_rates(include_len=error_rates[0], exclude_len=error_rates[1])
in favor of cascade.set_error_rates([0.25, 0.5])
at https://github.com/mozilla/addons-server/blob/0f718e347cde838085c9f8b2f5eec8fb45f125b4/src/olympia/blocklist/mlbf.py#L55-L57
Assignee | ||
Comment 8•3 days ago
|
||
I have identified the cause of the unexpectedly increased file size, elaborated at https://github.com/mozilla/addons/issues/15261#issuecomment-2584947155 . In short, the generation process duplicated entries. While these do not affect the logical outcome of the data represented by the MLBF, they did result in a larger file size.
The next step here is to fix the error in the generation on the addons-server side.
If for some reason we want to fast-track a reduced file size ASAP, it is possible to replace the existing addons-mlbf.bin
(and addons-mlbf.bin.meta.json
) file with an equivalent file that has the optimal file size (931343 -> 847859).
Description
•