0.11 - 0.18% installer size (osx-shippable) regression on push bfbde1e7984a2c899776f6908eb0a849b3dc3b4d (Fri July 17 2020)
Categories
(Core :: Security: PSM, defect)
Tracking
()
Tracking | Status | |
---|---|---|
firefox80 | --- | wontfix |
People
(Reporter: Bebe, Unassigned)
References
(Regression)
Details
(Keywords: perf-alert, regression)
Perfherder has detected a build_metrics performance regression from push bfbde1e7984a2c899776f6908eb0a849b3dc3b4d. As author of one of the patches included in that push, we need your help to address this regression.
Regressions:
0.18% installer size osx-shippable opt nightly 79,840,644.00 -> 79,986,351.50
0.11% installer size osx-shippable opt instrumented 112,014,892.54 -> 112,136,104.75
Details of the alert can be found in the alert summary, including links to graphs and comparisons for each of the affected tests. Please follow our guide to handling regression bugs and let us know your plans within 3 business days, or the offending patch(es) will be backed out in accordance with our regression policy.
For more information on performance sheriffing please see our FAQ.
Reporter | ||
Updated•4 years ago
|
Updated•4 years ago
|
Comment 2•4 years ago
|
||
This patch includes high-performance implementations of the most commonly used elliptic curve algorithms for TLS. These implementations are more code, with fewer branches, so that it runs much faster and in constant-time. More details are above in Comment 1.
Comment 3•4 years ago
|
||
(In reply to J.C. Jones [:jcj] (he/him) [increased latency due to COVID-19] from comment #2)
This patch includes high-performance implementations of the most commonly used elliptic curve algorithms for TLS. These implementations are more code, with fewer branches, so that it runs much faster and in constant-time. More details are above in Comment 1.
This seems like a reasonable argument for accepting the regression, thank you for the explanation! Are the new implementations Mac-only? It seems odd that we'd see an increase on Mac and not on other platforms -- and a 140KB increase in the installer size for constant-time algorithms seems like...a bit much?
Comment 4•4 years ago
|
||
They're not Mac-only, so I would expect that the installers for the other platforms should increase somewhat as well. Honestly, 140kB does seem large to me, too, but I confirmed on my Mac the difference in size for libnss3.dylib between NSS 3.54 and the current beta, and the size increase is all in libfreebl3
, or rather where our actual crypto implementations live:
3.54 beta
607992 libfreebl3.dylib 780320
220 libfreebl3.dylib.TOC 220
1352328 libmozpkix-testlib.a 1352328
673480 libmozpkix.a 673480
350224 libnspr4.a 350224
235960 libnspr4.dylib 235960
794984 libnss3.dylib 795224
22970 libnss3.dylib.TOC 23000
529432 libnssckbi.dylib 529432
221 libnssckbi.dylib.TOC 221
222892 libnssutil3.dylib 222892
6782 libnssutil3.dylib.TOC 6782
21696 libplc4.a 21696
39184 libplc4.dylib 39184
10376 libplds4.a 10376
38184 libplds4.dylib 38184
478064 libsectool.a 478064
202032 libsmime3.dylib 202032
5710 libsmime3.dylib.TOC 5710
899 libsoftokn3.chk 899
360976 libsoftokn3.dylib 360976
413 libsoftokn3.dylib.TOC 413
435740 libssl3.dylib 435876
2994 libssl3.dylib.TOC 2994
That said, I would be quite surprised if we didn't see at least some measure of increase on Windows and Linux too.
Comment 5•4 years ago
|
||
Can we run bloaty (https://github.com/google/bloaty) on libfreebl3.dylib
to understand where the growth is? Sometimes things get surprisingly large.
bloaty -d symbols libfreebl3.dylib -- libfreebl3-old.dylib
should be informative.
Comment 6•4 years ago
|
||
Cool tool!
FILE SIZE VM SIZE
-------------- --------------
[NEW] +60.8Ki [NEW] +60.8Ki _lut_cmb
[NEW] +18.1Ki [NEW] +18.1Ki _point_add_proj
+402% +16.8Ki +402% +16.8Ki _point_add_mixed
+231% +12.3Ki +231% +12.3Ki _point_double
[NEW] +6.46Ki [NEW] +6.46Ki _point_mul_two_secp521r1
[NEW] +6.42Ki [NEW] +6.42Ki _point_mul_two_secp384r1
[NEW] +6.40Ki [NEW] +6.40Ki _point_mul_secp521r1
[NEW] +5.75Ki [NEW] +5.75Ki _point_mul_secp384r1
+22% +4.88Ki +16% +4.88Ki [21 Others]
[NEW] +4.48Ki [NEW] +4.48Ki _point_mul_g_secp384r1
[NEW] +4.37Ki [NEW] +4.37Ki _point_mul_g_secp521r1
[NEW] +3.21Ki [NEW] +3.21Ki _fiat_secp521r1_carry_mul
+4.7% +3.18Ki +4.2% +2.89Ki [__LINKEDIT]
[NEW] +2.66Ki [NEW] +2.66Ki _fiat_secp384r1_mul
[NEW] +2.65Ki [NEW] +2.65Ki _fiat_secp384r1_square
[NEW] +2.22Ki [NEW] +2.22Ki _fiat_secp384r1_to_montgomery
[NEW] +1.87Ki [NEW] +1.87Ki _fiat_secp521r1_carry_square
[NEW] +1.83Ki [NEW] +1.83Ki _fiat_secp384r1_inv
[NEW] +1.34Ki [NEW] +1.34Ki _fiat_secp384r1_from_montgomery
[NEW] +1.32Ki [NEW] +1.32Ki _fiat_secp521r1_to_bytes
[NEW] +1.29Ki [NEW] +1.29Ki _fiat_secp521r1_inv
+28% +168Ki +27% +168Ki TOTAL
So the majority is the scalar multiplication lookup table in the p384r1 curve: https://searchfox.org/nss/source/lib/freebl/ecl/ecp_secp384r1.c#3651
Similarly, the constant-time point math all ends up a bit larger.
This is valid for all platforms, and checking Linux binaries at least shows the same rough size increase.
Updated•4 years ago
|
Comment 7•4 years ago
|
||
Going by comment 3, it sounds like we're going to accept this size increase.
Updated•4 years ago
|
Updated•4 years ago
|
Description
•