Closed Bug 1653546 Opened 4 years ago Closed 4 years ago

0.11 - 0.18% installer size (osx-shippable) regression on push bfbde1e7984a2c899776f6908eb0a849b3dc3b4d (Fri July 17 2020)

Categories

(Core :: Security: PSM, defect)

80 Branch
defect

Tracking

()

RESOLVED WONTFIX
Tracking Status
firefox80 --- wontfix

People

(Reporter: Bebe, Unassigned)

References

(Regression)

Details

(Keywords: perf-alert, regression)

Perfherder has detected a build_metrics performance regression from push bfbde1e7984a2c899776f6908eb0a849b3dc3b4d. As author of one of the patches included in that push, we need your help to address this regression.

Regressions:

0.18% installer size osx-shippable opt nightly 79,840,644.00 -> 79,986,351.50
0.11% installer size osx-shippable opt instrumented 112,014,892.54 -> 112,136,104.75

Details of the alert can be found in the alert summary, including links to graphs and comparisons for each of the affected tests. Please follow our guide to handling regression bugs and let us know your plans within 3 business days, or the offending patch(es) will be backed out in accordance with our regression policy.

For more information on performance sheriffing please see our FAQ.

Component: Performance → Security: PSM
Flags: needinfo?(kjacobs.bugzilla)
Product: Testing → Core
Flags: needinfo?(kjacobs.bugzilla)

This patch includes high-performance implementations of the most commonly used elliptic curve algorithms for TLS. These implementations are more code, with fewer branches, so that it runs much faster and in constant-time. More details are above in Comment 1.

(In reply to J.C. Jones [:jcj] (he/him) [increased latency due to COVID-19] from comment #2)

This patch includes high-performance implementations of the most commonly used elliptic curve algorithms for TLS. These implementations are more code, with fewer branches, so that it runs much faster and in constant-time. More details are above in Comment 1.

This seems like a reasonable argument for accepting the regression, thank you for the explanation! Are the new implementations Mac-only? It seems odd that we'd see an increase on Mac and not on other platforms -- and a 140KB increase in the installer size for constant-time algorithms seems like...a bit much?

Flags: needinfo?(jjones)

They're not Mac-only, so I would expect that the installers for the other platforms should increase somewhat as well. Honestly, 140kB does seem large to me, too, but I confirmed on my Mac the difference in size for libnss3.dylib between NSS 3.54 and the current beta, and the size increase is all in libfreebl3, or rather where our actual crypto implementations live:

3.54                                  beta
  607992 libfreebl3.dylib           780320
     220 libfreebl3.dylib.TOC          220
 1352328 libmozpkix-testlib.a      1352328
  673480 libmozpkix.a               673480
  350224 libnspr4.a                 350224
  235960 libnspr4.dylib             235960
  794984 libnss3.dylib              795224
   22970 libnss3.dylib.TOC           23000
  529432 libnssckbi.dylib           529432
     221 libnssckbi.dylib.TOC          221
  222892 libnssutil3.dylib          222892
    6782 libnssutil3.dylib.TOC        6782
   21696 libplc4.a                   21696
   39184 libplc4.dylib               39184
   10376 libplds4.a                  10376
   38184 libplds4.dylib              38184
  478064 libsectool.a               478064
  202032 libsmime3.dylib            202032
    5710 libsmime3.dylib.TOC          5710
     899 libsoftokn3.chk               899
  360976 libsoftokn3.dylib          360976
     413 libsoftokn3.dylib.TOC         413
  435740 libssl3.dylib              435876
    2994 libssl3.dylib.TOC            2994

That said, I would be quite surprised if we didn't see at least some measure of increase on Windows and Linux too.

Flags: needinfo?(jjones)

Can we run bloaty (https://github.com/google/bloaty) on libfreebl3.dylib to understand where the growth is? Sometimes things get surprisingly large.

bloaty -d symbols libfreebl3.dylib -- libfreebl3-old.dylib should be informative.

Cool tool!

    FILE SIZE        VM SIZE
 --------------  --------------
  [NEW] +60.8Ki  [NEW] +60.8Ki    _lut_cmb
  [NEW] +18.1Ki  [NEW] +18.1Ki    _point_add_proj
  +402% +16.8Ki  +402% +16.8Ki    _point_add_mixed
  +231% +12.3Ki  +231% +12.3Ki    _point_double
  [NEW] +6.46Ki  [NEW] +6.46Ki    _point_mul_two_secp521r1
  [NEW] +6.42Ki  [NEW] +6.42Ki    _point_mul_two_secp384r1
  [NEW] +6.40Ki  [NEW] +6.40Ki    _point_mul_secp521r1
  [NEW] +5.75Ki  [NEW] +5.75Ki    _point_mul_secp384r1
   +22% +4.88Ki   +16% +4.88Ki    [21 Others]
  [NEW] +4.48Ki  [NEW] +4.48Ki    _point_mul_g_secp384r1
  [NEW] +4.37Ki  [NEW] +4.37Ki    _point_mul_g_secp521r1
  [NEW] +3.21Ki  [NEW] +3.21Ki    _fiat_secp521r1_carry_mul
  +4.7% +3.18Ki  +4.2% +2.89Ki    [__LINKEDIT]
  [NEW] +2.66Ki  [NEW] +2.66Ki    _fiat_secp384r1_mul
  [NEW] +2.65Ki  [NEW] +2.65Ki    _fiat_secp384r1_square
  [NEW] +2.22Ki  [NEW] +2.22Ki    _fiat_secp384r1_to_montgomery
  [NEW] +1.87Ki  [NEW] +1.87Ki    _fiat_secp521r1_carry_square
  [NEW] +1.83Ki  [NEW] +1.83Ki    _fiat_secp384r1_inv
  [NEW] +1.34Ki  [NEW] +1.34Ki    _fiat_secp384r1_from_montgomery
  [NEW] +1.32Ki  [NEW] +1.32Ki    _fiat_secp521r1_to_bytes
  [NEW] +1.29Ki  [NEW] +1.29Ki    _fiat_secp521r1_inv
   +28%  +168Ki   +27%  +168Ki    TOTAL

So the majority is the scalar multiplication lookup table in the p384r1 curve: https://searchfox.org/nss/source/lib/freebl/ecl/ecp_secp384r1.c#3651

Similarly, the constant-time point math all ends up a bit larger.

This is valid for all platforms, and checking Linux binaries at least shows the same rough size increase.

Going by comment 3, it sounds like we're going to accept this size increase.

Status: NEW → RESOLVED
Closed: 4 years ago
Resolution: --- → WONTFIX
Has Regression Range: --- → yes
You need to log in before you can comment on or make changes to this bug.