Closed
Bug 1353216
Opened 4 years ago
Closed 4 years ago
certificate transparency signature verifications negatively impact TLS handshake performance
Categories
(Core :: Security: PSM, defect, P1)
Tracking
()
People
(Reporter: alice0775, Assigned: keeler)
References
(Blocks 1 open bug)
Details
(Keywords: perf, regression, Whiteboard: [STR in comment#6])
Attachments
(6 files, 2 obsolete files)
|
10.60 KB,
text/plain
|
Details | |
|
1.67 MB,
text/plain
|
Details | |
|
1.59 MB,
text/plain
|
Details | |
|
822.05 KB,
application/octet-stream
|
Details | |
|
949.26 KB,
application/octet-stream
|
Details | |
|
1.19 KB,
patch
|
jcj
:
review+
jcristau
:
approval-mozilla-aurora+
jcristau
:
approval-mozilla-beta+
jcristau
:
approval-mozilla-release+
jcristau
:
approval-mozilla-esr52+
|
Details | Diff | Splinter Review |
Disabling e10s does not solve in Nightly54.0a1. Reproducible: always Steps To Reproduce: 1. Start with clean profile, and disable Flash Player if any 2. Load https://www.yahoo.co.jp/ in several times Actual Results: In Nightly 55.0a1, page loading is x2 slower than Firefox 51.0.1 Elapsed time until tab-throbber stops: Firefox 51.0.1(without e10s) : 7-10sec Nightly 55.0a1(with/without e10s) : 20-25sec Expected Results: Not so. My environment: Windows10 Home 1607 64bit, Core2Quad 4@2.5GHz, Mem 8GB, Net ADSL 12(actual: down 9.5Mbps, up 800Kbps)
| Comment hidden (obsolete) |
| Comment hidden (obsolete) |
| Comment hidden (obsolete) |
| Comment hidden (obsolete) |
| Comment hidden (obsolete) |
| Reporter | ||
Comment 6•4 years ago
|
||
In Nightly 55.0a1, page loading is x2 slower than Firefox 51.0.1 Reproducible: almost always Steps To reproduce: 1. Disable e10s 2. Open https://tv.yahoo.co.jp/listings/fm/ (maybe Japan region only) 3. Wait to render page and settle CPU 4. Reload page (F5) --- Observe the time until tab-throbber stops spinning 5. Reload page (F5) --- Observe the time until tab-throbber stops spinning 6. Reload page (F5) --- Observe the time until tab-throbber stops spinning Actual Results: It takes about 4-6 seconds each Expected results: It is within 2 seconds each Regression window: https://hg.mozilla.org/integration/autoland/pushloghtml?fromchange=c651f7174f37747de99eaafd27b2d78fb10bead3&tochange=50143dbdcb47bf47c8827c8777b0e11e92e25418 Regressed by: 50143dbdcb47 Sergei Chernov — Bug 1293231 - Certificate Transparency - basic telemetry reports; r=Cykesiopka,keeler
Blocks: 1293231
Has Regression Range: --- → yes
Has STR: --- → yes
status-firefox52:
--- → wontfix
status-firefox53:
--- → wontfix
status-firefox54:
--- → fix-optional
Component: Untriaged → Security: PSM
Flags: needinfo?(sergei.cv)
Keywords: regression
Whiteboard: [STR in comment#6]
| Reporter | ||
Updated•4 years ago
|
Attachment #8854330 -
Attachment is obsolete: true
| Reporter | ||
Comment 7•4 years ago
|
||
Attachment #8854257 -
Attachment is obsolete: true
| Reporter | ||
Comment 8•4 years ago
|
||
And I confirmed that setting security.pki.certificate_transparency.mode = 0 fixes the problem
| Reporter | ||
Updated•4 years ago
|
Version: Trunk → 52 Branch
| Reporter | ||
Comment 9•4 years ago
|
||
[Tracking Requested - why for this release]: performance regression
tracking-firefox55:
--- → ?
Comment 10•4 years ago
|
||
I could not reproduce the issue on FF52 with browser.tabs.remote.autostart* = false and security.pki.certificate_transparency.mode = 0. https://tv.yahoo.co.jp/listings/fm/ always takes about 5 seconds to load. I've also tried on FF51 - same results. Could you please attach two HAR files from FF52 (release) for the page - with certificate_transparency off (loads quickly) and on (loads slowly)? With the difference in the loading times being so significant, I think we should be able to get a clue from there. Thanks!
Flags: needinfo?(sergei.cv) → needinfo?(alice0775)
| Reporter | ||
Comment 11•4 years ago
|
||
(In reply to Sergei Chernov from comment #10) > I could not reproduce the issue on FF52 with browser.tabs.remote.autostart* > = false and security.pki.certificate_transparency.mode = 0. > https://tv.yahoo.co.jp/listings/fm/ always takes about 5 seconds to load. > I've also tried on FF51 - same results. > > Could you please attach two HAR files from FF52 (release) for the page - > with certificate_transparency off (loads quickly) and on (loads slowly)? > With the difference in the loading times being so significant, I think we > should be able to get a clue from there. > > Thanks! What is HAR file? Where is it located? How do I obtain it?
Flags: needinfo?(alice0775) → needinfo?(sergei.cv)
| Reporter | ||
Comment 12•4 years ago
|
||
(In reply to Alice0775 White from comment #11) > (In reply to Sergei Chernov from comment #10) > > I could not reproduce the issue on FF52 with browser.tabs.remote.autostart* > > = false and security.pki.certificate_transparency.mode = 0. > > https://tv.yahoo.co.jp/listings/fm/ always takes about 5 seconds to load. > > I've also tried on FF51 - same results. > > > > Could you please attach two HAR files from FF52 (release) for the page - > > with certificate_transparency off (loads quickly) and on (loads slowly)? > > With the difference in the loading times being so significant, I think we > > should be able to get a clue from there. > > > > Thanks! > > What is HAR file? Where is it located? How do I obtain it? NM. https://developer.mozilla.org/en-US/docs/Tools/Network_Monitor#CopySave_All_As_HAR I will do.
Flags: needinfo?(sergei.cv)
| Reporter | ||
Comment 13•4 years ago
|
||
| Reporter | ||
Comment 14•4 years ago
|
||
| Assignee | ||
Comment 15•4 years ago
|
||
I believe this is due to inefficient EC signature verification code. I took some traces using the new profiling tool at https://perf-html.io/ (I'll attach them shortly), and it shows that certificate verification is taking ~8ms on my machine without CT and ~192ms with CT (select the socket thread, filter for "verify", and keep opening the arrows until you get to the relevant functions). Drilling into that, about half of the difference is due to actually doing the signature verification (so, math) and the other half is "importing" the public key into the PKCS#11 softoken to operate on it. Since we know in advance what EC keys we're going to be using, we shouldn't waste time reimporting them every time (note that due to an architectural issue with the way keys are represented, though, VFY_VerifyDigestDirect (the NSS function we're using) will always cause the key to be reimported, so we'll have to work around that (we should be able to call PK11_Verify directly).
| Assignee | ||
Comment 16•4 years ago
|
||
| Assignee | ||
Comment 17•4 years ago
|
||
Comment 18•4 years ago
|
||
> the difference is due to actually doing the signature verification (so, math)
:( That's bad. P256 shouldn't be that slow. But it seems it is. This sounds like something we probably should fix on the NSS side if it's visibly slowing down things.
Comment 19•4 years ago
|
||
Hmmm, does this affect 53? We should probably flip the `certificate_transparency.mode` preference back to 0 for release builds until we have this performant.
| Assignee | ||
Comment 20•4 years ago
|
||
It looks like just flipping the pref is sufficient to disable this (no test changes needed: https://treeherder.mozilla.org/#/jobs?repo=try&revision=181a7bb2472593a1ccac40058efdae6da0f25ecd )
Attachment #8857554 -
Flags: review?(jjones)
| Assignee | ||
Updated•4 years ago
|
Assignee: nobody → dkeeler
status-firefox53:
wontfix → ---
status-firefox54:
fix-optional → ---
tracking-firefox53:
--- → ?
tracking-firefox54:
--- → ?
Priority: -- → P1
Summary: In Nightly 55.0a1, page loading is x2 slower than Firefox 51.0.1 → certificate transparency signature verifications negatively impact TLS handshake performance
| Assignee | ||
Updated•4 years ago
|
status-firefox53:
--- → affected
status-firefox54:
--- → affected
Comment 21•4 years ago
|
||
Comment on attachment 8857554 [details] [diff] [review] 1353216-disable-ct.diff Review of attachment 8857554 [details] [diff] [review]: ----------------------------------------------------------------- OK, this is fine. We'll land this in 55 and -- well, does this need uplift to 54 / 53? I'll file a bug to turn this back on, but we shouldn't do that until it's not a noticeable performance hit.
Attachment #8857554 -
Flags: review?(jjones) → review+
Comment 22•4 years ago
|
||
[Tracking Requested - why for this release]: sounds like this might need to get fixed in 52esr
status-firefox-esr52:
--- → affected
tracking-firefox-esr52:
--- → ?
Comment 23•4 years ago
|
||
Pushed by dkeeler@mozilla.com: https://hg.mozilla.org/integration/mozilla-inbound/rev/dd72f3e3466f disable certificate transparency processing for performance concerns r=jcj
Comment 24•4 years ago
|
||
Tracking 54/55+. Probably too late for 53 at this point since are already at RC, but we should circle back regarding Comment 22.
I discussed this with keeler earlier today, if we land it quickly it can still make the 53 RC2 build. I am not super clear whether this should be a 53 release blocker, but I think not, since we've had the problem around for the entire 52 cycle.
| Assignee | ||
Comment 26•4 years ago
|
||
I filed bug 1355993 to investigate the underlying slowness of EC in NSS.
Flags: needinfo?(lhenry)
Comment 27•4 years ago
|
||
| bugherderuplift | ||
https://hg.mozilla.org/releases/mozilla-aurora/rev/cef7b5c139ef
Comment 28•4 years ago
|
||
| bugherderuplift | ||
https://hg.mozilla.org/releases/mozilla-beta/rev/2326a8ab1506 https://hg.mozilla.org/releases/mozilla-release/rev/2326a8ab1506
Comment 29•4 years ago
|
||
| bugherderuplift | ||
https://hg.mozilla.org/releases/mozilla-esr52/rev/aaa0bba23803
Comment 30•4 years ago
|
||
| bugherder | ||
https://hg.mozilla.org/mozilla-central/rev/dd72f3e3466f
Status: NEW → RESOLVED
Closed: 4 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla55
Comment 31•4 years ago
|
||
Comment on attachment 8857554 [details] [diff] [review] 1353216-disable-ct.diff post hoc a+ all the things.
Attachment #8857554 -
Flags: approval-mozilla-release+
Attachment #8857554 -
Flags: approval-mozilla-esr52+
Attachment #8857554 -
Flags: approval-mozilla-beta+
Attachment #8857554 -
Flags: approval-mozilla-aurora+
Updated•4 years ago
|
Comment 32•4 years ago
|
||
I concur with this disablement, but just in terms of scale, it seems like there must be a really unreasonable number of verifications to add up to 2 seconds. Before we re-enable this, someone needs to properly diagnose it. Keeler, I don't think it has to be you, but I think you are in charge of figuring out who it is.
Flags: needinfo?(dkeeler)
| Assignee | ||
Comment 33•4 years ago
|
||
Looking at https://perfht.ml/2od2JGj (this is Nightly with CT and HTTP/2 enabled), necko is spending a lot of time calling nsNSSSocketInfo::IsAcceptableForHost (via JoinConnection), which is what is doing these signature verifications. :mcmanus, is this expected? Looking at some debugging output, necko seems to be repeatedly asking if (e.g.) a.example.com can join example.com, and then b.example.com, etc. Would it be sound to cache the results so we don't have to rebuild a certificate chain and re-verify the CT information each time? (Also, fwiw, I was under the impression that ECDSA signature verification was supposed to be faster than RSA, which in retrospect wouldn't be the case particularly for small public RSA exponents, so blaming ECDSA specifically was probably a red herring here.)
Flags: needinfo?(dkeeler) → needinfo?(mcmanus)
Comment 34•4 years ago
|
||
there probably is an interaction here with the new coalescing code that went in with the origin frame code - that's 55 only. will add more thoughts later. necko might be the place to put a simpler filter.
Flags: needinfo?(mcmanus)
Comment 35•4 years ago
|
||
@keeler - can you make a different bug for comment 33 (as that one is a nightly 55 only issue, I'm sure and it seems like people are certain this bug has a strong 52 component). in particular the STR would matter a lot as to the state of the coalescing table. And we should also retest tomorrow's build where bug 1355591 has landed which probably improves things a bit.. you are likely right that a cache is needed.
| Assignee | ||
Comment 36•4 years ago
|
||
I filed bug 1356611 and re-ran the profiling. Bug 1355591 may have helped, but the underlying issue seems to still be there.
You need to log in
before you can comment on or make changes to this bug.
Description
•