Closed Bug 1355903 Opened 8 years ago Closed 1 year ago

Re-enable Certificate Transparency telemetry collection in Nightly

Categories

(Core :: Security: PSM, task, P1)

task

Tracking

()

RESOLVED FIXED
132 Branch
Tracking Status
firefox132 --- fixed

People

(Reporter: jcj, Assigned: keeler)

References

(Depends on 1 open bug, Blocks 1 open bug)

Details

(Whiteboard: [psm-assigned])

Attachments

(1 file)

Bug 1353216 shut off certificate transparency telemetry collection due to a performance regression. This bug should turn it back on after performance is improved.
Depends on: 1355993
sorry for the drive-by partially informed comment - but given that this is just for reporting is this something that can be done asynchronously queued and processed at 'idle' time? (its been a while, but iirc firefox has an event for that) so the math doesn't impact responsiveness? or is there too much state tied up in the connection to practically copy/addref for queueing?
(In reply to Patrick McManus [:mcmanus] from comment #1) > sorry for the drive-by partially informed comment - but given that this is > just for reporting is this something that can be done asynchronously queued > and processed at 'idle' time? (its been a while, but iirc firefox has an > event for that) so the math doesn't impact responsiveness? or is there too > much state tied up in the connection to practically copy/addref for queueing? It's a good idea, but an eventual goal is to have CT enforcement, too -- which would need to be synchronous to be effective.
Keeler and I discussed more what is needed to re-enable our CT telemetry collection: == Better Tests == Before we get to performance testing, I think we need more significant tests in-tree, as specified in bug 1361177. == Speedups == It seems that most of the performance regression was due to unnecessary cert chain validations, fixed in Bug 1356611. The unnecessary validations were made much worse by the slower-than-desired verification algorithm; Keeler filed Bug 1355993 to address speeding that up. It should be possible to get some performance increase, but given the magnitude of the fix from Bug 1356611, we _might not need_ the algorithm speedup to proceed. We'd optimally like to see a Talos test for HTTPS that could show a regression during CT (Bug 1356073), but perhaps we can proceed with a plan that checks telemetry instead: == Speedup Test Plan == When we disabled CT, the SSL_SUCCESFUL_CERT_VALIDATION_TIME_MOZILLAPKIX telemetry metric dropped precipitously [1] (the drop corresponds to the disable date) [2]. In-code, this metric is firmly tied to the CT performance. I think it's OK to move on without the Talos test and verify whether our performance is OK using Nightly telemetry. If slows down more than a little, we'd want to wait on Bug 1355993 to proceed. == Recommendation Summary == 1. We need the tests in bug 1361177. 2. Then we can re-enable using this bug and monitor SSL_SUCCESFUL_CERT_VALIDATION_TIME_MOZILLAPKIX. 2.a. If performance slows down by an unacceptable amount, we back-out the enable commit. 2.b. If we back-out, then we block on Bug 1355993 to re-try. [1] https://i.have.insufficient.coffee/ct-validation-time-fix.png [2] https://mzl.la/2qKALHN
Depends on: 1361177
After further discussion, I am amending my recommendation that we only re-enable after we also have Talos tests, and marking this bug as being blocked on Bug 1356073: == Recommendation Summary (Revised) == 1. We need the tests in bug 1361177. 2. We also need the Talos test in bug 1356073 to monitor for HTTPS performance regressions. 3. Then we can re-enable using this bug and monitor SSL_SUCCESFUL_CERT_VALIDATION_TIME_MOZILLAPKIX in Nightly. 4.a. If performance slows down by an unacceptable amount, we back-out the enable commit. 4.b. If we back-out, then we block on Bug 1355993 before re-trying.
Depends on: 1356073
Priority: -- → P3
Whiteboard: [psm-blocked]
Severity: normal → S3
Assignee: nobody → dkeeler
Severity: S3 → N/A
Type: defect → task
Priority: P3 → P1
Summary: Re-enable Certificate Transparency telemetry collection → Re-enable Certificate Transparency telemetry collection in Nightly
Whiteboard: [psm-blocked] → [psm-assigned]
Pushed by dkeeler@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/ad1041a28c3c enable certificate transparency telemetry by default in nightly r=jschanck

Backed out for causing mochitest failures on test_ext_webrequest_getSecurityInfo.html

[task 2024-08-28T21:38:18.308Z] 21:38:18     INFO - TEST-PASS | toolkit/components/extensions/test/mochitest/test_ext_webrequest_getSecurityInfo.html | expected an array of certificates 
[task 2024-08-28T21:38:18.308Z] 21:38:18     INFO - Buffered messages finished
[task 2024-08-28T21:38:18.308Z] 21:38:18     INFO - TEST-UNEXPECTED-FAIL | toolkit/components/extensions/test/mochitest/test_ext_webrequest_getSecurityInfo.html | expected security info - Expected: {"state":"secure","isExtendedValidation":false,"certificateTransparencyStatus":"not_applicable","hsts":false,"hpkp":false,"usedEch":false,"usedDelegatedCredentials":false,"usedOcsp":false,"usedPrivateDns":false}, Actual: {"state":"secure","isExtendedValidation":false,"certificateTransparencyStatus":"policy_not_enough_scts","hsts":false,"hpkp":false,"usedEch":false,"usedDelegatedCredentials":false,"usedOcsp":false,"usedPrivateDns":false} 
[task 2024-08-28T21:38:18.308Z] 21:38:18     INFO -     SimpleTest.ok@SimpleTest/SimpleTest.js:426:16
[task 2024-08-28T21:38:18.308Z] 21:38:18     INFO -     testHandler@SimpleTest/ExtensionTestUtils.js:83:18
[task 2024-08-28T21:38:18.308Z] 21:38:18     INFO -     testResult@SimpleTest/ExtensionTestUtils.js:97:18
[task 2024-08-28T21:38:18.308Z] 21:38:18     INFO -     Async*ExtensionTestUtils.loadExtension@SimpleTest/ExtensionTestUtils.js:128:33
[task 2024-08-28T21:38:18.308Z] 21:38:18     INFO -     test_getSecurityInfo@toolkit/components/extensions/test/mochitest/test_ext_webrequest_getSecurityInfo.html:16:40
[task 2024-08-28T21:38:18.308Z] 21:38:18     INFO -     add_task/nextTick/<@SimpleTest/SimpleTest.js:2189:34
[task 2024-08-28T21:38:18.308Z] 21:38:18     INFO -     nextTick@SimpleTest/SimpleTest.js:2233:11
[task 2024-08-28T21:38:18.308Z] 21:38:18     INFO -     setTimeout handler*SimpleTest_setTimeoutShim@SimpleTest/SimpleTest.js:922:41
[task 2024-08-28T21:38:18.308Z] 21:38:18     INFO -     add_task@SimpleTest/SimpleTest.js:2137:17
[task 2024-08-28T21:38:18.308Z] 21:38:18     INFO -     @toolkit/components/extensions/test/mochitest/test_ext_webrequest_getSecurityInfo.html:15:9
[task 2024-08-28T21:38:18.308Z] 21:38:18     INFO - GECKO(2195) | [Parent 2195, Main Thread] WARNING: 'aOwner->IsDiscarded()', file /builds/worker/workspace/obj-build/dist/include/mozilla/dom/SyncedContextInlines.h:95
[task 2024-08-28T21:38:18.308Z] 21:38:18     INFO - TEST-PASS | toolkit/components/extensions/test/mochitest/test_ext_webrequest_getSecurityInfo.html | success 
Flags: needinfo?(dkeeler)
Flags: needinfo?(dkeeler)
Pushed by dkeeler@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/e6660dd507f0 enable certificate transparency telemetry by default in nightly r=jschanck,robwu
Status: NEW → RESOLVED
Closed: 1 year ago
Resolution: --- → FIXED
Target Milestone: --- → 132 Branch
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: