Closed Bug 1539415 Opened 8 months ago Closed 8 months ago

make nsICertStorage/cert_storage asynchronous when called from the main thread

Categories

(Core :: Security: PSM, defect, P1, major)

defect

Tracking

()

RESOLVED FIXED
mozilla68
Tracking Status
firefox68 --- fixed

People

(Reporter: florian, Assigned: keeler)

References

(Blocks 1 open bug)

Details

(Whiteboard: [psm-assigned])

Attachments

(1 file)

http://bit.ly/2JFZ441 shows 4s of jank happening soon after startup, on a startup profile captured on the 2018 quantum reference hardware.

The calls to nsNSSCertificateDB::AddCert triggered by RemoteSecuritySettings.jsm cause lots of mainthread I/O calls for the cert9.db* and key4.db* files.

When looking at BHR data using http://queze.net/bhr/, this bug represents about a third of all hangs reported for the nightly population.

It looks like this feature is currently Nightly-only (if I understand bug 1520278 correctly), and that there are plans to improve this in bug 1530545.

Mark: if helpful, I have one of the 2018 reference devices and can bring it along sometime -- though it's main thread IO everywhere so hopefully that (and the profile etc.) is sufficient info to debug/patch.

I'm morphing this to apply to nsICertStorage/cert_storage more generally (eventually cert_storage will be responsible for storing these certificates anyway).

Assignee: nobody → dkeeler
Priority: -- → P1
Summary: RemoteSecuritySettings.jsm's calls to nsNSSCertificateDB::AddCert causes hangs → make nsICertStorage/cert_storage asynchronous when called from the main thread
Whiteboard: [psm-assigned]

The Set* functions of nsICertStorage (SetRevocationByIssuerAndSerial,
SetRevocationBySubjectAndPubKey, SetEnrollment, and SetWhitelist) are called on
the main thread by the implementations that manage consuming remote security
information. We don't want to block the main thread, so this patch modifies
these functions to take a callback that will be called (on the original thread)
when the operation in question has been completed on a background thread.

The Get* functions of nsICertStorage (GetRevocationState, GetEnrollmentState,
and GetWhitelistState) are currently only called off the main thread (or on the
main thread in xpcshell tests). Currently it would not be beneficial to make
these functions asynchronous. If we ever need to call these functions on the
main thread in the product, we would need to extend this work to make those
functions asynchronous as well.

(In reply to Florian Quèze [:florian] from comment #0)

http://bit.ly/2JFZ441 shows 4s of jank happening soon after startup, on a startup profile captured on the 2018 quantum reference hardware.

I was lucky to have it take only 4s when I captured this profile. Here is another one https://perfht.ml/2VbqnV8 captured yesterday where there's 100+s of consecutive jank (including 24014 main thread I/O markers, lasting 72844ms). And today I saw Firefox being unusable again for the same reason: https://perfht.ml/2JWLbOY

Seeing that we do that many I/O calls makes me wonder if moving this off main thread is enough (it's definitely a good first step in any case). Would there be a way to group the I/O so that we touch the disk less?

Severity: normal → major

This is related to the work we're doing in Bug 1529044 - the imports are now done on background threads and will possibly migrate to imports on a single thread.

Pushed by dkeeler@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/d1814ba5bb8e
make nsICertStorage (cert_storage) asynchronous for functions called from the main thread r=jcj,mgoodwin
Status: NEW → RESOLVED
Closed: 8 months ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla68

I just captured this profile today on a Nightly from yesterday on the reference hardware: http://bit.ly/2LdWz9G It looks like this bug to me (nsNSSCertificateDB::AddCert hangs when called by resource://gre/modules/psm/RemoteSecuritySettings.jsm), but this bug is marked as FIXED. Could you please have a look at the profile? Thanks!

Flags: needinfo?(dkeeler)

Sorry - comment 2 wasn't clear. I meant that this bug would lay the groundwork for making intermediate preloading not block the main thread, but that bug 1530545 would actually remove that call to AddCert. As it happens, that bug just landed - can you confirm we don't have this issue any longer?

Flags: needinfo?(dkeeler)
You need to log in before you can comment on or make changes to this bug.