Open Bug 1370516 Opened 3 years ago Updated 3 months ago

NSS should be initialized off main thread

Categories

(Core :: Security: PSM, enhancement, P2)

enhancement

Tracking

()

People

(Reporter: florian, Unassigned)

References

(Depends on 1 open bug, Blocks 6 open bugs)

Details

(Keywords: perf, Whiteboard: [qf:p2:responsiveness][psm-blocked][fxperf:p3])

In my profiles, I've observed NSS being initialized synchronously on the main thread the first time it is needed. From what I remember, in warm startup profiles, it took ~16ms, but in cold startup profiles it's usually more than 100ms (300ms is not uncommon) due to main thread IO when loading libraries.

We initially discussed this in bug 1362364, but the bug got repurposed.

For now to improve startup we are working on avoiding NSS initialization before first paint (bug 1362364, bug 1359031, bug 1367450).

I think the real solution would be to start initializing NSS on a different thread as early as possible during startup, so that it's ready by the time we may need it.

We currently have 2 cases where we would like to use SSL before first paint:
- captive portal would like to do an async XHR asap to that we can know if we should avoid restoring the previous session.
- bug 808104 would like to speculatively connect to the home page so that we are ready to load it as soon as the XUL window is ready.

I think that we would like is to have a promise that's resolved as soon as network is available, so that we can start both of these things asap, but without the risk of triggering main thread IO.
Flags: needinfo?(mcmanus)
Meant to ask a question with the needinfo: Patrick, is this likely to be possible to do in the quantum/photon time frame (ie. for 57)?
Blocks: 808104
Component: Networking → Security: PSM
Flags: needinfo?(mcmanus) → needinfo?(dkeeler)
I'm still investigating this, but I wanted to at least give a response. Unfortunately it's of the form "we don't know yet, but it would be difficult and not without risk". I will update this bug as I determine a more complete answer.
Flags: needinfo?(dkeeler)
Whiteboard: [qf] → [qf:p1]
Depends on: 1372656
Priority: -- → P2
Whiteboard: [qf:p1] → [qf:p1][psm-blocked]
This bug is not going to be fixed in time for 57. I am moving this to P2 for post 57 work.
Whiteboard: [qf:p1][psm-blocked] → [qf:p2][psm-blocked]
Keywords: perf
(In reply to David Keeler [:keeler] (use needinfo) from comment #2)
> I will update this bug as I determine a more complete
> answer.

Hey keeler, have you been able to determine a more complete answer?
Flags: needinfo?(dkeeler)
Whiteboard: [qf:p2][psm-blocked] → [qf:p1][psm-blocked]
This would depend on bug 1421084 (if we can even do it, which isn't certain). If/when that work is done, I think the way we would have to do this would be to move the majority of nsNSSComponent::InitializeNSS() inside LoadLoadableRootsTask::Run() (I imagine we would rename that). Then EnsureNSSInitializedChromeOrContent() would have to call BlockUntilLoadableRootsLoaded(). The last piece would be explicitly causing a PSM initialization early in startup after getting the user's profile directory.
Depends on: 1421084
Flags: needinfo?(dkeeler)
Whiteboard: [qf:p1][psm-blocked] → [qf:i60][qf:p1][psm-blocked]
Whiteboard: [qf:i60][qf:p1][psm-blocked] → [qf:f60][qf:p1][psm-blocked]
Whiteboard: [qf:f60][qf:p1][psm-blocked] → [qf:f61][qf:p1][psm-blocked]
Whiteboard: [qf:f61][qf:p1][psm-blocked] → [qf:f61][qf:p1][psm-blocked] [fxperf]
Whiteboard: [qf:f61][qf:p1][psm-blocked] [fxperf] → [qf:f61][qf:p1][psm-blocked][fxperf:p3]
See Also: → 441355
Hey dkeeler, with bug 1421084 fixed, is fixing this bug still as you described in comment 6? Is this still something we can attempt?
Flags: needinfo?(dkeeler)
Whiteboard: [qf:f61][qf:p1][psm-blocked][fxperf:p3] → [qf:f64][qf:p1][psm-blocked][fxperf:p3]
Essentially, yes. The added code from bug 1427248 would have to be handled a bit differently, but that wouldn't be too hard. My main concern here is that currently consumers assume it's safe to call NSS functions if nsINSSComponent has been initialized, but this would no longer be the case. We'd have to sprinkle around calls to `BlockUntilLoadableRootsLoaded()` (or whatever it would be called) in a number of places, and I don't see a way of guaranteeing that new code will do the appropriate waiting. This could lead to a long litany of whack-a-mole crashes where we ship a version and then go "oh, whoops - forgot to wait here... forgot to wait here... forgot to wait here...".

I think before we go further with this bug, we need to figure out what exactly is slow about initializing NSS. With the loadable roots loading on another thread, I suspect the long pole is now the user's certificate and key databases. If so, we may be able to load those on the background thread as well (and if we ever got the waiting wrong all that would happen is e.g. some TLS connections may fail and the user can just reload the page, rather than having Firefox crash).
Flags: needinfo?(dkeeler)
Whiteboard: [qf:f64][qf:p1][psm-blocked][fxperf:p3] → [qf:p1:f64][psm-blocked][fxperf:p3]
Blocks: 1445965
Whiteboard: [qf:p1:f64][psm-blocked][fxperf:p3] → [qf:p2:responsiveness][psm-blocked][fxperf]

Marking P2 for investigation because the last comment from keeler is that to move forward we need more information about exactly what is slow about NSS initialization

Whiteboard: [qf:p2:responsiveness][psm-blocked][fxperf] → [qf:p2:responsiveness][psm-blocked][fxperf:p3]

Here are two profiles showing what happens during NSS initialization: https://perfht.ml/2G4F0pl https://perfht.ml/2G4Mw3x

Blocks: 1541259
Depends on: 1557378
See Also: → 1596429
See Also: → 1596430
You need to log in before you can comment on or make changes to this bug.