Closed Bug 383963 Opened 17 years ago Closed 17 years ago

Want ability to temporarily disallow OCSP for a single function call

Categories

(NSS :: Libraries, defect)

3.11.7
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED WONTFIX

People

(Reporter: KaiE, Assigned: KaiE)

Details

Attachments

(1 file, 3 obsolete files)

PSM needs to display information about a lot of certificates quickly.

PSM needs to obtain the list of cert's valid uses from NSS. It does so using the Cert Verify functions.

PSM can not afford to have OCSP enabled while obtaining that information. It would block UI.

PSM's solution in the past has been:
- disable OCSP
- obtain all the information
- reenable OCSP

In addition PSM displays information in the UI that the displayed information was not yet verified using OCSP, and in order to do so the user must use the "view cert details" UI.


But it is not acceptable to globally disable OCSP.
For example, a negative side effect could be: An SSL handshake running on a different thread would possibly also run with OCSP disabled.


Ideally, NSS would offer a flag in its verification APIs that PSM could use to indicate "do not use OCSP this time".

But I think we do NOT want to introduce additional cert verification APIs.


Therefore I'm making this different proposal:

Let's add a new API to pause OCSP on the current thread, only.


On the thread where PSM assembles data for UI display:
- PSM pauses OCSP
- PSM calls the cert verification functions
- the cert verification functions detect that OCSP is paused and skip it
- PSM un-pauses OCSP


I propose to add two new functions and use thread local storage to achieve the pausing.
Summary: Want ability to pause OCSP without disabling it → Want ability to pause OCSP on a thread without disabling it globally
Attached patch Patch v1 (obsolete) — Splinter Review
Attachment #267904 - Flags: superreview?(julien.pierre.boogz)
Attachment #267904 - Flags: review?(rrelyea)
Attached patch Patch v2 (obsolete) — Splinter Review
Better patch.

I decided to NOT modify the cert verify functions.
Instead I modified the (non-exported) CERT_CheckOCSPStatus function.

In addition I renamed the functions to contain the word "verify".
I want to make it clear the "pause ocsp" applies to the cert verification functions, only.
Assignee: nobody → kengert
Attachment #267904 - Attachment is obsolete: true
Status: NEW → ASSIGNED
Attachment #267906 - Flags: superreview?
Attachment #267906 - Flags: review?(rrelyea)
Attachment #267904 - Flags: superreview?(julien.pierre.boogz)
Attachment #267904 - Flags: review?(rrelyea)
Attachment #267906 - Flags: superreview? → superreview?(julien.pierre.boogz)
Blocks: 383969
IMO, the word "pause" is the wrong word here.  You're not merely delaying
the performance of OCSP (as the word "pause" implies).  You're disabling
it for a time.  
Kai, Is the PSM architecture such that the code for the Cert Manager runs
on its own separate thread, and that thread is not also used for other
purposes, such as SSL?  

IINM, recent architectural changes to PSM cause all SSL work to be done on
a special SSL thread now.  Are we certain that the Cert Manager work does
not also happen on that thread?  
Comment on attachment 267906 [details] [diff] [review]
Patch v2

The issue with the word "pause" is serious enough that I must give the 
patch an r- for it.  I have not reviewed the rest of the patch.
It further seems to me that the API should not be seen as something 
that temporarily disables OCSP, but rather as a per-thread control 
over whether OCSP is enabled.  Consequently, the name of the functions
should reflect control of the positive (enabled) state of OCSP, not the 
negative (disabled) state.  This suggests new names, something like the
following:

CERT_IsVerifyOCSPPausedOnThread -> CERT_IsOCSPEnabledOnThread
ERT_PauseVerifyOCSPOnThread     -> CERT_EnableOCSPOnThread
Attachment #267906 - Flags: review-
(In reply to comment #3)
> IMO, the word "pause" is the wrong word here.  You're not merely delaying
> the performance of OCSP (as the word "pause" implies).  You're disabling
> it for a time.  

Yes, I'm disabling it for a time.

But the proposed API does not affect the global on/off switch.

In other words:
You are proposing to name the new function CERT_EnableOCSPOnThread.

But this is wrong according to my intention.

If OCSP was initially disabled, the sequence pause(true) and pause(false) would NOT switch OCSP to enabled.

If you don't like the term pause, think of it as a temporary override.

Using your proposed name CERT_EnableOCSPOnThread would suggest to the reader that OCSP would now be enabled.
(In reply to comment #4)
> Kai, Is the PSM architecture such that the code for the Cert Manager runs
> on its own separate thread, and that thread is not also used for other
> purposes, such as SSL?


Nelson, no, cert manager does not run on its own thread.

Most of Mozilla's and PSM's user interface and networking runs on a single thread: The main thread.

Yes, PSM delegates requests to do SSL (usually arriving on the main thread) to the separate SSL thread and later returns the results back to the main thread. But the SSL thread is irrelevant for this bug.

What matters is:
- the Mozilla framework will request PSM to provide text for the cells
  in the tables that make up the cert manager display
- the Mozilla framework will make this request from the main thread
- the Mozilla framework will wait and block until PSM returns 
  with the text

What matters is that all of the following will run on the same thread:
- Mozilla's request to provide a cell text (blocking)
- PSM's call into cert-verify to obtain usages (blocking)
- NSS' call to attempt OCSP


So, if PSM temporarily overrides OCSP to be disabled on a thread, it will be effective for the NSS code running on the same thread...



> IINM, recent architectural changes to PSM cause all SSL work to be done on
> a special SSL thread now.  Are we certain that the Cert Manager work does
> not also happen on that thread?  

Yes, Cert Manager does not happen on the SSL thread, because Cert Manager does not do SSL.
My new proposals for function names are:

CERT_PauseVerifyOCSPOnThread    ->   CERT_DisallowOCSPOnThread

CERT_IsVerifyOCSPPausedOnThread -> ! CERT_IsOCSPAllowedOnThread
I really got confused while coding using the "Disallow" meaning, so I'm proposing to call both in the positive way and recert the meaning of both functions.

CERT_PauseVerifyOCSPOnThread    -> ! CERT_AllowOCSPOnThread

CERT_IsVerifyOCSPPausedOnThread -> ! CERT_IsOCSPAllowedOnThread

(In reply to comment #5)
> It further seems to me that the API should not be seen as something 
> that temporarily disables OCSP, but rather as a per-thread control 
> over whether OCSP is enabled.

Nelson, NSS' has several global configuration properties that control the exact behaviour of OCSP.

I don't need a mechanism to individually set all the different propoerties per thread. I don't want to go that far, but keep it simple.

All I need is a mechanism to
  "call some NSS function without allowing OCSP".
Attached patch Patch v3 (obsolete) — Splinter Review
New patch using new function names.
Attachment #267906 - Attachment is obsolete: true
Attachment #267930 - Flags: superreview?(julien.pierre.boogz)
Attachment #267930 - Flags: review?(rrelyea)
Attachment #267906 - Flags: superreview?(julien.pierre.boogz)
Attachment #267906 - Flags: review?(rrelyea)
Summary: Want ability to pause OCSP on a thread without disabling it globally → Want ability to temporarily disallow OCSP for a single function call
Comment on attachment 267930 [details] [diff] [review]
Patch v3

Kai, Thanks for changing to a positive nomenclature.
Allowed is close enough to Enabled for my purposes.  

Here's an observation about the latest patch.

>Index: mozilla/security/nss/lib/certhigh/ocsp.c

>+static PRStatus
>+ensure_ocsp_allowed_index(void)
>+{
>+  if (0 == thread_local_index_ocsp_allowed) {
>+    return PR_CallOnce(&call_once_index_ocsp_allowed,
>+                       ocsp_allowed_index_function);
>+  }
>+
>+  return (thread_local_index_ocsp_allowed != 0) ? PR_SUCCESS : PR_FAILURE;

The last line above can never be reached when 
   (thread_local_index_ocsp_allowed == 0)
and therefore the conditional expression can only have one possible value,
namely PR_SUCCESS.

Rather than change the last line above, I would propose to fix this by
removing the word "return" from the call to PR_CallOnce, thereby making the 
last line reachable regardless of the value of thread_local_index_ocsp_allowed.
Attached patch Patch v4Splinter Review
Nelson, thanks for your suggestion, this is a new patch with your change included.
Attachment #267930 - Attachment is obsolete: true
Attachment #268086 - Flags: superreview?(rrelyea)
Attachment #268086 - Flags: review?(nelson)
Attachment #267930 - Flags: superreview?(julien.pierre.boogz)
Attachment #267930 - Flags: review?(rrelyea)
I really don't like this approach much. It is somewhat of a hack for PSM. I don't think there is any good reason not to show the proper revoked status in PSM. OCSP status is useful information that should be displayed to the user, even if it takes a little longer. PSM should find a way to retrieve it without blocking the UI thread completely during the time that it takes.

In libpkix, there is support for non-blocking CERT_VerifyCert with OCSP. PSM could take advantage of that feature to avoid blocking, and running several CERT_VerifyCert in parallel in a single thread.

For the time being, without the new pkix APIs, PSM could find a way to not block the main UI thread by spawning separate threads to do CERT_VerifyCert when OCSP is enabled.
(In reply to comment #14)
>  OCSP status is useful information that should be displayed to the user,
> even if it takes a little longer. PSM should find a way to retrieve it without
> blocking the UI thread completely during the time that it takes.

Even if it's 5 minutes?

I have several certs with OCSP URLs that are not reachable and it takes a while to reach the timeout for each of them !
(In reply to comment #14)
> I really don't like this approach much. It is somewhat of a hack for PSM.

Do you prefer that we introduce new variations of CertVerify-APIs that have an additional parameter "do not use OCSP while verifying this cert"?
Re: comment #15,

Isn't the OCSP timeout configurable ? If not, we should make it.
Anyway, even if it takes the whole timeout amount, the PSM UI could be properly
designed to update itself as verification results are obtained from each
CERT_VerifyCert, and never block. It could display some "in progress ..."
status during those 5 minutes but still have a responsive UI, just like the
browser does with the Moz icon. That would be doable with either the
nonblocking PKIX calls without any additional thread, or by running 1 or more
separate threads to dispatch the CERT_VerifyCert calls to.

Re: comment #16,

No, I don't prefer that at all. I have offered other solutions/workarounds.
We must be able to bring up the UI without potentially being blocked because we are waiting for server connections to complete, which potentially will be active until a timeout.

The approach to use multiple threads might improve that, but will be a guarantee for a fix.

If you disagree to do such a "obtain status and usage without OCSP", I'd rather do something else: Remove the current "purposes" column from PSM. Or alternative, when the dialog opens, that column could show "n/a" in all fields, and an OCSP thread in the background could loop through all the certs, and update the UI while information becomes available.
Kai,

I assume you meant that using multiple treads will "not" be a guarantee for a fix in comment 18.

I don't see why not . If you never run CERT_VerifyCert in the PSM UI thread, your UI should never be blocked by any server connections.

I like the suggestion of showing "n/a" in all fields until status is available. This is basically the same as I suggested with "in progress..." except with a different string ;).

And the background thread wouldn't be an "OCSP thread" but a CERT_VerifyThread really, since the OCSP is done as part of the CERT_VerifyCert and not separately. I would strongly recommend multiple threads be used however, since as you pointed out, some servers run into timeouts, while others complete quickly. So you don't want them to be all serialized by having only one such thread. Some reasonable number of verify threads should be chosen, based on the total number of certs to verify, and maybe also available memory and # of hardware threads/cores on the system. Of course if we go to the nonblocking PKIX calls then we only need one thread.
While I agree to this approach in theory, in practice this will take a while to implement, and given that cert manager is currently not usable, I need a quicker solution.

If the original "simple" workaround proposed is not acceptable, I would like to remove the "purposes" column until such a "perfect" solution gets implemented.
Well, I don't know if this is a perfect solution, but it seems to me the rest of mozilla does not run blocking I/O code within its UI thread. So it's the logical thing to do for PSM to conform to that model.
Note that with pkix, there will be other I/O done besides OCSP during the cert verification, such as HTTP or LDAP to fetch certs and CRLs. You would run into the same problem with that other I/O if it implemented underneath CERT_VerifCert. This is another reason why I am rejecting your proposal - it is not likely to be a long-term solution.
Comment on attachment 268086 [details] [diff] [review]
Patch v4

we decided not to do this. r-
Attachment #268086 - Flags: superreview?(rrelyea) → superreview-
Resolving as WONTFIX, removing dependency.
No longer blocks: 383969
Status: ASSIGNED → RESOLVED
Closed: 17 years ago
Resolution: --- → WONTFIX
(In reply to comment #16)
Kai asked:
> Do you prefer that we introduce new variations of CertVerify-APIs that have an
> additional parameter "do not use OCSP while verifying this cert"?

I prefer that.  I would have OKed such a proposal.  Since we are inventing 
a new cert verification API for 3.12, it seems that we should ensure that 
that new API has the ability to specify "no OCSP".

Attachment #268086 - Flags: review?(nelson)
(In reply to comment #25)
> I prefer that.  I would have OKed such a proposal.  Since we are inventing 
> a new cert verification API for 3.12, it seems that we should ensure that 
> that new API has the ability to specify "no OCSP".

Thanks, I'd appreciate it.
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: