Open Bug 234130 Opened 21 years ago Updated 3 years ago

Extend PR_Poll() and PRIOMethods to support related PRFileDescs

Categories

(NSPR :: NSPR, enhancement)

enhancement

Tracking

(Not tracked)

People

(Reporter: jgmyers, Unassigned)

References

(Blocks 1 open bug)

Details

(Whiteboard: PKIX)

Attachments

(1 file, 1 obsolete file)

In some PRIOMethods implementations (SSL with OCSP, for example), the readability or writeability of PRFileDesc A will depend on the readability or writeability of PRFileDesc B, where the relation between A and B is known to the PRIOMethods of A. This is an enhancemnt request to add a new method PRPoll2FN to PRIOMethods to permit an IO layer to export this knowledge and to modify PR_Poll() to use this knowledge.
An initial suggestion for an interface is: typedef PRInt16(PR_CALLBACK *PRPoll2FN)(PRFileDesc *fd, PRInt16 in_flags, PRInt16 *out_flags, PRInt16 *num_aux, PRFileDesc **aux_fd, PRInt16 **aux_flags); num_aux - filled in with a count of auxiliary fds aux_fd - filled in with ptr to array of auxiliary fds aux_flags - filled in with ptr to array of flags for auxiliary fds In addition to any information in the returned value (which applies to the input PRFileDesc), the input PRFileDesc will make progress if any of the auxiliary fds are pollable for their respective flags.
Blocks: 234135
Attached patch Proposed interface, version 2 (obsolete) — Splinter Review
The flat array wouldn't work in the case where a layer had multiple dependent fds, one of which in turn had multiple dependent fds. This proposal thus uses a recursive datastructure to return status. Stopped trying to make the revised interface similar to the old poll interface. Added the ability for a layer to export timeout information up to PR_Poll().
In pl_DefPoll2(), the line: (*out_status)->fd = fd; would be more cleanly written as: (*out_status)->fd = fd->lower; Not that it makes any practical difference. The ugly PRPollStatus structure and lifetime rules are unfortunately necessary to maintain compatibility with the old poll callback. If we didn't need to support having layers with the poll2 callback layered on top of layers which only implement the old poll callback, I'd replace the PRPollStatus argument with an opaque context that gets passed to all subordinate poll2 calls. The leaf layers would then use the context to store their OS-level sockets and flags in a data strucutre shared with the OS's implementation of PR_Poll().
Could I get some feedback on the interface? Is there any interest in doing this?
John, thank you for filing this bug and proposing a solution. Your proposed interface is difficult to understand. I will need more time to study it. The solution I have in mind is for the SSL fd to point its "lower" pointer to the OCSP socket for the duration of sending the OCSP request and receiving the OCSP response. This solution does not require a new NSPR I/O method. It is not as general as the solution you proposed, but should be sufficient for handling the case of doing OCSP during SSL.
The "swapping the lower fd" method has the following disadvantages: 1) It is not able to signal readiness caused by the application-level OCSP timeout. 2) It is not able to signal readiness when the SSL connection closes during the OCSP lookup. My proposed interface has the disadvantage of being difficult to understand and having error-prone returned-buffer lifetime requirements. I've been thinking of yet another interface which, instead of returning a PRPollStatus object, passes down an object pointer. That object would be internal to PR_Poll and the leaf layers. The leaf layers would use it to record the low-level poll information. The object would additionally have a method that would be called after any proxying to the old poll method in order to record the flags returned from that old poll method.
Any comments on the PRPollContext api? It should be a bit easier to understand.
John, There is interest now in this NSPR enhancement. As part of the libpkix work started at Sun to enhance PKI functionality in NSS 3.12, we would like to solve the OCSP blocking I/O problem . Also, libpkix may cause some LDAP and HTTP connections to occur as part of the cert verification called in the SSL handshake callback . This means that there may be multiple dependent sockets per SSL socket in some cases. For our initial implementation, the libpkix state machine will not attempt to do concurrent LDAP, HTTP and OCSP fetches for a given SSL socket. Ie. if there is any I/O that would block, the cert verification will return with a PRFileDesc to poll on - and socket will be from either OCSP, HTTP or LDAP connection . However, returning multiple PRFileDesc is an enhancement we are considering for the future (post NSS 3.12) if latency becomes a problem. Therefore, I would prefer that we implement an interface in NSPR that allows more than one dependent socket, even if we will only use one for the time being.
Julien, Please review the PRPollContext api proposal and provide feedback. Does this look like an API you'd like to implement against in NSS? Once I get positive feedback, I can make time to implement the API in NSPR.
> In some PRIOMethods implementations (SSL with OCSP, for example), the > readability or writeability of PRFileDesc A will depend on the readability or > writeability of PRFileDesc B, where the relation between A and B is known to > the PRIOMethods of A. I'm not sure that this is really sufficient to implement OCSP over HTTP proxies. For example, suppose the HTTP proxy requires NTLM authentication. That is an authentication protocol that requires multiple HTTP requests, and in many cases the first request results in a closed connection. The result, is that the HTTP client may need to open additional sockets to finally issue the HTTP request. Mozilla has an implementation of NTLM. I don't think we want to put an NTLM implementation in NSS, and if not, then it does not seem to me that socket A would have enough information about the other sockets on which it depends. I don't really have a good alternative solution for this problem, but I think it is a problem that we should not neglect. This problem is further complicated in Mozilla by the fact that Mozilla supports arbitrary authentication schemes over HTTP via XPCOM extensions (see nsIHttpAuthenticator).
Darin, Re: OCSP. We certainly don't want to put an NTLM proxy in NSS. But we do want to be allow for using one. Here is the way I envision the scenario could work. 1) application thread issues PR_Write on an SSL socket, which causes a handshake 2) the SSL stack calls the AuthCertificate callback 3) AuthCertificateHook callback calls CERT_VerifyCert (or its new libpkix replacement) 4) CERT_VerifyCert calls OCSP checker 5) OCSP checker calls proxy code provided by the application to connect to the server, ie. necko 6) proxy code starts non-blocking connection and returns wouldblock and the socket 7) OCSP checker returns file descriptor to CERT_VerifyCert 8) CERT_VerifyCert returns SECWouldBlock and a state that includes the OCSP file descriptor 9a) handshake callback sets the OCSP fd as a dependency of the SSL fd, and returns SECWouldblock . This is where the new NSPR proposal comes in . 9b) handshake callback also saves the CERT_VerifyCert state into the SSL layer secret data, so it can be resumed on 10) PR_Send returns PR_WOULD_BLOCK_ERROR to application 11) the application calls PR_Poll with an array containing the SSL socket 12) PR_Poll does a native poll not only on the SSL fd, but also on its dependent socket(s), in this case the OCSP socket 13) If the dependent OCSP socket made progress, PR_Poll returns that the SSL socket made progress. 14) The application tries to continue I/O on the SSL socket with another PR_Write 15) ssl_write notes from the SSL secret state that it is in the middle of an AuthCertificate operation. 16) the SSL stack calls back into the AuthCertificate callback with the previous CERT_VerifyCert state 17) CERT_VerifyCert calls the OCSP checker with the previously opened fd 18) OCSP checker proceeds with data from I/O etc. If the NTLM proxy code needs to open more than one HTTP socket concurrently, it should be fine . All that's required is that in steps 7, 8, and 9, more than one file descriptor can be passed and marked as dependency of the SSL socket. This is why I made comment #9 yesterday . Note that in step 9a, it might be desirable to not only mark the OCSP socket as dependency, but also as replacement, since no progress can be made on the SSL socket without completion from OCSP. The new API should have a way to note that the original socket is not to be polled on at all until the dependent socket makes progress. This is the equivalent of the "switching fd" proposal. However, this is not necessarily always the desired behavior for all NSPR layers and protocols, so that feature should be optional. And strictly speaking, "switching the original fd with the dependent fd" only works if there is a single dependent fd. If there is more than one dependent fd, the proposal is insufficient. So, I prefer to think of the NSPR enhancement in terms of adding the ability to a) add 1 or more new dependent sockets to be used in PR_Poll and cause it to return earlier if there is activity on dependent sockets b) marking the original socket as deadlocked on dependent socket(s), which causes subsequent PR_Poll to only poll the dependent sockets, but not on the original socket itself . This would be an on/off toggle
(In reply to comment #12) > Note that in step 9a, it might be desirable to not only mark the OCSP socket as > dependency, but also as replacement, since no progress can be made on the SSL > socket without completion from OCSP. This is not necessarily true. If you get an EOF or ECONNRESET on the SSL socket, it is possible to make progress without completion from OCSP. The SSL layer could return a connection failure without waiting for the OCSP response. The PRPollContext api proposal is sufficient to handle any case where API boundaries are expressed as NSPR IO layers. If NTLM HTTP proxy code can be invoked through an NSPR IO layer, then the proposal is sufficient. To remove the dependency on NSPR IO layers would effectively require developing some other nonblocking (or async) I/O framework and using it throughout. The situation where this sort of layered polling API works poorly is where you have a lot of top level layer instantiations which all share some lower level dependency. For example, consider the case where there are a lot of SSL sockets handshaking, each doing a lookup on the same shared OCSP socket. When one of the OCSP responses come in, all of the SSL sockets have to be woken up so they can each see if the data that came in was for them.
This enhancement or a similar one is required in order to be able to support non-blocking I/O with SSL sockets for OCSP and LDAP. The feature will be supported in NSS 3.12 . Marking P1 for NSPR 4.7 .
Blocks: 324867
Priority: -- → P1
Target Milestone: --- → 4.7
Attachment #141428 - Attachment is obsolete: true
To clarify how the PRPollContext api supports multiple dependent fds: The poll2 method of a layer which has more than one dependent fd would call the poll2 methods of each dependent fd in turn until one of them returns PR_TRUE. If any dependent fd's poll2 method returns PR_TRUE, the layer's poll2 method returns PR_TRUE. When calling a dependent fd's poll2 method, the layer would pass an in_flag indicating whether it wants to read or write. If it wants to do either, it makes two calls, one with each possible value of in_flag. The layer would pass down the same poll_context and out_timeout pointers as it was called with.
QA Contact: wtchang → nspr
Whiteboard: PKIX
This bug is still waiting for positive feedback on the API proposal.
No longer blocks: 324867

The bug assignee didn't login in Bugzilla in the last 7 months and this bug has priority 'P1'.
:KaiE, could you have a look please?
For more information, please visit auto_nag documentation.

Assignee: wtc → nobody
Flags: needinfo?(kaie)
Severity: normal → S4
Flags: needinfo?(kaie)
Priority: P1 → --
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: