Closed Bug 99303 Opened 24 years ago Closed 24 years ago

eliminate unnecessary I/O layer on top of SSL

Categories

(Core Graveyard :: Security: UI, defect, P1)

1.0 Branch
defect

Tracking

(Not tracked)

VERIFIED WONTFIX
Future

People

(Reporter: nelson, Assigned: ddrinan0264)

References

Details

Many applications, clients and servers, use NSS. Some use blocking sockets, others use non-blocking sockets. All of them but one access NSS/SSL through the NSPR interface that was intended (and documented) for that purpose, the PR_Read, PR_Write, PR_Connect, (etc.) calls. That interface is well documented and there exist numerous example programs that use it. PSM does not use that interface. Instead PSM has created yet another I/O layer that it places above SSL on the socket's protocol stack, which PSM then uses in place of the documented NSPR API. The use of this layer adds an very substantial amount of complexity to PSM, and as far as I know, is simply unnecessary. The added complexity adds significantly to the cost of maintaining and extending PSM, and makes it more difficult to ensure the correctness of PSM's use of SSL. This has been particularly apparent in recent attempts to change that layer to improve PSM's robustness in the presence of "TLS-intolerant" servers. So, this bug report requests that PSM be simplified through the elimination of the extra I/O layer, and a return to using NSPR's official and documented API.
cc all PSM developers for comments. Marking P1. Future. I believe that PSM will still need to deal with TLS intolerant sites and other issues for which PSM coded workaround into our SSL socket interface. These workarounds will probably have to be included in any new implementation. According to Nelson, the main benefits of this work will be: 1) A future ROI through easier PSM code maintenance. 2) Better confidence in PSM code correctness. PSM developers: 1) Please comment on the expected benefit outlined above. 2) How much work is it (include known list of workarounds that need to be migrated)? 3) Risk assessment. In principle I would prefer that PSM use NSS accepted practices and methods. Assigning to ddrinan as lead developer.
Assignee: ssaux → ddrinan
Priority: -- → P1
Target Milestone: --- → Future
QA Contact: bsharma → junruh
Changing my e-mail address.
Removing old e-mail address.
If we eliminate the layer, which mechanism could we use to hook into the socket communication? Or is this not possible anymore? Currently, the socket transport code is in a different component than the security component. If we remove the extra layer, is it necessary to move all security specific socket handling to the socket transport component?
Eliminating the socket layer would make it extremely hard to report SSL error messages that occur during SSL. I'm all for bringing down the size of code, but I think the extra layer does provide some useful functionality for us.
Javi, I'm not sure we're talking about the same thing here. I'm talking about the fact that PSM adds its own pushable layer which it pushes on top of the SSL layer in NSPR's stack of I/O layers on top of the socket. I'm talking about the fact that the code uses this layer instead of calling PR_Send, PR_Recv, etc, like all other programs in the world that use NSS/SSL. I don't believe that this pushable layer has any positive impact on your ability to report errors. It doesn't give you any new call backs that you wouldn't have if you used the NSPR interface. All the error codes are available either way.
I want to try to clarify what we are talking about. I think we talk about the following file: http://lxr.mozilla.org/mozilla/source/security/manager/ssl/src/nsNSSIOLayer.cpp Nelson, when you say: "PSM adds its own pushable layer which it pushes on top of the SSL layer in NSPR's stack of I/O layers on top of the socket" do you mean the call to PR_CreateIOLayerStub inside function nsSSLIOLayerAddToSocket? You also say: "the code uses this layer instead of calling PR_Send, PR_Recv, etc." By using the layer, the PSM code currently avoids it to use PR_Send and PR_Recv functions on its own. My understanding is, Mozilla still uses this functions, but they do not appear in the PSM component, but in Necko, the component responsible for transferring data over the network. I don't know which arguments have driven the initial implementation, and I might be wrong, but I guess one argument was to be able to isolate Mozilla from the SSL layer, and move all encryption specific application code to the PSM component, including error checks. Currently, Necko does not have to care for I/O errors caused by problems with SSL communication. Javi says: "Eliminating the socket layer would make it extremely hard to report SSL error messages that occur during SSL." Currently, as PSM does not care for transporting the application data over the network, but just doesn't care for this, as Necko does this job without knowing much about SSL, the PSM code does only care for the cases where transporting the data does not work. It hooks only into the network activity, and tries to detect some errors. When it detects the errors, it reports them to the user, or takes steps to make sure that the universal network I/O code in Necko retries the operation. The two cases I'm aware of are the Proxy case, and the "TLS intolerant server" situation. My understanding is: We could eliminate that extra layer. But this requires all SSL error checking, currently residing in PSM, to be moved to Necko. If we no longer use the extra layer, the error codes will show up at the place where the I/O read and write methods are called, and I believe this is in Necko. So the difficult thing is, that we have to extend Necko in some way or the other to eliminate the extra layer, and it might be difficult to have Necko still isolated from encryption. Note that it is possible to compile Mozilla without any encryption. I don't know better, but I guess people want it to stay that way to allow it to be used in countries with funny encryption laws.
I'm sorry if my previous comment was too long. I'd like to hear whether you think I'm right or wrong with my summary. If I'm right, and we need to change the network code, then we can't do the change alone but work together with the necko team. And I'd like to hear opinions, whether we should do it (anyway) or not. If my summary is wrong, I'd like to learn which parts are wrong and what we need to do instead. Thanks!
Whom should we cc from the necko team?
cc'ing darin from the necko team. Darin, what do you think? Do you agree that the SSL error checking (at least) would have to be integrated with necko, if we want to remove the extra layer?
NSPR's IO layer concept works very nicely with necko's concept of socket providers. of course, this was necessary back when NSS was not open source. can someone describe what impact eliminating IO layer support would have on nsSocketTransport.cpp?
I think the only sticking point is making necko aware of SSL error codes and dealing with the TLS intolerance logic the PSM layer currently implements.
i think the insulation between NSS and necko is a good thing. it simplifies the networking code tremendously. whatever solution comes out of this, i really don't want to see nsSocketTransport.cpp grow in complexity.
I fear Nelson's request for reduced complexity is not compatible with Darin's request to avoid added complexity in Necko. Does this mean, we must set this bug to WONTFIX? Or is there a way we could reach what Nelson suggests, without making necko more complex? Could we hook into Necko? I have the critical P1 bug 106188 assigned to me. From a chat with bryner I learned that the fix for bug 106188 would be completely different, depending on whether bug 99303 (this one) were implemented or not. I'm adding a dependency, as I would like to delay working on bug 106188, until a decision for this bug has been reached.
Blocks: 106188
Setting to WONTFIX. In a meeting we agreed, that it makes sense to keep the layer, because it is the only way to keep the SSL specific application code inside the security component, and not being required to put it into the general network component.
Status: NEW → RESOLVED
Closed: 24 years ago
Resolution: --- → WONTFIX
Verified per kaie's comment.
Status: RESOLVED → VERIFIED
Product: PSM → Core
Version: psm2.1 → 1.0 Branch
Product: Core → Core Graveyard
You need to log in before you can comment on or make changes to this bug.