Closed Bug 363455 Opened 18 years ago Closed 17 years ago

Enhance PSM's SSL handling on blocking sockets

Categories

(Core :: Security: PSM, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED

People

(Reporter: KaiE, Assigned: KaiE)

References

Details

(Keywords: fixed1.8.1.24)

Attachments

(3 files, 5 obsolete files)

In bug 355409 comment 38 and 39 Wan-Teh said:


I want to suggest an improvement.  You can either check in this patch
now
and work on a supplemental patch later, or prepare a new version of the patch.

For a non-blocking socket, PR_Read and PR_Recv(flags=0) are the same,
and PR_Write and PR_Send(flags=0) are the same.  (Note: there is no
flag defined for PR_Send.)

Therefore, you should only call

    fd->lower->methods->recv or fd->lower->methods->send

if the socket is *blocking*.  If the socket is non-blocking, you should
call

    nsSSLThread::requestRead or nsSSLThread::requestWrite

Call fd->lower->methods->getsocketoption to determine if the socket is
non-blocking.  The name of that socket option is PR_SockOpt_Nonblocking.


If you implement the improvement I suggested, make sure you
test it by changing Mozilla (Necko?) to call PR_Recv and PR_Send
instead of PR_Read and PR_Write.
Depends on: 355409
Wan-Teh, I attempted to implemented your proposal as attached to bug 355409 with supplemental patch v2 and changing Necko to use send/receive instead of read/write.

That revealed that the fix has to be smarter than that patch.

First, because of our trick to replace the lower fd with a pollable event, we must be careful before calling GetSockOpt. Calling that on the pollable event results in an assertion failure.

In order to find out the real socket fd, we must lock the thread mutex, because this fd could currently be accessed on the SSL thread.

I'm a bit unhappy that we'd unnecessarily require to aquire this mutex one additional time for non-blocking sockets.

In addition, I wonder why we should restrict this pass-through-to-lib-ssl-directly behaviour to the send/receive functions. In my understanding it made sense to do the same thing for read/write calls on a blocking socket.

So IMHO, the real fix is to make psmsend and psmrecv forward the call to the requestWrite/requestRead functions and have those handle it correctly - since that context acquires the mutex already.
Wan-Teh, this implements your proposal and addresses my comments in this bug.

Please also note the comment in the patch, where I explain an optimization. I would prefer it we don't have to call GetSockOpt on each read/write/send/recv call.
Attachment #248283 - Flags: review?(wtchang)
Attached patch Patch v1 (obsolete) — Splinter Review
Attached patch Necko change used for testing (obsolete) — Splinter Review
This patch was used for testing purposes only.
It changes Mozilla's https networking to call the send/receive functions.
Comment on attachment 248283 [details] [diff] [review]
Patch v1 - reviewer version - ignores whitespace

I suggest the following changes.

1. Reimplement nsSSLIOLayerRead to simply call PSMRecv(..., 0, PR_INTERVAL_NO_TIMEOUT)
and nsSSLIOLayerWrite to call PSMSend(..., 0, PR_INTERVAL_NO_TIMEOUT).  This way you
remove some redundant code, and you don't need to modify Necko to test the new code.

2. The timeout argument for nsSSLThread::requestRead and nsSSLThread::requestWrite
should be PRIntervalTime rather than a pointer to PRIntervalTime.

3. getRealFDIfBlockingSocket_locked should check the return value of
PR_GetSocketOption.

4. In nsSSLThread::requestRead and nsSSLThread::requestWrite, you should do the
if (blockingFD) case immediately after this:

>+    blockingFD = getRealFDIfBlockingSocket_locked(si);

Then you don't need to add if (!blockingFD) to the existing code, because the
if (blockingFD) case always returns.

In the if (blockingFD) case, you should be able to always call the recv/send
method, whether timeout is 0 or not.
(In reply to comment #5)
> 1. Reimplement nsSSLIOLayerRead to simply call PSMRecv(..., 0,
> PR_INTERVAL_NO_TIMEOUT)
> and nsSSLIOLayerWrite to call PSMSend(..., 0, PR_INTERVAL_NO_TIMEOUT).  This
> way you
> remove some redundant code, and you don't need to modify Necko to test the new
> code.
> 
> 2. The timeout argument for nsSSLThread::requestRead and
> nsSSLThread::requestWrite
> should be PRIntervalTime rather than a pointer to PRIntervalTime.


With these two changes, for a blocking socket, when the application calls PR_Read, this will result in a call to libssl's recv function with NO_TIMEOUT, is that what you want?

(... and a call to PR_Write will result in a call to libssl's send with NO_TIMEOUT.)


> 3. getRealFDIfBlockingSocket_locked should check the return value of
> PR_GetSocketOption.

What should we do if this call fails?
Should we
- assume blocking
- assume nonblocking
or
- abort the I/O call completely
?


> 4. In nsSSLThread::requestRead and nsSSLThread::requestWrite, you should do the
> if (blockingFD) case immediately after this:
> 
> >+    blockingFD = getRealFDIfBlockingSocket_locked(si);
> 
> Then you don't need to add if (!blockingFD) to the existing code, because the
> if (blockingFD) case always returns.


I think we should not do that, I'd prefer to leave my change as is.

The reason why I moved the if(blockingFD) case further down:

I want to unlock PSM's ssl_thread_singleton->mMutex before we call into libssl send/recv.

This is accomplished by leaving the scope that starts with nsAutoLock, before  executing the if(blockingFD) case.


> In the if (blockingFD) case, you should be able to always call the recv/send
> method, whether timeout is 0 or not.

This probably answers the question I asked for 1./2.
For a socket, PR_Read is equivalent to PR_Recv(..., 0, PR_INTERVAL_NO_TIMEOUT)
and PR_Write is equivalent to PR_Send(..., 0, PR_INTERVAL_NO_TIMEOUT, whether
the socket is in blocking or non-blocking mode.  If you are worried that
libssl's recv and send can't handle PR_INTERVAL_NO_TIMEOUT correctly, you
can only forward read/write calls on non-blocking sockets to recv/send.

If PR_GetSocketOption fails, you can abort the I/O call completely.
(In reply to comment #7)
> For a socket, PR_Read is equivalent to PR_Recv(..., 0, PR_INTERVAL_NO_TIMEOUT)
> and PR_Write is equivalent to PR_Send(..., 0, PR_INTERVAL_NO_TIMEOUT, whether
> the socket is in blocking or non-blocking mode.

Thanks for confirming.


> If PR_GetSocketOption fails, you can abort the I/O call completely.

Ok, will do.
Patch after addressing Wan-Teh's comments.
Attachment #248283 - Attachment is obsolete: true
Attachment #248284 - Attachment is obsolete: true
Attachment #248285 - Attachment is obsolete: true
Attachment #248525 - Flags: review?(wtchang)
Attachment #248283 - Flags: review?(wtchang)
Attached patch Patch v2 (obsolete) — Splinter Review
Comment on attachment 248525 [details] [diff] [review]
Patch v2 - reviewer version - ignores whitespace

Wan-Teh, I believe I have addressed all your requests, with the little exception that I explained in comment 6.

Could you please review the patch? Thanks a lot.
Comment on attachment 248525 [details] [diff] [review]
Patch v2 - reviewer version - ignores whitespace

r=wtc.

In nsSSLThread.cpp, method getRealFDIfBlockingSocket_locked, we have:

>+    if (sod.value.non_blocking)
>+    {
>+      si->mBlockingState = nsNSSSocketInfo::is_nonblocking_socket;
>+    }
>+    else
>+    {
>+      si->mBlockingState = nsNSSSocketInfo::is_blocking_socket;
>+    }

This can be replaced with an expression using the  ? : operator.

In the requestRead and requestWrite methods, it is a shame that we
have to use an if (!blockingFD) statement inside the scope protected
by ssl_thread_singleton->mMutex, followed by an if (blockingFD) statement
right after the scope so that we call blockingFD->methods->send without
locking ssl_thread_singleton->mMutex.  I hope we can come up with a
simpler method.  Right now it's not obvious what the consecutive
if (!blockingFD) and if (blockingFD) are for.  I did figure it out
after studying the code, but it takes some effort.  I'm worried that
people may add code between the if (!blockingFD) and if (blockingFD)
blocks in the future because they don't understand the purpose of those
two if tests.
Attachment #248525 - Flags: review?(wtchang) → review+
(In reply to comment #12)
> In nsSSLThread.cpp, method getRealFDIfBlockingSocket_locked, we have:
> This can be replaced with an expression using the  ? : operator.

ok, changed


> Right now it's not obvious what the consecutive
> if (!blockingFD) and if (blockingFD) are for.  I'm worried that
> people may add code between the if (!blockingFD) and if (blockingFD)
> blocks in the future because they don't understand the purpose of those
> two if tests.

In both places I've added the following comment after the first closing bracket:

  // leave this mutex protected scope before the blockingFD handling
v3 = v2 + Wan-Teh's requested change
carrying forward r=wtchang
Attachment #248525 - Attachment is obsolete: true
Attachment #248526 - Attachment is obsolete: true
Attachment #255167 - Flags: review+
Fixed on trunk.

Thanks for the review!
Status: NEW → RESOLVED
Closed: 17 years ago
Resolution: --- → FIXED
We never added this patch to Mozilla 1.8 branch for Thunderbird 2.

I backported patch v3 (only simple context changes).
Comment on attachment 365683 [details] [diff] [review]
Patch v3 for 1.8 branch

I think approval1.8.1.next is the right flag for thunderbird 2.
Attachment #365683 - Flags: approval1.8.1.next?
Dan, are you interested to drive this backported patch into thunderbird 2, as it seems to relax the situation from bug 390036?
I wish I had the bandwidth.  :-(
(In reply to comment #19)
> I wish I had the bandwidth.  :-(

I think the only thing that's needed is a branch approval.
Blocks: 390036
Attachment #365683 - Flags: approval1.8.1.next? → approval1.8.1.next+
Comment on attachment 365683 [details] [diff] [review]
Patch v3 for 1.8 branch

Approved for 1.8.1.24. a=ss
Merge approved patch to latest 1.8 branch, one simple context change.
I'll check this in once my build completes.
Checked in to 1.8 branch.

Checking in ssl/src/nsNSSIOLayer.cpp;
/cvsroot/mozilla/security/manager/ssl/src/nsNSSIOLayer.cpp,v  <--  nsNSSIOLayer.cpp
new revision: 1.97.2.25; previous revision: 1.97.2.24
done
Checking in ssl/src/nsNSSIOLayer.h;
/cvsroot/mozilla/security/manager/ssl/src/nsNSSIOLayer.h,v  <--  nsNSSIOLayer.h
new revision: 1.27.28.7; previous revision: 1.27.28.6
done
Checking in ssl/src/nsSSLThread.cpp;
/cvsroot/mozilla/security/manager/ssl/src/nsSSLThread.cpp,v  <--  nsSSLThread.cpp
new revision: 1.2.2.9; previous revision: 1.2.2.8
done
Checking in ssl/src/nsSSLThread.h;
/cvsroot/mozilla/security/manager/ssl/src/nsSSLThread.h,v  <--  nsSSLThread.h
new revision: 1.1.2.4; previous revision: 1.1.2.3
done
Keywords: fixed1.8.1.24
Is the best way to test this for TB 2.0.0.24 to see if it helps with bug 390036?
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: