Last Comment Bug 363455 - Enhance PSM's SSL handling on blocking sockets
: Enhance PSM's SSL handling on blocking sockets
Status: RESOLVED FIXED
: fixed1.8.1.24
Product: Core
Classification: Components
Component: Security: PSM (show other bugs)
: Trunk
: All All
: -- normal (vote)
: ---
Assigned To: Kai Engert (:kaie)
:
Mentors:
Depends on: 355409
Blocks: 390036
  Show dependency treegraph
 
Reported: 2006-12-11 08:26 PST by Kai Engert (:kaie)
Modified: 2010-02-18 16:06 PST (History)
6 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---


Attachments
Patch v1 - reviewer version - ignores whitespace (8.85 KB, patch)
2006-12-11 10:59 PST, Kai Engert (:kaie)
no flags Details | Diff | Splinter Review
Patch v1 (11.84 KB, patch)
2006-12-11 11:00 PST, Kai Engert (:kaie)
no flags Details | Diff | Splinter Review
Necko change used for testing (1.31 KB, patch)
2006-12-11 11:02 PST, Kai Engert (:kaie)
no flags Details | Diff | Splinter Review
Patch v2 - reviewer version - ignores whitespace (9.85 KB, patch)
2006-12-13 09:49 PST, Kai Engert (:kaie)
wtc: review+
Details | Diff | Splinter Review
Patch v2 (12.95 KB, patch)
2006-12-13 09:49 PST, Kai Engert (:kaie)
no flags Details | Diff | Splinter Review
Patch v3 as checked in (13.02 KB, patch)
2007-02-14 18:34 PST, Kai Engert (:kaie)
kaie: review+
Details | Diff | Splinter Review
Patch v3 for 1.8 branch (13.06 KB, patch)
2009-03-05 09:52 PST, Kai Engert (:kaie)
samuel.sidler+old: approval1.8.1.next+
Details | Diff | Splinter Review
Patch v3 for 1.8 branch, merged again (12.64 KB, patch)
2009-10-28 15:43 PDT, Kai Engert (:kaie)
no flags Details | Diff | Splinter Review

Description Kai Engert (:kaie) 2006-12-11 08:26:34 PST
In bug 355409 comment 38 and 39 Wan-Teh said:


I want to suggest an improvement.  You can either check in this patch
now
and work on a supplemental patch later, or prepare a new version of the patch.

For a non-blocking socket, PR_Read and PR_Recv(flags=0) are the same,
and PR_Write and PR_Send(flags=0) are the same.  (Note: there is no
flag defined for PR_Send.)

Therefore, you should only call

    fd->lower->methods->recv or fd->lower->methods->send

if the socket is *blocking*.  If the socket is non-blocking, you should
call

    nsSSLThread::requestRead or nsSSLThread::requestWrite

Call fd->lower->methods->getsocketoption to determine if the socket is
non-blocking.  The name of that socket option is PR_SockOpt_Nonblocking.


If you implement the improvement I suggested, make sure you
test it by changing Mozilla (Necko?) to call PR_Recv and PR_Send
instead of PR_Read and PR_Write.
Comment 1 Kai Engert (:kaie) 2006-12-11 08:38:14 PST
Wan-Teh, I attempted to implemented your proposal as attached to bug 355409 with supplemental patch v2 and changing Necko to use send/receive instead of read/write.

That revealed that the fix has to be smarter than that patch.

First, because of our trick to replace the lower fd with a pollable event, we must be careful before calling GetSockOpt. Calling that on the pollable event results in an assertion failure.

In order to find out the real socket fd, we must lock the thread mutex, because this fd could currently be accessed on the SSL thread.

I'm a bit unhappy that we'd unnecessarily require to aquire this mutex one additional time for non-blocking sockets.

In addition, I wonder why we should restrict this pass-through-to-lib-ssl-directly behaviour to the send/receive functions. In my understanding it made sense to do the same thing for read/write calls on a blocking socket.

So IMHO, the real fix is to make psmsend and psmrecv forward the call to the requestWrite/requestRead functions and have those handle it correctly - since that context acquires the mutex already.
Comment 2 Kai Engert (:kaie) 2006-12-11 10:59:47 PST
Created attachment 248283 [details] [diff] [review]
Patch v1 - reviewer version - ignores whitespace

Wan-Teh, this implements your proposal and addresses my comments in this bug.

Please also note the comment in the patch, where I explain an optimization. I would prefer it we don't have to call GetSockOpt on each read/write/send/recv call.
Comment 3 Kai Engert (:kaie) 2006-12-11 11:00:16 PST
Created attachment 248284 [details] [diff] [review]
Patch v1
Comment 4 Kai Engert (:kaie) 2006-12-11 11:02:32 PST
Created attachment 248285 [details] [diff] [review]
Necko change used for testing

This patch was used for testing purposes only.
It changes Mozilla's https networking to call the send/receive functions.
Comment 5 Wan-Teh Chang 2006-12-11 14:24:08 PST
Comment on attachment 248283 [details] [diff] [review]
Patch v1 - reviewer version - ignores whitespace

I suggest the following changes.

1. Reimplement nsSSLIOLayerRead to simply call PSMRecv(..., 0, PR_INTERVAL_NO_TIMEOUT)
and nsSSLIOLayerWrite to call PSMSend(..., 0, PR_INTERVAL_NO_TIMEOUT).  This way you
remove some redundant code, and you don't need to modify Necko to test the new code.

2. The timeout argument for nsSSLThread::requestRead and nsSSLThread::requestWrite
should be PRIntervalTime rather than a pointer to PRIntervalTime.

3. getRealFDIfBlockingSocket_locked should check the return value of
PR_GetSocketOption.

4. In nsSSLThread::requestRead and nsSSLThread::requestWrite, you should do the
if (blockingFD) case immediately after this:

>+    blockingFD = getRealFDIfBlockingSocket_locked(si);

Then you don't need to add if (!blockingFD) to the existing code, because the
if (blockingFD) case always returns.

In the if (blockingFD) case, you should be able to always call the recv/send
method, whether timeout is 0 or not.
Comment 6 Kai Engert (:kaie) 2006-12-12 13:18:04 PST
(In reply to comment #5)
> 1. Reimplement nsSSLIOLayerRead to simply call PSMRecv(..., 0,
> PR_INTERVAL_NO_TIMEOUT)
> and nsSSLIOLayerWrite to call PSMSend(..., 0, PR_INTERVAL_NO_TIMEOUT).  This
> way you
> remove some redundant code, and you don't need to modify Necko to test the new
> code.
> 
> 2. The timeout argument for nsSSLThread::requestRead and
> nsSSLThread::requestWrite
> should be PRIntervalTime rather than a pointer to PRIntervalTime.


With these two changes, for a blocking socket, when the application calls PR_Read, this will result in a call to libssl's recv function with NO_TIMEOUT, is that what you want?

(... and a call to PR_Write will result in a call to libssl's send with NO_TIMEOUT.)


> 3. getRealFDIfBlockingSocket_locked should check the return value of
> PR_GetSocketOption.

What should we do if this call fails?
Should we
- assume blocking
- assume nonblocking
or
- abort the I/O call completely
?


> 4. In nsSSLThread::requestRead and nsSSLThread::requestWrite, you should do the
> if (blockingFD) case immediately after this:
> 
> >+    blockingFD = getRealFDIfBlockingSocket_locked(si);
> 
> Then you don't need to add if (!blockingFD) to the existing code, because the
> if (blockingFD) case always returns.


I think we should not do that, I'd prefer to leave my change as is.

The reason why I moved the if(blockingFD) case further down:

I want to unlock PSM's ssl_thread_singleton->mMutex before we call into libssl send/recv.

This is accomplished by leaving the scope that starts with nsAutoLock, before  executing the if(blockingFD) case.


> In the if (blockingFD) case, you should be able to always call the recv/send
> method, whether timeout is 0 or not.

This probably answers the question I asked for 1./2.
Comment 7 Wan-Teh Chang 2006-12-12 18:39:30 PST
For a socket, PR_Read is equivalent to PR_Recv(..., 0, PR_INTERVAL_NO_TIMEOUT)
and PR_Write is equivalent to PR_Send(..., 0, PR_INTERVAL_NO_TIMEOUT, whether
the socket is in blocking or non-blocking mode.  If you are worried that
libssl's recv and send can't handle PR_INTERVAL_NO_TIMEOUT correctly, you
can only forward read/write calls on non-blocking sockets to recv/send.

If PR_GetSocketOption fails, you can abort the I/O call completely.
Comment 8 Kai Engert (:kaie) 2006-12-13 09:47:29 PST
(In reply to comment #7)
> For a socket, PR_Read is equivalent to PR_Recv(..., 0, PR_INTERVAL_NO_TIMEOUT)
> and PR_Write is equivalent to PR_Send(..., 0, PR_INTERVAL_NO_TIMEOUT, whether
> the socket is in blocking or non-blocking mode.

Thanks for confirming.


> If PR_GetSocketOption fails, you can abort the I/O call completely.

Ok, will do.
Comment 9 Kai Engert (:kaie) 2006-12-13 09:49:00 PST
Created attachment 248525 [details] [diff] [review]
Patch v2 - reviewer version - ignores whitespace

Patch after addressing Wan-Teh's comments.
Comment 10 Kai Engert (:kaie) 2006-12-13 09:49:28 PST
Created attachment 248526 [details] [diff] [review]
Patch v2
Comment 11 Kai Engert (:kaie) 2007-02-13 16:50:53 PST
Comment on attachment 248525 [details] [diff] [review]
Patch v2 - reviewer version - ignores whitespace

Wan-Teh, I believe I have addressed all your requests, with the little exception that I explained in comment 6.

Could you please review the patch? Thanks a lot.
Comment 12 Wan-Teh Chang 2007-02-14 16:26:51 PST
Comment on attachment 248525 [details] [diff] [review]
Patch v2 - reviewer version - ignores whitespace

r=wtc.

In nsSSLThread.cpp, method getRealFDIfBlockingSocket_locked, we have:

>+    if (sod.value.non_blocking)
>+    {
>+      si->mBlockingState = nsNSSSocketInfo::is_nonblocking_socket;
>+    }
>+    else
>+    {
>+      si->mBlockingState = nsNSSSocketInfo::is_blocking_socket;
>+    }

This can be replaced with an expression using the  ? : operator.

In the requestRead and requestWrite methods, it is a shame that we
have to use an if (!blockingFD) statement inside the scope protected
by ssl_thread_singleton->mMutex, followed by an if (blockingFD) statement
right after the scope so that we call blockingFD->methods->send without
locking ssl_thread_singleton->mMutex.  I hope we can come up with a
simpler method.  Right now it's not obvious what the consecutive
if (!blockingFD) and if (blockingFD) are for.  I did figure it out
after studying the code, but it takes some effort.  I'm worried that
people may add code between the if (!blockingFD) and if (blockingFD)
blocks in the future because they don't understand the purpose of those
two if tests.
Comment 13 Kai Engert (:kaie) 2007-02-14 18:30:41 PST
(In reply to comment #12)
> In nsSSLThread.cpp, method getRealFDIfBlockingSocket_locked, we have:
> This can be replaced with an expression using the  ? : operator.

ok, changed


> Right now it's not obvious what the consecutive
> if (!blockingFD) and if (blockingFD) are for.  I'm worried that
> people may add code between the if (!blockingFD) and if (blockingFD)
> blocks in the future because they don't understand the purpose of those
> two if tests.

In both places I've added the following comment after the first closing bracket:

  // leave this mutex protected scope before the blockingFD handling
Comment 14 Kai Engert (:kaie) 2007-02-14 18:34:16 PST
Created attachment 255167 [details] [diff] [review]
Patch v3 as checked in

v3 = v2 + Wan-Teh's requested change
carrying forward r=wtchang
Comment 15 Kai Engert (:kaie) 2007-02-14 18:47:10 PST
Fixed on trunk.

Thanks for the review!
Comment 16 Kai Engert (:kaie) 2009-03-05 09:52:36 PST
Created attachment 365683 [details] [diff] [review]
Patch v3 for 1.8 branch

We never added this patch to Mozilla 1.8 branch for Thunderbird 2.

I backported patch v3 (only simple context changes).
Comment 17 Kai Engert (:kaie) 2009-08-19 06:33:44 PDT
Comment on attachment 365683 [details] [diff] [review]
Patch v3 for 1.8 branch

I think approval1.8.1.next is the right flag for thunderbird 2.
Comment 18 Kai Engert (:kaie) 2009-08-19 06:35:05 PDT
Dan, are you interested to drive this backported patch into thunderbird 2, as it seems to relax the situation from bug 390036?
Comment 19 Dan Mosedale (:dmose) 2009-08-19 10:53:36 PDT
I wish I had the bandwidth.  :-(
Comment 20 Kai Engert (:kaie) 2009-08-19 10:57:38 PDT
(In reply to comment #19)
> I wish I had the bandwidth.  :-(

I think the only thing that's needed is a branch approval.
Comment 21 Samuel Sidler (old account; do not CC) 2009-10-28 15:20:06 PDT
Comment on attachment 365683 [details] [diff] [review]
Patch v3 for 1.8 branch

Approved for 1.8.1.24. a=ss
Comment 22 Kai Engert (:kaie) 2009-10-28 15:43:28 PDT
Created attachment 408959 [details] [diff] [review]
Patch v3 for 1.8 branch, merged again

Merge approved patch to latest 1.8 branch, one simple context change.
I'll check this in once my build completes.
Comment 23 Kai Engert (:kaie) 2009-10-29 15:16:48 PDT
Checked in to 1.8 branch.

Checking in ssl/src/nsNSSIOLayer.cpp;
/cvsroot/mozilla/security/manager/ssl/src/nsNSSIOLayer.cpp,v  <--  nsNSSIOLayer.cpp
new revision: 1.97.2.25; previous revision: 1.97.2.24
done
Checking in ssl/src/nsNSSIOLayer.h;
/cvsroot/mozilla/security/manager/ssl/src/nsNSSIOLayer.h,v  <--  nsNSSIOLayer.h
new revision: 1.27.28.7; previous revision: 1.27.28.6
done
Checking in ssl/src/nsSSLThread.cpp;
/cvsroot/mozilla/security/manager/ssl/src/nsSSLThread.cpp,v  <--  nsSSLThread.cpp
new revision: 1.2.2.9; previous revision: 1.2.2.8
done
Checking in ssl/src/nsSSLThread.h;
/cvsroot/mozilla/security/manager/ssl/src/nsSSLThread.h,v  <--  nsSSLThread.h
new revision: 1.1.2.4; previous revision: 1.1.2.3
done
Comment 24 Al Billings [:abillings] 2010-02-18 16:06:06 PST
Is the best way to test this for TB 2.0.0.24 to see if it helps with bug 390036?

Note You need to log in before you can comment on or make changes to this bug.