Closed
Bug 705755
Opened 13 years ago
Closed 7 years ago
Reintroduce handling of SSL short write after SSL thread removal
Categories
(Core :: Security: PSM, defect, P1)
Core
Security: PSM
Tracking
()
RESOLVED
FIXED
mozilla57
People
(Reporter: briansmith, Assigned: mayhemer)
References
Details
(Keywords: regression, Whiteboard: [psm-assigned])
Attachments
(1 file, 3 obsolete files)
7.81 KB,
patch
|
mayhemer
:
review+
|
Details | Diff | Splinter Review |
+++ This bug was initially created as a clone of Bug #674147 +++
(In reply to Honza Bambas (:mayhemer) from Bug 674147 Comment 37)
> You are regressing bug 378629:
> https://bugzilla.mozilla.org/show_bug.cgi?id=356470#c54
> https://bugzilla.mozilla.org/show_bug.cgi?id=378629#c11
>
> PSMSend must watch for libssl sending just request - 1 amount of data (= a
> "short write"). In that case, the simplest solution is IMO that we have to
> limit next data send to just 1 byte, simply explained in a pseudo code:
>
> PSMSend(buffer, count)
> {
> static flag = false;
> if (flag)
> count = 1;
> flag = false;
>
> written = lower->send(buffer, count)
> if (written == count - 1)
> flag = true;
> }
>
> We can do this in a separate bug, but ASAP. The final solution is up to
> you, but I'd like to review and test it.
This is marked blocking bug 674147 because that is how we show "this bug is caused by that bug" but I still intend to land the patch in bug 674147 first.
Assignee | ||
Comment 1•13 years ago
|
||
I renamed the bug to be more descriptive and also added refs to other bugs that gets affected by bug 674147.
I agree to land that bug first, this one is not a critical issue.
Assignee | ||
Updated•13 years ago
|
Assignee | ||
Comment 2•13 years ago
|
||
-> me
Stealing this one.
Assignee: nobody → honzab.moz
Status: NEW → ASSIGNED
Assignee | ||
Comment 3•13 years ago
|
||
- implements what the SSL thread had been implementing before its removal
- adds also check for 16383 bytes of written data ; this often happens even for larger buffers like 32768, w/o it we jitter again with 2 bytes being later written separately
- the 16383 const is disputable
Added Nelson for feedback to get more info on how all ssl lib indicates the short write.
To have what we had before SSL thread removal we may just remove the 16383 const checking and we are done.
Attachment #582708 -
Flags: review?(kaie)
Attachment #582708 -
Flags: feedback?(nelson)
Comment 4•13 years ago
|
||
Aren't we short on time for reviews on these for ff11 ?
Comment 5•13 years ago
|
||
I agree with the patch in general, just some small comments:
+/* Secure write sometimes returns this amount of bytes written even for larger
+ sizes of the output buffer, consider this as indication of a short write */
+static const PRInt32 kShortWrite16k = 16383;
(a)
Please explain (in the comment) where this number comes from.
I had to research myself.
I believe this is based on libssl's constant
#define MAX_FRAGMENT_LENGTH 16384
Maybe add this comment:
// Value is based on libssl's maximum buffer size (in a private header):
// MAX_FRAGMENT_LENGTH - 1
// TODO: Query this value once at runtime (changes in NSS required)
(b)
Nit, optional: Given you only use this constant once,
why not move it closer to the place where you use it?
Makes the explanation easier to find when reading the code.
(c)
+ // Value of the last byte pending from the SSL short write that needs
+ // to be passed to subsequent calls to send to perform the flush.
+ char mShortWritePendingByte;
Please change that to "unsigned char", because that's what you use everywhere else.
(d)
+ if (socketInfo->IsShortWritePending()) {
+ // We got "SSL short write" last time, try to flush the pending byte.
+ buf = socketInfo->GetShortWritePendingByteRef();
+ amount = 1;
+ ###
+ PR_LOG(gPIPNSSLog, PR_LOG_DEBUG, ("[%p] pushing 1 byte after SSL short write",
+ fd));
+ }
+
If I understand correctly, if we reach position ###,
the following must be true:
- the function was called with: amount >= 1
- this expression is true:
*(const unsigned char*) pending-ref == *buf-given-to-function[0]
I propose to add an assertion,
that way you don't need to change the "buf" pointer.
if (socketInfo->IsShortWritePending()) {
// We got "SSL short write" last time, try to flush the pending byte.
NS_ASSERTION(amount >= 1 &&
buf[0] == *(const unsigned char*)
socketInfo->GetShortWritePendingByteRef(),
"unexpected buffer after short write");
amount = 1;
PR_LOG(gPIPNSSLog, PR_LOG_DEBUG, ("[%p] pushing 1 byte after SSL short write",
fd));
}
(e)
+ // SSL Short Write handling.
+ //
+ // True when SSL layer has indicated an "SSL short write", i.e. need
+ // to call on send one or more times to push all pending data to write.
+ bool mIsShortWritePending;
+ // These are only valid iff mIsShortWritePending is true.
+ //
+ // Value of the last byte pending from the SSL short write that needs
+ // to be passed to subsequent calls to send to perform the flush.
+ char mShortWritePendingByte;
+ // Original amount of data the upper layer has requested to write to
+ // return after the successful flush.
+ PRInt32 mShortWriteOriginalAmount;
In this block it's difficult to see which variables you refer to when saying
"These are only valid iff mIsShortWritePending is true."
I'd align the block clearer, maybe like this:
// True when SSL layer has indicated an "SSL short write", i.e. need
// to call on send one or more times to push all pending data to write.
bool mIsShortWritePending;
// Value of the last byte pending from the SSL short write that needs
// to be passed to subsequent calls to send to perform the flush.
// Only valid if mIsShortWritePending
unsigned char mShortWritePendingByte;
// Original amount of data the upper layer has requested to write to
// return after the successful flush.
// Only valid if mIsShortWritePending
PRInt32 mShortWriteOriginalAmount;
Comment 6•13 years ago
|
||
Please run some quick local testing with a debug build, after you do (d), and confirm you don't see the assertion.
If you address my proposals, r=kaie
No second round of review necessary, but please attach the new patch for reference.
Comment 7•13 years ago
|
||
Comment on attachment 582708 [details] [diff] [review]
v1
r-
If you address my small comments, you may mark r=kaie on your new patch.
Attachment #582708 -
Flags: review?(kaie) → review-
Comment 8•13 years ago
|
||
> r-
because of the char/unsigned char inconsistence.
Reporter | ||
Updated•13 years ago
|
Assignee | ||
Comment 9•11 years ago
|
||
We may want to check this is actually still needed.
Status: ASSIGNED → NEW
Assignee | ||
Comment 10•9 years ago
|
||
David, can you or someone from your team please verify this is still an issue and should be fixed?
Assignee: honzab.moz → nobody
Flags: needinfo?(dkeeler)
Updated•9 years ago
|
Whiteboard: [psm-backlog]
Updated•8 years ago
|
Attachment #582708 -
Flags: feedback?(nelson)
Updated•8 years ago
|
Priority: -- → P3
Comment 12•7 years ago
|
||
"perf" key word?
Assignee | ||
Updated•7 years ago
|
Assignee | ||
Comment 13•7 years ago
|
||
I addressed all the Kai's 5 years old comments where I believed it was necessary. I can't trigger the code path tho, so hard to say if this breaks something or actually fixes something.
Attachment #582708 -
Attachment is obsolete: true
Attachment #8900871 -
Flags: review?(dkeeler)
Assignee | ||
Updated•7 years ago
|
Whiteboard: [psm-backlog] → [psm-active]
Assignee | ||
Updated•7 years ago
|
Target Milestone: mozilla11 → ---
Comment 14•7 years ago
|
||
Comment on attachment 8900871 [details] [diff] [review]
v1 (merged)
Review of attachment 8900871 [details] [diff] [review]:
-----------------------------------------------------------------
Cool. This looks good. r=me with comments addressed (particularly the assertion failure and the documentation).
I managed to reproduce the short write situation by running a server locally and using tc to limit the bandwidth and set a delay to it. It looks like Firefox can get into an inefficient state where it will send a large packet followed by a small packet followed by a large packet, etc.. This patch addresses the issue and makes the packet size much more uniform (and large). Unfortunately, it seems this would be hard to write a test for. Maybe we could add this as a QA smoketest task? Or I think we have some talos tests that simulate network effects that should be able to get Firefox in this state...
Anyway, I think we should elaborate on the documentation in this patch. Maybe something like this:
NSS indicates that it can't write all requested data (due to network congestion, for example) by returning either one less than the amount of data requested or 16383, if the requested amount is greater than 16384. We refer to this as a "short write". If we simply returned the amount that NSS did write, the layer above us would then call PSMSend with a very small amount of data (often 1). This is inefficient and can lead to alternating between sending large packets and very small packets. To prevent this, we alert the layer calling us that the operation would block and that it should be retried later, with the same data. When it does, we tell NSS to write the remaining byte it didn't write in the previous call. We then return the total number of bytes written, which is the number that caused the short write plus the additional byte we just wrote out.
::: security/manager/ssl/nsNSSIOLayer.cpp
@@ +1483,5 @@
> #endif
>
> + if (socketInfo->IsShortWritePending() && amount > 0) {
> + // We got "SSL short write" last time, try to flush the pending byte.
> + MOZ_DIAGNOSTIC_ASSERT(*static_cast<unsigned char const*>(buf) == *socketInfo->GetShortWritePendingByteRef(),
This assertion fails for me. If I'm understanding correctly, the higher layer is retrying the original write, right? If so, buf wouldn't be the byte(s) that didn't get written - buf would be the original data, in which case buf[mShortWriteOriginalAmount - 1] should be mShortWritePendingByte, right? In fact, if we wanted to give ourselves more confidence that the right thing was happening, maybe we could save the original buf and compare it to the new buf (probably in debug-only builds, though).
@@ +1497,4 @@
> int32_t bytesWritten = fd->lower->methods->send(fd->lower, buf, amount,
> flags, timeout);
>
> + /* Secure write sometimes returns this amount of bytes written even for larger
nit: //-style comments, please
@@ +1501,5 @@
> + sizes of the output buffer, consider this as indication of a short write.
> +
> + Value is based on libssl's maximum buffer size (in a private header):
> + MAX_FRAGMENT_LENGTH - 1
> + TODO: Query this value once at runtime (changes in NSS required) */
Let's file a bug in NSS :: Libraries to make this API available and reference it here.
::: security/manager/ssl/nsNSSIOLayer.h
@@ +175,5 @@
> + // True when SSL layer has indicated an "SSL short write", i.e. need
> + // to call on send one or more times to push all pending data to write.
> + bool mIsShortWritePending;
> +
> + // These are only valid iff mIsShortWritePending is true.
nit: "only valid if and only if" is a bit redundant :)
Attachment #8900871 -
Flags: review?(dkeeler) → review+
Updated•7 years ago
|
Priority: P3 → P1
Whiteboard: [psm-active] → [psm-assigned]
Assignee | ||
Comment 15•7 years ago
|
||
Attachment #8900871 -
Attachment is obsolete: true
Attachment #8902272 -
Flags: review+
Assignee | ||
Comment 16•7 years ago
|
||
(updated ci message)
Attachment #8902272 -
Attachment is obsolete: true
Attachment #8902311 -
Flags: review+
Assignee | ||
Updated•7 years ago
|
Keywords: checkin-needed
Comment 17•7 years ago
|
||
Pushed by ryanvm@gmail.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/21350cc2ff6b
Handle SSL short-write correctly to save CPU looping. r=keeler
Keywords: checkin-needed
Comment 18•7 years ago
|
||
bugherder |
Status: ASSIGNED → RESOLVED
Closed: 7 years ago
status-firefox57:
--- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla57
Updated•7 years ago
|
You need to log in
before you can comment on or make changes to this bug.
Description
•