Closed Bug 752648 Opened 8 years ago Closed 8 years ago

TLS intolerance detection and SSL fallback are broken


(Core :: Networking, defect, major)

Not set



Tracking Status
firefox14 --- fixed
firefox15 --- fixed


(Reporter: gcp, Assigned: mcmanus)




(Keywords: regression)


(2 files)

Split from bug 698203, starting from comment 19 and later

This will fail with a Connection Reset error on the first attempt. Subsequent atttemps will work.

one bug where we fallback too often, one where we don't fallback soon enough. And I didn't even change code in psm :) I need a goldilocks consult.

I'll look at this tomorrow.
Assignee: nobody → mcmanus
 	xul.dll!nsSSLIOLayerHelpers::addIntolerantSite(const nsCString & str={...})  Line 1270	C++
 	xul.dll!nsSSLIOLayerHelpers::rememberPossibleTLSProblemSite(nsNSSSocketInfo * socketInfo=0x095f66e8)  Line 681 + 0x9 bytes	C++
 	xul.dll!`anonymous namespace'::checkHandshake(int bytesTransfered=-1, bool wasReading=false, PRFileDesc * ssl_layer_fd=0x0d41c898, nsNSSSocketInfo * socketInfo=0x095f66e8)  Line 921 + 0x9 bytes	C++
 	xul.dll!PSMSend(PRFileDesc * fd=0x0d41c898, const void * buf=0x5a0c542f, int amount=0, int flags=0, unsigned int timeout=4294967295)  Line 1158 + 0x13 bytes	C++
 	xul.dll!nsSSLIOLayerWrite(PRFileDesc * fd=0x0d41c898, const void * buf=0x5a0c542f, int amount=0)  Line 1170 + 0x15 bytes	C++
 	nspr4.dll!PR_Write(PRFileDesc * fd=0x0d41c898, const void * buf=0x5a0c542f, int amount=0)  Line 146 + 0x16 bytes	C
>	xul.dll!nsSocketOutputStream::Write(const char * buf=0x5a0c542f, unsigned int count=0, unsigned int * countWritten=0x0723fa48)  Line 587 + 0x12 bytes	C++
 	xul.dll!nsHttpConnection::EnsureNPNComplete()  Line 310 + 0x33 bytes	C++
 	xul.dll!nsHttpConnection::OnSocketWritable()  Line 1230 + 0x8 bytes	C++
 	xul.dll!nsHttpConnection::OnOutputStreamReady(nsIAsyncOutputStream * out=0x095f5660)  Line 1530 + 0xb bytes	C++
 	xul.dll!nsSocketOutputStream::OnSocketReady(unsigned int condition=0)  Line 526	C++
 	xul.dll!nsSocketTransport::OnSocketReady(PRFileDesc * fd=0x0d41c898, short outFlags=2)  Line 1571	C++
 	xul.dll!nsSocketTransportService::DoPollIteration(bool wait=true)  Line 764 + 0x21 bytes	C++
 	xul.dll!nsSocketTransportService::Run()  Line 650	C++
 	xul.dll!nsThread::ProcessNextEvent(bool mayWait=true, bool * result=0x0723fc3f)  Line 656 + 0x19 bytes	C++

Error is SSL_ERROR_NO_CYPHER_OVERLAP.  When I remove breakpoint from that line on top of the stack, I can reproduce.  But with the breakpoint (just stay few secs in the debugger) I cannot, so it's some timing issue.
Here [1] mSentData is true.  The rest is false.  So we don't call Restart().  The "timing problem" is that simply connReused turns to true since the connection hangs for more then 950 ms.

> if (!mReceivedData && (!mSentData || connReused || mPipelinePosition))

Problem is that pipeline is prebuffering the data before it actually writes.

This is another reason to implement bug 715905, by the way.  We can detect TLS intolerance very soon and restart the connection it self and not to be dependent on complex and sensitive transaction restart logic.

Last good f8656bbf0818
First bad 3dd62d76cc6d
Triggered by3dd62d76cc6d	Patrick McManus — bug 447866 http pipelining is bursty r=honzab
Blocks: 447866
Attached patch pach 0Splinter Review
This moves the failed-ssl detection earlier, pretty much any point before the ssl handshake is complete, and avoids calling the transaction i/o functions before declaring failure. That makes restart work as expected.
Attachment #622000 - Flags: review?(honzab.moz)
Comment on attachment 622000 [details] [diff] [review]
pach 0

Review of attachment 622000 [details] [diff] [review]:

Attachment #622000 - Flags: review?(honzab.moz) → review+
Comment on attachment 622000 [details] [diff] [review]
pach 0

[Approval Request Comment]
Regression caused by (bug #): 447866 (ff14)

User impact if declined: Connection reset on first connection attempt to host that does not support TLSv1 (which is uncommon). Future attempts in the same session will work successfully with SSLv3 fallback. The fix makes the SSLv3 fallback work correctly on the first attempt too.

Testing completed (on m-c, etc.): on m-c. URL in bug report verified

Risk to taking this patch (and alternatives if risky):risk is low - its basically an error check at an earlier point in time. The code path that is changed is well exercised by all SSL connections, so if there is a problem it would likely be noticed quickly.

alternative would be to live with the ephemeral error with SSLv3 compatibility for FF14. "Try Again" works.

String changes made by this patch: none
Attachment #622000 - Flags: approval-mozilla-aurora?
Comment on attachment 622000 [details] [diff] [review]
pach 0

[Triage Comment]
Approved for Aurora 14.
Attachment #622000 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Closed: 8 years ago
Resolution: --- → FIXED
this has somehow created a race for the untrusted cert dialog screen. (with the patch sometimes you get the screen, sometimes you get a connection reset error). We do have tests for that, but it never seems to trigger the error on the CI.

I can see it consistently e.g. with (which should show the error).

I'll revert this from m-c.
Resolution: FIXED → ---
here's a new version that just addresses the problem rather than trying to centralize connection establishment like the last version.

it is simpler yet - it just uses transport SENDING_TO events as the nsHttpTransaction::SentData indication rather than completion of ReadSegments(). That lets us know when the http data really gets written out rather than just buffered in a higher layer which is what's really important for the idempotence safety check being done with the value.

I made sure that sending a CONNECT does not generate a SENDING_TO for the transaction. (if it did, the we wouldn't get the Restart() for TLS intolerance when tunneling through a proxy).
(In reply to Patrick McManus [:mcmanus] from comment #13)
> I made sure that sending a CONNECT does not generate a SENDING_TO for the
> transaction. (if it did, the we wouldn't get the Restart() for TLS
> intolerance when tunneling through a proxy).

unfortunately after further thinking and testing that approach was racy and insufficient.

To summarize:
 * we need to let the pipeline class buffer for efficient packet filling
 * its important to have the transaction i/o (as opposed to finding errors through ensureNPN()), in order to get the appropriate cert error dialogs. I'm sure this can be fixed in a bigger feature, but not in this bug which needs to be ported to ff14.
 * due to the vagaries of the way CONNECT is implemented, its hard to have transaction::mSentData truly reflect what we need for the restart logic

nsHttpTransaction can query nsHttpConnection to find out if the connection object has written any plain text bytes beyond the connect sequence. That will help it sort out the buffering issues at the beginning of the connection. (other situations aren't as important because we liberally restart on pconns anyhow). I've spent most of the day trying to avoid this conclusion, but it seems to be necessary.

As I had to add code to each nsAHttpConnection implementation I created a forwarding object macro.. that will actually let use remove (in a later patch) redundant code in those classes that just does forwarding and replace it with a macro in the header.
Comment on attachment 623007 [details] [diff] [review]
patch version 2 r0

Patrick, this patch looks reasonable to me.  Do you want to submit it for review?
Comment on attachment 623007 [details] [diff] [review]
patch version 2 r0

this looks good on try
Attachment #623007 - Flags: review?(honzab.moz)
Comment on attachment 623007 [details] [diff] [review]
patch version 2 r0

Review of attachment 623007 [details] [diff] [review]:


Good workaround.
Attachment #623007 - Flags: review?(honzab.moz) → review+
Closed: 8 years ago8 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.