If you think a bug might affect users in the 57 release, please set the correct tracking and status flags for Release Management.

Do not call ReclaimConnection during TFO, and make TFO handling clearer

RESOLVED FIXED in Firefox 55

Status

()

Core
Networking: HTTP
RESOLVED FIXED
4 months ago
4 months ago

People

(Reporter: dragana, Assigned: dragana)

Tracking

55 Branch
mozilla55
Points:
---

Firefox Tracking Flags

(firefox55 fixed)

Details

(Whiteboard: [necko-active])

Attachments

(2 attachments, 2 obsolete attachments)

(Assignee)

Description

4 months ago
I decided to rewrite part of TCP Fast Open handling in nsHttpConnectionMgr, to remove calls to ReclaimConnection and to make handling of TCP FastOpen backup HalfOpenSock more clear.

ReclaimConnection cause bug 1362984, which is cause by antivirus program and not our code (I am 100% sure that our code could not cause that). I will remove calls to that function since they are unnecessary.

Description:

I have introduced mHalfOpenFastOpenBackups to ConnectionEntry that will hold RefPtr to nsHalfOpenSock that are use as a backup for TFO connection. They used to be in mHalfOpens. This will make some handling easier to understand.

If socketTransport can use TFO it will call nsHalfOpenSocket::StartFastOpen().

nsHalfOpenSocket::StartFastOpen():
 remove the HalfOpen from mHalfOpens
 count this socketTransport as connected
 remove HalfOpen from callbacks, the new connection will take them.
 try to make a new connection:
   1) if we have not succeeded to activate this connection, e.g. request has an error or if connection got canceled (it can happen if we have too many idle connection and we have not found request for this one)
     cancel and deref everything.
   2) on success:
     add HalfOpen to mHalfOpenFastOpenBackups
     start a backup timer.

after StartFastOpen a HalfOpen will have:
mFastOpenInProgress == true and mConnectionNegotiatingFastOpen != nullptr and HalfOpen will be in mHalfOpenFastOpenBackups (these all always go together)

On connection successfully connected or an error happen that we are not going to recover from SetFastOpenConnected is called
SetFastOpenConnected:
  reset mSocketTransport->SetFastOpenCallback(nullptr) and mConnectionNegotiatingFastOpen->SetFastOpen(false)
  cancel the backup timer
  if we have a backup socket transport already started, we can use it, therefore put HalfOpen into mHalfOpens. Otherwise call Abandon to clean up.
  remove HalfOpen from mHalfOpenFastOpenBackups
  set mFastOpenInProgress to false and deref mConnectionNegotiatingFastOpen


If we are going to recover from this error also call SetFastOpenConnected:
  reset mSocketTransport->SetFastOpenCallback(nullptr) and    mConnectionNegotiatingFastOpen->SetFastOpen(false)
  call CloseConnectionFastOpenTakesTooLongOrError to close nsHttpConnection, but do not close socketTransport
 put transaction into Pending queue
 (I also add transaction to this halfOpen, so that I am sure that the transaction gets this halfOpen, but we can maybe remove that to be on a save side)
 put HalfOpen into mHalfOpens
 restore callbacks
 remove connection from mActiveConns (we are not calling ReclaimConnection to do that for us)
 remove HalfOpen from mHalfOpenFastOpenBackups
 set mFastOpenInProgress to false and deref mConnectionNegotiatingFastOpen

If backup connection is connected faster than the TFO connection:
in nsHalfOpenSocket::OnOutputStreamReady:
  reset mSocketTransport->SetFastOpenCallback(nullptr) and    mConnectionNegotiatingFastOpen->SetFastOpen(false)
  call CloseConnectionFastOpenTakesTooLongOrError to close nsHttpConnection and socketTransport
  put transaction into Pending queue
  (I also add transaction to this halfOpen, so that I am sure that that transaction gets this halfOpen, but we can maybe remove that)
  remove HalfOpen from mHalfOpenFastOpenBackups
  remove connection from mActiveConns (we are not calling ReclaimConnection to do that for us)
  set mFastOpenInProgress to false and deref mConnectionNegotiatingFastOpen

UpdateCoalescingForNewConn():
will call DontReuse() for connections in tfo phase, but that will not closed them. We want to closed them. In old version we were closing them throuth their HalfOpen because they were in mHalfOpens now they are not. Now we iterate through mHalfOpenFastOpenBackups and call CancelFastOpenConnection:
  reset mSocketTransport->SetFastOpenCallback(nullptr) and    mConnectionNegotiatingFastOpen->SetFastOpen(false)
  put transaction into Pending queue
  remove HalfOpen from mHalfOpenFastOpenBackups
  remove connection from mActiveConns (we are not calling ReclaimConnection to do that for us)
  set mFastOpenInProgress to false and deref mConnectionNegotiatingFastOpen
  Abandon();

Similar happens in GetSpdyActiveConn (GetSpdyActiveConn will NOT be called for a socket in TFO. (We can optimized when we set a connection to be IsExperienced for TFO connection, a separate bug)

If nsHttpConnetionMgr decide to close connection (e.g. PruneDeadConnections) the connection will close socketTransport which will call SetFastOpenConnected with an error which will clean up halfOpen and maybe put it into mHalfOpens. OnMgrShutdown this will happen first and than we will call cleanup for mHalfOpens and everything will be gone!
(Assignee)

Updated

4 months ago
Blocks: 1188435
Whiteboard: [necko-active]
(Assignee)

Comment 1

4 months ago
Created attachment 8867101 [details] [diff] [review]
bug_1364371_v1.patch

This will make some stuff easier to debug. The old version worked as well, but this one is easier to understand.
Attachment #8867101 - Flags: review?(mcmanus)
(Assignee)

Comment 2

4 months ago
TFO turned off
https://treeherder.mozilla.org/#/jobs?repo=try&revision=ff8a5765106c638b34c045a89e1dc09d11c354bd

TFO turned on
https://treeherder.mozilla.org/#/jobs?repo=try&revision=5da2da094b940cc2c9b560d4c1638b37e74cb8cd:
(Assignee)

Updated

4 months ago
Duplicate of this bug: 1363825
(Assignee)

Comment 4

4 months ago
Created attachment 8867224 [details] [diff] [review]
bug_1364371_v1.patch

I like when I forget ".get()" in a log.
Attachment #8867101 - Attachment is obsolete: true
Attachment #8867101 - Flags: review?(mcmanus)
Attachment #8867224 - Flags: review?(mcmanus)
(Assignee)

Comment 5

4 months ago
https://treeherder.mozilla.org/#/jobs?repo=try&revision=299c28ea4df5b0dcff397009cdd083921b24d576
(Assignee)

Comment 6

4 months ago
Thinking  of NullHttpTransaction crashes, I think this patch will fix them. I think I was not properly holding reference to HalfOpen and looking at nsHttpConnectionMgr any code path is possible.
Comment on attachment 8867224 [details] [diff] [review]
bug_1364371_v1.patch

comment 5 try run says this isn't quite ready :).. the description makes sense to me tho
Attachment #8867224 - Flags: review?(mcmanus)
(Assignee)

Comment 8

4 months ago
I have went a bit further in removing flipping reference:

I removed places where I have changed nsHttpOpen::mTransaction.
(Assignee)

Comment 9

4 months ago
Throughout nsHttpConnectionMgr code we call ProcessPendingQ a lot which, in some cases, can cancel halfOpen connection. This patch makes sure that a halfOpen connection is not in any of the lists during CancelFastOpenConnection, OnOutputStreamReady, StartFastOpen and SetFastOpenConnected. Therefore halfOpen will be removed from mHalfOpenFastOpenBackups or mHalfOpens (depends in which list it is) at the beginning of these functions and only added to mHalfOpenFastOpenBackups or mHalfOpens at the end of these functions. Connections will stay in mActiveConn because only actions on mActiveConn connections are call to DontReuse and close which I think will not get nsHttpConnectionMgr is a strange state.
(Assignee)

Comment 10

4 months ago
Created attachment 8867659 [details] [diff] [review]
bug_1364371_v1.patch

https://treeherder.mozilla.org/#/jobs?repo=try&revision=15c5be07296290b2b5974ac769b0d3ab0bd82af3
Attachment #8867224 - Attachment is obsolete: true
Attachment #8867659 - Flags: review?(mcmanus)
(Assignee)

Comment 11

4 months ago
Created attachment 8870110 [details] [diff] [review]
bug_1364371_v2.patch

From the conversation today: a version that does not remove ReclaimConnection.
Attachment #8870110 - Flags: review?(mcmanus)
Comment hidden (obsolete)
(Assignee)

Comment 13

4 months ago
Comment on attachment 8870110 [details] [diff] [review]
bug_1364371_v2.patch

try does not look good.
Attachment #8870110 - Flags: review?(mcmanus)
(Assignee)

Comment 14

4 months ago
Comment on attachment 8870110 [details] [diff] [review]
bug_1364371_v2.patch

sorry for the noise, I have push an additional patch that I did not want to push.

https://treeherder.mozilla.org/#/jobs?repo=try&revision=7a5fddd86b9a6dacbd045630d7cbd06076ce9bab
Attachment #8870110 - Flags: review?(mcmanus)
(Assignee)

Updated

4 months ago
Flags: needinfo?(mcmanus)
Comment on attachment 8867659 [details] [diff] [review]
bug_1364371_v1.patch

Review of attachment 8867659 [details] [diff] [review]:
-----------------------------------------------------------------

let's go with v2
Attachment #8867659 - Flags: review?(mcmanus)
Comment on attachment 8870110 [details] [diff] [review]
bug_1364371_v2.patch

Review of attachment 8870110 [details] [diff] [review]:
-----------------------------------------------------------------

I think this one is much safer :)
Attachment #8870110 - Flags: review?(mcmanus) → review+
Flags: needinfo?(mcmanus)

Comment 17

4 months ago
Pushed by dd.mozilla@gmail.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/23606bed178e
Introduce mHalfOpenFastOpenBackups and make TFO handling clearer. r=mcmanus

Comment 18

4 months ago
bugherder
https://hg.mozilla.org/mozilla-central/rev/23606bed178e
Status: ASSIGNED → RESOLVED
Last Resolved: 4 months ago
status-firefox55: --- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla55
You need to log in before you can comment on or make changes to this bug.