Closed Bug 930805 Opened 11 years ago Closed 9 years ago

SPDY IP Pooling/Connection Coalescing uses more connections than expected

Categories

(Core :: Networking: HTTP, defect)

24 Branch
x86
All
defect
Not set
normal

Tracking

()

RESOLVED WONTFIX

People

(Reporter: edward.ackroyd, Unassigned)

Details

(Whiteboard: [spdy])

Since Firefox 24 was released I see approx. double the connections on my SPDY servers (nginx), it seems that the SPDY ip pooling/connection coalescing is not working correctly.

To reproduce look at this example containing a few images served from multiple subdomains:

http://ackroyd.de/spdy-ip-pooling-test.html

All subdomains point to a single server IP with a valid wildcard certificate. I would expect that the ip pooling/connection coalescing kicks in and all images are served over one SPDY connection, however the nginx server logs show that multiple connections are used (second column is connection id):

ip1 311 [25/Oct/2013:00:40:32 +0200] https2 0.searchpreview.de "GET ..."
ip1 311 [25/Oct/2013:00:40:32 +0200] https2 c.searchpreview.de "GET ..."
ip1 311 [25/Oct/2013:00:40:32 +0200] https2 c.searchpreview.de "GET ..."
ip1 311 [25/Oct/2013:00:40:32 +0200] https2 2.searchpreview.de "GET ..."
ip1 311 [25/Oct/2013:00:40:32 +0200] https2 2.searchpreview.de "GET ..."
ip1 311 [25/Oct/2013:00:40:32 +0200] https2 4.searchpreview.de "GET ..."
ip1 311 [25/Oct/2013:00:40:32 +0200] https2 4.searchpreview.de "GET ..."
ip1 313 [25/Oct/2013:00:40:32 +0200] https2 4.searchpreview.de "GET ..."
ip1 315 [25/Oct/2013:00:40:32 +0200] https2 2.searchpreview.de "GET ..."
ip1 312 [25/Oct/2013:00:40:32 +0200] https2 1.searchpreview.de "GET ..."
ip1 316 [25/Oct/2013:00:40:32 +0200] https2 6.searchpreview.de "GET ..."
ip1 319 [25/Oct/2013:00:40:32 +0200] https2 c.searchpreview.de "GET ..."
ip1 314 [25/Oct/2013:00:40:32 +0200] https2 5.searchpreview.de "GET ..."
ip1 317 [25/Oct/2013:00:40:32 +0200] https2 7.searchpreview.de "GET ..."
ip1 318 [25/Oct/2013:00:40:32 +0200] https2 8.searchpreview.de "GET ..."
ip1 310 [25/Oct/2013:00:40:32 +0200] https2 3.searchpreview.de "GET ..."
ip1 321 [25/Oct/2013:00:40:32 +0200] https2 e.searchpreview.de "GET ..."
ip1 320 [25/Oct/2013:00:40:32 +0200] https2 9.searchpreview.de "GET ..."
ip1 311 [25/Oct/2013:00:40:32 +0200] https2 9.searchpreview.de "GET ..."
ip1 311 [25/Oct/2013:00:40:32 +0200] https2 9.searchpreview.de "GET ..."
ip1 311 [25/Oct/2013:00:40:32 +0200] https2 8.searchpreview.de "GET ..."
ip1 311 [25/Oct/2013:00:40:32 +0200] https2 8.searchpreview.de "GET ..."
ip1 311 [25/Oct/2013:00:40:32 +0200] https2 5.searchpreview.de "GET ..."
ip1 311 [25/Oct/2013:00:40:32 +0200] https2 5.searchpreview.de "GET ..."
ip1 311 [25/Oct/2013:00:40:32 +0200] https2 3.searchpreview.de "GET ..."
ip1 311 [25/Oct/2013:00:40:32 +0200] https2 3.searchpreview.de "GET ..."
ip1 311 [25/Oct/2013:00:40:32 +0200] https2 1.searchpreview.de "GET ..."
ip1 311 [25/Oct/2013:00:40:32 +0200] https2 1.searchpreview.de "GET ..."
ip1 311 [25/Oct/2013:00:40:32 +0200] https2 e.searchpreview.de "GET ..."
ip1 311 [25/Oct/2013:00:40:32 +0200] https2 e.searchpreview.de "GET ..."
ip1 311 [25/Oct/2013:00:40:32 +0200] https2 7.searchpreview.de "GET ..."
ip1 311 [25/Oct/2013:00:40:32 +0200] https2 7.searchpreview.de "GET ..."
ip1 311 [25/Oct/2013:00:40:32 +0200] https2 0.searchpreview.de "GET ..."
ip1 311 [25/Oct/2013:00:40:32 +0200] https2 0.searchpreview.de "GET ..."
ip1 311 [25/Oct/2013:00:40:32 +0200] https2 6.searchpreview.de "GET ..."
ip1 311 [25/Oct/2013:00:40:32 +0200] https2 6.searchpreview.de "GET ..."

Here are the corresponding Firefox connection ids from the Firefox HTTP traffic log ( nsHttpConnectionMgr::DispatchTransaction) which also shows a few different connections for the requests

[ci=.S..0.searchpreview.de:443 trans=1bf11000 caps=21 conn=44448900 priority=0]
[ci=.S..1.searchpreview.de:443 trans=1bf16000 caps=21 conn=44448800 priority=0]
[ci=.S..3.searchpreview.de:443 trans=44364800 caps=21 conn=44448b00 priority=0]
[ci=.S..4.searchpreview.de:443 trans=44364e00 caps=21 conn=44448c00 priority=0]
[ci=.S..5.searchpreview.de:443 trans=44365400 caps=21 conn=44448d00 priority=0]
[ci=.S..2.searchpreview.de:443 trans=1bf16600 caps=21 conn=44448e00 priority=0]
[ci=.S..7.searchpreview.de:443 trans=44366000 caps=21 conn=44448f00 priority=0]
[ci=.S..6.searchpreview.de:443 trans=44365a00 caps=21 conn=44449000 priority=0]
[ci=.S..8.searchpreview.de:443 trans=44366800 caps=21 conn=44449100 priority=0]
[ci=.S..9.searchpreview.de:443 trans=44366e00 caps=21 conn=44449200 priority=0]
[ci=.S..c.searchpreview.de:443 trans=4449b600 caps=21 conn=44449300 priority=0]
[ci=.S..e.searchpreview.de:443 trans=4449bc00 caps=21 conn=44449400 priority=0]
[ci=.S..c.searchpreview.de:443 trans=4449ba00 caps=21 conn=44448900 priority=0]
[ci=.S..c.searchpreview.de:443 trans=4449b800 caps=21 conn=44448900 priority=0]
[ci=.S..2.searchpreview.de:443 trans=44364000 caps=21 conn=44448900 priority=0]
[ci=.S..2.searchpreview.de:443 trans=44360c00 caps=21 conn=44448900 priority=0]
[ci=.S..4.searchpreview.de:443 trans=44365200 caps=21 conn=44448900 priority=0]
[ci=.S..4.searchpreview.de:443 trans=44365000 caps=21 conn=44448900 priority=0]
[ci=.S..9.searchpreview.de:443 trans=4449b400 caps=21 conn=44448900 priority=0]
[ci=.S..9.searchpreview.de:443 trans=4449b200 caps=21 conn=44448900 priority=0]
[ci=.S..8.searchpreview.de:443 trans=44366c00 caps=21 conn=44448900 priority=0]
[ci=.S..8.searchpreview.de:443 trans=44366a00 caps=21 conn=44448900 priority=0]
[ci=.S..5.searchpreview.de:443 trans=44365800 caps=21 conn=44448900 priority=0]
[ci=.S..5.searchpreview.de:443 trans=44365600 caps=21 conn=44448900 priority=0]
[ci=.S..3.searchpreview.de:443 trans=44364c00 caps=21 conn=44448900 priority=0]
[ci=.S..3.searchpreview.de:443 trans=44364a00 caps=21 conn=44448900 priority=0]
[ci=.S..1.searchpreview.de:443 trans=1bf16400 caps=21 conn=44448900 priority=0]
[ci=.S..1.searchpreview.de:443 trans=1bf16200 caps=21 conn=44448900 priority=0]
[ci=.S..e.searchpreview.de:443 trans=4449c000 caps=21 conn=44448900 priority=0]
[ci=.S..e.searchpreview.de:443 trans=4449be00 caps=21 conn=44448900 priority=0]
[ci=.S..7.searchpreview.de:443 trans=44366600 caps=21 conn=44448900 priority=0]
[ci=.S..7.searchpreview.de:443 trans=44366400 caps=21 conn=44448900 priority=0]
[ci=.S..0.searchpreview.de:443 trans=1bf14200 caps=21 conn=44448900 priority=0]
[ci=.S..0.searchpreview.de:443 trans=1bf14000 caps=21 conn=44448900 priority=0]
[ci=.S..6.searchpreview.de:443 trans=44365e00 caps=21 conn=44448900 priority=0]
[ci=.S..6.searchpreview.de:443 trans=44365c00 caps=21 conn=44448900 priority=0]
Here's what I see - (thanks for the test case!):

1] connections to all 12 hosts - this is to be expected because without handshaking we can't determine if they are running spdy and have the appropriately overlapping certs. These handshakes try and happen in parallel.

2] Of the 36 transactions, 25 of them happen happen on one connection.. these 25 transactions actually cover all 12 hostnames - so there is coalescing. The other 11 connections each carry exactly 1 transaction.

3] On the 11 connections that carry just one transaction, they are shut down immediately by firefox after that transaction completes. Future requests (or already queued requests) for that hostname are automatically done on the coaleseced connection which is left open.

This is happening because connections are generally (but not always) tied to specific transactions when the SSL logic is being executed - so by the time we establish a coalescable relationship the horse is out of the barn, so to speak. We immediately stop using that connection though and move everything over to the primary one. That relationship is cached for the rest of the browser session (space allowing), as well.

So the connections can't go away - we need to make them to verify the relationship. The best that could happen here would be to perform 0 transactions on the 11 connections that currently carry 1. That would have a minor benefit to congestion control and prioritization so its probably worth doing -  but not a huge deal.

It probably just involves more dependence on the nsHalfOpenSocket::OnOutputStreamReady() logic that uses a null http transaction; but I would be careful about not breaking SSL client auth.
(In reply to Patrick McManus [:mcmanus] from comment #1)
> Here's what I see - (thanks for the test case!):
> 
> 1] connections to all 12 hosts - this is to be expected because without
> handshaking we can't determine if they are running spdy and have the
> appropriately overlapping certs. These handshakes try and happen in parallel.

I did not express this thought correctly. We don't need to connect to all 12 hosts in a strictly technical sense - the IP address for all 12 of them plus the wildcard cert from the first one is sufficient. but because we don't yet know any of them are spdy - so we do try and connect to them all in parallel assuming they will be http/1. All 12 hosts have the TCP started before the first one has completed the SSL handshake (which is required to establish the coalescing criteria) and supplied the verified cert.

hope that helps more.

One way you can fix this on your site is to serve the base html page off of a domain that also satisfies the coalescing criteria. In that case the appropriate information would be established before the requests for the images are queued and no additional connections should be made.
Thanks for the explanation. I run a Firefox extension thats inserts the images into other pages, so I can't really use the base html page workaround you suggest. 

The Chrome browser uses 1 connection for all images in this test case (same scenario: New browser session, no spdy session open before starting the test).
the balance here is where we want it for now (presuming h1, so not holding back the handshake)
Status: UNCONFIRMED → RESOLVED
Closed: 9 years ago
Resolution: --- → WONTFIX
Whiteboard: [spdy]
You need to log in before you can comment on or make changes to this bug.