Closed Bug 1604286 Opened 5 years ago Closed 5 years ago

Http/2 connection reuse mix-up after DNS change

Categories

(Core :: Networking: DNS, defect)

71 Branch
defect
Not set
normal

Tracking

()

RESOLVED DUPLICATE of bug 1420777
Tracking Status
firefox71 --- affected

People

(Reporter: pascal, Unassigned)

Details

Attachments

(1 file)

184.50 KB, application/gzip
Details

User Agent: Mozilla/5.0 (X11; Linux x86_64; rv:69.0) Gecko/20100101 Firefox/69.0

Steps to reproduce:

  • create https servers "rev-proxy.foo.org" and "rev-proxy2.foo.org" with certificates *.foo.org . The HTTP KeepAlive must be long enough to allow connection reuse
  • create CNAMEs aaa.foo.org and bbb.foo.org for "rev-proxy.foo.org"
  • with firefox, access https://bbb.foo.org/
  • modify CNAME bbb.foo.org : it is now a CNAME for "rev-proxy2.foo.org"
  • with firefox, access https://bbb.foo.org/ then https://aaa.foo.org/

Actual results:

The request to https://aaa.foo.org/ is sent to "rev-proxy2.foo.org"

NB : a force-reload on https://aaa.foo.org/ is sent to "rev-proxy.foo.org"

Expected results:

The request to https://aaa.foo.org/ should have been sent to "rev-proxy.foo.org"

Version: 69 Branch → 71 Branch

Reproduced on Firefox 69 and 71.
From server log, I can see Firefox 67 and 68 have the same issue.

Hi Pascal,

I wasn't able to reproduce the bug but I've chosen a component for this bug in hope that someone with more expertise may look at it. We'll await their answer.

Regards, Flor.

Component: Untriaged → Networking: DNS
Product: Firefox → Core

Hello reporter,
Could you please help to gather http log to move forward?
https://developer.mozilla.org/en-US/docs/Mozilla/Debugging/HTTP_logging

Comment 3. Thanks

Flags: needinfo?(pascal)
Attached file http log

Obtained with about:networking logging.

Tested with first:

193.55.96.7 sas-nginx.univ-paris1.fr dns4a.univ-paris1.fr dns4b.univ-paris1.fr

then

193.55.96.7 sas-nginx.univ-paris1.fr dns4a.univ-paris1.fr
193.55.96.24 front-test.univ-paris1.fr dns4b.univ-paris1.fr

in /etc/hosts on linux

Flags: needinfo?(pascal)

We do aggressive coalescing for H2 if the result of DNS resolution. It's on purpose for your case.

Step 1:
Connect to dns4b.univ-paris1.fr => we have a connection Connection1 to dns4b.univ-paris1.fr with IP 193.55.96.7
Note that Connection1 keep connected for a long enough time.

Step 2:
Change DNS CName.
Connect to dns4b.univ-paris1.fr again, we have a connection Connection2 to dns4b.univ-paris1.fr with IP 193.55.96.24
The reason is we didn't have a used connection for the same IP

Step 3:
Connect to dns4a.univ-paris1.fr, we have a connection Connection1 with the same IP.
Therefore we coalesce and reuse the same connection Connection1

Hope this help.

Status: UNCONFIRMED → RESOLVED
Closed: 5 years ago
Resolution: --- → DUPLICATE

This is NOT what is happening.

I've done more tests and analysis of the logs & code.

  • Firefox on first vhost (ci) will create a nsConnectionEntry
  • it will then reuse this nsConnectionEntry for the same vhost (for how long? at least 12h when mostly idle)
  • it will assign a coalescing key to this nsConnectionEntry
  • the coalescing key is computed from current vhost IPs

Consequences:

  • when the nsConnectionEntry is reused, the coalescing key for a vhost is not re-computed, it is still using the old IP
  • if you access a vhost with a new IP, it will use the old IP as the "coalescing key":
    • if no current HTTP2 connection, it will open one with new IP,
    • if there is a current HTTP2 connection (*), it will use it

So two mix-ups :

  • accessing another vhost which is an old IP can be sent to new IP (in case connection was created for vhost on new IP)
  • accessing vhost on new IP can be sent to old IP (in case connection was created for vhost on old IP)

Workarounds :

  • ensure unknown vhosts return HTTP 421 (hopefully you don't have an app which catch all other vhosts...). It will still be a mess, but Firefox will retry with a new connection with the valid IP.
  • lower http2_idle_timeout (or equivalent) or disable HTTP/2 before DNS changes. But for how long??

(*) nginx http2_idle_timeout defaults to 3 minutes

Extract from logs :

# 2019-12-29 21:57:16
nsHttp uri=https://dns36b.univ-paris1.fr/
nsHostResolver CompleteLookup: dns36b.univ-paris1.fr has 193.55.96.7
#
nsConnectionEntry::nsConnectionEntry this=0x7fb0e59e9580 key=.S......[tlsflags0x00000000]dns36b.univ-paris1.fr:443
#
nsHttpConnectionMgr::MakeNewConnection 0x7fb106f52760 ent=0x7fb0e59e9580 trans=0x7fb0e5aab400
nsHttpConnectionMgr::MakeNewConnection [ci = .S......[tlsflags0x00000000]dns36b.univ-paris1.fr:443]
#
nsHttpConnectionMgr::nsHalfOpenSocket::OnTransportStatus STATUS_CONNECTING_TO Established New Coalescing Key # 0 for host dns36b.univ-paris1.fr [193.55.96.7~.:443/[]viaDNS]
#
nsSocketTransport   trying address: 193.55.96.7
UpdateCoalescingForNewConn() registering newConn 0x7fb0e59bc400 .S......[tlsflags0x00000000]dns36b.univ-paris1.fr:443 under key 193.55.96.7~.:443/[]viaDNS
# 2019-12-30 09:38:39
nsHttp uri=https://dns36b.univ-paris1.fr/
nsHostResolver CompleteLookup: dns36b.univ-paris1.fr has 193.55.96.24
#
# reusing nsConnectionEntry since same "ci"
#
nsHttpConnectionMgr::MakeNewConnection 0x7fb106f52760 ent=0x7fb0e59e9580 trans=0x7fb0e71a3400
nsHttpConnectionMgr::MakeNewConnection [ci = .S......[tlsflags0x00000000]dns36b.univ-paris1.fr:443]
#
# no "New Coalescing Key"
#
nsSocketTransport   trying address: 193.55.96.24
UpdateCoalescingForNewConn() registering newConn 0x7fb0e6838400 .S......[tlsflags0x00000000]dns36b.univ-paris1.fr:443 under key 193.55.96.7~.:443/[]viaDNS
# 2019-12-30 09:38:46
nsHttp uri=https://dns36a.univ-paris1.fr/
nsHostResolver CompleteLookup: dns36a.univ-paris1.fr has 193.55.96.7
#
nsConnectionEntry::nsConnectionEntry this=0x7fb0e5c36c80 key=.S......[tlsflags0x00000000]dns36a.univ-paris1.fr:443
#
nsHttpConnectionMgr::MakeNewConnection 0x7fb106f52760 ent=0x7fb0e5c36c80 trans=0x7fb0e76e9800
nsHttpConnectionMgr::MakeNewConnection [ci = .S......[tlsflags0x00000000]dns36a.univ-paris1.fr:443]
#
nsHttpConnectionMgr::nsHalfOpenSocket::OnTransportStatus STATUS_CONNECTING_TO Established New Coalescing Key # 0 for host dns36a.univ-paris1.fr [193.55.96.7~.:443/[]viaDNS]
#
FindCoalescableConnectionByHashKey() found match conn=0x7fb0e6838400 key=193.55.96.7~.:443/[]viaDNS newCI=.S......[tlsflags0x00000000]dns36a.univ-paris1.fr:443 matchedCI=.S......[tlsflags0x00000000]dns36b.univ-paris1.fr:443 join ok

Looking again at the code:

mCoalescingKeys values are initialized here: https://searchfox.org/mozilla-central/source/netwerk/protocol/http/nsHttpConnectionMgr.cpp#5130-5131 (in nsHttpConnectionMgr::nsHalfOpenSocket::OnTransportStatus)

That code should handle DNS change. Alas looking at the code, it looks somewhat difficult since we do not know if DNS changed.

Another solution would be to expire mCoalescingKeys after a few minutes? At least that way the issue won't last too long...

Tested again: the bug is still there on firefox 98.

NB: it seems DNS change is no more taken into account during HTTP Keepalive (?). So steps to reproduce becomes:

  • create https servers "rev-proxy.foo.org" and "rev-proxy2.foo.org" with certificates *.foo.org . The HTTP KeepAlive must be long enough to allow connection reuse
  • create CNAMEs aaa.foo.org and bbb.foo.org for "rev-proxy.foo.org"
  • with firefox, access https://bbb.foo.org/
  • modify CNAME bbb.foo.org : it is now a CNAME for "rev-proxy2.foo.org"
  • wait for connection not to be "Active" anymore (in "about:networking")
  • with firefox, access https://bbb.foo.org/ then https://aaa.foo.org/
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: