Closed Bug 1047698 Opened 7 years ago Closed 7 years ago

ALPN advertisment order may be wrong

Categories

(Core :: Networking: HTTP, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla34

People

(Reporter: u408661, Assigned: mcmanus)

Details

(Whiteboard: [sdpy][http2release])

Attachments

(1 file)

(This is on 3.17 beta 1, the version of NSS currently in mozilla-central, which I *assume* is trunk, since that's not listed above)

http://tools.ietf.org/html/rfc7301 section 3.1 says ALPN tokens are listed in descending order of preference. NPN appears to list them in ascending order of preference (this is how we have interop'd with others in the past, at least). However, when attempting to test our http/2 implementation against webtide.com, I could not get http2 negotiated. A packet capture shows that we are offering the following protocols (in this order) via ALPN:

spdy/3
spdy/3.1
h2-14
http/1.1

webtide.com (running jetty) selects spdy/3. This appears to be consistent with the spec as referenced above. However, gfe (on the same ALPN advertisement) selects h2-14. I believe that, with the exception of http/1.1, we are advertising things in the wrong order. I could be totally wrong.

Changing necko to send h2-14 first "works" against webtide.com (until gecko crashes somewhere random each time after starting to download data), though I suspect (but have not yet tested) that it will break NPN negotiations (as well as the ALPN negotiation with gfe).

We should figure this out (likely in coordination with Google, since they seem to behave like us right now) and fix it if necessary.
A correction, I should note that the ALPN advertisement that works against gfe is h2-13, not h2-14 (to my knowledge, gfe doesn't support h2-14 yet). However, the token ordering, and all other relevant details, remain the same.
Man, I feel like we've been down this road before.

references:
http://tools.ietf.org/html/draft-agl-tls-nextprotoneg-04
http://tools.ietf.org/html/rfc7301

alpn: client sends offer list in descending order of preference.
Server makes selection, but it doesn't actually need to take client preference into account at all  - the server works off its own preference which hasn't been serialized onto the wire.

"   It is expected that a server will have a list of protocols that it
   supports, in preference order, and will only select a protocol if the
   client supports it.  In that case, the server SHOULD select the most
   highly preferred protocol that it supports and that is also
   advertised by the client"

So I think this is what is going on with google. A packet capture to a public google endpoint shows the client offer list as spdy/3, spdy/3.1, h2, h1 - and the server hello selects spdy/3.1 from the middle of the list - because that's the server preference. So I think webtide would also be totally justified in selecting whatever they wanted with alpn (server chooses), but we should certainly be getting the order right to help them out if they want to honor our.

npn: server sends offer list in descending order of preference and client preference list is never serialized on the wire. (This is a way in which npn is nicer).. but we still have an interface to nss to share our preference order. This is the same interface used for alpn, of course.

The same google endpoint (which has no h2 support) advertises an npn list of spdy/3.1, spdy/3, h1. And we select spdy/3.1 from that list. Which is the desired outcome but a little suprising (see below).

Let's talk gecko:

protocol array given to nsISSLSocketControl::SetNPNList() = {h1, spdy/3, spdy/3.1, h2} .. That list is passed to nss in the same order - SSL_SetNextProtoNego()

This is the definition of that function: "If no matching protocol is found it
selects the first supported protocol.

Using this function also allows the client to transparently support ALPN.
The same set of protocols will be advertised via ALPN and, if the server
uses ALPN to select a protocol, SSL_GetNextProto will return
SSL_NEXT_PROTO_SELECTED as the state.

Since NPN uses the first protocol as the fallback protocol, when sending an
ALPN extension, the first protocol is moved to the end of the list. This
indicates that the fallback protocol is the least preferred. The other
protocols should be in preference order."

That says to me that our list should be {h1, h2, spdy/3.1, spdy/3} - where h1 is the fallback protocol that will be moved to the end of the list. That should generate an alpn client hello of h2, spdy/3.1, spdy/3, h1..

The surprising part here is that we select spdy/3.1 from google using npn even when its in the middle of our preference list (between 3 and h1 which we both support). The reason is the nss code actually ignores the client ordering with npn, other than using that magic first token for the no-match fallback. The peer's prefernce is honored (as webtide is doing with alpn): https://mxr.mozilla.org/mozilla-central/source/security/nss/lib/ssl/sslsock.c#1416

The good news is this will be a simple change to netwerk/protocol/http
Whiteboard: [sdpy][http2release]
some interop testing:

With the patch from comment 3 and alpn disabled: we still use NPN to select spdy/3.1 from google (correct) and we still speak h1 with apple (who does not use alpn or npn in the handshake). 

same patch with h2 and alpn enabled: we send alpn(h2-13, spdy/3.1, spdy/3, h1) and goog selects spdy/3.1 via alpn(good).. we use h1 with apple (good) and twitter.com ignores the alpn and offers an npn list of h2-13, spdy/3.1 and h1.. we select h2-13 (good).
I did try the patch with webtide: h2-13, spdy/3.1, spdy/3, h1.. it negotiated spdy/3 via ALPN. I assume that it doesn't support h2-13 or spdy/3.1. Should recheck with h2-14 applied.
I'll give this a shot against both webtide with h2-14 and gfe with h2-13 to see if anything breaks :)
Everything seems ok. Webtide doesn't even crash firefox now, though we do get a connection reset. That's debuggable outside this bug, though :)
Attachment #8467236 - Flags: review?(hurley)
Assignee: nobody → mcmanus
Status: NEW → ASSIGNED
Attachment #8467236 - Flags: review?(hurley) → review+
Component: Libraries → Networking: HTTP
Product: NSS → Core
Version: trunk → Trunk
https://hg.mozilla.org/mozilla-central/rev/cbdd181e39a6
Status: ASSIGNED → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla34
You need to log in before you can comment on or make changes to this bug.