Open
Bug 1504337
Opened 7 years ago
Updated 1 month ago
Consider if it makes sense to use HTTP1.1 on cellular connections
Categories
(Core :: Performance: Navigation, enhancement, P5)
Core
Performance: Navigation
Tracking
()
People
(Reporter: jesup, Unassigned, NeedInfo)
References
Details
(Keywords: perf, perf:pageload)
We should consider if we want to use HTTP 1.1 instead of HTTP2 on cellular connections.
When connected via cellular, there's evidence that http2 can be significantly slower than http1.1 for "heavy" sites, especially on 'fair' or 'poor' network connections.
Light sites *seem* to not have the same issue; most everything is fast (in the paper below).
Likely this is due to TCP loss recovery/Head-of-line-blocking on packet loss; larger numbers of connections for 1.1 can sidestep the problem, mostly. QUIC should eventually resolve it for poor connections.
See https://twitter.com/patmeenan/status/1058067408634699777?s=21 and https://www.researchgate.net/publication/324514529_HTTP2_Prioritization_and_its_Impact_on_Web_Performance
(particularly the figures and tables from the second link)
Note that dynamically characterizing the network quality is tough, though not impossible; the suggestion here if for a simpler, more deterministic test.
We'd want some of our own figures to validate this and track it, if possible.
Part of the argument *against* doing this is that http2 (and hopefully QUIC+http2) moves the web forward, so short-term optimizations like this may slow that down. it's not clear that our doing so would affect the larger http2 goal, and it definitely seems to be at least something of a win over 1.1 on cable-quality connections.
There are several considerations here to balance (and verify).
Thanks for CCing me here, Randell. I spent some time reading the paper and the twitter thread yesterday, as well as looking at the raw data provided by the authors, and I largely agree with your conclusions - it's quite likely that the reason we see slowdowns on larger sites and poor networks is because of packet loss. That is for sure an issue that was foreseen in the HTTP/2 design process, and was left up to the implementer to decide what to do (though with a strong preference for fewer connections in general).
One thing that really made me happy to read in the paper was on page 11, where the authors state that "Firefox is as a whole profoundly faster than Chrome". Granted, that's just one paper, and even if largely true it's not guaranteed to be because of our HTTP/2 priority implementation, though it does mesh with anecdotal information I've heard about our pageload speed with HTTP/2 relative to Chrome.
One thing this paper made me wonder about is actually not related to our prioritization scheme at all, but rather to our coalescing scheme. If I remember correctly, we are by far the most aggressive browser when it comes to coalescing requests to different origins onto the same HTTP/2 session. In doing so, for the lower-quality cell network case, we can actually hurt ourselves even more than other browsers, as a packet loss would potentially result in an impact on a larger number of requests than other browsers, who would have more parallelism due to the fact that they are less likely to coalesce connections.
I wonder if we can't try for a "best-of-both-worlds" approach - continue using HTTP/2 (so when things are going well, they continue to go exceedingly well), but open more connections if we detect that we're on a potentially bad network (so, on a cell network of any sort, in this case). How many connections to open is up for debate (my gut says 3, but just because that's about halfway between the 1 of pure HTTP/2, and the 6 of pure HTTP/1.1). There would also be the question of how to distribute requests across those connections - would a simple round-robin be good enough, or do we want to continue our pattern with HTTP/2 and try to be a bit smarter (by trying to ensure an even distribution of requests at various priority levels across the different connections)? I can see a simple round-robin degrading to just as bad as our current HTTP/2 situation (or nearly) if all the important requests just happen to end up on the same connection, and that connection experiences packet loss. Of course, doing this could make things worse on really good cell networks (like the one I have when I'm sitting at my desk).
This is also complicated a bit by the fact that there largely isn't a lot of parallelism in HTTP/2 - about 2/3 of all HTTP/2 sessions that carry data only every carry one stream at a time. Now, it's certainly possible that the reason the numbers look this way is because on fast networks, we don't get a whole lot of opportunities to parallelize in HTTP/2, but there's no way of telling from the telemetry data (HTTP/2 parallelism telemetry isn't broken down by network type or any other secondary axis).
Just some more food for thought on what we should/could do here, and what other data we may need to make a better-informed decision.
Reporter | ||
Comment 2•7 years ago
|
||
+ mayhemer since nick has left
Comment 3•7 years ago
|
||
It is not easy to detect if we are on a bad network: is it network or a server stalls. We could add some heuristics but I am not sure if it is worth it.
If we have a server that always fills the tcp congestion window (without background traffic) 1 tcp connection gets 75% of the network, 3 tcp connection get around 95%. These are theoretical numbers. With background traffic more connection will get proportionally more bandwidth, but can cause more loss as well and compete between each other. You can make ideal set up for all cases, but deciding this from application seen traffic is not easy. Http1.1 depends whether server can fill the tcp congestion window and if it can not fill it 95% can be much smaller.
Updated•6 years ago
|
Priority: -- → P5
Reporter | ||
Updated•6 years ago
|
Whiteboard: [qf]
Comment 4•6 years ago
|
||
Calling this [qf:p3:pageload] - Randell, please adjust if that seems off-base to you. (This might really be "qf:investigate" but I'm not sure we still use that)
Whiteboard: [qf] → [qf:p3:pageload]
Updated•3 years ago
|
Updated•3 years ago
|
Severity: normal → S3
Updated•4 months ago
|
Component: Performance: General → Performance: Navigation
Comment 5•2 months ago
|
||
We will be racing TCP and QUIC as part of the Happy Eyeballs V3 project, bug 1914201.
And I'm seeing time-to-request start about twice as fast over HTTP/3 compared to HTTP/2 on Fenix overall, which would make sense due to the reductions in round trips - Pageload Event
Randell, are you ok with closing this older bug because I don't think this something we'd consider anymore?
Flags: needinfo?(rjesup)
See Also: → 1914201
You need to log in
before you can comment on or make changes to this bug.
Description
•