Closed Bug 1538978 Opened 5 years ago Closed 5 years ago

Firefox periodically doesn't download CSS files from nginx server over HTTP2 with SSL

Categories

(Core :: Networking: HTTP, defect)

66 Branch
defect
Not set
normal

Tracking

()

RESOLVED DUPLICATE of bug 1486046

People

(Reporter: noir04, Unassigned)

Details

User Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:66.0) Gecko/20100101 Firefox/66.0

Steps to reproduce:

  • Fresh debian 9 on GCP (Also tested on centOS)
  • nginx 1.10.3 (but it also happens on latest stable and mainline)
  • HTTPS/SSL enabled (Tested with self-signed, Let's encrypt and commercial SSL)
  • HTTP2 enabled
  • Firefox 66 (Also tested with Firefox 65)
  • I naturally tested in other browsers aswell, but this problem only happens in Firefox, which Is why I'm leaning towards it being a Firefox problem instead of an nginx problem
  • Went to my test site: https://34.73.172.170/ (self signed SSL certificate, please ignore warning)

Actual results:

After 30s to two minutes of refreshes (happens automatically), eventually several of the CSS files will fail to load, returning status = 0
I've tried catching when the problem occurs with a bit of javascript inlined in the html file

Test package here if you guys want to reproduce on your own nginx server:
https://34.73.172.170/http2-test.zip

Expected results:

All the CSS files should load, on each reload
Failing to do so should provide some kind of error other than status = 0

I've also posted this on serverfault (https://serverfault.com/questions/959516/firefox-periodically-doesnt-download-css-files-from-nginx-server-over-http2-wit) but it got kinda derailed because of the amount of CSS files
The amount of CSS files is high, because it speeds up the problem occuring
But regardless of the number of CSS files (unless maybe we're in the thousands), this shouldn't happen, right?

Component: Untriaged → Networking: HTTP
Product: Firefox → Core
Has STR: --- → yes

Thanks for the perfect test case!

The nginx server just resets the connection w/o any prior warning. We treat this as a sudden connection drop and don't produce any errors. This is by the spec, but I can see this behavior from time to time and it really is very confusing. Duplicate of bug 1486046.

I could see peek concurrency on the session to be 90 some 30ms before the reset. It dropped to 61 just before the reset. There was no RecvGoAway and there was still some activity before the reset.

Status: UNCONFIRMED → RESOLVED
Closed: 5 years ago
Resolution: --- → DUPLICATE

Adding Dragana, in case she may think of something more to diagnose here. noir04, please keep the test case up for some time, if possible. Thanks.

Thanks for taking a look Honza
I'll keep the test case open for as long as you need :)
Should it ever crash, let me know and I'll set it up again

Thanks to the bugs that this is a duplicate of (sorry for making a duplicate), I found a comment from u408661 (https://bugzilla.mozilla.org/show_bug.cgi?id=1499307#c15), which said that the "cause" is the low number of http2_max_requests that nginx has as default, 1000

I spun up a duplicate of the test case, increased http2_max_requests to 50000, pointed firefox towards the site, and hey presto, it's alive.... sort of
Once I reached 50000 connection requests something weird happened
I had added additional logging to the accesslog, now also logging $connection and $connection_requests, this ticket from the nginx tracker "inspired" me (https://trac.nginx.org/nginx/ticket/1102) - I've uploaded the accesslog from the time where I hit 50000 connection requests: https://34.73.172.170/accesslog_50000_firefox.txt
The number to the far right is $connection_requests and to the left of that is $connection
Notice how the order of files isn't 1 to 100, but instead seemingly random
Also, 57.css and 75.css are both tried twice

Furthermore I've been able to obeserve that Chrome also acts a littly "wonky"
Setting Chrome loose upon https://34.73.172.170/ with cache disabled (important), once that 1000th connection_requests is achieved, chrome will take an awful long time to get the next files (awful being around 5 seconds)
However chrome never seems to trigger my "debug" javascript, suggesting that it always ends up getting all the files

Do you guys still need me to keep the test case online? :)

Flags: needinfo?(dd.mozilla)

(In reply to noir04 from comment #4)

Thanks to the bugs that this is a duplicate of (sorry for making a duplicate), I found a comment from u408661 (https://bugzilla.mozilla.org/show_bug.cgi?id=1499307#c15), which said that the "cause" is the low number of http2_max_requests that nginx has as default, 1000

I spun up a duplicate of the test case, increased http2_max_requests to 50000, pointed firefox towards the site, and hey presto, it's alive.... sort of
Once I reached 50000 connection requests something weird happened

do you remember what was happening?

Flags: needinfo?(dd.mozilla) → needinfo?(noir04)

The weird part is what I'm trying to explain in the next couple of sentences :)

Flags: needinfo?(noir04)
You need to log in before you can comment on or make changes to this bug.