Created attachment 8665856 [details] zero.cap User Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.10; rv:40.0) Gecko/20100101 Firefox/40.0 Build ID: 20150826023504 Steps to reproduce: Generate a simple answer like this one : HTTP/1.1 200 OK Date: Fri, 25 Sep 2015 09:03:03 GMT Server: Apache/2.4.10 (Debian) Vary: Accept-Encoding Transfer-Encoding: chunked Content-Type: text/html <!DOCTYPE html> <html> <head> <meta http-equiv="content-type" content="text/html; charset=utf-8" /> </head> <body>Hi ! 922 </body> </html> With start of <meta ...> placed after byte 1120 ( 0=start of HTTP header ) You can feed the preambule with whatever character you want ( 0x00, 0x20, ... ). You can place them either before <!DOCTYPE or after it. Actual results: First side effects : firefox laucnhes immediately a new request on the same ressource. ( see network capture enclosed ) Secondary side effects not fully identified : should the page contains lniks to other files ( js, css, ... ) caching policy is violated. Expected results: Firefox should not launch a secondary request. Firefox should respect caching policy
@Reporter - have you attempted this in the latest released version of Firefox? Mozilla/5.0 (Macintosh; Intel Mac OS X 10.11; rv:42.0) Gecko/20100101 Firefox/42.0)
Per spec, the initial prescan for a <meta> specifying a charset only considers the first 1024 bytes of the file. See <https://html.spec.whatwg.org/#determining-the-character-encoding:prescan-a-byte-stream-to-determine-its-encoding>. In this case, the initial prescan doesn't find the <meta>, because it's too far into the file, so the parse starts with the default encoding, which I expect is "ISO-8859-1" in your case. During the parse we see the <meta> and that triggers https://html.spec.whatwg.org/#parsing-main-inhead:change-the-encoding which goes to https://html.spec.whatwg.org/#change-the-encoding which in step 6 restarts the navigation but this time forces the new character encoding. We do fulfill this second request from cache if we can, but if the document is not fully in cache yet for whatever reason when this reload starts, that won't be possible.
Oh, and the point is per spec a valid HTML document per the spec has to have the <meta> specifying charset within the first 1024 bytes. See https://html.spec.whatwg.org/#charset third bullet point. The rest of the stuff discussed in comment 2 is basically error recovery for invalid documents.