Closed Bug 366023 Opened 13 years ago Closed 4 years ago

Chunked Encoding with HTTP/1.0

Categories

(Core :: Networking: HTTP, defect)

x86
Windows XP
defect
Not set

Tracking

()

RESOLVED FIXED
mozilla46
Tracking Status
firefox46 --- fixed

People

(Reporter: bugzilla, Assigned: mcmanus)

References

()

Details

(Whiteboard: [necko-active])

Attachments

(7 files, 2 obsolete files)

I'm getting the Content Encoding Error when I visit:
http://sharpcast.supportportal.com/Portal/Home.aspx

I thought that bug 357958 fixed it

Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9a1) Gecko/20061218 Minefield/3.0a1 ID:2006121804 [cairo]
WFM with Mozilla/5.0 (Windows; U; Windows NT 5.1; sv-SE; rv:1.9a2pre) Gecko/20070101 Minefield/3.0a2pre

The fix for bug 357958 was checked in 2006-12-19 07:03, so a build from 12-18 will not have that fix.
I was using a proxy when I got this error. If I turn off the proxy the problem goes away. Is this still a mozilla issue then?
I can reproduce the Content Encoding Error everytime I turn on the proxy.
Attached file http log
The response looks like this:

3732[11a5368]:   HTTP/1.0 200 OK
3732[11a5368]:   Cache-Control: private
3732[11a5368]:   Date: Mon, 08 Jan 2007 11:19:12 GMT
3732[11a5368]:   Content-Type: text/html; charset=utf-8
3732[11a5368]:   Server: Microsoft-IIS/6.0
3732[11a5368]:   MicrosoftOfficeWebServer: 5.0_Pub
3732[11a5368]:   X-Powered-By: ASP.NET
3732[11a5368]:   X-AspNet-Version: 1.1.4322
3732[11a5368]:   Content-Encoding: gzip
3732[11a5368]:   Vary: Accept-Encoding
3732[11a5368]:   Transfer-Encoding: chunked
3732[11a5368]:   X-Cache: MISS from u04.opasia.dk
3732[11a5368]:   Proxy-Connection: close

Note that unlike bug 357958 this is a _supported_ encoding.

Also note lack of a content-length, because it's being chunked.

Then your log shows:

0[1024368]: nsHttpChannel::OnDataAvailable [this=1e32188 request=2391310 offset=0 count=1383]

for that channel and no more data.

My log for loading this page without a proxy shows:

-1209119872[92de548]: nsHttpChannel::OnDataAvailable [this=9c6cc60 request=9a436c8 offset=0 count=2734]

Note the difference in size -- 2734 bytes vs 1383.

So the proxy is rearranging things _somehow_.  You'd have to look at the exact bytes that are on the wire to tell how, but I'd guess that what the proxy is giving us is either not all the data or a gunzipped version of the data or something.

So this is almost certainly invalid.
I can see the same "Content Encoding Error" when trying to open a kb article from Microsoft: http://support.microsoft.com/?kbid=316503

The HTTP response header looks similar to the one which was given above. But I'm not sure if this is really a proxy only error. Running IE in parallel I'm able to open this page without a problem. Only Firefox is affected. I'll attach my HTTP log. The proxy software we are using is squid/2.5.STABLE3.
Attachment #320885 - Attachment mime type: application/octet-stream → text/plain
Does IE send Accept-Encoding when using a proxy?
(In reply to comment #8)
> Does IE send Accept-Encoding when using a proxy?

Do I have to use Wireshark or are there other tools for Windows which let me fetch this information?
I have no idea, to be honest...
Michal can you help us get a better idea of what's going on here?
Assignee: nobody → michal
Is this INVALID/WFM? Seems to be related to specific proxy servers and maybe this one MSFT site?
(In reply to comment #2)
> I was using a proxy when I got this error. If I turn off the proxy the problem
> goes away. Is this still a mozilla issue then?
> I can reproduce the Content Encoding Error everytime I turn on the proxy.
> 

What proxy server do you use? I tried it with squid-2.6.STABLE19-1.fc8, but I can't reproduce it.
Michal, please see my comment 6.
Oops, I've overlooked this. I'll try it with the same version...
The problem seems to be in squid. Response is:

HTTP/1.0 200 OK
Cache-Control: private
Date: Sat, 17 May 2008 07:44:06 GMT
Content-Type: text/html; charset=utf-8
Server: Microsoft-IIS/6.0
MicrosoftOfficeWebServer: 5.0_Pub
X-Powered-By: ASP.NET
X-AspNet-Version: 2.0.50727
Content-Encoding: gzip
Vary: Accept-Encoding
Transfer-Encoding: chunked
X-Cache: MISS from aurora.local
Proxy-Connection: close

Transfer-Encoding header shouldn't be used in HTTP/1.0. In this case Firefox ignores chunked encoding: http://mxr.mozilla.org/mozilla/source/netwerk/protocol/http/src/nsHttpTransaction.cpp#856

And of course gzip decoder then fails on invalid data:
http://mxr.mozilla.org/mozilla/source/netwerk/streamconv/converters/nsHTTPCompressConv.cpp#415

If this is common problem, we should be probably more tolerant and accept transfer encoding in HTTP/1.0 too. E.g. curl can handle such response.
(In reply to comment #16)
> The problem seems to be in squid. Response is:
> 
> HTTP/1.0 200 OK
> Cache-Control: private
> Date: Sat, 17 May 2008 07:44:06 GMT
> Content-Type: text/html; charset=utf-8
> Server: Microsoft-IIS/6.0
> MicrosoftOfficeWebServer: 5.0_Pub
> X-Powered-By: ASP.NET
> X-AspNet-Version: 2.0.50727
> Content-Encoding: gzip
> Vary: Accept-Encoding
> Transfer-Encoding: chunked
> X-Cache: MISS from aurora.local
> Proxy-Connection: close
> 
> Transfer-Encoding header shouldn't be used in HTTP/1.0. In this case Firefox
> ignores chunked encoding:
> http://mxr.mozilla.org/mozilla/source/netwerk/protocol/http/src/nsHttpTransaction.cpp#856
> 
> And of course gzip decoder then fails on invalid data:
> http://mxr.mozilla.org/mozilla/source/netwerk/streamconv/converters/nsHTTPCompressConv.cpp#415
> 
> If this is common problem, we should be probably more tolerant and accept
> transfer encoding in HTTP/1.0 too. E.g. curl can handle such response.
> 

Can you work up a patch for this to ignore it on our side?  Should we file a bug for squid?


I believe this is the following bug in squid: http://www.squid-cache.org/bugs/show_bug.cgi?id=418
Flags: blocking1.9?
That patch will reintroduce bug 330214, no?  I think that would be worse than this problem, especially given that squid can fix the problem.
We should add a regression test for bug 330214.  Can our HTTP server handle that?
(In reply to comment #20)
> That patch will reintroduce bug 330214, no?  I think that would be worse than
> this problem, especially given that squid can fix the problem.

You are right, thanks for the info! Test https://bugzilla.mozilla.org/attachment.cgi?id=217164 passes successfully with the new patch.
Attachment #321829 - Attachment is obsolete: true
Attachment #321838 - Flags: review?(cbiesinger)
Attachment #321829 - Flags: review?(cbiesinger)
(In reply to comment #21)
> We should add a regression test for bug 330214.

As far as I can tell from reading the bug, I don't think so.  First, the server sends Connection: close with every response, which probably interferes with this.  Second, it prebuffers all the response data and sends it only when the full response is written, giving the Content-Length header corresponding to how much data was written.  (Don't try writing a handler to send a 5GB response, you'll OOM far before then.  :-) )  I *think* the two of those together prevent using the server to test this.

The server was written to guarantee correctness and not to allow incorrectness or deep twiddling with the properties of responses.  Bug 396226 will be a big part of addressing this, but even ignoring that it might be possible to add some sort of escape mechanism to handle this in cases where you really do want to play with fire.  It's possible the latter requires the former; I haven't thought about it very hard.
IIS and a couple of other web servers has a bug in that they respond with chunked encoding in certain situations even if the request was an HTTP/1.0 request. In the case of IIS this happens if  dynamic compression is enabled. This means that clients who are behind HTTP/1.0 proxies without support for chunked encoding may get HTTP/1.0 responses with chunked encoding even if chunked encoding doesn't exists in HTTP/1.0.

Squid-2.6 and 3.1 and later (but not 3.0) has support for chunked encoding and will properly decode such responses before forwarded to the requesting client to work around these broken servers, even if Squid is still advertising itself as HTTP/1.0.

This support was added to Squid-2.6 in squid-2.6.STABLE10 (4 Mar 2007).
Henrik - thank you very much this is extremely helpful information.

Given that this requires IIS + a specific squid version and is worked around in squid for over a year I'm going to recommend we not touch our HTTP handling code at this stage of the release (e.g. force an RC2).  We should explore this further for the next major firefox release.

Flags: wanted-next+
Flags: blocking1.9?
Flags: blocking1.9-
The same is true for the potential security issue from this protocol version mismatch referenced earlier.

In either case my recommendation is to blindly accept the chunked encoding if there is no Content-Length header, and only accept chunked + Content-Length (by ignoring Content-Length) if the message is an HTTP/1.1 message.

You MAY also remove chunking in HTTP/1.0 + chunked if Content-Length matches exactly the message length (including chunking). Such Content-Length header is quite likely added by a proxy on a cache hit.

Most origin servers sending both chunked + content-length sends the original length before chunking, and such responses may get trunkated in HTTP/1.0 proxies. And I am not aware of any servers being broken in both ways at the same time (both sending a wrong Content-Length and chunked in response to HTTP/1.0)

How about when the server reports HTTP/1.x?

I get the "The page you are trying to view cannot be shown because it uses an invalid or unsupported form of compression." error after the following transmission (from Live HTTP headers extension):

http://web.archive.org/web/20070927200805/http://www.sljfaq.org/w/Small_ke

GET /web/20070927200805/http://www.sljfaq.org/w/Small_ke HTTP/1.1
Host: web.archive.org
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-GB; rv:1.9.0.1) Gecko/2008070208 Firefox/3.0.1
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: en-gb,en;q=0.5
Accept-Encoding: gzip,deflate
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Keep-Alive: 300
Connection: keep-alive

HTTP/1.x 503 Service Temporarily Unavailable
Date: Tue, 19 Aug 2008 11:30:29 GMT
Server: Apache/2.2.4 (Ubuntu) PHP/5.2.3-1ubuntu6 mod_perl/2.0.2 Perl/v5.8.8
Vary: Accept-Encoding
Content-Encoding: gzip
Content-Type: text/html; charset=UTF-8
Connection: close
Transfer-Encoding: chunked
----------------------------------------------------------
I do no have any proxy configured.
That has nothing to do with this bug.  In that case, the server is just claiming 
"Content-Encoding: gzip" but not sending gzipped data (which you could have verified trivially by looking at what it does send).
Attachment #321838 - Attachment is obsolete: true
Attachment #346276 - Flags: review?(cbiesinger)
Attachment #321838 - Flags: review?(cbiesinger)
(In reply to comment #26)
> You MAY also remove chunking in HTTP/1.0 + chunked if Content-Length matches
> exactly the message length (including chunking). Such Content-Length header is
> quite likely added by a proxy on a cache hit.

I've implemented it, but I'm not sure if we want this behaviour. What we should do when Content-Length header doesn't match the real length? I'm now returning NS_ERROR_INVALID_CONTENT_ENCODING what is obviously wrong, but from the existing error pages it is the closest one.
Attachment #346279 - Flags: review?(cbiesinger)
Good question on what to do when there is both chunked and Content-length.

Probably best to simply follow the RFC and ignore Content-Length when there is chunked encoding regardless of received HTTP version, and instead flag the connection as broken forcing it to be closed immediarely after this response has been read, (and discarding any already read data after the current response if any)
For what its worth, on my parents computer FF gives a content encoding error, but we dont use any proxy servers between us and the net. Interestingly, we both seem to have the same version of FF but it works fine for me, but not for them. The site is http://www.qantas.com.au HTTP Log can be produced if required.
(In reply to comment #32)
> The site is http://www.qantas.com.au HTTP Log can be produced if required.

It would be great if you can provide HTTP log and also TCP dump from wireshark.
Browsing directly to the Qantas website by starting Firefox with http://www.qantas.com.au on the command line does _not_ show the Content Encoding Error.
Browsing to the Qantas website via a Google search result does show the Content Encoding Error.
I have attached two HTTP logs related to the Qantas website.  Browsing directly to the Qantas website via the command line shows the website correctly.  Browsing to the Qantas website via a Google search result shows the Content Encoding Error.  Pressing Ctrl+F5 then shows the website correctly.

I believe this behaviour started for me with the update to Firefox 3.0.8 on Windows Vista 32-bit as I had used the Qantas website via Google search result to book airline tickets on March 10 using Firefox 3.0.7.  The Firefox 3.0.8 update was installed on March 31.
Steve, those logs seem to have nothing to do with this bug.   It looks like bug 247334 to me: the first log shows a 200 response with "Content-Encoding: gzip", while the second log shows a 206 response for the same page but without the Content-Encoding header.  That's a bug in the server, basically.
I get this problem when using Firefox through a proxy at work. It doesn't happen at all in Internet Explorer. And it doesn't happen with FF at home where I'm not going through a proxy.

There needs to be some kind of work around for this problem (even if it is server related) as many popular sites are not working in Firefox but are working in IE.

These sites definitely produce the problem (and all work with IE):
- bing.com search results
  http://www.bing.com/search?q=test&go=&form=QBLH&filt=all&qs=n
- Lonely Planet Thorntree forums
  http://www.lonelyplanet.com/thorntree/thread.jspa?threadID=1577008&tstart=105
- O'Reilly Safari books viewer
  http://my.safaribooksonline.com/0130130567
Interestingly, when I look up those three sites (bing/LP/Safari) on http://www.netcraft.com/ they're all via Akamai - so maybe that's introducing some issue that affects Firefox?

Also, Ctrl+F5 doesn't resolve the problem for me.

I don't get the problem with Qantas or http://support.microsoft.com/?kbid=316503
 (mentioned above) -- perhaps they've changed their webservers since then.
I have recently encountered the Content Encoding Error with FF 3.6 in the following way:

Under normal conditions the page can be loaded correctly 4-5 times (simply pressing the reload button) before the error occurs. Then it gives a Content Encoding Error saying: "The page you are trying to view cannot be shown because it uses an invalid or unsupported form of compression.". Then I empty the Cache of FF via Tools -> Clear Recent History, and then pressing reload, the page loads correctly again for many times. Thus, even if the problem is not encountered on a particular page at a particular time, the problem may still persist.
Comment on attachment 346279 [details] [diff] [review]
new patch that allows HTTP/1.0 with Content-Length header

going to assume these are no longer relevant, please re-request if I'm wrong
Attachment #346279 - Flags: review?(cbiesinger)
This still looks relevant to me...
we still occasionally hear of this error.. and for webcompat I think I want this change (to allow chunked with 1.0 responses). for c-l/chunked mismatches we'll treat it as suggested in comment 31
Summary: Content Encoding Error → Chunked Encoding with HTTP/1.0
Whiteboard: [necko-active]
Attachment #8709576 - Flags: review?(daniel)
Assignee: michal.novotny → mcmanus
Status: NEW → ASSIGNED
Comment on attachment 8709576 [details] [diff] [review]
Allow h/1.0 chunked encodings

Review of attachment 8709576 [details] [diff] [review]:
-----------------------------------------------------------------

Apart from the little nit in the comment, it looks straight forward and neat.

There's a minor change in behavior then for HTTP/1.1 sites that send both C-L and chunked encoding as they're now marked for no-reuse, but I imagine that's very rare and very unlikely to cause any harm.

::: netwerk/protocol/http/nsHttpTransaction.cpp
@@ +1627,5 @@
>              // handle chunked encoding here, so we'll know immediately when
>              // we're done with the socket.  please note that _all_ other
>              // decoding is done when the channel receives the content data
>              // so as not to block the socket transport thread too much.
>              // ignore chunked responses from HTTP/1.0 servers and proxies.

I figure this line in the comments should be removed/modified to reflect the new logic
Attachment #8709576 - Flags: review?(daniel) → review+
(In reply to Daniel Stenberg [:bagder] from comment #46)

> 
> There's a minor change in behavior then for HTTP/1.1 sites that send both
> C-L and chunked encoding as they're now marked for no-reuse, but I imagine
> that's very rare and very unlikely to cause any harm.

that's an intentional change. Not reusing the connection is a typical mitigation when we are forced to do "liberal in what you receive" web compat things.. and that apples to h.1.1 also.

thanks!
https://hg.mozilla.org/mozilla-central/rev/4f5d63ec3097
Status: ASSIGNED → RESOLVED
Closed: 4 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla46
I've just noticed this change today while tracking down a difference in behavior between Chrome and Firefox with F-Secure Internet Gatekeeper. Looks like this software changes the HTTP version on chunked content to HTTP/1.0 while still keeping the Transfer-Encoding header and keeping the content chunked. According to my research (http://noxxi.de/research/http-evader-explained-3-chunked.html) this change make Firefox together with Safari the only browsers which behave this (wrong) way while all the others (Edge, IE, Chrome, Opera) properly interpret the content as not chunked. While I could understand if Firefox would work around broken systems the same way as others do I don't think that being the almost only one to work around this way is not a good idea.
You need to log in before you can comment on or make changes to this bug.