User Agent: Mozilla/5.0 (Windows NT 5.1; rv:12.0a1) Gecko/20120111 Firefox/12.0a1 Build ID: 20120111031049 Steps to reproduce: I opened this link http://labs.google.co.in/smschannels/browse Actual results: Firefox displays some weird characters instead of the web page. Expected results: I could view the web page in other browsers like Midori and Chrome, but Firefox failed to render it.
Confirming on FF 9.0.1, Fedora16.
1476[170f780]: http response [ 1476[170f780]: HTTP/1.1 200 OK 1476[170f780]: Content-Length: 13152 1476[170f780]: Content-Encoding: gzip,gzip 1476[170f780]: Expires: Fri, 01 Jan 1990 00:00:00 GMT 1476[170f780]: Vary: Accept-Encoding 1476[170f780]: Pragma: no-cache 1476[170f780]: Cache-Control: no-cache, must-revalidate 1476[170f780]: Date: Thu, 12 Jan 2012 15:32:33 GMT 1476[170f780]: Content-Type: text/html; charset=UTF-8 1476[170f780]: Set-Cookie: PREF=ID=ea367be4b644d81b:TM=1326382352:LM=1326382353:S=XXbCmH-J20YsDI04; expires=Sat, 11-Jan-2014 15:32:33 GMT; path=/; domain=.google.co.in 1476[170f780]: X-Content-Type-Options: nosniff 1476[170f780]: Server: mic_server 1476[170f780]: X-XSS-Protection: 1; mode=block 1476[170f780]: X-Frame-Options: SAMEORIGIN 1476[170f780]: ] The server sends "Content-Encoding: gzip,gzip" If the Client doesn't accept a content-encoding gzip the serer still sends "Content-Encoding: gzip". This is Tech Evangelism bug but Opera and Chrome seem to handle this, IE8 fails as well. Moving to networking:http for a decision from the networking developers.
Is the server actually gzipping twice when it sends "Content-Encoding: gzip,gzip"?
Related to Bug 205156?
wget --header="accept-encoding: gzip" --user-agent="Mozilla/5.0 (Windows NT 6.1; rv:12.0a1) Gecko/20120111 Firefox/12.0a1 SeaMonkey/2.9a1" -S http://labs.google.co.in/smschannels/browse The result is a double gzip content
Yeah, we should probably just handle that... It's not hard to hook up two chained gzip converters on a technical level; the question is whether there are sites sending "gzip,gzip" but only gzipped once.
(In reply to Boris Zbarsky (:bz) from comment #3) > Is the server actually gzipping twice when it sends "Content-Encoding: > gzip,gzip"? I like to look at httparchive.org for crud like this. It has a DB of 4.5 million HTTP transactions. About 1 million of them have content-encoding at all, and exactly 1 of them has more than 1 encoding.. and that's "identity, identity" from some cgi (http://www.toggo.de/fcgi-bin/fcgi_application) So I am content this is a dumb and rare header value. Barring any other data I think the standards compliant interpretation ought to win - which in this case is the double gzip.
Sounds good. Let's do that.
Created attachment 595189 [details] [diff] [review] patch 0 process list of Content-Encodings as per RFC 2616 (though I capped it at 16 to avoid true silliness).. add a test case for "gzip" and "gzip, gzip", and manually confirmed this makes http://labs.google.co.in/smschannels/browse work sent off to try as well.
Comment on attachment 595189 [details] [diff] [review] patch 0 Why change the return value from AsyncConvertData to NS_ERROR_UNEXPECTED on error? Just return rv there. Also, keep the logging for "unknown encoding"? r=me with that
Note that this probably makes zip-bombing (http://en.wikipedia.org/wiki/Zip_bomb) easier but I don't see that as a significant issue given that even a single layer of gzip can cause OOM for us.