User Agent: Mozilla/5.0 (Windows NT 5.1; rv:12.0a1) Gecko/20120111 Firefox/12.0a1
Build ID: 20120111031049
Steps to reproduce:
I opened this link http://labs.google.co.in/smschannels/browse
Firefox displays some weird characters instead of the web page.
I could view the web page in other browsers like Midori and Chrome, but Firefox failed to render it.
Confirming on FF 9.0.1, Fedora16.
1476[170f780]: http response [
1476[170f780]: HTTP/1.1 200 OK
1476[170f780]: Content-Length: 13152
1476[170f780]: Content-Encoding: gzip,gzip
1476[170f780]: Expires: Fri, 01 Jan 1990 00:00:00 GMT
1476[170f780]: Vary: Accept-Encoding
1476[170f780]: Pragma: no-cache
1476[170f780]: Cache-Control: no-cache, must-revalidate
1476[170f780]: Date: Thu, 12 Jan 2012 15:32:33 GMT
1476[170f780]: Content-Type: text/html; charset=UTF-8
1476[170f780]: Set-Cookie: PREF=ID=ea367be4b644d81b:TM=1326382352:LM=1326382353:S=XXbCmH-J20YsDI04; expires=Sat, 11-Jan-2014 15:32:33 GMT; path=/; domain=.google.co.in
1476[170f780]: X-Content-Type-Options: nosniff
1476[170f780]: Server: mic_server
1476[170f780]: X-XSS-Protection: 1; mode=block
1476[170f780]: X-Frame-Options: SAMEORIGIN
The server sends "Content-Encoding: gzip,gzip"
If the Client doesn't accept a content-encoding gzip the serer still sends "Content-Encoding: gzip".
This is Tech Evangelism bug but Opera and Chrome seem to handle this, IE8 fails as well.
Moving to networking:http for a decision from the networking developers.
Is the server actually gzipping twice when it sends "Content-Encoding: gzip,gzip"?
Related to Bug 205156?
wget --header="accept-encoding: gzip" --user-agent="Mozilla/5.0 (Windows NT 6.1; rv:12.0a1) Gecko/20120111 Firefox/12.0a1 SeaMonkey/2.9a1" -S http://labs.google.co.in/smschannels/browse
The result is a double gzip content
Yeah, we should probably just handle that...
It's not hard to hook up two chained gzip converters on a technical level; the question is whether there are sites sending "gzip,gzip" but only gzipped once.
(In reply to Boris Zbarsky (:bz) from comment #3)
> Is the server actually gzipping twice when it sends "Content-Encoding:
I like to look at httparchive.org for crud like this.
It has a DB of 4.5 million HTTP transactions.
About 1 million of them have content-encoding at all, and exactly 1 of them has more than 1 encoding.. and that's "identity, identity" from some cgi (http://www.toggo.de/fcgi-bin/fcgi_application)
So I am content this is a dumb and rare header value. Barring any other data I think the standards compliant interpretation ought to win - which in this case is the double gzip.
Sounds good. Let's do that.
Created attachment 595189 [details] [diff] [review]
process list of Content-Encodings as per RFC 2616 (though I capped it at 16 to avoid true silliness).. add a test case for "gzip" and "gzip, gzip", and manually confirmed this makes http://labs.google.co.in/smschannels/browse work
sent off to try as well.
*** Bug 205156 has been marked as a duplicate of this bug. ***
Comment on attachment 595189 [details] [diff] [review]
Why change the return value from AsyncConvertData to NS_ERROR_UNEXPECTED on error? Just return rv there.
Also, keep the logging for "unknown encoding"?
r=me with that
Note that this probably makes zip-bombing (http://en.wikipedia.org/wiki/Zip_bomb) easier but I don't see that as a significant issue given that even a single layer of gzip can cause OOM for us.
*** Bug 479266 has been marked as a duplicate of this bug. ***