Closed Bug 205156 Opened 21 years ago Closed 12 years ago

"Content-Encoding: gzip, gzip" not handled well

Categories

(Core :: Networking, defect)

x86
Windows 2000
defect
Not set
normal

Tracking

()

RESOLVED DUPLICATE of bug 717524

People

(Reporter: kaleida, Unassigned)

References

()

Details

User-Agent:       Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.3) Gecko/20030312
Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.3) Gecko/20030312

For many Web pages including the specified URL, auto-detect logic fails.
Internet Explorer 6.0 automatically selects the correct encoding for the 
specified URL - Western European (Windows). I was not able to manually 
select the correct encoding for Mozilla, since the menu layout is very difficult to 
understand, and no help is available. I have tried all available Western codings, 
but none of them worked.

Reproducible: Always

Steps to Reproduce:
1. Open the specified URL using Internet Explorer 6.0, and the encoding will be
correct.
2. Open the specified URL using Mozilla, and the text will be unreadable due to
an incorrect character coding.
2.
3.

Actual Results:  
The page was rendered incorrectly

Expected Results:  
Mozilla character coding logic should be improved. Moreover, the character 
coding menu should be revised to make it more logical, and it should be 
documented either iin Help or online.
Confirmed on build 2003050908 on Mac OS X 10.2.6 : I wasn't able to find an
encoding that works. Switching the universal detecor on or off didn't help.
either. But it works ok in Safari and Internet Exploder.

When I examined the raw data sent by the webserver (with
http://webtools.mozilla.org/web-sniffer/), I saw that it was regular ISO-Latin-1
text (or a similar encoding). And when I downloaded the page to my local disk, I
could see the page without any problems.

Could it be that the webserver is inserting some bogus content-encoding header,
but only for the Mozilla user-agent ?
Summary: Character Coding Auto-Detect Does Not Work → Character Coding Auto-Detect Does Not Work
I confirm the problem with 
Mozilla/5.0 (Windows; U; Win98; en-US; rv:1.4a) Gecko/20030401
but I do not see the reason for the problem clearly.

Some other things I saw:
That Page is not html 3.2:
<http://validator.w3.org/check?uri=http%3A%2F%2Fpcwin.101main.net%2Ftips-tricks%2Ftips-ms-dos-command.html&doctype=%28detect+automatically%29&charset=windows-1250+%28Central+Europe%29>
Page looks fine with NN4.73, IE6, OPERA6

Page-Source-view looks like page itself (more or less)

If I try to work on the page in the composer the page looks confused as in the
browser-window.

If I open the page with IE6 and click "Work page with MOZILLA", IE6 will open a
composer-window for the page and all looks quite normal, but I can not save the
page!

Might be there has been used any "source code protection" for the file?

If I save the page with mozilla-browser Notepad shows the page 
<html><head></head><body>‹      œXmsÚÊþÎŒÿùd’¶·¼9/·I´€Ä5Pâë^g  
and so on

If I save the page with IE6 and reopen it with Notepad, I see a normal
html-source and a charset- and other information:

<META http-equiv=Content-Type content="text/html; charset=iso-8859-8"><LINK 
href="_test_PCWin Resource Center - Windows 95-98, CE &amp; NT Tips and Tricks -
MS-DOS Command Prompt-Dateien/default.css" 
rel=stylesheet><!--#include virtual="/metatag.inc" -->
<META content="MSHTML 6.00.2600.0" name=GENERATOR></HEAD>

The charset-information could not be seen in the normal IE6 Source-view!


Rainer
Status: UNCONFIRMED → NEW
Ever confirmed: true
I tried encoding "iso-8859-8", but it did not work, too!
When I examine the raw-data with the web-sniffer (
http://webtools.mozilla.org/web-sniffer/view.cgi?url=http%3A%2F%2Fpcwin.101main.net%2Ftips-tricks%2Ftips-ms-dos-command.html
), then I can't see any Content-Encoding header, both on the HTTP-level, and in
a META-tag. So it's the webserver thaat is playing tricks !

iso-8859-8 is Visual Hebrew, right ?
The site sends back:

1026[812e8d0]: http response [
1026[812e8d0]:   HTTP/1.1 200 OK
1026[812e8d0]:   Server: Microsoft-IIS/5.0
1026[812e8d0]:   Date: Sat, 10 May 2003 17:00:39 GMT
1026[812e8d0]:   X-Powered-By: ASP.NET
1026[812e8d0]:   Connection: close
1026[812e8d0]:   Content-Encoding: gzip, gzip
1026[812e8d0]:   Content-Type: text/html
1026[812e8d0]:   Expires: Wed, 01 Jan 1997 12:00:00 GMT
1026[812e8d0]:   Cache-Control: max-age=86400
1026[812e8d0]:   Vary: Accept-Encoding
1026[812e8d0]: ]

I severely doubt that the content is actually doubly gzipped, so I suspect this
is a server bug....

Note that we do not currently support multiple encodings, so we treat the
encoding as "gzip, gzip", which is not a value we support; then we never gunzip
it even the one time it's needed. We could probably handle this more gracefully.
Assignee: asa → darin
Component: Browser-General → Networking: HTTP
QA Contact: asa → httpqa
Summary: Character Coding Auto-Detect Does Not Work → "Content-Encoding: gzip, gzip" not handled well
And for future reference, Mozilla's built-in NSPR logging is much more reliable
for HTTP headers than using web-sniffer.
Whiteboard: DUPEME
Related to or DUP of bug 176222 ?
Rainer: no, that bug is unrelated.
Target Milestone: --- → Future
*** Bug 220537 has been marked as a duplicate of this bug. ***
the charset problem sounds like bug 162061, duplicate?
This is biting me as well. I see it on www.alitalia.com (which is a reasonably
high-profile site).

Any solutions other than using about:config to disable all content encodings?
(In reply to comment #5)
> The site sends back:
> 
> 1026[812e8d0]: http response [
> 1026[812e8d0]:   HTTP/1.1 200 OK
> [...]
> 1026[812e8d0]:   Content-Encoding: gzip, gzip
> 
> I severely doubt that the content is actually doubly gzipped, so I suspect  this
> is a server bug....
> 
> Note that we do not currently support multiple encodings, so we treat the
> encoding as "gzip, gzip", which is not a value we support; then we never gunzip
> it even the one time it's needed. We could probably handle this more gracefully.

Actually, what I see in Ethereal is not "Content-Encoding: gzip, gzip" but two
"Content-Encoding: gzip" lines, which are probably legitimate:

HTTP/1.1 200 OK

Server: Microsoft-IIS/5.0

Date: Fri, 15 Oct 2004 01:41:46 GMT

X-Powered-By: ASP.NET

Connection: close

Content-Encoding: gzip

Content-Encoding: gzip

Content-Type: text/html

Expires: Wed, 01 Jan 1997 12:00:00 GMT

Cache-Control: max-age=0

Vary: Accept-Encoding

so is this still a server bug? Even if it is, it's probably going to become more
serious as ASP.NET becomes more common...
>Actually, what I see in Ethereal is not "Content-Encoding: gzip, gzip" but two
>"Content-Encoding: gzip" lines, which are probably legitimate:

those two are equivalent, according to RFC 2616.
so, if the content is not double encoded, then our decoder should be able to
detect that and it should then probably just pass the data on through
unmodified.  that might resolve this bug.

that said, i don't think we actually invoke the content decoder more than once,
so perhaps it is true that the content is double compressed.

when multiple content encodings are present we are supposed to create a stack of
decoders.  we can do that fairly easily given how stream converters work.
Workaround: Setting null to network.http.accept-encoding in prefs.js.
 (Default is "gzip,deflate", then Mozilla sends "Accept-encoding: gzip,deflate")
*** Bug 319564 has been marked as a duplicate of this bug. ***
Hi, I reported Bug 319564 (duplicate of this). 
IMHO Firefox should just ignore duplicate headers before parsing the content like other programs do. For example IE, curl, wget...

In the tests I run, the content was gzipped and the Web server was passing the same header "Content-Encoding: gzip" twice. That caused FireFox not display the content at all.

When I used Firefox Live HTTP headers, I could see:
Content-Encoding: gzip, gzip

When I did the same with other programs (curl, IE) it showed like:
Content-Encoding: gzip
Content-Encoding: gzip
those two ways are equivalent... (see RFC 2616)
*** Bug 288518 has been marked as a duplicate of this bug. ***
-> default owner
Assignee: darin → nobody
Component: Networking: HTTP → Networking
QA Contact: networking.http → networking
Target Milestone: Future → ---
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → DUPLICATE
Whiteboard: DUPEME
You need to log in before you can comment on or make changes to this bug.