Closed Bug 92140 Opened 24 years ago Closed 24 years ago

Download dialog of "file of type: text/html, text/html from <URL>"

Categories

(Core :: Networking: HTTP, defect, P2)

x86
Linux
defect

Tracking

()

RESOLVED FIXED
mozilla0.9.2

People

(Reporter: vickeryj, Assigned: darin.moz)

References

()

Details

(Keywords: topembed, Whiteboard: r=gagan, sr=dougt, verified-on-trunk)

Attachments

(3 files)

From Bugzilla Helper: User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:0.9.2+) Gecko/20010716 BuildID: 2001071608 This bug has shown up in, I believe, all post Mozilla 0.8 builds. This only happens when using an http proxy. Once certain pages have been cached (http://www.math.grin.edu is one) attempting to visit them again results in a popup window that reads: You have chosen to download a file of type: text/html, text/html from http://www.math.grin.edu/ What should Mozilla do with this file? And gives the standard prompt for unregistered MIME types. Clearing the browser cache allows the page to load normally. I have only verified this bug with a Perl Proxy written in house, so it is possible that it is due to our bug, however, we don't get this error with any other browser. Reproducible: Always Steps to Reproduce: 1. Turn on an HTTP proxy 2. Visit http://www.math.grin.edu/ 3. Hit reload Actual Results: The above described popup window apears. Expected Results: The page should load.
Could you attach the actual HTTP response from your proxy?
I modified the proxy server to print the HTTP::Response object that it returns to the web server as a string. The above attachement is a log of this. However, the proxy server does not send any response when the page has been cached by the browser, which is the only case when I get the error. In order to get the response to print I had to clear my cache. josh
Hmm: Content-Length: 10132 Content-Type: text/html Content-Type: text/html; charset="ISO-8859-1" Could that be causing us to get confused as to what the real type is? Is there a way to get the proxy to only send one content-type header to see whether that helps?
Mozilla is doing the Right Thing; this is the proxy's problem. The last paragraph of RFC 2616, Section 4.2, details how the two Content-Type headers are being compressed into the comma-separated list. Because the BNF definition of Content-Type in RFC 2616, Section 14.17 does not allow multiple values for Content-Type, Mozilla tries to parse the entire string "text/html, text/html; charset="ISO-8859-1" as a single MIME-type and, of course, does not recognize it. In short, the proxy should only ever send one Content-Type header per document.
hmm, I tried to get the proxy to stop sending two Content-Type headers, however I couldn't seem to do it (I didn't write the proxy), but then I got the proxy to give me a dump of the response it was getting from the web server when it passed on the request it got from Mozilla: Remote Server Response: HTTP/1.1 200 OK Connection: close Date: Tue, 24 Jul 2001 21:42:05 GMT Accept-Ranges: bytes Server: Apache/1.3.19 (Unix) (Red-Hat/Linux) mod_ssl/2.8.1 OpenSSL/0.9.6 DAV/1.0.2 PHP/4.0.4pl1 mod_perl/1.24_01 Content-Length: 8640 Content-Type: text/html Content-Type: text/html; charset="ISO-8859-1" So it looks like Apache is returning two Content-Types, though perhaps I am mistaken. Would this then be an Apache bug/misconfiguration? I am working on a simple perl script to illustrate this, which I will attach soon. josh
It seems that perl's HTTP and LWP modules may be at fault here. I grabbed the HTTP request from Mozilla and using telnet I sent it to the webserver. Here is the partial response: HTTP/1.1 200 OK Date: Tue, 24 Jul 2001 22:19:45 GMT Server: Apache/1.3.19 (Unix) (Red-Hat/Linux) mod_ssl/2.8.1 OpenSSL/0.9.6 DAV/1.0.2 PHP/4.0.4pl1 mod_perl/1.24_01 Last-Modified: Tue, 19 Jun 2001 21:10:34 GMT ETag: "fe56a-21c0-3b2fbfca" Accept-Ranges: bytes Content-Length: 8640 Keep-Alive: timeout=15, max=100 Connection: Keep-Alive Content-Type: text/html <?xml version="1.0" encoding="ISO-8859-1"?> <!-- XEmacs: This is an -*- XML -*- metadocument. --> Just one content type! It looks like perl is pulling information from the meta tags and placing them in the headers of the response object. Still, it seems that Mozilla is able to handle this double content type when it is not retrieving documents from the cache. Can someone advise me on what I should do now? josh
Interesting... the headers that script gets are nothing like the headers that wget or telnet get for the same site... The headers below are not received by wget or telnet: Content-Type: text/html; charset="ISO-8859-1" ETag: "fe56a-21c0-3b2fbfca" Connection: close Client-Date: Tue, 24 Jul 2001 22:26:07 GMT Client-Peer: 132.161.33.160:80 Title: Department of Mathematics and Computer Science, Grinnell College X-Meta-Description: front-door page for the Department of Mathematics and Computer Science at Grinnell College X-Meta-Keywords: Grinnell, mathematics, computer science, department, front door looks like interesting useragent sniffing on the server end...
I just love 166 character summaries...
Status: UNCONFIRMED → NEW
Ever confirmed: true
Keywords: qawanted
Ignore what I just said. Josh has the right guess about what's going on. Looks like our cached copy of the document stores a parsed version of the HTTP headers. That's why this problem appears. If we stored the actual headers, things would be OK. setting status to NEW, adding qawanted keyword for cache people to decide whether they want to deal with this... In the meantime, I would contact the perl developers who implemented those packages and tell them that their code is creating invalid headers.
Darin, this looks like it's related to how HTTP stores its headers for cached responses.
Assignee: gordon → neeti
Component: Networking: Cache → Networking: HTTP
QA Contact: tever → benc
MacOS 2001-07-27-21.0.9.2 I got this same dialog while playing with a section of the networking functional test we are running for the next Netscape release: STEPS: Go to: http://bubblegum.mcom.com:4321/re-direct View page, then go forward and back. When you go back, it displays the same dialog. Apologizes to people not on the netscape network, I don't know much about this server and don't have time to port a test case right now... Oddly, if I go to about:cache, I can find the cache entry, but cannot information on it by clicking on the link.
Another URL that exhibits this behaviour is this: http://www.planetdreamcast.com/psoworld/ Either click on an article or go to another page entirely and then attempt to go back. Same type of dialog.
Changing subject to generalize the problem more. Until we know the cause, then we might need to break off the cases...
Keywords: qawanted
Summary: Mozilla does not know what to do with a "file of type: text/html, text/html from http://www.math.grin.edu/" once it is in the cache while running with a Proxy Server. → Download dialog of "file of type: text/html, text/html from <URL>"
Keywords: qawanted
Summary: Download dialog of "file of type: text/html, text/html from <URL>" → Mozilla does not know what to do with a "file of type: text/html, text/html from http://www.math.grin.edu/" once it is in the cache while running with a Proxy Server.
Another site showing the problem. Visit any of the links, then attempt to go back: http://www.aerostich.com/
moz caching strikes again - repairing the damage...
Keywords: qawanted
Summary: Mozilla does not know what to do with a "file of type: text/html, text/html from http://www.math.grin.edu/" once it is in the cache while running with a Proxy Server. → Download dialog of "file of type: text/html, text/html from <URL>"
-> darin
-> attempting for moz 0.9.4
Priority: -- → P3
Target Milestone: --- → mozilla0.9.4
-> me (for real this time)
Assignee: neeti → darin
to solve this "bug" we could simply not coalesce multiple Content-Type headers. even though the spec says that we should coalesce Content-Type headers, perhaps we have no choice but not to... time to see what IE and NS4x do. upping priority and adding 4xp keyword.
Status: NEW → ASSIGNED
Keywords: 4xp
Priority: P3 → P2
IE and NS4x both properly detect the content-type as text/html. Attaching a patch which will make us honor only the last content-type header.
Attached patch v1.0 simple fixSplinter Review
Keywords: patch, topembed
r=gagan
Whiteboard: r=gagan, sr=?
sr=dougt
fixed-on-trunk
Whiteboard: r=gagan, sr=? → r=gagan, sr=dougt, fixed-on-trunk
Target Milestone: mozilla0.9.4 → ---
bubblegum is down, so I've asked someone to fix that so I can test.
bubblegum is up. benc@netscape.com could you verify this?
Target Milestone: --- → mozilla0.9.2
Easily reproducable on the planetdreamcast site using 7/25 builds. Works on latest trunk builds. verified on trunk: Win NT4 2001081303 Mac os9 2001081408 Linux rh6 2001081514
Whiteboard: r=gagan, sr=dougt, fixed-on-trunk → r=gagan, sr=dougt, verified-on-trunk
Hmph. I'm all for flogging the web server and proxy guys. This shouldn't be patched. We definately don't want to go the route of IE and make any HTML page look good, even if it isn't complaint. (Or in this case, HTTP headers.) At the very least, try to figure out the reason for allowing multiple Content-Types, as specified in the RFC. We comform to RFC standards, not broken web sites/servers/proxies.
fixed-on-branch
Status: ASSIGNED → RESOLVED
Closed: 24 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: