Open Bug 864851 Opened 12 years ago Updated 2 years ago

Treat certain resources served "text/plain" as if they had no Content-Type header

Categories

(Firefox :: File Handling, defect)

defect

Tracking

()

People

(Reporter: dosergio, Unassigned)

References

(Blocks 1 open bug, )

Details

(Whiteboard: DUPEME)

User Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:20.0) Gecko/20100101 Firefox/20.0 Build ID: 20130409194949 Steps to reproduce: I was going to download an .apk file (android packet) Actual results: Firefox treated the file as text/plain Expected results: Firefox should offer to download/save the file instead. I think download/save file MUST be the behaviour expected by Firefox users when the mimetype is unknow and the file is binary. If the file is text plain, ok, but if it is binary it is a disaster to see firefox showing its bytes in the screen.
The extension on the url doesn't matter, what matters is what mime type the server sets in it's Content-Type header. Do you have a link to the file can you check what Content-Type the server is sending?
Flags: needinfo?(dosergio)
I saw that the content-type Firefox shows is text/plain, but the question is: Why Internet Explorer and Chrome doesn't show the file bytes and offers to download while firefox shows the bytes? Is it a server problem? So, why IE and Chrome show different behaviour ?
Flags: needinfo?(dosergio)
Here is a link to test. My workaround was to create a new php page only to serve it to firefox - non mobile - users... http://sitesbr.net/arquivos/download/contentexplorer.apk
In the new page, I use the "Content-disposition: attachment ..." header, so I had to do this to work with firefox users: http://software.sitesbr.net/apkdown/contentexplorer.apk
I have an add-on that shows http headers, here is the responde report: Response: HTTP/1.1 200 OK Date: Tue, 23 Apr 2013 18:29:58 GMT Server: Apache Last-Modified: Fri, 28 Dec 2012 23:46:13 GMT Etag: "12658b-2cfe4-4d1f244c69b40" Accept-Ranges: bytes Vary: Accept-Encoding Content-Encoding: gzip Keep-Alive: timeout=5, max=500 Connection: Keep-Alive Transfer:Encoding: chunked Content-Type: text/plain
The server isn't sending a content-type, so you get the default http content type of Content-Type: text/plain. The server needs to be configured to set the correct mimetype for apk[1] or the mimetype for binary data[2]. attachement isn't needed as long as you give it the correct mimetype. [1] application/vnd.android.package-archive [2] application/octet-stream
Status: UNCONFIRMED → RESOLVED
Closed: 12 years ago
Resolution: --- → INVALID
And the reason some other browsers doens't show it inline is cause they do content sniffing for magic bits, and trigger different modes if they find things they "take as not text". This gives problems in other situations instead though.
I disagree, Mr. Cork. "Downloading a txt file is LESS dumb than showing binary data in a browser screeen." The text file downloaded as attachment you can read after download. But a binary file on the screen serves to nothing that make you feel disapointed with the browser.
The server says that it sends a text file; firefox supports showing text, so it renders it. The problem is the server configuration, not firefox.
I know you are right, but for the average internet users, this behaviour makes firefox "less smart" than the other brands that do content sniffing as you said.
I insist that this issue should be taken more deeply because although the server "says" it is text/plain, the content is clearly binary... Why can't firefox be "trained" as Thunderbird is to recognize spam, to identity when a text/plain is NOT real textplain and offer an INTELLIGENT solution to the problem: Even better: Firefox could offer a dialog to user download or view the corrupt text/plain file...
Status: RESOLVED → UNCONFIRMED
Resolution: INVALID → ---
It does seem a bit dumb of Firefox to display on screen what is obviously a binary file. I think it would be better to check first that a 'text/plain' file really is a text file...
Component: Untriaged → File Handling
Product: Firefox → Core
Summary: firefox reads file with unknow mime types → firefox reads file with unknown mime types
Whiteboard: DUPEME
According to the MIME Sniffing Standard, .apk files should be sniffed as "application/zip" if the Content-Type header is absent.
Blocks: mimesniff
Status: UNCONFIRMED → NEW
Ever confirmed: true
PS: My server admin has corrected the problem, so the link I gave in the top of this thread, now will NOT show the problem anymore.
This sounds like it might stem from an old version of a Apache setting a Content-Type header with "text/plain" if it doesn't recognize the MIME type of a file (which the server probably decided based on the file extension). According to the MIME Sniffing Standard, the browser should treat this situation as if it were unknown, and eventually determine this to be an "application/zip" file.
OS: Windows 7 → All
Hardware: x86_64 → All
Summary: firefox reads file with unknown mime types → Treat certain resources served "text/plain" as if they had no Content-Type header
Product: Core → Firefox
Version: 20 Branch → unspecified
Was this a change made to the MIME Sniffing standard? I thought we implemented all logic in it. I'm not sure we should be sniffing more resources than we currently do. The more we can trust Content-Type the better.
(In reply to Anne (:annevk) from comment #16) > Was this a change made to the MIME Sniffing standard? I thought we > implemented all logic in it. I'm not sure we should be sniffing more > resources than we currently do. The more we can trust Content-Type the > better. I'm not sure what you're responding to, but both the Apache bug workaround and the sniffing for application/zip appeared in the MIME Sniffing standard while it was still an IETF draft written by Adam Barth (based off Google Chrome's behavior). Also, from my modern reading of this bug, it seems this scenario is one in which there is no Content-Type header to trust.

We received a webcompat report about a site serving an image file with content-type: text/plain; charset=utf-8. This response renders as plain text (i.e. you see a lot of binary output) in Firefox, but Chrome renders the image. During research, I stumbled upon this issue.

From the original comment:

Firefox should offer to download/save the file instead.

In the MIME Sniffing spec, I found two potentially relevant paragraphs: § 7.2. Sniffing a mislabeled binary resource, and § 8.2. Sniffing in an image context. I am not sure if the second paragraphs applies, but given 7.2, I don't think we should be displaying the content, but instead treat is application/octet-stream, the response in question starts with 0xFF 0xD8 0xFF, which is none of the plain text-"approved" resource headers, but the byte pattern for a JPEG image, which even is added to the mimesniff spec.

Is this me misunderstanding the spec or Firefox doing something wrong? :)

text/plain; charset=utf-8 is one of the MIME types which should nonetheless be sniffed due to the Apache bug. bz, do you know why that's not implemented?

Flags: needinfo?(bzbarsky)

Added the URL from the mentioned WebCompat report for easier reproducing.

Annevk, when you say "Apache bug" I assume mean "the issue that Apache httpd used to have a content-type default"? Wasn't that using an ISO-8859-1 encoding?

EDIT: OK, I see that this goes back to the first versions of draft-abarth-mime-sniff - I'm still puzzled why this was needed though.

Different versions of Apache had different behavior there; some had "ISO-8859-1" and some had "iso-8859-1" and some had "UTF-8". We have logic for all of those to treat as binary. See https://searchfox.org/mozilla-central/rev/60c4067b1cbb0f94d7dc2d7cdfa27ed579817fee/netwerk/streamconv/converters/nsUnknownDecoder.cpp#851-853

The type mentioned in comment 19 has "utf-8", which is not a thing Apache ever used to do, so we don't have special-casing for it.

Past that, what we implement has pretty much nothing to do with the minesniff standard in general in all sorts of ways, and the same for other browsers. But in this specific instance, the relevant part of mimesniff is https://mimesniff.spec.whatwg.org/#supplied-mime-type-detection-algorithm step 2.2 which sets the "check-for-apache-bug flag" in exactly the cases our current code sets it in, and in particular not for "text/plain; charset=utf-8".

Past all that, the issue with the headers from comment 5 is the "Content-Encoding: gzip". Obviously if the data is gzipped, trying to detect whether that (gzipped) data is binary or not is pointless; it's binary. So our code doesn't do that. See https://searchfox.org/mozilla-central/rev/60c4067b1cbb0f94d7dc2d7cdfa27ed579817fee/netwerk/streamconv/converters/nsUnknownDecoder.cpp#857-866. Since the buggy Apache configuration did not include sending Content-Encoding, this wasn't an issue for the Apache workaround. I can't tell whether https://mimesniff.spec.whatwg.org/#rules-for-text-or-binary is operating after undoing content-encodings or not, or whether the issue was even considered.

At this point, this bug is probably fairly useless, since it's "covering" at least two completely separate issues: (1) what do do with "text/plain" when Content-Encoding is set, (2) what to do with "text/plain; charset=utf-8".

Flags: needinfo?(bzbarsky)
See Also: → 1527955
Severity: normal → S3
You need to log in before you can comment on or make changes to this bug.