864851 - Treat certain resources served "text/plain" as if they had no Content-Type header

Reporter

Description

•

12 years ago

User Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:20.0) Gecko/20100101 Firefox/20.0 Build ID: 20130409194949 Steps to reproduce: I was going to download an .apk file (android packet) Actual results: Firefox treated the file as text/plain Expected results: Firefox should offer to download/save the file instead. I think download/save file MUST be the behaviour expected by Firefox users when the mimetype is unknow and the file is binary. If the file is text plain, ok, but if it is binary it is a disaster to see firefox showing its bytes in the screen.

Cork

Comment 1

•

12 years ago

The extension on the url doesn't matter, what matters is what mime type the server sets in it's Content-Type header. Do you have a link to the file can you check what Content-Type the server is sending?

Flags: needinfo?(dosergio)

Sergio

Reporter

Comment 2

•

12 years ago

I saw that the content-type Firefox shows is text/plain, but the question is: Why Internet Explorer and Chrome doesn't show the file bytes and offers to download while firefox shows the bytes? Is it a server problem? So, why IE and Chrome show different behaviour ?

Flags: needinfo?(dosergio)

Sergio

Reporter

Comment 3

•

12 years ago

Here is a link to test. My workaround was to create a new php page only to serve it to firefox - non mobile - users... http://sitesbr.net/arquivos/download/contentexplorer.apk

Sergio

Reporter

Comment 4

•

12 years ago

In the new page, I use the "Content-disposition: attachment ..." header, so I had to do this to work with firefox users: http://software.sitesbr.net/apkdown/contentexplorer.apk

Sergio

Reporter

Comment 5

•

12 years ago

I have an add-on that shows http headers, here is the responde report: Response: HTTP/1.1 200 OK Date: Tue, 23 Apr 2013 18:29:58 GMT Server: Apache Last-Modified: Fri, 28 Dec 2012 23:46:13 GMT Etag: "12658b-2cfe4-4d1f244c69b40" Accept-Ranges: bytes Vary: Accept-Encoding Content-Encoding: gzip Keep-Alive: timeout=5, max=500 Connection: Keep-Alive Transfer:Encoding: chunked Content-Type: text/plain

Cork

Comment 6

•

12 years ago

The server isn't sending a content-type, so you get the default http content type of Content-Type: text/plain. The server needs to be configured to set the correct mimetype for apk[1] or the mimetype for binary data[2]. attachement isn't needed as long as you give it the correct mimetype. [1] application/vnd.android.package-archive [2] application/octet-stream

Status: UNCONFIRMED → RESOLVED

Closed: 12 years ago

Resolution: --- → INVALID

Cork

Comment 7

•

12 years ago

And the reason some other browsers doens't show it inline is cause they do content sniffing for magic bits, and trigger different modes if they find things they "take as not text". This gives problems in other situations instead though.

Sergio

Reporter

Comment 8

•

12 years ago

I disagree, Mr. Cork. "Downloading a txt file is LESS dumb than showing binary data in a browser screeen." The text file downloaded as attachment you can read after download. But a binary file on the screen serves to nothing that make you feel disapointed with the browser.

Cork

Comment 9

•

12 years ago

The server says that it sends a text file; firefox supports showing text, so it renders it. The problem is the server configuration, not firefox.

Sergio

Reporter

Comment 10

•

12 years ago

I know you are right, but for the average internet users, this behaviour makes firefox "less smart" than the other brands that do content sniffing as you said.

Sergio

Reporter

Comment 11

•

12 years ago

I insist that this issue should be taken more deeply because although the server "says" it is text/plain, the content is clearly binary... Why can't firefox be "trained" as Thunderbird is to recognize spam, to identity when a text/plain is NOT real textplain and offer an INTELLIGENT solution to the problem: Even better: Firefox could offer a dialog to user download or view the corrupt text/plain file...

Status: RESOLVED → UNCONFIRMED

Resolution: INVALID → ---

mjh563

Comment 12

•

12 years ago

It does seem a bit dumb of Firefox to display on screen what is obviously a binary file. I think it would be better to check first that a 'text/plain' file really is a text file...

Liz Henry (:lizzard) (relman/hg->git project)

Updated

•

12 years ago

Component: Untriaged → File Handling

Product: Firefox → Core

Summary: firefox reads file with unknow mime types → firefox reads file with unknown mime types

Whiteboard: DUPEME

Masatoshi Kimura [:emk]

Comment 13

•

12 years ago

According to the MIME Sniffing Standard, .apk files should be sniffed as "application/zip" if the Content-Type header is absent.

Blocks: mimesniff

Status: UNCONFIRMED → NEW

Ever confirmed: true

Sergio

Reporter

Comment 14

•

12 years ago

PS: My server admin has corrected the problem, so the link I gave in the top of this thread, now will NOT show the problem anymore.

Gordon P. Hemsley [:GPHemsley]

Comment 15

•

12 years ago

This sounds like it might stem from an old version of a Apache setting a Content-Type header with "text/plain" if it doesn't recognize the MIME type of a file (which the server probably decided based on the file extension). According to the MIME Sniffing Standard, the browser should treat this situation as if it were unknown, and eventually determine this to be an "application/zip" file.

Gordon P. Hemsley [:GPHemsley]

Updated

•

12 years ago

OS: Windows 7 → All

Hardware: x86_64 → All

Summary: firefox reads file with unknown mime types → Treat certain resources served "text/plain" as if they had no Content-Type header

Benjamin Smedberg

Updated

•

9 years ago

Product: Core → Firefox

Version: 20 Branch → unspecified

Anne (:annevk)

Comment 16

•

7 years ago

Was this a change made to the MIME Sniffing standard? I thought we implemented all logic in it. I'm not sure we should be sniffing more resources than we currently do. The more we can trust Content-Type the better.

Gordon P. Hemsley [:GPHemsley]

Comment 17

•

7 years ago

(In reply to Anne (:annevk) from comment #16) > Was this a change made to the MIME Sniffing standard? I thought we > implemented all logic in it. I'm not sure we should be sniffing more > resources than we currently do. The more we can trust Content-Type the > better. I'm not sure what you're responding to, but both the Apache bug workaround and the sniffing for application/zip appeared in the MIME Sniffing standard while it was still an IETF draft written by Adam Barth (based off Google Chrome's behavior). Also, from my modern reading of this bug, it seems this scenario is one in which there is no Content-Type header to trust.

Dennis Schubert [:denschub]

Comment 18

•

6 years ago

We received a webcompat report about a site serving an image file with content-type: text/plain; charset=utf-8. This response renders as plain text (i.e. you see a lot of binary output) in Firefox, but Chrome renders the image. During research, I stumbled upon this issue.

From the original comment:

Firefox should offer to download/save the file instead.

In the MIME Sniffing spec, I found two potentially relevant paragraphs: § 7.2. Sniffing a mislabeled binary resource, and § 8.2. Sniffing in an image context. I am not sure if the second paragraphs applies, but given 7.2, I don't think we should be displaying the content, but instead treat is application/octet-stream, the response in question starts with 0xFF 0xD8 0xFF, which is none of the plain text-"approved" resource headers, but the byte pattern for a JPEG image, which even is added to the mimesniff spec.

Is this me misunderstanding the spec or Firefox doing something wrong? :)

See Also: → https://github.com/webcompat/web-bugs/issues/24812

Anne (:annevk)

Comment 19

•

6 years ago

text/plain; charset=utf-8 is one of the MIME types which should nonetheless be sniffed due to the Apache bug. bz, do you know why that's not implemented?

Flags: needinfo?(bzbarsky)

Dennis Schubert [:denschub]

Comment 20

•

6 years ago

Added the URL from the mentioned WebCompat report for easier reproducing.

URL: https://mst.trrrending.today/560ff47d...

Julian Reschke

Comment 21

•

6 years ago

•

Edited

Annevk, when you say "Apache bug" I assume mean "the issue that Apache httpd used to have a content-type default"? Wasn't that using an ISO-8859-1 encoding?

EDIT: OK, I see that this goes back to the first versions of draft-abarth-mime-sniff - I'm still puzzled why this was needed though.

Boris Zbarsky [:bzbarsky]

Comment 22

•

6 years ago

Different versions of Apache had different behavior there; some had "ISO-8859-1" and some had "iso-8859-1" and some had "UTF-8". We have logic for all of those to treat as binary. See https://searchfox.org/mozilla-central/rev/60c4067b1cbb0f94d7dc2d7cdfa27ed579817fee/netwerk/streamconv/converters/nsUnknownDecoder.cpp#851-853

The type mentioned in comment 19 has "utf-8", which is not a thing Apache ever used to do, so we don't have special-casing for it.

Past that, what we implement has pretty much nothing to do with the minesniff standard in general in all sorts of ways, and the same for other browsers. But in this specific instance, the relevant part of mimesniff is https://mimesniff.spec.whatwg.org/#supplied-mime-type-detection-algorithm step 2.2 which sets the "check-for-apache-bug flag" in exactly the cases our current code sets it in, and in particular not for "text/plain; charset=utf-8".

Past all that, the issue with the headers from comment 5 is the "Content-Encoding: gzip". Obviously if the data is gzipped, trying to detect whether that (gzipped) data is binary or not is pointless; it's binary. So our code doesn't do that. See https://searchfox.org/mozilla-central/rev/60c4067b1cbb0f94d7dc2d7cdfa27ed579817fee/netwerk/streamconv/converters/nsUnknownDecoder.cpp#857-866. Since the buggy Apache configuration did not include sending Content-Encoding, this wasn't an issue for the Apache workaround. I can't tell whether https://mimesniff.spec.whatwg.org/#rules-for-text-or-binary is operating after undoing content-encodings or not, or whether the issue was even considered.

At this point, this bug is probably fairly useless, since it's "covering" at least two completely separate issues: (1) what do do with "text/plain" when Content-Encoding is set, (2) what to do with "text/plain; charset=utf-8".

Flags: needinfo?(bzbarsky)

Anne (:annevk)

Updated

•

5 years ago

Updated

•

2 years ago

Severity: normal → S3