Closed Bug 67940 Opened 23 years ago Closed 20 years ago
For application/octet-stream, set MIME type from extension/data
Sometimes a webserver gives the MIME type "application/octet-stream" for a file that has a defined MIME type (for example, I often get octet-stream for PDF files). It would be nice if Mozilla would, for octet-stream MIME types, try to guess the actual MIME type from the file extension.
This would also be useful in mail/news because certain email clients will not attach the correct mime type to a file. A particular example is when I receive png files I often have to save them to disk, and then open them again with mozilla because mail will not display them in-line if they have the application/octet-stream mime type.
This bug is very annoying for me. I use a fax->mail gateway that sends me Content-Type: application/octet-stream; name="990087634_C3G01F01.TIF" Content-Transfer-Encoding: base64 Content-Disposition: attachment; filename="990087634_C3G01F01.TIF" Even if I configure TIFF in Helper Apps, I still get a normal download dialog, with not even the tiff helper app prefilled. I guess, they use octet-stream, because some another app would otherwise choke, I don't know. neeti, any hints on how I could fix this? Changing SUMMARY: s/guess/set/, because we know the extensions of the mimetypes - you can set them in the helper app dialog.
Summary: For application/octet-stream, guess MIME type from extension → For application/octet-stream, set MIME type from extension
Open Networking bugs, qa=tever -> qa to me.
QA Contact: tever → benc
A more general description of this bug is in bug 66677. (f.ex. HTML attachments that arrive as text/plain).
I received mail with the following relevent headers (according to show message source): MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="----_=_NextPart_000_01C12F4F.C6F96700" Content-Length: 3705 This message is in MIME format. Since your mail reader does not understand this format, some or all of this message may not be legible. ------_=_NextPart_000_01C12F4F.C6F96700 Content-Type: text/plain; charset="iso-8859-1" ... text message... ------_=_NextPart_000_01C12F4F.C6F96700 Content-Type: application/octet-stream; name="DAVIS-T33549-PASCALET.htm" Content-Disposition: attachment; filename="DAVIS-T33549-PASCALET.htm" ... HTML source... Quite reasonably, Mozilla doesn't display the HTML page inline, as its disposition is attachment. If I select the attachment and choose Open, I am presented with a dialog box which proclaims You have chosen to download a file of type "Hyper Text Markup Language" [text/html] from imap... What should Mozilla do with this file? Open using <no application specified> Save this file to Disk It appears to have interpreted the file type from the extension, but doesn't offer the option of displaying the attachment internally (or doing this automatically, since I did say I wanted it opened).
Scott: you're being bitten by bug 78943 and its byblows
Here's a preliminary patch to simulate discussion. If people feel this is the right approach, I'd like to reimplement it a bit more cleanly and fix some other code that wants access to the Content-Disposition filename to go through the new interface it creates. This patch ignores a "Content-Type: application/octet-stream" sent by the server on the grounds that such a header is as useless as sending no "Content-Type" header whatsoever. When faced with such unknown content (i.e., a missing "Content-Type" header or a "stupid" content type like "*/*" or "application/octet-stream"), it *first* tries a filename given in a "Content-Disposition" header. If that gives an extension that maps to a useful MIME type, it uses that. Otherwise, it falls back to trying to derive a MIME type from a file extension in the URI. This is under-tested, but with this patch Mozilla now works as expected when accessing attachments from the Microsoft Outlook Web Access. In this case, the attachment is sent by the server with "Content-Type: application/octet-stream" but with an appropriate filename in a "Content-Disposition" header. With this patch, if there's a matching extension in the MIME types for that content disposition filename, Mozilla will get it right. (Hooray!) Comments? Also, let me note that bug 164996 is really a duplicate of this, but they're both assigned to different people, so I'm hesitant to change anything.
hmm... i do like the idea of adding Content-Disposition to the list of HTTP atoms, since that'll avoid growing the atom table when that header is encountered, but as for the rest, can it move into uriloader/exthandler along with the rest of the content-disposition code?
Oh, you probably want the patch that *doesn't* crash. Here's a second try. I wouldn't be entirely surprised if there weren't more bugs lurking in there though.
Attachment #99252 - Attachment is obsolete: true
Darin: It's not clear to me how to move it into uriloader. The problem is that, for HTTP channels, the MIME type should be decided by the channel's idea of the content type (from a Content-Type header) unless it's a "stupid" value like "application/octet-stream", then by the Content-Disposition filename extension, and finally by an extension calculated via GetTypeFromURI or similar. The first and the last case are already handled by nsHTTPChannel::GetContentType. We'd have to split them up if we wanted to check the content disposition in the middle there. My thought was just the opposite---the content disposition parsing code could be moved *out* of exthandler and into nsHttpResponseHead. There would be nsIHttpChannel and nsIMultiPartChannel methods that exthandler could use to fetch the relevant *parsed* Content-Disposition information it needed. However, I don't really understand the role of exthandler very well. Is it involved in every bit of content or *only* when the content has been recognized as something to be handled externally?
Kevin, the code Darin refers to lives in http://lxr.mozilla.org/seamonkey/source/uriloader/exthandler/nsExternalHelperAppService.cpp#248 (nsExternalHelperAppService::DoContent). It does the type lookup, then does the extension lookup if there's nothing registered for the type. The reason it's not a good idea to change the http channel as you did is that sites will often send HTTP content as octet-stream expecting it to be saved (that's what NS4 and IE do, no?). So they send the content with a Refresh header that redirects to a "done downloading" page or something. Handling such content inline would just mean the user has no way of getting to it.... This problem is inherent in any fix to this bug, including one via the uriloader (the helper app service would need to set the type on the channel and kick the load back to the originating window, basically.... but that encounters the same issues.)
Okay, I think I understand, but I'll have to ponder this a bit. It seems like there should be an extra channel interface method, so we have a pair like nsHTTPChannel::GetContentType and nsHTTPChannel::GuessExternalContentType. The idea would be that, at least for the HTTP channel, GetContentType would always deliver the Content-Type header, even if it was "application/octet-stream" (indicating "download suggested"). If there was no Content-Type header, it could fall back on a channel-dependent algorithm to guess the (inline) content type. For the HTTP channel, it would probably *only* check the extension from the URI and try to map that for a MIME type, as it does now. Then, GuessExternalContentType would be used from the uriloader module as a channel-dependent way to make a better guess when the content type is still unknown or something useless like "application/octet-stream". We could either do this in nsExternalAppHandler::OnStartRequest so we bring up a "what do you want to do with this content?" dialogue that has an appropriate MIME type based on the Content-Disposition filename. Or, we could do it in nsDocumentOpenInfo::DispatchContent just before we're about to pass the request off to the helperAppService. I don't think it makes a difference. Anyway, I'll think about it this week and try testing some things, including the "redirect to `download complete' page" scenario you mentioned.
kevin: you can QI to nsIHttpChannel and call GetResponseHeader("content-type") to access the raw server specified MIME type. GetContentType on the other hand returns the guessed MIME type. so, we basically have the functionality that you're looking for... perhaps the uriloader code just needs to ask for things differently??
Whoa. Let's keep knowledge of the content-type header out of the loader ok? That's _so_ an implementation detail of http... Not to mention that other channels have the same issues as http with application/octet-stream (multipart channels come to mind).
well, my point was that it is possible to infer these things from the channel. of course, if we can find a protocol-agnostic way to do the same, then that's always better.
Not sure, if it's relevant or already considered: Some filesystems store the mimetype in a special field. ext2, XFS and (I think) Mac OS have this ability. Presumably, it would be a good idea to set it for downloads as well. However, I am not sure, who is responsible to figure out a good mimetype, if the provider didn't give one (should we guess and set it or leave it blank, leaving it to OS/apps to figrue it out?). Also, did you consider using |file| on Unix to figure out the correct mimetype? It often has a much better guess than an filename extension lookup.
Ben, your suggestion would be a great improvement to the file channel... Please file a bug on that; cc me.
*** Bug 164996 has been marked as a duplicate of this bug. ***
moving neeti's futured bugs for triaging.
Assignee: neeti → new-network-bugs
Hello -- I often receive messages with application/octet-stream for PDF, EXE, TIF, GIF, EFX (fax). I need some resolution from Mozilla on determining applications for each file suffix/extension. Mail/News is _very_ frustrating to use when viewing a non-zero # of emails with attachments (of various type). This may be a trivial "perceived" functionality -- but difficult in implementation?? I don't doubt there's some issue with lack of standards support -- but end-user functionality should remain high on priority list. I'd be surprised to learn the Netscape7 release has this same problem. Thanks -GA
Start being surprised -- it has the same "problem". One issue is that on the Web authors rely on being able to set application/octet-stream to force "save as" behavior (that is currently the only reliable method). For mail, of course, this is not an issue... The mail channel impls could certainly be fixed to deal. I'll look into doing that.
Summary: For application/octet-stream, set MIME type from extension → For application/octet-stream, set MIME type from extension/data
Referring to comment #20 by Garretta: I've got the same problem. I had a text file (.txt) with the famous MIME type "application/octet-stream". Connected it to Notepad. Now every downloaded .exe file (and others) are saved as ".exe.txt". Now I checked the settings for helper applications: Editing the entry for "application/octet-stream" there, I noticed, that the MIME type must be specified. My suggestion: If I define a helper application for files with a certain extension and the MIME type "application/octet-stream", ignore the MIME type and just check the extension. This includes several options: a) If there's another MIME type with this extension connected to a helper application, use this type. b) Several helper application can be defined for the type "application/octet-stream". c) Alternatively you might introduce a "dummy" MIME type for this or the option "every type". I prefefred option a). Thanks for listening to a generally satisfied Mozilla user. RK
*** Bug 191730 has been marked as a duplicate of this bug. ***
This bug is fairly important for the average user. For example, my mother often gets emails containing images that have the incorrect content-type assigned to them. A quote from her: "Previous to downloading Netscape 7.0, it would have automatically opened up the picture. Hope you have some ideas how to fix this." (She previously used netscape 4.7 as her mail client.) Certainly sounds like a 4xp issue to me
Now that I think about it, every channel that wants this just needs to stick an nsUnknownDecoder stream converter into the data stream before calling OnStartRequest (as the http channel does for HTTP/0.9 responses, eg). That's it. No changes needed to the uriloader, channels can make their own decisions (eg HTTP should _not_ do this, in my opinion. But for mail, it would make a good deal of sense to do it). Thoughts? (Note that all comments but comment zero talk about this in the context of mail, and all the issues I've raised are only problems in the context of HTTP.)
bz: HTTP doesn't push a nsUnknownDecoder... the uriloader does so on behalf of any channel that cannot provide a specific content-type. the OnStartRequest is delayed by nsDocumentOpenInfo (or whatever the class name is) until nsUnknownDeocder does its thing.
Darin, see nsHttpChannel::CallOnStartRequest (you reviewed that patch, man!)
*oh yeah* :-)
*** Bug 195978 has been marked as a duplicate of this bug. ***
From a QA point, maybe.... This is a Necko bug at heart, but this particular part of necko is classified as "file handling" in the component descriptions.
*** Bug 173236 has been marked as a duplicate of this bug. ***
*** Bug 158050 has been marked as a duplicate of this bug. ***
Assignee: new-network-bugs → darin
Assignee: darin → law
Component: Networking → File Handling
QA Contact: benc → cpetersen0953
So are we back to Kevin's suggestion in comment #12 ?
Opera has a feature similar to what this is proposing. Their option calls it "determine file type based on extension for unreliable MIME types" or something, and I think that's a pretty good description of what this should be. I would like to see an option to allow filetypes to be defined based on extension (priority over MIME type) since many servers do not properly define many MIME types.
Re: Comment 25, my take is that this is a user-empowerment issue. I'm a pretty savvy developer (though not so much in the web arena), and my first attempt to "do something" about this implementation was to try changing my "helper applications" preference to handle "application/octet-stream" files with extensions of "jpg" (which were the ones that were annoying me) to be handled internally. Hilarity ensued. As for the web sites that want to "force" a download... what if I don't want to be forced? Where am I going with this, you might ask? Why does the GUI for configuring MIME type handlers allow only one entry for each MIME type? It seems to be that this could be handled by allowing users to specify their *own* actions for different extensions for these kinds of meaningless and widely wrongly used MIME types. Then we could have arguments about what the defaults for these should be :-).
Try http://www.testitonline.no/. It tries to download the html pages. Is this problem within the scope of this "bug"?
comment#39: The server is broken and they should send the correct mime/type for the document (text/html) but this is a workaround for this misconfigured server.
But comment#39 is a good example of a web site that is displayed in M$IE even when you think forced downloading should be triggered (it does in virtually all other browsers though). But this behavior probably depend on a file type/content that IE know how to display internally (i.e. text/html/image).
Why is this an enhancement, and not a bug? Is the email client really doing anything wrong? What mime-type should it use for unknown file types? It just gets files to attach, and often know nothing about them except their extentions. How should it be able to figure out the mime-type of all possible file types? I think this should be of hight priority to be fixed, as I assume many of the users are going to experience this problem and no good workaround exists.
>Why is this an enhancement, and not a bug? interesting that you ask, personally I wonder "why is this an enhancement and not WONTFIX" hm... comment 25 suggests to make this mail-only, in which case it may make sense, but why is this in the browser product then? >Is the email client really doing anything wrong? What mime-type should it use >for unknown file types? It just gets files to attach, and often know nothing >about them except their extentions. How should it be able to figure out the >mime-type of all possible file types? That's indeed a quite good question. Why do you think Mozilla can do a better job at extension->type mapping than your mail client can?
> hm... comment 25 suggests to make this mail-only, in which case it may make > sense, but why is this in the browser product then? I suppose the same problem exists for web also, or is it less because of less M$-percentage of servers than for email clients? I also agree that the issue is different for web. When you put something on a http server, you should know what type it is, that is, you are not displaying user content. >> Is the email client really doing anything wrong? What mime-type should it use >> for unknown file types? It just gets files to attach, and often know nothing >> about them except their extentions. How should it be able to figure out the >> mime-type of all possible file types? > That's indeed a quite good question. Why do you think Mozilla can do a better > job at extension->type mapping than your mail client can? Think of this scenario: -I have acroread/acrobat creator installed on my PC, and create a PDF file, place it on a file server. -B, which has no PDF applications installed on his PC, attach the file from that file server to an e-mail and send it to you -you have PDF applications installed, and know that .PDF (usually) are PDFs, a signature for PDF contents and know the MIME-type for PDF, get the attachment as application/octet-stream Here the MIME-type is lost because of the storage on a harddisk, with a file system which doesn't save the MIME-type. Do you think B should install a mapping of filename extentions of mime-types, or be forced to install some PDF applications? If the sender doesn't know the mime type and the receiver knows, I think the receiver should guess. Is there a better mime type for unknown than octet-stream? What should I tell people that say the exact same thing is working as expected in other mail clients (i.e. outlook)?
>I suppose the same problem exists for web also, or is it less because of less >M$-percentage of servers than for email clients? Well, see comment 11. >Do you think B should install a mapping of filename extentions of mime-types, >or be forced to install some PDF applications? That was a good example. Yeah, doing it for mail would be ok for me. But then this bug should be in the mail product. > Is there a better mime type for unknown than octet-stream? No, afaik octet-stream means "I don't know what this is, but it's binary" What should I tell people that say the exact same thing is working as expected in other mail clients (i.e. outlook)?
comment#11: "that's what NS4 and IE do, no?" No, IE doesn't use the content-type: http://www.mversen.de/mozilla/octet/ The server should send a attachment-header if the client should download the file. (And I saw a few servers send content-types like "application/x-download" to forcing a download) It would make sense to use the unkown-content decoder (not the extension) for application/octet-stream because application/octet-strema means = I don't know what this is. I think that the client should handle this file because the server doesn't know what this file is. I don't know if this is true but I think you can't tell Apache to send an unkown File without content-type. The mime-config only allows you to send a general mime-type for unkown files and the mime-list in Apache only contains IANA Mime-types (no mime-types for many known files like .rar or .ace) The xitami web-server does the same (but the default config is a broken "*/*" mime-type for unkown files ) And this would make us compatible with Opera or Safari
If this is a mail-only bug, it's a trivial matter of hooking the unknown content decoder into streamlistener chain for mailnews channels that have application/octet-stream for the type. The http channel does this already for cases when the server sends no content-type header. Note that: 1) Comment 0 is not about mail 2) We have a separate bug on the mail behavior, filed in the mailnews product, as far as I can recall.
Comment#43 asked "Why do you think Mozilla can do a better job at extension->type mapping than your mail client can?" The correct question should be "why do you think your Mozilla can do a better job at extension->type mapping than someone else's mail client or web server can?" A fix to this bug will allow Mozilla users to have control over how they handle content received from poorly configured mail senders or web servers.
The mail-bug (or at least one of the relevant bugs) is bug 59631 : M$ Virus Outbreak (= Outlook) sends vcards as application/octet-stream.
*** Bug 189391 has been marked as a duplicate of this bug. ***
WONTFIX based on comments. Please file a bug on the mail components if this is to be done only in our mail clients, assuming such a bug is not already filed.
Status: NEW → RESOLVED
Closed: 20 years ago
QA Contact: chrispetersen → ian
Resolution: --- → WONTFIX
that's bug 59631 as mentioned in Comment 49. so please do not file duplicates of that bug. vrfy wont
Status: RESOLVED → VERIFIED
This isn't only for the mail component. Why was this closed WONTFIX "based on comments"?
Because for non-mail cases, this is actually invalid per the specs, as described in comments above.
Exactly. The wontfix is for the non-mail part of this bug. I concur, and so does darin last I checked.
What enhancement bug then contains a feature similar to the opera option mentioned in comment #37?
None that I know of. And I don't think we have any plans to implement something like that (I would say that the wontfix in this bug applies to that idea too).
Seeing as how this is a huge pain for real users, I think something should be done. However, it is true that the specs say that application/octet-stream is for unknown binary files that should be saved to disk. Given that, isn't it a violation of the specs for Mozilla to even *allow* setting a handler for this MIME type (that's what gets most people into trouble)? You can't have it both ways. Personally, I suggested something in comment 38 that I think would be an ideal solution, and that would actually be useful in other situations as well (allowing multiple extensions/handlers per MIME type). Yes, it's a bit of a pain, and something that only advanced users would get any benefit from... but it's not a violation of this spec at least...
> to even *allow* setting a handler for this MIME type At this point, you can't do that accidentally. The only way to do it is to purposefully open up prefs and set the handler for that type, typing in the type. As in, the helper app dialog no longer auto-saves the handler when the type is application/octet-stream. If I want to set the handler to something like a hex editor, no reason why I shouldn't. ;) Please do file a _separate_ clear bug on your suggestion. It's something that we would take if someone implements it (neither biesi nor I are likely to have time to work on it anytime soon, and we seem to be the only people willing to touch this code....)
My Workaround for Linux/KDE: This just passes the mail attachment to KDE/Konqueror to run the default app. You can probably do something similar on Windows with Explorer. I don't know exactly about Gnome. Add the following line to /etc/mailcap: application/octet-stream; /usr/local/bin/attachment.sh "%s" Put the following script in /usr/local/bin/attachment.sh: #!/bin/bash kfmclient exec "$1" Make it executable: chmod ugo+x /usr/local/bin/attachment.sh In Edit:Preferences:Navigator:Helper Applications, Add a New Application: For MIME Type, use 'application/octet-stream' Add some file extensions: 'wpd doc pdf' Select "Open it using the default application" Uncheck "Always ask me before handling files of this type" As long as the KDE file manager is setup with the right application, this will work. With my system (Fedora Core 1) Mozilla copies the attachment to "/tmp" then the scripts pass the URL of the attachment to kfm-client which opens it with the correct application. It takes a couple of seconds on a slow system, but it works.
Wouldn't that also open shell scripts and exes (using wine) etc.? If so, that's extremely dangerous.
> then the scripts pass the URL of the attachment to kfm-client surely the filename of the /tmp file, not a url?
> surely the filename of the /tmp file, not a url? Sorry: of course, the filename. I was playing around with passing url's for a while so I was still in that mindset when I wrote that comment. I couldn't figure out whether Konq supports the mailbox:// url and I couldn't really find a good reference for the % variables, so it just saves the attachment and passes the filename. > Wouldn't that also open shell scripts... ? Afaik, not unless Moz saves the attachment with the execute bit, in which case, yes, that would be a huge problem. Although it would be a problem with Moz, not my script. You might be able to force the (wrong) behaviour through Konqueror by associating scripts with their interpreter, but I don't think Konq is set up to do that automatically. On my system at least, I don't see any scripts or Wine executables that are associated. Even so, I would hope that Konq would respect the filesystem permissions, but who knows.
*** Bug 222132 has been marked as a duplicate of this bug. ***
You need to log in before you can comment on or make changes to this bug.