Closed Bug 28452 Opened 25 years ago Closed 25 years ago

Need sniffer to determine type for files without native type

Categories

(MailNews Core :: Internationalization, defect, P3)

x86
Linux
defect

Tracking

(Not tracked)

VERIFIED INVALID

People

(Reporter: ji, Assigned: rhp)

Details

Attachments

(1 file)

Build: Linux 2000021808 

A text attachment file without .txt extension can't be displayed
inline. 4.X unix version can display that type of attachments inline.

Steps of reproduce:
1. Launch mozilla.
2. Select Tasks | Mail to bring up the messenger.
3. Compose a mail, attach a text file without .txt extension to the mail.
4. Send the mail to the testing account itself.
5. After the mail is received, open the mail.
   You'll see the attachment is not displayed inline.
6. Compose a mail, attach a text file WITH .txt extension to the mail.
7. Send the mail to the testing account itself.
8. After the mail is received, open the mail.
   You'll see the attachment is displayed inline.
Reassign to rhp. I think we have content-type-guessing code lying around 
somewhere...
Assignee: phil → rhp
Summary: Text attachment file without .txt extension can't be displayed inline → Text attachment file without .txt extension can't be displayed inline
The extension has nothing to do with it...its the content type that is tagged 
to the attachment when it is sent. I just tested this and it works just fine 
for me and I'm pretty sure on this one, because I fixed this bug a long time 
ago. 

I have a feeling that the file you were attaching wasn't all plain text and 
that will prevent it from being displayed inline.

- rhp
Status: NEW → RESOLVED
Closed: 25 years ago
Resolution: --- → WORKSFORME
ji - can you try again with another text file perhaps?
QA Contact: lchiang → ji
The text file I used is a Japanese euc text file.
will create an attachment later.
Reopened the bug. Changed the component to i18n and changed the summary.
Component: Back End → Internationalization
Summary: Text attachment file without .txt extension can't be displayed inline → Ja text attachment file without .txt extension can't be displayed inline
Reopen..
Status: RESOLVED → REOPENED
Resolution: WORKSFORME → ---
Ok, now I understand the problem here. Part of this issue is that we don't 
really have the native code hooked up that will actually give us the 
information about what a particular file is. There is just have a hack in place 
that looks at extensions (this functionality lives in Necko)...and if that 
fails, we do some sniffing of the file to see what we can find, hence the 
problem you are seeing.

I need to check the code to see if we are making any other assumptions, but if 
not, this is probably a dup of the bug assigned to getting the native code 
calls hooked up to this Necko service.

- rhp
Status: REOPENED → ASSIGNED
Target Milestone: M15
Ok, after testing this on Linux, I see that we need the MIME service hooked up 
to the native code that can tell what a file really is. Will change the summary 
and reassign.

- rhp
Summary: Ja text attachment file without .txt extension can't be displayed inline → nsIMIMEService needs hooked up to native code for mime type determination
Hi Judson,
Not sure we have a bug on this issue or not. If so, then we can mark as a dup.

Thanks.

- rhp
Assignee: rhp → valeski
Status: ASSIGNED → NEW
over to davidm.
Assignee: valeski → davidm
I am really unclear what this bug is about. Is it that we should use native 
services ( IC/Registary) for determing MIME types or that we have to write file 
sniffers ( one word ick ) to determine the type of files regardless of 
extensions.
I think it might be worthwhile reviewing what the spec was
like for 4.x for attaching extension-less files on Unix.
As far as I can see, the following is what we are doing
with 4.x:

1. When the user goes to attach a file, there are 2 options:

   Auto
   Binary

2. If you choose Auto, and if the file has an extension, 4.x seems to 
   honor the extension. (No further sniffing -- just trust the extension.)

3. If you choose Auto and if the file has no extension, 4.x is normally
   able to tell if a file is text/plain or text/html. Otherwise, 
   it seems to settle on "binary".

4. If the user chooses "binary", then the binary interpretation is
   forced on the file. 

What we are seeing currently in Mozilla is that under condition 3 above,
sniffing is not good enough. It seems to be able to sniff ASCII text files
but 8-bit text files like the test file attached to this bug report
or other 8-bit text files like Chinese Big 5 text files without extension
get tagged as: "application/octet-stream".

I would like to see Mozilla implement a good way to tell if an
extensionless file is text/html or text/plain whether or not the
file contains ASCII or non-ASCII data. Also if a file's MIME type
cannot be determined perfectly, then some way for the user to
choose a file type -- smething like what we have in 4.x on Unix. 

Simliar considerations should apply to Mac. I don't believe Mac 4.x
has "binary" attachment option.
Reassign to  rhp. In the case of an attachment where the mime service can't tell 
you what it is ( native stuff isn't going to help here), you either have to 
assume it's binary ( current behaviour) or sniff the file. I thought msgLib had 
some code to do this ( I think it looks at the first 4K and if x% of the chars 
are 8 bit it is considered binary).
Assignee: davidm → rhp
Summary: nsIMIMEService needs hooked up to native code for mime type determination → Need sniffer to determine type for files without native type
We can only do so much "sniffing" for a file....binary data is binary data and 
we do this already. Currently, we do everything that 4.x does as far as 
identifying the file type for attachments. 

The problem I have experienced the most is that the snIMIMEService in necko is 
just a hack that looks at file extensions. We need something that will dig into 
the platform specific mime associations and it will fix most of the problem's 
we've seen.

- rhp
Status: NEW → RESOLVED
Closed: 25 years ago25 years ago
Resolution: --- → INVALID
The following statement is not true:

"Currently, we do everything that 4.x does as far as 
identifying the file type for attachments."

4.x can certainly sniff the 8-bit plain text file type without
an extension and that is why this bug was filed in the first place.
ji's original point was exactly that, and my point #3 on 2/21/2000
says the same thing. 

Are you saying that we lack some code somewhere which prevents
us to achieve #3 which is availabel for 4.x?

Status: RESOLVED → REOPENED
Resolution: INVALID → ---
Please look at the following link. This IS what we used to do in 4.x:

http://lxr.mozilla.org/mozilla/source/mailnews/compose/src/nsMsgAttachmentHandl
er.cpp

What I am saying is that the service/code that asks the operating system what 
type of file something is does not work in Seamonkey. Look, try this....go on a 
Linux box and find an HTML file. Do:

    file my.html

you will see "HTML Document" as the output. Now, rename my.html to X and do a

    file X

you will STILL get "HTML Document". This is because the "sniffing" being done 
is done by the operating system, NOT mail.

Again, when the nsIMIMEService.idl is really implemented, this will all work as 
you expect, but the "magic" is not in mail/news, but rather this Necko code.

http://lxr.mozilla.org/mozilla/source/netwerk/mime/public/nsIMIMEService.idl

- rhp
Status: REOPENED → RESOLVED
Closed: 25 years ago25 years ago
Resolution: --- → INVALID
Rich, thanks for the explanation.
Is there need to file a bug against Necko?
Can this bug be used for that? or is this
one of the features planned for in Necko anyway?
Hmm...not sure. David, is there a bug on the system for hooking up 
nsIMIMEService to real OS calls? Thanks.

- rhp
That file stuff might work on Unix but it sure is not going to work on mac ( 
although if the file/creator is set it might ) or windows where the registrary 
does look ups by extension. If Unix has a mime mapping call that does sniffing 
then when they should get this behavior when they write a native MIME mapper. Is 
this the expected behavior?
I think Unix's "file" command does some sniffing via magic numbers, etc...but 
this is all platform specific and has never been inside the mail back end. I do 
know that window managers on Linux have what looks like a mime type registry. 
Not sure if we can hook in there or not.

- rhp
Checked with linux 2000092008 build. The euc text attachment file w/o extension
is actually
processed properly.
Status: RESOLVED → VERIFIED
Product: MailNews → Core
Product: Core → MailNews Core
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: