Closed Bug 52428 Opened 24 years ago Closed 23 years ago

intl chars in attachment filename get encoded wrong

Categories

(MailNews Core :: MIME, defect, P3)

x86
Windows 2000

Tracking

(Not tracked)

VERIFIED FIXED
mozilla0.9.6

People

(Reporter: bugzilla, Assigned: bugzilla)

References

Details

Attachments

(2 files)

if you attach a file called:
Použijte.txt
with mozilla and send it, mozilla wrong encodes the attachment filename into 
ISO-8859-1 charset, even though that the filename contains ISO-8859-2 chars!

So you'll get the following headers:
Content-Type: text/plain; name="=?ISO-8859-1?Q?Pouz=28ijte=2Etxt?="
Content-Disposition: inline; filename="=?ISO-8859-1?Q?Pouz=28ijte=2Etxt?="

this is WRONG!
Similar (not identical) to bug 43689
"Attachment filename in Ja is displayed as raw utf-8 string on Open/Save
attachment dialog window."

Should component be internationalization?
It's not identical. This bug is about the encoding of the filename in 
the outgoing mail, the other bug is simply about showing the incoming attachment 
filename correct in a dialog.
If you need testing from IQA, please send it to marina.

gemal, what language version/locale of Windows are you using?
Mozilla normally assumes the file names to be in the 
same charset as the filesystem encoding of the operating
system. So unless you are using Latin 2 language Windows,
or set the locale to be in one of Latin 2 languages, this
is not going to work. There is no way to know what encoding
the file names are in unless we have UI for it. Otherwise,
we need to make this most reasonable assumption.

I'm running a english Windows localized to Danish.

I'm developing a webmail application, where we have the same problem. What we do 
is this:
We try to see if the filename (in this case it contains ISO-8859-2 chars) can be 
represented in the current char coding (ISO-8859-1). If it cant be represented 
we encode it with UFT-8.

Cant we do the same thing with Mozilla?
Qa assigning to Marina since it involves international char and product
QA Contact: esther → marina
Status: NEW → ASSIGNED
Target Milestone: --- → Future
Sorry.
*** Bug 66105 has been marked as a duplicate of this bug. ***
reassigning to ducarroz
Assignee: rhp → ducarroz
Status: ASSIGNED → NEW
I don't know it this is still a problem!
Status: NEW → ASSIGNED
Target Milestone: Future → mozilla0.9.6
Marina, is this still happening?  If it is, Jean-Francos, will this be included
as part of the attachments work?
It's not directly related but the rewrite of the way we handle attachment in
nsMsgCompFields should takes care of this issue!
Depends on: 86089
still a problem in mozilla trunk 20011010
This is weird, i am using 10-10 branch build and Czeck WinNT, so i am able to
save the file as 8859-2 charset and the filename in the source is encoded
correctly, as ISO-8859-2 .. though in the doctype the meta charset is saying
8859-1 (!?), i am attaching the page source
Here how I reproduced the bug:
Setup: English Win2k localized via control panel to danish

1) renamed a text file called "test.txt" to "Použijte.txt" via Explorer
2) composed a new msg using mozilla and attached the "Použijte.txt" file
3) send it to my self
4) when I view the source of the resulting mail that I just got I get:
Content-Type: text/plain;
 name="=?ISO-8859-1?Q?Pouz=28ijte=2Etxt?="
Henrik, so as far as i understand you have a Danish system ( which is Latin-1)
and you are saving a file with the Central europeen name but this is not
supposed to work because the file name is saved in the system charset, when i am
using czeck windows it seems to work correctly except the strange meta-tag in
the doctype..( i attached the message source)
I'm not saving anything. This bug is not about saving. It's about encoding a 
iso-latin-2 filename into iso-latin-1, which can't be done!
Henrik,
those are your steps:
1) renamed a text file called "test.txt" to "Pou?ijte.txt" via Explorer ( here
is a saving into a charset different from the system one)
2) composed a new msg using mozilla and attached the "Pou?ijte.txt" file
3) send it to my self
i've done all this on latin-2 system and you can see the results in the
attachment i created on  2001-10-11 08:30 
in the Explorer it's not really a saving but just a renaming.

Perhaps it's bugzilla that's causing the problems?

When I compose the mail in Mozilla my View -> Char Coding is ISO-8859-1
is your's also that?
Yes, on Latin-1 system the default charset would be ISO-8859-1 when i compose
mail so if the file you'are attaching is latin-2 you have manually change the
encoding to latin-2.On my czeck system the composition window if i would go to
View would be set to latin-2. and when you are renaming the file name in
Explorer you are saving it with a different name though the name this time is
latin-2 but the system encoding is latin-1 , i guess the discrepancy is there.
Cc'ing Naoki
Hmm, now when i am looking at the fact that you're on Win2k and you added the
locale then it shouldn't matter for Explorer ( being unicode on Win2k) and the
file should show up ( and be renamed)correctly, then i guess it IS mozilla fault
not be able to encode the filename correctly.
Gemal, I think we need to spec this out carefully.

As a starting point, we can think of something like this:

1. On OS's which support both system encoding and Unicode for filenames,
   e.g. Win2000, WinXP, and maybe Win NT4, 
   
   a) Check if the filename encoing can be supported in the chosen 
      mail-send encoding. If so, encode the filename with that encoding.
   b) Otherwise, fall back to UTF-8.

2. On OS's which do not support Unicode in filenames, 

   a) encode in the system encoding if we can.
   b) If the filename contains non-system encoding characters,
      then most probably such names will not be displayed
      correctly anyway. So we can send it as is with the system
      encoding.

Something like the above. J-F, are you doing something similar to
the nsMsgCompose re-write? Because WinNT4/2000/XP can handle
filenames in different scripts, we should be able to deal with
names in different lang scripts.
The current code uses the composition charset for encoding file name. I'll try to
apply momoi proposition...
By the change, the mail charset and file name charset may be different (e.g. for
Japanese, this is always the case).
One concern I have is that, there may be MUA which cannot handle multiple
charsets in one message, although Mozilla should be able to handle that case.
Whiteboard: have fix
fixed and checked in
Status: ASSIGNED → RESOLVED
Closed: 23 years ago
Resolution: --- → FIXED
Whiteboard: have fix
After discussing the issues surrounding the fix just checked in, i.e.
my proposal #2, with nhotta, we agreed that it would be best to 
also go ahead with proposal in #1 to deal with multilingual OS's that are
becoming prevalent these days. This will be filed in a separate
bug since it seems to require other changes relating to filename
handling in multilingual OSs. This latter fix will probably complete
the original request by Gemal in this bug. (Until then, please
be patient.)

To summarize, the fix here assumes that all file attachments
use file names in system encoding and send them in the same
encoding regardless of the mail encoding.  There are clearly 
some advantages in doing so but be aware that there are also 
some disadvantages. 
Summary: int chars in attachment filename get encoded wrong → intl chars in attachment filename get encoded wrong
because a separate bug will take care of Kat's proposal number #2 this bug is
fixed ( with limitations and some disadvantages: the file name will be encoded
in the system encoding regardless of the mail encoding) Henrik, do you agree?
it seems to work for ascii headers but because of recent regressions (bugs##
11985, 117956) i'll wait before i'll mark this bug verified
when i mentioned the bug numbers i meant to say bug #111985.
filename is now encoded with windows-1252:
Content-Type: text/plain;
 name="=?windows-1252?Q?Slu=9Eby=2Etxt?="

20020131

Status: RESOLVED → VERIFIED
Product: MailNews → Core
Product: Core → MailNews Core
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: