Attachment mechanism incompatible with Unicode

RESOLVED FIXED in mozilla1.3alpha


17 years ago
11 years ago


(Reporter: lapsap7+mz, Assigned: tetsuroy)



Windows 2000
Dependency tree / graph

Firefox Tracking Flags

(Not tracked)



(6 attachments, 2 obsolete attachments)



17 years ago
User-Agent:       Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.2b) Gecko/20020930
Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.2b) Gecko/20020930

If a filename is in Unicode (eg Chinese filename in Western European system
locale), and if the file is attached in an email, every unrecognised character
is replaced by a '?'.  That means Mozilla can't get access to the file because
if you send this mail or save it, you'll see what I mean.

Reproducible: Always

Steps to Reproduce:
This worksforme all the time with Russian mail (in a Western locale).... could
you attach a message showing the problem to this bug?

Comment 2

17 years ago
Created attachment 101983 [details]
RAR file containing a text file with Chinese name

Comment 3

17 years ago
Created attachment 101984 [details]
What I got when I tried to attach the Chinese-named file in Mailer

Comment 4

17 years ago
I'd rather attach a test file.  It's archived inside a RAR file because that's
the only way to attach a file whose name isn't in Latin-1, and it's the only way
to preserve its filename.  You could save the file to anything .rar (eg test.rar).

With this test file, you should be able to try it yourself.  Attach it to a mail
and you should get the same thing as in the attached image.  The test file has
two Chinese characters, and you could see that they are changed to ??.txt.

Recall of the fact: when the file is attached, it's not possible to save or send
the email.
So how do I get that rar file out of the archive?  Is this being done on an
English Win2k or a Chinese Win2k (system language, I mean).

ccing yokoyama in case this has anything to do with the recent "Make Mozilla
Unicode app on windows" changes.

Comment 6

17 years ago
RAR's site is
The file and the archive were done in Western Europe system locale, but I don't
think this does matter because since WinRAR 2.x, it's been using Unicode. 
That's why I used RAR rather than Zip to preserve its name.
See, the problem is that we may be assuming the filename is in the system
locale.  Which in this case is Western Europe, then?

Comment 8

17 years ago
You're using W2k, right?  If yes, the file can be extracted without any problem
and its filename should be displayed properly (supposing you've add Eastern
Asian language beforehand).  It seems like you didn't understand what I meant. 
The Chinese filename is preserved WITHIN the RAR archive.  That means when you
download the attached RAR file, just name it whatever you want, eg test.rar. 
The archive filename doesn't matter.

Extracting the file seems to be posing some problem, though.  An easy way is to
use context menu on the RAR file and use WinRAR's menu item.  A more complicated
way is to switch system locale is Big5 to extract the file, and switch back to
your initial system locale.
No, _you_ misunderstand.  You have a Win2k in Western locale with a file with a
Chinese name.  We may be assuming the name is in ISO-8859-1 or something like
that because the OS as a whole is in the Western locale.  This is about the
original bug, not the archive (and no, I am in fact not on Win2k, or even Windows).

Comment 10

17 years ago
I haven't tried this, but I think this problem can be shown the other way round:
1) Suppose your system locale is in Western Europe, create a file whose name
contains some non Big5 characters.  An example is "français.txt"  --  'ç' isn't
in Big5.
2) Switch your system locale to Traditional Chinese (Big5).
3) Now, try to attach the "français.txt" in an email.

You should see that the file becomes "fran?ais.txt".  And try to save or send
your email .... normally, you can't.  

IMO, I think that's because Mozilla uses an ANSI function instead of W function
to read the filename.

Comment 11

17 years ago
It should be "français" -- that is ç instead of ç
Sorry, my browser was in UTF-8!

Comment 12

17 years ago
>I think that's because Mozilla uses an ANSI function instead of W >function to
read the filename.
Correct; however, even changing the commdlg calls to W functions,
the problem still occurs.  The problem is that the GetMessage(),
PeekMessage() and DispatchMessage() needs to be changed as well. 
If not changed to GetMessageW(), ... then the returned filename
from CommonDlg includes '?'.

Please wait until 104934 gets fixed. I'll make sure this gets
fixed as well.

Depends on: 104934

Comment 13

17 years ago
Re comment #9:
Ahem!  Boris, please note that this bug was marked as specific to Win2k (and
WinXP too).  So all your comments didn't apply.  But thanks for adding Yokoyama
in CC.

Yokoyama, so you get the same problem too, right?  Could you confirm this bug?

Should I change "Component" of this bug from Attachments to I18N?  Furthermore,
isn't it better that we add this bug to bug 104934 too?

Comment 14

17 years ago
sure. Confirming and taking this bug.
>we add this bug to bug 104934 too?
Making 172337 depends on 104934 results to link two bugs.
(In other words, it's already added to 104934's block list.)
Assignee: mscott → yokoyama
Component: Attachments → Internationalization
Ever confirmed: true
Target Milestone: --- → mozilla1.2beta


17 years ago
Keywords: intl


17 years ago
QA Contact: trix → kasumi


17 years ago

Comment 15

16 years ago
Created attachment 105398 [details] [diff] [review]
Need to call new PR_OpenFileUCS2()

Comment 16

16 years ago
Created attachment 105399 [details] [diff] [review]
file URL is now in UTF-8 with MOZ_UNICODE flag

With this and previous patch, we can attach a non-locale file to email.
I tested send/recieve attachment.

Comment 17

16 years ago
Created attachment 105467 [details] [diff] [review]
File URL is now in UTF-8 with MOZ_UNICODE flag

Oops, wrong attachment. Trying again.
nhotta: can you review this patch?
Attachment #105399 - Attachment is obsolete: true


16 years ago
Blocks: 107941

Comment 18

16 years ago
*** Bug 107941 has been marked as a duplicate of this bug. ***

Comment 19

16 years ago
Comment on attachment 105467 [details] [diff] [review]
File URL is now in UTF-8 with MOZ_UNICODE flag


you can move tempStr inside #else
Attachment #105467 - Flags: review+

Comment 20

16 years ago
Created attachment 105639 [details] [diff] [review]
as per suggesion
Attachment #105467 - Attachment is obsolete: true

Comment 21

16 years ago
david: can you super review?

Comment 22

16 years ago
Comment on attachment 105639 [details] [diff] [review]
as per suggesion

carry his stamp
Attachment #105639 - Flags: review+


16 years ago
Target Milestone: mozilla1.2beta → mozilla1.3alpha

Comment 23

16 years ago
Comment on attachment 105639 [details] [diff] [review]
as per suggesion

Attachment #105639 - Flags: superreview+


16 years ago
Attachment #101983 - Attachment mime type: application/octet-stream → application/x-rar-compressed

Comment 24

16 years ago
This problem is also evident on Mac OS X (FizzillaCFM/2002110808). Is this patch XP?
Summary: Attachment mechanism uncompatible to Unicode → Attachment mechanism incompatible with Unicode

Comment 25

16 years ago
  Sorry, it's not XP.  Only for Windows.
    ( see the patch : +ifeq ($(OS_ARCH),WINNT) )
  In Windows, we are going UTF8 in file url but I can't speak for 
  Mac. ( I have zero knowledge ... )


Comment 26

16 years ago
adding a couple of more dependencies
Depends on: 162358, 162361

Comment 27

16 years ago
Fixed; For verification, please wait until 162358 and 162361 get fixed.
Last Resolved: 16 years ago
Resolution: --- → FIXED

Comment 28

15 years ago
Hi Roy, I'm using Mozilla 1.6 stable built.
I'm testing again to see the "state" of this bug.

Now, if I attach a file having non system locale characters in filename, every
non system locale characters is replaced by '_', instead of '?' as it was a year
ago.  However, the file is still unaccessible and the mail can't be sent.

I understand that this bug still depends on bug 162361.  So, why was this bug
marked as resolved?

PS: Is there any abreviation for "non system locale charactes"?  It's quite long
to type :)  Could it be called "NSL characters"?

Comment 29

15 years ago
Oh, by the way, there's a similar bug about Unicode filename I/O.  Actually, I
would call that the "reverse process" of this bug.

I understand that you would like a separate bug.  So, here it is.  It is bug 234681.

Comment 30

15 years ago
Roy, after two years and I still don't understand why you marked this bug as
fixed whereas in fact it isn't !??  I actually don't understand what "for
verification" meant for you.

I'm now using TB 0.7.3, and the problem presents differently.

1) If I drag 'n drop a file whose name isn't in system locale in the attachment
zone, the filename disappears (I'll attach an image)

2) If I use the "Attach" button and choose the file using the dialog, every
Unicode character becomes an underscore.

In either case, I can't send out the mail because TB thinks the file doesn't
exist since the filename is different.

I know, I know, we're waiting for bug 162361 :(

Comment 31

15 years ago
Created attachment 159788 [details]
Name disappears when Unicode-named file is drag 'n dropped

Comment 32

15 years ago
Created attachment 159789 [details]
Every Unicode character in filename changed to underscore when using Attach button
> I actually don't understand what "for verification" meant for you.

"Verification" is when you test to make sure the bug is fixed.  Roy's comment
meant that this code-level bug is fixed, but that can't be noticed until those
other two bugs are fixed as well...

Product: MailNews → Core
Product: Core → MailNews Core
You need to log in before you can comment on or make changes to this bug.