Closed Bug 229872 Opened 21 years ago Closed 21 years ago

The filename with non-ascii characters doesn't show up in the attachment window.

Categories

(MailNews Core :: Composition, defect)

x86
Windows XP
defect
Not set
major

Tracking

(Not tracked)

RESOLVED FIXED
mozilla1.7alpha

People

(Reporter: charlie.chang, Assigned: jshin1987)

References

Details

(Keywords: intl)

Attachments

(4 files, 1 obsolete file)

User-Agent:       Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.7a) Gecko/20040101
Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.7a) Gecko/20040101

When I write an email and attach a filename with Chinese-Big5 character (the
name of the file), the filename doesn't show up in the attachment window. It
appears to be blank.

Reproducible: Always

Steps to Reproduce:
1. Compose a new email.
2. Try to attach a file with Chinese-Big5 characters.
3. The filename doesn't show up.

Actual Results:  
As step 3.

Expected Results:  
Show the correct filename....
Reporter, if problem occurs not only in composing panel but also in draft
display/edit panel, mail source helps developer to analyze the problem.

(1) Delete(or move) all mails in Drafts folder
(2) Compose a mail and attach your file of filename with Chinese-Big5 character
(3) Save as draft
(4) "Compact this folder" to Drafts folder
(5) Attach file named "Drafts" (instead of "Drafts.msf") in your mail directry
to this bug through "Create a New Attachment" link.
Mail headers for attacment in your mail are as follows.
> Content-Type: application/msword;
>   name="(N*(8bit-code))empty.doc"

Non ascii data should be encoded in mail header but is not encoded in your mail.

Following is an example of ISO-2022-JP/Base 64 encoding for attachment.
>  name="=?ISO-2022-JP?B?GyRCMlAkTkQ7GyhCLkpQRw==?="
Following is Big5/Base 64 encoding for same file by 2004010308-trunk/Win-Me.
> name="=?Big5?B?pPXHVbO+LkpQRw==?="

Mozilla 2004010308-trunk succesfully encoded with Big5 when I choosed Big5 as
character set for mail composition.

What is your character set choice for mail composition?
See Preferences/Mail&Newsgroup/Composition/Composing Messages/Character Coding.
*** Bug 230453 has been marked as a duplicate of this bug. ***
This bug is also present in mozilla 1.6 branch.
To Isaac Hwak Han, same question as Comment #3 :
What is your character set choice for mail composition?
See Preferences/Mail&Newsgroup/Composition/Composing Messages/Character Coding.

Are you reuesting enhancement?
Re: comment 6

As stated in bug 230453, I tested both EUC-KR and UTF-8 for composition
encoding. I'm not requesting enhancement. This is a regression bug. (probably
due to fix of bug 206252) I wish jshin could look into this bug, but he seems
busy :-)

Because not all users seems to be able to reproduce this bug, -- it's not even
confirmed -- I'm building my debug build to see what's under the hood.
I ran the debugger. This statement:

http://lxr.mozilla.org/seamonkey/source/mailnews/compose/src/nsMsgCompose.cpp#3629

eventually calls |nsFileSpecHelpers::Canonify()| where the loss (or damage) of
filename happens at: (I'm on Korean WinXP)

http://lxr.mozilla.org/seamonkey/source/xpcom/obsolete/nsFileSpecWin.cpp#89

Here, C run-time function _fullpath() is called:

     char* canonicalPath = _fullpath(buffer, ioPath, _MAX_PATH);

Input parameter |ioPath| is UTF8-encoded (unescaped) absolute filename (e.g.
"D:\가을.jpg"), while output parameter |buffer| shows up as "D:\가??jpg" after
invocation. I don't think _fullpath() is UTF-8 friendly -- it only supports
ASCII and MBCS in Windows.

I checked Unix codepath, and there the logic seems somewhat different.

http://lxr.mozilla.org/seamonkey/source/xpcom/obsolete/nsFileSpecUnix.cpp#98

It first checks if |ioPath| is absolute path. If yes, returns simply, else,
calls getcwd() to compose fully-qualified path. Hmm.. it may be locale-specific,
too.

To put it short, nsFileURL/nsFilePath/nsFileSpec don't like UTF-8. We should use
it before converting URL to UTF-8.
Attached patch Proposed patch (obsolete) — Splinter Review
Separate file leaf name first, and then convert to UTF-8 using
ConvertStringToUTF8(). Unescaping happens in nsFileURL constructor.

Tested in current CVS build on Korean WinXP. BTW, I'm not so sure about third
argument |skipCheck| of ConvertStringToUTF8().
confirmed this bug with Japanese (ISO-2022-JP) in Mozilla 1.6.
The file name of attachment in sent message is also broken.
This is critical issue for users using non-ascii character.

nominating blocking1.7a and proposing1.6.1!
Flags: blocking1.7a?
I could not recreate the problem in attachment of Japanese file name on NTFS or
VFAT32, with 2004011908-trunk on Japanese Win-2k.
When characater coding is iso-2022-jp or Big5 or utf-8, Japanese file name was
encoded in iso-2022-jp,Big5,utf-8 respectedly.
And Japanese filename was encoded in Shift_JIS even when character
Coding=iso-8859-1 and both subject and body contain us-ascii characters only.

Han-san, EMURA-san, what is the file system? NTFS? VFAT32? FAT16?
Or Win-XP only problem?
Me too :-)

Mozilla 1.6/Win XP Pro SP1/NTFS

Win XP's compatibility mode(? I don't know exact name)
"Windows 2000" has no effect.
Han san, I could re-create this bug at last with Mozilla-trunk on Win-2K(SP4).

On attachemnt of newly created or renamed Japanese name file on Win-2K, both
blank attachment name problem and garbage in attachment name problem occurred.
The reason why I could not recreate this bug was that I tested with Japanese
name file which was created on Win-Me and XCOPY'ed to NTFS drive of Win-2K.
So character code of file name in file system seems to relate to this bug -
unicode on Win-2K and Shift_JIS(probably) on Win-Me.

When file name is long (longer than 8.3 format), attachment name became blank,
and when saved as draft, attachemnt file name became binary data with no
encoding as you described.
After saving draft, attachement name was displayed as garbage string - "?"
surrounded by diamond shape. 

When file name is 8.3 format, attachment was displayed with partialy
garbage/partialy proper Japanese character file name, and when saved as draft,
file name was encoded with utf-8.
Format of encoding was proper but encoded data seemed to be invalid, and a part
of attachment name became garbage.
Not all file names were destroyed in 8.3 format file name cases.
For example(X,Y,Z denotes Japanese character in following sample) ;
 XYZ-1.txt -> XY was displayed properly. Z-1 portion became garbage.
 XYXY.txt  -> XYXY was displayed properly with Japanese chars.
 Seems to depend on character code in unicode.

First one is "not encoded" problem.
Second one is "encoded but invalid" problem.

Han san, will your patch resolve both problems?
Wada-san, I can't dare to say my patch is safe for all the cases you mentioned.
It may or may not. What the patch did is simply reverting the code flow back to
pre-bug 206252 state.

Jungshik, I think you can answer all of these questions. Would you review the
patch, or take this bug if you don't like it? (I saw horde of bugs related with
nsFile* and it may be better this bug will be fixed on a par with other bugs.)
This bug is quite easily noticable to CJK users and should be fixed ASAP.
Keywords: intl
jshin, do you have thoughts on the proposed patch to fix this? Looks like it
involves reverting back some previous changes.
Assignee: sspitzer → mscott
Severity: normal → major
Target Milestone: --- → mozilla1.7alpha
The patch seems to be fine. For an unknown reason, I haven't received any bug
mail from this bug. Without mscott's comment in bug 206252, I wouldn't have
known about this bug. It's very unfortunate because I could have fixed this in
time for 1.6 if I had known earlier. I was misled to believe GetLeafName returns
filename in UTF-8   . (I use UTF-8 locale so that I overlooked that it returns
leafname in the filesystem encoding, which is not UTF-8 on Windows). Anyway, I'm
sorry for the mistake.
Attachment #138829 - Flags: superreview+
Attachment #138829 - Flags: review?(jshin)
Attached patch updated patchSplinter Review
basically the same as attachment 138829 [details] [diff] [review] except for a couple of changes in the
name of a variable and the 4th argument to ConvertToUTF8String (skipCheck
should be PR_FALSE because 7bit non-ASCII character encodings like ISO-2022-JP
is NOT used on file system). BTW, this bug is specific to Windows. Unix doesn't
care as Issac suspected. MacOSX is not affected, either.
Assignee: mscott → jshin
Attachment #138829 - Attachment is obsolete: true
Status: NEW → ASSIGNED
Attachment #140435 - Flags: superreview+
Blocks: 230700
Comment on attachment 140435 [details] [diff] [review]
updated patch

asking Neil for r because he's more likely to be up than David :-)

I've just tested it on Win2k (not that Issac's test is not enough but that I
just wanna be sure because I changed his original patch a little)
Attachment #140435 - Flags: review?(neil.parkwaycc.co.uk)
Re: comment #3
> Following is an example of ISO-2022-JP/Base 64 encoding for attachment.
>  name="=?ISO-2022-JP?B?GyRCMlAkTkQ7GyhCLkpQRw==?="

Actually, for non-ascii parameter values, you can't use RFC 2047-style encoding
but you have to use RFC 2231-style encoding. See bug 193439. However, most mail
clients don't understand RFC 2231 (Mozilla/TB, Pine and Mutt do, but I suspect
MS OE and most web mail programs can't). So, we may have to compromise.. Anyway,
it's off topic here.
Comment on attachment 140435 [details] [diff] [review]
updated patch

Sorry, but I don't understand this well enough.
This is what I'm expecting the code to do (but, in JS...)
function AttachmemntPrettyName(urlstr, charset){
  var ioService = GetIOService();
  var uri = ioService.newURI(urlstr, charset, null);
  if (uri.schemeIs("file"))
    return uri.QueryInterface(nsIURL).fileName;
  if (uri.schemeIs("http"))
    return uri.spec.substr(7);
  return uri.spec;
}
Surely it can't be that difficult to do this in C++?
Attachment #140435 - Flags: review?(neil.parkwaycc.co.uk)
That's what it's doing although it's using deprecated nsFileSpec. I can rewrite
the wholething with nsIFile, but Scott may prefer a quick fix. Besides, we need
to fix this on 1.6branch, for which a minimal patch may be better.
I checked a version of this patch into the M4 branch for thunderbird 0.5
This is an update with nsIFile, the first step toward the total elimination of
nsFileSpec in mailnews/.:-)

Thanks, mscott. I was about to ask you to check in attachment 140435 [details] [diff] [review] for TB 0.5
when I got a bug mail that TB 0.5 had been closed in another bug. Somehow, I'm
not getting bug mails from this bug. For 1.6.1, I guess we can check in
attachment 140435 [details] [diff] [review] later.
Comment on attachment 140940 [details] [diff] [review]
update for the trunk

asking for r/sr.

>Index: mailnews/compose/src/nsMsgCompose.cpp

>@@ -1001,47 +998,46 @@ NS_IMETHODIMP nsMsgCompose::SendMsg(MSG_

This chunk is for another bug and has to be ignored.
Attachment #140940 - Flags: superreview?(mscott)
Attachment #140940 - Flags: review?(bienvenu)
Attachment #140940 - Flags: review?(bienvenu) → review+
Attachment #140940 - Flags: superreview?(mscott) → superreview+
Attachment #138829 - Flags: review?(jshin)
the latest patch checked in for this bug (on 2/9) caused a regression for
attaching files. Now normal files don't always attach normally. At a minimum,
the first file attached does not work. Sometimes attach, delete, attach works.
Either way, the latest patch applied for this bug is a regression
New regression mentioned in comment 26 is bug 233785. Adding dependency.
Depends on: 233785
Attachment names are no longer showing up in thunderbird and apparently in
seamonkey too (compose window). I think this checkin may be the cause. Can you
help look into this jshin?
jshin, this patch changes the string returned by AttachmentPrettyName from UCS2
to UTF8, but none of the JS callers were updated. Hence the empty file names I
think.
I backed out this checkin to fix the regressions it caused. Feel free to check
it back in again when the JS callers are updated to expect utf8 instead of ucs2.

The typo is esc_SkipControl || esc_AlwaysCopy
It needs to be bitwise or, not logical or!
mscott, js callers see no difference between AString and AUTF8String
getting late for 1.7a  - lets try for this in 1.7b
Flags: blocking1.7a? → blocking1.7a-
I'm very sorry for the typo. 

Yes, Neil is right on. As cbie wrote, js caller doesn't care whether it's 
AUTF8String or AString. Xpconnect does the right thing for both cases.

I'll check this in again once 1.7beta cycle begins with the typo fixed.

Anyway, something very strange is going on with bugmail. I have never received 
an email from this bug and bug 233785. 
I'm the assignee and also on the CC list. That may have triggerred an bug(?) in 
bugzilla. I'm now removing my name from the CC. 
fix checked into the trunk. thanks all.

I'll ask for  1.6 approval before 1.6.1 is released.
Status: ASSIGNED → RESOLVED
Closed: 21 years ago
Resolution: --- → FIXED
*** Bug 232253 has been marked as a duplicate of this bug. ***
*** Bug 240002 has been marked as a duplicate of this bug. ***
Product: MailNews → Core
Product: Core → MailNews Core
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: