Last Comment Bug 274264 - Japanese attachment file name garbled, after fix for Bug 243504 and Bug 263462
: Japanese attachment file name garbled, after fix for Bug 243504 and Bug 263462
Mozilla Japan wants to fix on 1.8 branch
: fixed-aviary1.0, fixed1.7.5, fixed1.8, intl, regression
Product: MailNews Core
Classification: Components
Component: Composition (show other bugs)
: Trunk
: x86 Windows XP
: P1 major with 2 votes (vote)
: mozilla1.8rc1
Assigned To: Masayuki Nakano [:masayuki] (Mozilla Japan)
: 273109 (view as bug list)
Depends on:
Blocks: 263462 278727 285073 309335
  Show dependency treegraph
Reported: 2004-12-12 01:59 PST by Hidehiro Kozawa
Modified: 2008-07-31 01:22 PDT (History)
19 users (show)
asa: blocking1.7.5+
asa: blocking1.8a6+
See Also:
Crash Signature:
QA Whiteboard:
Iteration: ---
Points: ---

patch (2.08 KB, patch)
2005-01-09 17:21 PST, Jungshik Shin
jshin1987: review+
darin.moz: superreview-
Details | Diff | Splinter Review
updated patch (1.44 KB, patch)
2005-01-11 00:35 PST, Jungshik Shin
darin.moz: superreview+
asa: approval1.8a6+
Details | Diff | Splinter Review
Patch rv1.0 (3.66 KB, patch)
2005-10-09 10:38 PDT, Masayuki Nakano [:masayuki] (Mozilla Japan)
masayuki: review-
Details | Diff | Splinter Review
Patch rv1.1 (4.02 KB, patch)
2005-10-14 20:53 PDT, Masayuki Nakano [:masayuki] (Mozilla Japan)
masayuki: review-
Details | Diff | Splinter Review
Patch rv1.1 (3.53 KB, patch)
2005-10-14 20:56 PDT, Masayuki Nakano [:masayuki] (Mozilla Japan)
jshin1987: review+
mozilla: superreview+
asa: approval1.8rc1+
Details | Diff | Splinter Review

Description Hidehiro Kozawa 2004-12-12 01:59:15 PST
Bug 273109 for Suite.
Japanese Localization team think that this bug is blocker for Japanese 1.7.5.
Comment 1 Hidehiro Kozawa 2004-12-13 16:55:24 PST
If this bug cannot stop 1.7.5 thread 'Boycott Thunderbird 1.0' become waste of time.

Is to backout not easy?

All Thunderbird, Trunk and 1.7branch has same patch.
Why do the bug make blocker only Thunderbird?
Why nobody wonder whether this bug cause or not?
too late to jerry.

Comment 2 WADA 2004-12-14 07:23:27 PST
Add to list in Bug 274366 (regressions meta-bug)
Comment 3 :Gavin Sharp [email:] 2004-12-14 14:25:56 PST

Bug 274366 is for regressions due to the upgrade, it has
nothing to do with any of the products also on b.m.o.
Comment 4 Asa Dotzler [:asa] 2004-12-17 10:52:16 PST
Hidehiro Kozawa, the patch was backed out last night so this should be fixed in
today's builds. Can you verify?
Comment 5 Asa Dotzler [:asa] 2004-12-17 10:53:53 PST
oops, reopening for trunk fixing too. this has been fixed on the 1.7 branch (we
believe). Please verify there. Thanks.
Comment 6 Hidehiro Kozawa 2004-12-17 17:10:16 PST
17branch becomes better in Mozilla/5.0 (Windows; U; Windows NT 5.1; ja-JP;
rv:1.7.5) Gecko/20041217.
not trunk.

not all fixed in 17branch yet. see detail bug 273109.
17branch is same status with thunderbird now.
Comment 7 Asa Dotzler [:asa] 2005-01-04 09:40:39 PST
jshin, can you look at this?
Comment 8 Asa Dotzler [:asa] 2005-01-08 14:47:27 PST
Oops, fixing assignee. jshin, we're getting very close to the 1.8a6 release. Can
you look into quickly?
Comment 9 Jungshik Shin 2005-01-09 16:28:51 PST
Sorry I'm looking at this now. A patch is available in bug 273109
Comment 10 Jungshik Shin 2005-01-09 17:21:49 PST
Created attachment 170776 [details] [diff] [review]

This is the patcy Ahn Dal-Soo in bug 273109 (with a little modification by me)

asking for a1.8a6 as well to expedite things.
Comment 11 Asa Dotzler [:asa] 2005-01-09 17:27:43 PST
Comment on attachment 170776 [details] [diff] [review]

a=asa pending reviews.
Comment 12 Darin Fisher 2005-01-10 14:42:59 PST
Comment on attachment 170776 [details] [diff] [review]

>Index: intl/uconv/src/nsUTF8ConverterService.cpp


did you mean to add these extra newlines?  it is usually not a good
idea to add random newlines to source files.  or was this intentional?

>   // NS_UnescapeURL does not fill up unescapedSpec unless there's at least 
>   // one character to unescape.
>+  NS_UnescapeURL(PromiseFlatCString(aSpec).get(), aSpec.Length(), 
>+                 esc_AlwaysCopy, unescapedSpec);

The comment is now incorrect since you are passing esc_AlwaysCopy.  You
can also more efficiently write this code like so:

    NS_UnescapeURL(PromiseFlatCString(aSpec), esc_AlwaysCopy, unescapedSpec);

>+  // convert to UTF-8 unless ASCII only or UTF-8
>+  if (!IsASCII(unescapedSpec) && !IsUTF8(unescapedSpec)) {

It seems to me that you only need to check |if (!IsUTF8(...))| since that
covers the not ascii case as well.

>+    nsCAutoString UTF8Spec; 
>+    nsresult rv = ToUTF8(unescapedSpec, aCharset, UTF8Spec);
>+    NS_ENSURE_SUCCESS(rv, rv);
>+    unescapedSpec = UTF8Spec;
>   }

From a quick inspection of the source code for ToUTF8, which is defined
in this very same file, it seems that you can pass |unescapedSpec| as
both the input and output arguments to this function.  The point is that
that allows you to perhaps reuse the same buffer, which could help.  So,
you could write those 4 lines as:

      nsresult rv = ToUTF8(unescapedSpec, aCharset, unescapedSpec);
      NS_ENSURE_SUCCESS(rv, rv);

>+  NS_EscapeURL(unescapedSpec.get(), unescapedSpec.Length(),
>+               esc_FilePath | esc_OnlyASCII | esc_AlwaysCopy, aUTF8Spec);

This can also be written like this:

    NS_EscapeURL(unescapedSpec, esc_FilePath | esc_OnlyASCII | esc_AlwaysCopy,

Please submit a revised patch, thanks!
Comment 13 Jungshik Shin 2005-01-11 00:35:47 PST
Created attachment 170900 [details] [diff] [review]
updated patch

thanks. I should  have been more 'vigilant'...
Comment 14 Darin Fisher 2005-01-11 10:45:08 PST
Comment on attachment 170900 [details] [diff] [review]
updated patch

Comment 15 Asa Dotzler [:asa] 2005-01-11 10:47:33 PST
Comment on attachment 170900 [details] [diff] [review]
updated patch

a=asa for checkin to 1.8a6. time is short so please land quickly. Thanks.
Comment 16 Jungshik Shin 2005-01-11 15:12:18 PST
fix checked in
Comment 17 Jungshik Shin 2005-01-12 02:06:45 PST
Comment on attachment 170900 [details] [diff] [review]
updated patch

risk : pretty low
platforms : all
test : done on Windows and Linux

The patch for bug 263462 was backed out from TB 1.0 and suite 1.7.x branches
because of this bug. To land that patch again, we need to fix this on branches.
Comment 18 Jungshik Shin 2005-01-13 02:36:42 PST
*** Bug 273109 has been marked as a duplicate of this bug. ***
Comment 19 Scott MacGregor 2005-01-13 11:38:21 PST
for grins i put this on the thunderbuird aviary branch too.
Comment 20 Hiro 2005-01-14 18:34:26 PST
(In reply to comment #19)
> for grins i put this on the thunderbuird aviary branch too.

Because the patch had been reflected in branch, it tried. 

The attachment Japanese file name is correct.
But, when the semicolon(;) is included in the file name, the file name is cut. 
However, the problem is not in numbersign(#). 

Mac OS X 10.3.7
version 1.0 (20050114)
Comment 21 Ahn Dal-Soo 2005-01-14 19:55:05 PST
In EscapeChars array:
EscapeChars[0x24] for "#" is 512 which is 1000000000 in binary and represents
EscapeChars[0x3B] for ";" is 912 which is 1110010000 in binary and represents
esc_Ref | esc_Query | esc_Param | esc_Directory.

When we pass esc_SkipControl | esc_AlwaysCopy | esc_FilePath to NS_EscapeURL(),
it does not escape ";".
Comment 22 Boris Zbarsky [:bz] (still a bit busy) 2005-01-21 13:06:53 PST
This probably broke mailto links.  See bug 278727.
Comment 23 Ahn Dal-Soo 2005-01-21 21:52:36 PST
Wrote a comment in bug 278727 and requesting backing out this patch. I think we
have to write, to solve this issue, some code pretty specific to ShiftJIS.
Comment 24 Jungshik Shin 2005-01-21 22:18:59 PST
Well, it's not just SJIS(Windows-932) that has  the problem but also
Big5(Windows-950) , UHC(Windows-949) and GBK(Windows-936)/GB18030 have the same
problem. So, if we have to write an encoding-specific hack, we have to take care
of all of them. GB18030 is worst because it's up to 4bytes long. It's possible....
Ack.. I hate all these encodings that violate ISO 2022 and 'infringe upon' the
ASCII range. 

Perhaps, we have to add another flag to indicate whether the encoding is
multibyte and non-ASCII-preserving and special case it in NS_UnescapeUR when
esc_NonASCII is on. 
It's getting uglier...
Comment 25 Ahn Dal-Soo 2005-01-21 22:33:31 PST
Yep, writing encoding-specific patches is not a quick hack...
Comment 26 Boris Zbarsky [:bz] (still a bit busy) 2005-01-23 15:14:52 PST
I've backed this out from the trunk to fix bug 278727.  mscott, you may want to
do the same with aviary branch.
Comment 27 Ahn Dal-Soo 2005-10-08 15:08:53 PDT
I noticed that Thunderbird 1.5 Beta 2 has this issue.
Comment 28 WADA 2005-10-09 03:20:59 PDT
(1) "blocking1.7.5" was approved by Asa, but 1.7.5 to 1.7.12 were already released.
(2) "blocking1.7.13" is not set yet.
Does it mean Thunderbird 1.5 will be released without fixing of this bug,
ignoring "blocking1.8a6" approved by Asa?
Comment 29 WADA 2005-10-09 04:01:27 PDT
Jshin, can you fix this bug easily before release of Thunderbird 1.5, without
breaking bug 278727("?" in mailto:)/bug 243504("#" in filename of attacment),
with satisfaying your comment #24?
If not, many many Japanese Tb 1.5.0 users will complain at many many forums,
once Japanese Tb 1.5.0 will be released...
Comment 30 Masayuki Nakano [:masayuki] (Mozilla Japan) 2005-10-09 10:38:38 PDT
Created attachment 199012 [details] [diff] [review]
Patch rv1.0

I think that we should not use nsIURI for getting the leaf name. We should use
nsIFile instaed of one. This way is very simple.
Comment 31 Masayuki Nakano [:masayuki] (Mozilla Japan) 2005-10-09 10:42:07 PDT
*** Bug 273109 has been marked as a duplicate of this bug. ***
Comment 32 Masayuki Nakano [:masayuki] (Mozilla Japan) 2005-10-09 16:30:57 PDT
On Linux, this patch works fine too. I have tested on Windows. Of course, it
worked fine on Windows.
Comment 33 Masayuki Nakano [:masayuki] (Mozilla Japan) 2005-10-10 07:02:38 PDT
A Japanese tester said, this patch works fine on Mac.
Comment 34 Asa Dotzler [:asa] 2005-10-10 16:56:42 PDT
jshin, david, can you tell us what you think of this patch and any risk in
taking it? Thank.
Comment 35 David :Bienvenu 2005-10-10 17:36:33 PDT
cc'ing bz - I'll try the patch locally to see if it regresses anything obvious.
Comment 36 Boris Zbarsky [:bz] (still a bit busy) 2005-10-10 18:31:29 PDT
So.. where is this string coming from, and what encoding is it in?
Comment 37 Boris Zbarsky [:bz] (still a bit busy) 2005-10-10 18:32:52 PDT
I guess the answer to that last one is "ascii, somehow", given that it's
|string| in the IDL.  Which means percent encoding from some charset.  Where do
we actually get the string from, though?
Comment 38 Masayuki Nakano [:masayuki] (Mozilla Japan) 2005-10-10 21:00:58 PDT
I think that file:// URL should be encoded by local system charset that is used
by |net_GetURLSpecFromFile|.
Comment 39 Scott MacGregor 2005-10-12 11:22:21 PDT
time is about out for us on this. 
Comment 40 David :Bienvenu 2005-10-12 11:36:06 PDT
I tried this patch and it didn't regress anything obvious on windows, and
handled non-7-bit ascii characters fine, and characters that need to be escaped,
etc. But jshin is really the expert here.
Comment 41 Jungshik Shin 2005-10-13 06:20:12 PDT
Masayuki seems to have come up with a clever alternative. Thanks ! 
I'll add 'r' flag after making sure that what his code expects to get is what it
actually gets (bz's question). 
Comment 42 Scott MacGregor 2005-10-13 10:27:44 PDT
jshin, it'd be great if you can verify this today so we can get it into the
trunk. That will improve the chances that we may take this for tbird 1.5.
Comment 43 Scott MacGregor 2005-10-14 09:17:34 PDT
jshin, this bug just isn't going to happen for 1.5 if we can't get some traction
on this patch ASAP. It needs bake time on the trunk and we don't have much time
Comment 44 Masayuki Nakano [:masayuki] (Mozilla Japan) 2005-10-14 10:21:41 PDT
Now, we cannot attach the file that has non-ASCII name(Trunk only). Because bug
309335 is fixed. We cannot use UTF-8 URL for file://. But my patch may fix this
problem too.
Comment 45 Scott MacGregor 2005-10-14 10:29:32 PDT
Masayuki, can you verify for us that your patch does indeed allow you to attach
file names with non ascii characters with Bug 309335?
Comment 46 Masayuki Nakano [:masayuki] (Mozilla Japan) 2005-10-14 10:34:19 PDT
I'm building the Tb with my patch. Please wait.
Comment 47 Masayuki Nakano [:masayuki] (Mozilla Japan) 2005-10-14 11:52:05 PDT
O.K. This patch fixes the regression of bug 309335.
Comment 48 Christian :Biesinger (don't email me, ping me on IRC) 2005-10-14 13:26:49 PDT
why not use NS_GetFileFromURLSpec?
Comment 49 Darin Fisher 2005-10-14 17:15:45 PDT
> why not use NS_GetFileFromURLSpec?

Yes, prefer NS_GetFileFromURLSpec as it helps to simplify the code.
Comment 50 Masayuki Nakano [:masayuki] (Mozilla Japan) 2005-10-14 20:53:37 PDT
Created attachment 199626 [details] [diff] [review]
Patch rv1.1

Yeah, you're right. Thanks!
Comment 51 Masayuki Nakano [:masayuki] (Mozilla Japan) 2005-10-14 20:56:35 PDT
Created attachment 199627 [details] [diff] [review]
Patch rv1.1
Comment 52 Jungshik Shin 2005-10-16 07:59:11 PDT
Comment on attachment 199627 [details] [diff] [review]
Patch rv1.1

sorry for the delay. here goes my r. I tracked down its callers and concluded
that it should be safe.
Comment 53 Jungshik Shin 2005-10-16 08:00:31 PDT
Comment on attachment 199627 [details] [diff] [review]
Patch rv1.1

asking for a to 1.8rc1. 
This is a simple enough and safe patch and non-English users would be for sure
happy with this patch.
Comment 54 Masayuki Nakano [:masayuki] (Mozilla Japan) 2005-10-16 08:04:30 PDT

checked-in to trunk.
Comment 55 Masayuki Nakano [:masayuki] (Mozilla Japan) 2005-10-16 08:06:28 PDT
Comment 56 Masayuki Nakano [:masayuki] (Mozilla Japan) 2005-10-16 10:52:14 PDT
checked-in to 1.8 Branch too.

Note You need to log in before you can comment on or make changes to this bug.