Closed Bug 264071 Opened 16 years ago Closed 16 years ago

M18a4/TB08 cannot handle mail folders with non-Latin1 (non-ASCII) characters

Categories

(MailNews Core :: Backend, defect)

x86
All
defect
Not set
major

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: World, Assigned: jshin1987)

References

Details

(Keywords: fixed-aviary1.0, intl, regression)

Attachments

(14 files, 2 obsolete files)

42.08 KB, image/png
Details
1.20 KB, text/plain
Details
19.70 KB, image/gif
Details
576 bytes, text/plain
Details
2.85 KB, image/png
Details
9.00 KB, patch
mscott
: review+
Bienvenu
: superreview+
Details | Diff | Splinter Review
7.63 KB, patch
Details | Diff | Splinter Review
9.64 KB, patch
Details | Diff | Splinter Review
1.02 KB, text/plain
Details
9.68 KB, patch
Details | Diff | Splinter Review
12.93 KB, patch
jshin1987
: review+
Bienvenu
: superreview+
Details | Diff | Splinter Review
7.40 KB, patch
Details | Diff | Splinter Review
3.66 KB, text/plain
Details
78.50 KB, image/jpeg
Details
User-Agent:       Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.8a5) Gecko/20041009
Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.8a5) Gecko/20041009

When I created a mail folder with Japanese name on Moziial 1.8.a4 and
Thunderbird 0.8 on Win-2K(NTFS drive),
following three problems occurred.

(Problem-1) Duplicate folder appears after restart.
(Problem-2) Deletion of parent folder fails with error of "File already exist".
            This is probably problem of Thunderbird Bug 259321.
(Problem-3) Re-definition of same folder name failed with error of "File already
exist".
            This is probably problem of Thunderbird Bug 259317
            
These problems occur if and only if Japanese folder name does not contain byte
code of illegal file name character(for example 0x5C) in second byte of Japanese
character. 
This bug did not occur on Moziial 1.8.a3 and Thunderbird 0.7.3

Test Scenario is as follows :

(Case-1) No illegal file name character byte
(1-1) Create 3 folders named "TEST-1", "TEST-2" and "TEST-3".
(1-2) Create a folder with a Japanese character(0x82A0 in Shift_JIS,
      first HIRAGANA character, pronounced "er")
      under "TEST-1", "TEST-2" and "TEST-3" folder.
      Next 3 files were created
      under TEST-1.sbd, TEST-2.sbd and TEST-3.sbd directry.
        bd0bf789
        bd0bf789.msf
        <0x82A0> 
(1-3) Delete the folder under "TEST-1"
      Next 4 files were created under Trash.sbd directry.
        bd0bf789
        bd0bf789.msf
        <0x82A0>
        <0x82A0>.msf <= Newly Created  
(1-4) Check TEST-1.sbd directry
      Next 1 file still remained in TEST-1.sbd directry
        <0x82A0>
(1-5) Try to define same folder name under "TEST-1" folder
      => Definition failure ("File already exist" error)
(1-6) Empty Trash => Trash was cleared successufully.
(1-7) Try to delete "TEST-2" folder
      => Deletion failure ("File already exist" error)
(1-8) Shutdown and Restart Mozilla
      Next 4 files were found under TEST-3.sbd directry.
        bd0bf789
        bd0bf789.msf
        <0x82A0>
        <0x82A0>.msf <= Newly Created by .msf recreation on restart
      Next 2 folders appear under "TEST-3" folder
        <0x82A0> (Folder location shows file of bd0bf789) 
                 Set of bd0bf789 and bd0bf789.msf
        <0x82A0> (Folder location shows garbage file name => Japanese file)
                 Set of <0x82A0> and <0x82A0>.msf

Status of (1-2) causes duplicate folders named <0x82A0> after restart ; 
 - a set of  bd0bf789 and bd0bf789.msf (Japanese folder name is saved in .msf)
 - a set of  <0x82A0> and <0x82A0>.msf(recreated on restart) 
Status of (1-4) causes "already exists" on re-define.
 
(Case-2) Illegal file name character at second byte
(2-1) Create a folder named "TEST-4"
(2-2) Create a folder with a Japanese character(0x835C in Shift_JIS,
      15-th KATAKANA character, pronounced "so")  under "TEST-4" folder
      Next 2 files were created under TEST-4.sbd directry.
        3db4214a
        3db4214a.msf
(2-3) Delete this folder
      Next 2 files were created under Trash.sbd directry.
        3db4214a
        3db4214a.msf
(2-4) Check TEST-2.sbd
      No file was found
 
(Case-2) is a normal result when a illegal file name character(0x5C,\) at second
byte of double byte character. (See Bug 117385)

Main cause of problems is creation of <0x82A0> file(file name of Japanese
character) at step (1-2) of (Case-1).
Manual deletion of <0x82A0> file just after folder creation resolved all
following problems(duped folders after restart/undeletable folder/redefinition
failure).
This result indicates that set of bd0bf789/bd0bf789.msf itself is a valid folder
file set.
(Japanese folder name is saved in .msf. Same as illegal file name character case)  

(Problem-4) Difficulty in manual folder recovery.
When no illegal file character byte is contained in Shift_JIS code of Japanese
character,
former Mozilla created file set of JAPANESE-chars and JAPANESE-chars.msf for
Japanese folder name.
However, current Mozilla created "hexa-string" and "hexa-string".msf files even
though no illegal character is included.
This means that latest Mozilla treats all Japanese character as illegal file
character.
"Treating all Japanese characters as illegal file name character"
is very incovinient for all Japanese people.
It is too difficult to know "Japanese mail folder name" from hexa-string mail
folder files.
This causes difficulty in manual recovery of mail folder files.
Please note that currently used my Japanese name folders are set of
JAPANESE-chars file and JAPANESE-chars.msf file, and are accessed with no error
even by latest Mozilla.

Easiest workaround - "Do not use non-ASCI" on folder creation.
 (1) Create folder with non-ASCII characters only.
 (2) Rename the folder from ASCII name to non-ASCII name.

Problem did not exist on Mozilla Suite 1.8.a3 Release build.
(Build ID of Win-32 ZIP build = 2004-08-17-07)
So this problem was produced after 2004/8/17.

Following is Bonsai's report for changes since 2004/08/17 00:00:00 to now 
 on mozilla/ mailnews/ local/ src/ nsLocalMailFolder.cpp

2004-09-15 16:04	bienvenu%nventure.com 	1.477 	10/1 	
	fix downloading partial pop3 messages for offline
	when they've been filtered into other folders, sr=mscott 259649
2004-08-30 09:57	bienvenu%nventure.com 	1.476 	8/0  	
	fix 257104, empty local trash doesn't set counts to zero, sr=mscott
2004-08-23 10:55	bienvenu%nventure.com 	1.475 	1/1  	
	fix 256332 compact breaks rss folders, also fix runtime warnings
	in account manager sr=mscott
2004-08-18 16:46	timeless%mozdev.org 	1.474 	1/1 	
	Bug 41929 - movemail. fixing build bustage
2004-08-18 16:11	neil%parkwaycc.co.uk 	1.473 	34/53  	
	Bug 41929 Allow multiple accounts with the same server and username
	if they have different port numbers
	p=kteuscher@myrealbox.com r=bienvenu sr=me
2004-08-18 14:54	scott%scott-macgregor.org 	1.472 	4/2	
	Bug #219586-->Eudora, Outlook mailboxes with "/" in name fail to import.
	Unable to create local folders with forward slashes
	in the name on Windows. sr=bienvenu

I suspect regression by patch for Bug 219586. 

This bug can be a blocker of Thunderbird 1.0.

Reproducible: Always
Steps to Reproduce:
Correction of workaround(sorry for spam).

Easiest workaround - "Do not use non-ASCI" on folder creation.
 (1) Create folder with ASCII characters only.
 (2) Rename the folder from ASCII name to non-ASCII name.

(Invalid)
> Easiest workaround - "Do not use non-ASCI" on folder creation.
> (1) Create folder with *non-*ASCII characters only.
> (2) Rename the folder from ASCII name to non-ASCII name.
Mozilla/5.0 (Windows; U; Windows NT 5.1; cs-CZ; rv:1.8a4) Gecko/20040927

I confirm this bug. Czech version has same problem.

> When I created a mail folder with Japanese name on Moziial 1.8.a4 and
> Thunderbird 0.8 on Win-2K(NTFS drive),
> following three problems occurred.
> 
> (Problem-1) Duplicate folder appears after restart.

This is not only problem for japanse characters, propably for all Non-ASCII
characters.

Can confirm with the CZech TB0.8.

But Mozilla 1.7.3
Mozilla/5.0 (Windows; U; Windows NT 5.0; cs-CZ; rv:1.7.3) Gecko/20040910 seems
unaffected.

Maybe it is regression.
Flags: blocking-aviary1.0?
Marking NEW as previous comments

Bienvenu, could you please look on Wada's analysis in comment #0?
Status: UNCONFIRMED → NEW
Ever confirmed: true
Keywords: intl, regression
This problem occures not only when the mailbox is created, but every time the
problematic folder is copied to another folder. Every time when it is copied it
is duplicated once more.
I believe that Scott's patch for Bug 219586 has completely resolved problem of
Bug 140212 ("/" in folder name), if the folder is newly created, although Scott
has not set it "FIXED".
  For folder name of "/" :
   Thunderbird 0.8 created 00000045 and 0000045.msf for folder of "/".
   (Excess file, file named "/", was not created.)
   This indicates that folder name was successfuly considered that
   illegal file name character is included in folder name.
   Using this folder has no problem on Thunderbird 0.8.  
   (But renaming to "/" seems to still have some problems. Bug 264467)
I absolutely agree on Scot's idea for resolving "/" problem.
Therefore, I think best way to resolve this bug is ;
 (1) Backout patch for Bug 219586 => This will easily resolve this bug.
 (2) Enhance Scots's logic of Bug 219586 for other special characters.
     "/" , "#" , "?" (URL delimiters) and "."(first dot and last dot only).
     (See meta Bug 124287)  
 (3) Restrict this new logic only on folder name which contains
     above special characters.
I believe this will resolve many problems listed in Bug 124287.   

However, if illegal(real or psedo) file character check is based on Shift_JIS or
something else, special care is required on double byte character ;
 - second byte of double byte character is not a real "illegal character".

In addtion, please re-design real file name for folder name with illegal file
characters(real illegal or psedo illegal).
For folder name of "A/B", Thunderbird 0.8 created file set of e439c4f2 /
e439c4f2.msf (on Win-2K/NTFS).
This can be A00000045B/A00000045B.msf or A_00000045_B/A_00000045_B.msf.
(For "/", 00000045/00000045.msf are used by Thunderbird 0.8.)
If only illegal(or psedo illegal) character is converted to HEXA string,
manual recovery of folder name from file name is not so difficult.
I believe that this will also reduce bugs which says "wrong file name for folder
was created".
Another idea for solution of both this bug and problems on special characters :
(1) Consider any folder name to be "Illegal file name"
(2) Change conversion logic from folder name to file name
    - Keep original characters in folder name as many as possible
      including character position.
(3) Introduce <filename>.xxx file who contains folder name in UTF-8 

Because Scott's idea completely resolved "/" problem, and because set of HEXA
string files itself in (Case-1) of Comment #0 worked very well, and because
(Case-2) of Comment #0 has no problem,
I think considering all folder name as "includes illegal file name character"
will work very well.
This will resolve this bug's problem and many special character problems.
This can be applicable to any character coding.
As a result of this change, ".msf" file will always contains folder name.

Problem of this approach is - "file name becomes HEXA string".
But this problem can be easlily avoided by changing "folder name->real file
name" conversion algorythm.
 - Keep original characters in folder name as many as possible.
If conversion algorythm is appropriate, almost all file name will be same as
folder name.

However, if converted file name becomes HEXA string only, for example folder
name of "...///###???///###???///...", it is too difficult to know folder name
from file name.
Folder name in ".msf" is not readable for human beings.
This dificulty can be easily minimized by introducing text file of
<filename>.xxx which contains folder name in UTF-8.
If UTF-8 text, many text editer can display it in readable grifs.

Tweaking bug summary from:
  If non-ASCII(Japanese) folder name, Duplicate mail folder after  
  restart/Undeletable folder/Folder re-definition failure problems occur( Mozilla
  1.8a4/Thunderbird 0.8)
to:
  M18a4/TB08 cannot handle mail folders with non-ASCII characters
Summary: If non-ASCII(Japanese) folder name, Duplicate mail folder after restart/Undeletable folder/Folder re-definition failure problems occur( Mozilla 1.8a4/Thunderbird 0.8) → M18a4/TB08 cannot handle mail folders with non-ASCII characters
Firefox/Thunderbird 1.0 are going to be off the aviary branch based off of 1.7,
not 1.8
Flags: blocking-aviary1.0?
(In reply to comment #9)
> Firefox/Thunderbird 1.0 are going to be off the aviary branch based off of 1.7,
> not 1.8

Robert Parenton: This bug/regression is present in Thunderbird 0.8, which is on
Aviary branch.
Re-requesting blocking Aviary1.0.
Flags: blocking-aviary1.0?
was this problem present in 0.7 or was it new in .8?
Assignee: sspitzer → mscott
(In reply to comment #11)
> was this problem present in 0.7 or was it new in .8?

Read my comment #0.
This problem did not occur on Mozilla 1.8a3 release build nor Thunderbird 0.7.3
release build.
But problem occured on Mozilla 1.8a4 release build and Thunderbird 0.8 release
build.
putting on the plus list for now. This seems like a pretty bad regression if I
am properly understanding mwada. Can one of you post a screen shot showing what
a folder looks like using 0.7.x next to 0.8 so I can see them side by side?
Flags: blocking-aviary1.0? → blocking-aviary1.0+
> Can one of you post a screen shot showing what
> a folder looks like using 0.7.x next to 0.8 so I can see them side by side?

Only quick look, which I have posted to the bug #253807 before I found it is
separte problem. 

See the folder "Weblog*"
https://bugzilla.mozilla.org/attachment.cgi?id=159073&action=view
https://bugzilla.mozilla.org/attachment.cgi?id=159074&action=view

Working on the new screenshots.
Folder under "0.7.3" is created on Thunderbird 0.7.3.
Folder under "0.8" is created on Thunderbird 0.8.
This screen shot was taken after restart of Thunderbird 0.8.
DIR listing (in UTF-8, no BOM)
(In reply to comment #13)

Attachment 163395 [details] and attachment 163396 [details] are screen shot and directry listing of
(Case-1) in my comment #0 (Folder name of first Japanese HIRAGANA character).

Since directry listing is saved in UTF-8, you can probably recreate this bug by
"copy&paste" of Japanese file name(non HEXA string) to Folder Name filed when
mail folder creation, even on English MS Windows.
Here is a cumulative screenshot. Created folder angliètina (it this
angli&#x010Dtina)

TB0.7.3 - dir listing OK folders created - created files: angliètina,
angliètina.msf
TB0.8 - dir listing immidiatelly after folde creating - created files:
53510e89, angliètina and 53510e89.msf
TB0.8 - dir listing after TB restart -> created aditional angliètina.msf

Both folders have in the TB name "angliètina" (see screenshot). But both
folders angliètina and 53510e89 are in TB correctly usable, so problem is
probably only during creating new file (or copying it to another folder).
I just wanted to make sure we were on the same page for one thing.

Starting in 0.8 and higher, new foldes you create with funky characters (non
ascii) not get converted into a general string like: abjkskwl, abjkskwl.msf ON
DISK. That is the actual file names will no longer correspond to the actual
"pretty name" thunderbird shows in the UI. If that's the complaint, then that
isn't a bug.  There is a difference there.

But 0.8 should be showing you the right "pretty name" in the actual folder pane
and not the generic disk name string (abjkskwl). 

As I read Met's comments, this part seems wrong:
TB0.8 - dir listing immidiatelly after folde creating - created files:
53510e89, angliètina and 53510e89.msf. In this case there should only be two new
files: 53510e89 and 53510e89.msf on disk. 

Using todays automated nightly (10/26), here is what I see:

1) Create a folder named: angliètina by pasting this text into the new folder
dialog.

2) Watch the folder get properly created in the folder pane with the right name.

3) Looked on disk and saw that the two files created for this folder were:
angli?tina and angli?tina.msf which looks correct.

4) Quit and restart and noticed that the folder still looks correct in the
folder pane. I didn't see an extra file get created with the wrong name.

Hey, any chance this bug was really just a dupe of Bug #264467? Where you guys
renaming folders to have non ascii characters? If it was renaming (verses
creating as a new folder) then I saw weird things such as an extra file getting
created before I checked in a fix for 264467. Wouldn't that be nice if this bug
was already fixed. 
Oh my God!
Mscott, you've killed my last workaround, renaming!
Mozilla Suite 2004102604 trunk build
 (Win-2K, ZIP build, your patch for bug 264467 is applied)
created both set of HEXA-string/HEXA-string.msf and
JAPANESE-Char/JAPANESE-Char.msf   
 when I renamed a folder "A" to Japanese-HIRAGANA character.
(In reply to comment #20)
> Using todays automated nightly (10/26), here is what I see:
> 
> 1) Create a folder named: angliètina by pasting this text into the new folder
> dialog.
> 
> 2) Watch the folder get properly created in the folder pane with the right name.
> 
> 3) Looked on disk and saw that the two files created for this folder were:
> angli?tina and angli?tina.msf which looks correct.
> 
> 4) Quit and restart and noticed that the folder still looks correct in the
> folder pane. I didn't see an extra file get created with the wrong name.

Mscott, binary of "èt" was probably interpreted as US-ASCII(Code page 437, 850
etc.) by your OS.
This default character encoding of OS is based on locale setting.
For example, Shift_JIS(Code page 932 etc.) is used when Japanese MS Windows.
This can be avoided by using Unicode. 
Could you test following scenario?
 (1) Open my attachment with character coding of "UTF-8"
     by UTF-8 capable text editor.
     (Notepad of Win-2K/XP has capability to handle UTF-8)
 (2) "Copy&paste" the Japanese-HIRAGANA character in my directry listing.

> Hey, any chance this bug was really just a dupe of Bug #264467?
Apparently "NO".
(In reply to comment #19)

> Starting in 0.8 and higher, new foldes you create with funky characters (non
> ascii) not get converted into a general string like: abjkskwl, abjkskwl.msf ON
> DISK. That is the actual file names will no longer correspond to the actual
> "pretty name" thunderbird shows in the UI. If that's the complaint, then that
> isn't a bug.  There is a difference there.
> 
> But 0.8 should be showing you the right "pretty name" in the actual folder pane
> and not the generic disk name string (abjkskwl). 

I know that excess file on creation is the cause of duplicate folder problem and
HEXA-string file name itself is "SPEC" when illegal file name character is
included in folder name.
Therefore, I separated (Problem-1/2/3) and (Problem-4) in my comment #0 , and
discussed (Problem-4) separately in my comment #6 and comment #7 .
But I believe both (Problem-1/2/3) and (Problem-4) are regression produced by
same cause.
I think your patch forced HEXA-string file name even when no illegal file name
caracter is used in folder name, and this problem is also a regression.
This is the reason why I discussed both (Problem-1/2/3) and (Problem-4) in this bug.

However, 2 ways are possible to avoid (Problem-4), I think ;
 (1) Call safeFolderName.AssignWithConversion(newDiskName) 
     if and oonly if dangerous special characters is included(/#? and . etc). 
 (2) Change design of safeFolderName.AssignWithConversion.
I expected solution (1) initially(see my comment #6),
but if you say that the cause is fault of safeFolderName.AssignWithConversion
instead of fault of caller of this module,
I'll open separate bug(See my comment #7).
I used mwada's utf-8 file to copy and paste some characters into a new folder
dialog. As you can see from the screen shot, it created the folder with the
correct name. I restarted and still saw just the one folder with that name.

I then looked on disk and verified that only two files were created for this
folder:

d7f3c2be
and 
d7f3c2be.msf

So everything worked out okay with that particular test case. 

(thanks for your patience in helping me try to reproduce this)
ah i just got it to happen by trying to rename the folder instead of creating
the folder as a new folder! 
I take it back. I lied about seeing the problem. What I saw was the fact that my
trash folder already had a folder in it with the name of the folder I was
renaming. So we failed to actually remove the original folder name + .msf file
during the renaming process. That's why I saw two sets of files there. With an
empty trash, I'm back to not being able to see this. 
(In reply to comment #24)
> Created an attachment (id=163521)
> screen shot after creating the folder and restarting. Note just the one folder
> exists

> I used mwada's utf-8 file to copy and paste some characters into a new folder
> dialog. As you can see from the screen shot, it created the folder with the
> correct name. I restarted and still saw just the one folder with that name.

> I then looked on disk and verified that only two files were created for this
> folder:
> d7f3c2be
> and 
> d7f3c2be.msf

MScott, 

Did you save my directry listng to you local hard disk file?
Did you opened the saved file with UTF-8 capable text editor?
Did you opened teh saved file with character encoding of UTF-8?

Your test is done on US-ASCII(Code page 437 or 850 base).
Do NOT view/copy&paste my attachment using Browser when re-creation test.
You have to do test using UTF-8.

When I saw the file with UTF-8 and Courier font by text editor, single Japanese
character portion(3bytes long) was displayed as "1200 ,  .msf"
  - 1200 is file size
  - Followed single space is field delimiter of directry listing
  - 3bytes data of "," and following 2 spaces(Shown as if space but not 0x20)
    is a Japanese character
  - ".msf" is extention

If you open file with character coding of UTF-8, and if you select "," portion
on text editor, "," and following 2 spaces(3bytes long, just before ".msf") will
be  reverted.
Then do "Copy&Paste" these 3 bytes to folder name field on creation or renaming.

If you do not have Japanese font and unviewable Japanese character causes
difficulty in test, please install "Arial Unicode MS".
"Arial Unicode MS" can be obtained from Microsoft's site with no charge if you
have licence of MS Windows.
This is very large font(20MB) but convenient because this font has many many
characters - Arabic, Chinese, Japanese,  Korean, ... etc.
>Using todays automated nightly (10/26), here is what I see:

I have just tested
http://ftp.mozilla.org/pub/mozilla.org/thunderbird/nightly/latest-0.9/thunderbird-win32.zip
"26-Oct-2004 11:41". But there is still the same problem.
At least I can reproduce the problem 1 in comment #0. 

The root cause of this and related problems is the improper use of
'AssignWithConversion' in patches for bug 264467 and bug 257986. Using
'AssignWithConversion' to round-trip Unicode strings between Mozilla's nsString
and nsCString does NOT work for any character above U+0100. A ('the') proper fix
would be to fix 'NS_MsgHashIfNecessary' to accept nsString instead of nsCString.
By working with Unicode strings instead of C strings (of many different
encodings), we can avoid many problems (among which is to mistake the second
byte of double-byt character for 'slash'). 

http://lxr.mozilla.org/seamonkey/ident?i=NS_MsgHashIfNecessary
Sorry for bug spam. Change the platform to all
Assignee: mscott → jshin
OS: Windows 2000 → All
Summary: M18a4/TB08 cannot handle mail folders with non-ASCII characters → M18a4/TB08 cannot handle mail folders with non-Latin1 (non-ASCII) characters
Attached patch patch (obsolete) — Splinter Review
This fixes bug 257689, bug 264467 and bug 21586 in such a way to make it work
for non-Latin-1 characters as well.
(In reply to comment #31)

> This fixes bug 257689, bug 264467 and bug 21586 in such a way to make it work

Oops. It's bug 219586 (not bug 21586)
Status: NEW → ASSIGNED
Comment on attachment 163571 [details] [diff] [review]
patch

>+    if (str.Length() > MAX_LEN) 
>+    {
>+      name.Truncate(MAX_LEN - 8); 
>+      PR_snprintf(hashedname + MAX_LEN - 8, 9, "%08lx",
>+                (unsigned long) StringHash(str));

The above line should read:

PR_snprintf(hashedname, 9, "%08lx", (unsigned long) StringHash(str));
Attachment #163571 - Flags: superreview?(bienvenu)
Attachment #163571 - Flags: review?(mscott)
Attached patch patch v2Splinter Review
I just shuffles things around a little. Basically the same as attachment 163571 [details] [diff] [review]
Attachment #163571 - Attachment is obsolete: true
Attachment #163571 - Flags: superreview?(bienvenu)
Attachment #163571 - Flags: review?(mscott)
Comment on attachment 163592 [details] [diff] [review]
patch v2

asking for r/sr
Attachment #163592 - Flags: superreview?(bienvenu)
Attachment #163592 - Flags: review?(mscott)
Comment on attachment 163592 [details] [diff] [review]
patch v2

thx very much for looking at this!

Just a note - I believe local folders ignore the mailbox name (the one set by
SetMailboxName) - only imap pays attention.
Attachment #163592 - Flags: superreview?(bienvenu) → superreview+
for historical purposes here's the aviary version of the fix since the string
APIs aren't as sophisticated on the branch.

One note, in order to build this on windows, I had to change: 

NS_NAMED_LITERAL_STRING (illegalChars, 
			  FILE_PATH_SEPARATOR FILE_ILLEGAL_CHARACTERS);
to directly calling

PRInt32 illegalCharacterIndex = name.FindCharInSet(FILE_PATH_SEPARATOR
FILE_ILLEGAL_CHARACTERS);

otherwise I got a compiler error. I don't know if that's just a branch only
thing and the trunk may be fine.

I did the follwing tests:

1) Renaming and creating new folders with slashes in them still did the right
thing, we still used a hashed name.

2) Created and renamed folders with the utf-8 characters mwada sent me and
verified that things were still behaving correctly (as far as I could tell)

I'll be checking this ported patch into the branch.
Keywords: fixed-aviary1.0
Attachment #163592 - Flags: review?(mscott) → review+
thanks for r/sr.
Also, thanks for porting it to the branch. It turned out that the trunk had the
same problem with 'FindCharInSet' (for nsTString) so that I had to make the
same change as you did for the branch.
In addition, I missed one place where I should have used 'safeName' in place of
'safeFolderName' because my tree was not up-to-date. (around line 1170). Your
aviary patch has that so that there's no worry. 
This patch includes those two changes and I'm uploading it to help code
archaelogists :-)
mark as fixed
Status: ASSIGNED → RESOLVED
Closed: 16 years ago
Flags: blocking1.7.x?
Resolution: --- → FIXED
awesome. Thanks again jshin! You rock!
Comment on attachment 163612 [details] [diff] [review]
patch ported to aviary

asking for a to 1.7branch. the patch has been checked into aviary-1.0.
Attachment #163612 - Flags: approval1.7.x?
jshin I suspect this patch doesn't need to go into 1.7 because my original
changes to make forward slashes work for new folder and renaming a folder never
went into 1.7...
Comment on attachment 163612 [details] [diff] [review]
patch ported to aviary

Thanks for the note. 

Shouldn't we fix the forward slash issue in 1.7 branch anyway? If so, I'll make
a new patch against 1.7branch.
Attachment #163612 - Flags: approval1.7.x?
a=chofmann for 1.7 branch
Flags: blocking1.7.x? → blocking1.7.x+
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Re-opened due to new problem.

I tested with Mozilla suite 2004102805-trunk/Win-2K.
Problem in (Case-1) in my comment #0 has been resolved and file name of
Japanese character is used if 2nd byte(in Shift_JIS) is NOT illegal file name
character.
Jshin and Mscott, thanks a lot.

However, new problem occurred on (Case-2) in my comment #0, 2nd byte(in
Shift_JIS) is 0x5C, folder name of 0x835C in my test.

(New-Problem-1)
When I created a folder named 0x835c in Shift_JIS,
file of JAPANESE-char was created in addition to converted HEXA-string
files("3db4214a" and "3db4214a.msf")
"Copy folder location" says that "3db4214a" is file for this folder.

(New-Problem-2)
When I renamed folder "A" to 0x835c in Shift_JIS,
"A"/"A.msf" was renamed to "JAPANESE-Char"/"JAPANESE-Char.msf",
 (See file size and time stamp in attachment)
and converted HEXA-string files("3db4214a"/"3db4214a.msf") were newly created.
"Copy folder location" says that "3db4214a" is file for this folder.
Folder name at folder list pane was changed to JAPANESE-char but mail count
became 0 (I copied a mail when folder name was "A").
This means mail loss, although lost "JAPANESE-Char/JAPANESE-Char.msf" will
appear again after restart of Mozilla.
Change severity to critical bacause of mail loss.
Severity: major → critical
Thanks for testing. 
I see what's going on. I should have fixed all call-sites of
'NS_MsgHashIfNecessary' instead of just call sites in nsLocalMailFolder.cpp.
I'll do later today.

Btw, these new problems would affect only those whose local file system encoding
is multibyte in which the second or later byte uses the byte range for ASCII 
(Shift_JIS, Big5, GB18000, UHC). In case of GB18000 (zh-CN) and UHC(ko), the
byte range for ASCII is only used for rarely used characters so that in practice
only Japanese and Taiwanese users are affected by these new problems. 

Severity: critical → major
Attached patch patch (additional) (obsolete) — Splinter Review
On Linux, I tested under zh_TW.big5 (one of a few  locales on Linux whose char.
encoding is NOT ASCII-safe). I made several folders with 'α' (in Big5, its
second byte is '0x5c') and the reall '/'. I also made folders with 'ASCII-safe'
Chinese characters. Creating, renaming and deleting work well except for a
couple of mysterious cases in which 'on-disk name' was updated but 'prettyName'
is not updated while MsgDB was updated properly. I couldn't reliably reproduce
this and most of time it works fine.
(In reply to comment #49)

> except for a couple of mysterious cases in which 'on-disk name' was updated
> but 'prettyName' is not updated while MsgDB was updated properly.
 
This mystery is probably old bug 65303.
Restart of Mozilla is required before renaming to previously used folder name.
> Bug 65303
> new folder created with same name as previously renamed folder contains
phantom mail - crashes when read

A circumvention of this bug in folder rename test :
 (1) Create many folders for test - "Test-1" to "Test-N".
 (2) Create a folder as subfoloder of "Test-1".
 (3) Rename the folder
 (4) Move the folder to next "Test-N", then rename the folder
 (5) Repeat (4)
Thanks for the pointer. I think that's it. I only have troubles when I rename a
folder to the name of a folder that used to exist. So, I think my patch fixes
all the problems discovered by you. 
*** Bug 266834 has been marked as a duplicate of this bug. ***
for historical purposes. Let's hope i didn't break anything when back porting
this for the aviary branch!
Comment on attachment 163892 [details] [diff] [review]
patch (additional)

The only thing I wasn't sure about was wrapping several method calls with
NS_ASSERTION. Are we sure the NS_ASSERTION macro will always execute even in
optimized builds? I know some macros don't do anything in opt builds so if you
put a method call you care about it inside it, it won't get executed...

NS_ASSERTION(NS_SUCCEEDED(NS_CopyNativeToUnicode(oldCPath, oldPath)),
+		  "can never happen");
Attachment #163892 - Flags: superreview?(bienvenu)
Attachment #163892 - Flags: review+
mwada, I just checked in this latest patch into the aviary 1.0 branch in the
hopes that you could try it out in the 10/31 builds when they come out to see if
things are better. 
This is aviary patch against the up-to-date tree with attachment 164063 [details] [diff] [review] applied
(Scott's port of my previous patch). This shows the difference between
attachment 163892 [details] [diff] [review] and attachment 164073 [details] [diff] [review]. Basically, I got rid of my stupid
mistake of converting UTF-8 to UTF-16 to native and then back to UTF-16
(instead of just going from UTF-8 to UTF-16)  at the beginning of
NS_MsgCreatePathStringFromFolderURI. Scott, can you check this into aviary-1.0
for Wada to test? I ran out of the disk space on Linux (where I can test with
zh_TW.big5 locale) so that I compiled this on Mac OS X where I can't test
because UTF-8 is the only encoding I can use there.
For the trunk build, I tested with zh_TW.big5 locale on Linux, though (with
attachmetn 163892)
hmm.bugzilla is behaving strangely.. My comment accompanying attachment 164079 [details] [diff] [review]
was not added. Here it is:

This is aviary patch against the up-to-date tree with attachment 164063 [details] [diff] [review] applied
(Scott's port of my previous patch). This shows the difference between
attachment 163892 [details] [diff] [review] and attachment 164073 [details] [diff] [review]. Basically, I got rid of my stupid
mistake of converting UTF-8 to UTF-16 to native and then back to UTF-16 (instead
of just going from UTF-8 to UTF-16)  at the beginning of
NS_MsgCreatePathStringFromFolderURI. 

Scott, can you check this into aviary-1.0 for Wada to test? I ran out of the
disk space on Linux (where I can test with zh_TW.big5 locale) so that I compiled
this on Mac OS X where I can't test because UTF-8 is the only encoding I can use
there.
For the trunk build, I tested with zh_TW.big5 locale on Linux, though (with
attachmetn 163892)

As for Wada's question, aviary-1.0 branch and mozilla 1.7 branch are distinct.
At the moment, mozilla 1.7 branch seems to have a lower priority. I'll take care
of it later when all these things are settled out.
This is identical to attachment 163892 [details] [diff] [review] except for
NS_MsgCreatePathStringFromFolderURI in nsMsgUtils.cpp where the previous patch
has an unnecessary conversion between 'native' encoding and UTF-16. 
It's taken with '-w' option so that '-l' option needs to be used to apply.

I also have a aviary version which is identical to Scott's attachment 164063 [details] [diff] [review]
except for NS_MsgCreatePathStringFromFolderURI.
Attachment #163892 - Attachment is obsolete: true
(In reply to comment #55 and comment #56)

I'd like to clarify term of "aviary 1.0 branch" in this bug.

(Mscott)
> mwada, I just checked in this latest patch into the aviary 1.0 branch
(Jshin)
> I also have a aviary version

Does it mean all of bug 264467 and bug 257986 and this bug will be resolved on
both Thunderbird and 1.7?
Or still Thunderbird only?
(ie. bug 264467 and bug 257986 are not fixed on 1.7, so this bug can not occur)
Comment on attachment 164073 [details] [diff] [review]
patch updated ('-w') 

carrying over Scott's r flag
Attachment #164073 - Flags: superreview?(bienvenu)
Attachment #164073 - Flags: review+
Attachment #163892 - Flags: superreview?(bienvenu)
latest patch has been checked into the aviary branch jshin (2004-10-31 01:17PDT)
as I understand it, NS_ASSERTION will evaluate to nothing in release builds, so
code inside of it will not execute in release builds - hopefully, that wasn't
checked into the aviary branch...
Comment on attachment 164073 [details] [diff] [review]
patch updated ('-w') 

one nit - , if I'm understanding correctly what data is being passed around,
IMAP doesn't use pure utf-7 - it uses what's called IMAP Modified UTF-7, which
is slightly different, though the main thrust of your comment is true; it is 7
bit ascii.
Attachment #164073 - Flags: superreview?(bienvenu) → superreview+
yeah I thought that was the case and removed the NS_ASSERTION around the method
calls before I checked into aviary.
Thanks for r/sr and catching my really silly blunder with NS_ASSERTION. I've
just checked in attachment 164073 [details] [diff] [review] (that doesn't have that stupid method calls
wrapped by NS_ASSERTION). Before landing it, I prepended 'UTF-7' with
'modified'. David is absolutely right about the char. encoding used by IMAP. 
I tested with Thunderbird latest-0.9 2004-11-01 08:14 build(Win32,ZIP),
on next three Japanese character(see attachment for detail.)
 (1) 0x82A0 (2nd byte is not special in Shift_JIS, Case-1)
 (2) 0x835C (2nd byte is \ in Shift_JIS,	   Case-2)
 (3) 0x837C (2nd byte is | in Shft_JIS, 	   Same as Case-2)
Problems of this bug have been completely resolved.
Congraturations!
In addition to bug fix, annoyance of converted HEXA-string file name for 0x835C
and 0x837C has vanished.
Thanks a lot, Jshin and Mscott. 

However, "#" and "~" problem(disappering on restart) still exists although "?"
and ";" problem seems to be fixed in additon to "/" problem.
 (1) "/" problem has been resolved by Mscott.
     (Many bugs in dependency tree for Bug 124287)
 (2) "?" problem did not occur too. (Bug 41944)
     I think this problem has also been resolved by Mscott.
 (3) ";" problem did not occur. (Bug 232001) 
     Resolved by Mscott?
 (4) "#" problem still occurs. (Bug 94124)
 (5) "~" problem still occurs.
     I could not find this problem in Bug 124287 but I saw bug(s) for this.

Mscott, have you fixed "?" and ";" problem too?

I think applying fix for Bug 257986/Bug 264467 by Mscott on "#_" will quickly
resolve these longlived bugs.
Further, I think Bug 229522 and Bug 117840 will easily be resolved, 
if your fix is also applied on starting "."(dot) and ending "."(dot).
Mscott, is it difficult?
Screen Shot on next test.
 (1) Create folders under a parent folder and copy a mail to each folder
 (2) Restart Thunderbird (=> "#" and "~" disappeared)
 (3) Create folders under another parent folder and copy a mail to each folder
     "#" and "~" is accessible.
 (4) ALT + Print Screen
";" problem (Bug 232001) seems to be different problem (may be bug 65303, rename
probem, not character related), since no problem occured for ";" on Thunderbird
0.7.3.
Please ignore my description on ";".
Sorry for spam.
I believe we can mark this bug as fixed if I'm reading everyone's comments
correctly. The remaining issues sound like they are old issues covered in other
bugs. The regression itself has been fixed. 

Lemme know if you disagree jshin. Thanks again for helping out with this. 
Status: REOPENED → RESOLVED
Closed: 16 years ago16 years ago
Resolution: --- → FIXED
I also believe this bug has been fixed although test reports from other
countries have not come yet.
Thanks again for your effort, Jshin, Mscott and David.

Mscott and Jshin, I'd like to report my test result in my Comment #66 ("/?"
resolved,  "#~" not yet) to Bug 124287 after the fix is applied on trunk.
Jshin, when will your patch for trunk be checked-in?
It's already in (see comment #65)
Product: MailNews → Core
Product: Core → MailNews Core
You need to log in before you can comment on or make changes to this bug.