rename folder fails for Unicode folder names

RESOLVED FIXED in mozilla1.8.1

Status

MailNews Core
Backend
RESOLVED FIXED
13 years ago
10 years ago

People

(Reporter: Dimitri Sverdlov, Assigned: Jungshik Shin)

Tracking

({fixed1.8.1, intl})

Trunk
mozilla1.8.1
fixed1.8.1, intl

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(1 attachment)

(Reporter)

Description

13 years ago
User-Agent:       Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8) Gecko/20051111 Firefox/1.5
Build Identifier: Thunderbird version 1.5 RC1 (20051025)

Renaming folders fails if the old and/or the new folder name use Unicode characters.

Reproducible: Always

Steps to Reproduce:
1. Create a folder "test"
2. Try to rename it to "тест"


Actual Results:  
Alert: "The folder could not be renamed. Perhaps the folder is being reparsed, or the new name in not a valid folder name."

Note that creating a *new* folder named "тест" works fine. An other bug here is that a newly created folder "тест" can't be deleted.

Other examples:
works:
* renaming "test тест" to "test" 

doesn't work:
* renaming "test тест" to "тест" 
* renaming "test тест" to "testtest тест" 
* renaming "тест" то "test тест"
* renaming "тест" то "тесттест"
etc.
(Reporter)

Updated

13 years ago
Component: RSS → General
Version: unspecified → 1.5
(Reporter)

Comment 1

13 years ago
P.S. - Sorry for the HTML-encoded characters, i didn't know bugzilla would convert them to that. But I hope you get the idea.

Comment 2

13 years ago
this is a local folder you're trying to rename?

Comment 3

13 years ago
if this is true in general (and I suspect it is), it would be bad...
Status: UNCONFIRMED → NEW
Ever confirmed: true
Flags: blocking1.8.1?
(Reporter)

Comment 4

13 years ago
Yes, that's true in general - local folders, POP3 folders, RSS folders.

Comment 5

13 years ago
This is failing in PR_Rename - nsMsgI18NCopyUTF16ToNative is giving us a string back with '?' in it, and I suspect that PR_Rename is not happy with that. It calls MoveFile, which I guess is a win32api? The weird thing is that creating a folder with the same name works fine...
(Assignee)

Comment 6

13 years ago
Dimitri, what's your OS locale? Are you running a Russian Windows XP? Next time you write a comment with any character outside the US-ASCII (e.g. Cyrillic letters), please, set View | Character Encoding to UTF-8 *before* filling in the textarea for 'additional comments'.

Hopefully, when I begin to work on bug 162361 (this weekend), I'll be able to come up with a rough idea to fix this as well.
(Assignee)

Comment 7

13 years ago
Apparently, when you create a new folder with character outside the current locale (ANSI-based locale), those characters are replaced by '_' *on* the disk (this replacement is done by Windows 'A' APIs). When removing or renaming, we go through UTF16->Native converter which replaces those characters with '?'. Therefore, we can't rename or remove it while we can create it with '_' in its on-disk filename.

BTW, I realized that that part of mailnews still relies on nsFileSpec... That means fixing bug 162361 will not automatically get 'propagated' there...

We need to fix this problem, but if we can't soon, we'd better block the creation of a folder with characters not representable on the current file system (well, NTFS uses Unicode, but our code doesn't yet fully exploit that)

(Reporter)

Comment 8

13 years ago
(In reply to comment #6)
> Dimitri, what's your OS locale? Are you running a Russian Windows XP?

No, I'm running a German XP with a German locale, but I just often type in Russian.
So that should be the problem, as you wrote.

Comment 9

13 years ago
Jungshik, c7 explains why we wouldn't be able to rename a folder that already had unicode characters, but what I also saw was that we couldn't rename an ascii folder name to a name that had unicode characters. Does that same explanation hold?

We're working on getting rid of uses of nsFileSpec but I'm pretty sure nsLocalMailFolder will be the last user to go.
(Assignee)

Comment 10

13 years ago
(In reply to comment #9)
> Jungshik, c7 explains why we wouldn't be able to rename a folder that already
> had unicode characters, but what I also saw was that we couldn't rename an
> ascii folder name to a name that had unicode characters. Does that same
> explanation hold?

I took a closer look and found the cause. We use 'nsMsgI18NCopyUTF16ToNative' to convert a new name to the native file system name which replaces unconvertible characters with '?'. However, '?' is invalid on Windows file system. We do check for invalid characters, but we do it *before* the conversion so that '?' passes through to lead to an error from a Windows API. A very quick fix would be to use CopyUnicodeToNative that takes care of invalid characters. A better fix would be to check if there's any unconvertible character in a folder name, in which case we can use a hashed name. 

> We're working on getting rid of uses of nsFileSpec but I'm pretty sure
> nsLocalMailFolder will be the last user to go.

This way, we can allow local folders whose names contain characters not covered by the file system character encoding even before we fix bug 162361 or convert nsLocalMailFolder to nsILocalFile.

One problem with this approach is that there may be a migration issue when we eventually support the full Unicode range on Windows. This is not a concern on Mac OS X and modern Linux (where UTF-8 is the default). 

David, what do you think? 



 

(Assignee)

Comment 11

13 years ago
Created attachment 207970 [details] [diff] [review]
patch

I've just tested this patch on Linux with a non-UTF-8 locale. It seems to work well. I'll make further tests on Windows. With this patch, any Unicode character can be used for local folder names. If any character cannot be covered by the file system character encoding, its hash value is calculated and used instead.
Assignee: mscott → jshin1987
Status: NEW → ASSIGNED
(Assignee)

Comment 12

13 years ago
Comment on attachment 207970 [details] [diff] [review]
patch

I've also tested this on Windows and it worked as intended. 
As for  the migration issue (when we can use the full Unicode-repertoire on Windows), it appears that there shouldn't be any as long as the mapping between human-readable fold name (shown in the folder list) and on-disk name (hashed if necessary) is well-kept.
Attachment #207970 - Flags: superreview?(mscott)
Attachment #207970 - Flags: review?(bienvenu)

Comment 13

13 years ago
Comment on attachment 207970 [details] [diff] [review]
patch

This patch changes the name of the .msf file on disk, for a given folder with intl characters, right? For existing folders, I think that means the old .msf file will get orphaned, and we'll generate a new one when you click on the folder - have you tried that?
(Assignee)

Comment 14

13 years ago
> file will get orphaned, and we'll generate a new one when you click on the
> folder - have you tried that? 

This patch only affects characters NOT covered by the file system encoding (e.g. Cyrillic on English Windows, Arabic on Japanese windows). That is, Kanji on Japanese Windows, Arabic on Arabic Windows, O-umlaut on English windows etc are NOT affected. 

Currently, characters not covered by the current file system encoding are converted to '_' when a folder is created on disk. However, our internal folder DB(?) keeps its original name *along with* its on-disk name(both folder itself and the corresponding msf file), right? As long as the name with uncovered characters replaced by '_'  is *unique*, this is fine, I think. When a user upgrades to a new version with my patch, this should continue to work if there is a separate folder DB in which our internal folder name (which is also shown to a user in the folder list)  is mapped to its on-disk name. 

Is the last 'if ...' the case or not? I thought that's the case because otherwise 'hashed' on-disk names wouldn't have worked, but your question and my memory of some 'msf' file mix-up in the past make me less certain. However, my test with 1.5RC2 and a build with my patch applied indicates that it's the case (i.e. we don't rely on on-disk names to get the list of folders, but we do have a separate folder DB). What I did was:

1. create a folder with two Devanagari characters followed by 'ab' ( सवab ) with 1.5 RC2
2. add some messages to it
3. go to the folder and read a couple of messages I just copied there
4. quit 1.5RC2
5. Check the on-disk name. It's '__ab' and '__ab.msf' because Devanagari characters are not covered by my current file system encoding.
6. launch my build with the patch applied
7. go to the folder (सवab) and make certain that the folder name (सवab) is intact
8. add a couple of more messages to the folder
9. see if there's any change in the names of on-disk folder and msf file
10. quit my build and check the on-disk file names once more.
11. launch 1.5RC2 and repeat 6 though 9
12. rename 'सवab' to 'usedtobehindi'
13. check the on-disk name to make sure that there's any stale *msf file ('__ab.msf') 


(Assignee)

Comment 15

13 years ago
David, do you have any test scenario other than what I described in the previous comment? 
Component: General → MailNews: Backend
Flags: review?(bienvenu)
Keywords: intl
OS: Windows XP → All
Product: Thunderbird → Core
Hardware: PC → All
Target Milestone: --- → mozilla1.8.1
Version: 1.5 → Trunk

Comment 16

13 years ago
Junshik, terribly sorry, I didn't see your last comment - that test scenario looks fine.

Updated

13 years ago
Attachment #207970 - Flags: superreview?(mscott) → superreview+
(Assignee)

Comment 17

13 years ago
Comment on attachment 207970 [details] [diff] [review]
patch

David, thank you. No need to be sorry because it's me who forgot to add you to Cc when adding the description of my test.

BTW, do I have to ask for r separately from Scott? Or would you double as reviewer as well? I don't recall canceling 'review request', but somehow it's cancelled.

Comment 18

13 years ago
you can put me down as r/sr
(Assignee)

Comment 19

13 years ago
thanks, David. patch landed on the trunk. i'll let it bake for a while on the trunk before asking for a branch approval. 
Status: ASSIGNED → RESOLVED
Last Resolved: 13 years ago
Resolution: --- → FIXED

Comment 20

13 years ago
sorry to spam this bug, but is it related to this display problem in newsgroups name, see bug 327037

Comment 21

12 years ago
Comment on attachment 207970 [details] [diff] [review]
patch

has this baked long enough for branch consideration?
Attachment #207970 - Flags: approval-branch-1.8.1?

Comment 22

12 years ago
Comment on attachment 207970 [details] [diff] [review]
patch

jshin, I think this has baked long enough and is ready for the branch. lemme know if you feel differently. Thanks!
Attachment #207970 - Flags: approval-branch-1.8.1? → approval-branch-1.8.1+
(Assignee)

Comment 23

12 years ago
(In reply to comment #22)
> (From update of attachment 207970 [details] [diff] [review] [edit])
> jshin, I think this has baked long enough and is ready for the branch. lemme
> know if you feel differently. Thanks!

I agree that this should go in. I've just landed it on branch. 
Keywords: fixed1.8.1
Product: Core → MailNews Core
You need to log in before you can comment on or make changes to this bug.