Open Bug 927827 Opened 11 years ago Updated 2 years ago

Saved searches duplicity on Mac (with special non-ASCII characters in name)

Categories

(MailNews Core :: Backend, defect)

x86
macOS
defect

Tracking

(Not tracked)

People

(Reporter: tkl, Unassigned)

References

(Depends on 1 open bug, Blocks 1 open bug)

Details

(Whiteboard: [Mac OS X uses NFD (Normalization Form D) for filenames])

User Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.8; rv:24.0) Gecko/20100101 Firefox/24.0 (Beta/Release)
Build ID: 20130910160258

Steps to reproduce:

1) Make a new Saved searches folder with any criteria, with name which will contain some czech national character like ě,š,č,ř,ž,ý,á,í,é,ú, or ů.
2) Restart Thunedbird.
3) After restart, the folder will be duplicated.
3) Delete one (any) and restart Thunderbird.
4) Searching criteria of remaining folder will be resetted.


Actual results:

OS: Mac OS X 10.8.5 Czech (UTF-8)
Thunderbird 24.0.1 Czech (UTF-8)

If I create a new Saved searches folder which contain some non-ASCII character in the name, after TB restart this folder will be duplicated. Both Ss folders (original and duplicated) work correctly now.
If I delete one of them (it does not matter which one), after TB restart remaining Ss folder stops work (all search criteria are resetted).
If I create Ss folder named with ASCII chars, folder works correctly even after TB restart, no duplicates arise.


Expected results:

Since it is possible to create a ordinary folder with non-ASCII characters in the name and such folders works correctly, they should work the Saved searches folders in the same way.
Phenomenon of "Duplicate Search Folder" is dup of 669144, and is not Mac only problem. Problem still occurs in Tb 24 on Win-XP.

(1) Create Search Folder named "ě,š,č,ř,ž,ý,á,í,é,ú,ů"(no quot)
(2) Following are created.
(2-1) virtualFolders.dat entry.
      Because non-ascii is used, UTF-8 binary of folder name is escaped.
> uri=mailbox://nobody@Local%20Folders/%C4%9B%2C%C5%A1%2C%C4%8D%2C%C5%99%2C%C5%BE%2C%C3%BD%2C%C3%A1%2C%C3%AD%2C%C3%A9%2C%C3%BA%2C%C5%AF
(2-2) msf file : 1da6dcfe.msf
      Search target folders,search conditions are held in .msf file.
      Because non-ascii is used, file name is hashed.
      Because Saved Search folder(not identical with local mail folder),
      or because used character is not "illegal file name character",
      actual folder name is not held in 1da6dcfe.msf
(2-3) directory named "1da6dcfe"(no quot)
(3) Restart Tb => two "1da6dcfe" are shown at folder pane.
    one     : UTF-8 binary in virtualFoldrs.dat entry => hashed name of "1da6dcfe"
              => create DIRECTORY named "1da6dcfe" => reach 1da6dcfe.msf
    another : directory named "1da6dcfe" => 1da6dcfe.msf (same route as local mail folder)

If Tb is restarted after delete directory named "1da6dcfe" and delete panacea.dat, only one "1da6dcfe" is shown at folder pane, but directory named "1da6dcfe" is created again. So, after restart of Tb, double Search Folders appears again.

Interesting phenomenon was observed.
If Search Folder name = "/ě,š,č,ř,ž,ý,á,í,é,ú,ů" is requested when "ě,š,č,ř,ž,ý,á,í,é,ú,ů" exists, following is added to virtualFolders.dat.
> uri=mailbox://nobody@Local%20Folders//%C4%9B%2C%C5%A1%2C%C4%8D%2C%C5%99%2C%C5%BE%2C%C3%BD%2C%C3%A1%2C%C3%AD%2C%C3%A9%2C%C3%BA%2C%C5%AF
But no file/directory is created. 
Instead, 1da6dcfe.msf is used, and following is added to 1da6dcfe.msf, because "illegal file name character" is used in folder name.
> @$${58{@
> <(A1=/$C4$9B,$C5$A1,$C4$8D,$C5$99,$C5$BE,$C3$BD,$C3$A1,$C3$AD,$C3$A9,$C3$BA,$C5\
> $AF)>[1:^9F(^AE^A1)]
> @$$}58}@
So, after restart of Tb, no "ě,š,č,ř,ž,ý,á,í,é,ú,ů" is shown, and two "/ě,š,č,ř,ž,ý,á,í,é,ú,ů" are shown.
i.e. "ě,š,č,ř,ž,ý,á,í,é,ú,ů" is hijacked by "/ě,š,č,ř,ž,ý,á,í,é,ú,ů".
Following entry is removed from virtualFolders.dat, because correspondig .msf file doesn't exist.
> uri=mailbox://nobody@Local%20Folders//%C4%9B%2C%C5%A1%2C%C4%8D%2C%C5%99%2C%C5%BE%2C%C3%BD%2C%C3%A1%2C%C3%AD%2C%C3%A9%2C%C3%BA%2C%C5%AF
This is perhaps special case of bug 286523.

Similar thing is seen, when "Japanese character" is used in Search Folder name.
  Search Folder name = あああ
  msf : あああ.msf
  directory named あああ is created.
  virtualFolders.dat : uri=mailbox://nobody@Local%20Folders/%E3%81%82%E3%81%82%E3%81%82
However, "duplicate Search Folder" doesn't occur.
This is perhaps because file name is not hashed.

Similar thing is seen, when "illegal file name charcter" is used in Search Folder name.
  Search Folder name = AAA???BBB
  msf : AAAd5f1b959.msf (all string after ? is hashed.
  directory named AAAd5f1b959.msf is created.
  virtualFolders.dat : uri=mailbox://nobody@Local%20Folders/AAAd5f1b959
However, "duplicate Search Folder" doesn't occur.
This is perhaps because "file name part in virtualFolders.dat" is 7bits ascii only and is not escaped.

It looks that both "escape in uri of virtualFolders.dat" and "hashed file name" for "duplicated Search Folder after restart".
It seems that one of them is sufficient for "excess DIRECTORY without file extension".

By the way, if "#" is used in Search Folder name, following alert was shown, and folder is not crated.
> The folder could not be created because the folder name you specified contains
> an unrecognized character. Please enter a different name and try again.
Status: UNCONFIRMED → NEW
Component: Folder and Message Lists → Backend
Ever confirmed: true
OS: Mac OS X → All
Product: Thunderbird → MailNews Core
Problem of bug 669144 comment #0 (== bug 669144 comment #1) looks already resolved.
It looks that problem of bug 669144 comment #2 ==  bug 694933 comment #3(closed as dup of bug 669144) == phenomenon you saw, still remains.

As seen in bug 669144 comment #2, when Search Folder name = "ě,š,č,ř,ž,ý,á,í,é,ú,ů"(no quot), following two entries are generated by Tb.
> uri=mailbox://nobody@Local%20Folders/1da6dcfe
> uri=mailbox://nobody@Local%20Folders/%C4%9B%2C%C5%A1%2C%C4%8D%2C%C5%99%2C%C5%BE%2C%C3%BD%2C%C3%A1%2C%C3%AD%2C%C3%A9%2C%C3%BA%2C%C5%AF

If "escaped version" is deleted from virtualFolders.dat when "/ě,š,č,ř,ž,ý,á,í,é,ú,ů" is shown for in 1da6dcfe.msf, only one "/ě,š,č,ř,ž,ý,á,í,é,ú,ů" is shown after restart of Tb.
(1da6dcfe in virtualFolders.dat => associated folder name is obtained panacea.dat or somewhere else => "/ě,š,č,ř,ž,ý,á,í,é,ú,ů" is shown at folder pane)

If a Search folder is renamed to "ě,š,č,ř,ž,ý,á,í,é,ú,ů"(no quote), following only is generated(hashed name version only), and "Duplicate Search Folder" doesn't occur,
> uri=mailbox://nobody@Local%20Folders/1da6dcfe
although excess DIRECTORY of "1da6dcfe" is created.
i.e. WORKAROUND = Rename Search Folder, if special character is used
"Hashed file name of 1da6dcfe.msf for Search Folder name=ě,š,č,ř,ž,ý,á,í,é,ú,ů" may be my Japanese MS Win-XP only(System charset=Shift_JIS. NTFS is used and NTFS supports unicode file name. But NTFS supports Shift_JIS file name too.)

Tomáš Klos(bug opener), what file(and directory) is created and used by Tb on Mac OS for Search Folder with non-ASCII characters in name?
What is set in uri= line of virtualFolders.dat?
If I create Ss folder named "ěščřžýáíéúů" in TB, the folder named "ěščřžýáíéúů" and file named "ěščřžýáíéúů.msf" are created in profile_folder/Mail/Local Folders. In the virtualFolders.dat this record with folder's uri is created:

uri=mailbox://nobody@Local%20Folders/%C4%9B%C5%A1%C4%8D%C5%99%C5%BE%C3%BD%C3%A1%C3%AD%C3%A9%C3%BA%C5%AF

After TB restart, no change in profile_folder/Mail/Local Folders and virtualFolders.dat, but in TB the folder is duplicated.
If I delete one, the folder ěščřžýáíéúů and file ěščřžýáíéúů.msf are deleted too, but in virtualFolders.dat the record still exists.
After TB restart the folder ěščřžýáíéúů and file ěščřžýáíéúů.msf are re-created but the record in virtualFolders.dat file stay without change and the Ss folder in TB is duplicated again.

I hope this information helps.
I forgot: if I delete both folders in TB, the folder "ěščřžýáíéúů", "ěščřžýáíéúů.msf" file and relevant records in virtualFolders.dat are all deleted.
You perhaps pasted file name of "ěščřžýáíéúů" / "ěščřžýáíéúů.msf" on Mac OS X to this bug.
Why "ěščřžýáíéúů" / "ěščřžýáíéúů.msf" / ěščřžýáíéúů and / ěščřžýáíéúů.msf?

ů" / ů. / ů  / is perhaps by Unicode Normalization Form or Canonical Composition/Decomposition.
> http://unicode.org/reports/tr15/
This may depend on "where file name is obtained". from Finder, or from file system.

Binary escaped in virtualFolder.dat is utf-8.
> uri=mailbox://nobody@Local%20Folders/%C4%9B%C5%A1%C4%8D%C5%99%C5%BE%C3%BD%C3%A1%C3%AD%C3%A9%C3%BA%C5%AF
Last %C5%AF==0xC5AF is UTF-8 binary for U+016F==ů 
> http://www.fileformat.info/info/unicode/char/16f/index.htm

"Duplicate folder at folder pane" in your case may be difference between "ěščřžýáíéúů / ěščřžýáíéúů.msf from virtualFolders.dat" and "ěščřžýáíéúů" / ěščřžýáíéúů.msf from file name passed by OS/Finder".
".msf file name != folder name in uri of virtualFolder.dat" is common in this bug and bug 694933. However, "why != occurs" is absolutely different.
Not duping. Setting dependency to bug 694933 for ease of tracking and problem analysis.
Depends on: 694933
OS: All → Mac OS X
FYI.
> U+016F==ů / UTF-8=0xC5AF
> Decomposition : LATIN SMALL LETTER U (U+0075) COMBINING RING ABOVE (U+030A)
Funny character in this bug's comment is perhaps by "Composition of U+030A and next character" instead of "Composition of U+0075 and U+030A".
> LATIN SMALL LETTER U (U+0075) + COMBINING RING ABOVE (U+030A) + " in Unicode
> LATIN SMALL LETTER U (U+0075) + COMBINING RING ABOVE (U+030A) + . in Unicode
Summary: Saved searches duplicity on Mac (with non-ASCII characters in name) → Saved searches duplicity on Mac (with special non-ASCII characters in name)
Whiteboard: [Mac OS X uses NFD (Normalization Form D) for filenames]
FYI. I've opened bug 928661 for # case in my comment #1.
Severity: normal → S3
You need to log in before you can comment on or make changes to this bug.