Open Bug 1219084 Opened 9 years ago Updated 2 years ago

On macOS, filenames on disk differ from mailbox folder url in feeds.rdf, breaking feed url to folder relationship (use of composed/decomposed forms some utf8 chars)

Categories

(MailNews Core :: Feed Reader, defect)

Unspecified
macOS
defect

Tracking

(Not tracked)

People

(Reporter: ttbn.gm, Unassigned)

References

(Blocks 1 open bug)

Details

User Story

[Simplest/easiest Steps to reproduce problem]
1. On Mac OS X, rename folderName of any Feed folder at Folder Pane to
   folder name which contains "プ", "ド" etc. of Japanese letter,
   letter with umlaut etc.
   [ Unicode charcter of NFD(decomposed form) != NFC(composed form) ]
   For example, "Ü".
     "Ü" = U+00DC , utf-8=0xC39C
     Decomposition LATIN CAPITAL LETTER U (U+0055)
                   COMBINING DIAERESIS (U+0308)
     U+0055 , utf-8=0x55
     U+0308 , utf-8=0xCC88
   mailbox URI(obtained via nsIMsgFolder.URI) is saved in
   feeds.rdf, megFilterRules.dat, virtualFolders.dat.
     mailbox URL with NFC(composed form)
       mailbox://<Feeds or abc%40x.y.z@x.y.z>/%C3%9C
     mailbox URL with NFD(decomposed form)
       mailbox://<Feeds or abc%40x.y.z@x.y.z>/%55%CC%88 
   Absolute(full) file path in HFS+(==NFD) is written in panacea.dat always.
2. Restart Thunderbird.
3. (a) mailbox URL written in fz:feed of feeds.rdf is NFC(composed form).
   (b) after restart, search key(mailbox URL) for fz:feed entry is
       msgFolder.URI which is NFD(decomposed form).
    => mismatch between (a) and (b) causes "URL of the feed is lost".

[Workarounds]
(i) Legitimate workaround :
- Upon subscrition of new feed on Mac OS X,
  If character of "composed form" != "decomposed form" is used in Title,
  (for example, "プ" in "アップル", "ド" in "スラド", "ブ" / "グ" in "ブログ")
  Change the Title to char of "composed form" === "decomposed form".
- If already subscribed and experienced problem on Mac OS X,
  Delete folder for the feed at Folder Pane,
  then subscribe the feed with changing Title.

(ii) Corner cutting workaround :
- Subscribe Feed.
- Before problem starts to occur by restart of Tb,
  Rename folder for the Feed to safe name.
  ("composed form" === "decomposed form")
  File name is changed, and mailbox URL for the new name is written in feeds.rdf.

Attachments

(5 files, 4 obsolete files)

User Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.11; rv:41.0) Gecko/20100101 Firefox/41.0 Build ID: 20151014143721 Steps to reproduce: in MacOSX. For exmaple, Subscribe "http://www.apple.com/jp/main/rss/hotnews/hotnews.rss" on feeds. Thunderbird quit. Thunderbird reboot. Actual results: Subscribed item icon is changed to plane folder. It has never updated. Delete the feed item, and same item cannot subscribe. Expected results: It happens by several feed items. They should behave other right items. Sorry, I'm not good at English.
This feed works fine for me (linux, win) and there isn't anything about feeds processing special to mac. Please read the guide[1], and turn on debugging. Without error logs or more precise exact steps to reproduce, this bug can only be closed incomplete. [1] https://support.mozilla.org/en-US/kb/how-subscribe-news-feeds-and-blogs
[A STR reported to a forum in Japan] 1. Create new Tb profile, define dummy news account in order to create Local Folders account. 2. Create Feeds account 3. Subscribe http://www.apple.com/jp/main/rss/hotnews/hotnews.rss 4. Restart Tb. => URL in Feed Subscription is null. icon is lost. => Already fetched feed can be viewed, and Repair Folder works, because Folder.msf/Folder is healthy. Problem was order of definitions in feeds.rdf created by Tb 38 on Mac OS X. It looks. Due to unexpected order, Tb fails to find fz:feed entry. "fz:feed after fz:feeds" may be a cause. If the feeds.rdf is copied to Win, same problem occurd. No problem in feeds.rdf by Tb 38 on Win.
OS: Unspecified → Mac OS X
Attached file winfeeds.rdf
Subscribe http://www.apple.com/jp/main/rss/hotnews/hotnews.rss feeds.rdf on Windows8.1+Thunderbird38.3.0
Attached file macfeeds.rdf
Subscribe http://www.apple.com/jp/main/rss/hotnews/hotnews.rss feeds.rdf on Mac OS X 10.10.5+Thunderbird38.3.0
Attached file diff.txt
diff winfeeds.rdf macfeeds.rdf
To alta88. please look feeds.rdf content. What's bad in feeds.rdf on Mac OS X?
Flags: needinfo?(alta88)
I don't have a mac so this is difficult to diagnose. There is nothing wrong with the macfeeds.rdf (on linux), once a folder named "アップル - ホットニュース" is created and macfeeds.rdf is copied to /Mail/Feeds as feeds.rdf and restart, the icon appears and feeds download, across restarts. Wada, please try some things: 1. Is this the only feed that doesn't work or are there other ja feeds that show the same problem? Change pref Feeds.logging.console to debug and restart. 2. In a fresh feeds account, first create a folder named "アップル - ホットニュース" (the feed's title), then subscribe the feed url to that folder via Subscribe Dialog, rather than subscribing to the account (which autocreates the subfolder based on title). Results? 3. Create an ascii folder name like "Apple" and subscribe the feed url to that folder. Results? Is there anything suspicious in the error console?
Flags: needinfo?(alta88)
(In reply to alta88 from comment #7) > There is nothing wrong with the macfeeds.rdf (on linux), > once a folder named "アップル - ホットニュース" is created > and macfeeds.rdf is copied to /Mail/Feeds as feeds.rdf and restart, > the icon appears and feeds download, across restarts. When did you copy macfeeds.rdf to liunx? After tetmination of Tb, before restart of Tb? I also don't have mac os x. When I copied other feeds.rdf(created on mac os x) to Win(after termination of Tb, before restart of Tb), problem was reproduced, and problem disappeared when I manually changed defintion order in feeds.rdf on Win.
Tb is closed, macfeeds.rdf copied to feeds.rdf. It won't work to copy macfeeds.rdf to win since folder names there are hashed (when you create the folder matching the title) and won't match the utf8 encoding in the original macfeeds.rdf file. So I'm not sure what worked for you.. The order of tags in feeds.rdf doesn't matter and I doubt there's been any change in rdf or in anything lower level on mac to make it matter. Someone with a mac needs to do the steps in comment 7 to verify this isn't something specific to the OP's setup or actions. For example, if the OP renamed a folder on disk with Tb closed, this bug is invalid.
(In reply to alta88 from comment #9) I see. When I tested, I deleted panacea.dat while testing. It may be relevant to copied feeds.rdf use(for example, no panacea.dat entry, so data is re-created from scratch using feeds.rdf). Two feeds on which problem was reported in forum in Japan. <fz:feed RDF:about="http://rss.rssad.jp/rss/slashdot/slashdot.rss" dc:title="スラド" </fz:feed> <fz:feed RDF:about="http://www.apple.com/jp/main/rss/hotnews/hotnews.rss" dc:title="アップル - ホットニュース" </fz:feed> "ド" and "プ" is combind character in Unicode, so it has composed form and decomposed form. IIRC, Mac Finder used decomposed form and file system used composed form. Can such thing be relevant to problem?
FYI. Same problem in Tb 31 was reported to forum in Japan by duplication test with Tb 31 for following, although phenomenon seems slightly different. (feeds written in comment #0) <fz:feed RDF:about="http://www.apple.com/jp/main/rss/hotnews/hotnews.rss" dc:title="アップル - ホットニュース" </fz:feed> So, this is perhaps not regression in Tb. Site might recently have changed from "Slash Dot", "Apple" to "スラド", "アップル" respectively(These are Slash Dot/Apple in Japanese.
(In reply to WADA from comment #10) > (In reply to alta88 from comment #9) > > I see. > When I tested, I deleted panacea.dat while testing. It may be relevant to > copied feeds.rdf use(for example, no panacea.dat entry, so data is > re-created from scratch using feeds.rdf). > > Two feeds on which problem was reported in forum in Japan. > > <fz:feed RDF:about="http://rss.rssad.jp/rss/slashdot/slashdot.rss" > dc:title="スラド" > </fz:feed> > <fz:feed RDF:about="http://www.apple.com/jp/main/rss/hotnews/hotnews.rss" > dc:title="アップル - ホットニュース" > </fz:feed> > > "ド" and "プ" is combind character in Unicode, so it has composed form and > decomposed form. > IIRC, Mac Finder used decomposed form and file system used composed form. > > Can such thing be relevant to problem? The folder name stored in feeds.rdf at subscription time must match the disk folder name (hashed on win, pretty name in panacea.date which adds extra confusion/failure point). Any change in folder name must be done in Tb so the folder<->url relationship is maintained. (In reply to WADA from comment #11) > FYI. > Same problem in Tb 31 was reported to forum in Japan by duplication test > with Tb 31 for following, although phenomenon seems slightly different. > (feeds written in comment #0) > <fz:feed RDF:about="http://www.apple.com/jp/main/rss/hotnews/hotnews.rss" > dc:title="アップル - ホットニュース" > </fz:feed> > So, this is perhaps not regression in Tb. > Site might recently have changed from "Slash Dot", "Apple" to "スラド", "アップル" > respectively(These are Slash Dot/Apple in Japanese. The feed title only matters on first subscription to get a relevant folder name. After that it can change, and isn't updated in feeds.rdf. The user can also change the title in Subscribe to whatever they want. If the feed url changes, and there's no notice to the user (bad site behavior) then there's nothing for Tb to do about it. In fact, Tb can autodetect and update a feed url change so it's transparent to the user (see the guide on how, Bug 304917). But the publisher has to care.
Thanks for your explanation. I asked test with file name=Apple and SlashDot at forum i Japan. I'll check escaped utf-8 binary for file name in feeds.rdf. Another question on "order in feeds.rdf". Following is template of feeds.rdf(spaces for indention is by me). . http://mxr.mozilla.org/comm-central/source/mailnews/extensions/newsblog/content/FeedUtils.jsm#1057 > 1057 FEEDS_TEMPLATE: '<?xml version="1.0"?>\n' + > 1058 '<RDF:RDF xmlns:dc="http://purl.org/dc/elements/1.1/"\n' + > 1059 ' xmlns:fz="urn:forumzilla:"\n' + > 1060 ' xmlns:RDF="http://www.w3.org/1999/02/22-rdf-syntax-ns#">\n' + > 1061 ' <RDF:Description about="urn:forumzilla:root">\n' + > 1062 ' <fz:feeds>\n' + > 1063 ' <RDF:Seq>\n' + > 1064 ' </RDF:Seq>\n' + > 1065 ' </fz:feeds>\n' + > 1066 ' </RDF:Description>\n' + > 1067 '</RDF:RDF>\n', Where is <fz:feed></fz:feed> for each Feed inserted? (Tb's design and actual implementation) From perspective of RDF, I think following is valid. > <?xml version="1.0"?> > <RDF:RDF xmlns:dc="http://purl.org/dc/elements/1.1/" > xmlns:fz="urn:forumzilla:" > xmlns:RDF="http://www.w3.org/1999/02/22-rdf-syntax-ns#"> > <RDF:Description about="urn:forumzilla:root"> > </RDF:Description> > <RDF:Seq> > </RDF:Seq> > <fz:feeds> > </fz:feeds> > </RDF:RDF> In this case, where should <fz:feed></fz:feed> for each Feed be inserted? Can be inserted anywhere?
It's determined by the core rdf api (ie the feed system callers of the api don't determine this) and doesn't matter. The rdf structure is in memory and changes are flushed to disk at the end of a biff cycle and on Subscribe changes. I think this bug is invalid/worksforme but will leave it to you to decide Wada, if you can investigate..
(In reply to alta88 from comment #14) > I think this bug is invalid/worksforme but will leave it to you to decide Wada, if you can investigate.. But problem actually occurred/occurs on some Mac OS X/Thunderbird 38 users in Japan who subscribed/subscribes some feeds in Japan... Why can it be INVALID? Fault of publisher? Why can it be WORKSFORME merely by "Win/Linux user can't reproduce problem" even though "problem on Mac OS X"? (In reply to alta88 from comment #12) > (hashed on win, pretty name in panacea.date which adds extra confusion/failure point). IIUC, absolute path in file system with escaping in panacea.dat on Win and Linux, but PersistentID in panacea.dat on Mac OS X. This kind of difference is a cause of our confusions, in addition to "decomposed form in Finder and composed form in file system on Mac OS X".
(In reply to alta88 from comment #14) > It's determined by the core rdf api (ie the feed system callers of the api > don't determine this) and doesn't matter. Does it mean that RDF(XML) file content(structure/order) depends on OS's RDF/XML processor? If so, difference of feeds.rdf content(structure/order) between Mac OS X and Win can be explained, because I couldn't find difference of Tb's code among OSes(I couldn't find widget like code for each OS).
FYI. Quck summary of report at forum in Janan. > (a) feeds.rdf created on Mac > Mac OS X Linux Win > feeds.rdf created on Mac : correctly parse correctly parse (*) fails to parse > creates same order as Mac create different order > (b) Find fz:feed for > "アップル", "スラド" : failed to find no problem no problem What I saw in my tests was (*) of (a). What hapened on Mac OS X was (b). (b) seems next: (i) When subscribe and file creation from title of feed, composed form is used for file name(via api for file system). And escaped mailbox URL is written in feeds.rdf. PersistenID is saved in panacea.dat on Mac OS X. (ii) When restart, folder file is accessed via. PersistenID which is saved in panacea.dat. , Because file is normally accessed, feed items in it is normally shown. File name is obtained via. PersistentID, and is perhaps decomposed form(via Finder). So, mailbox URI generated from the "decomposed form" is different from mailbox URI in feeeds.rdf. Then, Tb fails to find rz:feed entory for the feed.
Thanks Wada. Can you verify that if the folder name アップル - ホットニュース is created first, and then the feed is subscribed to that folder, that it works? Just so I understand, in that utf8 ja title, the char プ has two forms, a composed and decomposed, and that on mac the disk file name uses the composed form when created, while the mailbox url foldername stored in feeds.rdf uses the decomposed form? And that the name in panacea.dat is using the composed form? On linux, can you verify that the filename on disk as well as the mailbox folder url are both using composed form? What is the composed/decomposed version of プ ?
Status: UNCONFIRMED → NEW
Component: Untriaged → Feed Reader
Ever confirmed: true
Product: Thunderbird → MailNews Core
Summary: Some feeds in Thunderbird isn't updated in mac. → On macOS, filenames on disk differ from mailbox folder url in feeds.rdf, breaking feed url to folder relationship (use of composed/decomposed forms some utf8 chars)
(In reply to alta88 from comment #18) > Thanks Wada. Can you verify that if the folder name アップル - ホットニュース is > created first, and then the feed is subscribed to that folder, that it works? Followng in macfeeds.rdf(and in winfeeds.rdf too) is escaped utf-8 of the filder name(==title of feed). mailbox://nobody@Feeds-2/%E3%82%A2%E3%83%83%E3%83%97%E3%83%AB%20-%20%E3%83%9B%E3%83%83%E3%83%88%E3%83%8B%E3%83%A5%E3%83%BC%E3%82%B9 After subscribe and folder creation, terminate and restart Tb, already fetched feed items can be viewed normally. However, URL of the feed is cleared in Subscribe Feed panel. > Just so I understand, in that utf8 ja title, the char プ has two forms, a > composed and decomposed, and that on mac the disk file name uses the > composed form when created, while the mailbox url foldername stored in > feeds.rdf uses the decomposed form? No. mailbox URL in feeds.rdf which is set upon subscribe is composed form. > And that the name in panacea.dat is using the composed form? No. IIRC, it's PersistentID in panacea.dat on Mac OS. We are trying to check with ascii file name such as Apple, SlashDor in forum in Japan, as you requested. If message filter is defined and Copy/Move target is set as "Feed folder", current mailbos URI is set in msgFilterRules.dat. We'll check it in the forum.
FYI. 3rd example presented at forum. "ブ" and "グ" has both composed form and decomposed form. > <fz:feed RDF:about="http://www.mozilla.jp/blog/feed/" ... > dc:title="Mozilla Japan ブログ" > dc:identifier="http://www.mozilla.jp/blog/feed/"> > <fz:destFolder RDF:resource="mailbox://nobody@Feeds/Mozilla%20Japan%20%E3%83%96%E3%83%AD%E3%82%B0"/> > </fz:feed>
(In reply to WADA from comment #19) > (In reply to alta88 from comment #18) > > Thanks Wada. Can you verify that if the folder name アップル - ホットニュース is > > created first, and then the feed is subscribed to that folder, that it works? > > Followng in macfeeds.rdf(and in winfeeds.rdf too) is escaped utf-8 of the > filder name(==title of feed). > mailbox://nobody@Feeds-2/%E3%82%A2%E3%83%83%E3%83%97%E3%83%AB%20- > %20%E3%83%9B%E3%83%83%E3%83%88%E3%83%8B%E3%83%A5%E3%83%BC%E3%82%B9 > After subscribe and folder creation, terminate and restart Tb, already > fetched feed items can be viewed normally. > However, URL of the feed is cleared in Subscribe Feed panel. > Right, but what I mean is, in a clean profile/new feed account, first create the folder name. Then restart. Then subscribe the feed url to that folder. Most subscribes are to the account folder, thus forcing a subfolder creation and slightly different steps. It may also mean that on restart/reread of os file, the name internal to Tb is now composed, so a new subscribe would match (and not use the title in the feed at all). I think the problem is the title in the publisher's feed file is decomposed, and that is the folder name requested for a new folder, even while the mailbox url is correctly composed/encoded. But it may be possible to normalize such combinatorial chars so only composed forms are used for the name when creating a folder.
Attached patch normalize.patch (obsolete) — Splinter Review
It would be great if someone with a mac could apply/test this patch.
(In reply to alta88 from comment #22) > normalize.patch mailbox URI in feeds.rdf/ (1) File is created from dc:title="スラド" by Tb. mailbox://nobody@Feeds-9/%E3%82%B9%E3%83%A9%E3%83%89   Problem occurs. (2)File of "スラド" is manually created, restart Tb, subscribe "スラド"、 mailbox://nobody@Feeds-3/%E3%82%B9%E3%83%A9%E3%83%88%E3%82%99   Problem doesn't occur. (3)File of "SlashDot" is manually created, restart Tb, subscribe "スラド"、 mailbox://nobody@Feeds-3/SlashDot   Problem doesn't occur. So, comment in your patch is slightly inaccurate. File name created has "composed form"(in file system), unless decomposed form is intentionally requested. In almost all cases, "composed form" is used. This is true in title of feed provided by publisher, as true in HTML in Web sites. I think this is one of phenomena/issues due to "Finder returns decomposed form".
Our test results. filenames on disk(==in file system) == mailbox folder url in feeds.rdf(composed form) => problem occurs filenames on disk(==in file system) != mailbox folder url in feeds.rdf(decomposed form), i.e. filenames from Finder(decomposed form) == mailbox folder url in feeds.rdf(decomposed form) => no problem
FYI. getSanitizedFolderName is defined/used at: http://mxr.mozilla.org/comm-central/search?string=getSanitizedFolderName&find=&findi=&filter=^[^\0]*%24&hitlimit=&tree=comm-central getSanitizedFolderName is called here. http://mxr.mozilla.org/comm-central/source/mailnews/extensions/newsblog/content/Feed.js#86 > 86 get folderName() > 87 { > 88 if (this.mFolderName) > 89 return this.mFolderName; > 90 > 91 // Get a unique sanitized name. Use title or description as a base; > 92 // these are mandatory by spec. Length of 80 is plenty. > 93 let folderName = (this.title || this.description || "").substr(0,80); > 94 let defaultName = FeedUtils.strings.GetStringFromName("ImportFeedsNew"); > 95 return this.mFolderName = FeedUtils.getSanitizedFolderName(this.server.rootMsgFolder, > 96 folderName, > 97 defaultName, > 98 true); > 99 }, (i) If file is already created and Tb is restarted before feed subscription, "already-known file name" is used. => mailbox URI in feeds is decomposed form. This indicates "already-known file name" is decomposed form. (ii) If file doesn't exist, file name is generated from dc:title, and is created via file system api. => mailbox URI in feeds is composed form. This indicates "file name created by file system" is composed form. getSanitizedFolderName is called here too. http://mxr.mozilla.org/comm-central/source/mailnews/extensions/newsblog/content/feed-subscriptions.js#2525 This is code in 2389 importOPMLOutlines: function(aBody, aRSSServer, aCallback)
I think "normlise for existent file" is better executed in earlier stage, for example, "folder file open", "during folder re-discovery".
Do these comments after 22 apply to tests with the patch applied?? The filename requested will always be composed form, there should be no mismatch. Unless osx is decomposing, which would be extremely stupid. In the case of creating the folder name manually, it can be composed or decomposed, but will not matter to feeds, as the mailbox url will be used as given by the os in feeds.rdf. For info: var puC = プ = "\u30d7" var puD = プ = "\u30d5\u309A" = フ + ゚ var norm = puD.normalize("NFKC"); norm.charCodeAt(0).toString(16) = 30d7 It seems gecko automatically normalizes for presentation if "\u30d5\u309A" is given (there must be a standard here), so it's really necessary to look at the bits in a stream (like the feeds xml file). You can't tell what form the publisher used just by looking at the source of the feed in a browser.
Attached patch normalize2.patch (obsolete) — Splinter Review
So it appears that unlike everyone else, osx is in fact decomposing before storing, regardless of requested form. Wada, could you request testing of this patch? It decomposes in compatibility mode NFKD, which is broadest, instead of canonical mode, since that mode includes issues of full/half width katakana. I didn't see anything more specifically helpful in apple dev docs.
Wada, this patch isn't a complete case solution, nevermind. I'll have a new one soon. The folder url key in feeds.rdf will need to be normalized, as well as accesses based on folder name, and will have to consider folder name changes, moves, etc.
FYI. Following is content in msgFilterRules.dat and virtualFolders.dat which was reported at forum in Japan. - Manually created folder named "スラド" for feed by SlashDot Japan. - Copy.Move target of message filter. - Search target folder of saved search folder. msfFilterRules.dat, actionValue。 > mailbox://nobody@Feeds-3/%E3%82%B9%E3%83%A9%E3%83%88%E3%82%99 virtualFolders.dat, scope(path of Search Folder himself) > mailbox://nobody@Feeds-3/%E3%82%B9%E3%83%A9%E3%83%88%E3%82%99/test virtualFolders.dat, scope(search target folder) > mailbox://nobody@Feeds-3/%E3%82%B9%E3%83%A9%E3%83%88%E3%82%99 Content of panacea.dat > (185 > =/Users/meeyar/Library/Thunderbird/Profiles/1jfb3cyz.feed/Mail/Feeds-3\ > /Trash.msf)(1AA=1446374947)(186 > =/Users/meeyar/Library/Thunderbird/Profiles/1jfb3cyz.feed/Mail/Feeds-3) > (188 > =/Users/meeyar/Library/Thunderbird/Profiles/1jfb3cyz.feed/Mail/Feeds-3\ > /$E3$82$B9$E3$83$A9$E3$83$88$E3$82$99.msf)(1BC=10)(1D8=141a8)(1DE > =1446375429)(189 > (1A9=1446374920)(1DA > =/Users/meeyar/Library/Thunderbird/Profiles/1jfb3cyz.feed/Mail/Feeds-3\ > /$E3$82$B9$E3$83$A9$E3$83$88$E3$82$99.sbd/test.msf) Although "3 chars in Unicode", 4 * "3 bytes utf-8 binary" is used. As for "folder path in Tb on Mac OS X", it seems proper form is "decomposed form". i.e. What's bad in this bug is: When folder is newly created from dc:title by Thunderbird upon subscription, FeedUtil doesn't convert it to proper "decomposed form" even though proper form is "decomposed form" in Tb on Mac OS X. => Your patch shoulf be "decompose()" instead of "normalize()" :-)
Above is a reason of "incompatbility of profile directory(message filter/Virtual Folder)" between "Mac OS X" and "Win/Linux" in special case.
Sorry, typo. virtualFolders.dat, scope(path of Search Folder himself) => virtualFolders.dat, uri(path of Search Folder himself)
.(In reply to alta88 from comment #27) > Do these comments after 22 apply to tests with the patch applied?? No. I can't build modules, and purpose of our tests was "problem duplication test with current Thunderbird". > The filename requested will always be composed form, there should be no mismatch. > Unless osx is decomposing, which would be extremely stupid. "Presenting as decomsed form" is done by Mac OS X for uniqueness in comparison, for consistency in sort, and so on. Already known phenomenon. When file name is copied at Finder, string pasted in application's input field is "decomposed form". So, it's different from expectation by application which is usually composed form. IIRC, it might have been resolved by "convert to composed form in Copy side".
Sorry, wrong descriptions. If HFS+, "decomposed form" was used by file system. http://wiki.bazaar.canonical.com/UnicodeSupport Mac OS Extended (HFS+) - the default and recommended file system - uses canonically decomposed Unicode 3.2 in UTF-16 format.
A possible clever solution is following. After creating new folder from dc:title upon feed subscription, close it, re-open it as "mail folder in Tb", and get actual file name in Mac OS X as done on existent file/folder, then generates mailbox url in feeds.rdf.
(In reply to alta88 from comment #28) > Created attachment 8681707 [details] [diff] [review] > normalize2.patch I think it's correct resolution, but I don't think "explicit normalization at there merely for bypassing this bug due to special spec in Mac OS X" is best way as proper/generic solution...
Please note that "normalization at there" may not work if user used "composed form in HFS+". As stated "default", tt seems that "decomposed form" or "composed form" is optional in HFS+.
Folder for new feed looks created here. http://mxr.mozilla.org/comm-central/source/mailnews/extensions/newsblog/content/feed-subscriptions.js#1299 Is "Close folder, re-open folder, get actual file name from Mac OS X" possible at here?
User Story: (updated)
FYI. WikiPedia for HFS+. https://en.wikipedia.org/wiki/HFS_Plus Ruby is also bothered by it. https://bugs.ruby-lang.org/issues/7267 In HFS+, some ranges of Unicode is "composed form" instead of "decomposed form" for interoperability with old mac. No need to pay attention to it?
User Story: (updated)
User Story: (updated)
Comment on attachment 8681707 [details] [diff] [review] normalize2.patch Review of attachment 8681707 [details] [diff] [review]: ----------------------------------------------------------------- (1) Added code from "+ let str = "", j = 0;" to "+this.log.debug("getSanitizedFolderName: decomposed unicode - " +str);" loos for Mac OS X only. If so, entire block is better placed in "if (Application.platformIsMac)" block. (2) I think added code is better defined as a function, even if it's used at here only, for ease of understanding overall logic. (3) Is this code work when user sets option of "composed form for HFS+" instead of "decomposed form(default, recommended)"? (4) How about other file system on Mac OS X?
That patch was for testing and is obsolete, thus the 'nevermind'..
The solution here is to always make the destFolder key a canonically composed normalization of the nsIMsgFolder.URI string, thus removing any platform quirk or portability issues.
Assignee: nobody → alta88
Attachment #8681645 - Attachment is obsolete: true
Attachment #8681707 - Attachment is obsolete: true
Attachment #8682065 - Flags: review?(mkmelin+mozilla)
(In reply to alta88 from comment #42) > normalizeFolderURI.patch > The solution here is to always make the destFolder key a canonically composed normalization > of the nsIMsgFolder.URI string, thus removing any platform quirk or portability issues. It sounds for me pretty good solution, because mailbox URL in feeds.rdf is merely a key to access fz:feed.rdf. For "key like one", "decomosed form" is better, as currently done in file system of Mac OS X. feeds.rdf now can be shared by Triple-Boot Mac/Linux/Win and Mac/Linux/Win Virtual Machines :-) (concern-1) If feed like "スラド” is already subscribed on Win/Linux, same problem as this bug occurs on Win/Linux after the patch. I think code like next is needed. If fz:feed is not found by search with "decomposed form" key, search with "composed form" key. If possible, update mailbox URL in feeds.rdf by official "decomposed form" key. (not mandatory. for backward compatibility, it's better kept as-is in feeds.rdf.) By this, Mac OS X user who is currently experiencing this bug is cured by merely upgrading to new Tb release. (concern-2) How about backward compatibility of feeds.rdf when feed like "スラド” is newly subscribed by new Tb and the feeds.rdf is used by old Tb? (question) "mailbox URL in feeds.rdf" is merely a key to search fz:feed from mail folder. After the patch, "mailbox URL of actual folder" != "mailbox URL in feeds.rdf" can occur on Win/Linux. It's confusing for users and QA peoples if problem in Feeds is reported to B.M.O for special feed like "スラド". Is there any plan to change it to "feeds-mailbox:// ..." like one, to avoid future confusions on Win/Linux? Note: This kind of change surely produces "backword compatibility prblem".
Comment on attachment 8682065 [details] [diff] [review] normalizeFolderURI.patch The assumption was that title names were composed and that osx decomposed, if that's not always the case and there are feeds with decomposed file names, then this won't work on those existing keys. The best solution may be to get the real osx folder uri, before storing the feeds.rdf key, if osx 'sometimes does this, sometimes that' to combinatorial forms. But without a mac, writing this patch is guessing. Probably using getChildNamed() 2x after folder creation with both forms of the name will find the real decomposed (or not) handle, etc etc.
Attachment #8682065 - Flags: review?(mkmelin+mozilla)
(In reply to alta88 from comment #44) > normalizeFolderURI.patch > > The best solution may be to get the real osx folder uri, before storing the > feeds.rdf key, if osx 'sometimes does this, sometimes that' to combinatorial > forms. But without a mac, writing this patch is guessing. Probably using > getChildNamed() 2x after folder creation with both forms of the name will > find the real decomposed (or not) handle, etc etc. My idea in comment #35 is similar to it. If such action is not so easy or so clean, I think concept of current your patch, "use normalize("NFC") or normalize("NFD") for key to access fz:feed in feeds.rdf always on any OS", is reasonable and valid solution. Even if backward compatibility issue is involved, and even if this bug may occur on Win/Linux user if problematic feeds are already subscribed, # of affected users, # of affected RSS Feeds are pretty limited. It's never many. I believe that it's rather currently called "edge case". If many users/many feeds are involved, many complants on this bug should have been posted at many forums. However, AFAIK, problem actually occurred on 3 feeds only on several Mac OS X users in Japan. "All data in this bug which is relevant to this bug on Mac OS X" was provided by intentional duplication test by a Mac OS X user who didn't experience this bug. He did test because problem was reported by one Mac OS X user at one forum in Japan. I also did duplication test on Win only, and I also didn't subscribe the 3 problematic feeds in Japan.
Comment on attachment 8682065 [details] [diff] [review] normalizeFolderURI.patch Review of attachment 8682065 [details] [diff] [review]: ----------------------------------------------------------------- Oh, sorry, I confused with old patch. It was "use normalize("NFC") as key to search fz:feed in feeds.rdf always" on any OS. + getNormalizedFolderURI: function(aFolderURI) { + return this.rdf.GetResource(encodeURI(decodeURI(aFolderURI).normalize("NFC"))); + }, On Win/Linux: file name is composed form, so no problem including both backward/forward compatibility. On Mac OS X: This bug when "feed folder is created by Tb from title of feed" is resolved. Outstanding problem. If user already did workaround and set NFD(decomposed form) in feeds.rdf, new Tb with the patch fails to find fz:feed for this feed folder. I believe this is not problem because "did workaround" means "knows this bug".
(In reply to WADA from comment #45) > (In reply to alta88 from comment #44) > > normalizeFolderURI.patch > > > > The best solution may be to get the real osx folder uri, before storing the > > feeds.rdf key, if osx 'sometimes does this, sometimes that' to combinatorial > > forms. But without a mac, writing this patch is guessing. Probably using > > getChildNamed() 2x after folder creation with both forms of the name will > > find the real decomposed (or not) handle, etc etc. > > My idea in comment #35 is similar to it. > And it's the best solution. Trying to normalize keys now is risky, compatibility is not a huge concern but no reason to break it if unnecessary.
Attached patch getRealFolder.patch (obsolete) — Splinter Review
This is the most risk free solution, to requery for the real folder uri for new or renamed folders rather than rely on createLocalSubfolder(). There's even a hint in the idl for addSubFolder(), http://mxr.mozilla.org/comm-central/source/mailnews/base/public/nsIMsgFolder.idl#180
Attachment #8683313 - Flags: review?(mkmelin+mozilla)
Comment on attachment 8683313 [details] [diff] [review] getRealFolder.patch Review of attachment 8683313 [details] [diff] [review]: ----------------------------------------------------------------- > + this.server.rootMsgFolder.QueryInterface(Ci.nsIMsgLocalMailFolder) > + .createLocalSubfolder(this.folderName); > + try { > + // Get the new folder explicitly. Some filesystems will change the uri > + // out from under us (looking at osx HFS). If the newly created folder > + // isn't found, this will throw. > + this.folder = this.server.rootMsgFolder.getChildNamed(this.folderName); > + } Why not code like next? > this.folder = this.server.rootMsgFolder > .QueryInterface(Ci.nsIMsgLocalMailFolder) > .createLocalSubfolder(this.folderName); http://mxr.mozilla.org/comm-central/source/mail/base/content/folderPane.js#1208 > 1208 let newFolder; > 1209 try { > 1210 if (parentFolder instanceof(Components.interfaces.nsIMsgLocalMailFolder)) > 1211 newFolder = parentFolder.createLocalSubfolder(newName); > 1212 else > 1213 newFolder = parentFolder.addSubfolder(newName);
If following, I think this is bug in createLocalSubfolder who doesn't correctly set msgFolder.URI on Mac OS X. - After msgFolder = createLocalSubfolder, msgFolder.URI is NFC(composed form). - Upon folder open after restart or folder re-open, msgFolder.URI is NFD(decomposed form). And "forcing folder re-discovery" is a good circumvention in caller of createLocalSubfolder.
User Story: (updated)
Comment on attachment 8683313 [details] [diff] [review] getRealFolder.patch Review of attachment 8683313 [details] [diff] [review] ----------------------------------------------------------------- > + this.server.rootMsgFolder.QueryInterface(Ci.nsIMsgLocalMailFolder) > + .createLocalSubfolder(this.folderName); > + try { > + // Get the new folder explicitly. Some filesystems will change the uri > + // out from under us (looking at osx HFS). If the newly created folder > + // isn't found, this will throw. > + this.folder = this.server.rootMsgFolder.getChildNamed(this.folderName); > + } Comment for addSubfolder in nsIMsgFolder.idl http://mxr.mozilla.org/comm-central/source/mailnews/base/public/nsIMsgFolder.idl#175 Unless you know exactly what you're doing, you should be using createSubfolder + getChildNamed or createLocalSubfolder. As known by .QueryInterface(Ci.nsIMsgLocalMailFolder).createLocalSubfolder(this.folderName), createLocalSubfolder is not defined by nsIMsgFolder.idl, and is entity of nsIMsgLocalMailFolder only. It's defined by; http://mxr.mozilla.org/comm-central/source/mailnews/local/src/nsLocalMailFolder.cpp#154 154 NS_IMETHODIMP nsMsgLocalMailFolder::CreateLocalSubfolder(const nsAString &aFolderName, 155 nsIMsgFolder **aChild) I think followin which is recommended by comment of idl is better. this.server.rootMsgFolder.createSubfolder(this.folderName); // msgFolder is not returned by createSubfolder this.folder = this.server.rootMsgFolder.getChildNamed(NFC version of this.folderName); if(not found) try this.folder = this.server.rootMsgFolder.getChildNamed(NFD version of this.folderName); I believe "NFC format msgFolder.URI by msgFolder=CreateLocalSubfolder(this.folderName)" can be called bug in CreateLocalSubfolder on Mac OS X. But there is a "bad in feed" too. - Save msgFolder.URI of msgFolder=CreateLocalSubfolder(this.folderName) in feeds.rdf, - Tries to use it forever as key of fz:feed in feeds.rdf, without verifying "actual file name is surely returned by msgFolder.URI" even on Mac OS X.
Attached patch getRealFolder.patch (obsolete) — Splinter Review
Handle additional edge cases.
Attachment #8683313 - Attachment is obsolete: true
Attachment #8683313 - Flags: review?(mkmelin+mozilla)
Attachment #8684700 - Flags: review?(mkmelin+mozilla)
FYI. (A) Creation of new local mail folder at Folder Pane. > http://mxr.mozilla.org/comm-central/source/mail/base/content/mail3PaneWindowCommands.js#988 > 988 case "cmd_newFolder": > 989 gFolderTreeController.newFolder(); > 990 break; > > http://mxr.mozilla.org/comm-central/source/mail/base/content/folderPane.js#2345[code] > 2345 newFolder: function ftc_newFolder(aParent) { > 2367 if (aName) > 2368 aFolder.createSubfolder(aName, msgWindow); > > createSubfolder in idl. > http://mxr.mozilla.org/comm-central/source/mailnews/base/public/nsIMsgFolder.idl#163 > 173 void createSubfolder(in AString folderName, in nsIMsgWindow msgWindow); > > http://mxr.mozilla.org/comm-central/source/mailnews/local/src/nsLocalMailFolder.cpp#533 > 533 nsMsgLocalMailFolder::CreateSubfolder(const ... > 536 nsresult rv = CreateSubfolderInternal(folderName, msgWindow, getter_AddRefs(newFolder)); (B) Creation of new feed folder(identical to local mail folder) upon Feed Subscribing. > http://mxr.mozilla.org/comm-central/source/mailnews/extensions/newsblog/content/Feed.js#485 > 485 createFolder: function() > 486 { > 487 if (this.folder) > 488 return; > 490 try { > 491 this.folder = this.server.rootMsgFolder > 492 .QueryInterface(Ci.nsIMsgLocalMailFolder) > 493 .createLocalSubfolder(this.folderName); > 494 } > > createLocalSubfolder in idl. > http://mxr.mozilla.org/comm-central/source/mailnews/local/public/nsIMsgLocalMailFolder.idl#60 > 65 nsIMsgFolder createLocalSubfolder(in AString aFolderName); > > http://mxr.mozilla.org/comm-central/source/mailnews/local/src/nsLocalMailFolder.cpp#154 > 154 NS_IMETHODIMP nsMsgLocalMailFolder::CreateLocalSubfolder(const nsAString &aFolderName, ... > 157 NS_ENSURE_ARG_POINTER(aChild); > 158 nsresult rv = CreateSubfolderInternal(aFolderName, nullptr, aChild); (C) common CreateSubfolderInternal, and excuted code by CreateSubfolderInternal > http://mxr.mozilla.org/comm-central/source/mailnews/local/src/nsLocalMailFolder.cpp#547 > 547 nsMsgLocalMailFolder::CreateSubfolderInternal(const nsAString& folderName, ... > > 555 nsCOMPtr<nsIMsgPluggableStore> msgStore; > 556 rv = GetMsgStore(getter_AddRefs(msgStore)); > 557 NS_ENSURE_SUCCESS(rv, rv); > 558 rv = msgStore->CreateFolder(this, folderName, aNewFolder); > > http://mxr.mozilla.org/comm-central/source/mailnews/local/src/nsMsgBrkMBoxStore.cpp#69 > (Note: different module is used for MaildirMsgStore) > 69 NS_IMETHODIMP nsMsgBrkMBoxStore::CreateFolder(nsIMsgFolder *aParent, ... > // creation of file for folder > 80 nsresult rv = aParent->GetFilePath(getter_AddRefs(path)); > 84 rv = CreateDirectoryForFolder(path); > 91 NS_MsgHashIfNecessary(safeFolderName); > 93 path->Append(safeFolderName); > 99 path->Create(nsIFile::NORMAL_FILE_TYPE, 0600); > > // creation of msgDatabase > 103 rv = aParent->AddSubfolder(safeFolderName, getter_AddRefs(child)); > > // open msgDatabase > 115 rv = msgDBService->OpenFolderDB(child, true, getter_AddRefs(unusedDB)); > 117 rv = msgDBService->CreateNewDB(child, getter_AddRefs(unusedDB)); (D) creation of msgDatabase after creation of file for local message folder > http://mxr.mozilla.org/comm-central/source/mailnews/base/util/nsMsgDBFolder.cpp#3689 > 3689 NS_IMETHODIMP nsMsgDBFolder::AddSubfolder(const nsAString& name, ... > > In tis function, mailbox URI is generated from requested folder name, > and specil folder flag of special folder is set(Trash, Drafts, Sent, Archve etc.) (E) Similar problem to this bug on Mac OS X in message filter and search folder, and possible solution for all of feed proble, message filter problem, search folder problem. If file name of NFC(composed form)!=NFD(decomposed form) is used for "copy/move target folder of message filter" and/or "search target folder of Search Folder", similar problem occurs after restart of Tb, because Tb fails to find folder which is pointed by mailbox URI in msgFilterRules.dat or virtualFolders.dat. - message filter fails to find copy/move target aafter restart. - Search target is silently removed after restart, if it's not last search target folder. - Search Folder is silently deleted after restart, if it's last search target folder. Bug for it is is not opened yet. If mailbox URI is generated by nsMsgDBFolder::AddSubfolder from actual file name in HFS+ when Mac OS X, I think this bug(and message filter's copy/move case, search folder case too) will be resolved. alta88, what do you think?
(In addition to comment #53) Following is comment in nsMsgDBFolder::AddSubfolder. 3708 // fix for #192780 > 3709 // if this is the root folder > 3710 // make sure the the special folders > 3711 // have the right uri. > 3712 // on disk, host\INBOX should be a folder with the uri mailbox://user@host/Inbox" > 3713 // as mailbox://user@host/Inbox != mailbox://user@host/INBOX > 3714 nsCOMPtr<nsIMsgFolde I think that both "Inbox != INBOX in all environment" and "NFC(composed form) folder name != NFD(decomposed form) file name in HFS+ on Mac OS X" are better processed at same place in same module.
Yes, it certainly would be theoretically better if nsIMsgFolder always contained the osx disk uri and name etc. But I don't know and have no way to test whether this really affects other situations like move/copy or filters or anything else; are there any suspicious bugs? Since for those cases the problem may only happen in an immediately newly created folder (without restart) it may be rarely encountered. Someone with a mac will have to take an interest in this issue. The feeds case, of using a folder uri as a permanent key, is more clear (and common: http://forums.mozillazine.org/viewtopic.php?f=29&t=2969123).
Henri, do you know if there is a c++ api equivalent to .normalize() in ES6 to convert to/from composed/decomposed combinatorial utf8 chars? https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/String/normalize
Flags: needinfo?(hsivonen)
FYI. http://mxr.mozilla.org/comm-central/source/mozilla/intl/unicharutil/nsUnicodeNormalizer.cpp#332 332 mdn_normalize(bool do_composition, bool compat, 333 const nsAString& aSrcStr, nsAString& aToStr) 254 mdn__unicode_compose(uint32_t c1, uint32_t c2, uint32_t *compp) compose() and decompose() is also defined.
FYI. Bug 227547 was an imrovement on PPC Mac OS X. Bug 703161 looks successor of it for torelance with "NFD in HFS+ on Mac OS X".
FYI. Quick history on nsLocalFile*.cpp. 1. bug 227547 : (a) Convert attached file name from NFD to NFC in mail composition. (b) Introduced nsLocalFileOSX.cpp for Mac OS X 2. bug 571193 : make Mac OS X share UNIX filesystem code nsLocalFileOSX.h/nsLocalFileOSX.cpp is removed. Consolidated to nsLocalFileUnix.cpp. 3. Then bug 703161 was opened. Currently used nsLocalFile*/h and cpp /mozilla/xpcom/io/nsLocalFileUnix.cpp /mozilla/xpcom/io/nsLocalFileWin.cpp /mozilla/xpcom/io/nsLocalFile.h /mozilla/xpcom/io/nsLocalFileWin.h /mozilla/xpcom/io/nsLocalFileCommon.cpp /mozilla/xpcom/io/nsLocalFileUnix.h Is cause of following current nsLocalFileUnix.cpp code? After creation of local message folder : nsIMsgFolder.URI == mailbox URI with NFC name which is requested file name After restart of Thunderbird : nsIMsgFolder.URI == mailbox URI with NFD actual file name which is returned from HFS+ Note: Full path written in panaceadat(Folder Cache) is always "NFD which is returned from HFS+".
To Makoto Kato san, Masatoshi Kimura san. What is cause of "inconsitent nsIMsgFolder.URI" on Mac OS X? (NFC after local mail folder creation, but NFD after restart of Tb) Whose fault? nsLocalFileUnix.cpp? Caller of it? nsMsgDBFolder::AddSubfolder? Caller of it? fault upon local mail folder creation? fault upn restart of Tb? What is proper solution on Mac OS X?
Flags: needinfo?(m_kato)
Flags: needinfo?(VYV03354)
User Story: (updated)
Flags: needinfo?(hsivonen)
What happens if we normalizes the URI and the feed contains a CJK Compatibility Ideograph?
Flags: needinfo?(VYV03354)
(In reply to Masatoshi Kimura [:emk] from comment #62) > What happens if we normalizes the URI and the feed contains a CJK Compatibility Ideograph? "normalizes the URI" at where, in what module, when? As written in "User Storuy" of this bug, if dc:title(in NFC) of feed has char of NFD!=NFC, NFC mailbox URL is written in "fz:destFolder RDF:resource=" in feeds.rdf, (see attachment 8681221 [details] macfeeds.rdf wich is attached to comment #4, please) and feed URL is lost after restart of Tb. After the los of feed URL by restart of Tb, feed is recovered by "manual putting NFD mailbox URL in "fz:destFolder RDF:resource=". This is same on message filter and search folder. (1) Create folder with chaar of name of NFD!=NFC, for example, "Ü". (1-a) Before restart of Tb, set the "Ü" as copy/move target folder of message filter. => NFC mailbox URI is written in msgFilterRules.dat (1-b) Before restart of Tb, set the "Ü" as search target folder of search folder. => NFC mailbox URi is written in msgFilterRules.dat (3) Restart Tb. Message filter fails to copy/move mail. Search flder is delete because "last search target" is not found. (4) create filter/saved search after the restaart. (4-a) set the "Ü" as copy/move target folder of message filter. => NFD mailbox URI is written in msgFilterRules.dat (4-b) set the "Ü" as search target folder of search folder. => NFD mailbox URi is written in msgFilterRules.dat In any case, (a) After folder creation before restart of Tb : NFC mailbox URI is used used by Tb and is written in feeds.rdf/msgFilterRles.dat/virtualFolders.dat. If mailbox URI is manully replaced by NFD mailbox URI in feeds.rdf/msgFilterRles.dat/virtualFolders.dat, feed/filter/saerch folder perhaaps fails to find folder(find folder via folder URI is perhaps executed) because nsIMsgFolder.URI of msgFolder object is NFC mailbox URI. (b) After folder restart and restart of Tb : NFD mailbox URI(obtained from actual file name in HFS+) is used for mail folder. So, feed/filter/search folder fails to find target mail folder which is represented in NFC mailbox URI. If feed/filter/search folder is created after "folder creation followed by restart of Tb", NFD mailbox URI is written in feeds.rdf/msgFilterRles.dat/virtualFolders.dat. Please see following topic of MozillaZine Japan forum. http://forums.mozillazine.jp/viewtopic.php?f=3&t=15721 (for feed) http://forums.mozillazine.jp/viewtopic.php?f=3&t=15766 (for message filter)
FYI. - When nsMsdFolder.URI is set as NFC mailbox URI(after folder creaton, before restart of Tb), nsMsdFolder.prettiestName(folder name) is also NFC name which was requested in file creation by Tb. - When nsMsdFolder.URI is set as NFD mailbox URI(after folder creaton and after restart of Tb), nsMsdFolder.prettiestName(folder name) is also NFD which is obtained from HFS+.
Sorry for not being clear. My concern is that HFS+ filename composition/decomposition is not equivalent to NFC/NFD. The former will convert Compatibility ideographs to Unified ideographs, but the latter will not. So if we use NFC/NFD and the feed contains a Compatibility ideographs character, it will cause a similar problem of this bug.
(In reply to Masatoshi Kimura [:emk] from comment #65) > My concern is that HFS+ filename composition/decomposition is not equivalent to NFC/NFD. > The former will convert Compatibility ideographs to Unified ideographs, but the latter will not. > So if we use NFC/NFD and the feed contains a Compatibility ideographs character, it will cause a similar problem of this bug. What do you mean by your "Compatibility ideographs character"? A "char in unicode ranges for which Mac OS X/HFS+ uses NFC UTF-16 instead of NFD as file name in HFS+ file system? If so, and if such char of specific unicode ranges, this bug won't occur in current Thunderbird on Mac OS X. - When file creaation, file name of NFC is requested, and NFC is storered in nsIMsgFolder.prretiestName/URI. - Upon restart of Thunderbird, NFC is returned from HFS+, then NFC is stored in nsIMsgFolder.prretiestName/URI. This bug occurs only when NFD!=NFC and NFD utf-16 binary is storered as file name in HFS+ file system. Or, "プ", "ド", "ブ”、 "グ", "Ü"(capital U umlaut), etc. are the "Compatibility ideographs character"? When such "Compatibility ideographs character" is used in folder name/file name, is "NFC file name in HFS+ file system" is wrngly converted by Thunderbird to "NFD string" upon file open/folder open after restart of Tb?
Compatibility ideographs will be converted to Unified ideographs in ALL normalization form (both NFC/NFD). HFS+ does not convert those characters at all. So the problem will happen regardless of the normalization form.
(In reply to WADA from comment #66) > What do you mean by your "Compatibility ideographs character"? > A "char in unicode ranges for which Mac OS X/HFS+ uses NFC UTF-16 instead of > NFD as file name in HFS+ file system? Compatibility ideographs characters will be converted to corresponding Unified Ideographs in *all* normalization form (both NFC/NFD). It does not make sense to differentiate NFC/NFD here.
> On Win/Linux: > file name is composed form, so no problem This is wrong. Win/Linux do not care about the Unicode normalization. It happens to be composed form in many case. If the input string is decomposed, the decomposed form will be used as-is.
(In reply to Masatoshi Kimura [:emk] from comment #68) > Compatibility ideographs characters will be converted to corresponding > Unified Ideographs in *all* normalization form (both NFC/NFD). It does not > make sense to differentiate NFC/NFD here. This bug is for "プ", "ド", "ブ”、 "グ", "Ü"(capital U umlaut), etc. ("アップル", "スラド", "ブログ", A latin character + diacritical mark. char of NFD != NFC) CJK_Compatibility_Ideographs : https://en.wikipedia.org/wiki/CJK_Compatibility_Ideographs http://www.unicode.org/charts/PDF/UF900.pdf Even if problem due to the "Compatibility ideographs characters" actually exists in Mac OS X and it may produce similar or same phenomenon, why the "Compatibility ideographs characters" is relevant to this bug for "プ", "ド", "ブ”、 "グ", "Ü"(capital U umlaut), etc. on Mac OS X? If problem due to the "Compatibility ideographs characters" will be resolved, any problem on file name/folder name/folderURI on Mac OS X due to "NFD!=NFC && HFS+" will be entirely resolved?
(In reply to WADA from comment #70) > (In reply to Masatoshi Kimura [:emk] from comment #68) > > Compatibility ideographs characters will be converted to corresponding > > Unified Ideographs in *all* normalization form (both NFC/NFD). It does not > > make sense to differentiate NFC/NFD here. > > This bug is for "プ", "ド", "ブ”、 "グ", "Ü"(capital U umlaut), etc. > ("アップル", "スラド", "ブログ", A latin character + diacritical mark. char of NFD != > NFC) *Currently* this bug is about "プ", "ド", etc. *If* we use NFC/NFD to "fix" this bug, we will have a similar bug about CJK Compatibility Ideographs. We should use compatible composing/decomposing algorithms with HFS+.
(In reply to Masatoshi Kimura [:emk] from comment #71) > (In reply to WADA from comment #70) > > (In reply to Masatoshi Kimura [:emk] from comment #68) > > > Compatibility ideographs characters will be converted to corresponding > > > Unified Ideographs in *all* normalization form (both NFC/NFD). It does not > > > make sense to differentiate NFC/NFD here. > > > > This bug is for "プ", "ド", "ブ”、 "グ", "Ü"(capital U umlaut), etc. > > ("アップル", "スラド", "ブログ", A latin character + diacritical mark. char of NFD != > > NFC) > > *Currently* this bug is about "プ", "ド", etc. *If* we use NFC/NFD to "fix" > this bug, we will have a similar bug about CJK Compatibility Ideographs. We > should use compatible composing/decomposing algorithms with HFS+. emk, this theory is easy to test, if you have a mac. 1. Apply the patch, make sure to have a Tb feed account with a feed subscription; the folderpane favicon should be the feed's favicon, indicating the folder to feed url relationship is valid. 2. Rename the folder to something with a decomposed char; the favicon should still be there. 3. Rename the folder to something using a compatibility char; ibid. 4. Rename back to using only composed chars; ibid. 5. Repeat 2-4 with a restart each time. It is unclear what osx does with compatibility char names on disk; I think osx uses only canonical decomposed forms. Do you know for sure?
Comment on attachment 8684700 [details] [diff] [review] getRealFolder.patch Review of attachment 8684700 [details] [diff] [review]: ----------------------------------------------------------------- I'm not sure I'm a good reviewer for this. Basically looks ok but I really don't know much about NFC/NFD or what mac does those on the filesystem... I guess I can rubberstamp it if you get someone who does to f+ the approach ::: mailnews/extensions/newsblog/content/FeedUtils.jsm @@ +989,5 @@ > + * name form. > + * > + * @param nsIMsgFolder aFolder - the folder > + * @return nsIMsgFolder - the folder on disk, or original > + */ nit: the whole block should be indented by 2 spaces.
Attachment #8684700 - Flags: review?(mkmelin+mozilla)
aleth, can you help with this mac issue? testing requires the steps in comment 72, using the 2 chars below as folder names (select/copying them retains the form). composed: プ = "\u30d7" decomposed: プ = "\u30d5\u309A" = フ + ゚
Attachment #8693232 - Flags: feedback?(aleth)
Attachment #8684700 - Attachment is obsolete: true
With this patch, the favicon disappears after a restart with the folder name = "プ" (composed char). Decomposed chars seem fine. I also get JavaScript strict warning: resource:///modules/activity/moveCopy.js, line 285: ReferenceError: reference to undefined property this.lastFolder.URI which might be unrelated, but would be good to fix along the way.
Attachment #8693232 - Flags: feedback?(aleth) → feedback-
jorgk might know something about this after all his CJK bug fixes...
Flags: needinfo?(mozilla)
Sorry, no, I was venturing into the M-C serialiser and mail message compose and send territory. I learned a bit about encodings, but that won't help here.
Flags: needinfo?(mozilla)
(In reply to aleth [:aleth] from comment #75) > With this patch, the favicon disappears after a restart with the folder name > = "プ" (composed char). Decomposed chars seem fine. > > I also get > JavaScript strict warning: resource:///modules/activity/moveCopy.js, line > 285: ReferenceError: reference to undefined property this.lastFolder.URI > which might be unrelated, but would be good to fix along the way. That error has been there forever on movecopy. Well, a mac dev will have to address this. The best solution, but with more widespread effect, is to ensure the post osx decompose uri is returned in the folder object. It may not matter for other (than feed) operations or there are subtle strange random errors on osx only for folder operations in non ascii locales. For feeds, the somewhat blind attempt here was to get the right uri at key points to maintain the uri to url relationship. Thanks for the testing, aleth.
Assignee: alta88 → nobody
I'm OS X and Japanese user, so I tried Comment72, rename folder NFC, NFD, CJK Compatibility Ideographs and CJK unified ideographs. Only in case NFC(folderpane's folder name and fz:destFolder RDF:resource written in feeds.rdf), the favicon is disappeared after restarting Tb. NFD, CJK Compatibility Ideographs and CJK unified ideographs are all fine. CJK character that I tried ,copy from https://en.wikipedia.org/wiki/CJK_Compatibility_Ideographs "神"(U+FA19) and "﨟"(U+FA1F)
(In reply to meeyar from comment #79) Forgot to write > I'm OS X and Japanese user, so I tried Comment72, rename folder NFC, NFD, > CJK Compatibility Ideographs and CJK unified ideographs. I tried after applied attachment 8693232 [details] [diff] [review] to Tb
Flags: needinfo?(m_kato)
Severity: normal → S3
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: