Closed Bug 533640 Opened 15 years ago Closed 14 years ago

Thunderbird 2.0 to 3.0 upgrade deleted contents of local trash subfolders, colons (:), slash(/) in folder names on Mac OS X

Categories

(Thunderbird :: Folder and Message Lists, defect)

x86
macOS
defect
Not set
critical

Tracking

(blocking-thunderbird3.0 .2+, thunderbird3.0 .2-fixed)

RESOLVED FIXED
Thunderbird 3.1b1
Tracking Status
blocking-thunderbird3.0 --- .2+
thunderbird3.0 --- .2-fixed

People

(Reporter: thaisriv, Assigned: Bienvenu)

References

(Depends on 1 open bug, Blocks 1 open bug)

Details

(Keywords: dataloss, regression, Whiteboard: [223 Migration][blocks a major upgrade][needs branch approval])

Attachments

(6 files, 5 obsolete files)

User-Agent:       Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.6; en-US; rv:1.9.1.5) Gecko/20091102 Firefox/3.5.5
Build Identifier: Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.6; en-US; rv:1.9.1.5) Gecko/20091204 Thunderbird/3.0

Upgrading from Thunderbird 2.0 to 3.0 deleted all the emails in local folders contained within the Trash folder of Local Folders. The folders themselves remained visible inside the Trash folder of Local Folders but lacked any content.

Inside ~/Library/Thunderbird/Profiles/v8o8mg25.default/Mail/Local\ Folders there are .msf files corresponding to the folders, but the extensionless content files were missing. For example, "Stanford Backup/ Received.msf" is there, but not "Stanford Backup/ Received".

There are files in Trash.sdb inside Local Folders (on the disk) called "Stanford Backup/ Received.msf" and "Stanford Backup/ Received" (the latter does contain all of the missing messages).


Reproducible: Didn't try
Version: unspecified → 3.0
I believe the cause of this bug may be the fact that the folders in question contained a colon in their names (e.g., "Stanford Backup: Received") in Thunderbird 2.0. Even though these names are displayed properly inside the Trash folder of Local Folders in Thunderbird 3.0 (e.g., "Stanford Backup: Received"), in Finder the folders have been renamed (e.g., "Stanford Backup/ Received").
I can confirm this behavior and thaisriv@stanford.edu's analysis.

To reproduce:

1) Within Thunderbird 2.0, create two subfolders of "Local Folders" and two of "Local Folders/Trash", with one of each pair containing a colon and one not.  Put a message in each of the newly created folders.  In my case, I created messages with distinct subjects (Test1-4) and placed them in folders, resulting in the following hierarchy:

Local Folders
|-Trash
|	|-TrashTest
|	|	|- (Test3)
|	|
|	|-TrashTest: With a colon
|		|- (Test4)
|
|-Test
|	|- (Test1)
|
|-Test: With a colon
	|- (Test2)

2) Upgrade to Thunderbird 3.0 in the usual manner.

3) Observe results, noting that folders/subfolders containing colons are now empty.  In my test case:

Local Folders
|-Trash
|	|-TrashTest
|	|	|- (Test3)
|	|
|	|-TrashTest: With a colon
|
|-Test
|	|- (Test1)
|
|-Test: With a colon


Not sure whether Installer is the right component for this, but severity is correct in light of apparent data loss (and not just from Trash), even though the data evidently persists on disk.  I will be attaching my test notes (including directory contents listings) momentarily.
Attached file Test notes
Added summary of before/after test scenario, including overview of file structure on disk for the relevant directory.
Since the nature of the problem was clarified, a better search turned up Bug 275770 as being possibly related.

However, it (and related bugs) seem to be concerned only with the inconvenience of storing hashed folder names, not data loss on upgrade.
datalossy.  sticking in folders for visibility. confirming based on multiple reports. feel free to look for a duplicate
Status: UNCONFIRMED → NEW
Component: Installer → Folder and Message Lists
Depends on: 275770
Ever confirmed: true
Keywords: dataloss, regression
QA Contact: installer → folders-message-lists
Summary: Thunderbird 2.0 to 3.0 upgrade deleted contents of local trash subfolders → Thunderbird 2.0 to 3.0 upgrade deleted contents of local trash subfolders, colons (:) in folder names
Whiteboard: [223 Migration]
Conrad (a bit early in the morning for me to be thinking much) are you saying the problem exists in 2.0?
see also bug #360961, but it seems not has dataloss effects
Wayne, messages stored in colon-containing folders in either 2.0 or 3.0 persist without loss in their respective versions.

Data loss results during the 2.0 -> 3.0 upgrade process; all (local) folders containing colons are emptied.  Messages apparently persist on disk, but disappear from within Thunderbird.

(As an aside - and I should probably open another bug on it - I just discovered that I can't delete a local folder containing a colon in 3.0 either.  Delete key or drag to Trash has no effect at all.)

Calling it a night for now...
To recover data, it is necessary to rename the file on disk so that Thunderbird will recognize it. Files with colon-containing names are renamed with \ symbols. For example, Stanford Backup: Received" becomes "Stanford Backup/ Received". It's not until the file is renamed "Stanford Backup Received" that Thunderbird will recognize it and properly import the data.
Notably, this bug also affects RSS feeds, which I discovered the hard way (thankfully I had a backup!).  My profile had at least 50 RSS feeds with years worth of stories in them that I had flagged and tagged, and after the upgrade a whole bunch of the feeds were emptied!  Further experimentation with a test profile helped determine that all of the feeds that suffered data loss had colons in their title (and therefore the folder name).  Reading through this bug and the attachment, I see that the symptoms are exactly the same, including the fact that the missing messages still seem to be on disk, just no longer accessible within Thunderbird.  However, after launching Thunderbird 3 a second time, I also found that all of the folders with colons in their names were seemingly duplicated (two listings for each folder, though they both seem to point to the same place since changes in one affect the duplicate).

This bug is very serious, and along with a few other annoyances (such as bug 520034) it has kept me back on Thunderbird 2.  Like the original reporter, I'm seeing this bug on Mac OS X 10.6 with Thunderbird 3.0.
blocking-thunderbird3.0: --- → ?
Keywords: qawanted
(In reply to comment #2)
> the following hierarchy:
> Local Folders
> |-Trash
> |    |-TrashTest
> |    |-TrashTest: With a colon
> |-Test
> |-Test: With a colon
>
> -rw-------@ 1 shultzc  staff  1274 Dec  9 03:11 Test: With a colon
> -rw-r--r--@ 1 shultzc  staff  2295 Dec  9 03:14 Test: With a colon.msf
> ./Trash.sbd:
> -rw-------@ 1 shultzc  staff  1276 Dec  9 03:11 TrashTest: With a colon
> -rw-r--r--@ 1 shultzc  staff  2331 Dec  9 03:11 TrashTest: With a colon.msf
> (It looks that ":" is legal file name character for file system of Mac OS X)
> (Finder of Mac OS X permits ":" in file name?)

Original problem of this bug(comment #0 by bug opener) is phenmenon of next.
  (a) Under "Local Folders" directory;
      - File of "Stanford Backup/ Received.msf" is there
      - but not file of "Stanford Backup/ Received".
  (b) Under "Local Folders" directory;
      - Trash.sbd" directory
        under Trash.sbd" directory:
        - File of "Stanford Backup/ Received.msf" is there
        - File of "Stanford Backup/ Received" is there. Contains mail data.
(a) looks;
    Internal folder of "/Stanford Backup/ Received" is treated as root-level
    folder of "Stanford Backup/ Received", then "...msf" was created under
    directory of "Local Folders".
(It looks that "/" is legal file name character for Finder of Mac OS X.)
(File system of Mac OS X permits "/" in file name?)
If so, problem may be in escaping/unescaping of "/" in URL like internal path for mail folder(mailbox: URI in this case).

Your case looks one of next.
  - issue of Tb3 relates to escaping/unescaping of ":" in internal mailbox:
    URL for local mail folder.
  - ":" in file name is not escaped as required.
     (Tb3 on MS Win generated Test8a6020e5 / Test8a6020e5.msf for Test:)
     See Bug 303729 for trailing space related issue on MS Win.
     See Bug 379101 for bad affect by it on Mac OS X when upgrade to Tb2.

As "/" and ":" has different characteristics, "/" case and ":" case should be carefully distinguished;
 "/" : path delimiter in internal mailbox: URL => escaping/unescaping is done
       Legal character for file system of Mac OS X?
 ":" : delimiter of protocol, so not usable in URL
       => escaping/unescaping is probably done
       Legal character for Finder of Mac OS X?
       See next code for legal but special file name characters for file name.
> http://mxr.mozilla.org/comm-central/source/mailnews/base/util/nsMsgUtils.cpp#96
> http://mxr.mozilla.org/comm-central/source/mailnews/base/util/nsMsgUtils.cpp#297

To Conrad Shultz:

I think that all local mail folder with ":" is affected.
I also think that all local mail folder with "/" is affected.
Problen only when upgrade to Tb3?
When folder of "Test:" or "ABC / DEF" is created by Tb3 on Mac OS X, what file name is used by Tb3?
Adding "/" in bug summary for original report, for ease of search.
Summary: Thunderbird 2.0 to 3.0 upgrade deleted contents of local trash subfolders, colons (:) in folder names → Thunderbird 2.0 to 3.0 upgrade deleted contents of local trash subfolders, colons (:), slash(/) in folder names on Mac OS X
FYI.
> http://www.xvsxp.com/files/forbidden.php
> Finder of OS X or HFS+ prohibits ":" in file name.
> Finder of OS X prohibits starting dot(".") in file name. (Tb2 and after hashes it)
> Script Editor doesn't permit to save in file of forward slash("/") in name.

Finder uses "Display Name" instead of "file name in file system" in UI.
But, application usually uses "file name in file system".
> https://developer.apple.com/mac/library/documentation/MacOSX/Conceptual/BPFileSystem/Articles/DisplayNames.html#//apple_ref/doc/uid/20002298-CJBHIHFF
> You cannot use display names to manipulate actual files and directories in
> the file system. You use display names only as read-only strings in your
> application’s user interface.
>(snip)
> You would typically not use display names in an editable text field,
> especially if the user could modify the text and save the changes.
>(snip) 
> Display names should not be considered persistent, that is, assume they can
> change from one call to the next. You should never write the display name of
> a file to your application preferences or store that name in your internal
> data structures.
> If you need to refer to a file, store a copy of the actual file name instead.
> 
> Mac OS X uses display names in the Finder and in its Open and Save dialogs. >(snip)
> However, if you are writing a command-line application, you should not use
> display names. Mac OS X does not support display names in the Darwin and
> Classic environments.
> Applications in those environments must operate on the actual file-system names.

I think special characters in folder name are beter to be hashed in file name on any OS, in order to avoid;
  - Confusion by Tb himeself because it's inhibited in URL(escape is needed).
  - Confusion by Mac OS users because of difference between "Display Name"
    which Finder uses in UI and "file name in file system".
Mac OS X may replace "/" in "Display Name" by "_" in file system.
Next change will force hashing for ":" and "/" on all OS. 
> http://mxr.mozilla.org/comm-central/source/mailnews/base/util/nsMsgUtils.cpp#96
>    96 #define ILLEGAL_FOLDER_CHARS ";#"
> => 96 #define ILLEGAL_FOLDER_CHARS ";#:/"
>    // #:/ Special characters in mailbox: URL)
>    // ;   I don't know why hashed. What kind of problem occurred?
Note:
  It'll produce compatibility issue between Tb releases/builds on Mac(Linux?)
  So, problem when upgrade of Tb may occur on Mac OS X(and on Linux by ":"?)
  ("/" is already illegal file name character on Win/Linux.)
  (":" is already illegal file name character on Win.)
Keywords: qawanted
This sounds like something we need to look at and see if there is a fix for 3.0.x that we can do.
Assignee: nobody → bienvenu
blocking-thunderbird3.0: ? → needed
Flags: blocking-thunderbird3.1+
Whiteboard: [223 Migration] → [223 Migration][blocks a major upgrade]
WADA is right in his analysis, from what I can tell. The good news is that the messages are still there; the bad news is that we've created empty folders with the hashed name, and that's the one we show the user. Unwinding this is going to be fun.
Status: NEW → ASSIGNED
I think the change that happened was that we started using the illegal chars as defined by nsCRT.h instead of nsMsgUtils.h, or vice versa (the definitions are duplicated, and it somewhat depends on include order).
After digging around with the debugger, I think the bug occurred when we got rid of nsFileSpec - the mac nsFileSpec code replaced ':' with '/' internally, so the client code never knew about it. ':' is an illegal character in the mac file system.  nsILocalFile does not do this. So 3.0 sees a local file with a display name of "a:b" but on disk it's named "a/b".

If I change nsMsgUtil's NS_MsgHashIfNecessary to not hash folder names with ':' in them, we can see the old folders, but we can't delete the duplicate, because we try to create a sub-folder of the trash called "a:b", and nsLocalFileMac fails.

Perhaps I could change nsMsgHashIfNecessary to swap the ':' with '/' instead of doing the full-fledged hashing. This would probably fix the duplication that happens when you upgrade from 2 to 3.
blocking-thunderbird3.0: needed → .2+
After some more very unsatisfying debugging with x-code, and some unsuccessful attempts at fixing this, I think the way out probably involves distinguishing between folder discover of an existing profile and adding new folders. Both go through AddSubfolder, but in the former case, we don't want to be doing any name munging, because we already know what the folder is and should be called on disk.

Or, we might try to repair the folder and .msf file name during folder discovery. If we see a folder name with '/' in the leaf (which is what 2.0 generates), we could rename it and the .msf file to something "safe", e.g., escaping the '/'. The folder should still be displayed with the right name to the user since that's stored in the .msf file. This would be a safer fix, in that it would be limited to the Mac, and folder names with : in them.
Repairing the folder names didn't quite work out as I hoped - the escaped name was still visible to the user. I also noticed that when I have a file with the name a/b, displayed as such in Mac Finder, when I see the nsILocalFile leaf name, it has been converted to a:b. Which makes me wonder if not escaping ':' might still be the way out...
this was an attempt to add existing local folders differently. It also failed, but failed later than the previous approach. Folder discovery builds up the right data structures in memory, but when we try to load a message, nsLocalURI2Path calls NS_MsgCreatePathStringFromFolderURI, which tries to hash the leaf name in the uri, and then resulting path doesn't match the folder path on disk, which is unhashed.

I'm a little suprised that NS_MsgCreatePathStringFromFolderURI is calling the hash function, since folder uri's point at the already existing hashed folder name, e.g., a592822fc instead of a:b. But it's rather scary to think about changing that at this point.
this patch actually works, in the sense that 2.0 folders with ':' in the name can be read by 3.0, and deleted (moved to the trash). And newly created folders with ':' in the name can be created, have their names hashed, etc. It may turn out that this patch can be simplified quite a bit, though the scary part of making NS_MsgCreatePathStringFromFolderURI not escape the pieces is still required. And it may turn out that is called for IMAP Uri's as well.
Attached patch possible fix (obsolete) — Splinter Review
This actually fixes the problem very simply. I tried creating a few imap folders with ':' in the name, and didn't have an issue, because the ':' was escaped in the uri by the time we got here.

From my testing, it appears that in 2.0, the uri's for hashed folder names actually refer to the unhashed name, but in 3.0 (even w/o this patch), the uri points to the hashed folder name. 

This patch could be made safer by only not hashing the piece name if it's a local folder on the mac, which would certainly limit the possible regressions. I'll try to whip up a patch that does that, and try to write a unit test that tests upgrade of a 2.0 profile  with folders with ':' or '/' in the name.
Attached patch more targetted fix (obsolete) — Splinter Review
this makes the change only affect local folders on the mac, which reduces the chances of regressions.
I've requested a 3.02 pre + this patch try server build which should show up in a couple hours - http://tinderbox.mozilla.org/showbuilds.cgi?tree=ThunderbirdTry If anyone on the cc list wants to see if this fixes the problem for them, that would be helpful.
Attached patch fix with unit test (obsolete) — Splinter Review
this unit test succeeds with the fix applied, and fails without. I need to try this test on Windows to make sure it works. I'm a little worried about line endings and file sizes on Windows.
Attachment #425049 - Attachment is obsolete: true
Attachment #425076 - Attachment is obsolete: true
Attachment #425124 - Flags: superreview?(bugzilla)
Attachment #425124 - Flags: review?
Whiteboard: [223 Migration][blocks a major upgrade] → [223 Migration][blocks a major upgrade][has patch, needs review,sr]
OK, that test fails on Windows, probably because ':' is an illegal char for a file name on Windows. So this test probably should only run on the mac, or at least, that part of the test, for when we add more tests to that file. Standard8, which would you prefer?
(In reply to comment #28)
> OK, that test fails on Windows, probably because ':' is an illegal char for a
> file name on Windows. So this test probably should only run on the mac, or at
> least, that part of the test, for when we add more tests to that file.
> Standard8, which would you prefer?

I think if its a test that can be extended to other cases, then making the specific part mac-only would be the best way to go.

Just looking at the fix, this seems fairly safe - ':' is the only illegal char on Mac.

How will this affect 3.0 and 3.0.1 users how potentially are using the '/' version of the file (if I read this bug right)?
(In reply to comment #30)
 
> I think if its a test that can be extended to other cases, then making the
> specific part mac-only would be the best way to go.

I think the whole bug is mac-specific. I think the illegal characters only changed on the Mac, and the change from nsIFileSpec to nsIFile only affected the mac, because only the nsFileSpec code did the magical mapping (I'm not actually sure if that was just working around the strange things the mac file system did).

> 
> Just looking at the fix, this seems fairly safe - ':' is the only illegal char
> on Mac.
> 
> How will this affect 3.0 and 3.0.1 users how potentially are using the '/'
> version of the file (if I read this bug right)?

On the Mac, if you name a folder with a ':' in it (in 2.0), it creates a file with ':' in the name, but the folder is displayed in the Finder with '/' in the name instead of ':'. In the code, we see it with ':' in the name. In 2.0, if you create a folder with '/' in the name, we hash that immediately, so 3.0 handles it fine.
I need to try this on Windows, but it should be OK.
Attachment #425124 - Attachment is obsolete: true
Attachment #425822 - Flags: superreview?(bugzilla)
Attachment #425822 - Flags: review?
Attachment #425124 - Flags: superreview?(bugzilla)
Attachment #425124 - Flags: review?
Attachment #425822 - Flags: review? → review?(bugzilla)
Attachment #426072 - Flags: superreview?(bugzilla)
Attachment #426072 - Flags: review?
Comment on attachment 426072 [details] [diff] [review]
fix that ignores timestamp issues

Ok, this looks fine. Worst case a user who's created the folder in 2.x, used it on 3.0 or 3.0.1 and then upgrades will get a "double" folder, but can safely delete one having sorted the mail out.

>+      NS_MsgCreatePathStringFromFolderURI(urlPath.get(), newPath, scheme,
>+                                         isNewsFolder);

nit: isNewsFolder should be one more space across.

>diff --git a/mailnews/local/src/nsLocalUtils.cpp b/mailnews/local/src/nsLocalUtils.cpp
...
>-      NS_MsgCreatePathStringFromFolderURI(unescapedStr.get(), newPath);
>+      NS_MsgCreatePathStringFromFolderURI(unescapedStr.get(), newPath, nsDependentCString("none"));
>     } else
>-      NS_MsgCreatePathStringFromFolderURI(curPos, newPath);
>+      NS_MsgCreatePathStringFromFolderURI(curPos, newPath, nsDependentCString("none"));

These should be NS_LITERAL_CSTRING("none")

r+sr=Standard8 with those fixed.
Attachment #426072 - Flags: superreview?(bugzilla)
Attachment #426072 - Flags: superreview+
Attachment #426072 - Flags: review?
Attachment #426072 - Flags: review+
Attachment #425822 - Attachment is obsolete: true
Attachment #425822 - Flags: superreview?(bugzilla)
Attachment #425822 - Flags: review?(bugzilla)
this is what I'll land
Attachment #426072 - Attachment is obsolete: true
Attachment #426094 - Flags: superreview+
Attachment #426094 - Flags: review+
fixed on trunk
Status: ASSIGNED → RESOLVED
Closed: 14 years ago
Flags: in-testsuite+
Resolution: --- → FIXED
Whiteboard: [223 Migration][blocks a major upgrade][has patch, needs review,sr] → [223 Migration][blocks a major upgrade][needs branch approval]
Target Milestone: --- → Thunderbird 3.1b1
I changed the test file names to bugmail-1 and copied them to bugmail/1 and bugmail/1.msf, which I think TB will see as bugmail:1 and bugmail:1.msf (though I'm not sure about that). I could not copy them directly to bugmail:1 because that failed.
Attachment #426113 - Flags: review?
I'm going to land this stop the mac tinderboxes from going yellow...
Attachment #426113 - Flags: review? → review+
Attachment #426094 - Flags: approval-thunderbird3.0.2+
Comment on attachment 426094 [details] [diff] [review]
fix with nits addressed

I'm assuming we still want to land this in 3.0.2, therefore a=Standard8 for 3.0.2
fixed for 3.02
needs to be reopened...

only looking in local folders ( comment 25 ) doesn't work. i have a pop account, and its folders are also on the local disk.

upgrading my backed-up 2.0 profile directly to 3.0.3pre, folders with colons in their names in the pop account are still duplicated, renamed, and emptied.

would this also happen with synchronized imap accounts?
I'm going to clone a new bug for pop3 accounts and mark it blocking 3.03
Blocks: 547564
Problem still exists on Windows XP with Thunderbird 3.0.3 upgraded from 2.0.x, but strangely enough has only affected a single folder in local folders with a slash in its name.  Other folders with a slash in their names have not been affected. Nothing seems to be unique about this folder name either.  It's of the form "foo / bar", with spaces surrounding the slash, just like another folder that worked properly.

Digging around in the profile reveals that an mbox file is created whose name is the text portion following the "/" ("bar") under the folder whose name is the text portion preceding the "/" ("foo").  With the problem folder, this mbox file contains emails.  With the folders that do not exhibit this problem, this "bar" file exists but is empty.

How do I correct this?  Can I merely concatenate "foo" and "bar" into "foo" and resolve my problem?  I'll try this and report back; however, I have no idea how this ended up in this sorry state to begin with.
You can rename the mbox file name on disk to something without a slash, as long as that's a unique name. The .msf file will get regenerated automatically. I don't know why this '/' folder name would be different, but the mac file system does something weird with '/'s...
This is not on a Mac.  It's on a Windows XP box.  This seemed to be a Mac thread to begin with, but someone above interjected the Windows case, so I thought it appropriate to add to this bug report.  If I'm wrong, I'll gladly start a new bug thread.

To clarify the technical issue, what I have now is two mbox files, one named "foo" and a second named "bar" in a subfolder of "foo" called "bar".  Since the upgrade, any emails moved into the "foo/bar" local folder end up residing in the "foo" mbox file.  The older ones, those not visible, are in the "bar" mbox file in the "bar" subfolder.

Funny thing is that a global search will find emails in the "bar" mbox file.  But they don't show in the local folder "foo/bar" by directly accessing the folder.

I believe to remedy my current problem that I should concatenate the "foo" and "bar" mbox files as "foo", zero out "bar", delete the associated .msf files and reindex.  Am I nuts?  I'll try this tomorrow.
(In reply to comment #43)

Tom Wood, you are looking one of next bugs, instead of this bug on Mac OS X only, aren't you? Please note that "/" in directory name and file name is impossible on MS Win, so this bug can't occur on MS Win.
> If saved search folder,
> bug 286523 Unable to create saved search folder(virtual folder) properly when "/"(slash) in name
>   (and after restart, problems such as garbages of folders, loss of mail folder etc. occur)
> If real mail folder(search folder too), 
> bug 436032 Trailing slash("/") or preceding slash("/") in subfolder name
>   causes mail data loss upon folder rename of parent folder,
>   and when subfolder name of single "/" at mid(e.g. abc/xyz),
>   rename of parent folder creates garbage of abc.sbd directory

If you had mail folder file with file name which can produce incompatibility on MS Win(like next bug), rename like action can occur upon upgrade of Tb. If such action happens on parent folder, problem of bug 436032 occurs. 
> bug 379101 Mail data and sub-folders are lost,
> if folder has ending period(dot) or trailing space in its name,
> when upgrade to Tb 2.0 from Tb 1.5 or former(Linux/Mac)
This bug has not been fixed for RSS feeds (see comment 10).  I think that the fix needs to be extended to the News & Blogs account type like it was for POP3 accounts.  I've created another clone of this bug (bug 553942) to address the issue.
Blocks: 553942
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: