compact of local folders > 4GB fails

VERIFIED FIXED in Thunderbird 3.3a2

Status

MailNews Core
Backend
--
major
VERIFIED FIXED
7 years ago
2 years ago

People

(Reporter: Bienvenu, Assigned: Bienvenu)

Tracking

unspecified
Thunderbird 3.3a2
x86
Windows 7
Dependency tree / graph
Bug Flags:
in-testsuite +

Firefox Tracking Flags

(blocking-thunderbird3.1 .7+, thunderbird3.1 .7-fixed)

Details

Attachments

(2 attachments, 2 obsolete attachments)

(Assignee)

Description

7 years ago
If a folder gets > 4GB (e.g., the sent folder), then compact doesn't seem to work.
(Assignee)

Comment 1

7 years ago
I suspect the underlying cause of this is that parsing a > 4GB mail folder doesn't result in a valid db, because we only use 32 bits to store the folder size, which means the stored folder size doesn't match the folder size as reported by the OS. This is trivially fixable, by changing the interface, but that's not technically kosher for 3.1.x. I personally think changing nsIDBFolderInfo.idl on the branch is not a big issue since the chance of anyone depending on binary compatibility with that interface is exceedingly slim.
(Assignee)

Comment 2

7 years ago
Created attachment 487472 [details] [diff] [review]
this fixes the parsing and hence the compaction

this fixes the issue, so that the user can actually compact the oversize folder and get their profile back into a usable state. Two remaining considerations - unit test for this fix, and is it possible to recover messages that were appended to the sent folder after it was over 4GB? We could do something like notice that the folder is greater than 4GB, and copy the part > 4GB to the temp file, and reparse. I believe that the messages actually are appended, but they don't show up in the db (though I haven't verified that, and it's a bit tricky to do with files of this size).
(Assignee)

Comment 3

7 years ago
Writing a unit test for this is proving problematic. sparse files result in what look like very long lines to the mailbox parser, which eventually cause allocations to fail, because we try to store the whole line in one block of memory. I could break up the file into smaller sparse chunks (say, of 100MB each) which would increase the disk usage quite a bit, or I could make the mailbox parser essentially not buffer blocks of all '\0' bytes, which is a bit like teaching to the test...
(Assignee)

Comment 4

7 years ago
ignoring blocks of nulls is going to break message offsets during message parsing, so I'll need to be less sparse.
(Assignee)

Comment 5

7 years ago
Created attachment 489499 [details] [diff] [review]
fix for trunk with unit test

I'll work on the branch patch next since this is badly wanted for 3.1.7
Attachment #487472 - Attachment is obsolete: true
Attachment #489499 - Flags: review?(bugzilla)
(Assignee)

Comment 6

7 years ago
Created attachment 489688 [details] [diff] [review]
3.1 version of fix

I couldn't get the unit test working for 3.1, I think because the sparse file code works quite a bit differently on the trunk, but I tested the parsing and compaction of folders > 4gb, and it does work.
Attachment #489688 - Flags: review?(bugzilla)
Comment on attachment 489499 [details] [diff] [review]
fix for trunk with unit test

>-  GetInt32PropertyWithToken(m_folderSizeColumnToken, m_folderSize);
>+  GetUint64PropertyWithToken(m_folderSizeColumnToken, &m_folderSize);

How does mork cope with this change in a backwards compatibility fashion?
(Assignee)

Comment 8

7 years ago
(In reply to comment #7)
> Comment on attachment 489499 [details] [diff] [review]
> fix for trunk with unit test
> 
> >-  GetInt32PropertyWithToken(m_folderSizeColumnToken, m_folderSize);
> >+  GetUint64PropertyWithToken(m_folderSizeColumnToken, &m_folderSize);
> 
> How does mork cope with this change in a backwards compatibility fashion?

As long as the folder size is less than 4GB, this change doesn't affect the actual string written to the mork database, since they're written as text. For folder sizes > 4GB, the old code wasn't working anyway, and I think would fail in similar ways.
blocking-thunderbird3.1: --- → .7+
Comment on attachment 489499 [details] [diff] [review]
fix for trunk with unit test

As discussed on irc, r=Standard8 if you either remove the try/catch statements in the unit test or change the dump(ex) calls to do_throw(ex) versions.
Attachment #489499 - Flags: review?(bugzilla) → review+
(Assignee)

Comment 10

7 years ago
fixed on trunk - http://hg.mozilla.org/comm-central/rev/b6ac948ef71b
Status: NEW → RESOLVED
Last Resolved: 7 years ago
Flags: in-testsuite+
Resolution: --- → FIXED
Comment on attachment 489688 [details] [diff] [review]
3.1 version of fix

>diff --git a/mailnews/base/util/nsMsgDBFolder.cpp b/mailnews/base/util/nsMsgDBFolder.cpp

> NS_IMETHODIMP nsMsgDBFolder::ThrowAlertMsg(const char * msgName, nsIMsgWindow *msgWindow)
> {
>   nsString alertString;
>   nsresult rv = GetStringWithFolderNameFromBundle(msgName, alertString);
>   if (NS_SUCCEEDED(rv) && !alertString.IsEmpty() && msgWindow)
>   {
>-    nsCOMPtr <nsIDocShell> docShell;
>-    msgWindow->GetRootDocShell(getter_AddRefs(docShell));
>-    if (docShell)
>-    {
>-      nsCOMPtr<nsIPrompt> dialog(do_GetInterface(docShell));
>-      if (dialog && !alertString.IsEmpty())
>-        dialog->Alert(nsnull, alertString.get());
>-    }
>+    nsCOMPtr<nsIPrompt> dialog;
>+    msgWindow->GetPromptDialog(getter_AddRefs(dialog));
>+    if (dialog)
>+      dialog->Alert(nsnull, alertString.get());

I'm not sure we need this change if we're not doing the test...

>diff --git a/mailnews/db/msgdb/public/nsDBFolderInfo.h b/mailnews/db/msgdb/public/nsDBFolderInfo.h

>-  PRInt32   m_folderSize;
>+  PRUint32  m_folderSize;
....
>+nsresult nsDBFolderInfo::SetFolderSize64(PRUint64 size)
>+{
>+  m_folderSize = (PRUint32) size;
>+  return SetUint64Property(kFolderSizeColumnName, m_folderSize);
> }

This looks a bit strange. You're storing it locally as a PRUint32, but then you're also putting the 32 bit value into the database in a 64 bit form. Did you actually want to pass 'size' on that set line? Or maybe just store m_folderSize as a PRUint64 ?
(Assignee)

Comment 12

7 years ago
(In reply to comment #11)
> This looks a bit strange. You're storing it locally as a PRUint32, but then
> you're also putting the 32 bit value into the database in a 64 bit form. Did
> you actually want to pass 'size' on that set line? Or maybe just store
> m_folderSize as a PRUint64 ?

I think just pass size, though either should work...I'll do a new patch in a bit.
(Assignee)

Comment 13

7 years ago
Created attachment 493268 [details] [diff] [review]
3.1 fix addressing comments.

I've got to do some holiday things now, but these are the requested changes...
Attachment #489688 - Attachment is obsolete: true
Attachment #493268 - Flags: review?(bugzilla)
Attachment #489688 - Flags: review?(bugzilla)
Comment on attachment 493268 [details] [diff] [review]
3.1 fix addressing comments.

r+a=Standard8
Attachment #493268 - Flags: review?(bugzilla)
Attachment #493268 - Flags: review+
Attachment #493268 - Flags: approval-thunderbird3.1.7+
Checked into branch: http://hg.mozilla.org/releases/comm-1.9.2/rev/940d1d7a14b7
status-thunderbird3.1: --- → .7-fixed
Target Milestone: --- → Thunderbird 3.3a2
Severity: normal → major
Two comments:
1. 0x1100000000 == 73,014,444,032. Seems like there's either an extra 1 or an extra 0 there.
2. If we mark most of a file as sparse I can't imagine we'll actually need 4 gigs of hard drive space.
Depends on: 616559
Depends on: 619358

Updated

7 years ago
Duplicate of this bug: 636521
Duplicate of this bug: 637776
Re-opening to know next is "incomplete fix" or "still current restriction on file size".

Following is Tb trunk's behaviour on Win when file size exceeded 4GB limit.
(1. mail folder file, start with ".msf file is deleted" status.)
    This is emulation of bug 598104.
                    +-- 4GB boundary
                    V 
  -----------------> <-------------------------------------------------------
      <--------------------><--------------------><-------------------->...
                  mail(N)               mail(N+1)             mail(N+2)
(2. after rebuld-index)
Rebuild-index normally ends on Win, but mail(N+1), mail(N+2), ... are not shown at thread pane. Over-warp of offset-doesn't occur on Win. Mails over 4GB is simply ignored. 
(3. delete mails before mail(N), and Compact folder)
Compact normally ends, and mails till mail(N) were compacted. However, mail(N+1), ..., are not recovered. Mail data of mail(N+1), ... are lost in mail data file after compact.
As offset is 32bits unsigned integer, mails over 4GB looks ignored in any occasion on Win.

Do we need Bug 462665 for problem of "loss of mail date over 4GB by Compact"?

Note:
As "mail of offset=over 4GB" looks to be able to happen on Sent folder only, and it occurs only in Tb without fix of bug 598104(fixed by Tb 3.1.7, and Tb 3.1.8 is alredy shipped), I think "mail of offset=over 4GB, mail(N+1)" won't happen any more, even though "mail data over 4GB, mail(N), file size over 4GB" can happen.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
(Assignee)

Comment 20

7 years ago
You can file a new bug for the loss of mail > 4GB for compacting a folder > 4GB, but this bug is fixed. The user wasn't seeing the mail > 4GB before the compact fix, and I thought it was much more important to fix the core problem with compact so the users' folders would work again.
Status: REOPENED → RESOLVED
Last Resolved: 7 years ago7 years ago
Resolution: --- → FIXED
I see.
"mail of offset=over 4GB" occurrs only on Sent folder only in Tb 3.1.6 or before, because bug 598104 was already fixed by Tb 3.1.7.
VERIFIED, per my above test result for other bug with crafted but valid 4.2GB folder(13MB mail*322 mails) on Win.
Status: RESOLVED → VERIFIED
FYI.
Next was observed during test for bug 634544, with Tb 3.1.9.
(1) mail folder file, start with ".msf file is deleted" status.
    This is emulation of bug 598104. Size of all mail is 13MB.
               +-- 4GB boundary
               V 
  ------------> <-------------------------------------------------------
      <--------------------><--------------------><-------------------->...
          mail-316(13MB)        mail-317(13MB)        mail-318(13MB)
      <-------> <---------->
         1MB        12MB(this 12MB data over 4GB is lost by compact)
Rebuild-Index normally ends. mail-1 to mail-316 is shown at thread pane. mail-317 and after is not shown at tread pane as I wrote before.
Mail length of mail-316 is 13MB and whole mail data of mail-316 is shown.

(2) Compact Folder, using Tb 3.1.9.
    Delete mail-1, and Compact Folder.
    After compact, mail data length of mail-316 becomes 1MB.
    Observed problem is;
      Mail at offset<4GB is kept by compact,
      but mail data over 4GB is lost by compact of mail file over 4GB.
I'll open separate bug for it.
(Correction of previous comment #22)
> Mail length of mail-316 is 13MB and whole mail data of mail-316 is shown.

It was wrong.
Mail size of mail-316 shown at thread pane after rebuild-index was 1MB, and view/source showed the 1MB only.
i.e. "loss of mail data over 4GB" was never produced by Compact.
It was merely a phenomenon of "current folder file size limitation", and fix of Bug 462665 will be needed to resolve this kind of problems.
Sorry for my confusion.


> 
> (2) Compact Folder, using Tb 3.1.9.
>     Delete mail-1, and Compact Folder.
>     After compact, mail data length of mail-316 becomes 1MB.
>     Observed problem is;
>       Mail at offset<4GB is kept by compact,
>       but mail data over 4GB is lost by compact of mail file over 4GB.
> I'll open separate bug for it.
Depends on: 794303
If "file size after compact"(==file size of nstmp) is less than 2GB, Compact is normally executed by fix of this bug even when original foldr size is greater than 4GB.
However, if "file size after compact"(==file size of nstmp) is greater than 2GB, Compact fails even when original foldr size is less than 4GB.
Bug 794303 is for this issue.
Depends on: 854791
Depends on: 854798

Updated

3 years ago
No longer depends on: 854791
Depends on: 1144020
No longer depends on: 1144020
You need to log in before you can comment on or make changes to this bug.