Closed Bug 471682 Opened 16 years ago Closed 15 years ago

Messages in corresponding "Sent" folder inaccessible after sending due to .msf index file corruption

Categories

(MailNews Core :: Database, defect)

x86
Windows XP
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED
Thunderbird 3.0b2

People

(Reporter: fehe, Assigned: rkent)

References

(Blocks 1 open bug)

Details

(Keywords: dogfood, regression, Whiteboard: [crash case needs testcase - see comment 93])

Attachments

(2 files, 2 obsolete files)

User-Agent:       Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9) Gecko/2008052906 Firefox/3.0
Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.1b3pre) Gecko/20081231 Lightning/1.0pre

After sending an email, messages in the "Sent" folder for the corresponding account become inaccessible due to .msf index file corrupt.  This issue was first discussed in Bug 471307 Comment #2 and Comment #9.

Once the corruption occurs, the only remedy is to delete the corresponding Sent.msf file.

The regression range should be the same as in Bug 471307:

Regression Range:
Works: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.1b3pre)
Gecko/20081219 Shredder/3.0b2pre

Broken: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.1b3pre)
Gecko/20081220 Shredder/3.0b2pre


Reproducible: Always

Steps to Reproduce:
1. Send an email
2. Click a folder other than the corresponding "Sent" folder, and then click the "Sent" folder and attempt to view any of the sent messages.  Notice that no message content can be displayed.
3. Shutdown Shredder and delete the corresponding Sent.msf file.
4. Start Shredder and notice that you can now view the message content.
Flags: blocking-thunderbird3?
Keywords: regression
Version: unspecified → Trunk
I was able to reproduce this is yesterdays build, but not today.
Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.1b3pre) Gecko/20081231 Lightning/1.0pre Shredder/3.0b2pre ID:20081231031214
Seems to work fine for me.
Nope.  Still there.  It seems you have to click through folders for other mail accounts then come back to the Sent folder in the account from which you sent the email, before you see the problem.  So my Step #2 is not quite right.

When I get a chance, I will investigate further and provide better instructions, but this most definitely is not fixed in today's build.
OK, I was finally able to get a sent folder that was inaccessible.
But I'm not quite sure how.

One thing I remember:
I had a corrupted sent folder for my gmail account, (possibly from a previous use)
After attempting to read a message there, my local folders sent folder now failed.

It's as if something in the code is confusing one sent folder from another.
Status: UNCONFIRMED → NEW
Ever confirmed: true
So to restore the folders I deleted the .msf files.
Local Folders sent.msf was 1099 KB after deleting and restart it was 1048 KB
Gmail sent was 30 KB after deleting and a restart 26 KB


So something is getting into the msf that does not belong there.

See you next year.
Happy New Year
Happy New Year.

Well, after taking another look, I am finding this bug a bit difficult to pin down.  Sometimes it happens; sometimes not.

One thing I've noticed is that the index file created on startup, when no index file exists (e.g. where the index file was deleted) is significantly different than the one created when manually doing "Rebuild Index" on the folder.  Maybe that's why the corruption is occurring.

The beginning few lines of the index file headers (if that's what is called), for the exact same folder, created by either mechanism, differ as follows (copied line-by-line from the beginning of each file):

------------- Index file created at Shredder startup --------------------------
// <!-- <mdb:mork:z v="1.4"/> -->
< <(a=c)> // (f=iso-8859-1)
  (B8=account)(B9=folderName)(BA=sortType)(BB=sortOrder)(BC=viewFlags)
  (BD=viewType)(BE=MRUTime)(80=ns:msg:db:row:scope:msgs:all)(81=subject)
  (82=sender)(83=message-id)(84=references)(85=recipients)(86=date)
  (87=size)(88=flags)(89=priority)(8A=label)(8B=statusOfset)(8C=numLines)
  (8D=ccList)(8E=msgThreadId)(8F=threadId)(90=threadFlags)
  (91=threadNewestMsgDate)(92=children)(93=unreadChildren)
  (94=threadSubject)(95=numRefs)(96=msgCharSet)
------------------------------------------------------------------------------

------------- Index file create by "Rebuild Index" feature -------------------
// <!-- <mdb:mork:z v="1.4"/> -->
< <(a=c)> // (f=iso-8859-1)
  (B8=account)(B9=folderName)(BA=sortType)(BB=sortOrder)(BC=viewFlags)
  (BD=viewType)(BE=MRUTime)(BF=sortColumns)(C0=customSortCol)
  (C1=current-view-tag)(C2=current-view)(C3=keywords)(C4=imageSize)
  (C5=junkscore)(C6=charSet)(C7=notAPhishMessage)(C8=charSetOverride)
  (80=ns:msg:db:row:scope:msgs:all)(81=subject)(82=sender)(83=message-id)
  (84=references)(85=recipients)(86=date)(87=size)(88=flags)(89=priority)
  (8A=label)(8B=statusOfset)(8C=numLines)(8D=ccList)(8E=msgThreadId)
  (8F=threadId)(90=threadFlags)(91=threadNewestMsgDate)(92=children)
  (93=unreadChildren)(94=threadSubject)(95=numRefs)(96=msgCharSet)
------------------------------------------------------------------------------

The remaining header lines below the above excerpts are basically identical between the two files.  So it could be that it is those "(XX=some_string)" definitions missing from the on-startup created index files that's leading to the corruption.

So maybe replacing the on-startup code with the on-"Rebuild Index" code will fix this issue.
I should note that, since Bug 471307 is fixed, it is now possible to fix the corruption with the "Rebuild Index" feature.  No more need to manually delete the .msf index file.
(In reply to comment #7)
> I should note that, since Bug 471307 is fixed, it is now possible to fix the
> corruption with the "Rebuild Index" feature.  No more need to manually delete
> the .msf index file.

I'm afraid "Rebuild Index" does not work for me.
Right-clicking on my local folders "sent" file which got corrupted again gives:
Error: Component returned failure code: 0x80550005 [nsIMsgFolder.getMsgDatabase] = <unknown>
Source file: chrome://messenger/content/mailWidgets.xml
Line: 1929

Rebuild index just does nothing, presumably because it does not know what
folder was selected.
I don't get that message and "Rebuild Index" is working for me on inaccessible "Sent" folders.  What type of account is that (i.e. POP3, IMAP, etc.)?  What about extensions?
I have multiple newsgroups accounts which all use the same Local Folders "sent"
That is where I most often see the corrupted msf.

The last failure happened when I posted to a different account from the previous
2 posts. (at least that's when I noticed it)

"Rebuild Index" executes properly on a good msf (does not destroy a good index)
But does nothing with an msf that needs rebuilding :)
I did a lot of leg work in following malformed summary files on importing.  This may be related to this bug.

Someone try and apply my patch and see if it fixes this.

Bug 471130
Following is test result with Tb trunk 2009/1/05 build, with manual corruption of ".msf" file. 
(1) Terminate Tb trunk.
(2) Copy "Junk.msf" to "Sent.msf". (manual corruption of "Sent.msf")
(3) Restart Tb. Never touch "Sent" folder.
    => File size, timestam of "Sent.msf" was not changed.
(4) Copy a mail to Sent(not Drag&Drop, via Menu)
    => File size of "Sent" increased.
       "Sent.msf" was deleted. (not re-created)
       After a while, next error message was displayed in Errror Console.
> Error: uncaught exception: [Exception... "Component returned failure code: 0x80550006 [nsIMsgFolder.getMsgDatabase]"
>  nsresult: "0x80550006 (<unknown>)"  location: "JS frame ::
> chrome://messenger/content/mailWidgets.xml :: parseFolder :: line 1929"  data: no]  
(5) Copy a mail to Sent again(not Drag&Drop, via Menu)
    => File size of "Sent" increases.  
       "Sent.msf" is still not re-created.
       No error message was issued.
(6) Click "Sent" folder(open "Sent" folder)
    => "Sent.msf" was re-created. Copied mails were not lost. 

Not all issues of this bug is explained by above, but some of them seem to be above phenomenon.

To IU(bug opener):
Can you reproduce problem in next case?
 - Click "Sent" folder(open "Sent" folder via UI), then send mail several times.
Oh sorry, above error message was reported by Joe Sabash.
Joe Sabash, can "manual open of Sent folder before mail send" be a workaround?
With my patch from bug 471130 I can install a bad *.msf file and it will rebuild it.  I'm not sure if that is what the question is.

To add to my comment 11 *.js files mostly get to GetDatabaseWOReparse and which returns a db even if the summary file is bad. In TB2 it returned null db's with bad summary files. UpdateFolder only calls GetDatabaseWithReparse with a null db. TB2 of course had that and TB3 doesn't.

In my patch the null db is passed back to *.js which sends it to UpdateFolder and it gets reparsed.

Someone ought to sr my patch and get it in.  All it does is correct an ommision in Bug 437886.
(In reply to comment #12)
> Following is test result with Tb trunk 2009/1/05 build, with manual corruption
> of ".msf" file. 
> (1) Terminate Tb trunk.
> (2) Copy "Junk.msf" to "Sent.msf". (manual corruption of "Sent.msf")

I used a different "sent.msf" rather than "junk.msf"

> (3) Restart Tb. Never touch "Sent" folder.
>     => File size, timestam of "Sent.msf" was not changed.

I did not verify this.

> (4) Copy a mail to Sent(not Drag&Drop, via Menu)
>     => File size of "Sent" increased.
>        "Sent.msf" was deleted. (not re-created)

Yes, after the copy, the sent.msf was gone.

>        After a while, next error message was displayed in Errror Console.
> > Error: uncaught exception: [Exception... "Component returned failure code: 0x80550006 [nsIMsgFolder.getMsgDatabase]"
> >  nsresult: "0x80550006 (<unknown>)"  location: "JS frame ::
> > chrome://messenger/content/mailWidgets.xml :: parseFolder :: line 1929"  data: no]

Somewhere I got this:
Error: Component returned failure code: 0x80550006 [nsIMsgFolder.getMsgDatabase] = <unknown>
Source file: chrome://messenger/content/mailWidgets.xml
Line: 1929

  
> (5) Copy a mail to Sent again(not Drag&Drop, via Menu)
>     => File size of "Sent" increases.  
>        "Sent.msf" is still not re-created.
>        No error message was issued.

Yes

> (6) Click "Sent" folder(open "Sent" folder)
>     => "Sent.msf" was re-created. Copied mails were not lost.

Yes, the msf was re-created in this case, I guess because it was deleted in the
copy process. The copied messages were not lost.
But the "bad" msf remains in the native failure, so it must be manually deleted. 
I guess what you are getting at, is that the msf should be deleted in the 
course of "normal" additions to the sent folder as a result of a "normal" send,
if the msf was found to be bad.
@Phil: Maybe you could comment on my Comment #5.  Why is the header (or preamble or whatever) of the .msf file created on Shredder start-up, when no .msf file exists (e.g. file was deleted), different and significantly lacking compared to that of the .msf file created via "Rebuild Index".  Should these not be the same, and isn't this possibly a contributor to why this bug may be happening?

I notice that after a perform "Rebuild Index" on a folder, it becomes resilient to this bug--at least as far as I can tell.
This affects SeaMonkey Trunk too, so change the Product and Component to MailNews Core, Database.
Component: General → Database
Product: Thunderbird → MailNews Core
QA Contact: general → database
Target Milestone: --- → mozilla1.9.1b3
IU comment 16. Maybe same way it is corrupted on import of mail at least of oexpress mail.  I was going to look at that at some point.  But since my patch corrects the bad msf I may not get to it too soon.
(In reply to comment #19)
> IU comment 16. Maybe same way it is corrupted on import of mail at least of
> oexpress mail.

I think there may be a misunderstanding here.  Neither of the two .msf file excerpts in Comment #5 are from a corrupt .msf file.  They are both from working .msf files for the exact same folder, but created using two different mechanisms: (1) on Shredder startup (the shorter file); and (2) on "Rebuild Index" (the longer file).  These were created about a minute apart.

My concern is that these should be the same and yet they are not, and this may be why this corruption is occuring--that is, the .msf file created on startup is lacking information sufficient for all of Shredder's operations, thus resulting in a broken index file, when certain, but normal operations are performed on it.

Thus, I am assuming that, if Shredder is made to generate index files from startup, similar to how it generates them from "Rebuild Index", the problem may be solved.

Try this for yourself:

1. Delete an .msf file and launch Shredder
2. Copy the newly created .msf file to a new name
3. Perform the context menu for the folder associated with the index file in Steps 1 and 2, perform: "Properties... --> General Information --> Rebuild Index"
4. Copy the new .msf file to another name
5. Compare the file from Step 2 with that from Step 4.  Notice how different the header sections (where all those codes are defined) are.
This bug already has blocking-thunderbird3?. Requesting blocking-seamonkey2.0a3? too because it also affects SeaMonkey (see comment #17).
Flags: blocking-seamonkey2.0a3?
Blocks: 437886
Tried comment 20.
I can't do step 3. I rt click sent and rebuild index but nothing appears on disk. click 'ok' and nothing happens.  I have to close box.
next I open sent folder by left clicking.  All is well. rt click folder and click rebuild. status bar = 'Done' 'ok' button now works.

I will investigate.

I notice this before.
A bad or I guess missing msf file doesn't get rebuilt on my machine until I left click the folder to open it.  It didn't even do that until I applied my other patch.
BTW. just checked same in TB2. On open, msf file is empty.
only on left click does it reappear and is different if I then do a rebuild index.

also I note my copies of msf were zeroed out on restart even though they did not have a db to go with.
(In reply to comment #22)
> Tried comment 20.
> I can't do step 3. I rt click sent and rebuild index but nothing appears on
> disk. click 'ok' and nothing happens.  I have to close box.

Step 2 was to *copy* the .msf file to a new name--not rename the original. :-)

But anyhow, I suppose the fact that "Rebuild Index" does not create a missing index file should probably be a bug.
The concern with the .msf index file being different is more to do with the header section.  The fact that it is significantly different, depending on how its created, might be a contributing factor and why Shredder is writing data it cannot itself interpret.
if you try 'rebuilding index' in folder property pane (rtclick) with bad summary file both tb2 and tb3 don't handle this function well (tb2 in widgetglue.js and tb3 in folderpane.js)

getMsgDatabase(msgWindow)

Don't know why ok button is broke for those events.  Are you getting the same problem with rebuild index per comment 24.
open any folder. rtclick other unfocused folder. click rebuild index and try to click ok.
Maybe you tried this already.
(In reply to comment #26)
> if you try 'rebuilding index' in folder property pane (rtclick) with bad
> summary file both tb2 and tb3 don't handle this function well (tb2 in
> widgetglue.js and tb3 in folderpane.js)
> 
> getMsgDatabase(msgWindow)
> 
> Don't know why ok button is broke for those events.  Are you getting the same
> problem with rebuild index per comment 24.
> open any folder. rtclick other unfocused folder. click rebuild index and try to
> click ok.
> Maybe you tried this already.

I'm not getting that, but I'm getting a worse problem if I do that.  If I select a folder and then right-click and choose "Rebuild Index" on a different folder (one containing a corrupt index), it looks like it succeeded and I can click OK; however, when I then switch to the folder I just performed the rebuild operation one, there are no messages in the listed.  Everything for that folder is blank.  If I right-click the folder and choose Properties..., I see "Default Character Encoding" is now "Arabic (IBM-864)" and I also have the following in Error Console:

Error: Component returned failure code: 0x80550005 [nsIMsgFolder.charset] = <unknown>
Source file: chrome://messenger/content/folderProps.js
Line: 237
 ----------
Error: Component returned failure code: 0x8000ffff (NS_ERROR_UNEXPECTED) [nsIMsgDBView.hdrForFirstSelectedMessage]
Source file: chrome://messenger/content/mailWindowOverlay.js
Line: 1651

If I open the messages file, I see that my messages are still there, but neither redoing "Rebuild Index" nor deleting the .msf file will allow the messages to be displayed.  It's irrecoverably messed up.

This is getting irritating.  This problem just keeps getting worse and worse.

I'm probably going to have to file a new bug for that.
Keywords: mail4
Just filed: Bug 472446.  I don't have time now to do regression hunting--maybe on the weekend.
I mentioned in bug 471130 I also frequently can't access any messages of a local folder that receives mail filtered from imap account. What's interesting is I have several local folders that get filtered mail, this is the only one that gives me trouble. Sometimes reindex resolves the issue, sometimes not. I think this also happened once for an imap folder.
Keywords: dogfood
perhaps this bug and it's friends might be able to resolve bug 321371?
I would dearly love to investigate this issue, but I have not been able to
reproduce it. If anyone has a clear set of steps that can reproduce this I
would be grateful.

I certainly see some related bugs. On my personal system, my junk folder
database is blown away almost daily, and it has similar circumstances to Sent
of copies that occur when the folder is not open. But that has been happening
for months, so is not a recent regression.

Phil's fix in bug 471130, which I am now pushing to be accepted, may fix this.
I'll also be working on my Junk folder issues as well as some other related
problems I have with the way that the databases are opened. So maybe this will
go away with those other fixes.
(In reply to comment #33)
> I would dearly love to investigate this issue, but I have not been able to
> reproduce it. If anyone has a clear set of steps that can reproduce this I
> would be grateful.
> 
What I have found is that messages in the "Sent" folder does not immediately become inaccessible, after sending.  You have to navigate through several other folders, look at other messages, for a while before returning to the "Sent" folder.  It is then that you will notice the messages are inaccessible.
Here is an exact sequence of events that caused local folders "sent"
to become inaccessible.
There were 3 accounts involved:
Main pop3 account
Gmail pop
Newsgroups

Checked all sent folders OK
Sent a newsgroup message (set to go to local folders "sent"
Checked local folders sent OK
Checked main pop3 sent OK
Checked gmail sent OK
Checked local folders sent...inaccessible

This in the error console, although I'm not sure if it is relevant.
Error: Component returned failure code: 0x80004001 (NS_ERROR_NOT_IMPLEMENTED) [nsIRequest.name]
Source file: file:///C:/myprofilefolder/thunderbird/components/nsLoginManager.js
Line: 315

I'll see if it is really reproducible.
Not able to reproduce, with the same sequence of actions.

And no errors in the console.
Following problem still remains around "folder access without folder open via UI".
> Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.1b3pre) Gecko/20090119 Shredder/3.0b2pre

(1) Terminate Tb
(2) Add a mail data in a local mail folder (mail-4 is added to mail-1 to 3).
    (emulate crash/failure during adding of mail data)
    Don't touch .msf.
    => Modification timestamp of "xxx" > Modification timestamp of "xxx.msf"
       But difference is less than mail.db_timestamp_leeway value.
    => File size of "xxx" is greater than one held in "xxx.msf".  
(3) Restart Tb
(4) Move cursor on "xxx". Following error in Error Console.
> Error: uncaught exception: [Exception... "Component returned failure code: 0x80550005 [nsIMsgFolder.getMsgDatabase]" nsresult: "0x80550005 (<unknown>)"
> location: "JS frame :: chrome://messenger/content/mailWidgets.xml :: parseFolder :: line 1929"  data: no]
(5) Don't open "xxx".
(6) Copy two mails(mail-5 and mail-6) to "xxx".
(7) Open folder "xxx" => mail-1, mail-2, mail-3, mail-5, mail-6
    (mail-4 doesn't appear)
(8) Rebuild Index of "xxx" => All mails is displayed.

Internal rebuild index seems to fail in above situation of (4).
Wada, what you are describing in comment 37 is interesting, and I could see how it could happen - but I'm not sure it is the same issue as this bug. But I'll have to check it out. We don't expect reindexing at step 4, only when you open the folder, so the noted failure is by design. The confusion is occurring later, when the database is tricked into writing into an out-of-date database, and then convinces itself that the db is now up to date. Great detective work, by the way!

The patch for bug 471130 landed on Jan 19. I'm interested to see if that patch has any impact on this bug. So any comments on whether the bug is still seen on later builds would be useful.
(In reply to comment #38)
> The patch for bug 471130 landed on Jan 19. I'm interested to see if that patch
> has any impact on this bug. So any comments on whether the bug is still seen on
> later builds would be useful.

You might be right.  It does *seem* fixed.  I will keep an eye on it for another day, if I don't run into the problem again, I'll resolve as WFM.  Or should I instead mark as dupe of Bug 471130?
blocking plus in the meantime, but please do mark it WFM if it's working for you, thx!
Assignee: nobody → bienvenu
Flags: blocking-thunderbird3? → blocking-thunderbird3+
Ok then.  I'm pretty confident this issue in now resolved.  I haven't been able to trigger it all day, so for now it's WFM.

Thanks
Status: NEW → RESOLVED
Closed: 16 years ago
Resolution: --- → WORKSFORME
(In addition to comment #37)
Tested with 2009/1/21 build. 
> Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.1b3pre) Gecko/20090121 Shredder/3.0b2pre

Step (1) to (5) : Same as comment #37.
(6-a) Copy mail-5 to "xxx" => "xxx.msf" was deleted. mail-5 was added to "xxx"
(6-b) Copy mail-6 to "xxx" => mail-6 was added to "xxx"
(7) Place mouse cursor on folder "xxx" => same error as step (4)
(8) Rebuild Index of "xxx" => "msf" was recreated and all mails are displayed.

Even if problem (6-a) (.msf is deleted) and (6b) (data is appended to xxx without xxx.msf creation) exist, next "folder open via UI" or "manual rebuild index" will probably recreate .msf correctly, because "old .msf" is already deleted. So, even if problem like comment #37 was involved in this bug, main/critical problem of this bug (problem on Sent folder) doesn't seem to occur after land of patch for bug 471130 on Jan 19.

I don't know what will happen if above (6a) will be invoked by filter (message/junk). I don't know what will happen if above (6b) will occur on a mail folder simulteneously by message filters. No problem when simultaneous append of data to "xxx" without write-lock of ".msf"? 

By the way, "delete of .msf" at (6-a) resolves critical problems. However, it means that Bug 392704 will never work if "delete of .msf" at (6-a) occurs after crash&restart. Bug 392704 is better to be limited to IMAP? When Local mail folder, loss of tag is rare case. I think loss of Junk status is far better than problem like this bug.
Resolution: WORKSFORME → FIXED
Testing with:
Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.1b3pre) Gecko/20090121 Lightning/1.0pre Shredder/3.0b2pre ID:20090121031845

The index corruption that I'm seeing in my local folders sent msf is still there
The only difference is that the file is now rebuilt automatically after this error message:

Error: Component returned failure code: 0x80550005 [nsIMsgFolder.getMsgDatabase] = <unknown>
Source file: chrome://messenger/content/mailWidgets.xml
Line: 1929

So while the patch in bug 471130 dramatically lessens the impact, I think we should only consider this a "bandaid" to the real problem.

IMO this bug should be re-opened.
@Joe: I don't see the console message you indicate, but there is definitely something still seriously wrong.

I just sent several test message, using different accounts, an none of the messages showed up in any of the sent folders.

Reopening this.  It needs more investigation.

Joe: Can you reproduce your issue using a new profile?
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Ok I now see the Console Message you mentioned.  I wonder if a new bug shouldn't be filed for that, as it doesn't always happen for me.

My comment about not seeing my sent messages turned out to be a false alarm.  I had accidentally clicked that icon that places Shredder in offline mode, but Shredder provided no feedback about the issue when sending the messages.

I will leave this open for now until a decision is reached as to whether to file a new bug for the Error Console message, rather than flip-flopping the status.
(In reply to comment #44)
> @Joe: I don't see the console message you indicate, but there is definitely
> something still seriously wrong.

That error only occurs when the sent folder is found to be inaccessible. How I discovered it is leaving the error console open and minimized. (I use console2 extension if that matter) that way, any error pops up the console.

> I just sent several test message, using different accounts, an none of the
> messages showed up in any of the sent folders.
> 
> Reopening this.  It needs more investigation.
> 
> Joe: Can you reproduce your issue using a new profile?

New profile would probably not show the problem, but that would be counter-productive. My profile and contents is pretty big, and most probably is contributing to the manifestation of this bug.

I will be setting up a new profile for B2 testing though when that happens.
The console message is just the indication of the out-of-date sent folder. That is by design, and is not likely related to whatever is causing the initial problem.
Ok thanks, Kent.  So I guess there's nothing else left for this bug.  Is this out-of-date indication a reflection of another potential issue?
Status: REOPENED → RESOLVED
Closed: 16 years ago16 years ago
Resolution: --- → FIXED
The out-of-date indication is a marker of the original bug, so if this continues to be an issue than this bug is not fixed, and needs to be reopened.
I have a similar problem with other folders. I'm not sure if it is this bug or something else. The problem occurs intermittently.

When deleting messages sometimes Shredder stops deleting messages and the trash folder becomes inaccessible.  (shows empty folder) 
Same for moving messages between folders, messages can't be dragged to the destination folder.
When I try to open the destination folder I get the following alert message:

Unable to open the folder DestinationFolderName because it is in use by some other operation. Please wait for that operation to finish and then select the folder again.

The error console shows:

Error: Component returned failure code: 0x80550006 [nsIMsgFolder.msgDatabase] = <unknown>
Source file: chrome://messenger/content/mailWidgets.xml
Line: 1928

The destination folder shows empty

After a program restart everything works again.

I'm using a POP3 account with

Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.1b3pre) Gecko/20090122 Shredder/3.0b2pre ID:20090122031141
I have been tracking an issue for months, that on about a daily basis my junk folder loses its .msf file, and has to be rebuilt. I notice this because I keep bayes metadata in the .msf file, and it won't persist. This is really the same issue as this bug - and I can confirm it has been happening for a long time. Most people don't see this, as they would just notice that the folder is reindexing, and perhaps think this is normal.

It is possible therefore that the report of this bug is actually not really a regression from bug 437886, but rather that bug simply killed the ability to recover from the error, and therefore made people suddenly notice that their databases needed reindexing. Now that the reindex problem is fixed, we are still noticing the need to reindex.

Since I cannot reliably make any of the bugs appear, I can't really test this theory. But I am determined to solve my earlier bug, and perhaps I will hijack this bug for that purpose.

I am currently instrumenting the db code with PR_LOG messages so that I can see the sequence of events that lead up to the db invalidity. I caught it once today, and learned a little but also that I need to add more log output. Anyway, I think the sequence of events is:

1) Autocompact folders
2) Junk folder gets a db from cache (not sure if this is before, during, or after the folder is compacted)
3) Normal junk open (with leaveInvalid false) detects an invalid database, and deletes the .msf file.

I'll know more about this each time it happens, as I can add additional log messages that are closer to the action.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Sorry I haven't contributed here, as it's mostly happening on a computer I don't use as often. 

My bug report was combined with this one, though I'm not sure it's quite the same. There are no files on my computer called "shredder."

The main change is that now only one email account is affected, now any message listed in the sent mailbox is viewable in the preview window, and now no matter how many times I try to rebuild the summary file, only the emails sent since January 19 are listed. If I open the Sent file, all the emails are still there. If I delete the sent.msf and let it be recreated, nothing changes.

Recently some daily builds were disasters (crash to desktop every time you try to open Seamonkey) and I had to install older Seamonkey builds until the daily was fixed. Could that be related? It could explain all of us having different results, and me having different results from different computers and even different accounts on one computer.
(In reply to comment #52)
[...]
> My bug report was combined with this one, though I'm not sure it's quite the
> same. There are no files on my computer called "shredder."
[...]

Shredder is the code name for nightly and hourly builds of Thunderbird 3.0b2pre. They are extremely similar with the Mail&News component of SeaMonkey 2.0a3pre nightlies and hourlies built at the same date & time.
(In reply to comment #51)
> I have been tracking an issue for months, that on about a daily basis my junk
> folder loses its .msf file, and has to be rebuilt. I notice this because I keep
> bayes metadata in the .msf file, and it won't persist. This is really the same
> issue as this bug - and I can confirm it has been happening for a long time.
> Most people don't see this, as they would just notice that the folder is
> reindexing, and perhaps think this is normal.
> 
> It is possible therefore that the report of this bug is actually not really a
> regression from bug 437886, but rather that bug simply killed the ability to
> recover from the error, and therefore made people suddenly notice that their
> databases needed reindexing. Now that the reindex problem is fixed, we are
> still noticing the need to reindex.
> 

I first noticed frequent re-indexing on 2008-11-17 see:
http://forums.mozillazine.org/viewtopic.php?p=5010275#p5010275
Just prior to that date bug 414038 landed and also a fix for a database crashing issue bug 457751
I can't get any more specific on the regression date.
(In reply to comment #52)
> 
> If I delete the sent.msf and let it be recreated, nothing changes.
> 

Due to the problem in bug 472446, whose fix was checked in on 2009-01-18, in some cases reindexes actually deleted the files. Perhaps that is what you are seeing.

My junk folder msf file died again, and with more logging I can see that it is not related to autocompact as I previously suggested, but simply a case of the program detecting too much delay time between the writing of the folder file and its related .msf file, thus it is declared "invalid" and then deleted. (When declared invalid, the delay was 6064 (Milliseconds?) with a maximum allowed of 4000.) Boring, I can fix that with a preference. (The deletion I believe occurs in GetDBFolderInfoAndDB, which for some reason has parameters set to delete and recreate invalid files without trying to recover their data - but that is for another bug.)

Next step, I'm going to log delay calculations for db opening and closing and see if I can also see some possible issues in the Sent folder, whose corruption for me is too rare to be traceable.

Joe's timing in comment 54 point away from this being a regression from bug 437886. If I can trace this problem to be related to delta between db and folder file times, then perhaps I'll have something that I can both fix, and determine if it is a regression or not.
Now I can see there is clearly a problem with the setting of the folder time that is used for testing db integrity, and it is affecting at least Sent, Trash, and Junk (all folders that have messages copied into them). The time delta is in seconds, not milliseconds, so the db has to be out of date by over an hour by default before the folder is declared corrupt - a long time. Plus the db has to close and get out of the cache - which also can be unpredictable. That's why it is hard to reproduce.

Anyway I can see this reliably now in my debug system with a test profile, so I should be able to resolve it. I don't think Bienvenu will object if I officially take this from him now.
Assignee: bienvenu → kent
The story is pretty clear now. At the end of a local copy, we update the mailbox file using:

seekableStream->Seek(nsISeekableStream::NS_SEEK_CUR, 0); // seeking causes a flush, w/o syncing

But if I then look at the time on the folder file before and after that seek, it has not changed from the time before the copy began. To get a changed time, I need to do two things:

1) Replace the seek with a flush
2) Use a clone of the folder file, which is a trick done in nsMailDatabase to get the size of the file, but not its time.

So I'm pretty sure I can fix this now by adding all of that in. But the issue is that the flush was changed to a seek in bug 183560 for performance reasons. I'll need feedback from Bienvenu of the consequences of reversing that, or ideas for alternative strategies.
Is this issue also somehow affecting the spam detection?  None of the spam I am receiving is detected anymore.
No, it is not affecting spam detection.

But I have also seen a decline in spam detection efficiency in the last few days. What this bug does do is affect my ability to analyze spam performance, as I need the metadata in the junk folder (which is clobbered by this bug) to see if it was put there automatically, or manually by me, and what the junk percent values are on the messages.
Wada is suggesting that bug 474717 is related to this, I'm not too sure but posting this here just in case its relevant.
(In reply to comment #57)

> So I'm pretty sure I can fix this now by adding all of that in. But the issue
> is that the flush was changed to a seek in bug 183560 for performance reasons.
> I'll need feedback from Bienvenu of the consequences of reversing that, or
> ideas for alternative strategies.

Interesting, thx for the doing the digging. One thing that I don't quite understand is that the fix for bug 183560 was made somewhere between three and six years ago. I'll think about alternatives. I'm pretty sure you're on the right track, though. It could have become an issue when combined with the switch to nsIFile from nsFileSpec.
Here's the patch and a test case that fails with the current trunk.

I have not done performance testing to see if this affects the copy speed, as suggested in bug 183560. I could try something if you like.
Attachment #358658 - Flags: superreview?(bienvenu)
Attachment #358658 - Flags: review?(bienvenu)
Severity: critical → normal
Target Milestone: mozilla1.9.1b3 → Thunderbird 3.0b2
Well, I thought I had a clear STR
Send a message from main pop account
Wait a time  (in my case 1 hr 25mins)
Attempt to open the message in the sent folder

Result: The folder is re-indexed.

Anytime prior to that the sent message was accessible without a re-index.
(maybe looking at a cached copy)

I think Kent's mod here will fix the immediate problem, but are there may be other problems lurking about.

Testing with:
Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.1b2pre) Gecko/20081112 Lightning/1.0pre Shredder/3.0b1pre ID:20081112031406

And looking at the timestamps for the msf files, there were no incidents of re-indexing or other phenomena of the sent folders being inaccessible. 

As to when the timestamps were actually updated, your guess is as good as mine. The only thing I can say is that when testing in the above version, updating seemed to happen when it should (sometimes sent folders were updated when the associated inbox was accessed .

Bug 414038 comes to mind.
(In reply to comment #63)
> Wait a time (in my case 1 hr 25mins)
Set mail.db_timestamp_leeway(default=4000, seconds) to smaller value for ease of test.
(In reply to comment #64)
>
> Set mail.db_timestamp_leeway(default=4000, seconds) to smaller value for ease
> of test.

And "smaller value" can even be zero. I didn't try that approach myself, because once I discovered this was the issue I added logging in the code to show me the values directly. But the test case that I presented fails with an error of 2 seconds.

In addition to this delay though, you also have to get the folder database you are watching to close, as otherwise it gets the database from the cache which does not do the validity test. I was having trouble getting the database to close in some of my manual tests. The only reliable test I could do was to actually restart the program. I would think that a timestamp leeway of zero, along with restarting the program, would make the error show up every time.
Kent, can you try this? If we flush on every msg in a multiple message move/copy, that really hinders performance. This makes it so we only flush on the last message, which should help with performance while keeping the db timestamp correct.

Thx for working on this. I'll be running with the tweaked version of your patch and checking it out...
Comment on attachment 358863 [details] [diff] [review]
only flush on last message

>+        seekableStream->Seek(nsISeekableStream::NS_SEEK_CUR, 0); // seeking causes a flush, w/o syncing

Nit: while there, would you move the comment on its own line ?
I added Bienvenu's approach to my larger patch, which also has a test case plus the file cloning hack for timestamps.
Attachment #358658 - Attachment is obsolete: true
Attachment #358868 - Flags: superreview?(bienvenu)
Attachment #358868 - Flags: review?(bienvenu)
Attachment #358658 - Flags: superreview?(bienvenu)
Attachment #358658 - Flags: review?(bienvenu)
I'm having some similar problems that may be related:
Sometimes when I try to delete items from my inbox or move them to other folders, there is no response. They just stay in the inbox. I may be able to initially move some items, but once it starts, I can't move anything out of the inbox until I restart Seamonkey.

Also, I occasionally send email and the progress bar hangs, even though the mail has been sent successfully. So I can literally get a response before Seamonkey has finished sending the email. The only way to get rid of the progress bar is to completely close Seamonkey. That has been going on for many months longer than the rest of the problems.
(In reply to comment #69)
> I'm having some similar problems that may be related:
> Sometimes when I try to delete items from my inbox or move them to other
> folders, there is no response. They just stay in the inbox. I may be able to
> initially move some items, but once it starts, I can't move anything out of the
> inbox until I restart Seamonkey.
> 

I'm seeing the same problem with Shredder.
(In reply to comment #69)
>but once it starts, I can't move anything out of the
inbox until I restart Seamonkey.

My first thoughts are that what you describe is not likely to be this bug, because this bug affects the destination folder, not the source folder (at least according to my analysis.) But it is hard to know precisely.

The best thing that you could do would be to file a new bug (or search for an existing bug), and then do anything possible to generate a reproducible test case for the bug. Reproducible bugs can be solved.
The problem email deletion bug appears to be Bug 343196.
Blocks: 465794
(In reply to comment #69)
> Sometimes when I try to delete items from my inbox or move them to other
> folders, there is no response. They just stay in the inbox. I may be able to
> initially move some items, but once it starts, I can't move anything out of 
> inbox until I restart Seamonkey.

I can confirm this and I have noted, that right after failed deletion in Inbox (or another folder) the Junk status resets in most of the mail in Junk folder. So, maybe 'deletion problem' is somehow related to the current bug.
(In reply to comment #59)
> But I have also seen a decline in spam detection efficiency in the last few
> days.

Can this be related to https://bugzilla.mozilla.org/show_bug.cgi?id=471885 ?
Comment on attachment 358868 [details] [diff] [review]
flush-on-last-message plus tests and clones

fix checked in with a couple minor tweaks (temp assignment to rv not needed in a couple places). Thx for fixing this, Kent!
Attachment #358868 - Flags: superreview?(bienvenu)
Attachment #358868 - Flags: superreview+
Attachment #358868 - Flags: review?(bienvenu)
Attachment #358868 - Flags: review+
fix checked in.
Status: REOPENED → RESOLVED
Closed: 16 years ago15 years ago
Resolution: --- → FIXED
I've just backed this out: in today's nightly I was crashing whenever filters were copying mails to local folders, or when I was manually copying to local folders.

Crash stack:

http://crash-stats.mozilla.com/report/index/eaaf1e92-6bf5-4598-b551-226c92090128
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
fix upcoming...we're not settting seekablestream in one code path now...
Standard8, can you verify that this fixes the crash? It does for me, and restores the setting of seekablestream to the way it was before the patch.
Attachment #358868 - Attachment is obsolete: true
Attachment #359296 - Flags: superreview?(bugzilla)
Attachment #359296 - Flags: review?(bugzilla)
Could this bug be what is causing crashes I'm seeing on OSX on Bug 475205 - the reason I suspect this work is that it is happening repeatably AFTER fetching all mail, and the mail is fetched successfully (as confirmed by going back to a working version of TB), so I'm suspecting its in indexing, junk filtering, or message filters ? 

If its not bug 471682, is there anyone who could check what patches were committed to the nightlys between the last known working version 
Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.5; en-US; rv:1.9.1b3pre) Gecko/20090123 Shredder/3.0b2pre
and the first known buggy version
Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.5; en-US; rv:1.9.1b3pre) Gecko/20090125 Shredder/3.0b2pre
so that it can be reported on that bug?
I've narrowed it down further, the bug is in 20090124, so its one of the patches that went into that version.
No, I doubt it very much - this fix was backed out anyway, and the stack traces aren't remotely similar. Nightlies from 1/28 and 1/29 shouldn't have this patch in them, since I believe Standard8 respun 1/28
Comment on attachment 359296 [details] [diff] [review]
fix crash in last patch

r/sr=me on the diffs between this patch and the last.
Attachment #359296 - Flags: superreview?(bugzilla)
Attachment #359296 - Flags: superreview+
Attachment #359296 - Flags: review?(bugzilla)
Attachment #359296 - Flags: review+
(In reply to comment #85)
> (From update of attachment 359296 [details] [diff] [review])
> r/sr=me on the diffs between this patch and the last.

And I forgot to say, is it possible to extend the unit test (or add a new one) for this crash case?
I think we would add this to an existing imap test case, if possible...
Flags: in-testsuite-
fix re-landed.
Status: REOPENED → RESOLVED
Closed: 15 years ago15 years ago
Resolution: --- → FIXED
Flags: in-testsuite?
Flags: in-testsuite-
Flags: blocking-seamonkey2.0a3?
Is this bug supposed to have been fixed? I still can't get my sent messages before the date this happened. I even copied the entire sent file as a new file in the ...\Mail\ folder, and it still won't index before that message. Any suggestions of how I can fix this?
(In reply to comment #90)
This bug is about needing to reindex the folder. What you are describing isn't this bug - it's bug 472446, which combined with this bug made a very nasty combination. See the discussion there - but the short answer is that your sent messages were probably deleted and irrecoverable.
@Eileen: as Kent says, the emails are probably irrecoverable; however, you could try the following steps (at link below) to see if they are recoverable or not.  The instructions were for the inbox, but just substitute the word "sent" in place of "inbox" and the instructions are the same:

http://forums.mozillazine.org/viewtopic.php?p=5531385#p5531385
Whiteboard: [crash case needs testcase]
The main changes in this bug have a unit test, but we could do with a unit test (or ensuring we have one) to cover the crash mentioned in comment 79:

> I've just backed this out: in today's nightly I was crashing whenever filters
> were copying mails to local folders, or when I was manually copying to local
> folders.
> 
> Crash stack:
> 
> http://crash-stats.mozilla.com/report/index/eaaf1e92-6bf5-4598-b551-226c92090128
Whiteboard: [crash case needs testcase] → [crash case needs testcase - see comment 93]
Flags: in-testsuite?
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: