Last Comment Bug 730947 - local (POP3) move filter ends up corrupting messages, even after fix of bug 736539
: local (POP3) move filter ends up corrupting messages, even after fix of bug 7...
Status: RESOLVED FIXED
[By bug 736539, original problem of c...
: dataloss, dogfood, qawanted, regression
Product: MailNews Core
Classification: Components
Component: Backend (show other bugs)
: Trunk
: All All
: -- blocker (vote)
: Thunderbird 14.0
Assigned To: David :Bienvenu
:
Mentors:
: 740374 744706 (view as bug list)
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2012-02-27 13:03 PST by Robert Kaiser
Modified: 2015-10-07 18:38 PDT (History)
24 users (show)
bugzillamozillaorg_serge_20140323: in‑testsuite+
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
+
fixed
+
fixed
+
fixed
+
fixed


Attachments
zip (mailbox+msf) of corrupted folder (43.65 KB, application/octet-stream)
2012-02-27 13:03 PST, Robert Kaiser
no flags Details
screen shot, showing good headers and bad body (201.73 KB, image/png)
2012-02-27 13:08 PST, Robert Kaiser
no flags Details
KaiRo's filter rules file (12.15 KB, text/plain)
2012-03-02 05:43 PST, Robert Kaiser
no flags Details
Mail folder file with corrupted mail data after filter move, delete, compact, then copy folder n (12.68 KB, text/plain)
2012-04-08 02:34 PDT, WADA
no flags Details
fix compact handling of msg offset - checked in (787 bytes, patch)
2012-04-09 13:34 PDT, David :Bienvenu
mconley: review+
mozilla: approval‑comm‑aurora+
mozilla: approval‑comm‑beta+
Details | Diff | Splinter Review
add test for offsets to compact test - checked in (2.36 KB, patch)
2012-04-10 06:10 PDT, David :Bienvenu
mconley: review+
Details | Diff | Splinter Review

Description Robert Kaiser 2012-02-27 13:03:52 PST
Created attachment 601025 [details]
zip (mailbox+msf) of corrupted folder

I have a long list of filters set up to move messages in different folders under my POP3 inbox, one of those is moving all mail from bugzilla-daemon@mozilla.org to a "BugMail" folder.

Somewhere in Mozilla 12 trunk (SeaMonkey 2.9a1) I started noticing that messages in that folder had partial headers from another message appended and a lot of messages appearing to have a body belonging to another message. This way, I couldn't read the bugmail correctly any more, and after repairing and compacting the folder, I lost a ton of email that I had received, thankfully mostly bugmail, so at least the data is somewhere, even though I might not have read important work-related information.

I switched back to a build from the Mozilla 11 tree I still had and used it without problems until trying again in the last days with Mozilla 13 trunk (SeaMonkey 2.10a1).

Today I had this same problem again. I'm attaching the resulting corrupted BugMail folder as a zip with both the mailbox and MSF files.

Interestingly, it looks like the thread pane somehow correctly gets the subject, etc. info (I guess from the MSF), even when the body / message source is corrupt and doesn't even contain that info correctly any more (so the mailbox file is corrupted).
Comment 1 Robert Kaiser 2012-02-27 13:08:32 PST
Created attachment 601028 [details]
screen shot, showing good headers and bad body

Here's a screen shot (don't worry about my theme choice, it's the data that is important), showing that the thread pane shows perfectly good data, but the body of the selected message is busted (doesn't match the selected header with anything, actually, other than the sender, which is probably just because all messages there have the same one).
Comment 2 Robert Kaiser 2012-02-27 13:11:39 PST
CCing bienvenu as I know he looked into those filters recently, and setting some flags to get this tracked for the relevant releases as it looks pretty bad, esp. if it's not just in my case (and as I said, a SeaMonkey 2.8 / Mozilla 11 build works flawless).

Sorry, but given the dataloss and that it doesn't always show immediately apparently, I'm reluctant to search for detailed regression ranges...
Comment 3 Justin Wood (:Callek) 2012-02-27 13:18:43 PST
I want to track this for SeaMonkey -- it is very very bad UX. But I know the fix likely resides in Thunderbird Teams hands.
Comment 4 David :Bienvenu 2012-02-27 13:24:27 PST
is all that your filters do move mail? do you use the global inbox or a separate inbox per server? You're the only person reporting any issues, so I suspect it has to do with your setup and/or filters.
Comment 5 David :Bienvenu 2012-02-27 13:36:04 PST
also, do you have anti-virus quarantining turned on (tools, options/preferences, security, anti-virus, checkbox)

are all your filters move filters, or do you have any copy filters?
Comment 6 Robert Kaiser 2012-02-27 13:43:09 PST
(In reply to David :Bienvenu from comment #4)
> is all that your filters do move mail? do you use the global inbox or a
> separate inbox per server? You're the only person reporting any issues, so I
> suspect it has to do with your setup and/or filters.

I have a ton of filters, almost all of them exclusively move mail, the bugmail one is pretty high up in the list, very low/late in the list, there are a couple that also mark some junk or delete email from certain recipients, but the emails moved to this folder don't ever get to those, of course.
I'm using separate inbox per server, this inbox and bugmail folder are only ever hit by the POP3/local stuff.

My setup in by itself pretty old, from ancient suite times, but it went through a SeaMonkey 1.x/2.0 migration, which should have cleaned up quite some cruft in the 1.9.1 timeframe.
I'm a heavy user and probably not so many people are hitting trunk or aurora channel that heavily with POP3 filters, so not sure if I'm alone. It's suspicious though that an 11 build fares fine all the time, and as soon as I go to 12 or 13, I hit this at least after a bit.

(In reply to David :Bienvenu from comment #5)
> also, do you have anti-virus quarantining turned on (tools,
> options/preferences, security, anti-virus, checkbox)

I've had it on now when this happened, but the same happened when it was off earlier on when I ran SM2.9/Moz12 trunk and had this issue. I tried turning it on when I heard it had an influence on another bug, but it didn't fix the problem.

> are all your filters move filters, or do you have any copy filters?

No copy filters.
Comment 7 David :Bienvenu 2012-02-27 14:18:20 PST
are any of your filters message body filters?
Comment 8 Robert Kaiser 2012-02-28 05:08:56 PST
(In reply to David :Bienvenu from comment #7)
> are any of your filters message body filters?

Yes, I have a couple of those low on the list. I can attach the msgFilterRules.dat from this account as well if that helps you.
Comment 9 David :Bienvenu 2012-02-28 07:23:57 PST
What might help is to turn on filter logging and attach or e-mail me a filter log when it happens again to see if it's possible that the filters with body criteria might have been evaluated right before the corruption happened. I realize you don't want to run newer builds, but that's what would be helpful. I assume since you're running on linux there's no virus checker or file locking by some external app like backup in play.
Comment 10 Marco Bonardo [::mak] (Away 6-20 Aug) 2012-03-02 03:23:31 PST
I have hit this same bug at least 4 or 5 times in the last 2 months on Earlybird. the disk starts churning while I'm reading or deleting emails and suddenly all bodies are replaced with a single one, quite annoyng when you lose 400 bugmails :) Though in the try to figure out the brokeness, I have removed all my filters, so I can't help with the log.  Will make new filters and enable logging, I'm sure in the old filters I had one doing 6 or 7 ORed checks in the body part to remove bugmails for automatically closed bugs.
Comment 11 Wayne Mery (:wsmwk, NI for questions) 2012-03-02 04:46:46 PST
Robert, were you getting updates every day?
Comment 12 Robert Kaiser 2012-03-02 05:41:33 PST
(In reply to Wayne Mery (:wsmwk) from comment #11)
> Robert, were you getting updates every day?

Not sure what you mean, but I'm building daily from current "trunk" source myself and using those builds (or I have done so, as now I'm back on a Mozilla-11-based SeaMonkey 2.8 build that doesn't expose that problem).

Unfortunately, the only thing I remember is that before Christmas, I didn't have that problem on those trunk builds and then I January I had it first. The Mozilla 11 train is OK, the bug exists in the Mozilla 12 train, so the regression must be somewhere in the time when 12 was trunk - which is unfortunately quite some time.
Comment 13 Robert Kaiser 2012-03-02 05:43:31 PST
Created attachment 602328 [details]
KaiRo's filter rules file

Just for bienvenu's debugging efforts, here's my filter rules file.
Comment 14 Robert Kaiser 2012-03-06 10:59:33 PST
We're uplifting 12 to Beta in a week, which is the train I've first seen this issue with, and I'm also not the only one seeing this apparently.

bienvenu, any ideas on this problem? Do my attached files give you any clue as to what could be happening?
Comment 15 David :Bienvenu 2012-03-06 12:14:10 PST
(In reply to Robert Kaiser (:kairo@mozilla.com) from comment #14)
> We're uplifting 12 to Beta in a week, which is the train I've first seen
> this issue with, and I'm also not the only one seeing this apparently.
> 
> bienvenu, any ideas on this problem? Do my attached files give you any clue
> as to what could be happening?

I haven't been able to reproduce it. The other person who was having issues had them go away when he changed his anti-virus settings.
Comment 16 Dan Kies 2012-03-07 15:41:08 PST
I am seeing the same problem on SM trunk (going back to what is now SM 2.9).  Messages are corrupted by moving messages either by a user defined filter or by manually dragging and dropping messages from one message folder to another.
Comment 17 H. Hofer 2012-03-19 10:10:51 PDT
Could this be related to bug 736539 ?
Comment 18 David :Bienvenu 2012-03-19 11:18:10 PDT
Perhaps, if the information in this bug is incomplete. Basically, you have to do two moves of the same messages to cause the corruption in bug 736539 - there's no indication here that two moves are required, and I think Robert said that a pop3 incoming filter moved messages into a folder, and the messages were corrupt in that folder, which is *not* bug 736539. But I would love to be wrong, and have this bug be the same as bug 736539.
Comment 19 Marco Bonardo [::mak] (Away 6-20 Aug) 2012-03-20 03:30:59 PDT
fwiw, I see this issue again (just reproduced in yesterday's Earlybird 20120319030032), I will let you know if bug 736539 makes any difference from next build on
Comment 20 doug2 2012-03-27 07:30:27 PDT
Seeing corruption in inbox last two days, 3/26-27, but not before.  Lost mail.  Tried normal fixes (compress, delete inbox index) but no success.  One filter for junk (non-standard).
Comment 21 David :Bienvenu 2012-03-27 07:38:19 PDT
(In reply to doug2 from comment #20)
> Seeing corruption in inbox last two days, 3/26-27, but not before.  Lost
> mail.  Tried normal fixes (compress, delete inbox index) but no success. 
> One filter for junk (non-standard).

What build are you running? What does "non-standard" junk filter mean?
Comment 22 doug2 2012-03-27 07:59:12 PDT
Running 13.0a2 now, but updated after the corruption problem this AM.  
Have 2 mail filters created through interface, but not the "standard" profile junk filter.
Have turned off both filters because of problem above.
Comment 23 David :Bienvenu 2012-03-27 08:05:01 PDT
(In reply to doug2 from comment #22)
> Running 13.0a2 now, but updated after the corruption problem this AM.  
> Have 2 mail filters created through interface, but not the "standard"
> profile junk filter.
> Have turned off both filters because of problem above.

were you running nightly builds before?
Comment 24 doug2 2012-03-27 08:16:14 PDT
Yes.  I have accepted every nightly build update.  This morning, I did not see a notice until I started having problems and then checked the version and saw an update waiting.
Comment 25 Honza Bambas (:mayhemer) 2012-03-27 13:47:44 PDT
Just a stupid question: when the breakage happens, does Repair Folder on Inbox fix it or is the damage irreversible?
Comment 26 David :Bienvenu 2012-03-27 13:52:39 PDT
(In reply to Honza Bambas (:mayhemer) from comment #25)
> Just a stupid question: when the breakage happens, does Repair Folder on
> Inbox fix it or is the damage irreversible?

As I said on IRC, if the folder itself is corrupted, then no, it does not. If the .msf file has either the wrong offset into the folder, or the wrong message size for one or more messages, then repairing the folder (which only really repairs the db) will fix the db.
Comment 27 Marco Bonardo [::mak] (Away 6-20 Aug) 2012-03-27 14:57:13 PDT
all my corruptions have been irreversible, neither repair nor compact nor both. I just had to delete all the mails and proceed.
Comment 28 David :Bienvenu 2012-03-27 14:59:54 PDT
(In reply to Marco Bonardo [:mak] from comment #27)
> all my corruptions have been irreversible, neither repair nor compact nor
> both. I just had to delete all the mails and proceed.

Are they still happening? Any idea what steps lead to the corruption? Is it possible that auto-compact is happening right before the corruption? Do you have the ask me before compact pref set?
Comment 29 Marco Bonardo [::mak] (Away 6-20 Aug) 2012-03-27 15:21:54 PDT
(In reply to David :Bienvenu from comment #28)
> Are they still happening?
> Any idea what steps lead to the corruption?

So, I don't see that kind of corruption (replaced bodies) from when I updated to the latest Earlybird, but it was intermittent so I can't tell for sure.
Something strange happened some days ago, that looks similar. I was going through bugmail, I usually get some hundreds of those, read and delete from the bottom till I've gone through the first one (at that point the mailbox is empty).  In the past this was causing the intermittent corruption.
Instead this time, I had still about 80 bugmails above me, and all of them suddenly disappeared.  They were not in junk, nor bin, repair nor compact helped recovering them.

> Is it
> possible that auto-compact is happening right before the corruption?

Right before the corruption I always hear the disk churning. No idea if that's due to compacting. I though was the AV and disabled the option to allow it to quarantine mails, didn't make a difference though.

> Do you
> have the ask me before compact pref set?

If you mean mail.purge.ask, it's set to false.
Comment 30 doug2 2012-03-27 17:38:29 PDT
Running 13.0a2, all POP
Just started up EB and received 22 pcs mail.  All showed in list.  Read 1st one, went to 2nd one ('f', I think) and all unread disappeared. For a short time, the unread count in the side bar showed 22, but then that count disappeared.  Tried deleting the index (Inbox.msf).  Restarted EB - no new mail.  They do not appear anywhere in any file as far as I can tell.  No filters enabled.
Comment 31 doug2 2012-03-27 18:23:45 PDT
To clarify, I had two mail filters on Inbox, but I had disabled them earlier in the day when Inbox corruption began.  A really bad day here.
Comment 32 WADA 2012-03-27 18:56:11 PDT
(In reply to doug2 from comment #30)
> Running 13.0a2, all POP
> Just started up EB and received 22 pcs mail.  All showed in list.  Read 1st
> one, went to 2nd one ('f', I think) and all unread disappeared. For a short
> time, the unread count in the side bar showed 22, but then that count disappeared.

What kind of "corruption of mail data" happened in this test?
Can you show us some *corrupted* mail data by attaching mail data saved as .eml file?

> Tried deleting the index (Inbox.msf).
> Restarted EB - no new mail.
> They do not appear anywhere in any file as far as I can tell.
> Nofilters enabled.

What do you call by "any where in any file"?
No mail data in file named Inbox? (file size of file named Inbox=0)
Mail data exists in file named Inbox but any one has X-Mozilla-Status: 0008 or 0009? (Expunge bit=On, marked as deleed mail)
Comment 33 doug2 2012-03-28 05:09:14 PDT
I searched all folders for any mail dated after noon on 3/27 and found none.  All 22 pcs of new mail disappeared.  They are not in the Inbox file or trash or any folder.

Also, some mail received before noon (marked read) disappeared and was not found.

No "corruption" in this test.  Corruption to mail in EB on 3/26 and early 3/27.

I will search for some examples later.
Comment 34 David :Bienvenu 2012-03-28 06:54:21 PDT
If you leave messages on the server, you can always delete popstate.dat to cause a redownload of all the messages on the server.

You haven't changed mail.server.default.dup_action using the config editor (tools, options, advanced, general, config editor), have you? The default is 0.
Comment 35 doug2 2012-03-28 07:25:32 PDT
Mail.server.default.dup_action = 0

I've been going through the Inbox and Trash files with a text editor looking for the missing mail.  Found a few from 3/27 with X-Mozilla-Status: 0009 that do not show up in Trash using either search or visual review through T-Bird.  Tried changing the 0009 to 0001, but it still does not show up in T-Bird.
Comment 36 David :Bienvenu 2012-03-28 07:28:57 PDT
(In reply to doug2 from comment #35)
> Mail.server.default.dup_action = 0
> 
> I've been going through the Inbox and Trash files with a text editor looking
> for the missing mail.  Found a few from 3/27 with X-Mozilla-Status: 0009
> that do not show up in Trash using either search or visual review through
> T-Bird.  Tried changing the 0009 to 0001, but it still does not show up in
> T-Bird.

if you do this while thunderbird is not running, and change the size of the mail folder by even one byte (e.g., add a space somewhere in a message), when you restart, the deleted messages will get shown. Or you could just repair the folder.

So the messages were marked as deleted? And you have no mail filters running? Only the built-in spam filter should delete messages that way, then (or you as the user).
Comment 37 doug2 2012-03-28 07:51:37 PDT
Did a repair and the msg I was using as a test case does now show up.  Thanks
However, the folder (Trash) now shows 498 messages unread and a quick visual check showed them to be duplicates of messages already there.  I marked the folder read.

The test msg above was one that I had read and may have deleted, but was looking for. 
I "repaired" several other folders (but not all) and still have not found the mail from 3/27 PM.
Comment 38 Robert Roessler 2012-04-01 11:59:24 PDT
Not to muddy the waters, but I am seeing corruption of my mbox file.

I am on Windows 7 x64, formerly running SM 2.8, just switched to SM 2.9 b1/b2.

As soon as this bug popped up on the radar, I had removed the last action (moving the identified and marked messages to the trash) from my various filters - even though the platform is "x86_64 Linux".

I saw no problems at that time.  When I saw the above claims about bug 736539, I re-added the move actions... and also moved to SM 2.9 b1.

I then started seeing corruption of my actual mbox file, which has persisted into SM 2.9 b2.

By "corruption of the actual mbox file", I mean lost messages, fragmented messages, and merged messages - all of which are not affected by MSF rebuilding, and are actually "there" (or not there in the "missing" case) in the mbox text.

I think platform should be adjusted to "All All" at this time.
Comment 39 David :Bienvenu 2012-04-01 15:25:30 PDT
(In reply to Robert Roessler from comment #38)
> Not to muddy the waters, but I am seeing corruption of my mbox file.
> 
> I am on Windows 7 x64, formerly running SM 2.8, just switched to SM 2.9
> b1/b2.
> 
> As soon as this bug popped up on the radar, I had removed the last action
> (moving the identified and marked messages to the trash) from my various
> filters - even though the platform is "x86_64 Linux".

Robert, what were the other actions in your filters?  Were they marking messages read, or as junk, or "starring" them? And do you have message quarantining turned on (tools, options, security, anti-virus, quarantining checkbox)?
Comment 40 Robert Roessler 2012-04-01 15:39:56 PDT
These are just several variants on spam catchers - each has a different set of patterns, but they all Set Junk Status -> Junk, and then each sets a different message tag [color], so I can see at a glance which of the spam patterns was matched.

These 3 filters currently stop there while the corruption problem exists, but they all *had* a final Move to Trash step.

I use no "message quarantining" or A/V tools or settings.
Comment 41 WADA 2012-04-01 15:46:09 PDT
(In reply to Robert Roessler from comment #38)
> I am on Windows 7 x64, formerly running SM 2.8, just switched to SM 2.9 b1/b2.

Patch landing on Sm's beta may be different from Tb.
Do you see problem in latest Sm trunk? (as of today, Sm 2.11a1. that bug's patch was landed on 3/19)
Comment 42 Robert Roessler 2012-04-01 16:55:34 PDT
First off, I disagree strongly with your reversion of the platform association of this bug... it IS a data loss failure that IS happening on the released and beta versions of SeaMonkey ON Win 7, and the misleading settings of "x86_64/Linux" can induce a false sense of security.

Responding to your question, I quit running nightly builds when the new release scheme was adopted.

When I can determine (although I am not sure how at this point) that the possible fix for this data loss problem has been included in a SM 2.x beta build, I will be pleased to try it at that time.
Comment 43 David :Bienvenu 2012-04-01 17:07:09 PDT
Robert, it's quite possible that bugzilla accidentally reverted the platform changes I made - it does it quite frequently.
Comment 44 Robert Roessler 2012-04-01 17:14:34 PDT
Thanks for the clarification and re-setting the platform [again]. :)

(I may have sufficient privileges here to reset it myself - not sure, but was reluctant to kick off a "platform war".)
Comment 45 David :Bienvenu 2012-04-01 17:47:20 PDT
> I use no "message quarantining" or A/V tools or settings.

Thunderbird's message quarantining is on by default, in the options/preferences ui that I mentioned before - security tab, anti-virus sub-tab, quarantining checkbox
Comment 46 Robert Roessler 2012-04-01 17:56:40 PDT
Thanks, David - *Thunderbird's* message quarantining may well be "on" by default... but since I use SM, I can neither comment on nor adjust these settings. ;)
Comment 47 Wayne Mery (:wsmwk, NI for questions) 2012-04-01 19:01:15 PDT
(In reply to Robert Roessler from comment #46)
> Thanks, David - *Thunderbird's* message quarantining may well be "on" by
> default... but since I use SM, I can neither comment on nor adjust these
> settings. ;)

au contraire

preferences | mail | junk | suspect mail | allow ... scan
Comment 48 WADA 2012-04-01 19:14:46 PDT
As for "simple move by filter" case caused by bug 736539, problem is already fixed.
- No qurantine option,
- Filter : If subject doesn't contain ???!!!, Move to FolderX (move all mails)
- Sm trunk 3/01 build, Tb 3/01 build : problem occurs.
  - "Order Received" column value after move by filter :
      first moved mail : offset of the mail
      N-th moved mail  : offset of first mail + (N-1) i.e. incremented by 1
    At this stage, mail is shown correctly, View/Source shows correct data
  - Delete some mails, and execute Compact
    => mail data in local mail folder file is corrupted.
       offset value is changed from "incremented by 1 value" to ordinal value. 
- Sm trunk 4/01 build, Tb trunk : problem doesn't occur. 
  - "Order Received" column value after move by filter :
      first moved mail : offset of the mail
      N-th moved mail  : offset of first mail + (N-1) i.e. incremented by 1
    At this stage, mail is shown correctly, View/Source shows correct data
  - Delete some mails, and execute Compact
    => Mail data in local mail folder file is not corrupted.
       Mail is shown correctly, View/Source shows correct data.
  Internally used offset value looks correct after fix of bug 736539.
  Remainig problem is "Order Received" column value only.

When "Copy then Move in single filter rule" case, following problem occurred in both 3/01 build and 4/01 build.
- No qurantine option,
- Filter : If subject doesn't contain a string,
           Copy to FolderX, then Move to FolderY
- Sm trunk 3/01 build & 4/01 build, Tb 3/01 build & 4/01 build : problem occurs.
  - Copy is executed as expected.
  - When multiple mails hits, only last mail is moved from Inbox to FolderY.

This is almost same phenomenon as bug 448337 which is already closed as FIXED(but not marked as VERIFIED yet...).
IIRC, mail data was not written to FolderY in duplication test of that bug, even though it's shown at thread pane of FolderY, and mail data in Inbox was not marked as deleted(expunged bit in X-Mozilla-Status: was not set on). So, if Compact was executed at FolderY, mail disappeared. So, if compact is executed after copy/move of other mails to the folder, mail data may be broken and wrong data may be shown.

If this kind of problem still remains or occurs again, data in mail folder file may be corrupted. And, if auto-compact is enabled but "prompt before start of auto-compact" is disabled, auto-compact will silently invoked sooner or later when "move mails by message filter" is executed.
Please do following in order to see "when mail data is corrupted"(upon filter move, or upon compact, or upon others).
- mail.purge.ask = true and restart Tb (restart of Tb is mandatory after change)
- Reply Cancel or OK to prompt before start of auto-compact
- Never disable "prompt before start of auto-compact" again at the prompt
And, surely check with build on which patch of bug 736539 is applied, please.
Comment 49 WADA 2012-04-01 20:20:10 PDT
(In reply to Robert Roessler from comment #42)
> First off, I disagree strongly with your reversion of the platform association of this bug...

It was not intentional. When I tried to post the comment, "air collision" happened and I did Back & Reload. Above change is also a reult by my Reload.
Comment 50 WADA 2012-04-01 21:04:43 PDT
FYI.
I checked bug 448337 again with Tb 11.0.1 on Win-XP, and fix of that bug was verified again by Tb 11.0.1. Regression looks to have happened on bug 448337 in trunk. In recent nightly, phenomenon of "no tag data in X-Mozilla-Keys: header" was seen when action of "Add tag" was added to filter rule. This is also a regression.
Comment 51 WADA 2012-04-01 21:30:03 PDT
FYI.
Executed same tests as comment #48 with Sm 2.9b2 on Win-XP.
> http://ftp.mozilla.org/pub/mozilla.org/seamonkey/nightly/2.9b2-candidates/build1/win32/en-US/seamonkey-2.9b2.zip
> Last modified : 29-Mar-2012 02:34 	
Same result as comment #48 in Sm trunk 4/01 build and Tb trunk 4/01 build, ecxept that nothing was shown at thread pane of move target folder in "Copy&Move by single filter rule" case if Sm 2.9b2.
Comment 52 WADA 2012-04-01 21:33:29 PDT
To David :Bienvenu, should I open separate bug for following?
(a) "Increment by 1" of "Order Received" column value by filter move.
(b) Regression of bug 448337 in trunk.
Comment 53 Robert Roessler 2012-04-01 23:59:48 PDT
(In reply to Wayne Mery (:wsmwk) from comment #47)
> (In reply to Robert Roessler from comment #46)
> > Thanks, David - *Thunderbird's* message quarantining may well be "on" by
> > default... but since I use SM, I can neither comment on nor adjust these
> > settings. ;)
> 
> au contraire
> 
> preferences | mail | junk | suspect mail | allow ... scan

Ummm, there is indeed an option "Allow anti-virus clients to scan incoming messages more easily" in the SM "Junk & Suspect Mail" panel - but that is as far as it goes, and probably doesn't qualify as a "quarantining" setting (unless it is really badly mislabeled).

Plus, it is off anyway. :)
Comment 54 Robert Kaiser 2012-04-02 03:25:50 PDT
WADA:
We who saw this corruption did never see this in Thunderbird 11.0* or SeaMonkey 2.8*, the bug has been introduced into Thunderbird 12 and SeaMonkey 2.9 when it was on trunk and the bug is now on Beta. It sounds to me like your analysis is for a different bug than this.

(In reply to Robert Roessler from comment #53)
> Ummm, there is indeed an option "Allow anti-virus clients to scan incoming
> messages more easily" in the SM "Junk & Suspect Mail" panel - but that is as
> far as it goes, and probably doesn't qualify as a "quarantining" setting
> (unless it is really badly mislabeled).

It does. This option does store messages as single files for the time when anti-virus software is first scanning it and therefore allows them to be quarantined by that software without damaging other messages. That's why the SeaMonkey preference text and a naming of "quarantining setting" are indeed logically the same thing. :)
Comment 55 WADA 2012-04-02 05:20:34 PDT
(In reply to Robert Kaiser (:kairo@mozilla.com) from comment #54)
> WADA:
> We who saw this corruption did never see this in Thunderbird 11.0* or
> SeaMonkey 2.8*, the bug has been introduced into Thunderbird 12 and
> SeaMonkey 2.9 when it was on trunk and the bug is now on Beta.
> It sounds to me like your analysis is for a different bug than this.

I don't say "regression of bug 448337 in trunk" which I saw is cause of corruption in this bug.
I merely say;
>                  Bug 736539                                  Regression of "copy&move in single filter" case in trunk                                                           in trunk 
>                  (simple move of multiple mails by filter)   (same phenomenon as Bug 448337 in same test with trunk build)
> (1) Tb 11        bug 736539 doesn't exist                    bug 448337 was fixed by Tb 3.0b4               
>                                                              bug 448337 was verified by Tb 11.0.1 too
> (2) a trunk      problem of bug 736539 startde to occur      when regression started is unknown
> (3) some trunks  this bug was filed on 2/27
> (4) trunk 3/01   problem of bug 736539 was reproduced        regression was observed         
>                  corrupted after Compact due to bug 736539
> (5) 3/16         bug 736539 was filed on 3/16
> (6) 3/19         bug 736539 was fixed on 3/19
> (7) trunk 4/01   problem of bug 736539 was not reproduced    regression was observed
>                    except "increment by 1"
>                    of "Order Received" value

Because both "comments on phenomenon with builds before 3/19" and "comments on phenomenon with builds after 3/19" were posted in this bug, it's impossible to know whether problem of bug 736539 was actually involved in your case or not.
So, I'm asking problem reporters in this bug for sorting out of tested build and phenomena, detailed description of "corruption", detailed description of used message filter rules, and ask problem reporters to surely rule out known phenomena/problems.
Please note that phenomena of "mail data is somehow corrupted when message filter is used" is not always same problem.

By the way, "quarantine option" in Tb, "junk | suspect mail | allow ... scan" in Sm, corresponds to mailnews.downloadToTempFile setting of prefs.js(I called On/enabled if true, Off/disabled if false).
Comment 56 Marco Bonardo [::mak] (Away 6-20 Aug) 2012-04-03 12:10:55 PDT
fwiw, automatic compaction just made my unread emails disappear again, I can say since the disk started churning, and after the mail disappeared I read "done compacting" in the status bar.  Since I am not sure this has anything to do with filters, if you wish I may file a bug apart (though somewhere I've read there may be a known corruption bug about receiving new mail while the folder is compacting?).
The folder file in the profile folder is 0 bytes, msf does't contain interesting stuff afaict.
Comment 57 WADA 2012-04-03 19:30:33 PDT
(In reply to Marco Bonardo [:mak] from comment #56)
> automatic compaction just made my unread emails disappear again, (snip)
> The folder file in the profile folder is 0 bytes, msf does't contain interesting stuff afaict.

Bug 498814 is an issue which may produce such phenomenon. See dup'ed bugs for actual phenomenon of file size=0 after Compact where interfere by other software is suspected. See also dependency tree for meta bug 498274 for issues around Compact.

Was your mail data corruption(loss of all mail data in local mail folder file after Compact) caused by message filter relevant problem in Tb?
No possibility of problem like bug 498274? Did your problem start to occur after you started to use trunk/aurora/beta build atter 3/19(bug 736539 was fixed)?
Comment 58 Marco Bonardo [::mak] (Away 6-20 Aug) 2012-04-04 01:46:32 PDT
(In reply to WADA from comment #57)
> phenomenon of file size=0 after Compact where interfere by other
> software is suspected.

I have no idea which other software may interfere, I also disabled the AV in Thunderbird options and have no alien software doing crazy stuff.

> See also dependency tree for meta bug 498274 for
> issues around Compact.

Ugh lots of stuff there, though my issue began recently. I never had a problem until I upgraded to Earlybird12 and (not sure if related) MS Security Essentials (was using Avast before).

> Was your mail data corruption(loss of all mail data in local mail folder
> file after Compact) caused by message filter relevant problem in Tb?

Before 3/19 I was suffering mail bodies corruption. After bug 736539 I have no more mails corruptions, but I suffer dataloss on compact.
Comment 59 WADA 2012-04-04 17:51:57 PDT
(In reply to Marco Bonardo [:mak] from comment #58)
> Ugh lots of stuff there, though my issue began recently. I never had a
> problem until I upgraded to Earlybird12 and (not sure if related) MS
> Security Essentials (was using Avast before).

Avast! and MS Security Essentials are different software, so they have different defaults and behaviour is also different.
If Tb's file used for local mail folder is qurantined by anti-virus software, file size = ZERO can occur. mailnews.downloadToTempFile=true is Tb side protection of this case due to anti-virus software, protection from loss of all mail data. By mailnews.downloadToTempFile=true, loss of mail by qurantine of anti-virus software is limited to single mail only.

Please surely rule out above case first.
No incident log for qurantine in log of anti-virus software?

Please surely rule out problem like bug 498274 second.
Is automatic virus scan of your antivirus software surely disabled for files which are used as "file for mail folder" by Tb?
Please note that Tb's file for mail folder doesn't have file extesion. So, excluding of Tb's mail folder from scan target can not be requested via file extesion based setting. Explicite excluding by "Tb's profile directory in exclude list" is usually needed.

Please rule out problem in Compact of Tb, third.
0. Execute File/Compact Folders, to invoke Compact of all folders once.
1. Stop permitting Tb's silent auto-compact execution,
   by enabling "Dialog before start of auto-compact".
2. Reply Cancel when the dialog is shown.
3. If mail data corruption is seen even if Compact is not invoked,
   it's perhaps problem of this bug even after fix of bug 736539.
4. If you saw dialog and replied Cancel at step 2,
   execute File/Compact Folders when you have time, 
   and check mail data to know whethr following problem happened or not.
   - Similar problem to bug 736539 : problem of bug 736539 was next.
     Mail data is corrupted by Compact due to wrong data in .msf by bug 736539.
   - Problem like bug 498274 :
     Interfere of Compact by other software such as anti-virus software.
Comment 60 WADA 2012-04-04 18:22:27 PDT
(In addition to comment #59)
If bug 736539 happened on FolderX.msf file of FolderX folder in the past, and if no mail is deleted from FolderX, Compact is not actually executed because no need to execute Compact yet.
If mail is deleted from FolderX after transfer to build of "bug 736539 is already fixed", "mail data corruption upon Compact due to bug 736539" occurs even with build of "bug 736539 is already fixed".

A phenomenon of bug 736539 I observed is "increment by 1 in Order Received column value". This looks to happen when consecutive mails are moved to same mail folder by message filter upon new mail download. And, this is observed even after problem of bug 736539 is fixed.

Please show "Order Received" column and sort by "Order Received" column.
Is "increment by 1 in Order Received column value" seen?
If seen, do next at the mail folder.
(1) Delete mail of smallest "Order Received" column value, then Compact.
(2-a) If "wrong data in .msf due to bug 736539" exists,
    "Order Received" column value is changed to ordinal value,
    and mail data of at least one mail should be corrupted.
(2-b) If "wrong data in .msf due to bug 736539" doesn't exist,
    "Order Received" column value is changed to ordinal value,
    and mail data should NOT be corrupted.
Comment 61 WADA 2012-04-04 22:57:52 PDT
(Correction of a request in comment #59)
> Please surely rule out problem like bug 498274 second.

As I wrote in that bug, it was found that File size=0 by Compact in bug 498274 can't occur in official Tb 3.0 and later. So, ignore above request, please.
Sorry for my confusion.
Comment 62 Marco Bonardo [::mak] (Away 6-20 Aug) 2012-04-05 03:04:55 PDT
(In reply to WADA from comment #59)
> By mailnews.downloadToTempFile=true, loss of mail by
> qurantine of anti-virus software is limited to single mail only.

I had to disable that cause it causes bug 720161 (I think this is a specific problem with MS Security Essentials).

> No incident log for qurantine in log of anti-virus software?

nope.

> Explicite excluding by "Tb's profile directory in
> exclude list" is usually needed.

Will do that.

> Please rule out problem in Compact of Tb, third.

I have already changed so that it notifies me automatic compaction and I refuse, and so far no corruptions.  Though this means it would be hard to tell you if the AV was involved, so I'll have to reset it and try to currupt my bugmail after excluding the TB profile folder in the AV.

(In reply to WADA from comment #60)
> A phenomenon of bug 736539 I observed is "increment by 1 in Order Received
> column value". This looks to happen when consecutive mails are moved to same
> mail folder by message filter upon new mail download.

This is very similar to what I have here, all the (hundreds) mails coming from bugzilla are moved by a filter to a Bugmail folder and this happens almost continuously, even while I'm reading or deleting (and thus I suppose even just before or during a compact).

> Please show "Order Received" column and sort by "Order Received" column.
> Is "increment by 1 in Order Received column value" seen?

yes, I tried your experiment, I don't see any corrupt mail after compact.
Comment 63 Robert Kaiser 2012-04-05 03:32:49 PDT
In any case, note that this bug as filed by me has *nothing* to do with compaction. I have my client ask me before doing any compaction that I don't initiate myself, and any such thing was only involved in the cleanup process when the folder was already corrupt. It was during that cleanup process where I finally lost email, but it already wasn't accessible correctly before, and I never lost the whole folder - at most I lost *some* messages that were good before the mess along with the corrupt ones, but never all in a folder. The AV stuff talked about in the recent messages is surely something else than the bug I reported here.
Comment 64 Marco Bonardo [::mak] (Away 6-20 Aug) 2012-04-05 03:39:09 PDT
That was already clear, see the whiteboard.  I explicitly asked if I should file a new bug too, I'm fine with resolving this, I had the same issue and don't have anymore.

Btw, the dataloss just happened again, though now I have antivirus ignoring all the thunderbird profile folders. So it looks unrelated to the AV afaict.
Comment 65 WADA 2012-04-05 05:10:25 PDT
(In reply to Robert Kaiser (:kairo@mozilla.com) from comment #63)
> In any case, note that this bug as filed by me has *nothing* to do with compaction.

Since initial of this bug till build just before 3/19(bug 736539 was fixed)?
Or in builds after 3/19? Or in any builds you used?

IIRC, when I saw some reports for POP3 mail data corruption after mail move by filter, I checked it and I saw mail data corruption by simple download of multiple mails and simple move to a folder by filter(cosecutive mails are moved to a folder by filter) without Compact operation. 
However, as I wrote in comment #55, in 3/01 build, I could observe "mail data corruption with simple download/simple move by filter" only after Compact.
And, I couldn't see bug 736539 with builds after 3/19 by "simple download of multiple mails and simple move to a folder by filter" even after Compact.

Because your comments on "acttually corrupted by filter" was mainly in 2012/02, phenomenon I saw without Compact may be same as problem you saw in 2012/02 or before.
If you still see "corruption by filter move without Compact" in builds after 3/19, it's perhaps different problem with different regression window from bug 736539, like regression in "Copy and Move filter" which I saw.
(note: "Copy&Move in single rule" was not mandatory. "Copy and Move of a mail by different rule" had problem too. And, added Tag  by filter is not written in X-Mozilla-Keys:, increment by 1 of Order Received column value, are still observed)

By the way, many of problem reports in this bug doesn't have information about used build. And, no description about "corruption" and no descrition about relevant filter rules. So, tracking of problem and understanding of phenomenon/problem is not so easy...
Comment 66 Robert Kaiser 2012-04-05 07:40:50 PDT
(In reply to WADA from comment #65)
> Since initial of this bug till build just before 3/19(bug 736539 was fixed)?

I'm talking about the original bug reported here. If bug 736539 broke something else, that belongs in a different bug. And if compacting didn't do corruption in the original bug reported here, then any discussion of it doing it now belongs into a different bug that deals with that. Let's keep one bug to one issue. This one is about corruption caused by filters.
Comment 67 WADA 2012-04-06 09:28:50 PDT
(In reply to Robert Kaiser (:kairo@mozilla.com) from comment #66)
> (In reply to WADA from comment #65)
> > Since initial of this bug till build just before 3/19(bug 736539 was fixed)?
> I'm talking about the original bug reported here.
> If bug 736539 broke something else, that belongs in a different bug.

Your filter has "Move To" action, and you actually used build before 3/19. So,  if consecutive mails are moved to a move target folder by message filter rule's "Move To" action upon new mail download, bug 736539 surely occurs in your environment before fix of bug 736539. And, in my duplication test with 3/01 build(bug 736539 is not fixed yet), I could observe mail data corruption upon Compact execution only.

If bug 736539 occurred in a .msf file(call FolderX.msf) with build before 3/19 in your environment, and if you didin't execute "Repair Folder" yet for FolderX you didn't delete FolderX after bug 736539 occurred, broken data by bug 736539 remains in FolderX.msf.
If Compact is invoked and compaction is actually executed on FolderX(deleted mail in FolderX is needed), mail data corruption due to bug 736539 occurs even after fix of bug 736539.

You say that Compact is never invoked on any mail folder in your environment. So, exposure of problem of bug 736539 by "mail data corruption upon Compact" won't occur in your environment.

I don't know following problem exists in your environment or not.
(a) Bad data generated by bug 736539 remains in folderX.msf.
(b) When bad data remains in folderX.msf, something wrong may happen in build after 3/19 by filter move or by something except Compact.
To surely rule out problems like (b) due to (a) from this bug, I asked other problem reportes to surely exclude problems like (b) due to (a).

What is evidence that bug 736539 never happened in your environment or bad data in .msf by bug 736539 never exist in .msf file of your mail folder?

A particularity in your case is following in your msgFilterRules.dat.
  - special character is used in local mail folder name, and is hashed by Tb. 
> actionValue="mailbox://KaiRo@server/Inbox/Linux72d4e46f"
> actionValue="mailbox://KaiRo@server/Inbox/Mozilla/Buildb62996ab"
> actionValue="mailbox://KaiRo@server/Inbox/Mozilla/SeaMonkey/Build61997a97"
> actionValue="mailbox://KaiRo@server/Inbox/Mozilla/comm-central%20Org"
> actionValue="mailbox://KaiRo@server/Inbox/Mozilla/insights%20771e3e0c"
What is speccial character used in mail folder name in your case?

By the way, I'm trying to rule out following from problem reports in this bug by other than you.
- bug 736539, and mail data corruption by bug 736539(corruption looks to occur
  upon Compact after problem of bug happened) 
- Problem due to Compact, including mail/mail folder corruption when interfre
  by other software occurrs.
- Problem due to anti-virus(qurantine by anti-virus software)
I'm trying to know about still unclear problem which you are experiencing from "before fix of bug 736539" and even after "fix of bug 736539" in which phenomenon/problem/cause/solution is clear, and which still occurs even when Compact is never invoked.
Comment 68 Robert Kaiser 2012-04-07 10:09:01 PDT
(In reply to WADA from comment #67)
> You say that Compact is never invoked on any mail folder in your
> environment.

Only when I deliberately do or allow it. What I said is that it never happens automatically without me noticing.

> What is evidence that bug 736539 never happened in your environment or bad
> data in .msf by bug 736539 never exist in .msf file of your mail folder?

I surely have no evidence. What I know is that quarantine has no connection with my original bug, as 1) switching the setting didn't change the bug happening and 2) I have no anti-virus software because I don't need it on Linux.

> A particularity in your case is following in your msgFilterRules.dat.
>   - special character is used in local mail folder name, and is hashed by
> Tb. 
> > actionValue="mailbox://KaiRo@server/Inbox/Linux72d4e46f"
> > actionValue="mailbox://KaiRo@server/Inbox/Mozilla/Buildb62996ab"
> > actionValue="mailbox://KaiRo@server/Inbox/Mozilla/SeaMonkey/Build61997a97"
> > actionValue="mailbox://KaiRo@server/Inbox/Mozilla/comm-central%20Org"
> > actionValue="mailbox://KaiRo@server/Inbox/Mozilla/insights%20771e3e0c"
> What is speccial character used in mail folder name in your case?

A slash (/) in all those cases, except "comm-central%20Org" which is, as can be seen easily, just a space.

> - bug 736539, and mail data corruption by bug 736539(corruption looks to
> occur
>   upon Compact after problem of bug happened) 

That could be a factor, I haven't tried if the bug still happens since that landed. Of course, I'm not keen on losing email, so I don't use build younger than SeaMonkey 2.8 / Thunderbird 11 (current release) any more in normal use. I should try again, I know, but I need to make sure I 1) have some way of not losing email and 2) have enough time to work on recovery if the problem happens again. Both, esp. the latter, are not that easy to do.

> - Problem due to Compact, including mail/mail folder corruption when interfre
>   by other software occurrs.

I saw the problem without any compacting, mails were not accessible correctly in the attached folder (and others) directly after the filters had run.

> - Problem due to anti-virus(qurantine by anti-virus software)

Impossible as there is no anti-virus running on my (Linux) system.


If it's anything we already know about, it's bug 736539 - but it still can be something we don't know about.
Has anyone seen the filter corruption in current Nightly builds?
Comment 69 WADA 2012-04-07 20:01:58 PDT
I could observe mail data corruption by 2012/04/03 build only, with clean data.
  - Start with clean file(file size of any relevant folders = 0),
    so bad data in .msf nor data file never exists.
  - I simply repeated following.
    POP3 download with filter move after deletion of popstate.dat, 
    Shift+Delete of some mails, 
    Copy/Move mails by Drag&Drop or by manual filter run,.
    (both Run Filters on Folder of menu, and Run Now in filter is used),
    Delete by manual filter run etc.
During testing, I incidentally clicked Forward button, and I saw wrong data even though mail data is shown correctly(tested with simple text/plain mail only). And after additional testing including manual Compact, I could see following display by mail viewing.
(header pane)
No subject string, no From:, no To:, no Date:, and some tags added by filters
(message pane)
> >From - Sun Apr 08 10:06:54 2012
> X-Account-Key: account11
>(snip)

(corresponding line in folder file [CRLF]=0x0D0A, [CR]=0x0D)
> Bottom part of previous correct mail,
> and multiple [CRLF]
> >From - Sun Apr 08 10:06:54 2012[CRLF]
> X-Account-Key: account11
>(snip)
> test[CRLF]   <= mail data line
> [CR]
> [CRLF]
> >From - Sun Apr 08 10:06:54 2012
> X-Account-Key: account11
>(snip)

When correctly stored, this mail's end is following(null line at end of this mail).
> test[CRLF]
> [CRLF]
> [CRLF]
> From - Sun Apr 08 11:24:46 2012 <= separator of next mail
Offset of a mail, or length of mail was perhaps corrupted by filter copy/move or manual copy/move, or [LF] of bottom [CRLF] is perhaps lost by copy/move because new line of test mail is always [CRLF] and standalone [CR] nor stand alone [LF] never exists.
This bad data is probably not exposed to user by ordinal mail viewing, because thread pane display is not corrupted and header pane display is not affected until Compact is executed and Forward is not so frequently executed.
I did "Delete" by filter, and I executed manual filter move from Trash to testing folder. it may be relevant. 
I don't know "increase by 1 of Order Received column value" is relevant to problem or not.

By the way, I used special character(/ # ? space) in folder name in my test, but I don't think it's relevant to problem.
Comment 70 WADA 2012-04-08 02:34:40 PDT
Created attachment 613172 [details]
Mail folder file with corrupted mail data after filter move, delete, compact, then copy folder n

This is a mail folder file generated by following test.
0. Tb trunk 4/03 build on Win-XP, 7 test mails, "Leave Messages in Server" 
1. Delete popstate.dat, restart Tb, Get Msg
   Mails are moved to FolderA by move filter except 1 mail.
   Order Received column = 1 to 6
2. Repeat step 1. Order Received column =  7 to 12
3. Repeat step 1. Order Received column = 13 to 18
4. Repeat step 1. Order Received column = 19 to 24
5. Sfhit+Delete some mails.
   I deleted 8 mails, so 16 mails remain.
6. Compact at FolderA.
   Order Received column value = ordinal value
   At this step, mail data corruption is not observed.
7. At FolderA, Select All, Copy to FolderB.
   Mail count of FolderB = 8, number of mails in thread pane=8,
   even though copy of 16 mails is requsted.
   No problem is seen in thread pane.
   At message pane corrupted data is shown for some messages.
   Data in mail folder file(FolderB) is broken.
Comment 71 WADA 2012-04-08 05:17:21 PDT
If mails of lowest Order Received value(Order Received=1,2,3,4,...) is deleted before Compact of FolderA, broken data is seen at offset=0 of FolderB when mails are copied from FolderA to FolderB.

(mail data in FolderB, at offset=0)
> 000
> X-Mozilla-Keys:
> X-DTI-Virus-Check: checked
>(snip) normal message header lines here
> Content-Transfer-Encoding: 7bit
> 
> Mail body text line
> 
> 
> >From - Sun Apr 08 20:57:35 2012
> X-Account-Key: account11
> X-UIDL: 000004fa4c354af7
> X-Mozilla-Status: 0000
> X-Mozilla-Status2: 00000000
> X-Mozilla-Keys:
> X-DTI-Virus-Check: checked
>(snip)

Following part is lost in first mail's data(offset shift of next data length happens).
> From - Sun Apr 08 20:57:35 2012
> X-Account-Key: account11
> X-UIDL: 000004fa4c354af7
> X-Mozilla-Status: 0000
> X-Mozilla-Status2: 00000
Comment 72 David :Bienvenu 2012-04-08 12:37:18 PDT
thx, Wada. I do see an issue with steps similar to this. The copy from folderA to folder B is what seems to be broken, though presumably something about the state of the messages in folderA is broken before the copy.
Comment 73 David :Bienvenu 2012-04-09 11:08:11 PDT
It's the compact of folder A that seems to cause the problem. I think we're ending up with incorrect offsets into the mail folder, though I'm not quite sure why displaying messages seems to work fine.
Comment 74 David :Bienvenu 2012-04-09 13:34:54 PDT
Created attachment 613378 [details] [diff] [review]
fix compact handling of msg offset - checked in

This fixes compact's setting of the resulting message offset. Only some operations care about the message offset, and the multiple message move/copy code is one such operation. I'd like to fix as many of the places that use the message offset to go through the store instead, but I'd like to fix compaction before tomorrow's beta.
Comment 75 David :Bienvenu 2012-04-09 15:58:43 PDT
Wada, try server builds with the patch should show up here - http://ftp.mozilla.org/pub/mozilla.org/thunderbird/try-builds/bienvenu@nventure.com-6f2c0763748c/

Steps for corruption that show up without compaction would be helpful.
Comment 76 David :Bienvenu 2012-04-10 06:10:11 PDT
Created attachment 613567 [details] [diff] [review]
add test for offsets to compact test - checked in

mconley, if Neil's busy today, I'm hoping you can review the fix and the unit test so we can get going on the beta build today.
Comment 77 Mike Conley (:mconley) - (Needinfo me!) 2012-04-10 07:19:47 PDT
Ben:

Remember when you showed me that TB bug where your selected message didn't match the one that was being displayed?

I think this is the bug.

-Mike
Comment 78 Ben Hearsum (:bhearsum) 2012-04-10 07:40:42 PDT
(In reply to Mike Conley (:mconley) from comment #77)
> Ben:
> 
> Remember when you showed me that TB bug where your selected message didn't
> match the one that was being displayed?
> 
> I think this is the bug.
> 
> -Mike

I do! I don't use POP or have any local filters though...
Comment 79 Mike Conley (:mconley) - (Needinfo me!) 2012-04-10 07:55:34 PDT
Comment on attachment 613378 [details] [diff] [review]
fix compact handling of msg offset - checked in

Review of attachment 613378 [details] [diff] [review]:
-----------------------------------------------------------------

My experience with this component is limited, but from what I can tell, this looks good.
Comment 80 Mike Conley (:mconley) - (Needinfo me!) 2012-04-10 07:58:53 PDT
Comment on attachment 613567 [details] [diff] [review]
add test for offsets to compact test - checked in

Review of attachment 613567 [details] [diff] [review]:
-----------------------------------------------------------------

David:

Two super minor complaints.  If you were planning on doing something with totalSize and forgot, then r-.

If, however, totalSize is a remnant of something you no longer want to do, r+ with both of these things fixed.

I'll assume the latter unless you say otherwise. Thanks for your work,

-Mike

::: mailnews/base/test/unit/test_folderCompact.js
@@ +117,5 @@
> +function verifyMsgOffsets(folder)
> +{
> +  let msgDB = folder.msgDatabase;
> +  let enumerator = msgDB.EnumerateMessages();
> +  let totalSize = 0;

totalSize doesn't seem to be used.

@@ +122,5 @@
> +  if (enumerator)
> +  {
> +    while (enumerator.hasMoreElements())
> +    {
> +      var header = enumerator.getNext();

let instead of var
Comment 81 David :Bienvenu 2012-04-10 08:18:56 PDT
Comment on attachment 613567 [details] [diff] [review]
add test for offsets to compact test - checked in

comments addressed, thx, and updated patch checked in.
Comment 82 David :Bienvenu 2012-04-10 08:21:51 PDT
fixes for corruption WADA discovered (thx again, WADA) landed on trunk, aurora, and beta -
http://hg.mozilla.org/comm-central/rev/1c2129df7449
http://hg.mozilla.org/releases/comm-aurora/rev/96025e919c70
http://hg.mozilla.org/releases/comm-beta/rev/fa4685b8588e
Comment 83 David :Bienvenu 2012-04-10 08:27:22 PDT
Comment on attachment 613378 [details] [diff] [review]
fix compact handling of msg offset - checked in

[Triage Comment]
Comment 84 Robert Kaiser 2012-04-10 08:43:13 PDT
Just FYI, I've switched back to using trunk again this weekend, and so far haven't seen corruption so far, so it could be bug 736539, but last time it also took a while until I saw the first problems, and I also haven't run any compaction yet, so it could be that what WADA discovered played a role for me as well.

I'll keep using trunk and if no problems come up for some time, I'll trust that the two fixes made it go away. Thanks everyone for working on investigating and fixing!
Comment 85 Joe Sabash [:JoeS1] 2012-04-10 19:37:22 PDT
David,
Given that pluggable stores landed somewhere around  2011-12-24 16:18:06
Do you have any recommendation on how to clean up any latent side effects for those who have been using the nightlies since then.
Would "rebuild folder" take care of any "disaster waiting to happen scenarios"
Comment 86 David :Bienvenu 2012-04-10 19:40:34 PDT
(In reply to Joe Sabash from comment #85)
> David,
> Given that pluggable stores landed somewhere around  2011-12-24 16:18:06
> Do you have any recommendation on how to clean up any latent side effects
> for those who have been using the nightlies since then.
> Would "rebuild folder" take care of any "disaster waiting to happen
> scenarios"

Yes, rebuild folder would fix the incorrect msg offsets generated by folder compaction.
Comment 87 Tim Meader 2012-04-11 12:10:20 PDT
Does anyone know if this made the cutoff for the 12 Beta 4 candidate builds that are out now? I'd like to give the fix a shot.

Thanks.
Comment 88 David :Bienvenu 2012-04-11 12:30:56 PDT
(In reply to Tim Meader from comment #87)
> Does anyone know if this made the cutoff for the 12 Beta 4 candidate builds
> that are out now? I'd like to give the fix a shot.
> 
Yes, I held up the beta until I could land that patch.

I'm not marking the bug fixed because it's not clear if there aren't other issues. But that was the reproducible one.
Comment 89 Tim Meader 2012-04-11 12:59:00 PDT
Thanks. I'm a bit reluctant to try it now though, given the early feedback here:

http://forums.mozillazine.org/viewtopic.php?p=11900573#p11900573

Testing this is quite hard to do unfortunately given the risk involved.
Comment 90 Robert Roessler 2012-04-11 21:22:58 PDT
(In reply to Tim Meader from comment #89)
> 
> Testing this is quite hard to do unfortunately given the risk involved.

I have been considering this - and there is a way to do it that pretty much removes the risk, and just adds a bit of work for us brave beta users: just use the setting to "Leave messages on server".

Then you can filter/move/compact all you want (interspersing with sanity checks and before/after comparisons) until you are convinced everything is working.

If it seems to be working, then you just need to manage (as in delete) the messages piling up on your server, and clear this setting.

If it still seems broken, then restore your SAVED mbox file copy, and have your client re-download the messages from the server.

When the SeaMonkey beta build that contains this fix is available (SM 2.9 beta 3?), I will likely give it a shot, but protect my messages as outlined above.

OTOH, if I am properly interpreting all of the above traffic, the only fix that has been made is for the compact-related issues - NOT the filter-based-move issues - so I will still not be re-enabling my move steps in filters.  Am I correct on this, David?
Comment 91 David :Bienvenu 2012-04-11 21:36:01 PDT
there have been a couple fixes having to do with incoming messages not getting completely flushed to disk before the filter moves, resulting in the filter moves losing the last CRLF, which could eventually lead to issues.
Comment 92 Ludovic Hirlimann [:Usul] 2012-04-12 00:52:54 PDT
David did you forgot to set a few flags last week ?
Comment 93 Robert Kaiser 2012-04-12 05:42:18 PDT
(In reply to Ludovic Hirlimann [:Usul] from comment #92)
> David did you forgot to set a few flags last week ?

It's not clear if *this bug* is fixed, David has only landed one patch here that should be one mosaic stone in fixing the problem, some fixes landed in other bug might complete the picture, but it's not yet clear if there's still something missing that causes this or not.

(In reply to Robert Roessler from comment #90)
> OTOH, if I am properly interpreting all of the above traffic, the only fix
> that has been made is for the compact-related issues - NOT the
> filter-based-move issues - so I will still not be re-enabling my move steps
> in filters.  Am I correct on this, David?

Bug 736539 for example fixed one problem with filters, and I think there was another bug as well that fixed some stuff there.
Comment 94 David :Bienvenu 2012-04-12 07:21:20 PDT
Ludo, see https://bugzilla.mozilla.org/show_bug.cgi?id=730947#c88 - I explicitly said I wasn't marking it fixed because I don't know if there are still reproducible issues.
Comment 95 :aceman 2012-04-12 07:33:38 PDT
There is a new report for TB12 in bug 744706, but currently no STR.
Comment 96 Jens Hatlak (:InvisibleSmiley) 2012-04-12 15:45:58 PDT
Just FYI, AFAICT I was hit by either bug 736539 or this one, too. Unfortunately I cannot tell which one it was since the folder where I noticed it is the target of many automatic filter moves and I also do folder compaction *a lot* (manually; I disabled the automatic one since if compaction kicks in at random intervals, saved search folders for things like mail with the New flag become totally unusable). That target folder contained several thousand messages so I first figured it might be due to that and moved the majority away (which by itself took an eternity, but that's yet another issue not to be discussed here) and compacted afterwards.

I'm always running the latest beta so I'll check whether the upcoming SM 2.9b3 will fix it for me. If not, I guess I'll have to add a dataloss warning to the release notes and cross my fingers for the release...

[Thankfully the mail subjects were left intact! The folder I referred to above is for anything related to Mozilla, mostly bugmail but also lots of other important stuff!]

Is there anything one can do to fix a folder that has already been damaged? I guess not. [If you feel this should not be discussed here, please name a newsgroup or something for follow-up.]

[Maybe I should really start using IMAP the way it's meant to be used and stop using it like it was POP, thus avoiding any issues with local storage for my important mail.]
Comment 97 David :Bienvenu 2012-04-12 15:54:39 PDT
you can repair the folder by hand with a text editor. But I don't understand - are you using POP3 or IMAP? Or are you filtering all your imap mail to local folders?
Comment 98 Jens Hatlak (:InvisibleSmiley) 2012-04-12 16:00:11 PDT
(In reply to David :Bienvenu from comment #97)
> you can repair the folder by hand with a text editor.

Heh, yes, I guess I'll have to try that (once I'm confident enough that it won't break again any time soon). Unfortunately, I discovered that more than one folder was damaged. :-(

> But I don't understand - are you using POP3 or IMAP?

IMAP.

> Or are you filtering all your imap mail to local folders?

This. As I said, not what IMAP was meant for. ;-) To date, I only use IMAP to keep stuff around during the day so that I can access it via webmail.
Comment 99 David :Bienvenu 2012-04-12 16:03:20 PDT
(In reply to Jens Hatlak (:InvisibleSmiley) from comment #98)

> 
> This. As I said, not what IMAP was meant for. ;-) To date, I only use IMAP
> to keep stuff around during the day so that I can access it via webmail.

It sounds to me like you were bit by the compact bug. I don't think anyone else who has had filter corruption issues was filtering from imap to local folders (or if they were, they didn't tell me about it).
Comment 100 Joe Sabash [:JoeS1] 2012-04-12 16:14:13 PDT
(In reply to David :Bienvenu from comment #99)
> (In reply to Jens Hatlak (:InvisibleSmiley) from comment #98)
> 
> > 
> > This. As I said, not what IMAP was meant for. ;-) To date, I only use IMAP
> > to keep stuff around during the day so that I can access it via webmail.
> 
> It sounds to me like you were bit by the compact bug. I don't think anyone
> else who has had filter corruption issues was filtering from imap to local
> folders (or if they were, they didn't tell me about it).

Others are filtering from IMAP to local folders for various reasons:
http://forums.mozillazine.org/viewtopic.php?p=11897405#p11897405

He is on the cc list for this bug.
Comment 101 Tim Meader 2012-04-13 08:20:16 PDT
Yeah, I think part of the problem is that most of the bugs relating to all this are stated as POP3 specific. When it seems like at least a few of us who were affected the worst were using it with IMAP accounts.

Either way, after installing 12 Beta 4, cleaning (ie - deleting) all the messed up messages, and repairing all my mailboxes, I've been using it now for two full days (filtering and compacting as usual) with no issues so far.

Tentatively I'd say this is fixed. It'd be nice of some others could confirm though.
Comment 102 Robert Roessler 2012-04-14 13:10:44 PDT
David (or anyone else that can comment, of course), are the various fixes discussed here present in the "build2" SM 2.9 b3 candidate for Win32 - with a date of 20120413 @ 2345?

Even better, so I don't need to add to this thread, how would I be able to determine this for myself? :)
Comment 103 Jens Hatlak (:InvisibleSmiley) 2012-04-14 13:26:15 PDT
(In reply to Robert Roessler from comment #102)
> David (or anyone else that can comment, of course), are the various fixes
> discussed here present in the "build2" SM 2.9 b3 candidate for Win32 - with
> a date of 20120413 @ 2345?

Yes.

> Even better, so I don't need to add to this thread, how would I be able to
> determine this for myself? :)

Look at http://hg.mozilla.org/releases/comm-beta/graph and you'll see that this bug has been fixed for SM 2.9b3 build 1 already.
Comment 104 Robert Roessler 2012-04-15 15:35:35 PDT
Thanks, Jens - and David! :)

Yes, it now appears that I can once again perform moves from filters - and do compacts - without fear.

My "tests" (besides the filters with moves) consist of examining closely recently arrived messages for corruption and/or mismatches between headers and bodies, and interleaving this with compacts and occasional MSF rebuilds.

After the rebuilds, I always make sure the "Total" and "Unread" counts do not change - that would be bad. ;)
Comment 105 doug2 2012-04-16 11:39:38 PDT
Back on the nightly track with auto-compact turned on (ask first) and limit set to 40MB.  Seeing no problems with corruption, but have been asked if I want to compact 4 times today and no way am I getting that much mail.  How does it estimate the compaction versus the limit?
Comment 106 WADA 2012-04-16 20:39:32 PDT
(In reply to David :Bienvenu from comment #88)
> I'm not marking the bug fixed because it's not clear if there aren't other
> issues. But that was the reproducible one.

There are at least following problems which can be observed always by message filter upon download.
(a) Order Received column value starts from "file size"+1 when moved by filter,
    and is increased by 1 if multiple mails are moved to same local mail folder.
(b) Tag is not written to X-Mozilla-Keys: header when tag is added by filter
    upon mail download. This tag is never written to X-Mozilla-Keys: header
    by Compact.
Phenomenon (a) may be relevant to "Bad mail data when Forward" what I saw after manul move/copy/delete, manual filter run.
Phenomenon (b) can be called "broken header" and "data loss".
These are better processed by separate bug(s).

I think this bug is better limited to "broken mail data problem in filter move + Compact even after fix of bug 736539" what you've reproduced and resolved. I think it's better to open separate bug for "broken mail/folder problem even after fix of bug 736539 and fix of this bug" if "broken mail/folder" still occurs, for ease of analysis/tracking.
Comment 107 David :Bienvenu 2012-04-16 21:19:39 PDT
why does a) matter? Order received is no longer used as the offset into the message folder. If the storetoken/message offset value is wrong, that would be a different matter.

I'll see if I can reproduce b).
Comment 108 WADA 2012-04-16 21:54:09 PDT
(In reply to David :Bienvenu from comment #83)
> fix compact handling of msg offset - checked in

Checked with Tb trunk 4/16 build.
> Mozilla/5.0 (Windows NT 5.1; rv:14.0) Gecko/20120416 Thunderbird/14.0a1
Unable to reproduce problem of comment #70 by STR in that comment.
As for problem of comment #70, VERIFIED.

(In reply to David :Bienvenu from comment #107)
> Order received is no longer used as the offset into the message folder.

I see.
If so, "Order Received" value is better to start with 1 and better incremented by 1, instead of using offset value, although it's inconvenient for testers because we can't see offset value in UI.
Anyway, I'll open separate bug for "Order Received" value issue.

Question about it.
No problem in backword compatibility?
If .msf is used by older Tb versions, is .msf always re-built?
How about upward compatibility?
If .msf is created by older Tb versions, is .msf re-built by Tb 14?
If re-built, is there no performance impact? (long rebuild-index time, long Gloda indexing time if big mail folder, etc.)
Comment 109 David :Bienvenu 2012-04-17 15:21:41 PDT
(In reply to WADA from comment #108)
Thx for verifying that the issue in #70 is fixed.

> If so, "Order Received" value is better to start with 1 and better
> incremented by 1, instead of using offset value, although it's inconvenient
> for testers because we can't see offset value in UI.
I would suggest an extension like rkent's junquilla to add the messageOffset and storeToken header attributes as extra columns.

> No problem in backword compatibility?
No, because if messageOffset isn't set, we use the old order received.

> If .msf is used by older Tb versions, is .msf always re-built?
> How about upward compatibility?
That's usually called "forward compatibility", which is something we never promise. You're right this breaks running old versions after running newer versions, if you have pop3 mail filters. I tried maintaining forward compatibility when I could, but didn't get to it for pop3 filtered messages. .msf files are not rebuild automatically.

I'll file a bug about the pop3 filter keywords not getting written to the message, since I can recreate that, and have a tentative fix for it.
Comment 110 David :Bienvenu 2012-04-17 17:28:27 PDT
I've requested a try server build with a fix for the pop3 filter keyword issue - http://ftp.mozilla.org/pub/mozilla.org/thunderbird/try-builds/bienvenu@nventure.com-82cc58b3f954 - the bug is bug 746371 .
Comment 111 Joe Sabash [:JoeS1] 2012-04-17 20:04:27 PDT
(In reply to David :Bienvenu from comment #109)
> (In reply to WADA from comment #108)
> Thx for verifying that the issue in #70 is fixed.
> 
> > If so, "Order Received" value is better to start with 1 and better
> > incremented by 1, instead of using offset value, although it's inconvenient
> > for testers because we can't see offset value in UI.
> I would suggest an extension like rkent's junquilla to add the messageOffset
> and storeToken header attributes as extra columns.
> 
> > No problem in backword compatibility?
> No, because if messageOffset isn't set, we use the old order received.
> 
> > If .msf is used by older Tb versions, is .msf always re-built?
> > How about upward compatibility?
> That's usually called "forward compatibility", which is something we never
> promise. You're right this breaks running old versions after running newer
> versions, if you have pop3 mail filters. I tried maintaining forward
> compatibility when I could, but didn't get to it for pop3 filtered messages.
> .msf files are not rebuild automatically.
> 

So let me see if I have this right:
A pop3 (local folder) user will have no problems going from TB11 to TB12
But there is still the case of "I didn't like TB12 because of xyz,
I reverted to TB11 and some of my mail is gone." (Someone is bound to do this)
Additionally, there is the issue of regression testing, and how this might effect the testing community.
I noticed the wiki here:
https://wiki.mozilla.org/Thunderbird:Pluggable_Mail_Stores
But there is no mention of filter/DB incompatibility there.

At the very least, I think we need a very clear relnote on the subject.
Comment 112 David :Bienvenu 2012-04-19 14:31:55 PDT
*** Bug 744706 has been marked as a duplicate of this bug. ***
Comment 113 Justin Dolske [:Dolske] 2012-04-20 14:21:22 PDT
Ouch. Looks like this is what's been killing me recently. I'm on the current TB beta and just had a folder of mail get zapped yesterday. Are all the current fixes in the latest beta? Or does having a problem indicate there's another bug still?
Comment 114 Tim Meader 2012-04-20 14:24:12 PDT
Ever since doing a "Repair" on all my folders with Beta 4, I've been fine ever since for all operations.
Comment 115 David :Bienvenu 2012-04-20 14:32:46 PDT
(In reply to Justin Dolske [:Dolske] from comment #113)
> Ouch. Looks like this is what's been killing me recently. I'm on the current
> TB beta and just had a folder of mail get zapped yesterday. Are all the
> current fixes in the latest beta? Or does having a problem indicate there's
> another bug still?

Justin, are you really using POP3? All the current fixes are in the latest beta. Was the folder that got corrupted a folder that had messages filtered into it? Did a compact happen, and if so, was it the first compact that happened since you got b5? And were there deleted messages in that folder? (compact won't touch folders that haven't had messages deleted from them)
Comment 116 Justin Dolske [:Dolske] 2012-04-20 15:21:06 PDT
Oh, maybe I misread the summary. I have move filters in TB, which are moving between 2 IMAP accounts (all 1-way; Google to Zimbra). I get corruption in both the source (Google) and dest (Zimbra). I nuked the .msf files a week or two ago (after they became corrupted repeatedly).

Not sure if it was the first compact, but probably not. I had the auto-compact option on, set to do so with a 1MB threshold. The destination folders have 10-day retention policy.

Different bug?
Comment 117 David :Bienvenu 2012-04-20 15:23:49 PDT
(In reply to Justin Dolske [:Dolske] from comment #116)
> Oh, maybe I misread the summary. I have move filters in TB, which are moving
> between 2 IMAP accounts (all 1-way; Google to Zimbra). I get corruption in
> both the source (Google) and dest (Zimbra). I nuked the .msf files a week or
> two ago (after they became corrupted repeatedly).
> 
> Not sure if it was the first compact, but probably not. I had the
> auto-compact option on, set to do so with a 1MB threshold. The destination
> folders have 10-day retention policy.
> 
> Different bug?

Yes, different code, different bug. Can you file a new bug? I'll set up some filters to do that here and see if I can reproduce it.
Comment 118 Justin Dolske [:Dolske] 2012-04-21 14:40:39 PDT
Hrm, actually my Gmail account is POP3. Not sure why I was thinking it was IMAP!
Comment 119 Robert Kaiser 2012-04-23 10:53:35 PDT
I'm marking this bug FIXED as I filed it on my issue and that seems to be fixed now. I have been running with trunk builds for roughly two weeks, see comment #84, and I have not seen any such problems again.
If there's other problems, they should be in separate bugs, the original issue(s) of this bug is fixed.

Thanks a lot to everyone who investigated it and David for the work on a fix!
Comment 120 WADA 2012-05-05 00:43:49 PDT
*** Bug 740374 has been marked as a duplicate of this bug. ***

Note You need to log in before you can comment on or make changes to this bug.