Open Bug 1742975 Opened 6 months ago Updated 4 days ago

Bug 1734847 (.msf corruption) NOT FIXED on beta. (NOT VERSION 91.x)

Categories

(MailNews Core :: Database, defect, P1)

Thunderbird 95
x86_64
Windows 8.1

Tracking

(thunderbird_esr91 unaffected, thunderbird99 wontfix, thunderbird100 affected)

Tracking Status
thunderbird_esr91 --- unaffected
thunderbird99 --- wontfix
thunderbird100 --- affected

People

(Reporter: j.r.andresen, Unassigned)

References

()

Details

(Keywords: dataloss)

Attachments

(3 files)

User Agent: Mozilla/5.0 (Windows NT 6.3; Win64; x64; rv:94.0) Gecko/20100101 Firefox/94.0

Steps to reproduce:

Mail corrupted. Multiple(now as many as 5) messages appear in single entry. Things are getting worse instead of better.
TB still crashing on a regular basis. Inbox repair doesn't have any affect on issue(s).

Expected results:

Seems as though the SOM(start of message) and EOM(end of message) aren't being processed correctly.

Component: Untriaged → Database
OS: Unspecified → Windows 8.1
Product: Thunderbird → MailNews Core
Hardware: Unspecified → x86_64

bug 1742049 comment 1 reports "You'll have to preform Repair Folder to get the .msf file corrected."

Is 95 still generating bad messages after doing that?

Flags: needinfo?(j.r.andresen)

95.0b4 still is.

Repair doesn't correct it.

Flags: needinfo?(j.r.andresen)

I think TCW is also seeing this

Severity: -- → S2
Status: UNCONFIRMED → NEW
Ever confirmed: true
Keywords: dataloss

I am confirming it as well. It's not as bad as it used to be. Meaning, it's not doing it on EVERY message.

I was able to repro this way using my GMail account:

  1. Send yourself a test message with something in the Subject field and something in message body window. It should arrive in your Inbox intact and un-mangled. This will be our 1st test message
  2. Go to your Sent Mail folder, click on the sent Test message you just sent yourself and do an Edit As New Message action to it
  3. Re-Send the same test message once more to yourself so that now you have two of the same test messages in your Inbox
  4. Go to your Inbox now and view the new (2nd) test message you just re-sent. It should appear ok and un-mangled
  5. Go view the 1st test message you sent. It should appear mangled now

In my case, the original 1st test message is now mangled but the 2nd one is still ok

Severity: S2 → --
Status: NEW → UNCONFIRMED
Ever confirmed: false
Keywords: dataloss
Summary: Bug 1734847 NOT FIXED → Bug 1734847 (.msf corruption) NOT FIXED
Severity: -- → S2
Status: UNCONFIRMED → NEW
Ever confirmed: true
Keywords: dataloss
See Also: → 1734847

J R, Aureliano,

does comment 4 match your steps to reproduce?

Flags: needinfo?(j.r.andresen)
Flags: needinfo?(euryalus.0)

Hi Wayne.
Comment 4 is not my STRs. I don't have a clear STRs. This happens to me not in my google accounts but on my microsoft.outlook365. I have noticed that it happens if, while I am doing a search, many messages arrive at the same time in my IMAP Inbox.
Aniway "Repair Folder" not fix the issue at all.

I have tried with a fresh new profile with TB95.0b4 and I had the same problem.
I have tried with a fresh new profile with TB91.3.2 and I didn't have any problems.

Flags: needinfo?(euryalus.0)

I can confirm I can reproduce corrupted results using the steps in comment 4.
I have 384 emails today with dozens containing the corruptions.

Flags: needinfo?(j.r.andresen)

(In reply to J R Andresen from comment #7)

I can confirm I can reproduce corrupted results using the steps in comment 4.
I have 384 emails today with dozens containing the corruptions.

Are the ones for which you see the corruption from the same sender? Or is it randomly spread out? Meaning, are you seeing any kind of pattern?

Odd as it sounds, my STR from comment 4 seemed to indicate (to me) that something to do with the same sender, same subject and same message body content might have been triggering this issue. But today I got a slew of messages with subject "[Bug 1714846] High CPU consumption when downloading multiple .mp4 files" for which I thought I would certainly see some corruption (same sender, same subject) in the descending / older bunch of message but....nothing. Could the variable here be that the message body is different so it doesn't trigger?

So what's the commonality here and how were I and J R Andersen able to repro so easily?

they are from many senders, many subjects. Todays are from commercial advertisers. I can forward you a few if you like. Let me know where?

(In reply to J R Andresen from comment #7)

I can confirm I can reproduce corrupted results using the steps in comment 4.
I have 384 emails today with dozens containing the corruptions.

Is this also what is causing your crashes?

Flags: needinfo?(j.r.andresen)

(In reply to J R Andresen from comment #9)

they are from many senders, many subjects. Todays are from commercial advertisers. I can forward you a few if you like. Let me know where?

No, no need to send. By chance are you seeing crashes like in bug 1742590?

Duplicate of this bug: 1743856

What's the next possible item to focus on, or thing(s) we need more information about?

Sorry to use the word "mess", but it is pretty clear from https://mzl.la/3xQU1Dl (this is just version 95 bug reports), even though lately there hasn't been chatter in https://thunderbird.topicbox.com/groups/beta

  • bug 1734157 Thunderbird beta loses e-mails. Going back to regular doesn't recover lost e-mails
  • bug 1740319 Loading some messages from IMAP server cause TB 95.0b1 & 95.0b2 OOM, @ OutOfMemory crash
  • bug 1740486 Attempting a Repair Folder operation fails
  • bug 1740846 IMAP mails displaying other emails in message body
  • bug 1741004 After deleting message and compacting folder, restores duplicated mail
  • bug 1741874 Unified (virtual) Sent folder fails to search in GMail, it gets unchecked
  • bug 1742321 Crash [@ OutOfMemory | large ]
  • bug 1742590 If running a Repair Folder operation and attempting to shut down, crash @ mozilla::`anonymous namespace'::RunWatchdog / @ nsMsgAttachmentData::~nsMsgAttachmentData occurs, @ MimeInlineTextHTMLParsed_parse_line
  • bug 1742782 TB 95.0b4 64bit consumes very large amounts of RAM (up 8G and increasing)
  • bug 1742794 when compacting the INBOX of an IMAP-mailbox the nstmp-file will reach up to 550 GB
  • bug 1743388 Drag & Drop of collapsed email thread from Message List to a Folder not always transferring the entire thread of message when Keep Filter and Quick Filter are applied
  • bug 1744080 Getting daily notification of compacting folders
Flags: needinfo?(benc)
Priority: -- → P1

(In reply to Arthur K. [He/Him] from comment #11)

(In reply to J R Andresen from comment #9)

they are from many senders, many subjects. Todays are from commercial advertisers. I can forward you a few if you like. Let me know where?

No, no need to send. By chance are you seeing crashes like in bug 1742590?

My crashes occur when I simply select a message. It's random occurrence, at times I can repair the inbox and move on. Other times it will continue to crash on the same message selection.

Flags: needinfo?(j.r.andresen)

(In reply to J R Andresen from comment #14)

(In reply to Arthur K. [He/Him] from comment #11)

(In reply to J R Andresen from comment #9)

they are from many senders, many subjects. Todays are from commercial advertisers. I can forward you a few if you like. Let me know where?

No, no need to send. By chance are you seeing crashes like in bug 1742590?

My crashes occur when I simply select a message. It's random occurrence, at times I can repair the inbox and move on. Other times it will continue to crash on the same message selection.

Hmm, that is different than mine for sure. Do you have a recent crash report from Help > More Troubleshooting Information > Crash Reports for the Last 3 Days that you can post a link to?

Attached file reports

Crash Reports for the Last 3 Days
Report ID Submitted
bp-1146f5bc-f927-483a-9222-9e6d71211204 3 hours ago
bp-a8abe09a-7606-465b-a7e0-0b24c1211203 3 hours ago
bp-a731d87b-fb29-464a-b14a-89c8d1211203 3 hours ago
bp-b8c03cea-e92a-4c8a-8e27-d44851211203 4 hours ago
bp-f86bbd72-cf84-418d-afdd-8044c1211203 12 hours ago
bp-ce0dc048-03aa-493a-85ba-8739e1211203 12 hours ago
bp-6bdc7433-4c50-4a66-b994-50dd41211203 12 hours ago
bp-c8c2ea07-584c-4ff6-931b-8f8001211203 13 hours ago
bp-c5678646-39d4-43c0-be08-a60cd1211203 23 hours ago
bp-6f1f32b9-e47d-417f-bdec-e269f1211203 23 hours ago
bp-c150786c-fb5d-4029-aac2-7269f1211203 24 hours ago
bp-5e4690b1-a6cb-4bee-aa7e-ec2231211203 24 hours ago
bp-e843a298-ce49-4414-bf0a-9de641211203 1 day ago
bp-47258a62-6d32-408d-9095-009361211202 1 day ago
bp-8fbf5f05-6a5d-4d62-89c9-db40e1211202 1 day ago
bp-2b00d1c4-8218-4966-80b3-718091211202 1 day ago
bp-76d2f650-f0c4-4c03-b0fd-947731211202 1 day ago
bp-0fce0cda-bf8b-423c-96e2-ee9eb1211202 1 day ago
bp-7f66279c-a07d-45da-a7e5-fb7141211202 1 day ago
bp-93153d71-39b3-4333-853b-20eb71211202 1 day ago
bp-957cfbc3-f4b6-46dc-85c8-6858c1211202 1 day ago
bp-090bdbc6-3657-40ef-a5f2-092041211202 1 day ago
bp-25f5686c-f839-4aeb-b21a-e7c261211202 2 days ago
bp-e23cf01b-aa79-4f39-98bc-60af51211201 2 days ago
bp-7ad7ba65-a8db-4f83-8624-21f8e1211201 2 days ago
bp-8a6595c7-e6c5-41fb-962f-66f9c1211201 2 days ago
bp-47099ee4-1bfd-48cb-9d07-339921211201 2 days ago
bp-33bd9ffc-c8d8-4be3-858c-a21e41211201 2 days ago
bp-3089301e-8b31-4e28-9cd0-083931211201 2 days ago
bp-c9b2dad8-a8e5-4bb3-a0fb-939611211201 2 days ago
bp-01ad2414-51b3-4ea1-b736-278c61211201 2 days ago
bp-d01df61e-74e6-44f5-a8d9-ab7531211201 2 days ago
bp-9f2e45d6-cf36-4f60-955c-348ba1211201 2 days ago
bp-6866a90a-d6b5-4a27-91c6-386281211201 2 days ago
bp-a91ef240-5d08-4391-81f2-33f9c1211201 2 days ago
bp-1cbdd833-13b9-4bd7-98f1-c76781211201 2 days ago
bp-200e6841-449c-4a8f-b6b7-0abd41211201 2 days ago
bp-e0c926fa-621f-466d-a0e8-59fe31211201 2 days ago
bp-bf874292-a3c6-4e79-82bf-e794c1211201 3 days ago
bp-9e03cef8-c3d8-4563-bcc7-5c1301211201 3 days ago
bp-6848512e-20f6-4c36-b5a6-3261c1211201 3 days ago
bp-03ff1ebf-d847-4c04-a97b-f22151211201 3 days ago
bp-a4ade261-a7c9-4df3-99c5-892881211201 3 days ago
bp-0006bc6d-8dc3-49a2-950d-ca1721211201 3 days ago
bp-f627a622-ab6b-48cf-a15e-4abcc1211201 3 days ago
bp-333ccbe7-58b7-4658-99b3-7f3fb1211201 3 days ago
bp-5f87d9f2-1801-4fd8-bf1b-c3b721211201 3 days ago
bp-a84ca050-7d8b-4656-984f-01a501211201 3 days ago

A random sampling of these seem to point to something different than what I am seeing but most are "OutOfMemory | large or small" it seems. Is this 32-bit TB and 32-bit Windows 7?

https://crash-stats.thunderbird.net/report/bp-a84ca050-7d8b-4656-984f-01a501211201 (@ nsBidiPresUtils::TraverseFrames)
https://crash-stats.thunderbird.net/report/bp-1146f5bc-f927-483a-9222-9e6d71211204 (@ OutOfMemory | large)
https://crash-stats.thunderbird.net/report/bp-5f87d9f2-1801-4fd8-bf1b-c3b721211201 (@ OutOfMemory | small)
https://crash-stats.thunderbird.net/report/bp-e0c926fa-621f-466d-a0e8-59fe31211201 (@ mozilla::ArenaAllocator<1024,4>::Allocate)
https://crash-stats.thunderbird.net/report/bp-090bdbc6-3657-40ef-a5f2-092041211202 (@ mozilla::dom::FontFaceSet::UpdateRules)
https://crash-stats.thunderbird.net/report/bp-93153d71-39b3-4333-853b-20eb71211202 (@ mozilla::ArenaAllocator<8192,8>::Allocate)

64bit OS Windows 8.1 12GBmemory
32bit TB

additional observation:

When TB is started in windows, it's a 120Mb process as viewed via Task Manager.
I select an email message to view(that is only a couple Kb on the server) and the TB process size rapidly grows to over 2.5Gb then crashes.
I can repeat this using the same message over and over.

(In reply to J R Andresen from comment #19)

64bit OS Windows 8.1 12GBmemory
32bit TB

Yeah. it's going to be OOM crash city with 32-bit TB I would surmise. Any reason you haven't switched to 64-bit TB to leverage your system resources? Probably would help with the OOM but not with the actual bug I'm afraid.

I select an email message to view(that is only a couple Kb on the server) and the TB process size rapidly grows to over 2.5Gb then crashes.

I usually see this behavior only when I click on an unread and attempt to exit TB. I'm on 64-bit TB but my process will grow past 10GB and then crash.

JR,

I don't know if it's of any help to the devs here but do you know how to capture a perf profile?

Try this:

  1. CTRL-SHIFT-i to open Developer Tools. Accept the incoming Connection message when prompted
  2. in the Developer Tools window, press F1
  3. in the Default Developer Tools window, click the Performance check-box at top left. It will create a Performance tab at top middle of the screen
  4. Click on the Performance tab and you'll see a Start Recording Performance button. Don't click it yet
  5. Switch back to TB and click on the email that will start growing mem usage and cause TB to crash and then QUICKLY switch back and click the Start Recording Performance
  6. Record about 10 seconds and then stop the recording
  7. At top left, there will be a blue "Recording #1" box with a Save option next to it, click Save to save the recording to a JSON file
  8. Upload the JSON file to this bug report. You may have to .ZIP it as it might be large uncompressed

Note, even though I asked about the crashes, they will probably not be of great interest because they are just a symptom of the cause, which is (presumably) corrupted msf. I think the same could be said of performance info.

We need Ben or Magnus, or some other developer, to weigh in.

(In reply to Wayne Mery (:wsmwk) from comment #23)

Note, even though I asked about the crashes, they will probably not be of great interest because they are just a symptom of the cause, which is (presumably) corrupted msf. I think the same could be said of performance info.

We need Ben or Magnus, or some other developer, to weigh in.

I'm sure you're on point here. It's more for just validating what we probably already assume is the primary cause. More data probably isn't a bad thing though. You never knew when some other unrelated bug(s) gets uncovered as a result.

For what it's worth, I first saw this on October 27th. I keep trying to pull up the email where I first saw it happen to figure out what build I was running that day but it's futile. I'll keep trying.

Yes, it's entirely possible there is more than one bug here. I'm just suggesting we don't need to kill ourselves collecting data just yet.

I wasn't able to reproduce it using the steps in Comment 4 :-(

A few questions for Arthur:

  1. Did those steps work reliably, or did you have to perform them a bunch of times to get it to happen? (I tried a few times, but no dice).
  2. Just to confirm: I'm assuming it was via IMAP, right?
  3. Would you be able to send me the tail end of the mbox file for your inbox? Whatever you feel comfortable sending, but at least the last bit containing those two messages (i.e. including the one that appears corrupted)?

The mbox file will be somewhere like <your profile dir>/ImapMail/imap.gmail.com/INBOX. It's probably easiest if the corrupted messages were composed as text rather than html (easier to parse manually!), but anything is fine!

My current working theory is that there is some oddness with the "From " lines separating messages in the mbox file, so it'd be nice to confirm this and come up with a nice simple test case to fix.

Flags: needinfo?(benc) → needinfo?(thee.chicago.wolf)

Hello, I send you 2 examples at this adresse (benc-at-thunderbird.net)
Regards
Nicolas

(In reply to Ben Campbell from comment #28)

I wasn't able to reproduce it using the steps in Comment 4 :-(

A few questions for Arthur:

  1. Did those steps work reliably, or did you have to perform them a bunch of times to get it to happen? (I tried a few times, but no dice).

For me they worked reliably.

  1. Just to confirm: I'm assuming it was via IMAP, right?

Yes, IMAP.

  1. Would you be able to send me the tail end of the mbox file for your inbox? Whatever you feel comfortable sending, but at least the last bit containing those two messages (i.e. including the one that appears corrupted)?

I just bumped up to the test build of 96.0b1. I'll see if it still repros there. If not, I'll revert back to 95.0b5 and try to repro.

The mbox file will be somewhere like <your profile dir>/ImapMail/imap.gmail.com/INBOX. It's probably easiest if the corrupted messages were composed as text rather than html (easier to parse manually!), but anything is fine!

My current working theory is that there is some oddness with the "From " lines separating messages in the mbox file, so it'd be nice to confirm this and come up with a nice simple test case to fix.

I'll try to see if I can sequester the repro email files to a dedicated mbox and then try and send that to you. My Inbox is north of 1.4GB and there's data in there I cannot send outside my org's walls.

Flags: needinfo?(thee.chicago.wolf)

(In reply to Ben Campbell from comment #28)

I wasn't able to reproduce it using the steps in Comment 4 :-(

And now I am not able to repro either with 95.0b5 or 96.0b1. That's frustrating. I assume sending you a mangled email won't do much good?

Hello,
I specify that my data is on a disk d:
Every day my box gets corrupted. There seems to be memory leaks and / or my disk is working a lot.
Disk D: \ does not appear to be broken
I can send my destroyed INBOX (whith good examples) and my new INBOX directly to a developer (10Mo) (not on the forum)
Every day my box is corrupted
Nicolas

Sure, you can send it to me and Ben. (Please refer to bug 1742975 - this bug)

(In reply to Arthur K. [He/Him] from comment #31)

And now I am not able to repro either with 95.0b5 or 96.0b1. That's frustrating. I assume sending you a mangled email won't do much good?

Hmm... the symptoms in the original description match up so well with those in Bug 1734847. I thought the fix from that was included in 95.0b4, but comment 2 suggests it was still happening there :-(
I'm wondering if it was fixed in 95.0b4, but the effect was being masked by problems in folder repair? (Bug 1740486)

(In reply to Paour from comment #32)

I can send my destroyed INBOX (whith good examples) and my new INBOX directly to a developer (10Mo) (not on the forum)

Thanks Nicolas - I received your example files.

Which version of Thunderbird are you running? The mixed-up emails you're seeing do match Bug 1734847 symptoms.
I suspect that a repair folder would sort things out for you. You mentioned you weren't sure how to do that. Try this:

  • Right-click on Inbox and choose "Properties"
  • Click "Repair Folder"

(please excuse the English-centric instructions ;- )

(In reply to Ben Campbell from comment #34)

(In reply to Arthur K. [He/Him] from comment #31)

And now I am not able to repro either with 95.0b5 or 96.0b1. That's frustrating. I assume sending you a mangled email won't do much good?

Hmm... the symptoms in the original description match up so well with those in Bug 1734847. I thought the fix from that was included in 95.0b4, but comment 2 suggests it was still happening there :-(
I'm wondering if it was fixed in 95.0b4, but the effect was being masked by problems in folder repair? (Bug 1740486)

There could be these two possibilities as you say, but it's contingent upon others who've bumped to 95.0b4/b5 and subsequently run a folder repair with success to mostly eliminate bug 1734847 from the picture. I would LOVE to hear other user experiences as it relates to bug 1740486. I'm going to be a bit shocked if I am the only one having that problem.

I'm presently running a folder repair (that I'd intended to run last night) using 96.0b1 which I started at 9:11AM today. The repair/download operation is running far faster than with 95.0b5 and presently has about 12k of 49k+ left to process. After it's done, I hope to see a couple things: 1) no more mangled messages since bug 1734847 should be fixed and 2) no more CPU use after my currently running repair finishes.

If after finishing a repair operation the CPU use keeps going, I would also surmise that a perf profile against 96.0b1 won't do any more good than the ones I already submitted in bug 1740486?

Hello, I have a lot of "mixed" emails in my Gmail IMAP inbox every day, it started with TB 94b1.
I tried to reconstruct inbox many times, but it don't solved the issue.
Feel free to ask me more informations to help on this.

(In reply to Ben Campbell from comment #35)

Which version of Thunderbird are you running?
96.0b1 (and bug start with the last update 95.0b3 -> 95.0b4) see my duplicated bug https://bugzilla.mozilla.org/show_bug.cgi?id=1743856)

You mentioned you weren't sure how to do that. Try this:
No, I did this many time without effect (see https://bugzilla.mozilla.org/show_bug.cgi?id=1743856#c2)

As my connection is Imap, I also completely deleted INBOX and INBOX.msf .
The rebuild is OK at the start and after 10 ' (update of the subfolder), INBOX becomes corrupted.
I also specify that I have 5 mailboxes in Thunderbird and another mailboxes is corrupted

Don't hesitate to ask for tests, I'm available until Christmas

Nicolas

I'm presently running a folder repair (that I'd intended to run last night) using 96.0b1 which I started at 9:11AM today. The repair/download operation is running far faster than with 95.0b5 and presently has about 12k of 49k+ left to process. After it's done, I hope to see a couple things: 1) no more mangled messages since bug 1734847 should be fixed and 2) no more CPU use after my currently running repair finishes.

Well, that was a bust. CPU is still churning away. Anything you'd like me to try?

(In reply to Fernando Hartmann from comment #37)

Hello, I have a lot of "mixed" emails in my Gmail IMAP inbox every day, it started with TB 94b1.
I tried to reconstruct inbox many times, but it don't solved the issue.
Feel free to ask me more informations to help on this.

I forgot to mention that I'm now running TB 96b1
And starting in TB 95, I'm experiencing a lot o OOM crashes mainly during accessing emails in this corrupted mail boxes, some samples:

(In reply to Fernando Hartmann from comment #40)

(In reply to Fernando Hartmann from comment #37)

Hello, I have a lot of "mixed" emails in my Gmail IMAP inbox every day, it started with TB 94b1.
I tried to reconstruct inbox many times, but it don't solved the issue.
Feel free to ask me more informations to help on this.

I forgot to mention that I'm now running TB 96b1
And starting in TB 95, I'm experiencing a lot o OOM crashes mainly during accessing emails in this corrupted mail boxes, some samples:

I've seen this on my machine as well. Depends on the corrupted email though. Once I saved one of them and it turned into a 3GB .eml file so I imagine even on a well equipped and modern machine it could still OOM trying to bring up and even more corrupted one.

Hello,
I suggest this workaround for end users that use imap serveur :
1- close Thunderbird
2- open ImapMail folder in your profile

  • remove all msf file (i.e. imap.free.fr.msf ; imap1.free.fr.msf
  • for each subfolder, remove all files (i.e. imap.free.fr ; imap1.free.fr folder should be empty)
    3 - start Thunderbird
    ===========================================================
    For me, I suppose one of my mails in a subfolder was corrupted (Sent, any other folder).
    By deleting all the box I resolved the problem !
    If the workaround doesn't works
    1- uninstall Thunderbird
    2- export your calendar -> ics
    3- save abook.sqlite (adress book) from your profil
    4- remove your profile
    5-Install Thunderbird
    6- setup your profile and import your ICS calendar
    7- close Thunderbird
    8- replace abook.sqlite
    that's all ...

I tried today with TB 97.0a1 (2021-12-22) (64-bit) and I always encounter the same problem reported in related issue #1734847 (that is closed as verified...): mails are messed-up because one mail contains different mail body unrelated. Neither Repair folder and Neither deleting msf files (as stated in previous comment) solve the issue.

Is anyone addressing the issue? it's been 4 months.

It is acknowledged as a bad problem, so we were just discussing this at today's community meeting. The challenge is that a) we don't know which code changed the behavior and b) a developer has not been able to reproduce to issue, both of which would obviously help lead to a solution.

Those who can reproduce can help:

If that route doesn't get progress, then perhaps Ben can provide a special build.

(In reply to Wayne Mery (:wsmwk) from comment #45)

It is acknowledged as a bad problem, so we were just discussing this at today's community meeting. The challenge is that a) we don't know which code changed the behavior ...

Is that so? Looking at BMO references, this bug is a continuation of bug 1734847 which was regressed by bug 1728924. In fact, a backout of the latter was attempted, see bug 1734847 comment #33. I have the impression that code that was removed in bug 1728924 (https://hg.mozilla.org/comm-central/rev/2c8857af0eb3) was in fact needed. What's wrong in this line of argument?

The emphasis here has been on MSF and MSF corruption, but I do wonder if repair does not fix the issue if the actual storage is where the issue lies. Initial MSF corruption should be fixed as a new MSF is generated in the repair. I think someone needs to see the actual mbox store behind the msf to see if corruption, or duplication of message leader (FROM) information is the issue

Generally in support it is standard procedure to check if the issue can be replicated without antivirus scanning in the profile folder, or in the operating system's safe mode with networking. Corruption issues usually have an external cause. But that does not appear to have been investigated at all here.

Further checks include, the storage location used is a local internal drive and the profile location is in the default location. The local location must also not subject to streaming backups, cloud synchronisation or network storage. As all of those things have led to issues in the past, perhaps we can get these things clarified here.

Additionally, has the account local directory been modified from the default. No point having the profile locally if the directory is pointing to the document's folder (more common that it should be. The usual excuse if to facilitate backups) or some cloud synchronised location.

(In reply to newsfan from comment #46)

(In reply to Wayne Mery (:wsmwk) from comment #45)

It is acknowledged as a bad problem, so we were just discussing this at today's community meeting. The challenge is that a) we don't know which code changed the behavior ...

Is that so? Looking at BMO references, this bug is a continuation of bug 1734847 which was regressed by bug 1728924. In fact, a backout of the latter was attempted, see bug 1734847 comment #33. I have the impression that code that was removed in bug 1728924 (https://hg.mozilla.org/comm-central/rev/2c8857af0eb3) was in fact needed. What's wrong in this line of argument?

I'm not arguing against your point, but if you are correct I have these questions:

  • Which line(s) of that patch might be at fault?
  • AFAICT the problems didn't start until beta 95, but the patch shipped in beta 93. Why the gap in time between landing and reporting?

Given that it cannot be reproduced by a developer we need to be more creative and try something, i.e. anything, because this cannot continue. What is next to try, a try build with a backout to confirm that in fact this code block helps those who CAN reproduce the problem? If not, then what?

p.s. and once the problem is identified there is clearly a need for an automated test

Flags: needinfo?(benc)

AFAICT the problems didn't start until beta 95, but the patch shipped in beta 93.

The issue was first reported for TB 93 in bug 1730676 which was made a duplicate of bug 1734847. There is no gap in reporting. Strangely enough a developer reproduced the issue in bug 1734847 and the same person said: "I think I found a test case today" referring to this bug here:
https://thunderbird.topicbox.com/groups/developers/Tb67ca24581814a31-Mb72a225403851eda401d6e6d

Disclaimer: This is just assembling the published information, no own testing done. Maybe there are multiple issues. That mbox repair allegedly doesn't work any more is additionally worrying.

Hello,

  • AFAICT the problems didn't start until beta 95, but the patch shipped in beta 93. Why the gap in time between landing and reporting?

As I post in comment-40 I started to have this problems in 94b1.

Of course, as a non developer, I can have a naive opinion, but, I can imagine the difficulty to narrow down what causes the problem, but I really can't understand why using "Repair Folder" doesn't work !
At leas in my case, even right after use Repair Folder, the messages are downloaded but the mixing problem is still there, on the same messages that was mixed before !

Thanks for time !

Generally in support it is standard procedure to check if the issue can be replicated without antivirus scanning in the profile folder, or in the operating system's safe mode with networking. Corruption issues usually have an external cause. But that does not appear to have been investigated at all here.

For my part I can certify that I had no virus (or other external actions).
On the other hand, I have many sub-folders, and six e-mail accounts and after the incriminated update (I applied all of them), I happened to stop the Thunderbird process because it was too long (I'm a bad a bad user!).
This is probably the cause of the first corruption because I then think I have corrupted an email, or a file, and only my reset procedure (https://bugzilla.mozilla.org/show_bug.cgi?id=1742975#c42) solved the problem.
I haven't had any problem since (and I no longer stop the thunderbird process)
My external opinion , there are two approaches:
1- the initial cause of the problem
2- the eradication of the problem even with a stable version if an msf subfolder remains corrupted. Maybe offer an option to repair ALL msf folders

So, I don't know if it's of any help but I just got a spam message today and have been noticing within that the header is not being split out of the message.

Months ago when this issue first manifest, I feel like it was just pulling in the subject message from the succeeding message in Inbox. Today, I think I observed something that seems different than before. Today, I clicked on the spam message and saw that it is pulling in the header and subject from the >PRECEDING< message that wasn't even read yet.

That was a first. I used to have to read oldest to newest unread messages to see this. I am on 96.0b4 x64. I attached the message here if it's of any use to be analyzed. It seems like it's not knowing where the end of one message beings and the other one ends and just rolls it all into one.

OK, there's lots of different things all going on at once, so it's time for a bit of a recap.

  1. The patch in Bug 1728924 landed. This made some changes to the code that copies messages from elsewhere to a local mail store (ie to a local folder). There's always been an assumption that all local messages are in mbox files, and all the code assumes it has direct access to the raw file (which is problematic for all kinds of reasons). This patch removed some file seeks to loosen this assumption.
    Unfortunately, it turns out that the messageparser the copy code uses is reused without being reinitialised if multiple messages are being copied. The end result is that the .msf database records the wrong message offsets/size for subsequent messages.
    So, if multiple messages come in at once (from IMAP, say) and are moved to a local folder by a filter rule, the first one will be fine, but the others will look appear up (because of the borked offset/size in the .msf). The backing local mbox file should be OK though. If those screwed-up messages are then copied to another local folder, then the borked offset/size is used and the resultant mbox will contain screwed-up messages :-(

  2. The regression was tracked down and fixed in Bug 1734847. However, before this happened, a bunch of changes were made that rely on loosening the "everything is an mbox file" assumption that we're working toward. Changes like moving the protocol-independant message quarantining out of POP3 code (quarantining means just single messages get embargoed by anti-virus, rather than the entire folder).
    These changes are what caused the attempted backout of the Bug 1728924 patch to fail.

  3. Because the originally-bad patch of Bug 1728924 was out in the tree for a while without the Bug 1734847 fix... a bunch of people ended up with scrambled messages. This should fixable by "repair folder", but it looks like there are some issues there too (Bug 1740486). I'm not sure that is related - it doesn't happen to everyone, so it might just be that there was already a folder-repair bug for some messages, but the sudden rash of people doing folder-repair has brought more cases to light...
    Worth noting that folder-repair is a completely different operation for local folders than for IMAP. For local folders is just rebuilds the .msf file from the mbox file. For IMAP it re-downloads the messages.

Phew.

Next steps:
I'm pretty confident that Bug 1734847 fixes the Bug 1728924 regression. My suspicion is that most of the problems people are having now are due to a combination of data being borked before the fix went in, combined with folder repair not working as it should (very hypothetical example: maybe a badly-formatted message on an IMAP server throwing the folder repair into an endless loop).
So for now, unless we can nail down a replicatable case of new corruption in non-borked folders, I'm going to focus on Bug 1740486, and make sure folder repair is working properly.

[UPDATE: updated links to Bug 1734847, with the regression fix. They originally linked to this bug by mistake]

Flags: needinfo?(benc)

Ben, please edit the previous comment and use the correct bug numbers. Bug 1742975 is this very bug there.

I have a question about the .msf corruption. Does it matter where the mail store file and .msf file are located? I seem to have no problem if the files are in Local Folders, but I have the problem if the files are in my pop.att.yahoo.com directory (e.g., Ibbox, Sent, Drafts). Thanks.

(In reply to Arthur K. [He/Him] from comment #21)

(In reply to J R Andresen from comment #19)

64bit OS Windows 8.1 12GBmemory
32bit TB

Yeah. it's going to be OOM crash city with 32-bit TB I would surmise. Any reason you haven't switched to 64-bit TB to leverage your system resources? Probably would help with the OOM but not with the actual bug I'm afraid.

This might be a good reason to look into finally moving Thunderbird to 64 bit when possible, see:
Bug 1556748

(In reply to Wayne Mery (:wsmwk) from comment #48)
...

Given that it cannot be reproduced by a developer we need to be more creative and try something, i.e. anything, because this cannot continue. What is next to try, a try build with a backout to confirm that in fact this code block helps those who CAN reproduce the problem? If not, then what?

See above.

This is how one of my emails looks in Thunderbird. There is nothing in the top part (sender, etc.):

left:10px; padding-right:10px">
=20
=20
<!--[if !((mso)|(IE))]><!-- -->
<div class=3D"hse-column-container" style=3D"min-width:280px; max-wid=
th:600px; width:100%; Margin-left:auto; Margin-right:auto; border-collapse:=
collapse; border-spacing:0; background-color:#FFFFFF; padding-top:15px" bgc=
olor=3D"#FFFFFF">
<!--<![endif]-->
=20
<!--[if (mso)|(IE)]>
<div class=3D"hse-column-container" style=3D"min-width:280px;max-widt=
h:600px;width:100%;Margin-left:auto;Margin-right:auto;border-collapse:colla=
pse;border-spacing:0;">
<table align=3D"center" style=3D"border-collapse:collapse;mso-table-l=
space:0pt;mso-table-rspace:0pt;width:600px;" cellpadding=3D"0" cellspacing=
=3D"0" role=3D"presentation" width=3D"600" bgcolor=3D"#FFFFFF">
<tr style=3D"background-color:#FFFFFF;">
<![endif]-->
<!--[if (mso)|(IE)]>
<td valign=3D"top" style=3D"width:600px;padding-top:15px;">
<![endif]-->
<!--[if gte mso 9]>
<table role=3D"presentation" width=3D"600" cellpadding=3D"0" cellspacing=
=3D"0" style=3D"border-collapse:collapse;mso-table-lspace:0pt;mso-table-rsp=
ace:0pt;width:600px">
<![endif]-->
<div id=3D"column_1592509568105_0" class=3D"hse-column hse-size-12">
<table role=3D"presentation" cellpadding=3D"0" cellspacing=3D"0" width=3D=
"100%" style=3D"border-spacing:0 !important; border-collapse:collapse; mso-=
table-lspace:0pt; mso-table-rspace:0pt"><tbody><tr><td class=3D"hs_padded" =
style=3D"border-collapse:collapse; mso-line-height-rule:exactly; font-famil=
y:Arial, sans-serif; font-size:14px; color:#635951; word-break:break-word; =
padding:10px 20px 15px"><div id=3D"hs_cos_wrapper_module_15925095220262" cl=
ass=3D"hs_cos_wrapper hs_cos_wrapper_widget hs_cos_wrapper_type_module" sty=
le=3D"color: inherit; font-size: inherit; line-height: inherit;" data-hs-co=
s-general-type=3D"widget" data-hs-cos-type=3D"module"><div id=3D"hs_cos_wra=
pper_module_15925095220262_" class=3D"hs_cos_wrapper hs_cos_wrapper_widget =
hs_cos_wrapper_type_rich_text" style=3D"color: inherit; font-size: inherit;=
line-height: inherit;" data-hs-cos-general-type=3D"widget" data-hs-cos-typ=
e=3D"rich_text"><p style=3D"mso-line-height-rule:exactly; font-size:14px; l=
ine-height:175%; font-weight:bold"><span style=3D"color: #000000;">

It also has a pop up message. Will attach.

Summary: Bug 1734847 (.msf corruption) NOT FIXED → Bug 1734847 (.msf corruption) NOT FIXED on beta. (NOT VERSION 91.x)
Blocks: 1740319
Duplicate of this bug: 1740319
See Also: → 1734157

I managed to circumvent this bug by unticking "Select this folder for offline use" for all my connected accounts inboxes and then repairing the msf index for all of them.

(In reply to thepcmaniaccc from comment #62)

I managed to circumvent this bug by unticking "Select this folder for offline use" for all my connected accounts inboxes and then repairing the msf index for all of them.

The same steps worked for me

See Also: → 1760931, 1759902, 1761549

Can confirm this is still happening in Thunderbird 100.0b1

Not sure if related, but I did a File:Compact Folders on my Inbox. It came up with one message (could have been more than one) that came in last week, but now was showing today at 2:03, which was the time I ran compact. I was actually looking for that particular message earlier today, and it wasn't showing at all, or was buried somewhere. Other messages are still coming out scrambled. Thunderbird 91.8.1 (64-bit), updated earlier today.

Hello, I am also getting the multi load emails. I did also want to point this out as I believe it is related. I have always moved emails off my exchange server to local folders. In the past the emails would remain the same size as they were on the exchange server. As of late the emails have increased dramatically in size. Example being on the exchange server the email might be between 10 KB and 100 KB. Then when moved to a local folder the email will increase above 500 MB. This can be any email that may only have a few words in it.

Happy to provide any other information that may help.

Thank you very much.

(In reply to MRGSER from comment #66)

Hello, I am also getting the multi load emails. I did also want to point this out as I believe it is related. I have always moved emails off my exchange server to local folders. In the past the emails would remain the same size as they were on the exchange server. As of late the emails have increased dramatically in size. Example being on the exchange server the email might be between 10 KB and 100 KB. Then when moved to a local folder the email will increase above 500 MB. This can be any email that may only have a few words in it.

Happy to provide any other information that may help.

Thank you very much.

And you're using TB 91.8.1? 32-bit? 64-bit?

(In reply to Arthur K. [He/Him] from comment #67)

(In reply to MRGSER from comment #66)

Hello, I am also getting the multi load emails. I did also want to point this out as I believe it is related. I have always moved emails off my exchange server to local folders. In the past the emails would remain the same size as they were on the exchange server. As of late the emails have increased dramatically in size. Example being on the exchange server the email might be between 10 KB and 100 KB. Then when moved to a local folder the email will increase above 500 MB. This can be any email that may only have a few words in it.

Happy to provide any other information that may help.

Thank you very much.

And you're using TB 91.8.1? 32-bit? 64-bit?

Hey Arthur, my apologies here as I should have stated the version. I have used the beta version of TB for many years now. Everything had always been perfect up to the last few months. My current version is 100.0b2 (64-bit). I do see b3 is available so I am going to change to that now.

Let me know if any other info would help.

Thank you once again.

(In reply to Worcester12345 from comment #65)

Not sure if related, but I did a File:Compact Folders on my Inbox. It came up with one message (could have been more than one) that came in last week, but now was showing today at 2:03, which was the time I ran compact. I was actually looking for that particular message earlier today, and it wasn't showing at all, or was buried somewhere. Other messages are still coming out scrambled.
Thunderbird 91.8.1 (64-bit), updated earlier today.

I thought the scrambled or multiple emails joined together was a post-91-beta specific issue. I don't know your version usage history, but If you were running a post-91 beta and then "downgraded" back to 91.8.1 you may need to repair the inbox or other problem folders or, if that doesn't work, remove the mbox and mbox.msf files for the problem folder(s) and let tb re-download and rebuild them.

(In reply to gene smith from comment #69)

(In reply to Worcester12345 from comment #65)

Not sure if related, but I did a File:Compact Folders on my Inbox. It came up with one message (could have been more than one) that came in last week, but now was showing today at 2:03, which was the time I ran compact. I was actually looking for that particular message earlier today, and it wasn't showing at all, or was buried somewhere. Other messages are still coming out scrambled.
Thunderbird 91.8.1 (64-bit), updated earlier today.

I thought the scrambled or multiple emails joined together was a post-91-beta specific issue. I don't know your version usage history, but If you were running a post-91 beta and then "downgraded" back to 91.8.1 you may need to repair the inbox or other problem folders or, if that doesn't work, remove the mbox and mbox.msf files for the problem folder(s) and let tb re-download and rebuild them.

Hey Gene, my apologies as I should have mentioned the version I am on. Currently I am using 100.b3 (64-bit). I have not downgraded yet as I keep thinking one of the updates will correct the problem. However I do not want it to get to out of hand and sadly may be forced to downgrade.

(In reply to MRGSER from comment #70)

(In reply to gene smith from comment #69)

(In reply to Worcester12345 from comment #65)

Not sure if related, but I did a File:Compact Folders on my Inbox. It came up with one message (could have been more than one) that came in last week, but now was showing today at 2:03, which was the time I ran compact. I was actually looking for that particular message earlier today, and it wasn't showing at all, or was buried somewhere. Other messages are still coming out scrambled.
Thunderbird 91.8.1 (64-bit), updated earlier today.

I thought the scrambled or multiple emails joined together was a post-91-beta specific issue. I don't know your version usage history, but If you were running a post-91 beta and then "downgraded" back to 91.8.1 you may need to repair the inbox or other problem folders or, if that doesn't work, remove the mbox and mbox.msf files for the problem folder(s) and let tb re-download and rebuild them.

Hey Gene, my apologies as I should have mentioned the version I am on. Currently I am using 100.b3 (64-bit). I have not downgraded yet as I keep thinking one of the updates will correct the problem. However I do not want it to get to out of hand and sadly may be forced to downgrade.

Please see https://bugzilla.mozilla.org/show_bug.cgi?id=1740486#c19

It's the only thing that fixed this for me. I too am on 100.0 b3 but only after starting over circa 98.0 b2. Been running like a boss ever since and the issue has not returned. Up to you if you want to rip off the bandage or not.

Same here I guess. I am on a "post-91" version of Thunderbird, and also a 91 version, on two different computers. I may have gotten them mixed up. Thanks for pointing this out.

Same here I guess. I am on a "post-91" version of Thunderbird, and also a 91 version, on two different computers. I may have gotten them mixed up. Thanks for pointing this out.

See Also: → 1746632
Duplicate of this bug: 1770104
You need to log in before you can comment on or make changes to this bug.