Closed Bug 1734847 Opened 3 months ago Closed 2 months ago

TB (beta) 94.0b1 occasionally mis-indexing mail "subject" with previous inbox msg

Categories

(MailNews Core :: Database, defect, P1)

Thunderbird 94

Tracking

(thunderbird_esr91 unaffected, thunderbird95+ fixed)

RESOLVED FIXED
96 Branch
Tracking Status
thunderbird_esr91 --- unaffected
thunderbird95 + fixed

People

(Reporter: dan, Assigned: benc)

References

(Regression)

Details

(Keywords: dataloss, regression)

Attachments

(4 files)

User Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/94.0.4606.71 Safari/537.36

Steps to reproduce:

Restarted TB (beta 94.0B1) after TB automatically updated, checked for new mail.

Actual results:

When retreving mail TB 94.0b1 now occasionally displays "subject" of email correctly, but when opening mail it shows occasionally shows content of next previous email in "inbox" (duplicates the the content of the previous email) and apparently discards (or at least does not display) the actual contents of the newer email. Restatrting Thunderbird and then restarting PC had no effect. Help!

Expected results:

Correct mail subject and content should be displayed

Now attempting this: after exiting TB, deleted global-messages-db.sqlite file, then restarted TB and it is rebuilding file.

Rebuild of global-messages-db.sqlite file completed... not sure problem was resolved. Exited Beta Channel, installed release channel v91.2.0 and allowed the "downgrade" of my profile. All seems to be OK. Living life in the slow lane now ;)

There have been rare cases of subject/body mismatch for IMAP messages. Strange that this only happens in v94 now but not v91.x.

Duplicate of this bug: 1735580
Status: UNCONFIRMED → NEW
Ever confirmed: true

Dan, Please retest with 94.0b4 which is just released.

Flags: needinfo?(dan)

Would be happy to, but unable now... moved both my PCs back to the release channel and the current release version. Thanks for your help with this!

Flags: needinfo?(dan)
Status: NEW → RESOLVED
Closed: 3 months ago
Component: Untriaged → Database
Product: Thunderbird → MailNews Core
Resolution: --- → INCOMPLETE

Let's confirm this. I don't have steps to reproduce, but have seen it (even on 95 trunk) and there are some other reports as well.

Status: RESOLVED → REOPENED
Resolution: INCOMPLETE → ---
Status: REOPENED → NEW
Duplicate of this bug: 1738287
Severity: -- → S1
Priority: -- → P1
Duplicate of this bug: 1738105

This would be caused by recent changes in the backend wrt. to file streams, seeking, offset, etc. This report could also be related:
https://thunderbird.topicbox.com/groups/daily/T51927946941c9c68-Mc061a0a79f75977acd6f7dd2
Overall it looks like the mbox file is accessed at the wrong offset.

Duplicate of this bug: 1738155

Time line - 94.0b1 (per comment 1) shipped 10/6 at rate 100 and was built on 10/4. This bug report (which is the earliest I can find) was created on 10/8 - so this starts at least at the very beginning of beta 94. But perhaps earlier?

Questions:
No one saw this when using 93?
Did anyone using daily 94 see this?
Has anyone seen this for a pop account?

(In reply to Wayne Mery (:wsmwk) from comment #12)

Has anyone seen this for a pop account?

Yes, all my mail accounts are POP3 and that's where I experienced this when I originally reported on 10/8. Sorry I can't help now... moved my two machines back to the release channel.

Thanks for all of everyone's efforts with this!

Dan

I think I saw it on daily just the other day (IMAP).

(In reply to Magnus Melin [:mkmelin] from comment #15)

I think I saw it on daily just the other day (IMAP).

Oops, what I mean is daily prior to 10/6, when daily was still 94

(In reply to Wayne Mery (:wsmwk) from comment #12)

Time line - 94.0b1 (per comment 1) shipped 10/6 at rate 100 and was built on 10/4. This bug report (which is the earliest I can find) was created on 10/8 - so this starts at least at the very beginning of beta 94. But perhaps earlier?

Questions:
No one saw this when using 93?
Did anyone using daily 94 see this?
Has anyone seen this for a pop account?

My observation: it started happening pretty recently. I've noticed it before but didn't report it because it only happens occasionally. I'm on 94.0b4 and use it daily, and yes it did happen on a POP account.

(In reply to Wayne Mery (:wsmwk) from comment #16)

(In reply to Magnus Melin [:mkmelin] from comment #15)

I think I saw it on daily just the other day (IMAP).

Oops, what I mean is daily prior to 10/6, when daily was still 94

I'm on the beta update channel and have been for quite some time.

looks like a good list. should be easy to narrow it down with one or two try builds backing out one at a time?

(In reply to Wayne Mery (:wsmwk) from comment #12)

Questions:
No one saw this when using 93?

I was using beta 93.xx prior to the "automatic" update to 94.0b1, and did not experience this issue prior to the update.

Probably not so easy until we have steps to reproduce. After that it would be way easier.

Keywords: dataloss

As I told Magnus in bug 1738105, if there's some kind of logging or whatnot that I can turn on to try and catch this thing in the act, let me know and I'll give it a try. I didn't see it start happening until 94.0b5 and I'm on IMAP.

My guess would be there is no logging that could catch it.

(In reply to Magnus Melin [:mkmelin] from comment #24)

My guess would be there is no logging that could catch it.

Bummer. And nothing in that Error Console output that I added in bug 1738105 is any hint?

Almost forgot and maybe it's related. I can't seem to run a repair operation on a folder anymore. I say seem because when I do run it on my Inbox, I don't see the usual Downloading xxxxxx messages in the Status Bar. I do see the progress bar animation kind of doing something but it stops shortly after I run the repair.

So it looks like I've been pushed a 94.b05. I received an email at 9:15 and can see the email body when I select it. For an email received at 9:27 all all thereafter, I can select it but cannot see the email body at all. This is happening for 2 POP accounts. I can no longer read incoming emails. :-(

Oops - abo(In reply to Josh from comment #27)

So it looks like I've been pushed a 94.b05. I received an email at 9:15 and can see the email body when I select it. For an email received at 9:27 all all thereafter, I can select it but cannot see the email body at all. This is happening for 2 POP accounts. I can no longer read incoming emails. :-(

Oops - times above are in EDT.

Don't know if this is what is being reported here but definitely similar. I was sent via email a saved eml as seen by user breezymozilla with 94.0b (see bug 1734843 comment 105) where he attempts to open an email and sees below it many other emails, but just the raw mbox content. The attached png shows the end of the 1st valid email followed by the first line of the raw mbox content. The whole message with raw mbox content came to 28Mbytes ! This is using IMAP.
I couldn't duplicate with 94.0b4 so asked him to just report a new bug but I suspect this is caused by the same thing.

Duplicate of this bug: 1736917

(In reply to Magnus Melin [:mkmelin] from comment #19)

My main suspects in https://hg.mozilla.org/comm-central/pushloghtml?startdate=2021-09-10&enddate=2021-10-04 would be

For debugging,

(1) the first patch,
I would record the current position of the file in near where seek to the end was removed and then
re-insert the seek to end and then read the file position again compare the file position recorded earlier.
If they are the same, the removal of seek was OK, but if not we have a problem.
(I could find a similar file position problem when I tried to introduce the buffered write [still to be submitted/merged in the latest re-incarnation.]
When a file stream was converted to buffered stream, the stream was AUTOMATICALLY rewound to the initial 0 position which caused a serious bug and that was why it had been removed many years ago, several years before I tried it.
I could only find this automatic rewinding when I trace the file position very carefully in a similar approach. :-(

(2) the second patch.
I think only careful reading of the code, and dumping of various intermediate file, etc. can reveal any surprises or misunderstandings of the operation. (Well, in my attempt to introduce buffered write, I also touched this part and I was a bit surprised that
a buffered input stream could not be modified into buffered output stream or something like that. The conversion of various file types were not quite symmetrical and uniform. If BenC could eliminate such extra code to deal with the limit of low-level IO function, so much the better.)

I was assuming the above two are the only potential culprits, though.

Using the test case from bug 1736917 I was able to pin down the regression window to 2021-09-01 -> 2021-09-03 dailies.
https://hg.mozilla.org/comm-central/pushloghtml?startdate=2021-09-01+12%3A00&enddate=2021-09-03+14%3A01

By local backouts I found the offender is https://hg.mozilla.org/comm-central/rev/2c8857af0eb3112dac2250b7f7bdf777e8911771 (bug 1728924)

I'm going to back that out to fix this bug.
Something with m_envelope_pos or m_headerstartpos might be wrong but it will require some debugging to find out what and how to fix it.

The .msf files for affected folders will have been damaged. Repair folder is required and fixes it.

Assignee: nobody → mkmelin+mozilla
Status: NEW → ASSIGNED
Target Milestone: --- → 96 Branch
Regressed by: 1728924

Pushed by mkmelin@iki.fi:
https://hg.mozilla.org/comm-central/rev/70c0061f9f84
Backed out changeset 2c8857af0eb3 (bug 1728924) for causing .msf corruption. rs=backout

Status: ASSIGNED → RESOLVED
Closed: 3 months ago2 months ago
Resolution: --- → FIXED

I think the backout is causing failures in mailnews/search/test/unit/test_quarantineFilterMove.js crashing at seekableStream->Tell(&filePos);

Probably safest to back out the backout for now
Backout: https://hg.mozilla.org/comm-central/rev/87f80cf1335e987a0ee19ff8671afb6617a807eb

Status: RESOLVED → REOPENED
Resolution: FIXED → ---

Over to you Ben

Assignee: mkmelin+mozilla → benc

Definitely looks like my changeset 2c8857af0eb3 which screwed it up.

I can replicate it now too:

  1. have an IMAP account with a folder containing at least 3 messages
  2. copy those messages to a local folder
  3. observe one or more messages screwed up

It doesn't seem to happen when copying messages from other local folders.
"repair folder" fixes it.

No fix yet, but wanted to note down one anomaly I've spotted:

The .messageSize attribute on the new messages seem to be screwed up after the copy. It uses the messagesize from the source message. But during the copy, an extra 97 bytes is added when an "X-Mozilla-Keys" header (plus blank space to later write keywords) is inserted. Might also be an issue for X-Mozilla-Status & X-Mozilla-Status2, but I'm not sure (my test messages already have those set in the IMAP mbox).

I'd have thought this would cause all kinds of chaos and mayhem... but seems like it was already happening. The size of the copied messages is wrong even if the patch is backed out.

Attached file FOOK.tar

Just out of interest, here's the more constrained, human-readable test case I'm using now.
Two folders containing the same three messages. One good, one borked.

The mbox files are identical (other than differing "From "line timestamps).

(In reply to Ben Campbell from comment #36)

The .messageSize attribute on the new messages seem to be screwed up after the copy. It uses the messagesize from the source message. But during the copy, an extra 97 bytes is added when an "X-Mozilla-Keys" header (plus blank space to later write keywords) is inserted.

My mistake - I was looking at the .offlineMessageSize attribute (not entirely sure what that is used for... and it still seems very wrong).

In any case, it does seem like the borked messages are getting the wrong .messageSize. Haven't figured out the exact cause yet (there are a lot of moving parts involved), but I've got some ideas for potential fixes.

I'm not sure if this is helpful at all but I have only seen this problem with combined and screwed up messages when I also have Message Filters enabled. I have a bunch of Message Filters set up for the sole purpose of organizing messages into sub-folders. I also have offline folders enabled for all the folders I've had trouble with. I haven't seen any new corruption since I disabled all message filtering. From what I've seen, it seems like the problem is related to messages being moved/copied, maybe by multiple threads (?), maybe looking at the wrong message size or the corrupt message size in the process. I haven't specifically tried to reproduce or really even analyze what's going wrong - just my observations and gut feel based on what I've seen.

Same here, I have quite a few message filters filtering into Local Folders. And I'm now moved back to TB 91.3.0 and the problem isn't occurring, so it was introduced somewhere after that.

Brian and Josh: Yep, the issues you describe there do sound like the breakage I caused.

I think I've figured it out. It's not the filters which are the issue, but the copying of multiple messages from non-local folders to local ones.

For copying from non-local to local, the copy routines pass the message through a message parser (nsParseMailMessageState).
The changes I made in Bug 1728924 assumed that nsParseMailMessageState is only used for a single message at a time. This is usually the case... but here the local folder copy code just streams all the messages through the parser. And it gets confused where it's up to and sets the wrong sizes on the messages.
(there is a derived class, nsMsgMailboxParser, which does handle parsing multiple messages, but that's not being used here, for whatever reason).

I'm not sure why it uses the single parser for multiple messages. Probably it should be creating a new parser for each message. But the copy code is so brittle (Bug 1731177) that I think it's safer to just hack it for now and fiddle the required fields in the parser when each new message is started, so the size calculation is correct.

I'm just testing a patch now to do just that.

The real solution is that the parser shouldn't care about the size of the incoming message anyway - the size should be set by the code handling the writing (the mail store). Much more accurate, and better able to handle "From " escaping and all those icky details.
I'm working on it (Bug 1719121, Bug 1733849 and friends) but it's a major job. Assumptions about mbox are all over the place :-(

(In reply to Ben Campbell from comment #41)

Brian and Josh: Yep, the issues you describe there do sound like the breakage I caused.

I think I've figured it out. It's not the filters which are the issue, but the copying of multiple messages from non-local folders to local ones.

For copying from non-local to local, the copy routines pass the message through a message parser (nsParseMailMessageState).
The changes I made in Bug 1728924 assumed that nsParseMailMessageState is only used for a single message at a time. This is usually the case... but here the local folder copy code just streams all the messages through the parser. And it gets confused where it's up to and sets the wrong sizes on the messages.
(there is a derived class, nsMsgMailboxParser, which does handle parsing multiple messages, but that's not being used here, for whatever reason).

I'm not sure why it uses the single parser for multiple messages. Probably it should be creating a new parser for each message. But the copy code is so brittle (Bug 1731177) that I think it's safer to just hack it for now and fiddle the required fields in the parser when each new message is started, so the size calculation is correct.

I'm just testing a patch now to do just that.

The real solution is that the parser shouldn't care about the size of the incoming message anyway - the size should be set by the code handling the writing (the mail store). Much more accurate, and better able to handle "From " escaping and all those icky details.
I'm working on it (Bug 1719121, Bug 1733849 and friends) but it's a major job. Assumptions about mbox are all over the place :-(

This sounds interesting.
Many moons ago, I asked in one of the bugzilla, how TB reports back an error during the copying of multiple messages.
Imagine that you want to copy 100 messages, but the transfer of 50th message failed due to filled up file system, temporary network outage of the remote file system, etc. Irrespective of what one should do afterward, the error ought to be returned to the caller and eventually to the
UI.
Back when I asked that someone answered he will check. But I heard nothing afterward.

You may want to make sure the error is returned properly to the caller in your local hack now. (Back when I asked the question above, it seems to me that TB kickstarted the copying process and waited for the completion via listener or whatever it is called, but there was no clear path for error information propagation. Some functions involved in the copying process were declared void and obviously there was no way to return an error value from such a function. I looked at the potential pathways including output parameter, but the listener only seemed to care for the termination (error or success).
I thought that WAS BAD.

(In reply to ISHIKAWA, Chiaki from comment #43)

This sounds interesting.
Many moons ago, I asked in one of the bugzilla, how TB reports back an error during the copying of multiple messages.
Imagine that you want to copy 100 messages, but the transfer of 50th message failed due to filled up file system, temporary network outage of the remote file system, etc. Irrespective of what one should do afterward, the error ought to be returned to the caller and eventually to the
UI.
Back when I asked that someone answered he will check. But I heard nothing afterward.

You may want to make sure the error is returned properly to the caller in your local hack now. (Back when I asked the question above, it seems to me that TB kickstarted the copying process and waited for the completion via listener or whatever it is called, but there was no clear path for error information propagation. Some functions involved in the copying process were declared void and obviously there was no way to return an error value from such a function. I looked at the potential pathways including output parameter, but the listener only seemed to care for the termination (error or success).
I thought that WAS BAD.

This sounds a lot like the "copy lots of messages to another server bug" which I tried to summarize here: bug 538375 comment 194. This is something that has been a problem for many years as you can see by the long list of comments.

See Also: → 538375

I don't know if it's related to the attempts to fix this corruption issue but wanted to note bug 1740486 that I opened today.

(In reply to ISHIKAWA, Chiaki from comment #43)

You may want to make sure the error is returned properly to the caller in your local hack now. (Back when I asked the question above, it seems to me that TB kickstarted the copying process and waited for the completion via listener or whatever it is called, but there was no clear path for error information propagation. Some functions involved in the copying process were declared void and obviously there was no way to return an error value from such a function. I looked at the potential pathways including output parameter, but the listener only seemed to care for the termination (error or success).
I thought that WAS BAD.

There are a few listeners involved in message copying, but they seem to be used rather inconsistently. Some paths don't seem to invoke all the callbacks you'd expect, and there are lots of corner cases where stuff gets initialised in multiple different places, depending on the exact details of the messages being copied...
Message copying has just evolved over time into a brittle lump of concerns, made worse by the way message streaming, mbox handling and nsIMsDBHdr population is all tangled up together (not to mention move/copy/move-to-trash complications and undo support).
It needs some real effort to refactor and simplify all these moving parts, and that's more or less what my focus is at the moment, starting with mbox (with the side effect outcome of getting maildir support solid!).

But it's really hard not to break stuff along the way - in fact this whole bug was a result of a relatively modest refactoring attempt by me! So it's obvious I need to be waaay more careful as I continue.

All of which is to say: yes, I totally agree with you that the error reporting needs to be much more robust! It will be, but for this fix, I'm content if it fixes the problem and doesn't break anything else :-)

OK, so the patch seems to fix the problem for me, and the try build seems OK... so I've flagged it for landing.
But it'd be nice to have a few more people who experienced the issue try it out and confirm it's an improvement (and doesn't obviously break anything else!)...

(In reply to Ben Campbell from comment #41)

Brian and Josh: Yep, the issues you describe there do sound like the breakage I caused.

I think I've figured it out. It's not the filters which are the issue, but the copying of multiple messages from non-local folders to local ones.

For copying from non-local to local, the copy routines pass the message through a message parser (nsParseMailMessageState).
The changes I made in Bug 1728924 assumed that nsParseMailMessageState is only used for a single message at a time. This is usually the case... but here the local folder copy code just streams all the messages through the parser. And it gets confused where it's up to and sets the wrong sizes on the messages.
(there is a derived class, nsMsgMailboxParser, which does handle parsing multiple messages, but that's not being used here, for whatever reason).

I'm not sure why it uses the single parser for multiple messages. Probably it should be creating a new parser for each message. But the copy code is so brittle (Bug 1731177) that I think it's safer to just hack it for now and fiddle the required fields in the parser when each new message is started, so the size calculation is correct.

I'm just testing a patch now to do just that.

The real solution is that the parser shouldn't care about the size of the incoming message anyway - the size should be set by the code handling the writing (the mail store). Much more accurate, and better able to handle "From " escaping and all those icky details.
I'm working on it (Bug 1719121, Bug 1733849 and friends) but it's a major job. Assumptions about mbox are all over the place :-(

Once done, is this a done deal? In other words, if the bug gets fixed, will messages be back to normal?

If not, this needs to be a TOP PRIORITY over all others, due to its DATALOSS condition.

(In reply to Ben Campbell from comment #46)

(In reply to ISHIKAWA, Chiaki from comment #43)

 ... omission ...

There are a few listeners involved in message copying, but they seem to be used rather inconsistently. Some paths don't seem to invoke all the callbacks you'd expect, and there are lots of corner cases where stuff gets initialised in multiple different places, depending on the exact details of the messages being copied...
Message copying has just evolved over time into a brittle lump of concerns, made worse by the way message streaming, mbox handling and nsIMsDBHdr population is all tangled up together (not to mention move/copy/move-to-trash complications and undo support).
It needs some real effort to refactor and simplify all these moving parts, and that's more or less what my focus is at the moment, starting with mbox (with the side effect outcome of getting maildir support solid!).

But it's really hard not to break stuff along the way - in fact this whole bug was a result of a relatively modest refactoring attempt by me! So it's obvious I need to be waaay more careful as I continue.

All of which is to say: yes, I totally agree with you that the error reporting needs to be much more robust! It will be, but for this fix, I'm content if it fixes the problem and doesn't break anything else :-)

I will keep my fingers crossed. :-)

BTW, I am trying to update my local patch set to enable buffered-write in TB (bug Bug 1242030)
Copying many messages from a folder to the other takes too long IMHO.
Also, along the way, I noticed so many ignored cases of error values from low-level I/O routines, which I needed to fix. pop3 error handling scheme may not work due to the programming errors I noticed there.

Right now, basically, I am trying to accommodate the download to temporary file patch and others which I could not accommodate due to other patches until several weeks ago.

I have put in many sanity checks in my local patch. So if I notice anything strange from my local log, I will report it to bugzilla. But I am afraid it would take me a few more weeks to that stage. I am trying classify bugs in to serious and not-so-serious categories right now and comparing other people's jobs on try-comm-central to figure out which ones are caused by my local patches for now.
Logs from mochitest and xpcshell test of DEBUG version of TB contain so many noises, I am trying to cut down the noise, too.

TIA

Pushed by geoff@darktrojan.net:
https://hg.mozilla.org/comm-central/rev/91332386d3dd
Fix message size calculations when copying multiple messages to local folder. r=mkmelin

Status: REOPENED → RESOLVED
Closed: 2 months ago2 months ago
Resolution: --- → FIXED

(In reply to Worcester12345 from comment #48)

Once done, is this a done deal? In other words, if the bug gets fixed, will messages be back to normal?

For affected folders you'll need to do "Repair Folder". The actual messages are not damaged, but the .msf file of the folder has wrong data.

I raised defect 1736917 which has been marked as a duplicate of this one, for Thunderbird 95.0 appearing to corrupt mail messages when moved to a local folder by filters, but only for mails from two specific mail lists, all the rest of the 43 filters I have to move threads to local folders don't cause the issue.
About a week or so ago V95.0 of Thunderbird on my Linux VM image was upgraded to V96.0a1 and now the threads from the Fedora mail list that are sitting in my Imap mailbox are exhibiting the issue before the moves happen. So it looks like V96 has further compounded the issue.

regards,
Steve

For affected folders, you'll need to use the Repair Folder to get them working properly again. Did you do that?

That seems unrelated to this bug.

We should get this uplifted to beta.

Duplicate of this bug: 1741249
Duplicate of this bug: 1740287

Comment on attachment 9250105 [details]
Bug 1734847 - Fix message size calculations when copying multiple messages to local folder. r=mkmelin

[Approval Request Comment]
Regression caused by (bug #): bug 1728924
User impact if declined: .msf corruptions
Testing completed (on c-c, etc.): c-c
Risk to taking this patch (and alternatives if risky): continued .msf corruptions on beta

Attachment #9250105 - Flags: approval-comm-beta?

(In reply to Magnus Melin [:mkmelin] from comment #58)

Comment on attachment 9250105 [details]
...
Regression caused by (bug #): bug 1728924
User impact if declined: .msf corruptions
...
Risk to taking this patch (and alternatives if risky): continued .msf corruptions on beta

Is this considered "data loss"?

(In reply to Worcester12345 from comment #59)

Is this considered "data loss"?

Sorry, seeing this in key words just now.

Duplicate of this bug: 1741478

Comment on attachment 9250105 [details]
Bug 1734847 - Fix message size calculations when copying multiple messages to local folder. r=mkmelin

[Triage Comment]
Approved for beta

Attachment #9250105 - Flags: approval-comm-beta? → approval-comm-beta+
Duplicate of this bug: 1741517
Duplicate of this bug: 1740148
Duplicate of this bug: 1739871
Duplicate of this bug: 1737487

So I just installed the testing build of 95.0b4 and so far so good. At least bug 1740486 seems gone now. I'm presently doing a repair on my Inboxes and I'll report back if recent mangled messages are repaired.

So I was testing 95.0b4 on my home PC and doing Folder Repairs on affected Inboxes and something seems to be worse now. Folder Repair seems to have run all day and night long but never seems to have actually stopped. Activity Mon isn't showing any more activity but Task Manager still shows TB as doing....something. And now, AFAICT, we have bug 1742590 to contend with as a byproduct.

I think we ultimately have to Mozgression this one because it appears that whatever fix included in b4 didn't quite do it and caused fallout.

Duplicate of this bug: 1742049
Duplicate of this bug: 1730676

Sorry but I have experienced the same issue also today in TB95.0b4 in a new fresh profile.

See Also: → 1742975
Duplicate of this bug: 1742684

(In reply to gene smith from comment #44)

(In reply to ISHIKAWA, Chiaki from comment #43)

This sounds interesting.
Many moons ago, I asked in one of the bugzilla, how TB reports back an error during the copying of multiple messages.
Imagine that you want to copy 100 messages, but the transfer of 50th message failed due to filled up file system, temporary network outage of the remote file system, etc. Irrespective of what one should do afterward, the error ought to be returned to the caller and eventually to the
UI.
Back when I asked that someone answered he will check. But I heard nothing afterward.

You may want to make sure the error is returned properly to the caller in your local hack now. (Back when I asked the question above, it seems to me that TB kickstarted the copying process and waited for the completion via listener or whatever it is called, but there was no clear path for error information propagation. Some functions involved in the copying process were declared void and obviously there was no way to return an error value from such a function. I looked at the potential pathways including output parameter, but the listener only seemed to care for the termination (error or success).
I thought that WAS BAD.

This sounds a lot like the "copy lots of messages to another server bug" which I tried to summarize here: bug 538375 comment 194. This is something that has been a problem for many years as you can see by the long list of comments.

Oh, thank you for the heads-up.. This has been my nagging issue at the back of my mind.
I will put this on my radar when I work on my patch set for buffered-output performance improvement

Is anyone checking the crash reports? I assume they are being sent? Having them left and right. Also attaching corrupted .eml file. Sorry, doesn't look like there is any capability to do so in Bugzzzzz

(In reply to ISHIKAWA, Chiaki from comment #74)

Oh, thank you for the heads-up.. This has been my nagging issue at the back of my mind.
I will put this on my radar when I work on my patch set for buffered-output performance improvement

Are these open bugs right now? Could you give the bug numbers?

(In reply to Worcester12345 from comment #76)

(In reply to ISHIKAWA, Chiaki from comment #74)

Oh, thank you for the heads-up.. This has been my nagging issue at the back of my mind.
I will put this on my radar when I work on my patch set for buffered-output performance improvement
Are these open bugs right now? Could you give the bug numbers?

Are you interested in the buffered-output bugzilla?

Duplicate of this bug: 1740846
Duplicate of this bug: 1744447

(In reply to ISHIKAWA, Chiaki from comment #77)
...

Are you interested in the buffered-output bugzilla?

No idea what that is, so no.

Just noticing some emails that are crazy mixed up. This is on the newest nightly "beta" Thunderbird. So it apparently does not fix the issue. Or is it an issue where once the data (email) is messed up, it stays that way and won't be brought back or fixed?

Everyone affected must do Repair Folder, since the .msf file has incorrect data.

Since bugs like this happen from time to time, can a code for an asynchronous integrity check be added to TB so it would automatically run "Repair Folder" command on affected folders in the background, without any actions required from users?

(Obviously users that visit here can live with that but the corruption may affect people who are way removed from ever going to properties and manually starting the procedure.)

@Magnus Melin, I see recurring this issue in 96.0b1 (32-bit). After Repair Folder message is “composed” in another way. What should be done?

I'm not sure what you mean by that.
Possible further issue is tracked in bug 1742975, but it's yet unclear what it's about.

Ah, sorry my bad. Bad issue.

(In reply to uzivatel919 from comment #86)

Ah, sorry my bad. Bad issue.

No, it is not. I mean:

  1. I opened a message and saw it again merged with some.
  2. I run folder repait that seemed to did nothing.
  3. I repeat step 2 few times.
  4. Then message was merged another was.
  5. Restart, repair folder again, now message appears blank.

*repeated, another way

Saved .eml message has size about 192 MB.

Crashes upon message opening.

Folder repair continuously fails to do remedy.

It's possible there is some bug wrt compact, see bug 1742975 comment 13.

Can you try to delete the .msf file for the folder instead?

I deleted whole account (after previous more gentle attempts). Then I got immediately big message again. This time 125 MB. I can provide original message. Maybe it has some structural speciality?

(In reply to Magnus Melin [:mkmelin] from comment #82)

Everyone affected must do Repair Folder, since the .msf file has incorrect data.

Is there an easy way to do "repair all folders" at once?

If not, can there be?

(In reply to Magnus Melin [:mkmelin] from comment #92)

It's possible there is some bug wrt compact, see bug 1742975 comment 13.

Can you try to delete the .msf file for the folder instead?

Maybe they should come out with a "Thunderbird Repair Tool" which deletes all these files at once, and does other maintenance items. Kind of like the Mozbackup utility, but for repairs instead. Kind of a "Thunderbird Tuneup".

You need to log in before you can comment on or make changes to this bug.