Closed Bug 1880765 Opened 9 months ago Closed 9 months ago

ASSERTION: Corrupt file. Should not happen.: 'Error', file .../mozilla/comm/mailnews/local/src/nsParseMailbox.cpp when C-C TB accessed very old mail folder

Categories

(MailNews Core :: Database, defect)

defect

Tracking

(thunderbird_esr115 unaffected, thunderbird124 wontfix)

RESOLVED FIXED
125 Branch
Tracking Status
thunderbird_esr115 --- unaffected
thunderbird124 --- wontfix

People

(Reporter: ishikawa, Assigned: mkmelin)

Details

Attachments

(1 file)

I am testing C-C TB by compiling a DEBUG version under linux on my local PC.
I was testing buffered write of messages. (Currently the message write is not buffered at all and generates so many write system calls to slow down the operation significantly. The slowdown is significant under windows.)

Anyway, I tried to test the patch C-C TB for buffered write to count the number of write system calls issued during message move/copy operation.
For that, I had an old test environment I created several years ago, and used that.
That contained a few mail folders each having a hundred message or so. The folder files were created in 2016 or 2017.

When I tried to access these old folders, C-C TB generated the following error at run time. The line number may be off from the mozilla repository due to local patch.

[Parent 439045, Main Thread] ###!!! ASSERTION: Corrupt file. Should not happen.: 'Error', file /NEW-SSD/NREF-COMM-CENTRAL/mozilla/comm/mailnews/local/src/nsParseMailbox.cpp:1336

Well, I think the message was repeated as many as the # of messages in the folders C-C visited.

From what I found there was no discernible undesirable behavior. The headers were shown correctly as far as I could tell, and I think the message body was shown correctly, too. (But I am not so sure about the body. But given the local xpcshell test, and manual test drive of patched C-C TB on local PC, and tryserver tests do not show many errors. As a matter of fact, some strange errors reported locally are not reproducible if the test file that reports error(s) is run as a single target of }mach xpcshell-test targetfilename| while they do report errors when they are run in bulk by |mach xpcshell-tst|. The only two tests I observe locally are from test_filter.js which BenC reported he saw on his local PC, and test_addressBook_tracker.js. )
Furthermore, once C-C TB visits the file, it seems it somehow cleanses the folder so that the next time C-C TB visits the folder, the error messages are not shown again.

The message is from the following code and so there was unrecognizable Berkeley mbox status line character.
(Or that the C-C back in 2016 or 2017 produced incorrect status line somehow.)

       if (!mozstatus && statush) {
          /* Parse a little bit of the Berkeley Mail status header. */
          for (const char* s = statush->value; *s; s++) {
            uint32_t msgFlags = 0;
            (void)m_newMsgHdr->GetFlags(&msgFlags);
            switch (*s) {
              case 'R':
              case 'r':
                m_newMsgHdr->SetFlags(msgFlags | nsMsgMessageFlags::Read);
                break;
              case 'D':
              case 'd':
                /* msg->flags |= nsMsgMessageFlags::Expunged;  ### Is this
                 * reasonable? */
                break;
              case 'N':
              case 'n':
              case 'U':
              case 'u':
                m_newMsgHdr->SetFlags(msgFlags & ~nsMsgMessageFlags::Read);
                break;
              default:  // Should check for corrupt file.
                NS_ERROR("Corrupt file. Should not happen.");
                break;
            }
          }
        }

Although the first such test for manual move/copy "cleansed* an folder, I had two others, and so I inserted a dump statement to see what character C-C TB found in the status header line from other folders.
The result is as follows.

[Parent 439045, Main Thread] ###!!! ASSERTION: Corrupt file. Should not happen.: 'Error', file /NEW-SSD/NREF-COMM-CENTRAL/mozilla/comm/mailnews/local/src/nsParseMailbox.cpp:1336
invalid flag char <<0x4f>>

So the unrecognizable status line character was ASCII 0x4F, namely, 'O'. capital letter O.

Actually I only realized that C-C TB cleanses the folder, essentially ignores this status line and not carrying it over to compacted / rebuilt message file and .msf after I visited all the old folders.
So there are no longer old folder files that has 'O' in the status line, I think. I may have it somewhere, though, if I try hard.

Anyway, this seems not an earth-shattering problem since message copy/move worked (with buffering now thanks to my patch) but quite awkward during testing, and
if 'O' has a valid important meaning in Berkeley mbox specfication (I COULD NOT FIND any meaningful description with quick google search.), we do have a serious problem.
At least, it seems C-C TB has a problem of not being able to read old folder without problems.
Well, the last statement may need qualification.
It may be that the folder files were once created by linux's "mail" program by receiving messages I sent to the mail test account, and the resulting mbox file was copied into C-C TB's mail folder. So it could be that the 'O' was handled by linux's mail program. But since the testbed setup was done 7 or 8 years ago, I don't recall the details.

"Use the source, Luke" is certainly the dictum to follow.

I found the following in Debian's mailutil source package.
mailutil is used to create |mail| binary.
https://salsa.debian.org/debian/mailutils/-/blob/main/ChangeLog

2001-07-09 Sergey Poznyakoff
	* configure.in: check for termios.h termio.h sgtty.h
	Added MU_MAINTAINER_MODE.
	* m4/maintain.m4: (new) Settings specific to maintainer-mode.
	* m4/Makefile.am: Added maintain.m4.

	* mailbox/attribute.c: Fixed flags_to_string(): MU_ATTRIBUTE_SEEN
	and MU_ATTRIBUTE_READ were swapped: the former produced 'R',
	and the latter 'O'.

So the problematic mbox folders seem to have been produced by |mail| program and copied into TB's mail folder.
(Funny, these folders did not seem to case then C-C TB to produce such warnings about 7-8 years ago.)
I am not sure what the difference between 'R' and 'O' but probably we can pretend that the message marked with 'O' can be treated as if it were marked with 'R'.

Assignee: nobody → mkmelin+mozilla
Status: NEW → ASSIGNED

After reading the Debian mailutil ChangeLog file and some discussions on debian mailing list I think
'O' means the subject header was shown in the header listing of |mail| program, and 'R' means the message body has been shown.
So, I think having them handled similarly by TB is just fine.

(In reply to ISHIKAWA, Chiaki from comment #0)

Actually I only realized that C-C TB cleanses the folder, essentially ignores this status line and not carrying it over to compacted / rebuilt message file and .msf after I visited all the old folders.
So there are no longer old folder files that has 'O' in the status line, I think. I may have it somewhere, though, if I try hard.

Just to clarify: I'm pretty sure TB will never strip out Status headers (or any headers, for that matter).
It will add in X-Mozilla-Status and X-Mozilla-Status2 headers if it gets the chance (eg, during compaction).
And if the X-Mozilla-Status headers are there, then the parser will ignore the "Status" header.

So that would explain why you only saw the warning message on the first run through - after compacting the folder, the X-Mozilla-* headers would have been added and Status will be ignored.

(In reply to Ben Campbell from comment #4)

(In reply to ISHIKAWA, Chiaki from comment #0)

Actually I only realized that C-C TB cleanses the folder, essentially ignores this status line and not carrying it over to compacted / rebuilt message file and .msf after I visited all the old folders.
So there are no longer old folder files that has 'O' in the status line, I think. I may have it somewhere, though, if I try hard.

Just to clarify: I'm pretty sure TB will never strip out Status headers (or any headers, for that matter).
It will add in X-Mozilla-Status and X-Mozilla-Status2 headers if it gets the chance (eg, during compaction).
And if the X-Mozilla-Status headers are there, then the parser will ignore the "Status" header.

So that would explain why you only saw the warning message on the first run through - after compacting the folder, the X-Mozilla-* headers would have been added and Status will be ignored.

Thank you for the clarification.
So TB created X-Mozilla-Status headers and thus the original Status: line was not removed and just ignored and thus after the first compaction, I didi not see the error message.

With the proposed patch in comment 2, 'O' will be treated as 'R', and the X-Mozilla-Status* line would include that information.

I am simply amazed at the variation of mbox formats.
http://kb.mozillazine.org/Importing_and_exporting_your_mail#Mbox_files
And documents lilsted there fail to mention the status line characters in an easy to spot paragraphs.

Somehow, my scripts to handled mbox for splitting it or extracting a single message for inspection assumed
"From -" as the start of a message (which had been the default, I think, for TB's mbox format.)
But now "-" is gone, and so I had to modify the scripts. I assume this change is intentional for the new handling of "From" at the beginning of a line.
Life is fun. :-)

Severity: -- → S4
OS: Windows 10 → All
Hardware: x86_64 → All
Target Milestone: --- → 125 Branch

Pushed by mkmelin@iki.fi:
https://hg.mozilla.org/comm-central/rev/98b5c2f16da5
tone down warning when unexpected status was found for parsing Berkeley Mail status header. r=BenC

Status: ASSIGNED → RESOLVED
Closed: 9 months ago
Resolution: --- → FIXED
Pushed by mkmelin@iki.fi: https://hg.mozilla.org/comm-central/rev/2a9a8a1a695f follow-up, fix parameter. rs=bustage-fix
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: