Mork (.msf) file not immediately sync'd to disk when copy-filter rule, which affects biff message counts
Categories
(MailNews Core :: Database, defect)
Tracking
(Not tracked)
People
(Reporter: pablo, Unassigned)
References
Details
Attachments
(5 files)
User Agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.106 Safari/537.36
Steps to reproduce:
::: Environment/Set up :::
o POP account - non-gmail / check once a minute
o Create a shell script to email four or more simple messages to the test account:
echo foo | mailx ...
o Create a folder in the account named "Received"
o Create a copy-all filter rule to copy all email to "Received"
o Run the shell script
o Manually (or wait) get all new messages
o tail -f .../Inbox.msf
o I wrote a wee monitoring script that md5sum Inbox.msf and makes copies of it when it detects a change - I'll attach the time-stamped files.
Actual results:
Periodically, more often than not, Inbox.msf is not fully flushed when a copy-all filter is active and more than four emails are sent in a burst.
I tried to replicate the issue using GMail account but could not. GMail seems to be noticeably laggier than my non-GMail account. Who knows what Google is doing.
The end result is that ^A2= (A2=numNewMsgs) lags. I am working on /nbiff/ that is a (new)biff, systray for Linux. It depends on ^A2= being up to date. :)
For those suffering from insomnia, more on nbiff is here - https://github.com/pablo-blueoakdb/nbiff
Expected results:
The Mork file should always be sync'd to disk on a change.
Should TB unexpectedly abort, it'll eliminate re-receiving emails. I know because I tried to send TB some signals to see if I could force it to flush its .msf to disk via a signal. :) I ended up kill'ing it a few times.
Comment 5•3 years ago
|
||
Doesn't mork call |fdatasync()| at key places?
I don't see a direct call to it if I am not mistaken.
Comment 6•3 years ago
|
||
|fsync| is not called directly either.
There is a direct call gloda related javascript file, but that is all.
Of course, we may be calling fdatasync or fsync through other codes (most likely in mozilla portion of code.)
Hi Chiaki,
Thank you for investigating so far.
Would it be helpful if I ran an strace on the parent and all the children?
Comment 8•3 years ago
|
||
(In reply to Pablo from comment #7)
Hi Chiaki,
Thank you for investigating so far.
Would it be helpful if I ran an strace on the parent and all the children?
It certainly would be helpful.
At this moment, I am not sure who can fix mork code, though.
In any case, the lack of fdatasync or fsync at key places should be a concern to many parties
and so your trace that shows the lack of such calls would alert more parties, I think.
Also, my checking was done very briefly using searchfox and so may not be complete.
We will figure that out once your trace shows the lack of fdatasync or fsync.
But sqlite3 code certainly calls fdatasync, etc. if available.
I thought there was a transition movement from mork to sqlite3. But maybe I was wrong.
Comment 9•3 years ago
|
||
(In reply to ISHIKAWA, Chiaki from comment #8)
I thought there was a transition movement from mork to sqlite3. But maybe I was wrong.
not yet
Reporter | ||
Comment 10•3 years ago
|
||
(In reply to ISHIKAWA, Chiaki from comment #8)
(In reply to Pablo from comment #7)
Would it be helpful if I ran an strace on the parent and all the children?
It certainly would be helpful.
I attached to a running thunderbird
process (and its children) and captured two sets of results. I've placed them in two sub-directories:
- delay-sync/
- immediate-sync/
Top-level in the tar-ball is timing
. This is a very rough time notation of when I've done tasks (e.g. Get new email, Click on Inbox, etc.). The idea is to provide an index into the timed strace
files. They're voluminous! :p
I don't think sync()
s are being used. Using the timing
file, I couldn't find any such calls around the forced sync:
pablo@oreo:/usr2/tmp/trace/delay-sync
└─▬ $ fgrep 10:04:4[0-5] * | fgrep -i sync
At this moment, I am not sure who can fix mork code, though.
As a database person, and applying the same principals of ACID (Durability in particular), I wonder if we could open()
the .msf files with O_DSYNC
.
I'm also thinking that I may need to get a special version of thunderbird
. The goal is to have some level-user debug switches to create a human-readable log to post.
Reporter | ||
Comment 11•3 years ago
|
||
Updated•3 years ago
|
Comment 12•3 years ago
|
||
Pablo,
Still seeing this with version 91?
Reporter | ||
Comment 13•3 years ago
|
||
Hi Wayne,
Unfortunately, I still seem the problem on version 91.
I ran the unit test below with version 91.3.2 (64-bit) and I still see the problem. I've redacted some of the information:
for i in $(seq 1 6) ; do echo hey | mailx -r devnull@wnXXX -s "[WN] test $i" pablo@wnXXX; done
Comment 14•3 years ago
|
||
If I understand bug 418551 correctly, MORK has been removed for Thunderbird 93+, and replaced with jsoncpp.
Comment 15•3 years ago
|
||
(In reply to Thomas D. (:thomas8) from comment #14)
If I understand bug 418551 correctly, MORK has been removed for Thunderbird 93+, and replaced with jsoncpp.
Spoke too soon. Looks like the real deal for messages is Bug 11050 / Meta bug 1572000. Overall de-mork is Meta Bug 453975.
Description
•