Closed Bug 94602 Opened 23 years ago Closed 23 years ago

local mail folder manipulation is *SLOW*

Categories

(MailNews Core :: Backend, defect)

x86
Linux
defect
Not set
major

Tracking

(Not tracked)

VERIFIED FIXED

People

(Reporter: dledford, Assigned: Bienvenu)

References

Details

When deleting large numbers of emails, compacting large message folders, or
other similar activities that manipulate large portions of the message database,
the Mozilla mail package is *horribly* slow compared to Netscape.  It would
appear that it locks the folder, does manipulation of a single message, unlocks
the folder, then repeats for each and every single message or something like
that.  For example, moving 543 messages out of one folder and into the Trash
folder took over 2 minutes.  On Netscape, the same operation takes about 10 seconds.
the mail database is the summary file, not the file with the mail messages.
Also, when filing a bug, you need to say the build id of the version you are
running, or else we can't tell how old the bug is. This operation has been sped
up enormously. It might have slowed down recently, especially on Linux, which
seems to have huge issues with the way we manipulate the mailbox. A while ago, I
checked in code that made it so we don't re-open the source mailbox each time
for each message copy. I believe the folder locking is just a semaphore in
memory; it shouldn't have any appreciable time affect compared to the cost of
reading from or writing to disk.
Status: UNCONFIRMED → NEW
Component: Mail Database → Mail Back End
Ever confirmed: true
Sorry, build ID 2001080814

More accurate timing, it took just under 30 seconds to move 63 messages from my
Sent folder to Archives->Sent-Mail-Q3-2001.  The same 63 messages under Netscape
would have been less than 3 seconds probably.
This definitely looks like a linux only issue or atleast it is not a win32 
issue. It takes me between 8-10 secs for deleting/moving 130 local messages. 
I have a similar bug (bug 93260) where people are complaining about compacting 
folder is very slow. 
I can confirm that compacting folders is also slow here (I had a folder with
around 100 messages in it, and compacting was painful, taking most of the
elapsed 44 seconds for compacting).  And, speaking from the standpoint of being
a software developer, I can also confirm that the two problems "feel" similar in
terms of time consumed and responsiveness of the program while it is taking
place.  I'm running on the 2.4.2 kernel that shipped with Red Hat 7.1, it would
be interesting to know if a 2.2 kernel would make a difference on the speed of
the operation (which is possible if it's file locking or similar issues at play
here).  I'll test that and post back here tonight.
2.2 kernel does not make any difference.  OTOH, a strace of the process while it
is busy moving a large number of messages shows exactly where the problem is:

[dledford@aic-cvs dledford]$ more dledford/strace.out 
nanosleep({0, 499684000}, NULL)         = 0
rt_sigprocmask(SIG_SETMASK, [RT_0], NULL, 8) = 0
gettimeofday({997402491, 780252}, NULL) = 0
kill(31332, SIGRT_0)                    = 0
write(6, "\372", 1)                     = 1
gettimeofday({997402491, 781198}, NULL) = 0
ioctl(11, FIONREAD, [0])                = 0
poll([{fd=11, events=POLLIN}, {fd=16, events=POLLIN}], 2, 0) = 0
gettimeofday({997402491, 781599}, NULL) = 0
gettimeofday({997402491, 781677}, NULL) = 0
ioctl(11, FIONREAD, [0])                = 0
gettimeofday({997402491, 781839}, NULL) = 0
ioctl(11, FIONREAD, [0])                = 0
poll([{fd=11, events=POLLIN}, {fd=16, events=POLLIN}, {fd=5, events=POLLIN,
revents=POLLIN}], 3, 0) = 1
gettimeofday({997402491, 782238}, NULL) = 0
gettimeofday({997402491, 783747}, NULL) = 0
lseek(36, 0, SEEK_CUR)                  = 1725450
lseek(36, 0, SEEK_CUR)                  = 1725450
lseek(36, 0, SEEK_END)                  = 1725450
lseek(36, 1725450, SEEK_SET)            = 1725450
lseek(36, 1725450, SEEK_SET)            = 1725450
gettimeofday({997402491, 785075}, NULL) = 0
lseek(36, 0, SEEK_CUR)                  = 1725450
lseek(36, 0, SEEK_CUR)                  = 1725450
lseek(36, 0, SEEK_END)                  = 1725450
lseek(36, 1725450, SEEK_SET)            = 1725450
lseek(36, 1725450, SEEK_SET)            = 1725450
write(36, "From - Sun Aug  5 22:19:42 2001\n"..., 3970) = 3970
lseek(36, 0, SEEK_CUR)                  = 1729420
gettimeofday({997402491, 789549}, NULL) = 0
gettimeofday({997402491, 789737}, NULL) = 0
gettimeofday({997402491, 789818}, NULL) = 0
rt_sigprocmask(SIG_BLOCK, NULL, [RT_0], 8) = 0
rt_sigprocmask(SIG_UNBLOCK, [RT_0], [RT_0], 8) = 0
gettimeofday({997402491, 790198}, NULL) = 0
nanosleep({0, 499620000}, NULL)         = 0
[dledford@aic-cvs dledford]$ 

From one nanosleep to the next, everything is fast.  At each nanosleep, there is
a *noticeable* pause, one the order of 1/2 second or so.  For some reason, it is
the nanosleep calls that are resulting in the delay and slowness.  Since this is
during the process of reading from one file and writing to the next, I would
suspect it is the same problem with compaction as well.
david, I remember and also see in mailbox protocol code there is 
 PR_Sleep(500) call to avoid some other problem, Is that
what is causing this. 
Well, I'm not David, but since the nanosleep call is being called with routinely
just a hair under 500,000,000 for the nanosecond portion of the delay, I would
say that's exactly what's causing the slowdown.
it should be using PR_TimeInterval - I thought I checked that fix in, but if
not, it's in my tree at home. Sorry about that.
We're trying to sleep for 500 microseconds, or half a millisecond to get around
some horrible problems in the file channel code - are you saying that we're
really sleeping for 500 milliseconds because of the call to PR_Sleep, or are we
sleeping for half a millisecond but it's causing a half second delay? In other
words, will changing the call to PR_Sleep(PR_MicrosecondsToInterval(500)) fix
the problem? Thanks for looking into this, btw. Also, compact just does a copy
of the non-deleted messages to a temp folder, which is why the copy and compact
feel so similar.
*** Bug 93260 has been marked as a duplicate of this bug. ***
Reading the man page to nanosleep, the range for the nano portion of the time
spec is 0 - 999,999,999.  So, since I had forgotten if nanoseconds were 1
billionth or 1 trillionth of a second, that answered my question.  A nanosecond
is 1 billionth of a second, and according to the strace above, nanosleep is
being called with arguments in the 500,000,000 range for the nanosecond portion
of the sleep, which is indeed roughly 1/2 second instead of the intended 1/2
millisecond.
I am testing that on my linux now. 
Excellent. As i said on my bug just duped, it takes me about a minute for each
100 mails i want to compress
Some problem w/ my linux debug build, crashes on start-up, will try again 
later. 
that happens to me once in a month
reinstalling mozilla fixes this for me
fix checked in.
Status: NEW → RESOLVED
Closed: 23 years ago
Resolution: --- → FIXED
Summary: Database manipulation is *SLOW* → local mail folder manipulation is *SLOW*
What fix? I dont see any attachments, r, sr
Anyway i will be glad to test this. Will do as soon as i can
fix is described in the bug, use PR_MicrosecondsToInterval- please see bug 90079
for more info.
Maybe bugs 88449 and 85029 will get fixed by this one!

is this checked on trunk and branch or trunk only?
Verified fixed on 2001981106 trunk
My 20mb folder was compacted in less than 40 seconds which is "acceptable"
I have always seen this bug after compacting, mail folders forget the window
properties after compacting
Bug 94909 was filed with further explanation
Status: RESOLVED → VERIFIED
sorry that was 2001081106, anyway i am sure smart people got the idea
trunk only.
*** Bug 92732 has been marked as a duplicate of this bug. ***
Check out: http://www.mozilla.org/mailnews/linux_performance_results 
(under "Batch delete 5 messages"). It is local delete, and has speed up twice 
as fast.
Is this going into branch anytime soon?
no plans for checking it into the branch. It'll be in .9.4, which comes out in
two weeks or so.
CAUTION: mozilla was compacting my mail and downloading it at the same time.
This has happened before and didnt crash, but now it did and mozilla erased my
200 mails from my inbox!

Mail compaction should be done at exit
I filed a talkback, but got lost. I put in the summary something like
"compacting and downloading mail"
Product: MailNews → Core
Product: Core → MailNews Core
You need to log in before you can comment on or make changes to this bug.