POP: mail gets in very slowly in 0.9.9+ and rc1

VERIFIED FIXED

Status

MailNews Core
Networking: POP
--
major
VERIFIED FIXED
16 years ago
9 years ago

People

(Reporter: beanladen, Assigned: Navin Gupta)

Tracking

({perf})

Trunk
PowerPC
Mac System 9.x
Dependency tree / graph

Firefox Tracking Flags

(Not tracked)

Details

(Whiteboard: [adt2 rtm])

Attachments

(3 attachments, 2 obsolete attachments)

(Reporter)

Description

16 years ago
Solaris 2.8 with recommended PatchCluster 1
Problem observed since 0.9.9, continues to rc1:

When a bunch of mail (say, >5) is drawn from pop-server,
mail gets in very slowly (about 10 and more seconds
for every mail, regardless the length). During
the download, high network activity and collisions
are observed. nfsstat shows many readdirs. The
browser is completely blocked until download has
finished.
Mail download was fast in 0.9.8 (but ran without PatchCluster 1).
This should be resolved for final Mozilla 1.0.
(Reporter)

Comment 1

16 years ago
Sorry, platform is Sun Sparc, not PC.
Hardware: PC → Sun

Comment 2

16 years ago
cc'ing one of the sun guys.

Comment 3

16 years ago
Hi, grabow,
    I tested on SunOS pingpong 5.9 s81_51 sun4u sparc SUNW,Sun-Blade-100 and
didn't see what you said. I also tested on Redhat Linux 7.1. Can you give out
more clues about your enviroment, your operations to reproduce this bug? You may
test some very simple mails, then test some very complex mails and then test the
mix of them, see what happens. Thanks.
Henry

Updated

16 years ago
Component: Networking - General → Networking - POP
QA Contact: huang → sheelar
Summary: mail gets in very slowly in 0.9.9+ and rc1 → POP: mail gets in very slowly in 0.9.9+ and rc1

Comment 4

16 years ago
Take a look at Bug 140893. Seems to have the same problem under MacOS 8.6 and
OpenVMS.
(Assignee)

Comment 5

16 years ago
*** Bug 140893 has been marked as a duplicate of this bug. ***
(Assignee)

Comment 6

16 years ago
Downloading 508 pop3 msgs on Mac 9.2 takes about 90 seconds which is 1.5 times
slower than linux/win2k. I am wondering if it has always been like this before
also. cc stephend who does perf, but I don't think we measure pop3 msg download
speed. 

Assignee: mscott → naving
Keywords: perf

Comment 7

16 years ago
Created attachment 81979 [details]
POP retrieving performance (MacOS 8.6) Comm4.7/Moz1.0rc2

The same mail msg (12MB) was used. Moz seems to stall every 5 secs and the
download throughput never go over 47k/s when Comm4.7 gets constant throughput
of ~275k/s. Under MacOS 8.6 with Mozilla/5.0 (Macintosh; U; PPC; en-US;
rv:1.0rc2) Gecko/20020501. I use ipop3d as server under LinuxPPC.

Comment 8

16 years ago
Bug 140893 is a duplicate of this bug. 
To resolve that bug, it was suggested to me to increase the send buffer size of
the POP server. I did so, and set it to 65536 byte (64kB). 
The next morning I had about 10 mail messages of just a little less then 1 MB a
piece. Retrieving these messages went fast !!

Can it be that there is a kind of handshaking problem over the buffersize to use
between Mozilla and the POP server ? (or another handshaking problem ?) 

Comment 9

16 years ago
I tried with other POP servers (as Sympatico.ca) on large emails and I got the
same result: Stall and low throughput. Change the send buffer size can be a
workaround... but we have clearly a problem with Moz here.
(Reporter)

Comment 10

16 years ago
Hi Henry (comment #3),
here's my environment: Sun Ultra 5 Sparc, POP server
on an intranet over 10Mbit Ethernet, /home directory
NFS mounted over that net. As a first workaround I
moved the cache to local /tmp. This helped a bit against
total stall, but does by far not reach the short download
times we had prior to 0.9.9 and are not comparative to 
any other browser. The browser itself is irresponsive
during the download of one large mail (say, a few 10th
kb and beyond) and gets a short time slice to recover while 
turning to the next mail to download. Especially interesting
is the huge amount of collisions, which indicate that there's
a lot of interaction underway. A shortened receiving buffer
could trigger this problem.
Something essential on the communication must have been 
changed since 0.9.8. I also saw one guy reporting the same 
problem with an IMAP server on Mac OS last week (but cannot 
find the bug number...), so this may not be POP specific, but 
some problem introduced on an upper or lower communication layer.

Comment 11

16 years ago
As mentioned before I reported this same problem in bug 140893.

My configuration on OpenVMS 7.3 is very simple. I'm running SMTP, POP and
Mozilla all on my workstation. Mozilla's mail accounts point to Local Host
(127.0.0.1), so no actual network traffic takes place. It is all done in memory.
The MTU for this pseudo interface is 4096 byte. But of course the TCPIP stack is
used by POP and Mozilla, so we are talking about real IP traffic !

Furthermore there is hardly any 'real' network traffic. The CPU is running at
about 1%, and the buffered IO rate (includes network) is 3 - 5 IO's per second
on a scale of 500. In short, this system is doing absolutely nothing most of the
time. :-)

Despite all that I had very poor mail retrieval speed before I changed the send
buffer of POP to 64kB. 

I think we can conclude from this that there is a serious problem somewhere in
Mozilla Mail (and News ??).



(Assignee)

Comment 12

16 years ago
I'll look into it more. thanks for good comments that might be helpful to track the
problem. 
Status: UNCONFIRMED → ASSIGNED
Ever confirmed: true
(Assignee)

Comment 13

16 years ago
I am seeing slowdown in recent builds on mac. It is ok in 0.9.9 release. It would
be nice to narrow down when this started happening. Any help would be greatly
appreciated. 

Comment 14

16 years ago
For now I can say that it had begun before 2002042505 Branch... build that
reproduces the exact behavior shown in my previous attachment. A bug prevent me
to test older builds in the nightly directory under MacOS 8.6.
(Assignee)

Comment 15

16 years ago
This started happening right after I checked in the fix for bug 122361 - 04/05.
I looked at the patch that was checked in and found out
m_outFileStream->seek(PR_SEEK_END, 0) is taking a long time on mac, if I comment
them (2 instances) out I think we get back to before the fix. If I do check for
end-of-file before seeking it still improves performance but not as much. I wonder
why it affects only mac and other platforms, doesn't affect linux and win32 (as
I tested earlier)

conrad, do you know what could be the problem here. 
OS: Solaris → Mac System 9.x
(Assignee)

Updated

16 years ago
Hardware: Sun → Macintosh
I'll take a look.

Comment 17

16 years ago
Since the problem also exists with OpenVMS, may I point you to the effect that
increasing the pop send buffer size had in my setup ? I'm not suggesting that
this is the solution to the problem (it clearly isn't), but if we can figure out
why this change makes the retrieval so much faster, it may lead us to the cause
of the problem. 
(Reporter)

Comment 18

16 years ago
Back to my original Sun problem, I noted by accidence that
downloading is as fast as it has been before when starting Mozilla
on the machine where the Mail directory lives. This again
seems to hint to a buffer problem, here maybe something related
to the NFS buffers; see also comment #8.
Looked at seek(). It's incredibly expensive.
http://lxr.mozilla.org/mozilla/source/xpcom/io/nsIFileStream.cpp#330
1. Since it does a Flush() for each call, it hits the disk whenever there is
anything in the buffer (always after the first time).
2. PR_Available() internally does 2 PR_Seeks so we have a total of 4 PR_Seeks
for this operation.
3. No checks are done for seeking to the current position so, in this case, all
this work is for naught.
4. On the lowest Mac-specific level, PR_Seek has a similar lack of optimization.

Unless both FileImpl::Seek and PR_Seek can be significantly improved, avoid the
call to seek() is possible.
(Assignee)

Comment 20

16 years ago
conrad, Thanks for seek() info. how costly is checking eof()? I will have to
do that atleast in one place to avoid seeking every time.
(Assignee)

Comment 21

16 years ago
Created attachment 83122 [details] [diff] [review]
possible fix 1

This fix helps us to gain back some performance loss but it still does not get 

back to 0.9.9.
(Assignee)

Comment 22

16 years ago
Created attachment 83371 [details] [diff] [review]
proposed fix

This will get back us to the performance as it was before fixing bug 122361. 
We now seek only when we have to -> While downloading new pop3 mail, if the
header
flags are changed, then only it causes a seek because we are sharing folder
stream between pop3 code and mailDatabase code, otherwise not.
(Assignee)

Comment 23

16 years ago
David, can you review the last patch(proposed fix). This has to do with sharing
folder stream between mailDatabase and pop3 code. This was mainly done to avoid 
checking for and/or seeking un-necessarily

>Unless both FileImpl::Seek and PR_Seek can be significantly improved, avoid the
>call to seek() is possible.

Comment 24

16 years ago
It looks like you're seeking to the end of the file whenever we update the
folder flag if m_folderStream is set. That seems to be based on some sort of
very odd and hitherto unknown contract  - that if m_folderStream is set, we must
be getting new pop3 mail. Is that the case? Otherwise, we'd be slowing down
other operations by always seeking to the end of the file when updating flags.
For example, when deleting a bunch of messages, it looks to me like now we'll
seek to the end of the file after updating each message, which will slow that
down greatly.
(Assignee)

Comment 25

16 years ago
m_ownFolderStream is false only when we are sharing folder stream between pop3
and mailDatabase and this happens only when we are downloading msgs. So it will
not slowdown deletion of messages or reading msgs (updating flags) unless you
are doing
this while you are downloading msgs. 

Comment 26

16 years ago
Is that true? I thought we owned the folder stream if we created it, not if
we're not doing get new mail. For example, if I move a bunch of messages from my
inbox to another folder, and while that's going on, I press get new mail, won't
it be the case that the db owns the folder stream and pop3 will share it? 
(Assignee)

Comment 27

16 years ago
Comment #25 is what we have right now. I agree it is like a contract but I
didn't see any other way to restore the performance back to what it was originally. 
In your example we don't need to share any folder stream because we just do
batching when we are actually deleting msgs, so we cannot download msgs if we
are deleting msgs from Inbox. I mean there is no scope of sharing folder stream
the other way. 

BTW, I don't see you on AIM.
(Assignee)

Comment 28

16 years ago
I am comfortable with possible fix 1 also. we can have another bug to fix seek
on mac to restore the performance fully. 

Comment 29

16 years ago
I'd much prefer the first fix, iff checking for eof wasn't also slow (I don't
know if it  is or not - we'd need to ask) - also, in the first fix, are you sure
you can remove the seek to the end after publish msg hdr? That does't seem
right. And the real right fix is to make seeking to the end cheap when we're
already at the end. I don't know if that's hard or not in the file stream or
NSPR code. If it was a simple matter of checking eof, I would have thought they
would have done it. 

fix 2 seems fragile and just asking for other problems. We're shuffling around
the file ptr to make the current caller and code happy but could cause lots of
other problems down the road.

Another possibility is to use Tell to make sure the file pos hasn't changed
unexpectedly.
(Assignee)

Comment 30

16 years ago
possible fix 1 does the right thing, we don't need to seek to end after
publishing msg header, because all the writing to mailbox takes place in
WriteLineToMailbox and we just have to make sure we are at the EOF there. 
(Assignee)

Comment 31

16 years ago
Acutally checking for eof is also slow, that is what I found with my tests, but
it is better than what we have now. 

Comment 32

16 years ago
It seems like checking eof shouldn't be nearly as bad as what's happening now,
and I think your tests showed that, right? And the removing of the seek after
the publishmsghdr code is really a separate change in its own right, isn't it?
It could be removed by itself; it doesn't depend on the other changes in either
of these patches? That seems true to me, and should help a little. I'll put an
r= on the first fix, and if we want to explore better fixes, we can keep going.

Comment 33

16 years ago
Comment on attachment 83122 [details] [diff] [review]
possible fix 1

r=bienvenu
Attachment #83122 - Flags: review+

Comment 34

16 years ago
IMO I don't think that a half-temporary-fix is the way to go. I receive and send
lot of big emails and I still use Comm4.7 for my mail because Moz isn't enought
efficient on POP and SMTP transferts (bug 88489)...
(Assignee)

Comment 35

16 years ago
Created attachment 83761 [details] [diff] [review]
proposed fix, v2

So this patch gets rid of the contract I had in the last patch. We save the
offset
of m_folderStream before using in mailDB code and restore it back after using
it (if we are not the owner of m_folderStream). This is good because we will
not lose any performance and this is something general in the sense that if we
are not the owner of the folderStream we should not tinker with state, and if
we do we should restore it back. 

david, can you review ? thx.
Attachment #83371 - Attachment is obsolete: true

Comment 36

16 years ago
Comment on attachment 83761 [details] [diff] [review]
proposed fix, v2

I like this approach better. However, do we need to have methods and a member
variable to save and restore the folder stream pos? Couldn't this just be
handled with a local variable, since it's all in one synchronous method
(UpdateFolderFlag()?) At least I think that's the case, w/o seeing all the
context from the diff.
I'm also not 100% sure that "owning the folder stream" is equivalent to "no one
else is using the folder stream". Perhaps having a member variable that says
"sharing folder stream" would be more accurate. Shared folder streams would
require this seeking...
(Assignee)

Comment 37

16 years ago
Created attachment 83777 [details] [diff] [review]
patch with changes, making it local var

I was just trying to make it generic, missed out that both the calls were in
the same function. I have changed it to local var. please review.
Attachment #83761 - Attachment is obsolete: true

Comment 38

16 years ago
Comment on attachment 83777 [details] [diff] [review]
patch with changes, making it local var

r=bienvenu - I'm not convinced that there aren't problems where the db is in
batching mode, and thus has a stream, while a get new mail is done, but you're
fairly confident that's not the case. I would suggest, however, putting an
assertion in  nsMailDatabase::SetFolderStream() that m_folderStream is null
whenver this is called, and verifying that doing a get new mail while moving
lots of messages from the inbox doesn't trigger the assertion.Does that make
sense?
Attachment #83777 - Flags: review+
(Assignee)

Comment 39

16 years ago
I added this assertion to SetFolderStream

NS_ASSERTION(!m_folderStream || !aFileStream, "m_folderStream is not null and we
are assigning a non null stream to it");

Both should be non-null to assert and verified we are not hitting this assertion
while we are moving mail and do GetMsg()

Comment 40

16 years ago
Comment on attachment 83777 [details] [diff] [review]
patch with changes, making it local var

sr=mscott
Attachment #83777 - Flags: superreview+
(Assignee)

Comment 41

16 years ago
fixed
Status: ASSIGNED → RESOLVED
Last Resolved: 16 years ago
Keywords: adt1.0.0, nsbeta1
Resolution: --- → FIXED

Comment 42

16 years ago
This have added a new warning on brad TBox:

+mailnews/db/msgdb/src/nsMailDatabase.cpp:335
+ `PRInt32 folderStreamPos' might be used uninitialized in this function

(Assignee)

Comment 43

16 years ago
ok, I will fix it. 

Comment 44

16 years ago
Sheelar, could you verify this on the trunk?  Navin, what areas of the product
need to be tested around to make sure this doesn't break anything?
Keywords: nsbeta1 → nsbeta1+
Whiteboard: [adt2 rtm]
(Assignee)

Comment 45

16 years ago
This fix is dependent on the fix in bug 135275. If we want to take this we
have to take that. 

Comment 46

16 years ago
Here are the results of download timings before and after patch on just Mac OS 9.1:

Trunk build:2002-05-16-03
1.Create new profile with pop account
2.Change accounts and settings to leave mssgs on the server
3.Start download of 97 messages of 1MB size for the account
  Time taken: 27 seconds
4.Compose and send a message with an attachment of 9.3MB size
   Time taken to download this single message with a large attachment:8min 30sec

Trunk build: 2002-05-21-03 after the patch on commmercial trunk builds.
1.Create new profile with pop account
2.Change accounts and settings to leave mssgs on the server
3.Start download of 97 messages of 1MB size for the account
   Time taken: 5 seconds
4.Compose and send a message with an attachment of 9.3MB size
   Time taken to download this single message with a large attachment:8min 30sec

As per the results above it is very clear that there is imporved performance
with the patch.  

Updated

16 years ago
Depends on: 135275

Comment 47

16 years ago
Branch landing scheduled? On truck it works great.

Comment 48

16 years ago
marking this bug as verified since it has been verified on the trunk as it is so
far fixed on the commercial trunk.
Status: RESOLVED → VERIFIED

Comment 49

16 years ago
adt1.0.1+ (on ADT's behalf) for checkin to the 1.0 branch, pending Driver's
approval.
Blocks: 143047
Keywords: adt1.0.0 → adt1.0.1+

Comment 50

16 years ago
please checkin to the 1.0.1 branch ASAP. once there, remove the mozilla1.0.1+
keyword and add the fixed1.0.1 keyword.
Keywords: mozilla1.0.1+

Updated

16 years ago
Attachment #83777 - Flags: approval+

Comment 51

16 years ago
this will take me a while since the last patch doesn't apply since it's not
really one diff.

Comment 52

16 years ago
fix checked into 1.01 branch
Keywords: mozilla1.0.1+ → fixed1.0.1

Comment 53

16 years ago
verified the time taken to download 39.1MB pop inbox before and after fix.
05-16 trunk build on mac os9.1 =18 minutes
06-12-08 1.0 branch build on mac os 9.1 = 1 min 11 seconds.
changing keyword to verified1.0.1
Keywords: fixed1.0.1 → verified1.0.1
(Reporter)

Comment 54

14 years ago
I just reported bug 231814, and the longer I think about it, the more I feel that
this problem here has returned in 1.6. Can somebody check if the code that was
fixed here has been changed again ?

Comment 55

14 years ago
yes, this fix was definitely removed in order to fix mailbox corruption
problems. If we don't seek to the end, we can end up in situations where we
write mail into the middle of the file instead of the end...I'll see if there's
some other way to avoid this...
Product: MailNews → Core
Product: Core → MailNews Core
You need to log in before you can comment on or make changes to this bug.