Closed Bug 123063 Opened 23 years ago Closed 19 years ago

Stalls indefinitely at "saving to sent folder..." on SSL/IMAP folders

Categories

(MailNews Core :: Backend, defect)

defect
Not set
major

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: Bill.Burns, Assigned: Bienvenu)

References

Details

Attachments

(3 files, 1 obsolete file)

I have my client configured for SSL/IMAP and SSL/SMTP.  I have my "sent" folder
specified to be the IMAP Sent folder.  Quite often (not every time) when I send
an email it is sent quickly but the subsequent saving it to the Sent folder just
stalls with "Saving to Sent folder..." until I cancel the save operation.

I don't see this when I have my "sent" folder specified to be locally stored.
I also don't see this problem if I disable SSL/IMAP when I use a different
server.  (I can't disable SSL/IMAP on my real mail server since it only support
SSL/IMAP).

I see the same behavior when using Netscape 6.2.1 RTM.
QA Contact: esther → sheelar
Reporter,
Which build are you using?  Do you have multiple accounts? Does this happen on
the first account in the profile?  
it happens on the any profile, including when I create a new profile and set it up.
forgot to mention: I was using N6.2.1 RTM and daily builds up through 1/16/02.
We have a similar problem. Could this have something in common with bug 102816?
Server is RedHat Linux 7.1 running UW IMAP 2000c-10 (2000.287rh) and clients are
Windows 2000 w/ Mozilla 20020202.
reporter,
Which imap server?
in our case the server is running Netscape Messaging Server 4.15 on Solaris; my
client is a Windows 2000sp2.
I'm not the reporter, but I've occasionally seen this behavior on dredd.
However, it hasn't happened to me in a long time (I'm usually living off the trunk).
Hi,

we have this problem too. But it also occours if when using the same account
without SSL.
All,

I have the same problem, I can never copy to sent folder using ssl imap.

I use 1.0RC1 on Win2K and MacOSX, this was allready a problem in earlier releases.

We use the Cyrus IMAP server developed at Carnegie Mellon University. We only
listen on ssl imap
A workaround I found is to save to the local "Sent" folder and then just
manually drag it accross to the "Sent" folder on the server.

So it looks like a problem with the sending routine.
I don't see this problem with every email saved.  I estimate that I see this on
maybe 15% of the emails I save; it's not 100% reproducable for me (the original
poster).
QA Contact: sheelar → junruh
As an experiment I tried to write a mail and save this to the drafts folder this
did not work either. Like with copying to the "Sent" folder I get the error
message "The server responded Permission denied"
Seems to be a duplicate of bug 89285.

*** This bug has been marked as a duplicate of 89285 ***
Status: UNCONFIRMED → RESOLVED
Closed: 22 years ago
Resolution: --- → DUPLICATE
Verified.
Status: RESOLVED → VERIFIED
reopen, probably not dupe of bug 89285, at least on bug 89285 there's
an error msg and this one doesn't
Status: VERIFIED → UNCONFIRMED
Resolution: DUPLICATE → ---
Summary: unreliable "saving to sent folder" on SSL/IMAP folders → Mozilla stalls at "saving to sent folder..." on SSL/IMAP folders
*** Bug 197798 has been marked as a duplicate of this bug. ***
Summary: Mozilla stalls at "saving to sent folder..." on SSL/IMAP folders → Stalls indefinitely at "saving to sent folder..." on SSL/IMAP folders
Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.4a) Gecko/20030326

This has been a long-standing bug in Mozilla 1.x.  It seems that any message
copy from a local folder to an IMAP folder (SSL or non-SSL) takes an extremely
long time to negotiate.  To make matters worse, there is little to no status
information (i.e. "moved/copied message #4 of 20" used to be the default
behavior in Netscape 4.x).

There have been so many dups on this that I think the original bug report is
starting to be diluted!  Half of this bug relates to extreme slowness, the other
half relates to lack of notification to the user as to what Mozilla is doing
while it appears to be "idle".   Bug 80877  Bug 156471  Bug 169608
I often recieve the same error message. The message is sent fine, however it
cannot make the copy to the sent folder on the IMAP server. I suspect that the
error is a function of how long it takes me to compose a message, the longer it
takes, the higher probability of an error. 

Config info:
Client/OS
Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.4b) Gecko/20030409
Windows XP SP-1
Client is configured to use IMAPS and SMTP/TLS (port 25)

SSL gateway is between client and Mirapoint IMAP mailserver. GW serves as a
passive "prototol converter" between SSL and the native clear text IMAP on the
Mirapoint.

-Steve
If I get the error, I resend the message and the copy to the sent folder
*always* works the second time, however it also sends another email to the
recipient (of course).

QA Contact: junruh → esther
There seems to quite a few votes that this can be confirmed.

However, I need someone to confirm it on a more recent build. I cannot do so.

Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.4) Gecko/20030529

Generally happens when forwarding HTML E-mail with images to SpamCop.  Mail is
forwarded just fine, but occasionally stalls saving to IMAP Sent folder (I don't
have a problem saving to local Sent folder with same rev on another machine).
Confirming
Status: UNCONFIRMED → NEW
Ever confirmed: true
I got the same thing on Windows XP:
If SMTP options are set to "Use SSL when available", all windows hang (cease
responding) after sending an email.
Same Problem here: W2k SP4, Mozilla 1.5 or Thunderbird 0.3, 0.4a (20031110), uw
imap 2000c with mbox mailboxes, ssl via sslwrap, SuSE Linux 7.3. 

The problem arises especially when sending large or a lot of attachements. After
stopping the "copying mail to Sent Folder" operation, it's not possible anymore
to save the message, the save button does just nothing. If it did not work the
first time, the problem will occur every time, a mail is sent, just to the end
of the mozilla session.

It seems, that the imap communication hangs. strace on the imap pid just
displays "READ...", tail on the sent-mail box display nothing, until mozilla is
completly closed. Sometimes, the mail will be copyed then.

Still happens here. 
Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.7a) Gecko/20031222

Prog.
This happens to me on Mozilla1.7RC1 on Linux (Mozilla/5.0 (X11; U; Linux i686;
en-US; rv:1.7b) Gecko/20040421).  Setting OS to All/All.  Also resetting owner
and QA contact to default.
Assignee: mscott → sspitzer
QA Contact: esther
OS: Windows 2000 → All
Hardware: PC → All
Not only is this broken operation fundamental to Thunderbird working on
SSL/IMAP, but it also wastes a good deal of time, as the user might wait extra
time thinking the message is still being sent.

Bug 227995 might be a duplicate of this, so I've added 'DUPEME - bug 123063?' to
the whiteboard on the bug.  When I see bug 227995 I get an error message after
waiting an inordinately long time (usually >45 minutes), so it might or might
not be a dup -- someone here should test this.

As I volunteered in bug 227995, just tell me what to do and I'll do it to give
any needed information to possible fixers.  (Edit: found a link to
<http://www.mozilla.org/quality/mailnews/mail-troubleshoot.html#imap> in another
maybe-dupe bug 163951.  Sometime in the next week I'll test and post some logs,
because this is nearly a constant aggravation with the mail server I have to
use.  I now routinely add my address to CC to make sure I know that the message
was sent and that I have a copy.)
Severity: normal → major
Flags: blocking-aviary1.0?
minusing for now. May plus later pending the logs getting posted by Jeff Walden.
Flags: blocking-aviary1.0? → blocking-aviary1.0-
(In reply to comment #28)
> minusing for now. May plus later pending the logs getting posted by Jeff Walden.

I've just tried for about an hour to reproduce this bug with the "Saving to Sent
folder..." message, and I can't.  I know it's there, but I haven't been able to
reproduce it by making an effort to do so.

For now, I'll keep running Thunderbird and creating logs.  When I eventually
reproduce it, I'll upload the log immediately.
thanks for trying Jeff.
I've seen this as well, with this additional behavior:  if I cancel the "save"
operation and then close the window, Thunderbird keeps a connection open to my
mailbox until I exit the program and restart.  This locks down the mailbox and,
with MIT's mail server, at least, causes duplicate copies of mail to be
delivered.  I don't know if this is a helpful data point, but it seems worth
mentioning.
Oh, my experiences are with the released 0.8 version, not the dev version.  will
switch to that and report back about repeatability here...  

With 0.8, though, I'm seeing approximately a 30% failure rate, I'd guess.  Maybe
as high as 50%.
Is MIT still using Cyrus v 2.1.5? That server has a well-known bug, where it
doesn't always handle append correctly, especially over SSL. Sometimes it just
spins waiting for more data, even though the client has sent all the data it
said it would.

Toggling the offline button on and off will have the side effect of sending more
data to the server, and causing it to get out of the append state. The only
other work around, according to the Cyrus developer I talked to, is upgrading
the server.
(In reply to comment #33)
> Is MIT still using Cyrus v 2.1.5? That server has a well-known bug, where it
> doesn't always handle append correctly, especially over SSL. Sometimes it just
> spins waiting for more data, even though the client has sent all the data it
> said it would.

Yes (albeit with security patches installed), and even with my attempts to prod
them they're likely to *keep* using it for the next year.  ("...we may look to
upgrading in the next year.")  For those with the proper MIT certificates, see
the case report at <http://snurl.com/mit_old_imap_server>.
(In reply to comment #34)
> (In reply to comment #33)
> > Is MIT still using Cyrus v 2.1.5? That server has a well-known bug, where it
> > doesn't always handle append correctly, especially over SSL. Sometimes it just
> > spins waiting for more data, even though the client has sent all the data it
> > said it would.
> 
> Yes (albeit with security patches installed), and even with my attempts to prod
> them they're likely to *keep* using it for the next year.  ("...we may look to
> upgrading in the next year.")  For those with the proper MIT certificates, see
> the case report at <http://snurl.com/mit_old_imap_server>.

So... is it acceptable behavior for a client to hang if the server hangs?  The
"cancel" button does not work in a useful manner; there's no way to cause the
client to disconnect, reset state, reconnect, and attempt saving the message a
second time.  If you force the cancel, you can either send the message a second
time in order to get it saved to the folder, or discard the message without
saving a copy.

Even worse, after you've forced this cancel, it holds the connection open
indefinitely, locking your mailbox until you exit Thunderbird altogether.  

I would suggest that even though it's triggered by a server defect, it's a
serious bug that there is anything the server that can do that will trigger this
kind of reaction from the client.
No, cancelling should kill the connection and allow you to save the message as a
draft...
Product: MailNews → Core
I have the same problem roughly 50% of the time.

Client: Thunderbird 1.0 (20041206) IMAP/SSL
Server: Kerio MailServer 6.0.6
(In reply to comment #37)
> I have the same problem roughly 50% of the time.
> Client: Thunderbird 1.0 (20041206) IMAP/SSL
> Server: Kerio MailServer 6.0.6

We have this problem all the time (100% !!!) if we try to send attachments
bigger than 100 kilobytes (approx.). After the stall we have to restart the
client to fix this until the next attachment is sent (or tried to). Saving to
Drafts sometimes works sometimes not. Users are frustrated 'cause they don't
know if their mail was sent or not.

Clients: Netscape 7.x, Mozilla 1.x, Thunderbird 0.x - 1.0
Servers: UW Imapd 4.7a + SSL & UW Imapd 2001a + SSL
SSL: SSLWrap 2.04-2.06 + OpenSSL 0.93-0.95a
(where x=all available versions)

Removing SSL or tunneling via SSH without SSL solves this problem but it is not
a real solution. Downgrade to Netscape 4.8 seems to be the only path here since
these guys at Mozilla don't care about us and we don't use M$ Outlook at all
(which works by the way). Have you noticed that this bug thread was opened here
almost  two (2) years ago, great.

And it is not the server side to blame. I have heard similar reports from users
of UW, Cyrus, Courier and now Kerio as well. If it affects all these (and
possibly more) the client should be studied and fixed. And finally, clients
should be immune to these types of problems even if they were caused by servers
- at least so many. We are quite tired of testing and trying without any support
from the developers. 

This was my 2 cents - whatever it is worth for.
We see this bug as well. Here are some of the details:

o Sent folder is on the server (IMAP)
o Message will occassionaly send but stall on copying to Sent folder with
"Copying message to Sent folder" dialog.
o Actual mail is sent (SMTP part seems to complete)
o If you cancel out of the dialog, the composition window is in a weird state.
Can't save as draft, can't save to file, can't add text or attachments. But if
you attempt to close the dialog, and select "Save" from the "Message has not
been sent" diaglog, the Save will sometimes work.
o If you don't cancel the dialog, you will get the "Error copying message to
Sent folder" error after a long delay (many minutes). You can edit, save,
attach, and attempt a resend of the message at this point.
o Happens regardless of message size.
o Restart of Thunderbird seems to clear things up for a while. But once it
starts acting up, it seems to act up on every message. (perhaps a state mismatch
between client and server?)
Couple other details:

o Thunderbird version 1.0 (20041206)
o Server is UW-IMAP (IMAP4rev1 2004.352)

I can provide a tcpdump if that's useful.

Craig Pratt's descriptions seemed very apt for my experience as well.  What I
wanted to add was that I can make a stalled message window immediately "unstall"
by going to the main Thunderbird window and delete the message that I was
replying to.  Thunderbird continues to hang on subsequent sends, but the
deletion trick will "unstall" it every time.
I'd like to add something else - I noticed this only yesterday. Here's a little
scenario. 

1) I'm checking emails through my IMAP connection and am replying to them; they
get added to my Sent folders with no problems
2) Then a little later, I'm typing a reply to another message, and at a given
moment the status bar of the window where I am typing the replay, displays:

"Sent Receiving: Headers 1 of 1..."

I'm not sure whether it appears immediately or only after some seconds/minutes.
Perhaps it only shows up when TB checks for new messages. But once it's there
for long enough to notice it, it stays there forever. 
Then, when I hit 'send', the 'Copy to Sent folder' stalls for the first time.
All subsequent replies will also stall, although they don't show this message. 
So it seems to me that this message, whatever it means exactly, is the border
between "no problem yet" and "problem". I hope it can help to solve the problem.
I also suppose, considering what comment #41 states, that the problem only
arises when replying to a message (not when composing a new one). 


I've noticed that message more often but only linked it to this problem
yesterday. Did anyone notice this too?


Sygmoral
More data points from what we're seeing:

o Stalling can occur regardless of message size or presence of attachments.
o Message may be new or a reply.

I'll try to watch for the symptom described in comment 42 ("Sent Receiving:
Headers 1 of 1...").
Wow - just like that - went to send a test message and saw the "Sent Receiving:
Headers 1 of 1..." at the bottom of the composition window with a green progress
bar showing 100%. It was there for the duration of my composition. 

However, when I went to send, it sent w/o issue. And I see the message in my
Sent folder. So at least for me, this symptom is not correllated with the stalling.
Don't know how much these behavioral reports help, but the synch idea seems to
be relevant(?):  I sent a message with a large attachment (5MB), and when it
stalled on copying to the sent folder, I cancelled the status window, and
clicked to close the message window (and chose not to save in the resulting
popup).  After this, I sent myself a small one-liner message, but when it
finished sending this small message and began to copy to the sent folder, it
took a very long time, so my assumption is that it was copying my *previous* 5MB
email to the sent folder instead.  In fact, this time it didn't happen to stall,
so the end result was that BOTH of these two messages were in my sent folder
when it finished (hmph, and even yet another message I had previously cancelled
out of a stalled copy-to-sent window).

If this is just noise, I appologize.
I have experienced the same behaviour as in comment #45... not so often though,
but sometimes. When I'm sending many emails, and they're not getting copied to
the Sent folder, but I leave them "loading" anyway, then sometimes I have for
example three windows with a message that have all already been sent but that
are all still trying to get copied to the Sent folder. Then I say, they're not
going to get copied anyway, so let's close the windows one by one; but when I do
that, sometimes after closing the first one, the second or third one will say
"copying failed, want to try again?" and sometimes TB will then manage to copy
it, and afterward I can find all three messages in the Sent folder - although
indeed the first one was cancelled "while copying". 


So I think we agree this problem is quite complex, even to just describe it
accurately...  It appears to me that TB is simply having difficulties to tell us
what exactly is happening while it's trying to copy the message to the Sent
folder. Which does not necessarily mean that it's not working: sometimes after
closing the window that says it's not managing to copy to 'Sent', you'll simply
find it in that folder anyway... 
However most often I don't. So unfortunately I'm still not able to be more
specific; I hope this behaviour is not due to an 'other' bug. 


Perhaps the developers could release a tiny plug-in or extension to provide the
user (us) with more details on what TB is doing, step by step, and what it's
waiting for at a given moment, when trying to copy the message to "Sent"? 


Sygmoral
can folks try with a trunk build from today or yesterday? I fixed a rather
fundamental problem in the copy service then.
I have downloaded the nightly build 20050406, and I am no longer experiencing
this issue!

(However I did get an error message about missing an entry point when I first
started Thunderbird. After saying OK, it loaded anyway. It does this every time
now when I load Thunderbird.)
Ignore the last part of my comment, this was apparantly related to Enigmail. I
uninstalled it and no longer get the error. Plus the Copying to sent folder bug
is gone.
great, I'm going to mark this as a dup, then.

*** This bug has been marked as a duplicate of 287658 ***
Status: NEW → RESOLVED
Closed: 22 years ago19 years ago
Resolution: --- → DUPLICATE
I regret to inform you the bug has now come back after an hour of usage. It was
working great, and then now it hs happening again. What I am seeing is
eventually the message is copied to the sent folder, but it takes forever, way
more than normal. :-(
if it eventually succeeds, then it's not the same as the other bug(s) - I wonder
if your server is slow at times, or if you're running a version of the Cyrus
imap server that has a bug with appending to the sent folder...
We are using SmartMax MailMax for our mail server, not Cyrus.

We have relatively low traffic to our servers, especially IMAP traffic. It is
possible the issue is related to traffic to the server, but it seems weird that
I am the only one experiencing this problem in the office. 

The issue has definitely improved since moving from 1.0.2 to this nightly trunk
build, though.
This is not the same as <a
href="https://bugzilla.mozilla.org/show_bug.cgi?id=287658"> bug 287658 </a>.

That bug is an issue with saving to local folders and this one is about saving
to a sent mail folder on the *server*. 

Is there any hope of a resolution for those of us accessing mail on the
problematic Cyrus servers? Or does it just make sense to switch to another client?
for cyrus users, 1.1 has general tcp timeouts, so we'll eventually timeout our
connection to the problematic cyrus server, which should free everything up. Of
course, MIT could upgrade to a version of the cyrus server w/o this bug.
Just to be clear, this issue is NOT limited to Cyrus servers.  At least I see it
here on the latest Courier IMAP.  I haven't tried the latest build (I hesitate,
since I use Japanese-ified builds), but given the fact that Thomas Winzig took
back his claim that it was resolved, shouldn't this be re-opened (or at least
only closed if others can verify)??
He said it eventually completed - that's not a permanent stall...
(In reply to comment #55)
> for cyrus users, 1.1 has general tcp timeouts, so we'll eventually timeout our
> connection to the problematic cyrus server, which should free everything up. Of
> course, MIT could upgrade to a version of the cyrus server w/o this bug.

Just one user reports his problem was partially solved and this BUG was marked as
resolved ? No no no, this bug affects UW, Cyrus, Courier, Kerio and now MailMax
too. Please read the thread above. This should be reopened ASAP. I verified the
latest released clients (1.7.6 and 1.0.2) and they both fail miserably. Btw, MS
Outlook works well with our UW imapd as does the good ole Netscape Comm 4.8 ...
got the point ?
since this fix was checked into 1.02 or 1.7.6, it's not surprising it's still
broken there. This bug was fixed on the trunk, and will be in the next trunk
release (1.1)
that should be "wasn't checked into..."
(In reply to comment #59)
> since this fix was checked into 1.02 or 1.7.6, it's not surprising it's still
> broken there. This bug was fixed on the trunk, and will be in the next trunk
> release (1.1)

Just tested the latest TB build and with no luck. First mail was sent, copied to
sent folder and the "copying to" -windows timed out in few minutes. The second
mail was sent, copied to sent folder but the "copying to..." -window did not and
I repeat did not time out in 15 minutes so I cancelled it and got the usual
warnings.  Both mails had attachments up to 3-4 Mb in size. So, the fix does not
work for us. Please, reopen the bug.
>"copying to" -windows timed out in few minutes
Did it really time out, in the sense of getting an alert saying the network
operation had timed out? Do you have a virus checker installed? What IMAP server
are you using? Have you tried increasing the timeout (tools | options | advanced...

(In reply to comment #62)
> >"copying to" -windows timed out in few minutes
> Did it really time out, in the sense of getting an alert saying the network
> operation had timed out? Do you have a virus checker installed? What IMAP server
> are you using? Have you tried increasing the timeout (tools | options |
advanced...

Yes, it really timed out. It stayed some time in the "Sending authenticate login
information..." & 100% status and after some time it returned "The connection to
server xxx timed out" and "There was an error copying the message to sent
folder" etc etc. The second try took more than 15 minutes and I got frustrated
watching the "Copying to sent folder" -window and cancelled it myself by
clicking the cancel button

Yes, I have a virus scanner - Symantec Antivirus CE 9.01. Imap server has UW
imapd + sslwrap.

I tried the connection timeout setting and it works so that the "Connection to
server xxx timed out" -window appears after exactly the time specified there.
But it doesn't affect the second try at all. It still says "Copying to sent
folder" and it stays there longer. One good thing though, when the connection
timed out, I was able to send mail without attachments and everything worked
fine = no error messages or waiting.

Btw, every time I tried the mail was sent and copied to sent folder.
it could be that the error path for the imap url timeout doesn't clean up the
copy state - I'll have to look into that, if I can figure out a way of making it
fail.
Would a log file help? I can't remember or find out how to generate one. 
I managed to get the imap append to timeout by doing some trickery in the
debugger, and it failed the way you described. So I just have to figure out why
this isn't propagated to the copy state...
Status: RESOLVED → REOPENED
Resolution: DUPLICATE → ---
I've done a lot of investigation and work on this. The cancel wasn't cancelling
the imap load group because cancelling the imap mock channel doesn't abort the
input stream. Once I fixed that, and a couple other problems, I ran into the
following bizarre problem:

The imap thread is in a blocking read waiting for data that will never come. So
it's blocked in nsPipeInputStream::Wait(). The cancel code now aborts the
transport, which eventually causes nsPipe::OnPipeException to get called, which
notifies the nsPipe monitor. If I switch focus from the compose window to
another window (same app or different app) and back to the compose window
*before* doing the cancel, the ui thread nsAutoMonitor::Notify from the nsPipe
Exception code is not received by the nsPipe code which is blocked on the same
monitor. After talking to Darin, I tried changing the nsPipe code to use
NotifyAll but it didn't help. If I don't switch focus, the cancel works fine.

I've added some more logging and generated logs, and it really seems to boil
down to the imap thread never waking up from the monitor. I don't know what
about switching focus to a different window could cause such a strange side
effect. I'll attach the pr logging that shows the problem. Wan Teh, does this
ring any bells, or do you have any advice as to how to track this down?

I've only tried this on win32, so I don't know if this problem is XP or not.
Assignee: sspitzer → bienvenu
Status: REOPENED → NEW
Here's where 932 is blocked, waiting for data:
932[42647a0]: OOO WriteSegments [this=42657f8 count=2]
932[42647a0]: OOO rolling back write cursor 552 bytes
932[42647a0]: OOO advancing write cursor by 2
932[42647a0]: nsSocketTransportService::PostEvent [event=775a580]
932[42647a0]: III ReadSegments [this=4264f98 count=4096]
932[42647a0]: III pipe input: waiting for data

Here's where we press cancel and the PipeException is generated

0[3a5c18]: III CloseWithStatus [this=4264f98 reason=80470002]
0[3a5c18]: PPP nsPipe::OnPipeException [reason=80470002 output-only=0]
0[3a5c18]: nsPipeInputStream::OnInputException [this=4264f98 reason=80470002
mcallback=0 mBlocked=1]
0[3a5c18]: PPP nsPipe::OnPipeException notifying monitor after OnInputException



and you'll see that 932 never wakes up.
this patch contains various fixes I needed to make this work as well in the
case where focus isn't switched from the compose window. The most important
part is making the mock channel know about the nsImapProtocol object so it can
tell the thread to die on cancel, which eventually causes us to close the
transport. Another important part is fixing the timeout handling code to set
the timeout correctly when we use a cached connection. I also made cancelling
the progress window stop the urls before sending the on state change, which
seemed like a good idea at the time, but I don't know if it matters and I'll
probably try things w/o that.
Wan-Teh: Basically, the log file shows that one thread is stuck on PR_Wait even
though another thread called PR_Notify on the same monitor.  This seems like an
impossible situation, and yet it's what the log file shows.  Can you think of
any possible explanations for PR_Notify being ignored?
Please give me the source files and line numbers of
the PR_Wait and PR_Notify calls.
the nsPipe code use nsAutoMonitor's:

so we wait here, on the imap thread:
http://lxr.mozilla.org/seamonkey/source/xpcom/io/nsPipe3.cpp#610

and notify here, from the ui thread:
http://lxr.mozilla.org/seamonkey/source/xpcom/io/nsPipe3.cpp#568

I've added a printf if the mon.Notify call fails, and that printf doesn't get hit.
Thanks.  I didn't see anything wrong in the nsPipe3.cpp
code.
Attached patch proposed fixSplinter Review
OK, this fixes all the issues that I've encountered. To summarize:

on cancel, nsMsgProgress should stop the urls before sending a notification,
because in the notification listeners, we can put up a prompt and issue a
retry.

in compose, we should clear out mSendProgress, because we won't do a retry if
mSendProgress is non null, and has been cancelled (which it has in this case)

in imap, as I said before, make the mock channel cancel kill the nsImapProtocol
thread, so that everything will get torn down.

When reading from the input stream, hold an extra ref to the stream so that if
it gets cancelled, it won't get deleted out from under us. Thx to Darin for
that tip!

Also, make sure we restore the timeouts when using a cached imap connection.
Attachment #181977 - Attachment is obsolete: true
Attachment #182202 - Flags: superreview?(mscott)
Attachment #182202 - Flags: superreview?(mscott) → superreview+
Comment on attachment 182202 [details] [diff] [review]
proposed fix

this should clean up a class of imap hangs...
Attachment #182202 - Flags: approval-aviary1.1a?
David,

Could you explain the missed monitor notify we
looked at earlier this week?
In some situations, the monitor was getting destroyed before the imap thread had
a chance to wake up (possibly even before the monitor was notified at the lowest
levels, I don't know for sure because my logging is at a higher level). This
turned out to be because the pipe was getting destroyed out from under us, from
a different thread, because we weren't holding a reference to the input stream.
So the fix was to hold a ref to the input stream while we're in a blocking read.
Comment on attachment 182202 [details] [diff] [review]
proposed fix

a=sspitzer for tbird 1.1a
Attachment #182202 - Flags: approval-aviary1.1a? → approval-aviary1.1a+
fix checked in. Please try tomorrow's trunk build and let me know. This fix
won't stop the stalling in the case where the server is broken, but it should
fix cancel to clean everything up, and it should fix the timeout so that it
works even when using cached connections.
Status: NEW → RESOLVED
Closed: 19 years ago19 years ago
Resolution: --- → FIXED
(In reply to comment #79)
> fix checked in. Please try tomorrow's trunk build and let me know. This fix
> won't stop the stalling in the case where the server is broken, but it should
> fix cancel to clean everything up, and it should fix the timeout so that it
> works even when using cached connections.

Thank you. It looks like both the cancel and the timeout work well. I have
tested the 20050501 nb for about a week now and have never had a problem with
those features corrected. I still don't believe all the servers mentioned
earlier are broken but I will have a good opportunity to test this later this
year when we upgrade our server.

Btw, thanks to everyone involved for the new detach / delete attachment feature
! That is what we have been looking for since we gave up WinPmail and Nowell six
years ago.
If an error is detected during a send/append (in an IMAP folder operation), EstablishServerConnection is called to re-establish the connection w/o having to close down the compose window.  Also, the progress window is maintained instead of nulled in nsMsgSend.cpp to allow the progress bar to continue operation.
Comment on attachment 202713 [details]
fixes the case where connection to server has intermittent problems.

Interesting, thx for the patch. I can't recreate this problem so I was hoping you could try a different approach:

Instead of calling EstablishServerConnection(), can you try this:

            m_runningUrl->SetRerunningUrl(PR_TRUE);
            m_retryUrlOnError = PR_TRUE;

That should make us go through the normal retry mechanism. 

My memory is fuzzy, but I remember it being pretty important in some cases to null out mSendProgress where I did...

EstablishServerConnection just tries to read the greeting from the server; it doesn't actually directly connect to the server; Does putting that call actually make us re-connect to the server?
Comment on attachment 202713 [details]
fixes the case where connection to server has intermittent problems.

hmm, I bet it works because it calls CreateNewLineFromSocket, which attempts to read from the connection, which generates an error, and takes us to the retry code. So my suggestion would be a bit more direct.
Yes, that's exactly what's going on.  For an easily-reproducable scenario, send a large attachment and play around w/ the network cable while it's in the middle of the copying process.  We had to leave that progress window in place, otherwise it would just get orphaned during the retry - making it appear as if a hang was taking place...
(In reply to comment #83)
> (From update of attachment 202713 [details] [edit])
> hmm, I bet it works because it calls CreateNewLineFromSocket, which attempts to
> read from the connection, which generates an error, and takes us to the retry
> code. So my suggestion would be a bit more direct.

Hello, just wanted to check wether this new fix is included in released versions or not. Our users do not like the hang that happens 99% of times when sending attachments. Even with the timeout fixed now. And since MS patch broke the send to mail recipient function with Netscape 4.8 we are in desperate need to find a working solution now.

THX.
tjsalin@yahoo.com, what imap server are you using? If it's Cyrus 2.1.5, you will need to upgrade your server.
(In reply to comment #86)
> tjsalin@yahoo.com, what imap server are you using? If it's Cyrus 2.1.5, you
> will need to upgrade your server.

Still hanging with the old UW imapd + sslwrap (see comment #63 etc.). I was hoping that this "feature" might get a new approach via the patch introduced in the 11-11-2005 posts. If not, we really have to see if we can speed up the server upgrade process. But still, there remains a doubt that we might face the same problem later again with a new box & code.
Is your server really failing to respond to the append command when the client's finished sending it data? If so, that really does need to be fixed on the server.
(In reply to comment #88)
> Is your server really failing to respond to the append command when the
> client's finished sending it data? If so, that really does need to be fixed on
> the server.

I don't think it is the server because everything works fine without attachments, with Outlook and Netscape 4.x. And if we turn the SSL encryption off it works with TB as well. You still think it is the server ? After reports from users of UW, Cyrus, Courier, Kerio and MailMax. I can believe in one or two but not 5.

If someone could please explain to me why Outlook and Netscape 4.8 work but Mozillas and TBs not. Then I will really consider spending the extra money and upgrading the server.

Thx.
> I don't think it is the server because everything works fine without
> attachments, with Outlook and Netscape 4.x.

Did you mean "with" attachments?

Are your users doing a MAPI send, i.e., sending e-mail from other apps like Word? Not that it should matter...

The only instance I know of this problem that's the server's fault is with the Cyrus 2.1.5 server (and iirc, some earlier version of the Kerio server)

If you can easily reproduce this problem, an imap protocol log generated by following these instructions might be useful:

http://www.mozilla.org/quality/mailnews/mail-troubleshoot.html#imap
(In reply to comment #90)
> Did you mean "with" attachments?

It fails with attachments in TB. It does not fail with attachments in Netscape 4.8. It does not fail with attachments in Outlook. TB works well only without attachments. TB also works well without SSL with attachments.

> Are your users doing a MAPI send, i.e., sending e-mail from other apps like
> Word? Not that it should matter...

No, just starting TB (1.5.0.2) and click write. Compose the message, attach a file with attach button and click send. It delivers the mail and proceeds to status message: sending authenticate login information... 100% and after the timeout period it says: connection to server xxx timed out. When we click ok, it says: there was an error copying the message to sent folder. Retry ? We click cancel and it says: the message was sent successfully, but could not be copied to sent folder... We click cancel and check the sent folder and the mail goes there every time. We just get that timeout and error messages. Just like it was last year when you fixed the timeout issue.

> The only instance I know of this problem that's the server's fault is with the
> Cyrus 2.1.5 server (and iirc, some earlier version of the Kerio server)

Just check the earlier post for this bug and you'll see there is more.

> If you can easily reproduce this problem, an imap protocol log generated by
> following these instructions might be useful:
> http://www.mozilla.org/quality/mailnews/mail-troubleshoot.html#imap

Tried that but the logfile is empty everytime. Is that supposed to work with TB 1.5.0.2 too ? I tried with various log levels too. No typos, checked it several times...

Thx.
did you substitute "IMAP" for "protocol" in the instructions for setting the enviroment varaible?
(In reply to comment #92)
> did you substitute "IMAP" for "protocol" in the instructions for setting the
> enviroment varaible?

Yes, that was it, how stupid of me. Now I was able to produce two samples. One with timeout (attachment size ca. 700k) and one without (attachment size ca. 6k). So it still has something to do with the size as well.

I think the important lines are as follows (the one that worked - right after the attacment):

3604[30a5a18]: 30a4b40:server_name_omitted:A:SendData: 
3604[30a5a18]: ReadNextLine [stream=30a8c58 nb=23 needmore=0]
3604[30a5a18]: 30a4b40:server_name_omitted:A:CreateNewLineFromSocket: 2 OK APPEND completed
3008[23491b8]: 243e4d8:server_name_omitted:S-INBOX:SendData: DONE
3008[23491b8]: ReadNextLine [stream=2403dd0 nb=14 needmore=0]
3008[23491b8]: 243e4d8:server_name_omitted:S-INBOX:CreateNewLineFromSocket: * 384 EXISTS
3008[23491b8]: ReadNextLine [stream=2403dd0 nb=12 needmore=0]
3008[23491b8]: 243e4d8:server_name_omitted:S-INBOX:CreateNewLineFromSocket: * 1 RECENT
3008[23491b8]: ReadNextLine [stream=2403dd0 nb=21 needmore=0]
3008[23491b8]: 243e4d8:server_name_omitted:S-INBOX:CreateNewLineFromSocket: 7 OK IDLE completed

And the one that did not work = timeout (right after the attachment):

3500[3097458]: 30adae0:server_name_omitted:A:SendData: 
3500[3097458]: ReadNextLine [stream=306cc68 nb=0 needmore=1]
3500[3097458]: 30adae0:server_name_omitted:A:CreateNewLineFromSocket: clearing IMAP_CONNECTION_IS_OPEN - rv = 804b000e
3500[3097458]: 30adae0:server_name_omitted:A:TellThreadToDie: close socket connection
3500[3097458]: 30adae0:server_name_omitted:A:CreateNewLineFromSocket: (null)
3500[3097458]: 30adae0:server_name_omitted:A:ProcessCurrentURL: aborting queued urls
3500[3097458]: ImapThreadMainLoop leaving [this=30adae0]
3356[2349188]: ReadNextLine [stream=2401960 nb=14 needmore=0]
3356[2349188]: 243dd10:server_name_omitted:S-INBOX:CreateNewLineFromSocket: * 385 EXISTS
3356[2349188]: ReadNextLine [stream=2401960 nb=12 needmore=0]
3356[2349188]: 243dd10:server_name_omitted:S-INBOX:CreateNewLineFromSocket: * 1 RECENT
3356[2349188]: 243dd10:server_name_omitted:S-INBOX:SendData: DONE
3356[2349188]: ReadNextLine [stream=2401960 nb=21 needmore=0]
3356[2349188]: 243dd10:server_name_omitted:S-INBOX:CreateNewLineFromSocket: 8 OK IDLE completed

Looks like the server is still waiting for more data. If you think the complete logs might help, I can send them to you.

Thx.
yes, that's NS_ERROR_NET_TIMEOUT - I assume the lines before the timeout error are the last bits of data appended to the sent folder, but no corresponding "Append completed" from the server?

you could try increasing the timeout (tools | options | advanced | connection timeout, if for some reason, your server takes a really long time to respond (doubtful, but perhaps worth a try)

(In reply to comment #94)
> yes, that's NS_ERROR_NET_TIMEOUT - I assume the lines before the timeout error
> are the last bits of data appended to the sent folder, but no corresponding
> "Append completed" from the server?

Yes, all the attachment data is just before the lines I sent. And no append completed in the failed entry.

> you could try increasing the timeout (tools | options | advanced | connection
> timeout, if for some reason, your server takes a really long time to respond
> (doubtful, but perhaps worth a try)

I have tried with 20, 30, 60, 180 and 360 seconds. It does not work. I'm about to believe that it really is the server but could you please explain why older Netscape products (4.x) and Outlooks work and the new ones + TBs not ? Do the working products automatically assume that all the data is sent and everything is ok if not append completed is received ? Or is there any possibility that the new products miss the servers reply ?

Thx.
I don't know why Netscape 4.x and Outlook work, but at least in the case of Outlook, it's possible it just assumes the whole operation succeeds...

I don't think the client could miss the append ok completed, unless something bad is happening at the ssl level (which I highly doubt). A remote possibility is that a virus checker or proxy server is interfering with the network data.
Product: Core → MailNews Core
See Also: → 1745130
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: