Closed Bug 70928 Opened 24 years ago Closed 23 years ago

Can't use Send Page & failing to send message after a certain size

Categories

(MailNews Core :: Composition, defect, P1)

x86
Linux
defect

Tracking

(Not tracked)

VERIFIED FIXED
mozilla0.9

People

(Reporter: scottputterman, Assigned: vparthas)

References

Details

(Whiteboard: [nsbeta1+])

Attachments

(1 file)

Build 2001030505 Windows 2000

Whenever I go to a page and do Send Page, the status says "Sending Message" but
it never finishes.  I hit Get Msg and it never appears.  However, if I write a
new message and send it, it sends fine.
Apparently this doesn't happen on all pages.

It happens to me when going to a bugzilla bug report and then doing Send Page.
It happened to me when sending my status report for today (which was a file on
disk).

Eventually the server sends a timed out error report.
QA Contact: esther → laurel
Here's what I found and I believe it's part of this bug.  (using build
2001-03-05 on winme)
Send Page, gives me an error that Send was ok, but saving to my sent folder
failed.  Sending a large attachment also gives me this error. This only happens
if I have selected my IMAP Sent folder as the designated folder to send a copy.
If I change my Copy to Sent folder to point to my Local Sent folder I don't get
this error and it both sends and copies to the designated folder.
Note: once this fails to copy to my IMAP sent folder, I can no longer send
anything without getting the error until I close the app and relaunch.  When
closing the app, I get a crash.  Talkback call stack:
nsMsgComposeAndSend::GetDefaultPrompt  
[d:\builds\seamonkey\mozilla\mailnews\compose\src\nsMsgSend.cpp, line 226]
nsMsgComposeAndSend::NotifyListenersOnStopCopy  
[d:\builds\seamonkey\mozilla\mailnews\compose\src\nsMsgSend.cpp, line 3379]
CopyListener::OnStopCopy  
[d:\builds\seamonkey\mozilla\mailnews\compose\src\nsMsgCopy.cpp, line 142]
nsMsgCopyService::ClearRequest  
[d:\builds\seamonkey\mozilla\mailnews\base\src\nsMsgCopyService.cpp, line 160]
nsMsgCopyService::~nsMsgCopyService  
[d:\builds\seamonkey\mozilla\mailnews\base\src\nsMsgCopyService.cpp, line 144]
nsMsgCopyService::`scalar deleting destructor'            
nsMsgCopyService::Release  
[d:\builds\seamonkey\mozilla\mailnews\base\src\nsMsgCopyService.cpp]        
DeleteEntry  
[d:\builds\seamonkey\mozilla\xpcom\components\nsServiceManager.cpp, line 259]
PL_HashTableEnumerateEntries   [../../../lib/ds/plhash.c, line 414]        
nsHashtable::Reset  
using the 03-05-01 builds, This doesn't happen on linux 
It does happen on my Mac after the 2nd or 3rd send page
add myself to cc list. buildid on win98:  2001030509
After talking to esther just tried sending a page from browser window and also 
composed a new message and attached a 51kb gif file to the message. Upon sending 
I saw the stauts updated to "sending message" and did not see copy complete 
which usually follows after that.  Did not even see the alert that comes back 
"sending message failed alert". After few minutes I had minimized my mail window 
 and when I went back and clicked on the mail from my windows task bar I saw the 
following alert"  An error occurred while sending mail. The mail server 
responded dredd.mcom.com timing out connection to (xxx.xx.xx.xxx) please check 
the message and try again." Clicking ok on this alert brought me back the 
compose window. I got both the compose windows back one for sending a page 
and another for sending an attachment. From the alert message sounds like it 
was not able to establish  a connection to the server at all. 

I have my copy of sent message go to imap sent folder.
What sheela wrote is exactly what I'm seeing.  At home I also have my Sent
folder on an Imap server.

marking nsbeta1+ and moving to mozilla0.8.1.  We need to be able to send
attachments and send a page. 
Keywords: nsbeta1
Priority: -- → P1
Whiteboard: [nsbeta1+]
Target Milestone: --- → mozilla0.8.1
QA Contact: laurel → esther
removing my name from the cc list. change qa contact as myself cc-ing esther.
Since send is failing
QA Contact: esther → sheelar
Varada, 
Tried a few send variations on going back to build 2001022606. Basically 26th 
build has some potential send problems with not able to copy or timing out the 
connection to the server etc.  Following are a few things that I saw 
-Tested sending a page, attachment, -get error send successfull but copy failed. 
-But when I checked my inbox and did get message did not see two messages that I 
sent.  
-But after restarting the application saw those messages in inbox.
-Sending a page or an attachment from webmail to a test account-came back with 
the alert "An error occurred........". and leaves you with compose window 
depending on if you say cancel or okay.

However when I want to exit the application I see netscp6 still in the task 
manager.  I have to do ctrl+alt+delete to get rid of it.  I am thinking inspite 
of ok or cancel to the alert for specific messages that fail may be one of the 
messages that got failed to send or copy is keeping the connection to server at 
all times.  So could be that the first message that fails to copy to the sent 
folder or sending message could be the reason for all the following failures?
Build on 2001022118 has the problem of copying message to the sent folder when I
attached a file more than 8kb in size. I could send and copy message upto 8 kb.
I saw the failure when sending an attachment of 10kb in file size.  It took a
while for the alert window to come up.  It is the alert window that I mention in
my comments 2001-03-05 15:21 and eventually get back your compose window back.
Sent folder shouldn't have anything to do with this bug.  I see this on the 
trunk builds Win32 and I don't have a Sent folder set up so I never access the 
sent folder.

It may be good to get an SMTP protocol log here, sheela.
*** Bug 71169 has been marked as a duplicate of this bug. ***
*** Bug 71169 has been marked as a duplicate of this bug. ***
Caused by problem in sockets in the netwerk code.
Fix has been checked in .
Marking fixed.
Status: NEW → RESOLVED
Closed: 23 years ago
Resolution: --- → FIXED
windows98, buildid:2001-03-08-09
Verifying this bug is fixed on windows. I still have to verify this on mac and 
linux.  
Varada,
For now reopening this bug for linux and Mac.  Let me know if you need a 
seperate bug to be filed for mac and linux.
buildid:  2001-03-08-09 and 2001-03-12-08 on linux
This is still a problem on linux. send page is not working and sending with any 
attachments is also failing sending message.  On a new profile on linux a simple 
text message also cannot be sent.
On Mac buildid 2001-03-08-09:Sending a 141kb size jpeg file froze the 
application and had to force to quit. On friday evening, I saw an that the first 
time you do send page on there is an error"There was a problem including the 
file http://home.netscape.com in the message would you like to continue sending 
the message without this file?  cancel and okay button.  When I clicked ok 
resulted in sending the message and copied to the sent folder.  But when you see 
the message only link is displayed and the page is not being displayed inline.  
But the second time onwards sending a page is not resulting in the above error. 
 But this error is not coming up with the same build today(03-12-2001) 
On windows:  I do see this is fixed and as I mentioned in my earlier comments.  
However, as I continued testing on that profile I did see the copy failing 
window come up on windows98 only once.  Sometimes I also see that the message 
gets sent and the copy fails without any error.  

Also reopening bug 71202
Status: RESOLVED → REOPENED
OS: Windows 2000 → Linux
Resolution: FIXED → ---
Try sending this page from linux(buildid:2001-03-12-08)
http://personalfinance.netscape.com/finance/main.html 
On same builds mentioned below I am seeing that while trying to send page on 
this page is failing on both linux and mac.  On linux you receive the page but 
the copy fails silently the first time.  The second time you try to send this 
page again will result in the alert window which prompts you to send the message 
but will not copy. We have tried this on Esther linux machine with an existing 
and new profile and have problems copying to the imap sent folder.  
Mac(buildid:2001-03-08-10)On a new profile I was able to send the above page.  
However as I mentioned below sending a jpeg attachment (141kb size) freezes the 
application..Which means send failing with bigger file size attachments.  The 
above page is about 71kb which sends and copies to the sent folder.   

I have attached the jpeg file I am tyring to attach and send with both mac and 
linux which fails sending.  

changed the severity,
Severity: normal → critical
Adding nsdogfood keyword, and jenm to cc: list.
Keywords: nsdogfood
Adding dougt and darin to the cc list because of a possibility that this problem 
could be fixed in necko.
UGH!!!!   after a day of debuggin this problem, I am beginning to believe that 
this bug is due to our mail server flakiness.  Although I have seen the problem, 
I am now having a tough time to reproduce it.  I have been able to sucessfully 
have a mail message with a 32k+ attachment be copied to my IMAP sent directory 
after I send it on both windows and linux.

Is there an isolate imap server that I can get an account on so that I can try 
to reproduce without worrying about how bad our internal mail server can be 
sometimes.

more investigation needed....
okay.  I am able to reproduce it again.  All data is getting written to the IMAP 
server.  We are now deadlocking on a monitor.  
dougt: there's an imap server running on status.
Here is what is happening as best as I can tell.  

SMTP uploads the message.  During its complete, it checks if you want to save a 
copy somwhere. It does and so imap proceeds to copy the message to the Sent 
directory.  Imap finds that there is already a connection to that server (Inbox) 
and uses it (not sure if this is correct - I thought that there was one 
connection per folder?).  It uploads the entire message including the 
attachement.  It follows up with a CRLF.  I see this get written to the server 
without any problem.  On line 4744 of nsImapProtocol, we enter 
ParseIMAPandCheckForNewMail passing a command.  ParseIMAPServerResponse is 
called and and asks for some data from the socket via CreateNewLineFromSocket.  
Now, this CreateNewLineFromSocket function basically sits on a monitor 
|m_dataAvailableMonitor| until somone notifies it.  The only two ways that 
obviously notify this monitor is the protocol's OnDataAvailable and shutdown.  
The later we don't need to worried about (in his bug).  Here is the 
problem, during this entire operation of uploading the file to the Sent 
directory, we do not receive any nsIStreamLisenter notfications on the protocol, 
deadlocking imap.

So, questions for IMAP gurus:

1.  Should we be reusing the socket that was used to connect to the Inbox?  Do 
we have to reinitialize the protocol when doing this?

2.  After uploading an attachment, are we expecting some kind of response from 
the server?

3.  I understand that some work has gone into removing blocking IO from IMAP.  
What is the state of that?  Does it have the same problem?

Also, I see alot possible unprotected critical sections.  I am not totally 
versed in this implementation, but it scares me that we are twittling flag bits 
directly about and outside a lock on a monitor.  I am wondering why this 
setting/getting of flags isn't protected.
Blocks: 72106
As per varada- 
-changing the summary to also reflect that message after certain size cannot be 
sent. This could be a long message in the compose window or you could attach a 
file.  
-see bug 72106 for the copy failing to imap sent folder which is dependent on 
this bug.
Summary: Can't use Send Page → Can't use Send Page & failing to send message after a certain size
I'm getting differing reports on the state of the bug.  I've been told offline
that we no longer have a problem sending but just a problem copying to the Sent
folder.  I'm moving this to mozilla0.9 so that we can get some IMAP help after
the mailnews perf branch lands.  
Target Milestone: mozilla0.8.1 → mozilla0.9
we should be re-using a connection to copy the sent msg to the sent mail folder;
not neccesarily the inbox, but some already-in-use connection. That code has
worked for a few years now, so I doubt that's your problem.

I believe when we've finished sending data that we expect an OK from the server.
That's what we're waiting for.

The imap code counts on necko only calling it on one thread, which is why the
code doesn't need to protect certain flags.
I tried replying to a very large message - it failed in the smtp code with an
invalid socket because someone had closed the socket out from under the smtp
code. This caused necko to PR_Abort. I'm not sure what would happen on a release
build, because PR_Assert doesn't abort. I'm not sure who decided to kill the socket.
I traced through this a bit more - the necko code is closing the socket and then
polling the socket, which PR_Aborts. The smtp code is not closing the socket as
near as I can tell - it seems to be the necko code acting on its own. In
nsSocketTransport::Process, we get to the eSocketState_Done case, and it calls
CloseConnection. It doesn't look like we fell through from the
eSocketState_Error case because mCurrentState was eSocketState_Done.

At this point, I'm at a bit of a loss - I'm just not familiar with this necko
code and debugging it is very difficult. I'll try generating a log.
Here's the end of the log. Note the close connection followed by the poll. It's
unclear to my why the connection was closed.

0[a926e0]: nsSocketBOS::Write [this=b8fcd30] Poll succeeded; looping back to
PR_Write
0[a926e0]: nsSocketBOS::Write [this=b8fcd30 count=1969]
0[a926e0]: nsSocketBOS::Write [this=b8fcd30] Poll succeeded; looping back to
PR_Write
0[a926e0]: nsSocketBOS::Write [this=b8fcd30 count=2042]
0[a926e0]: nsSocketBOS::Write [this=b8fcd30] Poll succeeded; looping back to
PR_Write
0[a926e0]: nsSocketBOS::Write [this=b8fcd30 count=2042]
0[a926e0]: nsSocketBOS::Write [this=b8fcd30] Poll succeeded; looping back to
PR_Write
0[a926e0]: nsSocketBOS::Write [this=b8fcd30 count=2042]
0[a926e0]: nsSocketBOS::Write [this=b8fcd30] Poll succeeded; looping back to
PR_Write
0[a926e0]: nsSocketBOS::Write [this=b8fcd30 count=2042]
0[a926e0]: nsSocketBOS::Write [this=b8fcd30] Poll succeeded; looping back to
PR_Write
0[a926e0]: nsSocketBOS::Write [this=b8fcd30 count=2042]
0[a926e0]: nsSocketBOS::Write [this=b8fcd30] Poll succeeded; looping back to
PR_Write
0[a926e0]: nsSocketBOS::Write [this=b8fcd30 count=2042]
0[a926e0]: nsSocketBOS::Write [this=b8fcd30] Poll succeeded; looping back to
PR_Write
0[a926e0]: nsSocketBOS::Write [this=b8fcd30 count=2010]
0[a926e0]: nsSocketBOS::Write [this=b8fcd30] Poll succeeded; looping back to
PR_Write
0[a926e0]: nsSocketBOS::Write [this=b8fcd30 count=2042]
0[a926e0]: nsSocketBOS::Write [this=b8fcd30] Poll succeeded; looping back to
PR_Write
0[a926e0]: nsSocketBOS::Write [this=b8fcd30 count=2042]
0[a926e0]: nsSocketBOS::Write [this=b8fcd30] Poll succeeded; looping back to
PR_Write
0[a926e0]: nsSocketBOS::Write [this=b8fcd30 count=1951]
0[a926e0]: nsSocketBOS::Write [this=b8fcd30] Poll succeeded; looping back to
PR_Write
0[a926e0]: nsSocketBOS::Write [this=b8fcd30 count=2042]
0[a926e0]: nsSocketBOS::Write [this=b8fcd30] Poll succeeded; looping back to
PR_Write
0[a926e0]: nsSocketBOS::Write [this=b8fcd30 count=2042]
0[a926e0]: nsSocketBOS::Write [this=b8fcd30] Poll succeeded; looping back to
PR_Write
0[a926e0]: nsSocketBOS::Write [this=b8fcd30 count=2042]
0[a926e0]: nsSocketBOS::Write [this=b8fcd30] Poll succeeded; looping back to
PR_Write
0[a926e0]: nsSocketBOS::Write [this=b8fcd30 count=2042]
0[a926e0]: nsSocketBOS::Write [this=b8fcd30] Poll succeeded; looping back to
PR_Write
0[a926e0]: nsSocketBOS::Write [this=b8fcd30 count=2042]
0[a926e0]: nsSocketBOS::Write [this=b8fcd30] Poll succeeded; looping back to
PR_Write
0[a926e0]: nsSocketBOS::Write [this=b8fcd30 count=2042]
0[a926e0]: nsSocketBOS::Write [this=b8fcd30] Poll succeeded; looping back to
PR_Write
0[a926e0]: nsSocketBOS::Write [this=b8fcd30 count=2042]
0[a926e0]: nsSocketBOS::Write [this=b8fcd30] Poll succeeded; looping back to
PR_Write
0[a926e0]: nsSocketBOS::Write [this=b8fcd30 count=2042]
0[a926e0]: nsSocketBOS::Write [this=b8fcd30 count=2042]
0[a926e0]: nsSocketBOS::Write [this=b8fcd30 count=2042]
0[a926e0]: nsSocketBOS::Write [this=b8fcd30 count=2042]
0[a926e0]: nsSocketBOS::Write [this=b8fcd30 count=2042]
0[a926e0]: nsSocketBOS::Write [this=b8fcd30] Poll succeeded; looping back to
PR_Write
0[a926e0]: nsSocketBOS::Write [this=b8fcd30 count=2042]
0[a926e0]: nsSocketBOS::Write [this=b8fcd30] Poll succeeded; looping back to
PR_Write
0[a926e0]: nsSocketBOS::Write [this=b8fcd30 count=2042]
0[a926e0]: nsSocketBOS::Write [this=b8fcd30] Poll succeeded; looping back to
PR_Write
0[a926e0]: nsSocketBOS::Write [this=b8fcd30 count=2042]
0[a926e0]: nsSocketBOS::Write [this=b8fcd30] Poll succeeded; looping back to
PR_Write
0[a926e0]: nsSocketBOS::Write [this=b8fcd30 count=2042]
0[a926e0]: nsSocketBOS::Write [this=b8fcd30] Poll succeeded; looping back to
PR_Write
0[a926e0]: nsSocketBOS::Write [this=b8fcd30 count=2042]
58[156a1e0]: nsSocketTransport: Entering Process() [host=nsmail-1:25
this=b8fa870], aSelectFlags=1, CurrentState=5.
58[156a1e0]: nsSocketTransport: Transport [host=nsmail-1:25 this=b8fa870] is in
WaitReadWrite state [readtype=1 writetype=200 status=80470007].
58[156a1e0]: nsSocketTransport: doReadWrite [this=b8fa870, aSelectFlags=1,
mReadRequest=b905fc0, mWriteRequest=0
58[156a1e0]: nsSocketReadRequest: [this=b905fc0] inside OnRead.
58[156a1e0]: nsSocketReadRequest: [this=b905fc0] calling listener [offset=319,
count=36]
58[156a1e0]: nsSocketIS: PR_Read(count=36) returned 36
58[156a1e0]: nsSocketReadRequest: [this=b905fc0] read 36 bytes [offset=355]
58[156a1e0]: nsSocketTransport: doReadWrite [readstatus=80470007 writestatus=0
readsuspend=0 writesuspend=0 mSelectFlags=5]
58[156a1e0]: nsSocketTransport: Leaving Process() [host=nsmail-1:25
this=b8fa870], mStatus = 80470007, CurrentState=5, mSelectFlags=5

58[156a1e0]: nsSocketTransport: Entering Process() [host=nsmail-1:25
this=b8fa870], aSelectFlags=1, CurrentState=5.
58[156a1e0]: nsSocketTransport: Transport [host=nsmail-1:25 this=b8fa870] is in
WaitReadWrite state [readtype=1 writetype=200 status=80470007].
58[156a1e0]: nsSocketTransport: doReadWrite [this=b8fa870, aSelectFlags=1,
mReadRequest=b905fc0, mWriteRequest=0
58[156a1e0]: nsSocketReadRequest: [this=b905fc0] inside OnRead.
58[156a1e0]: nsSocketReadRequest: [this=b905fc0] calling listener [offset=355,
count=8192]
58[156a1e0]: nsSocketIS: PR_Read(count=1693) returned 0
58[156a1e0]: nsSocketReadRequest: [this=b905fc0] done reading socket.
58[156a1e0]: nsSocketTransport: CompleteAsyncRead [this=b8fa870]
58[156a1e0]: nsSocketTransport: doReadWrite [readstatus=0 writestatus=0
readsuspend=0 writesuspend=0 mSelectFlags=4]
58[156a1e0]: nsSocketTransport: Transport [host=nsmail-1:25 this=b8fa870] is in
Done state.
58[156a1e0]: nsSocketTransport::CloseConnection [this=b8fa870] Calling PR_Close
58[156a1e0]: nsSocketTransport: Leaving Process() [host=nsmail-1:25
this=b8fa870], mStatus = 0, CurrentState=3, mSelectFlags=4

0[a926e0]: Assertion failure: NULL != bottom, at
../../../../../pr/src/md/windows/w32poll.c:244
OK, I've found the change that's most likely causing the problem - 

   // From bug 71391:
    //
    // If the socket is not to be reused, then make sure it
    // is closed once we reach eSocketState_Done.
    //
    if (mReuseCount == 0)
        mCloseConnectionOnceDone = PR_TRUE;

Without knowing what's going on here, is it possible that the smtp code could
call SetReuseConnection() and prevent this problem? And I guess the imap code
too if the imap code is having the same problem.
possibly, but that code was added after this problem appeared. I tried setting
the reuseConnection on the IMAP socket without any luck.  Maybe we would also
have to do this on the SMTP socket?  

bienvenu, how about we sit down together and debug this problem?  
Doug, I won't be in the office until Thursday. I tried setting ReuseConnection
to true for smtp, and that in turn revealed the underlying problem because I got
further w/o aborting. We're sending a single line that's too long for the SMTP
server (> 16K). This causes the stmp server to send us an error response about
the too long line. After reading this response, necko closes the connection.
Here's the stack trace and the code that closes the connection. After the read
succeeds, we call CompleteAsyncRead. Later on, I think we keep polling for data
and get the pr_abort.

nsSocketTransport::CompleteAsyncRead() line 516
nsSocketTransport::doReadWrite(short 0x0001) line 978
nsSocketTransport::Process(short 0x0001) line 459 + 13 bytes


        else if (mSelectFlags & PR_POLL_READ) {
            //
            // Read data if available
            //
            if (aSelectFlags & PR_POLL_READ) {
                readStatus = mReadRequest->OnRead();
                if (mReadRequest->IsSuspended()) {
                    mSelectFlags &= ~PR_POLL_READ;
                    readStatus = NS_BASE_STREAM_WOULD_BLOCK;
                }
                else if (NS_SUCCEEDED(readStatus))

/* *** we get here ***** */
                    CompleteAsyncRead();

                else if (NS_FAILED(readStatus) && (readStatus !=
NS_BASE_STREAM_WOULD_BLOCK))
                    return readStatus; // this must be a socket error, so bail!
            }

I'm not sure why we're sending the real long line - I assume it's the editor not
inserting any linebreaks when saving as html. So, we need to fix that problem,
but I think also we need to figure out what's going on in necko.
What are your OnDataAvailable callbacks returning??
nsMsgProtocol::OnDataAvailable() always returns NS_OK.
So, I made some changes to nsMsgProtocol::PostMessage so that it will handle
extra long lines by adding a CRLF to the line. This fixes the problem because we
don't get an error back from the server. I'll attach the patch, but it's not a
good idea to check it in, for several reasons. One is that I *think* this code
will go away when Scott checks in his async write code (which I assume will be
used for smtp). The other is that I'm not sure the patch really does the right
thing in some cases - we can add an extra CRLF if we happen to get a line longer
than 100 bytes because of the way PostMessage works. When mscott resurfaces from
the branch, he can comment.

I have *not* seen the copy to sent folder problem with IMAP - though I have seen
a PR_Abort in cases where the imap connection has timed out.
bienvenu, I can semi-reliable create the
*fail-to-copy-to-sent-folder-on-imap-server* problem.
Does the attached patch fix the two problems for anyone experiencing them?
mscott, can you comment on this patch.  Does this make sense for the 0.8.1 branch?

varada,
Mark this bug as resolved and I can verify it. 
I am not having sending problems sending messages with attachments or sending 
page on linux machine anymore.  This is working for me.  I had some corrupt 
image files that I was trying to attach which failed to send.  I can send page 
and message with attachments with no problem from buildid:  2001040306 on linux. 
 
Status: REOPENED → RESOLVED
Closed: 23 years ago23 years ago
Resolution: --- → FIXED
Marking Fixed.
verified,
2001-04-03-10linux
2001-04-03-06win98
2001-04-03-09mac
Status: RESOLVED → VERIFIED
Product: MailNews → Core
Product: Core → MailNews Core
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: