Closed Bug 83381 Opened 19 years ago Closed 19 years ago

replying to a msg crashes

Categories

(MailNews Core :: Composition, defect, P1)

x86
Windows NT
defect

Tracking

(Not tracked)

VERIFIED FIXED
mozilla0.9.1

People

(Reporter: skasinathan, Assigned: mscott)

Details

(Whiteboard: [nsbeta1+] fix in hand r= sr=sspitzer)

Attachments

(4 files)

..so..if you are inside netscape firewall you should have got a msg from 
chofmann about mtbf. I replied to that mail and it crashes after throwing this 
dialog. "An error occured while sending mail. The mailserver responded DATA line 
too long. max 16384....." 

Build: Today's build on Win NT.
here goes the stack trace:
  nsMsgAsyncWriteProtocol::UnblockPostReader 
                                             
[d:\builds\seamonkey\mozilla\mailnews\base\util\nsMsgProtocol.cpp, line 1001]
     
  nsMsgProtocolStreamProvider::OnDataWritable
                                             
[d:\builds\seamonkey\mozilla\mailnews\base\util\nsMsgProtocol.cpp, line 728]
     
  nsOnDataWritableEvent::HandleEvent 
                                             
[d:\builds\seamonkey\mozilla\netwerk\base\src\nsStreamProviderProxy.cpp, line 
143]
     
  PL_HandleEvent 
                                             
[d:\builds\seamonkey\mozilla\xpcom\threads\plevent.c, line 591]
     
  PL_ProcessPendingEvents 
                                             
[d:\builds\seamonkey\mozilla\xpcom\threads\plevent.c, line 524]
     
  _md_EventReceiverProc 
                                             
[d:\builds\seamonkey\mozilla\xpcom\threads\plevent.c, line 1072]

event if we are still generating line too long, we should not crash. See the 
stack trace, reassign to mscott, cc'ing varada about the line lenght problem.
Assignee: ducarroz → mscott
Keywords: nsbeta1
I can't reproduce the crash but I do get the line too long problem.  Because of
this you can't send the message.  Varada could you help look into this?  Putting
on 0.9.1 radar.
Priority: -- → P1
Whiteboard: [nsbeta1+]
Target Milestone: --- → mozilla0.9.1
I try on both Mac and Windows using today's debug build and I cannot crash 
either!
reassigning to varada.
Assignee: mscott → varada
I can reproduce this crash at will using today's commercial release build and 
mozilla debug build from late last night on Win NT.

fyi: After I click send, "Ask Me' dialog pops up and let the default (Both Plain 
text and HTML) selected. 
do you see the same stack each time you crash?
Status: NEW → ASSIGNED
Using today's debug build on WinNT as well as yesterday's release builds I am
unable to reproduce either problem - the line is too long or the crash
I see the same stack everytime I tried. Also, I was able to duplicate this (both
crash and Max data alert) using Navin's Win NT system (just to make sure it is
not just my system). 
Whiteboard: [nsbeta1+] → [nsbeta1+] - hard to reproduce
Finally got a chance to check on this using 0531 build and I was able to 
duplicate it. btw, my SMTP SSL setting was set to "Never", if that matters.
I don't crash with the 6-1 build on win98, but I do consistently get the error
message "sending message failed".

pulling on my winnt box so I can help debug.
If I go to reply to that message, and then hit save, I fail when trying to save
the draft.  (note, my drafts folder is local.)

saving a small message works fine.

The original message has a pretty big <pre> block.  perhaps when we quote (for
reply) that pre block, it turns into a giant line with no CRLFs?

I'm still rebuilding so I don't have a build I can debug yet.
As far as I can see the last few lines in the LOG are from nsSocketTransport -
either LeavingProcess or CloseConnection and after that we crash. I hope that 
Darin can make something out of this. It doesnt crash on mac or on linux.
Surprisingly enough -Most times I am unable to crash it when I run it thru my
debugger - but the very same debug build crashes consistently when run thru the
same scenario from the command line prompt!!.
Whiteboard: [nsbeta1+] - hard to reproduce → [nsbeta1+] - hard to reproduce; needs darin investigation
while the crash is hard to reproduce, the failure to send is not.

I'll help debug this today.
debugging to see why saving the reply as draft (to a local draft folder) 
fails.  (CopyData() in nsLocalMailFolder is failing).

about the reply that is generated, the <pre> line is HUGE.  I'll attach what is 
generated.
the save as draft fails here, in nsLocalMailFolder.cpp, line 2153:

if (!end && aLength > (PRInt32) maxReadCount)
 // this must be a gigantic line with no linefeed; sorry pal cannot handle
 return NS_ERROR_FAILURE;

this just confirms that the giant <pre> line is killing us.
forgive my ignorance of editor:

the original message (in html) has a pre block like this:

<pre>
 1
 2
 3
 ...
 100
</pre>

you leave pre blocks alone since they are preformatted.

when we reply, the original message is wrapped with <blockquote> tags
that turns the pre into this:

<blockquote>
 <pre>1<br>2<br>3<br>....100<br></pre>
</blockquote>

that's what's killing us.

if we are inserting <br>, can't we do "<br>\n"?

or "<br>CRLF" (or whatever).

I'm not able to reproduce the crash, all I can get is the failures to send or 
save.
looks like this is all covered in bug #67334.  I'll see if I can help by fixing 
that.
The crash happens only if the recepient is of unknown format preference and when
the dialog box comes up we choose to send as Plaintext & HTML.
ok, I've got a hack for bug #67334 that will avoid this problem.

I'll attach what the reply looks like when using my hack
The crash occuring in nsMsgProtocol.cpp happens because we have already closed
the socket. We are then trying to write again. This ofcourse leads to the crash.
Adding an addref to the mMsgProtocol led to other problems. 
Still Investigating.
> The crash occuring in nsMsgProtocol.cpp happens because we have already closed
> the socket. We are then trying to write again. 

as I mentioned over aim to varada, a while back we had these crasher bugs for
IMAP and NNTP where we'd crash when we'd tried to write to a socket and it was
closed.

see bug #72317

> Adding an addref to the mMsgProtocol led to other problems. 

huh?  can you elaborate?  why where you doing that?
Can we take the hack from bug 67334 for m.9.1 and avoid the crash?  Is there an
"avoid the crash" hack we can take for m.9.1?  We're almost completely out of
time...
Ok, so I understand being under the gun.  But this is really a "one step forward 
two steps back" proposal.  

First off the patch has nothing to do with the real cause of the bug, if I 
undstand this discussion correctly.  Thus we are masking a crasher in this one 
instance.  Are there other ways to get the crash?  We don't know, right?

Next, this patch won't avoid the situation of long <pre> lines in all cases, so 
there are other things that the user can do to trigger the same situation (and 
presumably crash).  This patch will be active for html paste, and mail quoting, 
as things stnad now in the source.  It will not, though, do anything for normal 
text pasting.  So if someone copies a bunch of normal text (for instance, out of 
a textarea or plaintext editor or a different app) and inserts (pastes) it into 
the message inside of a pre somewhere, you will get the same problem.  We would 
have to expand the patch to cover the text insertion code, and that is the second 
riskiest code to touch in the editor (you can break everything with a screwup 
there), right after the deletion code.

Third, Seth is right that this will cause regressions.  I just confirmed it.  In 
addition to the regression he described inside mail quotes, you can get 
regressions in Composer (mail not involved).  For instance, take this source:

<html>
<body>
non pre text
<pre>some pre text


blank lines above
end of pre text</pre>
some non ore text
</body>
</html>

If you load that in mozilla (with the patch) and copy it from the browser window 
(an html copy) and paste it into Composer, you won't be able to click on any of 
the blank lines.  Doh!

Please don't do this.  Already editor is one of the most problematic parts of the 
app, and half our problems stem from trying to work around issues like this that 
really aren't editor caused.  We need to start biting the bullet and doing the 
"right thing" in the right places if editor is ever going to be decent.

I don't know what we should do short term (but I hope it's not this).  Fixing the 
real cause of the crash would be best.  

Longer term two things should probably happen:

We should get layout&caret code to work together better and do what it takes to 
avoid the "dead zones" in pres with newline (and in empty table cells and list 
items).  

There should be flags to set on the html serializer that will positively 
guarantee that long lines are wrapped if the caller asks for it, even if it 
alters the source in some meaningful way.  After all, what if someone pastes in a 
line that is 1000 chars long with no spaces or returns?  You can't mail it like 
that, and yet to break it up has to alter the content.  

The editor is actually the wrong place to solve either of these problems.
Solving the crash would be great, but my understanding is that we'll still be
left in a state where there are certain messages which will not be able to be
sent. How long term is the long term solution and who are the right people to
involve in it?
This bug should be solely for the purpose of the crashing issue.
The issue of messages not being able to be sent due to long lines is covered by 
at least one other bug (possibly 2 or even more).  Discussions/approaches/etc. 
for the long lines issue should be covered in bug #67334 or another bug.

Removing the depends since this crasher can be fixed independently of bug #67334 
(and a fix to bug #67334 will mask the problem).
No longer depends on: 67334
jfrancis' argument makes a lot of sense.  I respect his decision to move editor
forward, and not hack it up more.

> 1) We should get layout&caret code to work together better...
Is there a bug for this?  If that were fixed, it would allows us to keep the
newlines, and help reduce the frequency of the crash and the "fail to send /
fail to copy msg" problem, by reducing the frequency of the long lines.  
(but not eliminated it, as you pointed out)

> 2) There should be flags to set on the html serializer that will positively 
> guarantee that long lines are wrapped if the caller asks for it
Is there a bug for this?  About wrapping, I get a little confused.  there is
wrapping when the user views a message, and there's also wrapping on send?

in this case, we don't need a CRLF at 72 or 80 columns.  we just need one every
4092 bytes (or something).

To fix (or hide) the crash/ fail to send / fail to copy problems on the 0.9.1
branch, could we do what someone (brade?) suggested and force in a newline
(hopefully in the mailnews code, not editor) to wallpaper over this problem?

I'll investigate such a hack.
I think the content serializer should have a flag that enforces a line length 
(which, as seth should know, should be < 1000 octets so that NNTP works).
seth: i stepped through the code a bit with varada, and it seemed at first as
though nsMsgProtocolStreamProvider was having its mMsgProtocol member deleted
out from under it.  so, we tried making nsMsgProtocolStreamProvider hold a 
hard ref to mMsgProtocol.  that didn't get us very far... it just moved the
crash to UnblockPostReader() while touching m_outputStream.

so, it seems like things are getting prematurely shutdown.  varada said that
we were calling CloseSocket() from the SMTP_FREE state in nsSmtpProtocol::
ProcessProtocolState().  nsMsgProtocol::CloseSocket() seems OK.  i'm not sure
why OnDataWritable would still be processed after a call to m_request->Cancel().

nsOnDataWritableEvent::HandleEvent first checks the socket request's status
and will not call OnDataWritable if the socket request's status is a FAILURE
code.  perhaps there is some problem with the Cancel call?!?

i'll take another look once i get my winnt build finishes.
darin, don't bother spending anymore time on this. It's not a networking
problem. We're just closing the socket down after an abort call and i still have
async write code that's attempting to write to the socket. 

this is my bug. 
re-assigning. I believe I have a fix. 
Assignee: varada → mscott
Status: ASSIGNED → NEW
great!
Attached patch the fixSplinter Review
Status: NEW → ASSIGNED
Whiteboard: [nsbeta1+] - hard to reproduce; needs darin investigation → [nsbeta1+] fix in hand r=? sr=?
Whiteboard: [nsbeta1+] fix in hand r=? sr=? → [nsbeta1+] fix in hand r= sr=sspitzer
r/sr=bienvenu
fix checked into the tip. branch coming up.
fixed on both the tip and the trunk. However the message still won't be sent
because of the long lines. 
Status: ASSIGNED → RESOLVED
Closed: 19 years ago
Resolution: --- → FIXED
using buildid: 2001053106 trunk builds on win98,linux,mac I was not able to 
reproduce the crash. However, I saw the crash on suresh winNT machine.  So I 
used suresh's machine to verify that the crash was fixed with buildid: 
2001-06-06-12-0.9.1
Also checked on branch builds:  2001-06-06-12-0.9.1 win98, 2001060611-0.9.linux,
2001-06-06-14-0.9.1
The original crash problem replying to this email has been fixed. But as mscott 
mentioned the send still fails with the long lines error.  
As per comments in 2001-06-05 10:00 by Brade not opening a new bug since bug 
67334 logged.  If this appears to be a differnt problem please comment in the 
bug so I can log a new one for send failing with long lines
Keywords: vtrunk
Blocks: 83396
No longer blocks: 83396
verified using turnk build
2001081506 win98, linux
2001081706 mac
Status: RESOLVED → VERIFIED
Product: MailNews → Core
Product: Core → MailNews Core
You need to log in before you can comment on or make changes to this bug.