Closed Bug 11094 Opened 20 years ago Closed 20 years ago

[Dogfood] Reply to a message leads to a crash

Categories

(MailNews Core :: Backend, defect, P3, critical)

x86
Windows NT
defect

Tracking

(Not tracked)

VERIFIED FIXED

People

(Reporter: marina, Assigned: rpotts)

References

()

Details

-Steps to reproduce:
-select a message;
-click Reply button;
//note: body is trying to redraw and this  leads to crash
observed with 1999-08-02-09-M9/ windows build
Is it a plain-text or html message you're replying to? I tried plain text w/o a
crash.
It is an HTML mail (no attachements encoding us-ascii)
I tried this without a problem. Could you send me the message (Edit message as
new as opposed to forward)? thanks.
Lisa or Esther, have you seen this?
We will try this today.
I wasn't having this problem when using the plain text mail-send
option. The I tried teh following:

user_pref("mail.identity.id1.compose_html", true);

With this HTML send option, so far I crashed twice simply trying
to reply to a message which happens to contain high-bit characters --
though I'm not sure if that is relevant at this point.
I need to add that the original messages I was trying to reply to
are in a local mail folder. Try the above ZIP file which
contains 3 Latin 1 msgs I created for another bug. Place the unarchived
mailbox file into a POP3 mail folder and try replying to each of
teh 3 messages.

I'll attach teh Talkback incident report next.
Here's part of one of the 2 reports I filed with TalkBack:

------------------------------------------------
Trigger Type:  Program Crash
Trigger Reason:  Access violation
Call Stack:    (Signature = 0x51ff088b e3555688)
0x51ff088b
raptorhtml.dll + 0x23eb9 (0x60473eb9)
ender.dll + 0x18e2 (0x600a18e2)
ender.dll + 0x7868 (0x600a7868)
ender.dll + 0x15d87 (0x600b5d87)
ender.dll + 0x16c6 (0x600a16c6)
ender.dll + 0x79fb (0x600a79fb)
ender.dll + 0x15dcc (0x600b5dcc)
xpcom.dll + 0x3de1 (0x60b03de1)
ender.dll + 0x1d176 (0x600bd176)
ender.dll + 0x1f5ee (0x600bf5ee)
raptorweb.dll + 0x5e7e (0x60ad5e7e)
raptorweb.dll + 0x1f4e (0x60ad1f4e)
raptorweb.dll + 0x1ebb (0x60ad1ebb)
necko.dll + 0x50d6 (0x602550d6)
necko.dll + 0x4de4 (0x60254de4)
plds3.dll + 0x18ee (0x60a318ee)
plds3.dll + 0x186a (0x60a3186a)
plds3.dll + 0x1b18 (0x60a31b18)
USER32.dll + 0x11ab (0x77e511ab)
nsappshell.dll + 0x191e (0x609b191e)
apprunner.exe + 0x1b38 (0x00401b38)
KERNEL32.dll + 0x1b623 (0x7001b623)
------------------------------------------------

If you want to see full reports of 2 incidents I filed, go to:

http://cyclone/reports/reporttemplate.cfm?style=1&reportID=1099

and click on the Bug ID number for this bug, 11094, on that page.
The above report was filed with 8/3/99 Win32 M9 necko build.
Assignee: phil → buster
Component: Front End → Composition
OK, this looks like an editor problem, doesn't it? Changing component
to composition and reassigning to Steve Clark. Please reassign if you disagree.
2 additional pieces of info:

1. The message to which you engage "reply" function does not
   have to contain Latin 1 high-bit characters to induce this bug.
   I had no problem re-creating this bug with all ASCII message.
2. This problem does not seem to occur when original messages
   you want to reply to are on an IMAP server. (After drawing the window
   content, you're no able to dismiss that window -- freexing problem,
   but that's another bug.)
I can confirm that the crash is related to the POP server, i'm crashing today
again on POP but not on IMAP, my yesterday's Talkback ID incident is the one i
sent to David in the e-mail ID 11924651.
Confirming using 1999080212 build I crash clicking Reply or Forward on an html
or plain text message in my pop (qatest03) inbox.  I don't have any of
international characters in my messages.

1. Launch Messenger
2. Select a message in my POP inbox
3. Click Reply
Results:  The compose window comes up with the name in the To: field and then
you crash.  This was tested on Win95, I still have to try this on linux and mac.
Severity: normal → critical
Summary: Reply to a message leads to a crash → Reply to a message in a POP local folder leads to a crash
This is very critical to have fixed for mail usage. Thanks.
When I tried this, I crashed with just netlib/necko and javascript on the stack.
I heard in the IRC group yesterday that rhp/akkana/simon discovered some timing
problems between message compose and the editor. I believe it had to do with us
setting text in the editor widget (which would happen when you do a reply)
before the editor widget had finished loading. Simon and rhp had talked about
tweaking an editor interface to allow us to become a listener so we could add
the body at the appropriate time.

This crash could be related to that. I'm going to add rhp to the cc list and see
what he thinks.
Blocks: 11091
Summary: Reply to a message in a POP local folder leads to a crash → Reply to a message in leads to a crash
I have to alter my earlier comments because Replying to a message on the IMAP
leads to a crash also but randomly.Here is a Talkback report ID# 11978860

 Trigger Type:  Program Crash

 Trigger Reason:  Access violation


 Call Stack:    (Signature = 0x00020006 ec294005)
   0x00020006

   ender.dll + 0x160c (0x600a160c)

   ender.dll + 0x15e7 (0x600a15e7)


   ender.dll + 0x16c6 (0x600a16c6)


   xpcom.dll + 0x3dc2 (0x60b03dc2)


   ender.dll + 0x1bebd (0x600bbebd)


   ender.dll + 0x1bd90 (0x600bbd90)


   raptorhtml.dll + 0xb621 (0x6045b621)


   raptorhtml.dll + 0xc494 (0x6045c494)


   ender.dll + 0x2646 (0x600a2646)


   ender.dll + 0x93e8 (0x600a93e8)


   ender.dll + 0x1a8a (0x600a1a8a)


   ender.dll + 0x7a96 (0x600a7a96)


   ender.dll + 0x15e5f (0x600b5e5f)


   ender.dll + 0x1c874 (0x600bc874)


   ender.dll + 0x1c973 (0x600bc973)


   ender.dll + 0x1d182 (0x600bd182)


   ender.dll + 0x1f5ee (0x600bf5ee)


   raptorweb.dll + 0x5e7e (0x60ad5e7e)


   raptorweb.dll + 0x1f4e (0x60ad1f4e)


   raptorweb.dll + 0x1f92 (0x60ad1f92)


   raptorweb.dll + 0x1f68 (0x60ad1f68)


   raptorweb.dll + 0x1ebb (0x60ad1ebb)


   necko.dll + 0x50d6 (0x602550d6)


   necko.dll + 0x4de4 (0x60254de4)


   plds3.dll + 0x18ee (0x60a318ee)


   plds3.dll + 0x186a (0x60a3186a)


   plds3.dll + 0x1b18 (0x60a31b18)


   USER32.dll + 0x13ed (0x77e713ed)


   nsappshell.dll + 0x191e (0x609b191e)


   apprunner.exe + 0x1b38 (0x00401b38)


   KERNEL32.dll + 0x1b304 (0x77f1b304)
observed with 1999-08-03-08 windows build
Assignee: buster → putterman
Component: Composition → Back End
doesn't seem like an editor bug. reassigning componenet to "back end" since it
seems to matter whether the message is IMAP or POP.  reassigning to putterman,
so he can pick the right mail engineer.
cc'ing simon in case he has any insight.
Hmm.  According to the bug, both IMAP and POP cause this and more importantly,
there are no mailnews dll's in either stack trace, just mostly Ender with some
Necko and layout. If an engineer can reproduce this (and I can't) could you put
a useful stack trace in here that actually has function names?
Kat, you mentioned that there was a zip file with the messages?   Can you attach
it to the bug report.  I don't see it.  Perhaps, development can use those
messages to try to reproduce the crash.  Thanks.
Can somebody get us a stack trace with symbols, please? Those without are almost
useless.
Assignee: putterman → rhp
I've tried putting Kat's messages(the zip file is in the URL) in a mail folder
and I can reply to each of the 3 messages with no problem.

perhaps this is a Release build problem or as mscott mentioned, a timing
problem.  I'm going to start off reassigning to rhp to investigate this and add
akkana to the cc list.
With rhp's help, I can now crash.  Make sure you don't have the old quoting pref
turned on if you want to crash.  Here's the stack:

nsFrame::GetOffsets(const nsFrame * const 0x0cfb2190, int &, int & 0) line 429 +
3 bytes
nsEditor::GetSelection(nsEditor * const 0x0cfd20b0, nsIDOMSelection * *
0x0012fb6c) line 488 + 24 bytes
nsTextEditor::~nsTextEditor() line 160 + 33 bytes
nsHTMLEditor::~nsHTMLEditor() line 137 + 8 bytes
nsHTMLEditor::`scalar deleting destructor'(unsigned int 1) + 15 bytes
nsEditor::Release(nsEditor * const 0x0cfd20b0) line 380 + 102 bytes
nsTextEditor::Release(nsTextEditor * const 0x0cfd20b0) line 218
nsHTMLEditor::Release(nsHTMLEditor * const 0x0cfd20b0) line 152
nsCOMPtr<nsISupports>::assign_with_AddRef(nsISupports * 0x00000000) line 649
nsCOMPtr<nsISupports>::operator=(nsISupports * 0x00000000) line 549
nsEditorShell::PrepareDocumentForEditing(nsEditorShell * const 0x0cfc4cb0,
nsIURI * 0x0cfd6720) line 793
nsEditorShell::OnEndDocumentLoad(nsEditorShell * const 0x0cfc4cb8,
nsIDocumentLoader * 0x0cf1dea0, nsIChannel * 0x0cfd6520, int 0,
nsIDocumentLoaderObserver * 0x0cf1c924) line 2908 + 28 bytes
nsWebShell::OnEndDocumentLoad(nsWebShell * const 0x0cf1c924, nsIDocumentLoader *
0x0cf1dea0, nsIChannel * 0x0cfd6520, int 0, nsIDocumentLoaderObserver *
0x0cf1c924) line 3063
nsDocLoaderImpl::FireOnEndDocumentLoad(nsIDocumentLoader * 0x0cf1dea0, int 0)
line 1122
nsDocLoaderImpl::OnStopRequest(nsDocLoaderImpl * const 0x0cf1dea4, nsIChannel *
0x0cfd6520, nsISupports * 0x00000000, unsigned int 0, const unsigned short *
0x00000000) line 1029
nsOnStopRequestEvent::HandleEvent(nsOnStopRequestEvent * const 0x0cf29da0) line
274
nsStreamListenerEvent::HandlePLEvent(PLEvent * 0x0cf29da4) line 149 + 12 bytes
PL_HandleEvent(PLEvent * 0x0cf29da4) line 509 + 10 bytes
PL_ProcessPendingEvents(PLEventQueue * 0x00f1b290) line 470 + 9 bytes
_md_EventReceiverProc(HWND__ * 0x00a207e6, unsigned int 49307, unsigned int 0,
long 15839888) line 932 + 9 bytes
Assignee: rhp → akkana
Ok, I know what this is now. I've talked with Akkana about this on IRC this
morning. The problem that is happening here is that quoting out of libmime has
a META tag that sets the charset to UTF-8. Well, when that happens, it causes a
reload of the page for I18N autodetection and when that happens, there is a
problem cleaning up the first editor shell. Basically, a member variable in the
editor that is not refcounted and is getting freed by someone else and then
accessed by the editor.

Akkana, I think you have a beat on this one so I am reassigning.

- rhp
Status: NEW → ASSIGNED
Target Milestone: M9
We're entering nsEditorShell::PrepareDocumentForEditing
(fromnsWebShell::OnEndDocumentLoad) twice, and so destroying the first editor in
order to create the second one.  But by the time we get there, parts (e.g.
mPresShell) of the first editor are already bogus.
Summary: Reply to a message in leads to a crash → Reply to a message leads to a crash
We might have to add an onunload handler in the XUL, which calls into the
editorShell and clears out the editor from the editorShell.
Unfortunately adding an onunload handler in the XUL turns out not to work
(doesn't get called).

Rich, can you explain again why I18n requires reloading the same document a
second time, rather than just loading it right the first time?

A temporary workaround, if this is blocking you, is to ifdef out most of the
code in nsTextEditor::~nsTextEditor, from
nsCOMPtr<nsIDOMSelection>selection;
through the end of the clause just before deleting mRules.  That cures the crash
(but causes a leak, so it's not an adequate permament solution).

Both the pres shell and the document are garbage at this point in the reload
case, so ~nsTextEditor shouldn't count on getting anything from them.
Addreffing them in nsEditor::Init doesn't help, though.
Assignee: akkana → ftang
Status: ASSIGNED → NEW
The more I delve into this, the more I realize that this is a problem the editor
can't solve.

The problem is that the document is being loaded twice, but when the first
document goes away (to be replaced by the second), the editor isn't told about
it in time, so it doesn't know that it can't use its pointers to the old
document.

This is really two problems:
1. We shouldn't have to load every document twice -- that's wasteful;
2. We shouldn't reload a document without somehow notifying things that depend
on that document (like the editor) that the first document is going away.

We need to solve at least one #2, but preferably #1.  This needs the attention
of someone who knows why the document is being reloaded, so we can find out why
that's happening and what we can do to streamline the process (or at least send
notifications).  Rich says that's ftang.  Reassigning -- not to wash my hands of
the issue (the editor needs to work with whoever solves this problem) but
because the editor needs backend help in order to solve the deeper problem.
Naoki worked on the Mail reload problem also with ftang and mscott.
Cc'ing nhotta.
Summary: Reply to a message leads to a crash → [Dogfood] Reply to a message leads to a crash
We really need this fixed soon and have enough time to test the fix for
M9. Thanks.
I'm going to try to check in a workaround for the crash that doesn't delete the
editor in the reload case (leaving a memory leak, which is bad but at least not
as bad as a crash).  I have the fix in my tree, but I hit a cvs conflict and now
I'm having to pull and rebuild everything in order to test the fix.
Nope, my attempted fix (skip setting mEditor to 0 in PrepareDocumentForEditing)
alas, doesn't work: now the window comes up, but with no editor since the editor
is still pointing at the old doc.  I don't see any way to fix this without
always leaking all editors (not just the ones in reply windows), which is likely
to hork more than it fixes.

Rich, how about temporarily disabling the charset so we don't reload the
document?
Well, thr problem is that if I take out the META char stuff, ALL non us-ascii
quoting will break. Do we have an idea of when the real fix might be in place.
I can break all I18N stuff if I know that a fix is coming, but if I make my
change and then this problem gets pushed into the M10 bucket, my bug list will
explode with re-opened bugs.

I'll await comments before I do anything.

- rhp
I am NOT sure if it applies here but there is a way to override META by setting
a force charset.
parser->SetDocumentCharset(charsetStr, kCharsetFromPreviousLoading);
I do NOT know the context of this bug so I am NOT sure if this applies for this
bug.
Status: NEW → ASSIGNED
Depends on: 7330
Apparently the double loading problem is partly due to 7330; adding that
dependency.

But there's a problem reproducing this in today's build: we no longer see the
crash.  On Linux, I see one nsEditorShell::OnStartDocumentLoad, but there's no
corresponding call to OnEndDocumentLoad so we never set up the editor in the
compose window.

No one seems to know what changed to cause this.  The editor window and New
Message still work fine -- this is just on replies that OnEndDocumentLoad no
longer gets called (even once, much less twice).
Assignee: ftang → akkana
Status: ASSIGNED → NEW
akkana, I reassign this back to you. Please try the following thing first.

go into intl/chardet/src/nsObserverBase.cpp , uncomment
the follwoing lines in your local tree (remove the // in the beginning of the
line)
 19 //#define DONT_INFORM_WEBSHELL


rebuild intl/chardet

and try it again. If the problem go away, then it mean it is related to the meta
charset reloading problem, otherwise, it have nothing to do with the meta
charset since uncomment out the #define will make the observer ignore the meta
charset. If the problem go away after you uncomment, then please reassign this
bug back to me. Thanks.
Assignee: akkana → rhp
I tried Frank's fix, and it doesn't make any difference to the new behavior of
the compose window (that OnEndDocumentLoad never gets called and so the editor
never gets initialized).  Something has changed in mail replies which is perhaps
masking this bug.  Tossing the hot potato back to Rich.
Assignee: rhp → rpotts
This is another twist on the webshell/docloader notification problem that Rick
Potts is working on. I'm forwarding this bug to him (if there aren't any
objections).
*** Bug 11573 has been marked as a duplicate of this bug. ***
*** Bug 11764 has been marked as a duplicate of this bug. ***
*** Bug 11763 has been marked as a duplicate of this bug. ***
Status: NEW → RESOLVED
Closed: 20 years ago
Resolution: --- → DUPLICATE
*** This bug has been marked as a duplicate of 10335 ***
Blocks: 11889
Status: RESOLVED → VERIFIED
verified dup
I'm not seeing the relationship of this bug and the one it was marked a
duplicate of (10335).  Please let me know if this is really a dup of bug 10335.
I've made a note in duplicate bug to verify the mail case when fixed.  The
duplicate bug is marked M9 target milestone.
I don't see how this bug got to be a DUP of  10335 , guess this resolution
should be cleared until serious test are made to verify its functionality
Status: VERIFIED → REOPENED
Depends on: 10335
This is one of those cases where outward phenomena may be diverse though its
cause may be reduced to an underlying cause. (cf. warren's 8/11 comment.)

I think the best thing to do in this kind of case from a tester's pijnt of view is to
keep it resolved/fixed and wait for the other bug to get fixed and then re-test.

Accordingly, I'm going to make this resolved/fixed and
mark the dependency on 10335. lchiang already made a comment in 10335
and so all we need to do is re-tset when that gets fixed.
Status: REOPENED → RESOLVED
Closed: 20 years ago20 years ago
Resolution: DUPLICATE → FIXED
Status: RESOLVED → VERIFIED
Using 19990820 builds on win98 and mac and 19990816 builds on linux Replying to
an HTML message does not crash (original scenario).  This bug has many comments
and then was listed as a dup of 10335 (which is fixed and verified).  I am
verifying this based on the original scenario (no international characters or
plain text editing involved).  If this is crashing with international characters
or in plain text a new bug needs to be logged.  Note: there is already a bug for
plain text crashing with New Msg and Reply 11984).  This bug is verified
Product: MailNews → Core
Product: Core → MailNews Core
You need to log in before you can comment on or make changes to this bug.