Closed Bug 357861 Opened 18 years ago Closed 18 years ago

GC crash (e.g. shutdown, window close, etc) [@ nsTextServicesDocument] after composing a message with spellcheck turned on

Categories

(Core :: Spelling checker, defect)

defect
Not set
critical

Tracking

()

VERIFIED FIXED

People

(Reporter: stephend, Assigned: mscott)

References

()

Details

(Keywords: crash, regression)

Crash Data

Attachments

(1 file)

Build ID: 2006-10-24-09 SeaMonkey trunk, but this has been happening for ~1 week. Steps to Reproduce: 1. Launch Navigator. 2. Launch Mail. 3. Send a brief test message to yourself. 4. Read the message. 5. Do a File | Quit. Expected Results: Graceful shutdown. Actual Results: mcsmurf's frame dump/stack: ChildEBP RetAddr WARNING: Frame IP not in any known module. Following frames may be wrong. 0012fd2c 1000543a 0x298ec63 0012fd34 02f1fc1f xpcom_core!nsCOMPtr_base::~nsCOMPtr_base(void)+0xc [h:\mozilla\tree-main\mozilla\seamonkey-build\xpcom\build\nscomptr.cpp @ 82] 0012fd58 02f215e0 editor!nsTextServicesDocument::~nsTextServicesDocument(void)+0x72 [h:\mozilla\tree-main\mozilla\editor\txtsvc\src\nstextservicesdocument.cpp @ 124] 0012fd60 02f1f600 editor!nsTextServicesDocument::`scalar deleting destructor'(void)+0x8 0012fd68 1000543a editor!nsTextServicesDocument::Release(void)+0x14 [h:\mozilla\tree-main\mozilla\editor\txtsvc\src\nstextservicesdocument.cpp @ 160] 0012fd70 05c6661d xpcom_core!nsCOMPtr_base::~nsCOMPtr_base(void)+0xc [h:\mozilla\tree-main\mozilla\seamonkey-build\xpcom\build\nscomptr.cpp @ 82] 0012fd80 05c66ea3 spellchk!mozInlineSpellChecker::~mozInlineSpellChecker(void)+0x43 [h:\mozilla\tree-main\mozilla\extensions\spellcheck\src\mozinlinespellchecker.cpp @ 530] 0012fd88 05c665b3 spellchk!mozInlineSpellChecker::`scalar deleting destructor'(void)+0x8 0012fd90 1000543a spellchk!mozInlineSpellChecker::Release(void)+0x18 [h:\mozilla\tree-main\mozilla\extensions\spellcheck\src\mozinlinespellchecker.cpp @ 511] 0012fd98 02f2e81b xpcom_core!nsCOMPtr_base::~nsCOMPtr_base(void)+0xc [h:\mozilla\tree-main\mozilla\seamonkey-build\xpcom\build\nscomptr.cpp @ 82] 0012fdac 02f24f49 editor!nsEditor::~nsEditor(void)+0x136 [h:\mozilla\tree-main\mozilla\editor\libeditor\base\nseditor.cpp @ 218] 0012fdc4 02eeccf4 editor!nsPlaintextEditor::~nsPlaintextEditor(void)+0x6e [h:\mozilla\tree-main\mozilla\editor\libeditor\text\nsplaintexteditor.cpp @ 116] 0012fdf4 02eedc03 editor!nsHTMLEditor::~nsHTMLEditor(void)+0x347 [h:\mozilla\tree-main\mozilla\editor\libeditor\html\nshtmleditor.cpp @ 236] 0012fdfc 02f2da96 editor!nsHTMLEditor::`scalar deleting destructor'(void)+0x8 0012fe04 0128a8c3 editor!nsEditor::Release(void)+0x1b [h:\mozilla\tree-main\mozilla\editor\libeditor\base\nseditor.cpp @ 220] 0012fe34 019d6184 xpc3250!XPCJSRuntime::GCCallback(struct JSContext * cx = 0x02a2feb8, JSGCStatus status = JSGC_END (1))+0x406 [h:\mozilla\tree-main\mozilla\js\src\xpconnect\src\xpcjsruntime.cpp @ 591] 0012fe44 011d9777 gklayout!DOMGCCallback(struct JSContext * cx = 0x011b41da, JSGCStatus status = 44236472 (No matching enumerant))+0x14 [h:\mozilla\tree-main\mozilla\dom\src\base\nsjsenvironment.cpp @ 3157] 0012fe88 011b41da js3250!js_GC(struct JSContext * cx = 0x02a2feb8, JSGCInvocationKind gckind = GC_NORMAL (0))+0x7c8 [h:\mozilla\tree-main\mozilla\js\src\jsgc.c @ 3041] 0012fe98 01de33b1 js3250!JS_GC(struct JSContext * cx = 0x0040379b)+0x32 [h:\mozilla\tree-main\mozilla\js\src\jsapi.c @ 1944] 0012fee4 0040379b profile!nsProfile::ShutDownCurrentProfile(unsigned int shutDownType = 0x2a2feb8)+0x121 [h:\mozilla\tree-main\mozilla\profile\src\nsprofile.cpp @ 1354]
I don't suppose the patch in bug 355064 makes any difference?
(In reply to comment #1) > I don't suppose the patch in bug 355064 makes any difference? If you're asking me, I unfortunately don't have a build system or tree right now...
The patch in Bug 355064 seems to fix this bug here, indeed... (I tested the patch in Attachment 240853 [details] [diff]).
Neil, do you understand why your work in Bug 355064 fixes this crash? (I couldn't tell). I wonder what got checked in on the trunk that exposed this too...Hmmm.
(In reply to comment #4) >Neil, do you understand why your work in Bug 355064 fixes this crash? Unfortunately I don't really understand the spellchecker backends at all.
(In reply to comment #0) > Steps to Reproduce: > > 1. Launch Navigator. > 2. Launch Mail. > 3. Send a brief test message to yourself. > 4. Read the message. > 5. Do a File | Quit. For me this is much shorter. 1.start mailnews 2.click reply to answer a messege 3.close Composer(its not necessary to write anything) CRASH! 'last good' here 2006100902 'first bad' here 2006101006 TB25471086W; TB25472746M and from today TB25741848K. Maybe this is not exactly the same crash, but I found out, that SM don't crash, when I switch of 'Check Spelling As You Type' in the Spellcheckermenu. Was already anything checked in for bug 355064?
Judging from the regression range mentioned in comment 6, this could be a fallout from bug 278677.
Blocks: 278677
Flags: blocking1.9?
(In reply to comment #6) >Maybe this is not exactly the same crash, but I found out, that SM don't crash, >when I switch of 'Check Spelling As You Type' in the Spellcheckermenu. Yes, this is the inline spellchecker again. What happens here is that when the compose window is finally destroyed at shutdown, it destroys the editor, which destroys its array of edit action listeners, then it destroys its inline spellchecker, which tries to remove an action listener from the now destroyed editor.
*** Bug 361896 has been marked as a duplicate of this bug. ***
*** Bug 362298 has been marked as a duplicate of this bug. ***
another TB TB27177972E on Mozilla/5.0 (Windows; U; Windows NT 5.0; fr-FR; rv:1.9a1) Gecko/20061209 SeaMonkey/1.5a
*** Bug 363952 has been marked as a duplicate of this bug. ***
Same condition here, and same workaround per Comment #6: Mozilla/5.0 (OS/2; U; Warp 4.5; en-US; rv:1.9a2pre) Gecko/20061222 MultiZilla/1.8.3.0a SeaMonkey/1.5a
Me too. Talkback incidents: 27856695 27796256 27795733 all after composing plain text mail or new messages with Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9a2pre) Gecko/20061226 SeaMonkey/1.5a Crashes occur either (a) immediately after sending the message or (b) upon shutdown
At least the check-ins in Bug 355064 did not fix this bug here...not sure what is going on here (although I only tested Attachment 240853 [details] [diff] at that time in Comment 3 and that one was obsoleted by a new patch).
There are too many dupes... I'm changing the summary because it was hard to find this bug searching for "nsTextServicesDocument" (the windows stacks are ugly). I'm posting here a version of the stack of the crashing while sending the message (yeah, no quit) and with our new GC: #7 0xb06221f2 in ~nsCOMPtr (this=0x97cac30) at ../../../dist/include/xpcom/nsCOMPtr.h:583 #8 0xb0678c1b in ~nsTextServicesDocument (this=0x97cac10) at nsTextServicesDocument.cpp:124 #9 0xb0670efa in nsTextServicesDocument::Release (this=0x97cac10) at nsTextServicesDocument.cpp:159 #10 0xb2d41e5e in ~nsCOMPtr (this=0x956bb80) at ../../../dist/include/xpcom/nsCOMPtr.h:583 #11 0xb2d4cabb in ~mozInlineSpellChecker (this=0x956bb60) at mozInlineSpellChecker.cpp:530 #12 0xb2d4cff1 in mozInlineSpellChecker::Release (this=0x956bb60) at mozInlineSpellChecker.cpp:511 #13 0xb069faca in ~nsCOMPtr (this=0x953f668) at ../../../dist/include/xpcom/nsCOMPtr.h:583 #14 0xb069de87 in ~nsEditor (this=0x953f620) at nsEditor.cpp:217 #15 0xb068286f in ~nsPlaintextEditor (this=0x953f620) at nsPlaintextEditor.cpp:116 #16 0xb061d910 in ~nsHTMLEditor (this=0x953f620) at nsHTMLEditor.cpp:236 #17 0xb069d76b in nsEditor::Release (this=0x953f620) at nsEditor.cpp:219 #18 0xb061d287 in nsHTMLEditor::Release (this=0x953f620) at nsHTMLEditor.cpp:246 #19 0xb6f381a7 in XPCJSRuntime::GCCallback (cx=0x8261990, status=JSGC_END) at xpcjsruntime.cpp:590 #20 0xb6217f1c in DOMGCCallback (cx=0x8261990, status=JSGC_END) at nsJSEnvironment.cpp:3173 #21 0xb7d5707a in js_GC (cx=0x8261990, gckind=GC_NORMAL) at jsgc.c:3198 #22 0xb7d23380 in JS_GC (cx=0x8261990) at jsapi.c:1886 #23 0xb6f0bf87 in nsXPConnect::BeginCycleCollection (this=0x8165610) at nsXPConnect.cpp:454 #24 0xb7e8da1d in nsCycleCollector::Collect (this=0xb7ecbd60) at nsCycleCollector.cpp:1581 #25 0xb7e8db77 in nsCycleCollector_collect () at nsCycleCollector.cpp:1687 #26 0xb621848a in nsJSContext::Notify (this=0x97d9b10, timer=0x9826fb8) at nsJSEnvironment.cpp:3129 #27 0xb7e7da81 in nsTimerImpl::Fire (this=0x9826fb8) at nsTimerImpl.cpp:386 #28 0xb7e7dbbe in nsTimerEvent::Run (this=0xb3f4cf38) at nsTimerImpl.cpp:456 #29 0xb7e783bc in nsThread::ProcessNextEvent (this=0x80de330, mayWait=1, result=0xbfd2d9c0) at nsThread.cpp:482 #30 0xb7e127a1 in NS_ProcessNextEvent_P (thread=0xb72e2717, mayWait=1) at nsThreadUtils.cpp:225 #31 0xb5af9be3 in nsBaseAppShell::Run (this=0x81d1180) at nsBaseAppShell.cpp:153 #32 0xb5b65278 in nsAppStartup::Run (this=0x81cfe40) at nsAppStartup.cpp:218 #33 0x0804dc92 in main1 (argc=3, argv=0xbfd2dc54, nativeApp=<value optimized out>) at nsAppRunner.cpp:1228 #34 0x0804dfef in main (argc=Cannot access memory at address 0x0 ) at nsAppRunner.cpp:1722
OS: Windows XP → All
Hardware: PC → All
Summary: Shutdown crash in Mail after sending and reading a message. → crash [@ nsTextServicesDocument] after composing a message.
(In reply to comment #16) > There are too many dupes... > > I'm changing the summary because it was hard to find this bug searching for > "nsTextServicesDocument" (the windows stacks are ugly). It will be just as hard to find without "shutdown," so I'm re-adding that.
Summary: crash [@ nsTextServicesDocument] after composing a message. → Shutdown crash [@ nsTextServicesDocument] after composing a message.
I've been hitting this about once a day, so I decided to poke at it. This is actually a bug introduced by the interaction of the inline spell-checker code and the unfortunate way weak references work. The relevant part of the stack is: nsTextServicesDocument::~nsTextServicesDocument() nsTextServicesDocument::Release() nsCOMPtr_base::~nsCOMPtr_base() mozInlineSpellChecker::~mozInlineSpellChecker() mozInlineSpellChecker::Release() nsCOMPtr_base::~nsCOMPtr_base() nsEditor::~nsEditor() Now ~nsTextServicesDocument does: 119 scott 1.58 nsCOMPtr<nsIEditor> editor (do_QueryReferent(mEditor)); 120 if (editor && mNotifier) 121 editor->RemoveEditActionListener(mNotifier); (code added for inline spellcheck). The problem is that the do_QueryReferent has an object to work with, since the nsSupportsWeakReference destructor of the editor has NOT yet been called. But the call to QI to get nsIEditor is calling into the vtable of a deleted object (the nsHTMLEditor in this case), and even if that succeeds the RemoveEditActionListener call is messing with the members of a deleted object. So we crash. There are a few approaches I see: 1) Make all this stuff use the cycle collector and be done with it. 2) Make the nsSupportsWeakReference::ClearWeakReferences from the most-derived subclass of nsEditor. 3) Null out the mNotifier on the nsTextServicesDocument in ~mozInlineSpellChecker. 4) Null out the mTextServicesDocument on the mozInlineSpellChecker in ~nsHTMLEditor. Might be other more extravagant things we could do here, of course. #3 and #4 might not actually work; someone who actually understands how this code is supposed to work should pick an approach.
Assignee: nobody → mscott
Component: Editor → Spelling checker
QA Contact: spelling-checker
Summary: Shutdown crash [@ nsTextServicesDocument] after composing a message. → GC crash (e.g. shutdown, window close, etc) [@ nsTextServicesDocument] after composing a message.
Oh, and the reason I changed the summary is that I hit this about once every 20-30 mails I send, not just on shutdown. I bet on Windows you could reproduce it before shutdown if you disabled compose window recycling the way it's disabled on Linux.
(In reply to comment #13) > Same condition here, and same workaround per Comment #6: > > Mozilla/5.0 (OS/2; U; Warp 4.5; en-US; rv:1.9a2pre) Gecko/20061222 > MultiZilla/1.8.3.0a SeaMonkey/1.5a > This issue continues with: Mozilla/5.0 (OS/2; U; Warp 4.5; en-US; rv:1.9a2pre) Gecko/20070124 MultiZilla/1.8.3.0a SeaMonkey/1.5a I actually thought it was resolved (I was able to get one message through), but upon my second composition/send, BLINK! Out it went. :-( Per Boris' Comment #20, compose window recycling is disabled on my OS/2 installation (due to some other issues with it).
I'm still trying to reproduce this on windows (with the compose recycler turned off) in tbird and haven't been able to generate this particular stack trace yet. One odd thing has jumped out at me. There's a method on editor called PreDestroy which instructs mInlineSpellChecker to release all its goodies including the text services document. This fires before the destructor. Here's the lxr link: http://lxr.mozilla.org/seamonkey/source/editor/libeditor/base/nsEditor.cpp#504 This code was added to fix Bug 347561. At least for me, when closing a compose window and watching the html editor instance get destroyed, I never see a call to PreDestroy.
Well, here's a data point: I can crash (per bug 356439) with 3a1-0108 but not with -0110; *with* -0113, but not with -0125.
I can hit this bug (steps from bug 361896) with builds up to 2007-01-09-09-trunk, but not with later builds, starting with 2007-01-10-08-trunk (including the 1/13 build). But I don't see anything relevant within that window.
Here's the talkback ID for the crach I just gen'd with -0113. (Win2K) TB28716910W Not a lot of stack info, but the same old nsTextServicesDocument::`scalar deleting destructor' And this is gen'd simply opening the program (offline, in this case), opening a compose window, closing the compose window (dismissing the "Save?" prompt with Cancel), and closing the program.
Interesting, it looks like using the InlineSpellCheckerUI API may leave you more susceptible to this crash. I converted Thunderbird trunk builds to that a couple days ago (Bug 355064) and am now starting to see more reports of this crash.
I'll say. The average for the three days prior to 1/29 was 1.79 crashes per hour with 'scalar deleting destructor', but on 1/29 and after the average is 3.67 crashes per hour. It has doubled the rate of crashes.
Jerry Baker, are you running with check-spelling-inline turned on? I leave it off, and I don't see crashes when I run the trunk.
Yes. I have been running with it on since the feature became available. It only started crashing yesterday.
While I'm not crashing in the debugger, I am now consistently seeing the stack trace posted by bz now that tbird uses InlineSpellCheckerUI. Neil and I think the cause is that nsIEditor::PreDestroy doesn't get called for the composition case. There are two places where nsIEditor::PreDestroy is called: 1) nsTextControlFrame explicitly calls nsIEditor::PreDestroy in its Destroy method. This is why we don't see the crash when spell checking text boxes in Firefox. 2) For the editor composition case, nsIEditor::PreDestroy gets called indirectly from nsEditingSession::TearDownEditorOnWindow, which passes null as an argument to nsDocShellEditorData::SetEditor. nsDocShellEditorData::SetEditor calls nsIEditor::PreDestroy. Unfortunately for us, in this second case, TearDownEditorOnWindow only calls SetEditor when nsEditingSession::mDoneSetup is true. I'm seeing mDoneSetup get set to true via nsEditingSession::MakeWindowEditable during creation of the composition window. But shortly after making the window editable, we call TearDownEditorOnWindow again in nsEditingSession::StartDocumentLoad thus clearing mDoneSetup back to false. Then when the compose window is closed, the destructor for nsDocShellEditorData calls mEditingSession->TearDownEditorOnWindow but mDoneSetup is false so PreDestroy never gets called.
Attached patch Possible patchSplinter Review
I've no idea whether this works, because I don't dogfood message compose, so this is solely based on Scott's excellent analysis. I put the mDoneSetup = PR_TRUE; at the beginning of SetupEditorOnWindow because otherwise it doesn't make sense to call TearDownEditorOnWindow when SetupEditorOnWindow fails. Alternatively I was wondering whether TearDownEditorOnWindow is safe enough to ignore mDoneSetup completely.
Attachment #253616 - Flags: superreview?(mscott)
Attachment #253616 - Flags: review?
Comment on attachment 253616 [details] [diff] [review] Possible patch *Neil glares at bugzilla
Attachment #253616 - Flags: review? → review?(daniel)
I've been stepping through the debugger with Neil's patch to nsEditingSession and things look good. The call stack when deleting composer's editor instance is much better. nsEditor::PreDestroy is now getting called on the compose window's editor instance, which clears its owning reference to mInlineSpellChecker. I then see the following: 1) The cycle collector ends up garbage collecting the inline spell checker which in turn calls the destructor for nsTextServicesDocument which does: nsCOMPtr<nsIEditor> editor (do_QueryReferent(mEditor)); if (editor && mNotifier) editor->RemoveEditActionListener(mNotifier); It's still able to get an editor object and it calls RemoveEditActionListener but the destructor for the editor instance hasn't fired yet so we are ok. 2) Later on, the dtor for nsHTMLEditor fires, the spell checker reference has already been cleared (and in this case already deleted). 3) The call stack stays happy without any calls into the vtable of a deleted object. I'd like to note that even if the editor instance gets deleted before the inline spell check object gets collected, things are still ok because they've been decoupled by PreDestroy. The editor dtor won't trigger the inline spell check being destroyed so the text services document dtor won't in turn call back into the editor object. Instead when the inline spell checker gets destroyed attempts to fetch the weak reference for the editor object will fail. In short, I believe the patch will fix the crash folks are seeing. Thanks Neil!
Attachment #253616 - Flags: superreview?(mscott) → superreview+
I'm using the attached patch so far and I'm not experiencing crashes while running SeaMonkey trunk. As I'm on Linux, the crashes were very frequent (no need to close the program). So I believe a lot the patch will fix this crash. ;)
Still crashing with 20070205 trunk. Did the patch get checked in?
(In reply to comment #36) > Still crashing with 20070205 trunk. Did the patch get checked in? No; it's very clearly still waiting on review from Daniel.
Summary: GC crash (e.g. shutdown, window close, etc) [@ nsTextServicesDocument] after composing a message. → GC crash (e.g. shutdown, window close, etc) [@ nsTextServicesDocument] after composing a message with spellcheck turned on
(In reply to comment #37) > No; it's very clearly still waiting on review from Daniel. Is this still necessary after mscott's superreview+?
is it really necessary for people to ask questions in bugzilla? the web has search engines, and oddly enough if you search for "super review", the second hit is: http://www.mozilla.org/hacking/reviewers.html which if you read it, you will discover that review is still necessary.
I read that entire page and it wasn't clear to me that code still needs a "regular review" after a "super review."
> mozilla.org requires module owner and peer review before code is checked > into the mozilla.org CVS repositiory. Checking into most (but not all) of > the Mozilla tree also requires an additional level of pre-check-in code > review. what part of "requires additional" means that it trumps the other? now would everyone please stop poking this bug? it's going nowhere fast.
Comment on attachment 253616 [details] [diff] [review] Possible patch r=daniel@glazman.org provided the fact you test the patch (a) with the Midas demo (b) with a document linked from the Midas demo page (the editor should be shut down on that new page, right ?)
Attachment #253616 - Flags: review?(daniel) → review+
(In reply to comment #43) >the editor should be shut down on that new page, right ? Right.
Status: NEW → RESOLVED
Closed: 18 years ago
Resolution: --- → FIXED
Testing with TB 3a1-0214, I don't see a crash (per bug 356439); the previous trunk build I had, -0210, would crash. But see comment 23 -- I can't say those STR are necessarily sufficient to generate the crash.
Whatever Midas tests were used to test this should be added to the regression suite. And thanks for fixing this!
Flags: blocking1.9? → in-testsuite?
Both my talkback query (see URL) and my own testing of version 3.0a1pre (20070630) confirm that this bug is now fixed. Verified.
Status: RESOLVED → VERIFIED
The bug I had which was duplicated to this one had no Talkback report. Maybe it was a different bug after all?
Crash Signature: [@ nsTextServicesDocument]
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: