Closed Bug 7938 Opened 25 years ago Closed 25 years ago

Crash on exit

Categories

(Core Graveyard :: Tracking, defect, P1)

x86
Linux
defect

Tracking

(Not tracked)

VERIFIED FIXED

People

(Reporter: fur, Assigned: akkzilla)

References

Details

(Originally filed as bug #7545)

------- Additional Comments From akkana@netscape.com  06/09/99 18:28 -------
I think this may have something to do with the registry.  I moved my ~/.mozilla
aside, went through the irritating profile wizard again, and then found that I
couldn't get the crash on exit any more (which I'd been seeing every single run
all day before I removed the registry).  I continued not to see any crashes on
exit, until I updated a few files and rebuilt a library, then on the next run, I
started getting exit crashes every time again.

Adding dp to cc list -- dp, any ideas?

------- Additional Comments From dp@netscape.com  06/09/99 19:47 -------
Incidentally, I just filed a bug on JS Engine with the same stack trace. Maybe
it already made it fur's way.

So I have been seeing this too. It is interesting what akkanna reports. Dll
change causes this. mmh! Let me try it out. So the sequence is:

- Remove reg
- run (all fine)
- Change dll
- run: crash (everytime)

I will try it. I doubt it is that easy. Any clue on what that dll is ?

------- Additional Comments From dp@netscape.com  06/09/99 19:49 -------
*** Bug 7861 has been marked as a duplicate of this bug. ***

------- Additional Comments From akkana@netscape.com  06/10/99 15:22 -------
Actually we have two bugs here -- the one I'm seeing regularly on Linux, which
comes and goes depending on how up to date the registry is (when I don't see the
bug, I can usually make it come back by making a significant change to a library
and recompiling, then running without removing the registry first), which has
the stack trace I gave, and the one in the JS GC code, which happens on Windows.
Not clear whether these are the same crash or not.  There's no GC in the stack
traces I'm seeing, just app core, non-GC JS code, and RDF.

The stack trace I posted before was abbreviated: here's the full trace:

#0  0x40958fac in ?? ()
#1  0x4067cac7 in gdk_exit ()
#2  0x400217df in nsAppShellService::Shutdown (this=0x807bc08)
    at nsAppShellService.cpp:413
#3  0x40a6139f in nsEditorAppCore::Exit (this=0x80f54d8)
    at nsEditorAppCore.cpp:957
#4  0x40a71204 in EditorAppCoreExit (cx=0x80f44a0, obj=0x825ae98, argc=0,
    argv=0x81c7bc0, rval=0xbfffe188) at nsJSEditorAppCore.cpp:1103
#5  0x40445c83 in js_Invoke (cx=0x80f44a0, argc=0, constructing=0)
    at jsinterp.c:650
#6  0x40456456 in js_Interpret (cx=0x80f44a0, result=0xbfffe590)
    at jsinterp.c:2199
#7  0x40445ce1 in js_Invoke (cx=0x80f44a0, argc=0, constructing=0)
    at jsinterp.c:666
#8  0x40456456 in js_Interpret (cx=0x80f44a0, result=0xbfffe9c4)
    at jsinterp.c:2199
#9  0x40445ce1 in js_Invoke (cx=0x80f44a0, argc=1, constructing=0)
    at jsinterp.c:666
#10 0x40445f98 in js_CallFunctionValue (cx=0x80f44a0, obj=0x8152118,
    fval=135602464, argc=1, argv=0xbfffeb48, rval=0xbfffeb4c) at jsinterp.c:735
#11 0x4041fd95 in JS_CallFunctionValue (cx=0x80f44a0, obj=0x8152118,
    fval=135602464, argc=1, argv=0xbfffeb48, rval=0xbfffeb4c) at jsapi.c:2554
#12 0x4039f831 in nsJSEventListener::HandleEvent (this=0x829c370,
    aEvent=0x82c2bb0) at nsJSEventListener.cpp:97
#13 0x40d4a296 in nsEventListenerManager::HandleEvent (this=0x829c038,
    aPresContext=@0x80d4da0, aEvent=0xbfffecc0, aDOMEvent=0xbfffec38,
    aFlags=3, aEventStatus=@0xbfffecfc) at nsEventListenerManager.cpp:569
#14 0x40b0fd62 in RDFElementImpl::HandleDOMEvent (this=0x829bae0,
    aPresContext=@0x80d4da0, aEvent=0xbfffecc0, aDOMEvent=0xbfffec38,
    aFlags=1, aEventStatus=@0xbfffecfc) at nsRDFElement.cpp:2278
#15 0x400ea22e in nsMenuItem::DoCommand (this=0x82e78d8) at nsMenuItem.cpp:404
#16 0x400e9d3d in nsMenuItem::MenuItemSelected (this=0x82e78d8,
    aMenuEvent=@0xbfffed40) at nsMenuItem.cpp:300
#17 0x400eb2e2 in menu_item_activate_handler (w=0x82e5728, p=0x82e78d8)
    at nsGtkEventHandler.cpp:505
#18 0x4064e4ad in gtk_marshal_NONE__NONE ()
#19 0xab15 in ?? ()
QA Contact: leger → sujay
changing qa_contact to myself since I originally filed the bug...I will
regress it once it is fixed.
Status: NEW → RESOLVED
Closed: 25 years ago
Resolution: --- → DUPLICATE
Status: RESOLVED → VERIFIED
Per don - all Crash on Exit bug info moved to:
http://bugzilla.mozilla.org/show_bug.cgi?id=7799

*** This bug has been marked as a duplicate of 7799 ***
Target Milestone: M7
Status: VERIFIED → REOPENED
Re-opening bug ...
Resolution: DUPLICATE → ---
Assignee: don → slamm
Status: REOPENED → NEW
Summary: Crash on exit in AppRunner → Crash on exit
Clearing resolution.
Here are some useful comments from bug #7799:

------- Additional Comments From don@netscape.com  06/14/99 20:44 -------
Since jband has fixed bug #7940, the Javascript garbage collection problem, can
someone see if this is still happening?


------- Additional Comments From don@netscape.com  06/14/99 20:47 -------
BTW, if jband's fix for bug #7940 does not fix at least the Linux-specific
manifestation of this bug, then perhaps we should re-open bug #7938.


------- Additional Comments From akkana@netscape.com  06/15/99 10:38 -------
Still happening -- on my very first run of apprunner (no arguments, just
browser) in this morning's build, when I exited (by the windowmanager delete
button -- there's nothing in any of the menus so I can't do File->Exit) I
crashed in:
#0  0x10 in ?? ()
#1  0x40974ac7 in nsCOMPtr<nsIFactory>::~nsCOMPtr (this=0x40983c08,
    __in_chrg=2) at nsEditorShellFactory.cpp:131
#2  0x4096c34c in __tcf_0 () at nsEditorShellFactory.cpp:131
#3  0x407f9585 in exit (status=0) at exit.c:55
#4  0x40666ac7 in gdk_exit ()
#5  0x400236b6 in nsAppShellService::UnregisterTopLevelWindow (this=0x8091dd8,
    aWindow=0x80c63e8) at nsAppShellService.cpp:631
#6  0x40024101 in nsWebShellWindow::Close (this=0x80c63e8)
    at nsWebShellWindow.cpp:386
#7  0x40024213 in nsWebShellWindow::HandleEvent (aEvent=0xbffff03c)
    at nsWebShellWindow.cpp:434
Note that I did not ever bring up the editor in this run; not clear why it was
trying to delete an nsEditorShellFactory.  Sounds like the list of top level
windows may be confused.  This may or may be the same bug.
Priority: P3 → P1
Steve, see what you can figure out with Akkana's stack traces ...
Blocks: 7799
No longer blocks: 7799
Component: Apprunner → other
Now (after the Wed. M7 deadline) I'm getting a stack trace more similar to the
first one:
#1  0x4097bdc3 in nsCOMPtr<nsIFactory>::~nsCOMPtr (this=0x40989d0c,
    __in_chrg=2) at nsEditorShellFactory.cpp:131
#2  0x409730d0 in __tcf_0 () at nsEditorShellFactory.cpp:131
#3  0x407f7585 in exit (status=0) at exit.c:55
#4  0x40664ac7 in gdk_exit ()
#5  0x400231f7 in nsAppShellService::Shutdown (this=0x8092310)
    at nsAppShellService.cpp:415
#6  0x4096e92f in nsEditorShell::Exit (this=0x82e8660) at nsEditorShell.cpp:980
#7  0x400a6840 in XPTC_InvokeByIndex (that=0x82e8660, methodIndex=23,
    paramCount=0, params=0xbfffdf6c) at xptcinvoke_unixish_x86.cpp:154
#8  0x4110daa7 in nsXPCWrappedNativeClass::CallWrappedMethod (this=0x82e8df8,
    cx=0x80f84d0, wrapper=0x82e97a8, desc=0x82e8fbc, callMode=CALL_METHOD,
    argc=0, argv=0x81e0d50, vp=0xbfffe150) at xpcwrappednativeclass.cpp:605
Removing my name from cc.
Whiteboard: Investigate for M7
Whiteboard: Investigate for M7 → slamm: Unable to reproduce
I have not been able to get this crash.
Target Milestone: M7 → M8
OK, since we can't reproduce it yet, let's move it to M8 and give it some more
air time.
Well, okay; but I'm seeing this every time I exit (occasionally it goes away for
a few consecutive runs, but then it always comes back).  It's not at all
difficult for me to reproduce.

I suspect it has something to do with the editor shell leaking (bug 5806).
I'm trying to debug 5806. I need jband's help to do so.
Me too. I get the crash many many times.
Simon and I spent some time looking at this.  We discovered that if you change
g_pNSIFactory (currently a statically constructed nsCOMPtr) to a regular
pointer, initialized to 0, in GetEditorShellFactory in nsEditorShellFactory.cpp,
the crash goes away.  The crash happens on runs where we reregister the editor
library; on exit, we try to destruct this nsCOMPtr but apparently there's
something wrong with it or it wasn't properly constructed by the static
constructor, and the destructor causes a crash.

Adding ramiro and scc to the cc list in case they're interested in the static
constructor issue.  I'll look into this more to try to find out what the
difference is between this statically constructed nsCOMPtr and all the other
statically constructed nsCOMPtrs we're using for factory registration even
though the C++ guidelines say not to.
Consultation with dp resulted in the following theory:
The library is loaded in order to register itself, then forcibly unloaded.  The
constructor for the statically allocated object is called at library load time,
but the destructor isn't called until exit; however, by then, the library has
been unloaded so the code to implement the destructor no longer exists, and we
crash.

dp has reviewed and agreed with my change from nsCOMPtr to a straight pointer
(which will not be deallocated, so the factory will be leaked) for M7, and
suggested that other libraries ought to be doing the same, getting rid of any
static nsCOMPtrs they might be using in their factory methods.
Assignee: slamm → akkana
Target Milestone: M8 → M7
Marking M7 because I'd like to check in the temporary fix.
Status: NEW → ASSIGNED
Whiteboard: slamm: Unable to reproduce → have temporary fix
I have looked at the fix and approve this.
This fix wont be complete without fixing all of these cases. Akkana are you
going to take care of all these cases:

./editor/base/nsEditFactory.cpp:33:  static nsCOMPtr<nsIFactory>  g_pNSIFactory;
./editor/base/nsHTMLEditFactory.cpp:36:  static nsCOMPtr<nsIFactory>
g_pNSIFactory;
./editor/base/nsTextEditFactory.cpp:33:  static nsCOMPtr<nsIFactory>
g_pNSIFactory;
./editor/base/nsTextEditor.cpp:1404:static nsCOMPtr<nsIDOMElement>
./editor/base/nsEditorShellFactory.cpp:131:  static nsCOMPtr<nsIFactory>
g_pNSIFactory;
I was going to check in to M7 only the fix for the one that was known to cause a
crash, since that was the only one that's been explicitly approved at this
point, then check in the others for M8; but I could check them all in now if you
think it's a better idea (I tend to think we should do them all).

I haven't found any instances of this outside the editor, which is odd because I
know someone copied our code from somewhere, but maybe it's since been fixed
everywhere else.
I think it is better to fix all.
Status: ASSIGNED → RESOLVED
Closed: 25 years ago25 years ago
Resolution: --- → FIXED
Checked in a fix for all the editor cases.  We're no longer keeping a static
pointer to the factory; instead we create the factory again whenever we need it.
This needs to be investigated for performance issues when ender becomes the
standard text widget; we may need to go back to a static model (but one which
can unload along with the ender library, which the old model didn't do right).
*** Bug 8043 has been marked as a duplicate of this bug. ***
now I get this after exiting apprunner on Linux:

Gtk-CRITICAL **: file gtkmain.c: line 532 (gtk_main_quit): assertion `main_loops
!= NULL' failed.
Whiteboard: have temporary fix
I see huge numbers of GTK errors and warnings on the console whenever I run
apprunner.  They're annoying and I hope someone is working on fixing them, but
they aren't related to this crash.
Status: RESOLVED → VERIFIED
verified in 6/21 build.
Product: Core → Core Graveyard
You need to log in before you can comment on or make changes to this bug.