Closed Bug 62401 Opened 24 years ago Closed 23 years ago

Crash on shutdown after opening Manage Bookmarks in session

Categories

(Core :: Preferences: Backend, defect, P1)

defect

Tracking

()

RESOLVED WORKSFORME
mozilla0.9

People

(Reporter: bugzilla, Assigned: alecf)

References

Details

(Keywords: crash)

Attachments

(2 files)

Build ID: new trunk

Steps to Reproduce:

(1) Load Mozilla
(2) Open the Manage Bookmarks window (Bookmarks | Manage Bookmarks... or Ctrl+B)
(3) Close the Manage Bookmarks window
(4) Quit Mozilla

Result:  crash in JS3250.DLL

In the console of my debug build:

JS API usage error: 1 contexts left in runtime upon JS_DestroyRuntime.
JS engine warning: leaking GC root 'res->input' at 02835F20
JS engine warning: 1 GC root remains after destroying the JSRuntime.
                   This root may point to freed memory. Objects reachable
                   through it have not been finalized.

Doesn't sound pretty.  Tried to get a stack and Windows gave me the BSOD then 
threatened to force reinstallation, so I gave up.
Hi reporter, I have a suggestion for you:
For you might 'new trunk' mean 'really new', but tomorrow, or the day after
that, this will be an old trunk for people using every days latest build. So I
suggest please always specify the build. So we can try to reproduce this, and
test with a newer build perhaps. To check if this is fixed already. Then your
helping other testers much more, like me.

And as I see the name Blake Ross, Isn't that a well know person in this
cummunity? So you should know the rules, or am I wrong with stating this?

Also, there's already a bug about opening Bookmark, Help, About and closing
windows and quiting Mozilla. But I whas unable to reproduce it! Might this be
related to that bug?

B.T.W. I just run the test with build 2000120804 on WinNT4 Sp6b and I have 264
bookmarks on file (if that helps), but WFM!

Thank You, HJ.
Keywords: crash
*** Bug 62400 has been marked as a duplicate of this bug. ***
*** Bug 62417 has been marked as a duplicate of this bug. ***
blake: Does not the phrase "JS API usage error" suggest that it is the *user* of
the JS API, not the JS engine that implements that API, that should get this
bug?

alecf: can you dig into this a bit?  Thanks.
/be
Assignee: rogerl → alecf
Component: Javascript Engine → Bookmarks
QA Contact: pschwartau → claudius
Brendan: Yes, I assumed there was probably something wrong with the bookmarks 
code, but I also didn't think JS should crash because of it.

Perhaps this (specifically that reference to the bookmarks window?) is causing 
problems:

function doUnload()
{
    // Get the current window position/size.
    var x = window.screenX;
    var y = window.screenY;
    var h = window.outerHeight;
    var w = window.outerWidth;

    // Store these into the window attributes (for persistence).
    var win = document.getElementById("bookmark-window");
    win.setAttribute("x", x);
    win.setAttribute("y", y);
    win.setAttribute("height", h);
    win.setAttribute("width", w);
}

...called, obviously, onunload.
nav triage team:

make nsbeta1, p1. QA: is this reproducible on other platforms?
Keywords: nsbeta1
Priority: P3 → P1
ok, so that leak is a regular expression leak of some kind... 
http://lxr.mozilla.org/seamonkey/source/js/src/jsregexp.c#2317

There is only one regexp in bookmarks.js, I'm guessing it's the culprit:
http://lxr.mozilla.org/seamonkey/source/xpfe/components/bookmarks/resources/book
marks.js#817

Sure enough, if I remove that .replace(), we don't crash.

brendan? this is looking like a js engine bug.
Assignee: alecf → rogerl
Component: Bookmarks → Javascript Engine
QA Contact: claudius → pschwartau
btw: I couldn't repro this on Mac on linux but I did rather easily on Win98 with a 12/08 build
For the umpteenth time, this is not a JS engine bug.  A JS API user is leaking a
whole JSContext, which leaks its regExpStatics.input root.

Alec, it's you or jst, I'm sorry to say.

/be
Assignee: rogerl → alecf
it must be jst then - because the only thing that triggers this is the use of a
regular expression. If I don't use it, it doesn't crash...

the bookmarks code in question is 100% JS and the only direct manipulation of
the JS API from the C++ bookmarks code is not even called if you just open the
window and close it again
Assignee: alecf → jst
jst, alecf: sorry to be less helpful earlier.  The fix is in nsGlobalWindow.cpp,
in GlobalWindowImpl::SetNewDocument -- where it calls JS_ClearScope, it should
also call ::JS_ClearRegExpStatics(cx); given a new JSContext *cx that is first
set to (JSContext*) mContext->GetNativeContext().

This is because a RegExp object has the same parent as its constructor (RegExp),
which is the window object.  The window object holds a strong XPCOM ref on its
private data (GlobalWindowImpl), which owns its mContext, which owns its native
context (cx), which owns a GC root (regExpStatics.input), which points at the
RegExp object.

The classic codebase called JS_ClearRegExpStatics, but that call was not carried
forward in the new DOM.

/be
So something like this?

Index: dom/src/base/nsGlobalWindow.cpp
===================================================================
RCS file: /cvsroot/mozilla/dom/src/base/nsGlobalWindow.cpp,v
retrieving revision 1.366
diff -u -r1.366 nsGlobalWindow.cpp
--- nsGlobalWindow.cpp  2000/12/03 01:21:54     1.366
+++ nsGlobalWindow.cpp  2000/12/13 09:57:09
@@ -369,6 +369,7 @@
 //      and I mean everything, to be leaked (that's bad)
           ::JS_ClearScope((JSContext *) mContext->GetNativeContext(),
                           (JSObject *) mScriptObject);
+          ::JS_ClearRegExpStatics((JSContext *)mContext->GetNativeContext());
         }
       }
       nsCRT::free(str);

I assume this is an old bug that just happend to be found now, right?
Status: NEW → ASSIGNED
Target Milestone: --- → mozilla0.8
It's been around for a fairly long time, yes.
jst: yeah, old bug; your patch should fix it.  How does it prove itself during
testing?

/be
Setting default component and QA -
Status: ASSIGNED → NEW
Component: Javascript Engine → DOM Level 0
QA Contact: pschwartau → desale
Brendan, unfortunately the patch doesn't prove itself being very helpful in
resolving this problem, no change what so ever. More ideas?
Status: NEW → ASSIGNED
OS: Windows ME → All
Hardware: PC → All
JS_ClearRegExpStatics is the right tool -- maybe it's just not being used at the
right time.  Is it being called before the "JS API usage error" debugging
printfs appear?  Maybe there's some other, independent strong ref cycle that is
causing the context to leak.

/be
Brendan, this is getting interesting, I checked if we really leak a nsJSContext
by putting a breakpoint in the constructor and destructor and here's what I
found, + means constructor and - means destructor:

Starting mozilla:

+this 0x0193bfb0
+this 0x02ddb510

profile dialog opens

-this 0x0193bfb0
-this 0x02ddb510

profile dialog done

+this 0x02c92360
+this 0x02d0a3a0
+this 0x032224c0
+this 0x034499c0
+this 0x0344c3c0
+this 0x039a20c0
+this 0x03a26770
+this 0x03aa2a60
+this 0x03c5e400

(Main window appears)

+this 0x03c22a00
+this 0x048712c0

Main window done, opening bookmark window.

+this 0x049669f0
+this 0x049bb7b0

bookmark window open, closing bookmark window

-this 0x049669f0

closing main window.

-this 0x02c92360
-this 0x03c22a00
-this 0x03c5e400
-this 0x02d0a3a0
-this 0x039a20c0
-this 0x048712c0
-this 0x0344c3c0
-this 0x049bb7b0
-this 0x034499c0
-this 0x03a26770
-this 0x03aa2a60
-this 0x032224c0

If you count the lines with + and - on them you'll see that there are 15 of
each, i.e. no contexts leaked and they're all destroyed before we call
JS_DestroyRuntime() but we still crash and we get this on the console:

WEBSHELL- = 0
...
JS API usage error: 1 contexts left in runtime upon JS_DestroyRuntime.
JS engine warning: leaking GC root 'res->input' at 03271D70
JS engine warning: 1 GC root remains after destroying the JSRuntime.
                   This root may point to freed memory. Objects reachable
                   through it have not been finalized.

Here's the stack for the crash, we crash doing:

    PR_Lock(rt->rtLock);

and 'rt' looks like garbage, most likely pointing to a destroyed runtime, the
call to JS_DestroyRuntime() happened before the crash so yeah, the runtime must
be gone.

js_LockRuntime(JSRuntime * 0x010479d0) line 879 + 3 bytes
js_DestroyContext(JSContext * 0x03271cc0, int 2) line 159 + 9 bytes
JS_DestroyContext(JSContext * 0x03271cc0) line 887 + 11 bytes
PREF_CleanupPrefs() line 551 + 13 bytes
PREF_Cleanup() line 536
nsPref::ShutDown(nsPref * const 0x00559050) line 627
nsPref::~nsPref() line 225
nsPref::`scalar deleting destructor'(unsigned int 1) + 15 bytes
nsPref::Release(nsPref * const 0x00559050) line 529 + 135 bytes
nsCOMPtr<nsIPref>::~nsCOMPtr<nsIPref>() line 489

So it looks like the crash is unrelated to the warning about a context leak and
it looks like there's something weird going one with the contexts too since all
the native nsJSContexts are destroyed. (this test was done with my above change)
It must be the prefs context, then.  Who should get this bug?

/be
Whiteboard: [nsbeta1+]
Alec, could you look into this? It looks like the prefs code is doing JS engine
calls after the js runtime has been destroyed when mozilla shuts down after the
manage bookmarks window has been opened.
Assignee: jst → alecf
Status: ASSIGNED → NEW
Status: NEW → ASSIGNED
getting rid of old nsbeta1+ (we have keywords now)
Whiteboard: [nsbeta1+]
Hmm...this seems to be win-only per claudius's earlier comment.  So possibly 
related to bug 59530?
It crashes for me with build 2001010504 on Win32 (Windows 2000), but I get an
error message saying an area of memory can't be 'written'. The title of the
message is 'NSPR:EventReceiver: mozilla.exe'. I just thought this might help.
the proposed patch makes preferences an observer of the XPCOM shutdown
notification, which happens BEFORE the XPCOM services shutdown. the JS runtime
is stored in a service, so we know the runtime is valid at this point.
Shutdown() calls PREF_CleanupPrefs() which eliminates our pointer to the global
runtime.

I have verified that your prefs still save when you exit, and that we don't
crash. We were already calling ShutDown() in the destructor (as shown by the
stack trace) so that essentially makes this a no-op (but I'm leaving it in there
in case by some STRANGE chance you want to shut down the pref service before
mozilla itself shut down.

Looking for reviewers. brendan? jst?
Whiteboard: fix in hand
Looks good to me, but I'd like to see

  NS_ConvertASCIItoUCS2(NS_XPCOM_SHUTDOWN_OBSERVER_ID)

changed to

  NS_LITERAL_STRING(NS_XPCOM_SHUTDOWN_OBSERVER_ID).get()

Change that when you check in you've got my r=jst.
What jst wrote (although the .get() isn't necessary, is it?)

I'm gonna file a bug on nsCRT::strcmp(const PRUnichar *, const char *) for its
useless table lookup to map ISO-Latin-1 to UCS2.

sr=brendan@mozilla.org.

/be
Brendan, currently .get() is not required on nsLiteralString's (or
NS_ConvertASCIItoUCS2's) but I've heard rumors that it will be sometime soon, so
using .get() will probably save whomever makes .get() required some work.
You can't use NS_LITERAL_STRING with #define'ed strings, because it expands badly:

NS_LITERAL_STRING(FOO).get() puts an "L" in front of FOO and this turns into

nsLiteralString(LFOO).get()

which is undefined, and creates a compiler error.
We really need to fix NS_XPCOM_SHUTDOWN_OBSERVER_ID to do like the others
(which are defined like:
#define FOO (nsLiteralString("foo").get())

but that's another bug... which I'll go file now.
Argh, sorry for forgetting that nasty little C-pre-processor order of ## vs.
argument substitution gotcha.  But hey, fixing NS_XPCOM_SHUTDOWN_OBSERVER_ID to
expand to a NS_LITERAL_STRING(...).get() will avoid the asymmetric nsCRT::strcmp
call, although it will also inflate via NS_ConvertASCIItoUCS2 in two places in
your patch, on platforms that can't use uint16 to hold UTF-16 for L"foo" string
chars.

/be
fix is in, new bug is bug 66008
Status: ASSIGNED → RESOLVED
Closed: 24 years ago
Resolution: --- → FIXED
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Seeing this again...(but was fixed for a few days, iirc)
ugh, no I had to back this out (forgot to reopen) because it was causing
additional leaks.
Status: REOPENED → ASSIGNED
moving "fix in hand" bugs to moz 0.9
Target Milestone: mozilla0.8 → mozilla0.9
ok, the problem before was that a few people tried to access prefs after I had
recieved the shutdown notification

this fix only cleans up the JS stuff when I get the notification. The pref
hashtable is still there.

brendan/jst care to review?
Looks ok, one nit: no need to call JS_SetGlobalObject(cx, NULL) if you're about
to JS_DestroyContext(cx).

+        if (gGlobalConfigObject) {
+            JS_SetGlobalObject(gMochaContext, NULL);
+            gGlobalConfigObject = NULL;
+        }
+
+        JS_DestroyContext(gMochaContext);
+        gMochaContext = NULL;

/be
oh, ok, I was just copying and pasting code. fixed in my tree, and now checked
in.
Status: ASSIGNED → RESOLVED
Closed: 24 years ago24 years ago
Resolution: --- → FIXED
And this was once again backed out. Reopening.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
thanks for reopening
yeah, I had to back this out because it caused a crash on exit on tinderbox. I
cannot reproduce the crash on my machine, but I'll continue to investigate..
Status: REOPENED → ASSIGNED
Could the crash on exit be related to brendan's changes too?  (It was a JS
assertion.)  On some of the tinderboxes (maybe the optimized ones where
JS_ASSERT is a no-op?) there was no crash on exit, but there were additional
leaks instead.
Keywords: dom0
*** Bug 69800 has been marked as a duplicate of this bug. ***
Keywords: dom0
Resetting target milestone, not sure Alec can get this in for mozilla0.9
Target Milestone: mozilla0.9 → ---
ok, a bit hasty there, putting back mozilla0.9, sorry for the spam
Target Milestone: --- → mozilla0.9
I'm close, and this is pretty critical - we can't have people crashing just
because they opened the bookmarks window. Crash on exit means that we lose lots
of user data...putting back on 0.9 milestone
If it's any help, I'm getting this too (crash on close after opening bookmarks),
but didn't start doing it until I tried using the 'file bookmark' on the
bookmark dialog toolbar.  Instead of adding one, it added some reference back to
the top of my bookmark list, thus creating a sort of 'infinite loop' in my
bookmarks (that is, if I look at my bookmarks, when I get to the directory where
I tried to file it, I have instead a "Bookmarks for <my name>" subdirectory
there, which contains all my root level bookmarks, and I can repeatedly keep
going deeper like that).

I tried deleting that, manually editing the bookmarks file, and even replacing
the bookmarks file with my Netscape 4.x one.  That last one fixed my bookmarks
at least, but still generates the error on close if I open the bookmark dialog.

Currently using "Build ID: 2001022820" (was using milestone 0.8 when first occured).
to be clear, this has NOTHING to do with the contents of your bookmarks - this
is something broken with libpref's use of JSContexts.

I'm actually updating the component just so we're clear.
Component: DOM Level 0 → Preferences: Backend
Whiteboard: fix in hand
alecf: what's going wrong, can you say more?

/be
Mass moving most of mozilla0.9 bugs to mozilla0.8.1
Target Milestone: mozilla0.9 → mozilla0.8.1
Marking nsbeta1+ since we're trying to fix this anyway
Keywords: nsbeta1nsbeta1+
*** Bug 71603 has been marked as a duplicate of this bug. ***
*** Bug 63767 has been marked as a duplicate of this bug. ***
moving to 0.9 since I dont' have a fix that makes linux happy yet.
Target Milestone: mozilla0.8.1 → mozilla0.9
I think shaver fixed this with his latest pref checkin
Status: ASSIGNED → RESOLVED
Closed: 24 years ago23 years ago
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: