Closed Bug 464828 Opened 16 years ago Closed 16 years ago

js_ThreadDestructorCB doesn't get called for all threads

Categories

(Core :: JavaScript Engine, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED

People

(Reporter: dmandelin, Assigned: dmandelin)

References

Details

(Keywords: fixed1.9.1)

Attachments

(2 files, 1 obsolete file)

My test case: start a browser. Shut it down. js_ThreadDestructorCB appears not to have been called at all. The attached printf patch should show a record for it and doesn't.

The result is that js_FinishJIT doesn't get called either. And so the fragment caches are not cleaned up (by Fragmento::clearFrags), and they appear in the shutdown leak report. Both the trace cache and the native regexp cache appear in the leak report. I'm not sure why the problem only became obvious with regexps, it could be that they are very frequent in chrome and smaller and so there are more of them.
Blocks: 463814
So I thought there was a bug on this, but I can't find it.  I recall knowing before that we didn't destroy the private data for the main thread, though.  The constraint of not regressing bug 365048 makes this somewhat harder to fix, though.  If we didn't have that constraint, we could just destroy the thread-private data when destroying the runtime.

(I think the leak may date back to being a regression from bug 312338.)
(In reply to comment #1)
> (I think the leak may date back to being a regression from bug 312338.)

Sorry, bug 312238.
I've figured out more about what's going on here. 

NSPR doesn't call thread-private-data (TPD) destructors on the main thread. This was a specific decision made to fix bug 383977, which was a crash on shutdown.

This makes it hard to avoid shutdown leaks in TraceMonkey, because TM has a trace monitor for each JS thread, including of course the main thread. 

I think the ideal solution would be to change NSPR to call TPD destructors on the main thread and make whatever other changes are required to avoid crashes on shutdown. But comment 2 of bug 383977 says:

>I'm now convinced that it is unsafe for NSPR's shared library
>finalization function to call back to the thread private data
>destructors.  First, the libraries in which the destructors are
>defined in may have been shut down or unloaded.  Second, this
>breaks backward compatibility, because NSPR did not call the
>TPD destructors when it is unloaded before.

I don't see why the first reason (libraries containing the dtor functions may gave been unloaded) is true, but maybe I'm just ignorant on that. The second reason is more problematic, although I think not having those functions called is super-annoying and unexpected, and also doesn't match the docs.

Anyway, I'm not sure if we could update NSPR in time for 1.9.1. Can we? If we can, then we should be able to get this to work. For example, we could add a parameter to the callback that says whether it is being called from the main thread. That way, dtors that are unsafe to run from the main thread could check for that and do nothing.

If we can't do that, then we need to call js_FinishJIT on main thread shutdown by other means. Based on a suggestion from jst, it looks like NS_ShutdownXPCOM_P would be a reasonable place to do this, right after it calls nsCycleCollector_shutdown. (That function calls JS_GC, so clearly we can't shut down before it.) If we go this route, presumably we need to consult with a browser & JS shutdown expert, which is certainly not me.
(In reply to comment #3)
> If we can't do that, then we need to call js_FinishJIT on main thread shutdown
> by other means. Based on a suggestion from jst, it looks like

Why call js_FinishJIT when you can just delete the TPD for the main thread (and set it to null) and fix a bunch of other leaks at the same time?  (In comment 1, I was pretty much thinking that, but then realized that you couldn't just do that in JS_DestroyRuntime per the bug I cited there.)
(In reply to comment #4)
> Why call js_FinishJIT when you can just delete the TPD for the main thread (and
> set it to null) and fix a bunch of other leaks at the same time?  (In comment
> 1, I was pretty much thinking that, but then realized that you couldn't just do
> that in JS_DestroyRuntime per the bug I cited there.)

That's a good idea but I don't know how to do it. The highest (in the call graph) functions I found were _pt_thread_death (non-portable, private) and _PR_DestroyThreadPrivate (also private). 

Were you thinking of setting up a mechanism to call PR_SetThreadPrivate(index, 0) for each TPD index? (Or actually I guess you might have just meant doing that for the JIT thing, which is clearly better than calling js_FinishJIT.)
I was thinking of something much less fancy:

    JSThread *thread;

    thread = (JSThread *)PR_GetThreadPrivate(threadTPIndex);
    js_ThreadDestructorCB(thread);
    PR_SetThreadPrivate(threadTPIndex, NULL); /* does this do what I want? */
(In reply to comment #0)
> The result is that js_FinishJIT doesn't get called either.

What about calling js_FinishJIT during the shutdown of the JS Runtime? It happens on the main thread for the browser and the shell.
(In reply to comment #7)
> What about calling js_FinishJIT during the shutdown of the JS Runtime? It
> happens on the main thread for the browser and the shell.

We thought of doing this, but we were afraid of adding hidden requirements about which threads are used for starting and shutting down runtimes. Let me think it through more carefully:

- The trace monitor for thread T is created when js_GetCurrentThread is called on T (e.g., when a context is created on that thread).

- The destructor callback is registered (for all threads) the first time JS_NewRuntime is called. This will result in the trace monitor being cleaned up for all threads (when they exit) except the main thread.

So I think it's OK to clean that up in the last call to JS_DestroyRuntime or the call to JS_ShutDown. It looks like JS_ShutDown can be used for both browser and shell. If some change later causes that call to occur on a different thread, then the main thread won't be cleaned up, but that isn't catastrophic and should be noticed by the leak finders.
Here's a patch based on the JS_ShutDown idea. Seems to do the right thing in the browser.
Attachment #350843 - Flags: review?(igor)
Comment on attachment 350843 [details] [diff] [review]
Patch to clear tp data in JS_ShutDown

>diff -r f379309b622f js/src/jscntxt.h
> /*
>+ * Cleanup thread-private data on the current thread. This happens 
>+ * automatically for most threads, so this method should only be called 
>+ * on shutdown on the main thread.
>+ */
>+extern JSBool
>+js_CleanupThreadPrivateData();

The comment is not precise - it should state that the cleanup happens automatically when the thread terminates. So the function should be called for any thread that survives JS_DestroyRuntime. Otherwise the patch is OK.
Attachment #350843 - Flags: review?(igor)
Drive-by nit-pick: s/Cleanup/Clean up/ in that comment, too, please.

/be
I revised the comments per your recs, except that I referred to JS_ShutDown since that's where I do the final cleanup. I thought it was possible to create multiple runtimes but I don't know if that ever really happens. I also added some bug number xrefs because I forgot to do that the first time around.
Assignee: general → dmandelin
Attachment #350843 - Attachment is obsolete: true
Status: NEW → ASSIGNED
Attachment #350999 - Flags: review?
Attachment #350999 - Flags: review? → review?(igor)
Attachment #350999 - Flags: review?(igor) → review+
Pushed to TM as fe5804f75874.
Status: ASSIGNED → RESOLVED
Closed: 16 years ago
Resolution: --- → FIXED
Attachment #350999 - Flags: approval1.9.1+
fixed1.9.1 kw?
Not yet fixed on 1.9.1, AIUI. Not even sure it got merged into mozilla-central, though it is RESO FIXED which implies as much.
The changeset is in m-c. What is showing it not to be fixed there?
(In reply to comment #16)
> The changeset is in m-c. What is showing it not to be fixed there?

I have been doing all the branch merging, so the team can spend their time on other things. Looks like I missed the flag here, I'll verify.
Flags: in-testsuite-
Flags: in-litmus-
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: