Closed
Bug 905926
Opened 11 years ago
Closed 11 years ago
LAST_CONTEXT GCs are totally insane
Categories
(Core :: JavaScript Engine, defect)
Tracking
()
RESOLVED
FIXED
mozilla27
People
(Reporter: bholley, Assigned: bholley)
References
Details
Attachments
(9 files, 2 obsolete files)
2.98 KB,
patch
|
jonco
:
review+
|
Details | Diff | Splinter Review |
3.53 KB,
patch
|
billm
:
review+
|
Details | Diff | Splinter Review |
2.02 KB,
patch
|
billm
:
review+
|
Details | Diff | Splinter Review |
9.07 KB,
patch
|
billm
:
review+
|
Details | Diff | Splinter Review |
3.22 KB,
patch
|
billm
:
review+
|
Details | Diff | Splinter Review |
7.88 KB,
patch
|
jonco
:
review+
|
Details | Diff | Splinter Review |
11.43 KB,
patch
|
billm
:
review+
jonco
:
review+
|
Details | Diff | Splinter Review |
4.86 KB,
patch
|
billm
:
review+
|
Details | Diff | Splinter Review |
1.25 KB,
patch
|
billm
:
review+
|
Details | Diff | Splinter Review |
In bug 905364, I worked some magic so that we don't always create the SafeJSContext whenever CAPS fires up. In general, this just delays the creation a bit until it's really needed. But in certain very trivial xpcshell situations, we actually never create the SafeJSContext at all.
This caused crashes, which I spent some time investigating. As it turns out, the JS engine has a panic attack when the last JSContext is destroyed, and starts freeing memory buffers willy-nilly without calling finalize hooks. In particular, during the 'last gc', we just call js_delete() on every compartment, even if it's marked and has a live global with all sorts of stuff hanging off of it: http://mxr.mozilla.org/mozilla-central/source/js/src/jsgc.cpp#2530
So in effect, the engine has an invariant that there must be a JSContext alive for the lifespan of anything useful happening on the JSRuntime. The SafeJSContext used to do this for the browser, but doesn't (necessarily) anymore. So now we run into a chicken-and-egg scenario with xpcshell. If we destroy xpcshell's JSContext before shutting down XPCOM, SpiderMonkey takes a bulldozer to XPConnect's carefully-managed data structures, causing us to crash. If we shut down XPCOM first, we end up simultaneously destroying the runtime, which probably isn't kosher to do with an outstanding context.
A simple workaround is to force the creation of the SafeJSContext in xpcshell. But it seems like we should fix this. Why do we do it this way, and not in JS_DestroyRuntime or somesuch?
Assignee | ||
Updated•11 years ago
|
Summary: LAST_CONTEXT GC is totally insane → LAST_CONTEXT GCs are totally insane
Some time ago we used to need a JSContext in order to perform a GC because we would pass the context to finalizers. Igor did a big refactor a while ago that removed this requirement. I'm pretty sure it should be possible to do all this teardown stuff in JS_DestroyRuntime now.
Comment 2•11 years ago
|
||
That sounds like an improvement.
I probably said this elsewhere, but DestroyRuntime should really call all the finalizers. Leaking everything is outrageous.
Assignee | ||
Updated•11 years ago
|
Assignee: general → bobbyholley+bmo
Assignee | ||
Comment 3•11 years ago
|
||
Assignee | ||
Comment 4•11 years ago
|
||
Blocks: 890243
Assignee | ||
Comment 5•11 years ago
|
||
Bill is going to take a look here.
This patch makes some changes that I think are needed to allow us to run safely without any JSContexts around. It passes jit-tests. I wasn't able to test otherwise since I get a crash at startup with just Bobby's patches applied.
Bobby, it's possible that this patch will fix the xpcshell test problem. I'm not sure what the deal is with the startup crash. It claims it's OOMing, but that doesn't make any sense. Have you tried to build your stuff locally? I based your patches on top of 764ac2c7cb0a, which was green on tinderbox, although I haven't tried it locally.
Never mind, the OOM was a stupid thing caused by updating Ubuntu without rebooting.
I couldn't reproduce the failures from try, so I pushed my patch to try on top of everything else. Maybe we'll get lucky and it will fix the problem.
https://tbpl.mozilla.org/?tree=Try&rev=130fb1068768
Assignee | ||
Comment 9•11 years ago
|
||
hm, still GC crashes. :-(
There are still some crashes left, but I think some stuff got fixed. Everything that's left seems to be caused by calling cyclecollector::DeferredFinalize during the final GC, which apparently is forbidden because the cycle collector has already shut down. I'm not too familiar with this code.
Comment 12•11 years ago
|
||
There's two parts to CC shutdown. First, there's the CC part, which doesn't matter here. The second part is nsCycleCollector_forgetJSRuntime, which nulls out the runtime component of the CC per-thread data which is what the DeferredFinalize is routed through. It looks like that gets called in ~CycleCollectedJSRuntime, right before we call JS_DestroyRuntime(). Is that also used in XPCShell?
Comment 13•11 years ago
|
||
Just at a glance, moving the nsCycleCollector_forgetJSRuntime after JS_DestroyRuntime() might work, though you'd have to be very careful, because presumably you'd end up with a dangling pointer in the CC at some point in during JS_DestroyRuntime().
I took Andrew's suggestion and moved down nsCycleCollector_forgetJSRuntime.
https://tbpl.mozilla.org/?tree=Try&rev=22ba816f2e7f
I guess I messed up the rebasing.
https://tbpl.mozilla.org/?tree=Try&rev=e7a224bec480
This one actually compiles.
https://tbpl.mozilla.org/?tree=Try&rev=3af588d149f7
Assignee | ||
Comment 17•11 years ago
|
||
(In reply to Bill McCloskey (:billm) from comment #16)
> This one actually compiles.
> https://tbpl.mozilla.org/?tree=Try&rev=3af588d149f7
The basic problem with the crash here is that XPCWrappedJS::Release needs to acquire the map lock, which is destroyed by the time we invoke JS_DestroyRuntime. We're going to just expose an idempotent mechanism on CycleCollectedJSRuntime so that we can destroy the JSRuntime in the XPCJSRuntime destructor. I've filed bug 911303 on doing this properly.
Assignee | ||
Comment 18•11 years ago
|
||
Assignee | ||
Comment 19•11 years ago
|
||
Assignee | ||
Comment 20•11 years ago
|
||
Assignee | ||
Comment 21•11 years ago
|
||
Assignee | ||
Comment 22•11 years ago
|
||
This is all green except for one jit-test, Debugger-onNewGlobalObject-15.js, which I actually wrote not so long ago.
The issue is that the list of debuggers is non-empty when the runtime is destroyed, because the Debugger object never got GCed. The problem goes away when I revert my patch to make SHG-marking lastContext()-independent. Weird.
I'll talk to bill about this on monday. Hopefully we can get it figured out and get this stuff up for review.
Assignee | ||
Comment 23•11 years ago
|
||
Assignee | ||
Comment 24•11 years ago
|
||
infra issues :\
https://tbpl.mozilla.org/?tree=Try&rev=ee48b61ff12c
Assignee | ||
Comment 25•11 years ago
|
||
_finally_ green on linux64. I'm going to do a full try push to see if there are any other lingering failures.
Assignee | ||
Comment 26•11 years ago
|
||
Assignee | ||
Comment 27•11 years ago
|
||
This is looking green, though the macosx results haven't come in yet. I'm going to upload patches and flag for review.
Assignee | ||
Updated•11 years ago
|
Attachment #793087 -
Attachment is obsolete: true
Assignee | ||
Updated•11 years ago
|
Attachment #792959 -
Attachment is obsolete: true
Assignee | ||
Comment 28•11 years ago
|
||
This patch fixes two small issues.
(1) It tightens an assertion in finalizeNow. Stuff finalized in finalizeNow
should never be finalized off the main thread, so we shouldn't have to
worry about BFS_JUST_FINISHED.
(2) If the only compartment that's left to GC is the atoms compartment, then
we're not collecting it. I just moved the |any| check down below the
code that decides whether to schedule the atoms compartment for GC.
Attachment #803483 -
Flags: review?(jcoppeard)
Assignee | ||
Comment 29•11 years ago
|
||
The current mechanism is pretty sloppy, and falls over if the context count ever
drops to zero and then increases again. Let's do this right.
Attachment #803484 -
Flags: review?(wmccloskey)
Assignee | ||
Comment 30•11 years ago
|
||
This interacts badly with our attempts to shut things down nicely, and happens
to be entirely unused. the primary impact of it ends up being that the cx
destruction triggers a GC, which we depend on. But we can do that manually.
Attachment #803485 -
Flags: review?(wmccloskey)
Assignee | ||
Comment 31•11 years ago
|
||
Attachment #803486 -
Flags: review?(wmccloskey)
Assignee | ||
Comment 32•11 years ago
|
||
Attachment #803487 -
Flags: review?(wmccloskey)
Assignee | ||
Comment 33•11 years ago
|
||
Attachment #803488 -
Flags: review?(jcoppeard)
Assignee | ||
Comment 34•11 years ago
|
||
This patch is a joint effort between billm and myself. I'm marking myself as
the author because I worked on it more recently, which means that Bill should
probably review it again in addition to jonco, and self-review is frowned upon.
There were a few places where we relied on !rt->hasContexts() to mean
"this is the last GC". That was never really valid, and it especially isn't
valid with the previous patch. This patch replaces the check everywhere it's
used in the GC.
Attachment #803489 -
Flags: review?(wmccloskey)
Attachment #803489 -
Flags: review?(jcoppeard)
Assignee | ||
Comment 35•11 years ago
|
||
In the current setup, we'll end up GCing once when destroying the SafeJSContext,
and then once again immediately after in the explicit GC we do before destroying
the XPCJSRuntime. It would be nice to avoid the first GC entirely, but we
currently need both to avoid leaking.
All in all, these patches cause us to GC three times during shutdown, rather
than twice as we did before, because the second GC was rolled together with
the runtime destruction GC when we destroyed the last JSContext. There are a
number of ways to eliminate these, at least in opt builds, but mccr8 thinks
it probably doesn't matter, since there shouldn't be much left in the heap after
the second GC.
We can probably get away with eliminating rambo GC entirely at some point. But
this might become irrelevant for the browser if we end up doing bug 662444.
It would also be interesting to see what, if anything, the rambo GC actually
collects. We might even be able to get away with asserting that all the zones
are gone and removing the GC entirely.
We also take the opportunity to kill mOwnSafeJSContext, which no longer holds
any meaning.
Attachment #803490 -
Flags: review?(wmccloskey)
Assignee | ||
Comment 36•11 years ago
|
||
Attachment #803491 -
Flags: review?(wmccloskey)
Comment on attachment 803484 [details] [diff] [review]
Part 2 - Explicitly track whether a context has been created. v1
Review of attachment 803484 [details] [diff] [review]:
-----------------------------------------------------------------
::: js/src/jscntxt.cpp
@@ +186,5 @@
> * keywords, numbers, strings and self-hosted scripts. If one of these
> * steps should fail, the runtime will be left in a partially initialized
> * state, with zeroes and nulls stored in the default-initialized remainder
> * of the struct. We'll clean the runtime up under DestroyContext, because
> * cx will be "last" as well as "first".
Please remove the last sentence from the comment.
Attachment #803484 -
Flags: review?(wmccloskey) → review+
Comment on attachment 803485 [details] [diff] [review]
Part 3 - Stop creating an ephemeral JSContext during XPConnect shutdown. v1
Review of attachment 803485 [details] [diff] [review]:
-----------------------------------------------------------------
I think there's probably more we can do to improve this sequence, but it's definitely outside the scope of the bug.
Attachment #803485 -
Flags: review?(wmccloskey) → review+
Comment on attachment 803486 [details] [diff] [review]
Part 4 - Invoke JS_DestroyRuntime before we totally tear down the XPCJSRuntime. v1
Review of attachment 803486 [details] [diff] [review]:
-----------------------------------------------------------------
::: xpcom/base/CycleCollectedJSRuntime.h
@@ +86,5 @@
> CycleCollectedJSRuntime(uint32_t aMaxbytes,
> JSUseHelperThreads aUseHelperThreads);
> virtual ~CycleCollectedJSRuntime();
>
> + // Idempotent. Subclasses may destroy their runtime earlier in execution if
I think runtimes should be plural, although I'm not sure.
Attachment #803486 -
Flags: review?(wmccloskey) → review+
Attachment #803487 -
Flags: review?(wmccloskey) → review+
Comment on attachment 803489 [details] [diff] [review]
Part 7 - Stop using hasContexts(). v2
Review of attachment 803489 [details] [diff] [review]:
-----------------------------------------------------------------
::: js/src/vm/Runtime.cpp
@@ +422,5 @@
>
> + /*
> + * Flag us as being destroyed. This allows the GC to free things like
> + * interned atoms and Ion trampolines.
> + * */
Last line of comment is messed up.
Attachment #803489 -
Flags: review?(wmccloskey) → review+
Comment on attachment 803490 [details] [diff] [review]
Part 8 - Be more explicit about GCing twice in the nsXPConnect destructor. v1
Review of attachment 803490 [details] [diff] [review]:
-----------------------------------------------------------------
::: js/xpconnect/src/nsXPConnect.cpp
@@ +110,5 @@
>
> + // In order to clean up everything properly, we need to GC twice: once now,
> + // to clean anything that can go away on its own (like the Junk Scope, which
> + // we unrooted above), and once after forcing a bunch of shutdown in
> + // XPConnect, to clean the stuff we forcibly disconnected.
This doesn't explain why we couldn't get away with only the latter GC. If you know the answer, can you add something?
Attachment #803490 -
Flags: review?(wmccloskey) → review+
Attachment #803491 -
Flags: review?(wmccloskey) → review+
Updated•11 years ago
|
Attachment #803483 -
Flags: review?(jcoppeard) → review+
Updated•11 years ago
|
Attachment #803489 -
Flags: review?(jcoppeard) → review+
Updated•11 years ago
|
Attachment #803488 -
Flags: review?(jcoppeard) → review+
Assignee | ||
Comment 42•11 years ago
|
||
Updated patches to address review comments. One more full push to try just to be super safe. There were no substantive changes to the patches since comment 26, so the only danger would be a mismerge while updating to tip:
https://tbpl.mozilla.org/?tree=Try&rev=b7b1b04fcccd
Assignee | ||
Comment 43•11 years ago
|
||
Comment 44•11 years ago
|
||
https://hg.mozilla.org/mozilla-central/rev/8093f0c7cc19
https://hg.mozilla.org/mozilla-central/rev/ce0d77291a93
https://hg.mozilla.org/mozilla-central/rev/338b7a7e8561
https://hg.mozilla.org/mozilla-central/rev/93dcb6fe927f
https://hg.mozilla.org/mozilla-central/rev/ecb564ad9140
https://hg.mozilla.org/mozilla-central/rev/021a57afb505
https://hg.mozilla.org/mozilla-central/rev/e78212e4cfa3
https://hg.mozilla.org/mozilla-central/rev/afa1358c1a80
https://hg.mozilla.org/mozilla-central/rev/5ffef86946ed
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla27
You need to log in
before you can comment on or make changes to this bug.
Description
•