Closed
Bug 1016738
Opened 10 years ago
Closed 10 years ago
Simplify/fix "dead compartment" logic
Categories
(Core :: JavaScript Engine, defect)
Tracking
()
RESOLVED
FIXED
mozilla34
People
(Reporter: billm, Assigned: billm)
References
(Blocks 2 open bugs)
Details
Attachments
(2 files, 2 obsolete files)
28.58 KB,
patch
|
billm
:
review+
|
Details | Diff | Splinter Review |
1.04 KB,
patch
|
bbouvier
:
review+
|
Details | Diff | Splinter Review |
The dead compartment/zone code has been polluting our codebase for a while. It's really weird and hard to understand. Eric ran into a problem with it in bug 966518. Originally, I thought we could just work around the problem, but I think it's better to simplify the whole setup.
The purpose of the code is to collect objects even if they get "revived" by wrapper remapping during incremental GC. This revival can happen because we do the remapping to all wrappers, not just ones that are alive. If we have a compartment that's completely dead, then remapping a wrapper inside it might cause us to trigger a barrier or allocate. Either of these operations will likely cause the global to be marked, which will then cause lots of other stuff to be marked.
This is sort of a big deal because it happens consistently during tests, where IGCs span over many tests, and each test causes a new navigation (and consequent wrapper remapping) when we load it.
The idea I came up to simplify this is that we can detect, at the end of every GC, whether a dead compartment was revived. If it was, we can do a non-incremental collection of just the zones that we want to collect. In theory, that collection should be really fast. We won't need to do any marking, just finalization. Previously, we were doing a non-incremental GC of all zones in that case. Since the cost of this collection is no longer so expensive, I think it's fine to remove all the weird guards and stuff that were intended to ensure that these collections where rare. So the patch eliminates AutoMaybeTouchDeadZones and AutoMarkInDeadZone.
While testing the patch, I ran into a further issue. Apparently the existing mechanism hasn't really been working since the zones patch landed. My original testcase for all this was to run reftests and watch compartments accumulate. With zones, the maybeAlive and scheduledForDestruction flags moved to the zone. However, reftests uses a single zone for all the tests. So the flags aren't much use--the zone is alive, but many of its compartments are dead. As such, the number of compartments can grow really big while running reftests. I guess we were just sort of getting away with this somehow.
Anyway, I moved these flags back to the compartment. Since we can't always get to the compartment from a given GC thing, I had to make an approximation that a compartment is alive only if it has live objects or scripts. I think that's pretty much true. Even if the approximation is wrong, the only bad thing that will happen is that we'll try to do one of these non-incremental GCs on the given zone.
With these changes, the number of compartments seems to stay fairly steady during the first few hundred reftests.
Assignee | ||
Comment 1•10 years ago
|
||
This seems pretty green on try. Luke, you reviewed the previous changes related to this stuff. Jon, maybe you should look at it too.
Attachment #8429697 -
Attachment is obsolete: true
Attachment #8430208 -
Flags: review?(luke)
Attachment #8430208 -
Flags: review?(jcoppeard)
Comment 2•10 years ago
|
||
Comment on attachment 8430208 [details] [diff] [review]
dead-compartments v2
Review of attachment 8430208 [details] [diff] [review]:
-----------------------------------------------------------------
It's nice to see AutoMaybeTouchDeadZones and AutoMarkInDeadZone go.
::: js/src/gc/Tracer.cpp
@@ +639,5 @@
> Zone *zone = static_cast<Cell *>(thing)->tenuredZone();
> if (zone->isCollecting()) {
> + // See the comment on SetMaybeAliveFlag to see why we only do this for
> + // objects and scripts.
> + switch (kind) {
It might be worth adding a comment GCMarker::appendGrayRoot() only gets called on incremental GC (we don't buffer gray roots for non-incremental GC). I don't think this matters because dead compartments never have their contents inadvertently marked live during non-incremental GC.
Attachment #8430208 -
Flags: review?(jcoppeard) → review+
Comment 3•10 years ago
|
||
Comment on attachment 8430208 [details] [diff] [review]
dead-compartments v2
Review of attachment 8430208 [details] [diff] [review]:
-----------------------------------------------------------------
Much nicer
::: js/src/jsgc.cpp
@@ +3023,5 @@
> for (CompartmentsIter c(rt, SkipAtoms); !c.done(); c.next()) {
> for (JSCompartment::WrapperEnum e(c); !e.empty(); e.popFront()) {
> + const CrossCompartmentKey &key = e.front().key();
> + if (key.kind == CrossCompartmentKey::ObjectWrapper ||
> + key.kind == CrossCompartmentKey::DebuggerObject)
I think you should additionally be able to get to the compartment of DebuggerScript, DebuggerEnvironment (environment = scope object), and maybe DebuggerSource.
@@ +4903,1 @@
> } while (repeat);
Unrelated, but: it seems 'repeat' is only used for the while condition, so I think it'd be a tiny bit simpler to put the condition in the while() directly.
::: js/src/jswrapper.h
@@ -310,5 @@
> - * because they operate on all compartments, whether live or dead. A brain
> - * transplant can cause a formerly dead object to be "reanimated" by causing a
> - * read or write barrier to be invoked on it during the transplant. In this way,
> - * a zone becomes a zombie, kept alive by repeatedly consuming
> - * (transplanted) brains.
I will miss this comment.
Attachment #8430208 -
Flags: review?(luke) → review+
Assignee | ||
Comment 4•10 years ago
|
||
Comment 5•10 years ago
|
||
Backed out for a spike in various GC-related crashes starting with this push.
https://hg.mozilla.org/integration/mozilla-inbound/rev/3ac2205532ed
https://tbpl.mozilla.org/php/getParsedLog.php?id=40744831&tree=Mozilla-Inbound
https://tbpl.mozilla.org/php/getParsedLog.php?id=40744890&tree=Mozilla-Inbound
https://tbpl.mozilla.org/php/getParsedLog.php?id=40734131&tree=Mozilla-Inbound
https://tbpl.mozilla.org/php/getParsedLog.php?id=40735967&tree=Mozilla-Inbound
https://tbpl.mozilla.org/php/getParsedLog.php?id=40740817&tree=Mozilla-Inbound
https://tbpl.mozilla.org/php/getParsedLog.php?id=40740348&tree=Mozilla-Inbound
Comment 7•10 years ago
|
||
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla32
Assignee | ||
Comment 8•10 years ago
|
||
I had to back this out because of some WebRTC crashes.
https://hg.mozilla.org/integration/mozilla-inbound/rev/7831f5db9e25
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Comment 9•10 years ago
|
||
Once we verify that this makes the crashes go away, we'll want to uplift this backout to Aurora as well.
Comment 10•10 years ago
|
||
No crashes on inbound since this landed. Please request Aurora approval at your earliest convenience.
Comment 11•10 years ago
|
||
Merge of backout:
https://hg.mozilla.org/mozilla-central/rev/7831f5db9e25
Assignee | ||
Comment 12•10 years ago
|
||
Comment on attachment 8430208 [details] [diff] [review]
dead-compartments v2
Are we supposed to request approval for backouts?
[Approval Request Comment]
Bug caused by (feature/regressing bug #): this one
User impact if declined: unknown
Testing completed (on m-c, etc.): on m-c
Risk to taking this patch (and alternatives if risky): none
String or IDL/UUID changes made by this patch: none
Attachment #8430208 -
Flags: approval-mozilla-aurora?
Comment 13•10 years ago
|
||
Comment on attachment 8430208 [details] [diff] [review]
dead-compartments v2
Fair point.
Attachment #8430208 -
Flags: approval-mozilla-aurora?
Comment 14•10 years ago
|
||
Comment 15•10 years ago
|
||
Bug 1022235 is now fixed, which was the thing most heavily triggered by this change. Note however that when this change landed, we started getting oranges without the load manager turned on; turning it on in bug 1022212 triggered bug 1022235. So it must have been tripping up something else - possibly another incorrect lifetime, or a bug in this patch.
With bug 1022235 landed (and on Aurora), it should be easier to look into this.
Assignee | ||
Comment 16•10 years ago
|
||
In the hopes that this is all fixed now, I've relanded this.
https://hg.mozilla.org/integration/mozilla-inbound/rev/aca2ec2457ec
Comment 17•10 years ago
|
||
Sorry, looks like they're still there. Backed out.
https://hg.mozilla.org/integration/mozilla-inbound/rev/3124e2c41202
Status: REOPENED → RESOLVED
Closed: 10 years ago → 10 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla33
Assignee | ||
Updated•10 years ago
|
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Comment 19•10 years ago
|
||
when this was backed out we had a perf win on android 4.0 tsvgx tests. Something to keep an eye out for when we land it again.
Updated•10 years ago
|
Blocks: GC.stability
Assignee | ||
Comment 20•10 years ago
|
||
Terrence mentioned on IRC the other day that a GC bug got fixed that might have impacted this patch. I pushed it to try again and the results look better. I no longer see the nuking crashes that I remember from before (I think they were in mochitest-1). However, it looks like there are still some test failures that need addressing.
I need some advice on what to focus on. There's an OSX 10.8 opt xpcshell test failure that clearly needs to be fixed. Other than that, I'm a little unsure. There's one Linux x64 debug crashtest failure that looks like one of the WebRTC issues we ran into before (initially starred in bug 1018372). But it seems pretty rare. There's also some weird stuff that I'm not sure about.
Ryan, can you offer any help here? What failures need to be addressed before landing and what do you think we can live with? Despite the multiple backouts we've had on this bug, I still haven't found a bug in the patch. It just seems to be really good at turning up bugs elsewhere.
Here are my try pushes. Please feel free to retrigger as you see fit.
https://tbpl.mozilla.org/?tree=Try&rev=ebce8337a8ed
https://tbpl.mozilla.org/?tree=Try&rev=f7fb02279093
https://tbpl.mozilla.org/?tree=Try&rev=b38987fbab99
Flags: needinfo?(ryanvm)
Comment 21•10 years ago
|
||
My recollection was the issues mainly manifested in the first DOM crashtest after WebRTC (i.e. things going down the tubes during cleanup). I'll retrigger more crashtest runs, but if things look good there, I don't see any reason to hold back on this.
That said, this looks exactly like the kind of thing we were hitting before:
https://tbpl.mozilla.org/php/getParsedLog.php?id=45153067&tree=Try
Seems like if we think this is an issue with WebRTC shutdown, we need more debugging done with this patch applied.
Flags: needinfo?(ryanvm)
Target Milestone: mozilla33 → ---
Assignee | ||
Comment 22•10 years ago
|
||
Rebased version.
Attachment #8430208 -
Attachment is obsolete: true
Attachment #8467996 -
Flags: review+
Assignee | ||
Comment 23•10 years ago
|
||
https://hg.mozilla.org/integration/mozilla-inbound/rev/906c4230aec8
The WebRTC stuff should be fixed. This landing may lead to an increase in a pre-existing (and already very frequent) 10.8 xpcshell failure. RyanVM told me that was okay. Hopefully, if it happens, it will make it easier to track the issue down.
Comment 24•10 years ago
|
||
Introduced a build warning:
In file included from /home/ben/code/moz/inbound/js/src/jit/Ion.h:13:0,
from /home/ben/code/moz/inbound/js/src/jit/IonAllocPolicy.h:17,
from /home/ben/code/moz/inbound/js/src/jit/LIR.h:17,
from /home/ben/code/moz/inbound/js/src/jit/shared/Lowering-shared.h:13,
from /home/ben/code/moz/inbound/js/src/jit/shared/Lowering-x86-shared.h:10,
from /home/ben/code/moz/inbound/js/src/jit/x64/Lowering-x64.h:10,
from /home/ben/code/moz/inbound/js/src/jit/x64/Lowering-x64.cpp:7,
from /home/ben/code/moz/builds/d64/js/src/Unified_cpp_js_src7.cpp:2:
/home/ben/code/moz/inbound/js/src/jscompartment.h: In constructor ‘JSCompartment::JSCompartment(JS::Zone*, const JS::CompartmentOptions&)’:
/home/ben/code/moz/inbound/js/src/jscompartment.h:460:30: warning: ‘JSCompartment::jitCompartment_’ will be initialized after [-Wreorder]
js::jit::JitCompartment *jitCompartment_;
^
/home/ben/code/moz/inbound/js/src/jscompartment.h:456:10: warning: ‘bool JSCompartment::scheduledForDestruction’ [-Wreorder]
bool scheduledForDestruction;
^
In file included from /home/ben/code/moz/builds/d64/js/src/Unified_cpp_js_src7.cpp:93:0:
/home/ben/code/moz/inbound/js/src/jscompartment.cpp:37:1: warning: when initialized here [-Wreorder]
JSCompartment::JSCompartment(Zone *zone, const JS::CompartmentOptions &options = JS::CompartmentOptions())
^
I'll check in once inbound reopens.
Attachment #8468360 -
Flags: review+
Comment 25•10 years ago
|
||
(forgot to say that r+ is from efaust on irc)
https://hg.mozilla.org/integration/mozilla-inbound/rev/01d12ccde2bb
Comment 26•10 years ago
|
||
(In reply to Bill McCloskey (:billm) from comment #23)
> The WebRTC stuff should be fixed.
The final patch didn't land yet? Oh well, guess we'll fast-track it at this point.
Comment 27•10 years ago
|
||
https://hg.mozilla.org/mozilla-central/rev/906c4230aec8
https://hg.mozilla.org/mozilla-central/rev/01d12ccde2bb
Status: REOPENED → RESOLVED
Closed: 10 years ago → 10 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla34
Updated•10 years ago
|
Flags: qe-verify-
You need to log in
before you can comment on or make changes to this bug.
Description
•