932982 - IonMonkey: Don't throw away / invalidate active ion code during GC

Reporter

Description

•

11 years ago

This used to be an utopia, but is coming closer and closer. The walls making this impossible are falling and there seems to be only a few things left, before this can become a reality. This is important because we don't have GGC and a GC is still common. So every GC removes all ionscripts and we fall back to baseline. We need to compile every script again. As a result every GC adds some "startup time", before we are fast again. GC shouldn't push us on the ground and let us crawl, until we can stand up again. With GGC this is also nice to have. We will still GC! So for keeping our performance during GC this will be much much much better. Especially for code that does graphical things. Constant speed is prefered in that case, without too big drops in performance. I feel this bug is becoming more and more important. I spoke with nbp and terrence before to know what was needed, but didn't find time to look into this, especially since I'm fuzzy about a lot of details needed to implement this. Today Jan and Brian were talking about it. Both have more knowledge about it and Brian mentioned he would possibly take this after rounding up his other stuff. I'm opening this as a bug/feature/todo list. (Since I have other bugs that will depend on this)

Hannes Verschore [:h4writer]

Reporter

Comment 1

•

11 years ago

Tested this on pdf.js with "gcPreserveCode()" and our score goes from 13873 to 17561. That would be a 26% improvement for pdf.js and bring us on par with chrome (v8).

Blocks: 807162

Till Schneidereit [:till]

Comment 2

•

11 years ago

Since we expect Shumway to run pretty much permanently, this is of fairly high importance.

Blocks: 885526

Whiteboard: [Shumway:m2]

Brian Hackett [Laid off!]

Assignee

Comment 3

•

11 years ago

Attached patch patch — Details — Splinter Review

This patch removes the special type marking steps that need to be taken when we are preserving jitcode. Instead of marking all scripts, type objects and singletons, we just mark the heap as normal and can collect whatever is dead. Since discarding jitcode aggressively is still good for memory consumption this leaves the related heuristics unchanged, except that the lastCodeRelease catch is removed as we can still collect stuff when keeping jitcode around. Presumably these heuristics can be relaxed to keep jitcode around more often once this patch goes in. This patch seems stable with jit-tests with or without forcing ShouldPreserveJITCode to always return true. Besides removal of markTypes, the changes this makes: - Trace type constraints along with their associated type sets. Since these constraints are what trigger invalidation they need to persist through the GC. As with type sets though these just hold weak references. - Remove NEW_SCRIPT_REGENERATE junk, which was a hack to deal with the fact we always threw away type constraints on GC. - Compress the compiler outputs indexing jit compilations on GC, so that the array does not grow without bounds. - Move some stuff from TypeCompartment to TypeZone for ease of use. - Removal of the pending array on compartments. This is kind of unrelated and is more cleanup; resolving constraints is no longer reentrant so the pending array isn't needed to avoid blowing the native stack.

Assignee: nobody → bhackett1024

Attachment #8338909 - Flags: review?(wmccloskey)

Attachment #8338909 - Flags: review?(jdemooij)

Brian Hackett [Laid off!]

Assignee

Updated

•

11 years ago

Depends on: 943667

Jan de Mooij [:jandem]

Comment 4

•

11 years ago

Comment on attachment 8338909 [details] [diff] [review] patch Review of attachment 8338909 [details] [diff] [review]: ----------------------------------------------------------------- ::: js/src/gc/Zone.cpp @@ -85,5 @@ > -{ > - /* > - * Mark all scripts, type objects and singleton JS objects in the > - * compartment. These can be referred to directly by type sets, which we > - * cannot modify while code which depends on these type sets is active. Good to see this go. ::: js/src/jit/Ion.cpp @@ +2541,1 @@ > // If this script has Ion code on the stack, invalidation() will return Pre-existing: s/invalidation()/invalidated() ::: js/src/jsinfer.cpp @@ +4266,4 @@ > /* > * Sweep analysis information and everything depending on it from the > * compartment, including all remaining mjit code if inference is > * enabled in the compartment. This comment could use an update.

Attachment #8338909 - Flags: review?(jdemooij) → review+

Bill McCloskey [inactive unless it's an emergency] (:billm)

Comment 5

•

11 years ago

Comment on attachment 8338909 [details] [diff] [review] patch Review of attachment 8338909 [details] [diff] [review]: ----------------------------------------------------------------- Thanks very much! Have you done measurements about the cost of the sweeping? We really need to do that before landing. I think we should run Gregor's MemBench test (http://gregor-wagner.com/tmp/mem) and record the time for PHASE_DISCARD_ANALYSIS in each slice, looking at how the distribution changes. Once you're ready to push this to try, can you post a link? I don't mind running the measurements. Please update the comment here: http://mxr.mozilla.org/mozilla-central/source/js/src/jsinfer.h#275 Also here? http://mxr.mozilla.org/mozilla-central/source/js/src/jsinfer.h#859 ::: js/src/jsgc.cpp @@ +2837,4 @@ > ShouldPreserveJITCode(JSCompartment *comp, int64_t currentTime) > { > JSRuntime *rt = comp->runtimeFromMainThread(); > if (rt->gcShouldCleanUpEverything || !comp->zone()->types.inferenceEnabled) As long as you're touching this code, what's the purpose for the !comp->zone()->types.inferenceEnabled check here? ::: js/src/jsinfer.cpp @@ +922,5 @@ > + cx->zone()->types.addPendingRecompile(cx, script_); > + } > + > + TypeConstraint *sweep(TypeZone &zone) > + { Brace should go on previous line. @@ +1031,5 @@ > ? property.maybeTypes()->isSubset(expected) > : property.maybeTypes()->empty(); > } > + > + bool shouldSweep() { return false; } I can't help but notice that every implementation of shouldSweep in the type constraints returns false. I guess they provide a nice place to comment on why we don't need to worry about sweeping, so it's probably okay. @@ +1435,1 @@ > {} Braces should be moved up a line. @@ +3120,5 @@ > > void newType(JSContext *cx, TypeSet *source, Type type) {} > + > + TypeConstraint *sweep(TypeZone &zone) > + { Brace should go on previous line. @@ +3176,5 @@ > object->clearAddendum(cx); > } > + > + TypeConstraint *sweep(TypeZone &zone) > + { Brace should go on previous line. ::: js/src/jsinfer.h @@ +1302,5 @@ > bool pendingInvalidation_ : 1; > > + // During sweeping, the list of compiler outputs is compacted and invalidated > + // outputs are removed. This gives the new index for a valid compiler output. > + uint32_t sweepIndex_ : 29; Can you put some code somewhere so that we MOZ_CRASH if we ever exceed 29 bits? @@ +1428,5 @@ > /* Pool for type information in this zone. */ > static const size_t TYPE_LIFO_ALLOC_PRIMARY_CHUNK_SIZE = 8 * 1024; > js::LifoAlloc typeLifoAlloc; > > + /* Valid & Invalid script referenced by type constraints. */ Could you improve this comment?

Attachment #8338909 - Flags: review?(wmccloskey) → review+

Bill McCloskey [inactive unless it's an emergency] (:billm)

Comment 6

•

11 years ago

It just occurred to me that we can eliminate the code from bug 755604, either here or as a follow-up.

Brian Hackett [Laid off!]

Assignee

Comment 7

•

11 years ago

(In reply to Bill McCloskey (:billm) from comment #5) > Thanks very much! Have you done measurements about the cost of the sweeping? > We really need to do that before landing. I think we should run Gregor's > MemBench test (http://gregor-wagner.com/tmp/mem) and record the time for > PHASE_DISCARD_ANALYSIS in each slice, looking at how the distribution > changes. Once you're ready to push this to try, can you post a link? I don't > mind running the measurements. This try run is looking pretty good. Can you do the measurements? Thanks! https://tbpl.mozilla.org/?tree=Try&rev=3fffb7eca297

Bill McCloskey [inactive unless it's an emergency] (:billm)

Comment 8

•

11 years ago

I measured and it looks pretty much like you would expect. Times get a little worse, but it's not too bad. Here are the distributions for PHASE_DISCARD_ANALYSIS: without this patch: <5:53 <10:11 <15:8 <20:5 <25:0 <30:0 <35:1 <40:0 with this patch: <5:41 <10:14 <15:6 <20:7 <25:1 <30:1 <35:0 <40:2 It's laid out as "<TIME:COUNT", so "<5:41" mean that 41 slices spent <5ms in PHASE_DISCARD_ANALYSIS. There are a few more slices on the high end with the patch, but it doesn't seem too bad. There's some natural variation between runs, so some of this could be noise anyway. The important thing is that there isn't a huge spike that was missing before. I also looked at total slice times to make sure we're not blowing up somewhere else. without this patch: <12:269 <20:17 <30:19 <42:55 <50:10 <60:1 <80:0 <100:1 >=100:3 with this patch: <12:210 <20:23 <30:19 <42:47 <50:10 <60:2 <80:1 <100:0 >=100:5 Again, looks similar but maybe a little worse. I looked specifically at the extra slices in the >=100 bucket. It looks like we just hit two additional non-incremental GCs in the run with the patch, which may also explain the higher numbers for PHASE_DISCARD_ANALYSIS. Anyway, I feel comfortable taking the patch. The cost is reasonable, and it really will help us architecturally.

Jan de Mooij [:jandem]

Updated

•

11 years ago

Keywords: perf

Brian Hackett [Laid off!]

Assignee

Comment 9

•

11 years ago

https://hg.mozilla.org/integration/mozilla-inbound/rev/b93e7a83b308

Ryan VanderMeulen [:RyanVM]

Comment 10

•

11 years ago

Backed out for Linux mochitest-bc leaks. https://hg.mozilla.org/integration/mozilla-inbound/rev/31e30e41f853 https://tbpl.mozilla.org/php/getParsedLog.php?id=31822295&tree=Mozilla-Inbound

Brian Hackett [Laid off!]

Assignee

Comment 11

•

11 years ago

Restoring the test which causes us to throw away jitcode in non-inference-enabled compartments on every GC seems to make this nonsense orange go away. https://hg.mozilla.org/integration/mozilla-inbound/rev/a29d7f5ddde6 https://tbpl.mozilla.org/?tree=Try&rev=83999c07630e

Ryan VanderMeulen [:RyanVM]

Comment 12

•

11 years ago

Backed out for ggc asserts. https://hg.mozilla.org/integration/mozilla-inbound/rev/afa4bcb0150e https://hg.mozilla.org/integration/mozilla-inbound/rev/afa4bcb0150e

Ryan VanderMeulen [:RyanVM]

Comment 13

•

11 years ago

Whoops, bad log link there. https://tbpl.mozilla.org/php/getParsedLog.php?id=31879625&tree=Mozilla-Inbound Also, looks like this had B2G mochitest-2 permafail. https://tbpl.mozilla.org/php/getParsedLog.php?id=31884927&tree=Mozilla-Inbound

Brian Hackett [Laid off!]

Assignee

Comment 14

•

11 years ago

The assertion failure is actually a bug in bug 939614 exposed by an assertion added in this patch. The mochitest-2 failure didn't happen in my Try run (https://tbpl.mozilla.org/?tree=Try&rev=e2bfcd435ff6) so it might be related to bug 939614 as well, but since you backed out both bugs at once we won't know I guess until this relands.

Ryan VanderMeulen [:RyanVM]

Comment 15

•

11 years ago

There were green M2 runs after bug 939614 landed until this patch landed.

Brian Hackett [Laid off!]

Assignee

Comment 16

•

11 years ago

Bug 939614 had several extant bugs when it landed which may have interacted with this patch. As I said earlier, the mochitest-2 failure didn't happen on Try.

Brian Hackett [Laid off!]

Assignee

Comment 17

•

11 years ago

https://hg.mozilla.org/integration/mozilla-inbound/rev/2a1e0a4f8d72

Wes Kocher (:KWierso) (Not reading bugmail; email directly if needed)

Comment 18

•

11 years ago

Backed out again in https://hg.mozilla.org/integration/mozilla-inbound/rev/ad1ad5c34405 because the b2g m2 orange returned after this landed.

Brian Hackett [Laid off!]

Assignee

Comment 19

•

11 years ago

Well, this is still not showing up on try: https://tbpl.mozilla.org/?tree=Try&rev=690be4075355 Does anyone have any ideas on how to deal with this? I'm leaning towards WONTFIX.

Ryan VanderMeulen [:RyanVM]

Comment 20

•

11 years ago

https://hg.mozilla.org/integration/mozilla-inbound/rev/2e5ff5614254 Try looks good. Let's try relanding this with a clobber and see if that makes a difference.

Ryan VanderMeulen [:RyanVM]

Comment 21

•

11 years ago

Green!

Brian Hackett [Laid off!]

Assignee

Comment 22

•

11 years ago

Thanks Ryan!

Phil Ringnalda (:philor)

Comment 23

•

11 years ago

https://hg.mozilla.org/mozilla-central/rev/2e5ff5614254

Status: NEW → RESOLVED

Closed: 11 years ago

Resolution: --- → FIXED

Target Milestone: --- → mozilla29

Brian Hackett [Laid off!]

Assignee

Updated

•

11 years ago

Blocks: 984537