989414 - GGC regressed performance on dom-attr

At 1M iterations, we go from 200ms to 290ms. GGC is repeatedly GCing because of FULL_STORE_BUFFER on the slots buffer. Although each minor GC is only 50us, the total minor GC time is 74ms, explaining the majority of the regression.

Terrence Cole [:terrence]

Assignee

Comment 3

•

11 years ago

If we fix the FULL_STORE_BUFFER GC's through some form of duplication, we shuffle almost all of this work into the dedup, gaining at most 4-8ms on the total. I have a feeling if we make the store buffer compaction faster but sloppier, we'll end up translating the overhead into extra marking. Still worth a try though.

Joel Maher ( :jmaher ) (UTC -8)

Comment 4

•

11 years ago

checking in here as we are <2 weeks away from the uplift of this to Aurora, do we have a plan in place for the performance regressions?

tracking-firefox31: --- → ?

Andrew McCreight [:mccr8]

Updated

•

11 years ago

Flags: needinfo?(terrence)

Terrence Cole [:terrence]

Assignee

Comment 5

•

11 years ago

Bug 990336 and bug 988950 should address the performance gap on the affected micro-benchmarks.

Flags: needinfo?(terrence)

Andrew McCreight [:mccr8]

Updated

•

11 years ago

Assignee: nobody → terrence

Depends on: 988950, 990336

Terrence Cole [:terrence]

Assignee

Comment 6

•

11 years ago

Also bug 995442, maybe some bug 984101 too, and probably others I'm not thinking of. I expect us to be in a good position to release by end-of-week, if not sooner.

Depends on: 995442, 984101

Benjamin Kerensa [:bkerensa]

Updated

•

11 years ago

tracking-firefox31: ? → +

Terrence Cole [:terrence]

Assignee

Comment 7

•

11 years ago

Allowing normal deduplication of the slots buffer worked in a sense. We now do only 2 minor GC's here, one of them for DestroyContext. Surprisingly, we're still quite a bit slower. On the other handk, in the meantime, Jan optimized Lambda creation, so everything is faster. The new numbers for 1M iterations are: 150ms before and 260ms after, roughly the same ratio. If we switch the test to create a bare object instead of a function, we are actually a bit faster with ggc, so it is something specific to functions. The big obvious difference is that we're not allocating these in the nursery -- they're captured the by the script currently, so it wouldn't matter anyway. Looking at cachegrind output, both scripts take roughly the same amount of time in the jit. The object version spends a significant portion of it's time in minor collection where the function spends no time there. Neither version has significant data in the store buffer. There is no one heavy user: the profile information is spread out all over the run. There is no major source of overhead outside the jit other than atomization, which is unchanged with ggc and is the same on both versions of the test. Also note that the testcase in bug 945555 is almost identical to this test and its performance is roughly 3x faster with ggc.

Depends on: 945555

Terrence Cole [:terrence]

Assignee

Comment 8

•

11 years ago

If we run these as --baseline-eager --no-ion, the difference is from 330ms -> 360ms, so the ggc difference is all in ion.

Jan de Mooij [:jandem]

Comment 9

•

11 years ago

(In reply to Terrence Cole [:terrence] from comment #8) > If we run these as --baseline-eager --no-ion, the difference is from 330ms > -> 360ms, so the ggc difference is all in ion. The interpreter/Baseline still allocate lambdas tenured, can we fix that to match Ion (or fix Ion)?

Terrence Cole [:terrence]

Assignee

Comment 10

•

11 years ago

The MIR that's generated before and after is identical because we are hitting the CallSetElement fallback. This enters the interpreter and after checking a surprisingly large number of special cases, it takes the common path, at long last ending up in setSlot, doing getSlotRef().set(). The only difference between the two binaries is the execution of this post barrier. We know from gc dumps that this post barrier is not causing any pathological algorithmic behavior. This basically eliminates everything but codegen. Looking at the clang's -O3 codegen: mostly miracles, all the way down. The bottom of the stack is identical with these 5 frames, all the intermediates and parameter juggling handily removed: #1 js::NativeSet<(js::ExecutionMode)0> #2 js::baseops::SetPropertyHelper<(js::ExecutionMode)0> #3 setGeneric #4 SetObjectElementOperation #5 js::SetObjectElement In the non-ggc case, clang does a callq to ObjectImpl::setSlot, which inlines HeapSlot::set, the pre-barrier, and the actual set -- a surprising amount of code, even if it's mostly jumped over normally. In the ggc case, clang instead inlines ObjectImpl::set and instead does a callq to HeapSlot::set, which then inlines both the pre and post-barrier. First, this makes the code in NativeSet is a bit less elegant; HeapSlot::set takes two more parameters than ObjectImpl::setSlot. Then the actual set is just quite long: 365 bytes of code. This is actually large enough to be reflected in the executed instruction counts: ==27287== I refs: 3,104,099,224 ==27287== D refs: 1,370,402,602 (854,508,317 rd + 515,894,285 wr) ==27325== I refs: 3,335,606,115 ==27325== D refs: 1,473,968,889 (930,717,225 rd + 543,251,664 wr) This isn't really enough instructions to account for a 30% slowdown. When we include the I1 misses and LL refs, however, the picture gets much clearer. ==27287== I1 misses: 18,790,554 ==27287== LL refs: 22,482,207 ( 20,687,583 rd + 1,794,624 wr) ==27325== I1 misses: 46,294,016 ==27325== LL refs: 50,043,843 ( 48,231,231 rd + 1,812,612 wr) Even though the absolute numbers here are small in comparison to the totals, these are all going to be high-latency.

Terrence Cole [:terrence]

Assignee

Comment 11

•

11 years ago

(In reply to Terrence Cole [:terrence] from comment #7) > Also note that the testcase in bug 945555 is almost identical to this test > and its performance is roughly 3x faster with ggc. I'd guess the testcase there isn't falling out of Ion like we are here.

Terrence Cole [:terrence]

Assignee

Comment 12

•

11 years ago

Okay, so the question is what to do about it. The bare logical minimum we need is: if (zone->needsPreBarriers()) MarkValue(trc, value, "pre-barrier"); value = newValue; if (value.isObject() && !IsInsideNursery(owner) && IsInsideNursery(&value.toObject)) rt->storeBuffer.slotsBarrer[last++] = SlotRef(owner, offset, kind); Unfortunately, this simple concept is burdened with a ton of other complicating crap that's been slowing getting piled on. What it actually looks like in practice is more like: if (!(value.isString() && StringIsPermanentAtom(value.toString())) && zone->needsBarrier()) { Value tmp(value); js::gc::MarkValueUnbarriered(shadowZone->barrierTracer(), &tmp, "write barrier"); } value = newValue; if (value.isObject() && storeBuffer.isEnabled() && CurrentThreadCanAccessRuntime(value.toObject().chunk()->info.trailer.runtime) && !IsInsideNursery(owner) && IsInsideNursery(&value.toObject)) { SlotRef *t = rt->storeBuffer.slotsBarrer.alloc(); *t = SlotRef(owner, offset, kind); }

Terrence Cole [:terrence]

Assignee

Comment 13

•

11 years ago

Attached patch rebase_heapptr_to_ptr-v0.diff — Details — Splinter Review

https://tbpl.mozilla.org/?tree=Try&rev=efeb048d445e This is part 1 of several. The end goal is to unify the heap Ptr/Value/Id classes using the same mechanism used by Rooted/Handle. This should be a tremendous reduction is SLOC and hopefully much simpler as well once I'm done.

Attachment #8412752 - Flags: review?(jcoppeard)

Terrence Cole [:terrence]

Assignee

Comment 14

•

11 years ago

Comment on attachment 8412752 [details] [diff] [review] rebase_heapptr_to_ptr-v0.diff Review of attachment 8412752 [details] [diff] [review]: ----------------------------------------------------------------- ::: js/src/gc/Barrier.h @@ +292,5 @@ > +template <typename T> > +struct InternalGCMethods {}; > + > +template <typename T> > +struct InternalGCMethods<T *> It would be nice to share these eventually with the GCMethods in RootingAPI.h; however, for the first pass I wanted to maintain the existing inlining behavior, which cannot be done from outside the API currently. @@ +306,4 @@ > /* > * Base class for barriered pointer types. > */ > +template <class T> The Union bits appear to be completely unused now. Nothing was required other that deleting the code. @@ +605,5 @@ > +typedef HeapPtr<BaseShape*> HeapPtrBaseShape; > +typedef HeapPtr<UnownedBaseShape*> HeapPtrUnownedBaseShape; > +typedef HeapPtr<types::TypeObject*> HeapPtrTypeObject; > +typedef HeapPtr<types::TypeObjectAddendum*> HeapPtrTypeObjectAddendum; > +typedef HeapPtr<jit::JitCode*> HeapPtrJitCode; This was a nice opportunity to unify our style and switch absolutely everything over to typedefed names. ::: js/src/gc/Marking.h @@ +11,5 @@ > > class JSAtom; > class JSLinearString; > > + Sorry, did not notice this. Will remove it. @@ +54,2 @@ > * This function should be used for marking JSObjects, in preference to all > + * others below. Use it when you have HeapPtr<JSObject*>, which I'll change these names to HeapPtrObject. @@ +102,1 @@ > type *Update##base##IfRelocated(JSRuntime *rt, type **thingp); This was a bit finicky to get right. Thankfully, we're finally getting close to having a fully typed API, which should let us templatize this mess. @@ -243,5 @@ > - * The unioned HeapPtr stored in script->globalObj needs special treatment to > - * typecheck correctly. > - */ > -void > -MarkObject(JSTracer *trc, HeapPtr<GlobalObject, JSScript *> *thingp, const char *name); Our one and only "user" of the unioned bits. Fortunately, unused. @@ +283,5 @@ > MarkScript(trc, o, name); > } > > inline void > +Mark(JSTracer *trc, HeapPtr<jit::JitCode*> *code, const char *name) Not sure why I didn't use the typedef here. Fixed now.

Terrence Cole [:terrence]

Assignee

Comment 15

•

11 years ago

Attached patch rebase_barrieredvalue_on_barrieredptr-v0.diff — Details — Splinter Review

https://tbpl.mozilla.org/?tree=Try&rev=237e835c870a This uses the new BarrieredPtr to implement BarrieredValue and subclasses. It removes 222 lines of code. Id coming up next.

Attachment #8412921 - Flags: review?(jcoppeard)

Terrence Cole [:terrence]

Assignee

Comment 16

•

11 years ago

Comment on attachment 8412921 [details] [diff] [review] rebase_barrieredvalue_on_barrieredptr-v0.diff Review of attachment 8412921 [details] [diff] [review]: ----------------------------------------------------------------- ::: js/public/Value.h @@ +1609,5 @@ > JSString *toString() const { return value()->toString(); } > JSObject &toObject() const { return value()->toObject(); } > JSObject *toObjectOrNull() const { return value()->toObjectOrNull(); } > void *toGCThing() const { return value()->toGCThing(); } > + uint64_t asRawBits() const { return value()->asRawBits(); } Was not exposed before. No reason not to since it's just .get().asRawBits() otherwise. ::: js/src/gc/Barrier.cpp @@ -22,5 @@ > - return true; > - > - return ZoneOfValue(value) == zone || > - zone->runtimeFromAnyThread()->isAtomsZone(ZoneOfValue(value)); > -} The zone variant of |set| was never actually called, so we're not actually losing an assertion. ::: js/src/gc/Barrier.h @@ +365,5 @@ > + JSRuntime *rt = static_cast<js::gc::Cell *>(vp->toGCThing())->runtimeFromAnyThread(); > + JS::shadow::Runtime *shadowRuntime = JS::shadow::Runtime::asShadowRuntime(rt); > + shadowRuntime->gcStoreBufferPtr()->removeRelocatableValue(vp); > +#endif > + } I did not attempt to make any improvements to the way we treat these. I plan to follow with optimization and hopefully simplification. @@ +372,5 @@ > /* > * Base class for barriered pointer types. > */ > template <class T> > +class BarrieredPtr : public HeapBase<T> I'm deriving HeapBase<T> in order to pull in ValueOperations. I think it makes sense to re-use the public Heap infrastructure for this. At least, I don't see any obvious foot-guns and the Try run is green. @@ +387,5 @@ > this->value = v; > } > > /* Use this if the automatic coercion to T isn't working. */ > + const T &get() const { return value; } C++ was having trouble using the return as a |const Value&| for some reason. There may have been a confounding definition in an intermediate state or something. In any case this should be strictly faster on 32bit, so I decided to keep it. @@ +398,5 @@ > void unsafeSet(T v) { value = v; } > > T operator->() const { return value; } > > + operator const T &() const { return value; } Ditto. @@ +598,4 @@ > relocate(); > this->value = v; > + } else { > + this->value = v; The pointer case is either null or non-null, but with value we need the else. I didn't think it was worth adding new infrastructure just to assert we don't hit this case for pointers when the usage is pretty clear from construction. @@ +605,5 @@ > } > > protected: > + static bool isMarkable(const T &v) { > + return InternalGCMethods<T>::isMarkable(v); I added this static method solely to save on typing above. @@ +760,5 @@ > post(rt, owner, kind, slot, v); > } > > + //operator Value() const { return this->value; } > + //operator const Value &() const { return this->ptr; } D'oh! Will remove this hunk. @@ +786,5 @@ > value = v; > post(shadowZone->runtimeFromAnyThread(), owner, kind, slot, v); > } > > + static void writeBarrierPost(JSObject *obj, Kind kind, uint32_t slot, const Value &target) Caught this when reading. Totally not critical to this particular patch, just sneaking this and the next in since it's arguably related. ::: js/src/jsfun.h @@ +216,5 @@ > } > > void setGuessedAtom(JSAtom *atom) { > + JS_ASSERT(!atom_); > + JS_ASSERT(atom); I was hoping to get away without operator== and operator!= entirely, so that C++ would enforce this particular style guideline. Unfortunately, I had to add them back in the next patch for HeapId, which we actually do compare to raw jsid in quite a few places. ::: js/src/jsiter.cpp @@ +1756,5 @@ > * write barriers here. > */ > HeapValue::writeBarrierPre(gen->regs.sp[-1]); > gen->regs.sp[-1] = arg; > + HeapValue::writeBarrierPost(gen->regs.sp[-1], &gen->regs.sp[-1]); The runtime saves us a few instructions, but this path isn't hot and it's the only user.

Terrence Cole [:terrence]

Assignee

Comment 17

•

11 years ago

Attached patch rebase_barrieredid_on_barrieredptr-v0.diff — Details — Splinter Review

This saves about ~100 lines. I think as a followup we should MOZ_CRASH at JSID_IS_OBJECT so that we're not carrying around probably-rotted code. I'd remove it entirely, but there is discussion of re-using it for Symbols at some point.

Attachment #8412930 - Flags: review?(jcoppeard)

Terrence Cole [:terrence]

Assignee

Comment 18

•

11 years ago

Attached patch rename_and_ref_fixedptr-v0.diff — Details — Splinter Review

This removes the * from FixedHeapPtr to make it match the others and renames it to ImmutableTenuredPtr. There is only the one PropertyName user, which I have called ImmutablePropertyNamePtr. I think this is a fair balance between clarity and length, but disagreements are more than welcome.

Attachment #8412933 - Flags: review?(jcoppeard)

Steve Fink [:sfink] [:s:]

Comment 19

•

11 years ago

(In reply to Terrence Cole [:terrence] from comment #16) > @@ +605,5 @@ > > } > > > > protected: > > + static bool isMarkable(const T &v) { > > + return InternalGCMethods<T>::isMarkable(v); > > I added this static method solely to save on typing above. Could this be using InternalGCMethods<T>::isMarkable; ? Maybe it's not a good idea even if it works; too unusual. (In reply to Terrence Cole [:terrence] from comment #18) > This removes the * from FixedHeapPtr to make it match the others and renames > it to ImmutableTenuredPtr. There is only the one PropertyName user, which I > have called ImmutablePropertyNamePtr. I think this is a fair balance between > clarity and length, but disagreements are more than welcome. I think I may have suggested ImmutableTenuredPtr, but I do wonder if the Tenured part should be in there -- what happens when we have compacting GC? Will we even still have these?

Terrence Cole [:terrence]

Assignee

Comment 20

•

11 years ago

Attached patch rename_encapsulated_to_prebarriered-v0.diff — Details — Splinter Review

EncapsulatedPtr is a terribly unhelpful name in pretty much every dimension. I think we should call these PreBarriered to more accurately reflect their usage, even if it is a bit jargony.

Attachment #8412988 - Flags: review?(jcoppeard)

Terrence Cole [:terrence]

Assignee

Comment 21

•

11 years ago

Comment on attachment 8412988 [details] [diff] [review] rename_encapsulated_to_prebarriered-v0.diff Review of attachment 8412988 [details] [diff] [review]: ----------------------------------------------------------------- ::: js/src/gc/Barrier.h @@ +695,5 @@ > typedef BarrieredPtr<JSScript*> BarrieredPtrScript; > > +typedef PreBarriered<JSObject*> PreBarrieredObject; > +typedef PreBarriered<JSScript*> PreBarrieredScript; > +typedef PreBarriered<jit::JitCode*> PreBarrieredJitCode; This consciously removes Ptr to closer match Rooted.

Terrence Cole [:terrence]

Assignee

Comment 22

•

11 years ago

Attached patch deref_readbarriered_and_merge_value-v0.diff — Details — Splinter Review

This gives the same treatment to ReadBarriered: move the * outside the template and merge with Value.

Attachment #8413905 - Flags: review?(jcoppeard)

Terrence Cole [:terrence]

Assignee

Comment 23

•

11 years ago

Comment on attachment 8413905 [details] [diff] [review] deref_readbarriered_and_merge_value-v0.diff Review of attachment 8413905 [details] [diff] [review]: ----------------------------------------------------------------- ::: js/src/gc/Barrier-inl.h @@ -18,5 @@ > - JSObject::readBarrier(&value.toObject()); > - else if (value.isString()) > - JSString::readBarrier(value.toString()); > - else > - JS_ASSERT(!value.isMarkable()); The only code using this is the cross-compartment wrapper map, which only has object/string. Since these branches are always going to test true, but the subsequent mark is out-of-line, there's really no point inlining this method. @@ -26,5 @@ > - > -inline > -ReadBarrieredValue::operator const Value &() const > -{ > - return get(); Since this is just forwarding, I have no idea why it couldn't just be in the header. @@ -32,5 @@ > - > -inline JSObject & > -ReadBarrieredValue::toObject() const > -{ > - return get().toObject(); There is only one caller, so I just moved the .get() there so we can kill this file. ::: js/src/gc/Barrier.h @@ -727,5 @@ > -typedef PreBarriered<jsid> PreBarrieredId; > -typedef RelocatablePtr<jsid> RelocatableId; > -typedef HeapPtr<jsid> HeapId; > - > -typedef ImmutableTenuredPtr<PropertyName*> ImmutablePropertyNamePtr; Just moving these below ReadBarriered so that we can add them to the list. Also sorting the individual sections.

Terrence Cole [:terrence]

Assignee

Comment 24

•

11 years ago

Attached patch rename_barrieredptr_to_barrieredbase-v0.diff — Details — Splinter Review

BarrieredPtr is now a misnomer.

Attachment #8413982 - Flags: review?(jcoppeard)

Terrence Cole [:terrence]

Assignee

Comment 25

•

11 years ago

Comment on attachment 8413982 [details] [diff] [review] rename_barrieredptr_to_barrieredbase-v0.diff Review of attachment 8413982 [details] [diff] [review]: ----------------------------------------------------------------- ::: js/src/gc/Barrier.h @@ -768,5 @@ > struct TypeObjectAddendum; > } > > -typedef BarrieredPtr<JSObject*> BarrieredPtrObject; > -typedef BarrieredPtr<JSScript*> BarrieredPtrScript; Unlike the other types here, we do not want people to actually use BarrieredBase. Removing the typedefs adds another hint that it is special. @@ -794,5 @@ > typedef HeapPtr<jit::JitCode*> HeapPtrJitCode; > typedef HeapPtr<types::TypeObject*> HeapPtrTypeObject; > typedef HeapPtr<types::TypeObjectAddendum*> HeapPtrTypeObjectAddendum; > > -typedef BarrieredPtr<Value> BarrieredValue; And this was just wrong: BarrieredValue does not actually provide post-barriers.

Terrence Cole [:terrence]

Assignee

Comment 26

•

11 years ago

https://tbpl.mozilla.org/?tree=Try&rev=d55994bb24c3

Joel Maher ( :jmaher ) (UTC -8)

Comment 27

•

11 years ago

I looked at the results for linux and dom-attr doesn't appear to be fixed- what specific platforms were regressed here?

Boris Zbarsky [:bzbarsky]

Comment 28

•

11 years ago

Bug 619558 comment 36 says the regression was across the board.

Jon Coppeard (:jonco)

Comment 29

•

11 years ago

Comment on attachment 8412752 [details] [diff] [review] rebase_heapptr_to_ptr-v0.diff Review of attachment 8412752 [details] [diff] [review]: ----------------------------------------------------------------- Glad to see the Unioned stuff go, I always wondered why we needed that.

Attachment #8412752 - Flags: review?(jcoppeard) → review+

Jon Coppeard (:jonco)

Comment 30

•

11 years ago

Comment on attachment 8412921 [details] [diff] [review] rebase_barrieredvalue_on_barrieredptr-v0.diff Review of attachment 8412921 [details] [diff] [review]: ----------------------------------------------------------------- ::: js/src/gc/Barrier.h @@ +297,5 @@ > > template <typename T> > struct InternalGCMethods<T *> > { > + static bool isMarkable(T *v) { return bool(v); } I would prefer |v != nullptr| here.

Attachment #8412921 - Flags: review?(jcoppeard) → review+

Jon Coppeard (:jonco)

Updated

•

11 years ago

Attachment #8412930 - Flags: review?(jcoppeard) → review+

Jon Coppeard (:jonco)

Comment 31

•

11 years ago

Comment on attachment 8412933 [details] [diff] [review] rename_and_ref_fixedptr-v0.diff Review of attachment 8412933 [details] [diff] [review]: ----------------------------------------------------------------- ::: js/src/gc/Barrier.h @@ +573,1 @@ > value = ptr; Can we assert that the pointer is tenured here if it's not too inconvenient?

Attachment #8412933 - Flags: review?(jcoppeard) → review+

Jon Coppeard (:jonco)

Updated

•

11 years ago

Attachment #8412988 - Flags: review?(jcoppeard) → review+

Jon Coppeard (:jonco)

Updated

•

11 years ago

Attachment #8413905 - Flags: review?(jcoppeard) → review+

Jon Coppeard (:jonco)

Updated

•

11 years ago

Attachment #8413982 - Flags: review?(jcoppeard) → review+

Terrence Cole [:terrence]

Assignee

Comment 32

•

11 years ago

remote: https://hg.mozilla.org/integration/mozilla-inbound/rev/bb7af0cf48d6 remote: https://hg.mozilla.org/integration/mozilla-inbound/rev/5c288c892ea6 remote: https://hg.mozilla.org/integration/mozilla-inbound/rev/22e59b8c5b2f remote: https://hg.mozilla.org/integration/mozilla-inbound/rev/d8a34e6f01f4 remote: https://hg.mozilla.org/integration/mozilla-inbound/rev/3a821ea694d7 remote: https://hg.mozilla.org/integration/mozilla-inbound/rev/7165d1b0097a remote: https://hg.mozilla.org/integration/mozilla-inbound/rev/57292971f110

Keywords: leave-open

Ryan VanderMeulen [:RyanVM]

Comment 33

•

11 years ago

https://hg.mozilla.org/mozilla-central/rev/bb7af0cf48d6 https://hg.mozilla.org/mozilla-central/rev/5c288c892ea6 https://hg.mozilla.org/mozilla-central/rev/22e59b8c5b2f https://hg.mozilla.org/mozilla-central/rev/d8a34e6f01f4 https://hg.mozilla.org/mozilla-central/rev/3a821ea694d7 https://hg.mozilla.org/mozilla-central/rev/7165d1b0097a https://hg.mozilla.org/mozilla-central/rev/57292971f110

Terrence Cole [:terrence]

Assignee

Comment 34

•

11 years ago

Attached patch storebuffer_in_trailer-v0.diff — Details — Splinter Review

https://tbpl.mozilla.org/?tree=Try&rev=1896d4d12d6f Putting the store buffer pointer in the trailer turns out to be pure awesome-sauce. Will make comments inline.

Attachment #8415393 - Flags: review?(jcoppeard)

Terrence Cole [:terrence]

Assignee

Updated

•

11 years ago

Attachment #8412752 - Flags: checkin+

Terrence Cole [:terrence]

Assignee

Updated

•

11 years ago

Attachment #8412921 - Flags: checkin+

Terrence Cole [:terrence]

Assignee

Updated

•

11 years ago

Attachment #8412930 - Flags: checkin+

Terrence Cole [:terrence]

Assignee

Updated

•

11 years ago

Attachment #8412933 - Flags: checkin+

Terrence Cole [:terrence]

Assignee

Updated

•

11 years ago

Attachment #8412988 - Flags: checkin+

Terrence Cole [:terrence]

Assignee

Updated

•

11 years ago

Attachment #8413905 - Flags: checkin+

Terrence Cole [:terrence]

Assignee

Updated

•

11 years ago

Attachment #8413982 - Flags: checkin+

Jon Coppeard (:jonco)

Comment 35

•

11 years ago

Comment on attachment 8415393 [details] [diff] [review] storebuffer_in_trailer-v0.diff Review of attachment 8415393 [details] [diff] [review]: ----------------------------------------------------------------- Cool, it seems this combines the runtime check with the checking whether the object is in the nursery. ::: js/src/gc/Barrier.h @@ +357,4 @@ > static void postBarrier(Value *vp) { > #ifdef JSGC_GENERATIONAL > + if (vp->isObject()) { > + gc::StoreBuffer *sb = reinterpret_cast<gc::Cell *>(&vp->toObject())->storeBuffer(); I think you don't need to cast here because JSObject ultimately derives from Cell. @@ +366,5 @@ > > static void postBarrierRelocate(Value *vp) { > #ifdef JSGC_GENERATIONAL > + if (vp->isObject()) { > + gc::StoreBuffer *sb = reinterpret_cast<gc::Cell *>(&vp->toObject())->storeBuffer(); Ditto above. ::: js/src/gc/Heap.h @@ +623,4 @@ > #endif > > JSRuntime *runtime; > }; The padding looks a bit strange. If we have it in the same place on both architectures we won't need the #ifdefs: uint32_t location; uint32_t padding; StoreBuffer *storeBuffer; JSRuntime *runtime;

Attachment #8415393 - Flags: review?(jcoppeard) → review+

Terrence Cole [:terrence]

Assignee

Comment 36

•

11 years ago

Comment on attachment 8415393 [details] [diff] [review] storebuffer_in_trailer-v0.diff Review of attachment 8415393 [details] [diff] [review]: ----------------------------------------------------------------- ::: js/src/gc/Barrier.h @@ +232,5 @@ > } > #endif > } > > + static void writeBarrierPost(T *thing, gc::Cell **addr) {} We need to cast in quite a few more places now, particularly headers that don't understand that Object derives from Cell, but I think it makes the intention clearer. I'd appreciate your thoughts once you've seen the changes that were required. @@ +358,5 @@ > #ifdef JSGC_GENERATIONAL > + if (vp->isObject()) { > + gc::StoreBuffer *sb = reinterpret_cast<gc::Cell *>(&vp->toObject())->storeBuffer(); > + if (sb) > + sb->putValueFromAnyThread(vp); This saves the indirection through the runtime, which is nice. The real win though is that the null check here is actually checking if the target is in the nursery. This filters not-in-remember set faster /and/ means that we can remove |!nursery.isInside(deref(v))| from maybeInRememberedSet: a mask, three compares, etc. This is probably the biggest win from this patch. @@ +368,5 @@ > #ifdef JSGC_GENERATIONAL > + if (vp->isObject()) { > + gc::StoreBuffer *sb = reinterpret_cast<gc::Cell *>(&vp->toObject())->storeBuffer(); > + if (sb) > + sb->putRelocatableValueFromAnyThread(vp); Relocate works on the post value again. @@ +377,5 @@ > static void postBarrierRemove(Value *vp) { > #ifdef JSGC_GENERATIONAL > JSRuntime *rt = static_cast<js::gc::Cell *>(vp->toGCThing())->runtimeFromAnyThread(); > JS::shadow::Runtime *shadowRuntime = JS::shadow::Runtime::asShadowRuntime(rt); > + shadowRuntime->gcStoreBufferPtr()->removeRelocatableValueFromAnyThread(vp); Now that I'm thinking about it, removal should work on prior value, so it should be safe to do the same optimization here. This doesn't check maybeInRememberedSet however, so it's not a huge perf concern. @@ +863,4 @@ > #ifdef DEBUG > bool preconditionForSet(JSObject *owner, Kind kind, uint32_t slot); > bool preconditionForSet(Zone *zone, JSObject *owner, Kind kind, uint32_t slot); > + bool preconditionForWriteBarrierPost(JSObject *obj, Kind kind, uint32_t slot, Value target) const; I have no idea why this was implemented differently from the two above. @@ +881,2 @@ > value = v; > + post(owner, kind, slot, v); The runtime now buys us absolutely nothing, so I've ripped out all the runtime-taking paths. ::: js/src/gc/Nursery.h @@ +202,5 @@ > } > > MOZ_ALWAYS_INLINE void initChunk(int chunkno) { > NurseryChunkLayout &c = chunk(chunkno); > + c.trailer.storeBuffer = JS::shadow::Runtime::asShadowRuntime(runtime())->gcStoreBufferPtr(); We still need gcStoreBufferPtr() in a few places, so moving this out-of-line wouldn't allow us to remove it yet. ::: js/src/gc/StoreBuffer.cpp @@ -352,5 @@ > JS_PUBLIC_API(void) > JS::HeapValuePostBarrier(JS::Value *valuep) > { > JS_ASSERT(valuep->isMarkable()); > - if (valuep->isString() && StringIsPermanentAtom(valuep->toString())) This is totally superfluous since we only want barriers for things pointing at objects. @@ +365,5 @@ > JS::HeapValueRelocate(JS::Value *valuep) > { > /* Called with old contents of *valuep before overwriting. */ > JS_ASSERT(valuep->isMarkable()); > + if (valuep->isString() && valuep->toString()->isPermanentAtom()) No need to call the out-of-line StringIsPermanentAtom from a cpp file. Whoops. ::: js/src/gc/StoreBuffer.h @@ +244,5 @@ > bool operator!=(const CellPtrEdge &other) const { return edge != other.edge; } > > bool maybeInRememberedSet(const Nursery &nursery) const { > + JS_ASSERT(nursery.isInside(*edge)); > + return !nursery.isInside(edge); Sadly we don't know if the edge here is inside of the GC heap, so we can't safely check the chunk trailer. @@ +336,5 @@ > > bool operator==(const WholeCellEdges &other) const { return edge == other.edge; } > bool operator!=(const WholeCellEdges &other) const { return edge != other.edge; } > > + bool maybeInRememberedSet(const Nursery &nursery) const {return true; } O_o Will fix. @@ -362,5 @@ > void *data; > }; > > - template <typename Edge> > - bool isOkayToUseBuffer(const Edge &edge) const { This used to call maybeInRememberedSet, but it had to be moved after the reentrancy guard for reasons I don't remember anymore. I guess I forgot to fix the signature when I made that change. @@ +378,2 @@ > */ > if (!CurrentThreadCanAccessRuntime(runtime_)) This is the elephant in the room. This unconditionally calls PR_CurrentThread(). Not only is this out of line, it's probably quite slow in general. The next step is going to be implementing putFromMainThread, which will assert this instead of testing it. ::: js/src/vm/ObjectImpl.h @@ +970,4 @@ > #ifdef JSGC_GENERATIONAL > + gc::StoreBuffer *storeBuffer = obj->storeBuffer(); > + if (storeBuffer) > + storeBuffer->putRelocatableCellFromAnyThread(cellp); Gah, I really shouldn't make last-minute changes. Will fix.

Attachment #8415393 - Flags: review+ → review?(jcoppeard)

Terrence Cole [:terrence]

Assignee

Comment 37

•

11 years ago

(In reply to Jon Coppeard (:jonco) from comment #35) > Comment on attachment 8415393 [details] [diff] [review] > storebuffer_in_trailer-v0.diff > > Review of attachment 8415393 [details] [diff] [review]: > ----------------------------------------------------------------- > > Cool, it seems this combines the runtime check with the checking whether the > object is in the nursery. > > ::: js/src/gc/Barrier.h > @@ +357,4 @@ > > static void postBarrier(Value *vp) { > > #ifdef JSGC_GENERATIONAL > > + if (vp->isObject()) { > > + gc::StoreBuffer *sb = reinterpret_cast<gc::Cell *>(&vp->toObject())->storeBuffer(); > > I think you don't need to cast here because JSObject ultimately derives from > Cell. Sadly this isn't the case. We need full definitions of Barrier available to define JSObject, since it embeds some HeapPtr<>. This means we only have the empty public definition of JSObject in Barrier.h. If we could express that relation without exposing the types, it would make everything in the public tracing api much simpler. > ::: js/src/gc/Heap.h > @@ +623,4 @@ > > #endif > > > > JSRuntime *runtime; > > }; > > The padding looks a bit strange. If we have it in the same place on both > architectures we won't need the #ifdefs: > > uint32_t location; > uint32_t padding; > StoreBuffer *storeBuffer; > JSRuntime *runtime; Wow, great observation! Done.

Terrence Cole [:terrence]

Assignee

Comment 38

•

11 years ago

> if (!CurrentThreadCanAccessRuntime(runtime_)) This also calls InParallelSection, which uses TLS. So let's make that *catastrophically* slow.

Terrence Cole [:terrence]

Assignee

Comment 39

•

11 years ago

> @@ +377,5 @@ > > static void postBarrierRemove(Value *vp) { > > #ifdef JSGC_GENERATIONAL > > JSRuntime *rt = static_cast<js::gc::Cell *>(vp->toGCThing())->runtimeFromAnyThread(); > > JS::shadow::Runtime *shadowRuntime = JS::shadow::Runtime::asShadowRuntime(rt); > > + shadowRuntime->gcStoreBufferPtr()->removeRelocatableValueFromAnyThread(vp); > > Now that I'm thinking about it, removal should work on prior value, so it should be safe to do the same > optimization here. This doesn't check maybeInRememberedSet however, so it's not a huge perf concern. Actually, the conditions for calling relocate vs remove are isMarkable/non-null, whereas the conditions for actually inserting would be more tightly constrained. This runs into at least one problem if we do 1) ptr = Nursery; 2) ptr = Tenured; 3) ~ptr. This fails to remove the entry on destruction because the tenured can't get to the store buffer to do the removal. I was thinking we removed at (2), but since we're only checking bool(ptr), we end up just inserting a second entry for the tenured entry. I'm just going to leave all the relocatable barrier code alone for now, as they aren't actually implicated in any perf issues yet. I think we can move the isNursery check to needsPostBarrier to fix this, but it's worth a separate issue.

Terrence Cole [:terrence]

Assignee

Comment 40

•

11 years ago

Attached patch speed_up_some_barriers-v0.diff — Details — Splinter Review

Terrence Cole [:terrence]

Assignee

Comment 41

•

11 years ago

Comment on attachment 8415709 [details] [diff] [review] speed_up_some_barriers-v0.diff Review of attachment 8415709 [details] [diff] [review]: ----------------------------------------------------------------- ::: js/src/jit/VMFunctions.cpp @@ -713,5 @@ > void > PostWriteBarrier(JSRuntime *rt, JSObject *obj) > { > JS_ASSERT(!IsInsideNursery(rt, obj)); > - rt->gcStoreBuffer.putWholeCell(obj); I'm surprised how little effect this had on octane. Oh well.

Attachment #8415709 - Flags: review?(jcoppeard)

Terrence Cole [:terrence]

Assignee

Comment 42

•

11 years ago

Attached patch speed_up_nativesetslot-v0.diff — Details — Splinter Review

This gets an small but notable speedup on the testcase in this bug. I'm clearly still missing something important.

Attachment #8415710 - Flags: review?(jcoppeard)

Jon Coppeard (:jonco)

Comment 43

•

11 years ago

(In reply to Terrence Cole [:terrence] from comment #36) > > + static void writeBarrierPost(T *thing, gc::Cell **addr) {} > > We need to cast in quite a few more places now, particularly headers that > don't understand that Object derives from Cell, but I think it makes the > intention clearer. I'd appreciate your thoughts once you've seen the changes > that were required. I like the typing, but I don't like all the extra casting :) Can we have overloaded AsCell(JSObject *thing) style functions to do the casting for us? Maybe if we declare it inline and only supply a definition when JSObject has been defined then we won't need the casting. > The real win > though is that the null check here is actually checking if the target is in > the nursery. This filters not-in-remember set faster /and/ means that we can > remove |!nursery.isInside(deref(v))| from maybeInRememberedSet: a mask, > three compares, etc. Sligtly off topic, but IsInsideNursery() does: > return uintptr_t(p) >= runtime->gcNurseryStart_ && uintptr_t(p) < runtime->gcNurseryEnd_; I suspect compilers may optimize this pattern already, but in case they don't we can change it to something like: > return uintptr_t((uintptr_t(p) - runtime->gcNurseryStart_) < Nursery::NurserySize; Which would save a compare and branch.

Jon Coppeard (:jonco)

Comment 44

•

11 years ago

Comment on attachment 8415393 [details] [diff] [review] storebuffer_in_trailer-v0.diff Review of attachment 8415393 [details] [diff] [review]: ----------------------------------------------------------------- ::: js/src/gc/Barrier.h @@ +232,5 @@ > } > #endif > } > > + static void writeBarrierPost(T *thing, gc::Cell **addr) {} This does make the intention clearer, but unfortunately it means we have to cast (almost?) everywhere we call it. I experimented with making it |T **addr| and that didn't work out either. TBH I'm not sure this is worth it. We can add an assert that |*addr == thing| that will prevent misuse. ::: js/src/gc/StoreBuffer.cpp @@ +365,5 @@ > JS::HeapValueRelocate(JS::Value *valuep) > { > /* Called with old contents of *valuep before overwriting. */ > JS_ASSERT(valuep->isMarkable()); > + if (valuep->isString() && valuep->toString()->isPermanentAtom()) Shouldn't we have the same check as in HeapValuePostBarrier() here? However, if we're going to do this I think we could just change GCMethods<JS::Value>::needsPostBarrier(const JS::Value &v) to check v.isObject() rather than isMarkable() and save an API call.

Attachment #8415393 - Flags: review?(jcoppeard) → review+

Jon Coppeard (:jonco)

Comment 45

•

11 years ago

Comment on attachment 8415709 [details] [diff] [review] speed_up_some_barriers-v0.diff Review of attachment 8415709 [details] [diff] [review]: ----------------------------------------------------------------- Looks good! ::: js/src/gc/StoreBuffer.h @@ +447,5 @@ > void putCellFromAnyThread(Cell **cellp) { putFromAnyThread(bufferCell, CellPtrEdge(cellp)); } > void putSlotFromAnyThread(JSObject *obj, int kind, int32_t start, int32_t count) { > putFromAnyThread(bufferSlot, SlotsEdge(obj, kind, start, count)); > } > + void putWholeCellFromAnyThread(Cell *cell) { This doesn't appear to be used - can we get rid of it?

Attachment #8415709 - Flags: review?(jcoppeard) → review+

Jon Coppeard (:jonco)

Comment 46

•

11 years ago

Comment on attachment 8415710 [details] [diff] [review] speed_up_nativesetslot-v0.diff Review of attachment 8415710 [details] [diff] [review]: ----------------------------------------------------------------- Nice, gets rid of the main thread test ::: js/src/gc/Barrier.h @@ +924,5 @@ > + if (ContextFriendFields::get(cx)->isJSContext()) > + storeBuffer->putSlotFromMainThread(owner, kind, slot, 1); > + else > + storeBuffer->putSlotFromAnyThread(owner, kind, slot, 1); > + } So I'm a little confused now... At the moment, only the main thread can ever touch things in the nursery. That means if we extract a store buffer pointer from a chunk trailer, then we must be on the main thread (and we assert this in putFromMainThread). If we're not on the main thread, we don't ever insert entries into the store buffer (but we don't seem to assert that an entry wasn't required, which I guess we could do). This would mean that the call of storeBuffer->putSlotFromAnyThread() should not be required. Did I misunderstand what this does? (Also I think in future PJS is planning to make use of nurseries off the main thread, but that just means our thread assertion will have be updated to take account of that).

Terrence Cole [:terrence]

Assignee

Comment 47

•

11 years ago

(In reply to Jon Coppeard (:jonco) from comment #46) > Comment on attachment 8415710 [details] [diff] [review] > speed_up_nativesetslot-v0.diff > > Nice, gets rid of the main thread test > > ::: js/src/gc/Barrier.h > @@ +924,5 @@ > > + if (ContextFriendFields::get(cx)->isJSContext()) > > + storeBuffer->putSlotFromMainThread(owner, kind, slot, 1); > > + else > > + storeBuffer->putSlotFromAnyThread(owner, kind, slot, 1); > > + } > > So I'm a little confused now... > > At the moment, only the main thread can ever touch things in the nursery. > That means if we extract a store buffer pointer from a chunk trailer, then > we must be on the main thread (and we assert this in putFromMainThread). If > we're not on the main thread, we don't ever insert entries into the store > buffer (but we don't seem to assert that an entry wasn't required, which I > guess we could do). > > This would mean that the call of storeBuffer->putSlotFromAnyThread() should > not be required. Did I misunderstand what this does? > > (Also I think in future PJS is planning to make use of nurseries off the > main thread, but that just means our thread assertion will have be updated > to take account of that). You are 100% correct: we do not need the putSlotFromAnyThread here. More on the nose: this is a lot of ugly code and the wins are surprisingly marginal, at least on linux. I'm going to push the two preceeding patches independently to see how a big a boost we get from putWholeCellFromMainThread. If that's actually significant on some platform I'll move forward with this, otherwise I'm not sure passing a context is worth it.

Terrence Cole [:terrence]

Assignee

Comment 48

•

11 years ago

The first actual optimization patch: https://hg.mozilla.org/integration/mozilla-inbound/rev/9a6b434b34be Green try run: https://tbpl.mozilla.org/?tree=Try&rev=ac88798ffefd

Terrence Cole [:terrence]

Assignee

Comment 49

•

11 years ago

Attached patch allocate_lambda_objects_in_the_nursery-v0.diff — Details — Splinter Review

Oh, oh, oh... I am an idiot. I've been so focused on reducing overhead that I forgot /why/ GGC is faster. The Lambdas are getting allocated tenured, so we are not getting any of GGC's benefits, only the overhead. The reasons for this are: (1) generally putting functions in the nursery is a bad idea because they are held live by scripts; (2) things allocated through the interpreter are generally on a slow path anyway so we should avoid wasting nursery and store-buffer space that would be better used by the jits. This particular case is different because Lambdas are (apparently) not held live by their script and the interpreter routine is just getting called from a jit stub here. This one line patch allocates Lambdas in the nursery and gets us back to performance parity. It seems to pass all jsapi-tests and jit-tests; I will post try results if and when the try push finishes.

Attachment #8415970 - Flags: review?(jcoppeard)

Terrence Cole [:terrence]

Assignee

Comment 50

•

11 years ago

Backed out for mochitest-2 asan failures. https://hg.mozilla.org/integration/mozilla-inbound/rev/e5a6e3bfa4be

Wes Kocher (:KWierso) (Not reading bugmail; email directly if needed)

Comment 51

•

11 years ago

https://hg.mozilla.org/mozilla-central/rev/9a6b434b34be

Jon Coppeard (:jonco)

Updated

•

11 years ago

Attachment #8415970 - Flags: review?(jcoppeard) → review+

Terrence Cole [:terrence]

Assignee

Comment 52

•

11 years ago

https://hg.mozilla.org/integration/mozilla-inbound/rev/8f4e9cf85af8 Try run seems to be green, modulo the inbound bustage from yesterday morning: https://tbpl.mozilla.org/?tree=Try&rev=907c8fdeb598

Joel Maher ( :jmaher ) (UTC -8)

Comment 53

•

11 years ago

when this lands again we will see a win of 4.5% on linux32/linux64 on the tart test!

Terrence Cole [:terrence]

Assignee

Comment 54

•

11 years ago

This is getting a bit messy, so let me clarify what the current state is. There are 3 fixes here that I'd like to land in this bug. 1) Lambda objects in the nursery. This is currently "breaking" jit-tests: we have a bunch of tests that set gcMaxByte to small, then do a recursive call. These are marked to fail with OOM, but no longer generate garbage so instead are hitting the stack limit first. I think I've got a fix, we'll see shortly: https://tbpl.mozilla.org/?tree=Try&rev=089d0461401a. This will get us a 100% perf improvement on the dom-attr micro-benchmark, although it is unlikely to help elsewhere. It may even be a net loss, so we'll need to watch other tests as well. 2) StoreBufferPtr through the chunk trailer. This is currently broken because of Relocatable<>. We are using needsPostBarrier for insertion, which only checks isMarkable/non-null. This allows us to put a nursery pointer, then a non-nursery pointer (StoreBuffer* is now null). Now when we assign a nullptr and try to remove, we have no access to the store buffer and end up with the store buffer pointing to nullptr. I think the right solution here is going to be to make the needsPostBarrier check also check IsInsideNursery, but this will warrant a complete try run. This will get us 4.5% on tart and will be a small win across the board. 3) Do not check CurrentThreadCanAccessRuntime from putWholeCell. Blocked on (2). Should be a small win everywhere.

Terrence Cole [:terrence]

Assignee

Comment 56

•

11 years ago

Repushing part 2: http://hg.mozilla.org/integration/mozilla-inbound/rev/f5ac479e69d4 Green try run, particularly missing the prior M-2 bustage: https://tbpl.mozilla.org/?tree=Try&rev=e4c1c0a48952

Jon Coppeard (:jonco)

Comment 57

•

11 years ago

Comment on attachment 8415710 [details] [diff] [review] speed_up_nativesetslot-v0.diff Review of attachment 8415710 [details] [diff] [review]: ----------------------------------------------------------------- Not sure whether you're planning to land this, but it looks fine apart from previous comments.

Attachment #8415710 - Flags: review?(jcoppeard) → review+

Carsten Book [:Tomcat]

Comment 58

•

11 years ago

https://hg.mozilla.org/mozilla-central/rev/f5ac479e69d4

Terrence Cole [:terrence]

Assignee

Comment 59

•

11 years ago

Jit failures here are from known bustage in the patch under it. https://tbpl.mozilla.org/?tree=Try&rev=baa76b9acdc2 https://hg.mozilla.org/integration/mozilla-inbound/rev/292a44e059e2

Terrence Cole [:terrence]

Assignee

Updated

•

11 years ago

Attachment #8415393 - Flags: checkin+

Terrence Cole [:terrence]

Assignee

Updated

•

11 years ago

Attachment #8415709 - Flags: checkin+

Terrence Cole [:terrence]

Assignee

Comment 60

•

11 years ago

Comment on attachment 8415710 [details] [diff] [review] speed_up_nativesetslot-v0.diff I am leaning towards no; it's a bunch of complexity and extra code for a tiny gain.

Ryan VanderMeulen [:RyanVM]

Comment 61

•

11 years ago

https://hg.mozilla.org/mozilla-central/rev/292a44e059e2

Joel Maher ( :jmaher ) (UTC -8)

Comment 62

•

11 years ago

this shows up as a 2.3% regression on osx 10.6 for kraken, is this expected?

Flags: needinfo?(terrence)

Terrence Cole [:terrence]

Assignee

Comment 63

•

11 years ago

(In reply to Joel Maher (:jmaher) from comment #62) > this shows up as a 2.3% regression on osx 10.6 for kraken, is this expected? The new code uses fewer instructions: I got 23 talos improvement e-mails for the change. The 10.6 numbers appear bi-modal to me so it is almost certainly a changed alignment or access pattern. It's certainly not worth investigating; we can get bigger general wins in the time it would take to figure out what's going off the rails in this one narrow case.

Flags: needinfo?(terrence)

Joel Maher ( :jmaher ) (UTC -8)

Comment 64

•

11 years ago

sounds like a bunch of steps forward and a quarter step back, sounds like a plan!

Terrence Cole [:terrence]

Assignee

Comment 65

•

11 years ago

(In reply to Joel Maher (:jmaher) from comment #64) > sounds like a bunch of steps forward and a quarter step back, sounds like a > plan! When we landed GGC, we got improvements of 6-10% across the board on octane: a real benchmark. We also got a handful of 2-4% regressions on some silly micro-benchmarks. I wouldn't even bother with /this/ bug if it weren't trivially reproducible and a good excuse to do some much-needed cleanups at the same time.

Joel Maher ( :jmaher ) (UTC -8)

Comment 66

•

11 years ago

thanks for taking the time to clean this up. From a performance tracking perspective I am fine with closing this.

Terrence Cole [:terrence]

Assignee

Comment 67

•

11 years ago

(In reply to Joel Maher (:jmaher) from comment #66) > thanks for taking the time to clean this up. From a performance tracking > perspective I am fine with closing this. Lets land the lambdas-in-nursery first. And maybe some out-of-lining. Initial results of that are mixed, so I'm going to try one more thing there.

Terrence Cole [:terrence]

Assignee

Updated

•

11 years ago

No longer depends on: 945555

Terrence Cole [:terrence]

Assignee

Comment 68

•

11 years ago

Moving all the barrier code out-of-line saves ~100 lines of very ugly code immediately and probably would allow us to remove some stuff from the shadow namespace too. It appears entirely performance neutral other than a regression on winxp that is maybe noise? I think it would be worth checking it in, at least for a couple days, to get some statistical rigor behind "it is performance neutral".

Terrence Cole [:terrence]

Assignee

Updated

•

11 years ago

Blocks: GC.performance
No longer blocks: GenerationalGC

Terrence Cole [:terrence]

Assignee

Comment 69

•

11 years ago

https://hg.mozilla.org/integration/mozilla-inbound/rev/8c234572141a Green try run at: https://tbpl.mozilla.org/?tree=Try&rev=735fc04f10d5

Terrence Cole [:terrence]

Assignee

Updated

•

11 years ago

Keywords: leave-open

Wes Kocher (:KWierso) (Not reading bugmail; email directly if needed)

Comment 70

•

11 years ago

https://hg.mozilla.org/mozilla-central/rev/8c234572141a

Status: NEW → RESOLVED

Closed: 11 years ago

Resolution: --- → FIXED

Target Milestone: --- → mozilla32

Sylvestre Ledru [:Sylvestre]

Comment 71

•

11 years ago

GGC won't be part of 31.

tracking-firefox31: + → -

Daniel Veditz [:dveditz]

Updated

•

10 years ago

Depends on: 1004457

rebase_heapptr_to_ptr-v0.diff 11 years ago Terrence Cole [:terrence] 47.05 KB, patch	jonco : review+ terrence : checkin+	Details \| Diff \| Splinter Review
rebase_barrieredvalue_on_barrieredptr-v0.diff 11 years ago Terrence Cole [:terrence] 25.00 KB, patch	jonco : review+ terrence : checkin+	Details \| Diff \| Splinter Review
rebase_barrieredid_on_barrieredptr-v0.diff 11 years ago Terrence Cole [:terrence] 9.03 KB, patch	jonco : review+ terrence : checkin+	Details \| Diff \| Splinter Review
rename_and_ref_fixedptr-v0.diff 11 years ago Terrence Cole [:terrence] 6.26 KB, patch	jonco : review+ terrence : checkin+	Details \| Diff \| Splinter Review
rename_encapsulated_to_prebarriered-v0.diff 11 years ago Terrence Cole [:terrence] 25.29 KB, patch	jonco : review+ terrence : checkin+	Details \| Diff \| Splinter Review
deref_readbarriered_and_merge_value-v0.diff 11 years ago Terrence Cole [:terrence] 26.46 KB, patch	jonco : review+ terrence : checkin+	Details \| Diff \| Splinter Review
rename_barrieredptr_to_barrieredbase-v0.diff 11 years ago Terrence Cole [:terrence] 22.11 KB, patch	jonco : review+ terrence : checkin+	Details \| Diff \| Splinter Review
storebuffer_in_trailer-v0.diff 11 years ago Terrence Cole [:terrence] 35.24 KB, patch	jonco : review+ terrence : checkin+	Details \| Diff \| Splinter Review
speed_up_some_barriers-v0.diff 11 years ago Terrence Cole [:terrence] 6.78 KB, patch	jonco : review+ terrence : checkin+	Details \| Diff \| Splinter Review
speed_up_nativesetslot-v0.diff 11 years ago Terrence Cole [:terrence] 17.03 KB, patch	jonco : review+	Details \| Diff \| Splinter Review
allocate_lambda_objects_in_the_nursery-v0.diff 11 years ago Terrence Cole [:terrence] 984 bytes, patch	jonco : review+	Details \| Diff \| Splinter Review