Closed Bug 1255352 Opened 8 years ago Closed 8 years ago

Land initial CacheIR patch

Categories

(Core :: JavaScript Engine: JIT, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla48
Tracking Status
firefox48 --- fixed

People

(Reporter: jandem, Assigned: jandem)

References

(Blocks 1 open bug)

Details

Attachments

(1 file)

Attached patch PatchSplinter Review
Our current IC design has a number of problems, most importantly:

* Baseline and Ion ICs don't share much code. The same or similar logic and code is duplicated (often slightly differently and more than once per JIT).
* Ion ICs can handle cases that Baseline doesn't support, and vice versa.
* Baseline ICs are sometimes not 'flexible' enough: an example of this is the does-not-exist stub: Baseline does not attach a stub if an object on the proto chain has obj->hasUncacheableProto(), because it's really hard to support this case with the current ICs.
* It's hard to do something like: unwrap a (Window)Proxy, CCW, etc. and then optimize as a normal getprop, without duplicating/complicating a lot of code.
* ICs and Baseline stub classes are boilerplate-heavy (there are > 7000 lines of code in BaselineIC.h and SharedIC.h).
* Register allocation in ICs, especially Ion ICs, can be hard to get right.
* Adding ICs is a lot of work. I want to add an Ion IC for JSOP_IN. We also need ICs for the new |super| property accesses for ES6 classes (these are like normal GetProp/SetProp, but have an additional receiver argument).
* For Ion there's no good mechanism to discard or update stubs that are no longer valid, so we sometimes attach similar stubs multiple times.

I've been working on a design that addresses all of those issues. The idea is that we emit a very simple, linear (there are guards but no loops or branches) CacheIR bytecode, and generate Baseline and Ion IC code from it. The IR for a simple read-slot getprop looks like this:

  GuardIsObject 0
  GuardShape 0
  LoadFixedSlotResult 0

The generated CacheIR will be exactly the same for Baseline and Ion, but they will compile each op to different machine code and Ion can skip certain guards (if it knows they will never fail).

This ensures we optimize exactly the same cases in Baseline and Ion. We still need different CacheIR -> IC code generators, but at that point all the high-level decisions and VM bits are taken care of.

The actual shapes and slot offsets will be stored separately from the CacheIR. Ion can bake those directly into the JitCode, but (just like we do now) Baseline code will store them in the ICStub, allowing us to share IC stub code. Sharing Baseline stub code happens transparently with this design: stubs that have the same CacheIR can share JIT code.

We'll no longer need Baseline ICStub classes for each case, as the stubs are allocated dynamically. Stubs have a pointer to their (shared) CacheIR code, that allows Ion to optimize based on Baseline ICs. (Later on we could compile CacheIR we get from Baseline stubs to Ion MIR instructions, allowing us to inline and optimize more cases in IonBuilder without ICs).

I'm attaching a first version that generates CacheIR for NativeObject getprop slot reads (one of the most common IC stubs), and generates Baseline IC code from it. Later on we can convert the remaining getprop stubs, support other ICs, and implement the Ion backend.

Note that this already handles cases we didn't support before, for instance the hasUncacheableProto example I mentioned before is now optimized by Baseline.

Most of this patch is putting the basic infrastructure in place, but after that, supporting new ICs or stubs will hopefully require minimal changes or boilerplate.
Attachment #8728911 - Flags: review?(efaustbmo)
Sorry for the large patch btw, but I think the basic infrastructure is hard to review without seeing how it all fits together.

Also, large parts of this are just removing Baseline code that we no longer need.
Comment on attachment 8728911 [details] [diff] [review]
Patch

Review of attachment 8728911 [details] [diff] [review]:
-----------------------------------------------------------------

This /is/ much shorter and cleaner! This looks good. I have a bunch of random nits and ergonomics comments, but no major concerns. I am excited to see how the merge with Ion and the addition of accessors et al. will turn out.

::: js/src/jit/BaselineCacheIR.cpp
@@ +101,5 @@
> +        if (kind_ == PayloadReg)
> +            return payloadReg() == reg;
> +        if (kind_ == ValueReg) {
> +#if defined(JS_NUNBOX32)
> +            return valueReg().typeReg() == reg || valueReg().payloadReg() == reg;

This is fine to stay here. Is it useful to expose this as part of ValueOperand? Is there any reason not to confine all the #if to one place?

@@ +110,5 @@
> +        return false;
> +    }
> +    bool aliasesReg(ValueOperand reg) {
> +#if defined(JS_NUNBOX32)
> +        return aliasesReg(reg.typeReg()) || aliasesReg(reg.payloadReg());

hmm, I guess that makes this one more of a pain. Still, it feels like this code "knows too much" about the layout of ValueOperand.

@@ +217,5 @@
> +};
> +
> +// RAII class to put a scratch register back in the allocator's availableRegs
> +// set when we're done with it.
> +class MOZ_RAII AutoScratchRegister

Isn't this even better if it auto allocates, asserts if it cannot, and frees when it leaves scope, with implicit conversions to Register everywhere? I guess I will have to look at usage ergonomics before knowing for sure.

@@ +275,5 @@
> +
> +        MOZ_ASSERT(inputs_.length() == other.inputs_.length());
> +
> +        for (size_t i = 0; i < inputs_.length(); i++) {
> +            if (inputs_[i] != other.inputs_[i])

cute that each keeps their own copy, so that we can pop or spill value stack slots without fear.

@@ +380,5 @@
> +                               orig.valueReg());
> +            }
> +            break;
> +          default:
> +            MOZ_CRASH();

Please add a string here. "Invalid kind in..." or whatever.

@@ +399,5 @@
> +      : CacheIRCompiler(cx, writer),
> +        stubDataOffset_(stubDataOffset)
> +    {}
> +
> +    MOZ_WARN_UNUSED_RESULT bool init(CacheKind kind);

:)

@@ +560,5 @@
> +      }
> +
> +      case OperandLocation::ValueStack: {
> +        // The value is on the stack, but boxed. If it's on top of the stack we
> +        // unbox it and then remove it from the stack, else we just unbox.

"If it's on top of the stack, unbox and then pop it. If we need the registers later, we can always spill back. If it's not on the top of the stack, just unbox"

or something. It took me a minute to understand why popping was good. I didn't expect it, here.

@@ +641,5 @@
> +            }
> +            if (loc.kind() == OperandLocation::ValueReg) {
> +                ValueOperand reg = loc.valueReg();
> +#ifdef JS_NUNBOX32
> +                bool inUse = currentOpRegs_.has(reg.typeReg()) || currentOpRegs_.has(reg.payloadReg());

Isn't a RegisterSet has() overload for ValueOperand preferable?

@@ +695,5 @@
> +bool
> +BaselineCacheIRCompiler::emitGuardShape()
> +{
> +    Register obj = allocator.useRegister(masm, reader.objOperandId());
> +    AutoScratchRegister scratch(allocator, allocator.allocateRegister(masm));

This usage makes me think that I was right about AutoScratchRegsiter.

@@ +972,5 @@
> +        return nullptr;
> +
> +    // Just a sanity check: the caller should ensure we don't attach an
> +    // unlimited number of stubs.
> +    MOZ_ASSERT(stub->numOptimizedStubs() < 20);

20 needs a name, and a #define or static const. Seems like a number that will run around nameless to all the callers, if we let it.

@@ +997,5 @@
> +
> +        // Allocate the shared CacheIRStubInfo. Note that the putCacheIRStubCode
> +        // call below will transfer ownership to the stub code HashMap, so we
> +        // don't have to worry about freeing it below.
> +        stubInfo = CacheIRStubInfo::New(kind, stubDataOffset, writer);

MOZ_ASSERT(!stubInfo) for ludicrous safety.

::: js/src/jit/CacheIR.cpp
@@ +52,5 @@
> +    ValOperandId valId(writer.ref().setInputOperandId(0));
> +
> +    if (val_.isObject()) {
> +        RootedObject obj(cx_, &val_.toObject());
> +        ObjOperandId objId = writer.ref().guardIsObject(valId);

mozilla::Maybe overloads operator* and operator->. writer->guardIsObject() here and *writer below is surely cleaner.

@@ +72,5 @@
> +
> +    MOZ_ASSERT(!holder);
> +
> +    // Just because we didn't find the property on the object doesn't mean it
> +    // won't magically appear through various engine hacks:

nit: this colon either deserves a list of the lower checks, or should be removed

@@ +77,5 @@
> +    if (obj->getClass()->getProperty)
> +        return false;
> +
> +    // Don't generate missing property ICs if we skipped a non-native object, as
> +    // lookups may extend beyond the prototype chain (e.g.  for DOMProxy

hyper-nit: double space after e.g.

@@ +117,5 @@
> +        holder.set(&baseHolder->as<NativeObject>());
> +    }
> +
> +    if (IsCacheableGetPropReadSlotForIonOrCacheIR(obj, holder, shape) ||
> +        IsCacheableNoProperty(obj, holder, shape, pc))

It's cool to see the various stages of refactor we've done over the years finally come together into a coherent and clean implementation :)

@@ +135,5 @@
> +    // in reshaping the holder, and thus the failure of the shape guard.
> +    MOZ_ASSERT(obj != holder);
> +
> +    if (obj->hasUncacheableProto())
> +        writer.guardProto(objId, obj->getProto());

This is correct, but I would love to see a comment explaining why we are looking at the proto after checking something called hasUncacheableProto(). Looks like it really means "must guard on proto".

@@ +158,5 @@
> +    }
> +}
> +
> +static void
> +TestMatchingReceiver(CacheIRWriter& writer, JSObject* obj, Shape* shape, ObjOperandId objId)

This is a lovely change. I'm quite happy to see this factored this way. Looks like it already existed since bhackett did some work on the Ion cached, but it's quite nice.

@@ +194,5 @@
> +            holderId = writer.loadObject(holder);
> +            writer.guardShape(holderId, holder->as<NativeObject>().lastProperty());
> +        } else {
> +            // The property does not exist. Guard on everything in the
> +            // prototype chain.

nit: "This is guaranteed to see only Native objects because of CanAttachNativeGetProp()"

@@ +215,5 @@
> +    // Slot access.
> +    if (holder)
> +        EmitLoadSlotResult(writer, holderId, &holder->as<NativeObject>(), shape);
> +    else
> +        writer.loadUndefinedResult();

please add a boolean accessor to OperandId for validity, where invalid is equivalent to having id UINT16_MAX, and assert it in this if branch on both sides. The logic above looks like it should permit this, and takes enough twists and turns that this makes me feel safer.

@@ +261,5 @@
> +    MOZ_ASSERT(!emitted_);
> +
> +    if (!obj->is<UnboxedPlainObject>())
> +        return true;
> +    UnboxedExpandoObject* expando = obj->as<UnboxedPlainObject>().maybeExpando();

hyper nit: insert blank line above this declaration.

::: js/src/jit/CacheIR.h
@@ +22,5 @@
> +//
> +// IRWriter
> +// --------
> +// CacheIR bytecode is written using IRWriter. This class also records some
> +// meta data that's used by the Baseline and Ion code generators to generate

hyper nit: normally, I think of metadata as one word.

@@ +138,5 @@
> +    // our scope.
> +    JS::AutoCheckCannotGC nogc;
> +
> +    void writeOp(CacheOp op) {
> +        MOZ_ASSERT(uint32_t(op) <= UINT8_MAX);

There's some shenanigans with a compiler bug that means we can't give the uint8 size class to CacheOp, right?

::: js/src/jit/x86/MacroAssembler-x86.h
@@ +161,5 @@
>  
> +        // If we have a BaseIndex that uses both result registers, first compute
> +        // the address and then load the Value from there.
> +        if ((baseReg == val.payloadReg() && indexReg == val.typeReg()) ||
> +            (baseReg == val.typeReg() && indexReg == val.payloadReg()))

niiiiiiice. I bet this was fun to find.
Attachment #8728911 - Flags: review?(efaustbmo) → review+
Thanks for the fast review! Great suggestions.

(In reply to Eric Faust [:efaust] from comment #2)
> This is fine to stay here. Is it useful to expose this as part of
> ValueOperand? Is there any reason not to confine all the #if to one place?

Yeah, maybe we should move this into ValueOperand... I'll try that and post a followup patch.

> Isn't this even better if it auto allocates, asserts if it cannot, and frees
> when it leaves scope, with implicit conversions to Register everywhere? I
> guess I will have to look at usage ergonomics before knowing for sure.

Good idea, I agree it's nicer for this class to take just an allocator argument. (It already does implicitly convert to Register.)

> Isn't a RegisterSet has() overload for ValueOperand preferable?

Yeah, but ValueOperand contains 2 registers on 32-bit so we have to make it clear this new method will also return true if the RegisterSet contains only one of them... hasPartOf, hasAtLeastOneOf, or something.

> mozilla::Maybe overloads operator* and operator->. writer->guardIsObject()
> here and *writer below is surely cleaner.

Ah, that is cleaner.

> There's some shenanigans with a compiler bug that means we can't give the
> uint8 size class to CacheOp, right?

Yeah. Though I think we require gcc 4.8 (as of this week) and 4.7 was the buggy one. I'll check.

> niiiiiiice. I bet this was fun to find.

Fortunately one of the asserts in this function failed... Without that assert it'd have been annoying :)
(In reply to Jan de Mooij [:jandem] from comment #3)
> > Isn't a RegisterSet has() overload for ValueOperand preferable?
> 
> Yeah, but ValueOperand contains 2 registers on 32-bit so we have to make it
> clear this new method will also return true if the RegisterSet contains only
> one of them... hasPartOf, hasAtLeastOneOf, or something.

what about calling it aliases / mightAlias or something alike. I've seen that used before for registers and multiple register Operands.
(In reply to Hannes Verschore [:h4writer] from comment #4)
> what about calling it aliases / mightAlias or something alike. I've seen
> that used before for registers and multiple register Operands.

Yup, I went with aliases(..). Thanks :)
Keywords: leave-open
Alias: CacheIR
Depends on: 1258105
Depends on: 1258349
Depends on: 1259925
Sorry for the bug spam, but keeping this bug open after the initial patch landed is a bit confusing.

Let's file a new meta bug.
Alias: CacheIR
Keywords: leave-open
Summary: Use a simple IR for IC code generation → Land initial CacheIR patch
Target Milestone: --- → mozilla48
Status: ASSIGNED → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
Blocks: CacheIR
No longer depends on: 1258105, 1259925
Comment on attachment 8728911 [details] [diff] [review]
Patch

Review of attachment 8728911 [details] [diff] [review]:
-----------------------------------------------------------------

Just looking back at this code. Trying to better understand CacheIR and had some questions.

::: js/src/jit/BaselineCacheIR.cpp
@@ +832,5 @@
> +    Register output = allocator.defineRegister(masm, reader.objOperandId());
> +
> +    FailurePath* failure;
> +    if (!addFailurePath(&failure))
> +        return false;

Why do we need to add the FailurePath. We don't use "failure"?
Can't we just remove it? And if not. Why?

::: js/src/jit/BaselineInspector.cpp
@@ +121,5 @@
> +    //   GuardIsObject 0
> +    //   GuardGroup 0
> +    //   1: GuardAndLoadUnboxedExpando 0
> +    //   GuardShape 1
> +    //   LoadUnboxedExpando 0

What is holding us back to reuse the already loaded "UnboxedExpando" during "GuardAndLoadUnboxedExpando". Why do we need to reload it?
See above questions.
Flags: needinfo?(jdemooij)
(In reply to Hannes Verschore [:h4writer] from comment #9)
> > +    FailurePath* failure;
> > +    if (!addFailurePath(&failure))
> > +        return false;
> 
> Why do we need to add the FailurePath. We don't use "failure"?
> Can't we just remove it? And if not. Why?

Good catch! Yes it's unused so we should remove it. I'll fix.

> What is holding us back to reuse the already loaded "UnboxedExpando" during
> "GuardAndLoadUnboxedExpando". Why do we need to reload it?

This just matches what Ion ICs do.. With CacheIR we can optimize this by making expandoId in TestMatchingReceiver an outparam, then the caller can decide to use that for something else. I will file a followup bug and see how that works out :)
Flags: needinfo?(jdemooij)
You need to log in before you can comment on or make changes to this bug.