BaselineCompiler: Optimize JSOP_ARRAYPUSH

RESOLVED INVALID

Status

()

defect
P5
normal
RESOLVED INVALID
5 years ago
2 years ago

People

(Reporter: djvj, Assigned: sankha, Mentored)

Tracking

Firefox Tracking Flags

(Not tracked)

Details

(Whiteboard: [lang=C++])

Attachments

(1 attachment, 5 obsolete attachments)

Reporter

Description

5 years ago
Jan just implented a vmcall-based handling of JSOP_ARRAYPUSH in baseline.  This is an easy-to-optimize op, and it should be optimized with a nice stub.
Ben: If you are interested, you can work on this bug. FYI, we already handle ArrayPush in IonMonkey (inside the CodeGenerator.cpp file [1]).

[1] http://dxr.mozilla.org/mozilla-central/source/js/src/jit/CodeGenerator.cpp#5627
Whiteboard: [mentor=efaust][lang=C++]

Comment 2

5 years ago
Thanks Nicolas. Yeah I'd love to work on this bug if its okay that I don't get too much done in the next week or so. I have an exam on the 9th that I need to study for, and I'm currently working full time in a coop position. I'll do my best to get a good start this week though. After the 9th I'm free to work on it on weekends and weeknights.

Updated

5 years ago
Assignee: nobody → benhackett4
(In reply to Ben Hackett from comment #2)
> Thanks Nicolas. Yeah I'd love to work on this bug if its okay that I don't
> get too much done in the next week or so.

Go for it. It's not pressing. Let me know if you have any questions.

Comment 4

5 years ago
Hi Eric,

Yes I do have questions. I'm pretty lost, sorry. I'm very new to all of this.

How exactly do I go about optimizing the method?? As far as I understand it: I should add a stub to an inline cache chain. What part of the code corresponds to the inline cache chain? And more fundamentally, what is a stub? Does it contain type information? I thought inline caching was modifying JIT compiled code - this sounds different than appending type information.

Thanks for any help :)
Flags: needinfo?(efaustbmo)
Reporter

Comment 5

5 years ago
Hi Ben,

I can answer these questions.

First off, we want to modify the Baseline JIT engine.  This engine uses inline caches (ICs) as the main (only) optimization strategy.  For unoptimized ops, the baseline JIT will typically just directly call a C++ vm-function to execute the desired behaviour.  For optimized ops, the baseline JIT will emit a jump to a "fallback stub" that handles the behaviour for that op.  The fallback stub then tries to generate an optimized stub for that particular case, and if successful attaches that stub to the chain for the op, and returns.

Baseline ICs are designed for code sharing.  The stub itself is split up into two parts: the first is a structure (inheriting from ICStub in BaselineIC.hpp) which holds any state that a stub needs to execute.  The second is a pointer to the jitcode generated for that stub.  When we "jump" to a stub within an IC, what happens is the following:

  1. The main jitcode stores a pointer to the first stub of the chain in BaselineStubReg (a well-known register that stores the stub pointer).
  2. The main jitcode stores any operands needed by the stubs in register values R0 and R1.
  3. The main jitcode reads the code pointer from the stub loaded, and jumps to that code pointer.

The each stub itself belongs to a single IC chain in a single Baseline-jitted script, but the code pointed to by stubs can be shared between multiple stubs across different scripts.  The code uses the stub pointer to read the state it needs to perform its operation.

Every IC chain starts with just a fallback stub on its chain.  The first time the IC is entered, the fallback stub executes, and it runs some C++ code which implements the operation.  The fallback stub then investigates the inputs it received, and determines whether or not the input can be optimized.  If so, it generates an optimized stub for that case, adds it to the IC chain (before itself), and returns.


The initial approach you can take is:

1. Visit BaselineCompiler.cpp - and read the code for the "emit_JSOP_*" methods.  Many of these methods emit an IC.  For example, emit_JSOP_GETELEM emits a jump to an IC to optimize the GETELEM operation.

2. Note that in emit_JSOP_ARRAYPUSH, we generate a call to a VM function with callVM.  Instead of doing this, we want to do something similar to what emit_JSOP_GETELEM does: jump to a fallback IC for the op.

3. The fallback IC for the op will be defined in BaselineIC.hpp.  See the header documentation on BaselineIC.hpp to see how ICs fit together.  Then, add a ICArrayPush_Fallback stub.  Like most other fallback stubs, this stub will pack its arguments, and call a DoArrayPushFallback function.

4. The DoArrayPushFallback stub should call the VM-function (NewboryArrayPushInfo) to push the array.

After these changes, the basic infrastructure for optimizing JSOP_ARRAYPUSH will be in place.  Once you get here, either efaust or I can assist you in adding the actual optimized stub for this op.
Flags: needinfo?(efaustbmo)

Comment 6

5 years ago
Thanks Kannan. 

Sigh. I'm still feeling like I'm useless here. It feels like it will be years before I'm actually helping and not being taught!?

Anyways, here are my new questions:
Does ICArrayPush_Fallback inherit from ICFallbackStub? Does it need to worry about type stubs? Are there any methods/properties I need to add that arent' inherited? Is DoArrayPushFallback a member?  Is NewbornArrayPushInfo the paramater or the function?
Reporter

Comment 7

5 years ago
Hey Ben,

Don't get discouraged.  I realize there's a bit of a learning curve here.  You're dealing with language semantics, an interpreter semantics for that language, and native code generation that implements those semantics in a way that allows for dynamic optimization.  These are not trivial concepts, but they're not inaccessible either.  I felt pretty overwhelmed when I started working on this stuff too.  It's totally worth getting past that point.

Ask questions.  Don't worry about asking too many.  It takes a bit of time to wrap your head around this stuff.  Also, post your work-in-progreess patches frequently and needinfo? me, and I can comment on your approach.

As for your question:

ICArrayPush_Fallback should inherit from ICFallbackStub.  All fallback stubs should inherit from ICFallbackStub.  It defines core fields required for all fallback stubs.

NewbornArrayPushInfo is just a jit-code wrapper for a C++ function (NewbornArrayPush) that performs the operation.  The current approach is a "trivial jit" of the op.  All we do is embed a call to a C++ function that implements the op.  We use this approach for rare ops that don't need heavy optimization.  For example, see emit_JSOP_DELNAME in BaselineCompiler.cpp.  We don't care to optimize DELNAME all that much, so the jitcode just emits a call to the |DeleteName| C++ VM-function.

You can see the definition of NewboryArrayPushInfo right above where it's used: it's a statically declared value of type VMFunction.  It just holds the necessary information about the wrapped function so that the |callVM(NewbornArrayPushInfo)| can do the right things to enter C++ from jitcode.

The DoArrayPushFallback function you're writing won't be called from the same place NewbornArrayPush will be called.  It'll be called from the jitcode generated for ICArrayPush_Fallback.  See BaselineIC.cpp for other examples of the jitcode for fallback functions.


Hope this helps.  Please ask more questions if you run into any snags or areas which you don't understand.  Either here or on IRC.

Lastly, thanks for working on this :)  I know it can be frustrating to climb the learning curve for this stuff, and I appreciate you taking on that challenge.

Comment 8

5 years ago
Thanks again Kannan.

I tried to hg pull -u and I thought everything went okay but it deleted emit_JSOP_ARRAYPUSH. 

Anyways here is the beginning of the patch. Do I have anything right?

Regarding ICArrayPush_Fallback, I'm still confused on how to implement it. All of the other fallback classes are moderately large classes with varying functions. 

Do you have any advice about understanding the code base? I'm finding it difficult because everything is related to 10 other things, and each thing is like 5000 lines of code.
Flags: needinfo?(kvijayan)

Comment 9

5 years ago
Posted patch JSOP_ARRAYPUSH.patch (obsolete) — Splinter Review
Attachment #8434608 - Flags: review?(kvijayan)
Reporter

Comment 10

5 years ago
Comment on attachment 8434608 [details] [diff] [review]
JSOP_ARRAYPUSH.patch

Review of attachment 8434608 [details] [diff] [review]:
-----------------------------------------------------------------

This looks right :)  I understand the feeling of "everything is related to everything else, how am I supposed to make sense of this?".  It'll slowly start coming together.

The next step is to implement the ICArrayPush_Fallback stub.  See BaselineIC.h, and BaselineIC.cpp for examples (you can probably crib from the implementation of ICGetElem_Fallback for everything except for the "doArrayPushFallback" C++ implementation (which should just call the right C++ functions to do the array push and return).

After that is done, the next step would be to add optimized stubs.. but for now the fallback stub implementation is the only thing that needs to be considered.

Cancelling review on this since it's an incomplete patch :)  Looking good so far.
Attachment #8434608 - Flags: review?(kvijayan)
Reporter

Updated

5 years ago
Flags: needinfo?(kvijayan)
Mentor: efaustbmo
Whiteboard: [mentor=efaust][lang=C++] → [lang=C++]

Comment 11

5 years ago
Hey everyone, I hate to do this but I just keep getting progressively busier and busier. I don't have time to work on this. Sorry and thanks for helping me out!

Updated

5 years ago
Assignee: benhackett4 → nobody
Reporter

Comment 12

5 years ago
No worries.  Thanks for taking the effort when you could.
Posted patch patch v0 (obsolete) — Splinter Review
I decided to take a try at this just to get a better idea. :)

This patch compiles, but I am getting the following jit-test failures. Can you please help me?

    basic/bug630377.js
    basic/bug791465.js
    basic/testArrayComp1.js
    basic/testArrayComp2.js
    collections/Map-iterator-remove-1.js
    collections/Map-values-2.js
    collections/Set-iterator-remove-1.js
    debug/Debugger-debuggees-18.js
    debug/Environment-identity-03.js
    for-of/generators-4.js
    for-of/proxy-2.js
Attachment #8434608 - Attachment is obsolete: true
Attachment #8446818 - Flags: feedback?(kvijayan)
Reporter

Comment 14

5 years ago
Comment on attachment 8446818 [details] [diff] [review]
patch v0

Review of attachment 8446818 [details] [diff] [review]:
-----------------------------------------------------------------

Nice!  This is mostly there.  Just a few small issues to address.

::: js/src/jit/BaselineCompiler.cpp
@@ +1802,5 @@
> +    if (!emitOpIC(stubCompiler.getStub(&stubSpace_)))
> +        return false;
> +
> +    // Mark R0 as pushed stack value.
> +    frame.push(R0);

JSOP_ARRAYPUSH does not push anythign back onto the stack.  So this is pushing whatever bogus content there is in R0 back on the stack, throwing the stack state out of whack, and probably leading to most of your failures.

::: js/src/jit/BaselineIC.cpp
@@ +9403,5 @@
> +// ArrayPush_Fallback
> +//
> +static bool
> +DoArrayPushFallback(JSContext *cx, BaselineFrame *frame, ICArrayPush_Fallback *stub_,
> +                    HandleObject obj, const Value &v)

The last argument type should be HandleValue.

@@ +9422,5 @@
> +    EmitRestoreTailCallReg(masm);
> +
> +    // Push arguments.
> +    masm.pushValue(R0);
> +    masm.pushValue(R1);

Since the BaselineCompiler unboxes the R1 array into R1.scratchReg(), you should only be pushing R1.scratchReg().

Currently, pushing R0 is pushing a js::Value on the stack, but the format of a js::Value of an object and a raw object pointer are different.  The sizes are also different on 32-bit machines (js::Value is 2 words, whereas a pointer is 1 word).

This should be: masm.push(R1.scratchReg())

::: js/src/jit/BaselineIC.h
@@ +376,5 @@
>      _(Call_ScriptedApplyArray)  \
>      _(Call_ScriptedApplyArguments) \
>      _(Call_ScriptedFunCall)     \
>                                  \
> +    _(ArrayPush_Fallback)       \

Nit: Add an extra blank escaped line after this (like the line before this entry), since each group of stubs for an op has a blank line between it.

@@ +2892,5 @@
>  
> +// ArrayPush
> +//      JSOP_ARRAYPUSH
> +
> +class ICArrayPush_Fallback : public ICStub

This class should inherit from ICFallbackStub, not ICStub.
Attachment #8446818 - Flags: feedback?(kvijayan) → feedback+
Posted patch patch v1 (obsolete) — Splinter Review
This patch works and passes all jit-tests. I also implemented the optimized stub.

> The last argument type should be HandleValue.

I tried this, but the shell was crashing, maybe because NewbornArrayPush take a Value? So I am using Value again.
Attachment #8446818 - Attachment is obsolete: true
Attachment #8447617 - Flags: review?(kvijayan)
Reporter

Comment 16

5 years ago
(In reply to Sankha Narayan Guria [:sankha93] from comment #15)
> Created attachment 8447617 [details] [diff] [review]
> patch v1
> 
> This patch works and passes all jit-tests. I also implemented the optimized
> stub.
> 
> > The last argument type should be HandleValue.
> 
> I tried this, but the shell was crashing, maybe because NewbornArrayPush
> take a Value? So I am using Value again.

HandleValue should auto-convert to 'const Value &' and to 'Value' implicitly when passed to such functions.  Not using a handle-value for a vm-call argument (which is rooted) is an error, so you definitely want to use Handlevalue.
Reporter

Comment 17

5 years ago
Comment on attachment 8447617 [details] [diff] [review]
patch v1

Review of attachment 8447617 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/jit/BaselineCompiler.cpp
@@ +1794,5 @@
>  BaselineCompiler::emit_JSOP_ARRAYPUSH()
>  {
>      // Keep value in R0, object in R1.
>      frame.popRegsAndSync(2);
>      masm.unboxObject(R1, R1.scratchReg());

Add a comment here indicating that R1 is guaranteed to be a boxed Array object.  Then, in #ifdef DEBUG code, emit native instructions to assert that this guarantee holds at runtime:

  frame.popRegsAndSync(2);
  #ifdef DEBUG
  {
    Label fail;
    Label ok;
    // Stub register is unused in mainline code so we can use it as
    // scratch.
    Register scratchReg = BaselineStubReg;
    masm.branchTestObject(Assembler::NotEqual, R1, &fail);
    Register objReg = masm.extractObject(R1, ExtractTemp0);
    masm.branchTestObjClass(Assembler::Equal, objReg, scratchReg, &ArrayObject::class_, &ok);

    masm.bind(&fail);
    masm.assumeUnreachable("JSOP_ARRAYPUSH operand 1 is not an array.");

    masm.bind(&ok);
  }
  #endif

  masm.unboxObject(R1, R1.scratchReg());
  ...

::: js/src/jit/BaselineIC.cpp
@@ +9458,5 @@
> +{
> +    Label failure;
> +
> +    masm.branchTestObject(Assembler::NotEqual, R1, &failure);
> +    Register obj = masm.extractObject(R1, ExtractTemp1);

The main-line jitcode in BaselineCompiler.cpp already unboxes R1 into R1.scratchReg(), so by the time we get here, R1.scratchReg() contains the array object.  These two lines should be removed, since R1 is not even guaranteed to be valid anymore (on X86-64, the unbox of R1 into R1.scratchReg() will replace the Value in R1 with a bare pointer, and this code would be incorrect).

@@ +9460,5 @@
> +
> +    masm.branchTestObject(Assembler::NotEqual, R1, &failure);
> +    Register obj = masm.extractObject(R1, ExtractTemp1);
> +
> +    GeneralRegisterSet regs(availableGeneralRegs(1));

availableGeneralRegs(1) will reserve R0, but not R1.scratchReg() which you are using.  Also do:

Register obj = R1.scratchReg()
regs.take(obj);

To ensure that you don't allocate R1.scratchReg() to something else later in the stub.

@@ +9465,5 @@
> +    Register elementsTemp = regs.takeAny();
> +    Register length = regs.takeAny();
> +
> +    masm.loadPtr(Address(obj, JSObject::offsetOfElements()), elementsTemp);
> +    masm.load32(Address(elementsTemp, ObjectElements::offsetOfLength()), length);

You need to additionally ensure that |capacity > length| to ensure you have space in the array to store.  So:

  masm.branch32(Assembler::BelowOrEqual,
                Address(elementsTemp, ObjectElements::offsetOfCapacity()),
                length, &failure);

The occasional times when the array needs to grow, we'll just end up making our way back to the fallback stub slow-path the capacity-increasing push, and then return to executing in the optimized stub after that.
Attachment #8447617 - Flags: review?(kvijayan) → review+
Reporter

Comment 18

5 years ago
Comment on attachment 8447617 [details] [diff] [review]
patch v1

Review of attachment 8447617 [details] [diff] [review]:
-----------------------------------------------------------------

Whoops.  Meant to cancel review and ask for re-review after comments addressed.

The patch is looking pretty good.  After these last changes and fixing you should be good for final review.
Attachment #8447617 - Flags: review+
Posted patch patchv2 (obsolete) — Splinter Review
Attachment #8447617 - Attachment is obsolete: true
Attachment #8448765 - Flags: review?(kvijayan)
Reporter

Comment 20

5 years ago
Comment on attachment 8448765 [details] [diff] [review]
patchv2

Review of attachment 8448765 [details] [diff] [review]:
-----------------------------------------------------------------

Looks good.  Address the comments and make sure it passes tests and this should be ready for pushing.

::: js/src/jit/BaselineIC.cpp
@@ +9464,5 @@
> +    Register elementsTemp = regs.takeAny();
> +    Register length = regs.takeAny();
> +
> +    masm.loadPtr(Address(obj, JSObject::offsetOfElements()), elementsTemp);
> +    masm.load32(Address(elementsTemp, ObjectElements::offsetOfLength()), length);

Nit: it might be nice to assert here (in debug mode), that (length == initializedLength):

#ifdef DEBUG
{
  Label ok;
  masm.branch32(Assembler::Equal,
                Address(elementsTemp,
                        ObjectElements::offsetOfInitializedLength()),
                length,
                &ok);
  masm.assumeUnreachable("ArrayPush array length != initializedLength");
  bind(&ok);
}
#endif

@@ +9470,5 @@
> +                  Address(elementsTemp, ObjectElements::offsetOfCapacity()),
> +                  length, &failure);
> +    Int32Key key = Int32Key(length);
> +
> +    masm.storeValue(R0, BaseIndex(elementsTemp, length, TimesEight));

Nit: I make it a practice, whenever we implicitly assume in jitcode that the sizeof(Value) == 8, to add a JS_STATIC_ASSERT(sizeof(Value) == 8) before it.

Someday, we may want to change the size of a value and it'd be nice to know where all we bake in assumptions about that.
Attachment #8448765 - Flags: review?(kvijayan) → review+
Reporter

Comment 21

5 years ago
Btw, I just benched this on linux x64, with the following code:

function test_f(a) {
    return [i*i for (i of a)];
}

On an array of 10 million integers, this takes 77ms on a reference build (without patch), and 51ms with the patch.  Not bad at all :)

If you're interested, after landing this patch - the next optimization level can be modified to support this instruction.  In fact, the Ion optimization path for this op should be relatively straightforward.  Then those numbers should improve an order of magnitude.  Currently Ion will just give up compiling if it sees JSOP_ARRAYPUSH, so that entire script runs in baseline.  Adding ion support for this op will enable ion compilation of the rest of the function, which should be pretty sweet.

But so far, this is a nice 33% win on the code above.  Nice :)
Posted patch patchv3 (obsolete) — Splinter Review
try: https://tbpl.mozilla.org/?tree=Try&rev=93b93bc7b348
Assignee: nobody → sankha93
Attachment #8448765 - Attachment is obsolete: true
Status: NEW → ASSIGNED
Attachment #8450926 - Flags: review+
I filed bug 1034578 to support this instruction in IonMonkey.
Reporter

Comment 24

5 years ago
(In reply to Sankha Narayan Guria [:sankha93] from comment #22)
> Created attachment 8450926 [details] [diff] [review]
> patchv3
> 
> try: https://tbpl.mozilla.org/?tree=Try&rev=93b93bc7b348

The oranges you are seeing in that try run are due to another secure bug.  Try rebasing on top of a green m-i tree and re-pushing to try.
Flags: needinfo?(efaustbmo)
Priority: -- → P5
Taking over mentorship.
Mentor: efaustbmo → hv1989
Flags: needinfo?(efaustbmo)
I just rebased the old patch on the current tip. A lot of the old code has changed.

I ran the JIT tests on builds from this patch and they seem to pass. Next I will be taking a look at why the last time tests broke in other parts of the tree when this was landed last time.
Attachment #8450926 - Attachment is obsolete: true
Flags: needinfo?(sankha93)
Attachment #8811170 - Flags: feedback?(hv1989)
JSOP_ARRAYPUSH is used for array comprehensions. We removed the ancient/legacy array comprehension syntax a while ago (bug 1220564), and the newer syntax proposal we implemented is also not going to be standardized after all, so will also be removed at some point. I'm not sure this is worth spending time on.
(In reply to Jan de Mooij [:jandem] from comment #30)
> JSOP_ARRAYPUSH is used for array comprehensions. We removed the
> ancient/legacy array comprehension syntax a while ago (bug 1220564), and the
> newer syntax proposal we implemented is also not going to be standardized
> after all, so will also be removed at some point. I'm not sure this is worth
> spending time on.

Good to know.

I think we can accept taking the patch, since it is close to ready. But if a lot of work still needs to happen or a lot of debugging we should probably mark this "wontfix". It is your call sankha. I can search another bug report you can work on if you want.
Comment on attachment 8811170 [details] [diff] [review]
patchv3 rebased

Review of attachment 8811170 [details] [diff] [review]:
-----------------------------------------------------------------

Looks like an important part is not in the code yet. I've added pointers to it in my review.
I assume that is the failures that you were seeing. Now adding this might take a while to implement,
as a result I would totally understand if you decide not to do it (given that it most likely will get deleted eventually).
Let me know what you think.

::: js/src/jit/BaselineCompiler.cpp
@@ +2174,5 @@
> +        masm.assumeUnreachable("JSOP_ARRAYPUSH operand 1 is not an array.");
> +
> +        masm.bind(&ok);
> +    }
> +    #endif

We can do this debug check in DoArrayPushFallback. Would be cleaner.

::: js/src/jit/BaselineIC.cpp
@@ +7610,5 @@
> +    if (stub.invalid())
> +        return true;
> +
> +    if (!stub->hasStub(ICStub::ArrayPush_Native))
> +    {

bracket should be on the previous line, according to style guides

@@ +7650,5 @@
> +//
> +
> +bool
> +ICArrayPush_Native::Compiler::generateStubCode(MacroAssembler &masm)
> +{

One important bit I'm missing here is that we need to keep track of the types that flows into this opcode.
You see that happening:
http://searchfox.org/mozilla-central/source/js/src/jsarray.cpp#2058

In SharedIC.h you'll see a big comment with extra explanation for TypeUpdate ICs. Eventually you'll have to add a "callTypeUpdateIC" in your code.

@@ +7664,5 @@
> +
> +    masm.loadPtr(Address(obj, NativeObject::offsetOfElements()), elementsTemp);
> +    masm.load32(Address(elementsTemp, ObjectElements::offsetOfLength()), length);
> +    #ifdef DEBUG
> +    {

Nits: The ifdef should start at column 0. And I see no need for the {}.
Also add a newline before the ifdef and after the endif
Attachment #8811170 - Flags: feedback?(hv1989)
Support for Array.push optimization in Baseline was added in Bug 1366375.
Status: ASSIGNED → RESOLVED
Closed: 2 years ago
Resolution: --- → DUPLICATE
Duplicate of bug: 1366375
This is actually unrelated to Array.push.

JSOP_ARRAYPUSH was an op emitted for array comprehensions and I removed generator/array comprehensions (including JSOP_ARRAYPUSH) in bug 1414340.
Resolution: DUPLICATE → INVALID
You need to log in before you can comment on or make changes to this bug.