Closed Bug 1492995 Opened 11 months ago Closed 4 months ago

CacheIR support for String + Boolean

Categories

(Core :: JavaScript Engine: JIT, enhancement, P3, minor)

enhancement

Tracking

()

RESOLVED FIXED
mozilla68
Tracking Status
firefox64 --- wontfix
firefox68 --- fixed

People

(Reporter: mgaudet, Assigned: asorholm, Mentored)

References

(Blocks 1 open bug, )

Details

Attachments

(1 file)

It doesn't look like it would be hugely impactful, but I see lots of IC failures-to-attach from String+boolean on the Google Docs document from Bug 1488435

> {
>   "name":"BinaryArith",
>   "file":"https://docs.google.com/static/document/client/js/210745210-kix_main_i18n_integrated_kix_core.js",
>   "mode":0,
>   "line":2089,
>   "column":686,
>   "pc":"127452b27",
>   "op":"add",
>   "rhs":{
>     "type":"boolean"
>   },
>   "lhs":{
>     "type":"string",
>     "value":"2:2:"
>   }
> },

Fixing this bug will require

  1. Learning about the of the high level notion of Inline Caches
  2. Learning about the specific way that Inline Caches are implemented in SpiderMonkey: CacheIR (1) (2)
  3. Learn about how the test cases for other binary operator caches are written
  4. Learn how to see the results of the CacheIR attachment process (CACHEIR_LOGS)
  5. Write a new test case that show that adding a boolean to a string doesn't successfully attach
  6. Learn how the existing String+Number code works
  7. Use that knowledge to handle the String+Boolean case.

An excellent, but optional extension to this would be to write a microbenchmark to show the effect of handling this case in the small.

(In reply to Matthew Gaudet (he/him) [:mgaudet] from comment #1)

Do you have any tips for converting boolean values to strings¹? I'm imitating BinaryArithIRGenerator::tryAttachStringNumberConcat, which leads to needing to imitate CacheIRCompiler::emitCallInt32ToString (i.e. this line but with boolean). I've found CodeGenerator::visitBooleanToString to use as inspiration, but am worried I'm just making myself more confused and ignoring an easier and more simple solution.

1: Based on my reading of the ES6 specifications and my Node shell's behavior, I understand the String+Boolean case should concatenate the string with the boolean word, for example 'string'+false should be 'stringfalse'.

Flags: needinfo?(mgaudet)

Ok, you're definitely on the right path (and great find with visitBooleanToString).

The approach I would take is to add a new CacheIR opcode. Something like BooleanToStringResult; the opcode would be named with the Result suffix because it will fill in the output register of the cache.

You'll implement the code generation for it in CacheIRCompiler.cpp. Because its implementation would work for Ion, you'll also need to add it to the CACHE_IR_SHARED_OPS list.

The CodeGeneration will look a lot like that in visitBooleanToString; though you'll get the names from cx_->runtime()->commonNames.

Other crib points I'd note:

Does visitBooleanToString make enough sense to you? I can explain how it works if you're not used to reading MacroAssembler.

(Note, I haven't tried to compile any of the above :D )

Flags: needinfo?(mgaudet)

(PS: I concur with your spec reading!)

(In reply to Matthew Gaudet (he/him) [:mgaudet] from comment #3)

Does visitBooleanToString make enough sense to you? I can explain how it works if you're not used to reading MacroAssembler.

I think I need an explanation of it. I've been using searchfox to examine and understand the classes and methods it is using, in order to find and implement analogous methods in BinaryArithIRGenerator, but I am not having much luck forming a good understanding of anything.

Oops, BinaryArithIRGenerator above should be CacheIRCompiler

Hey Adam, here's a quick documentation pieces of MacroAssembler I whipped up:

The SpiderMonkey MacroAssembler is the dominant interface to emitting code. For the most part, it tries to provide a hardware agnostic interface to the emission of native machine code.

Let's use visitBooleanToString as a worked example:

void CodeGenerator::visitBooleanToString(LBooleanToString* lir) {
  Register input = ToRegister(lir->input());
  Register output = ToRegister(lir->output());
  const JSAtomState& names = gen->runtime->names();
  Label true_, done;

  masm.branchTest32(Assembler::NonZero, input, input, &true_);
  masm.movePtr(ImmGCPtr(names.false_), output);
  masm.jump(&done);

  masm.bind(&true_);
  masm.movePtr(ImmGCPtr(names.true_), output);

  masm.bind(&done);
}

Let's go through this bit by bit:

  • Register input = ToRegister(lir->input());: So at the top, we have two Register declarations. These correspond to machine registers (so, r11 or eax etc., depending on the architecture). In this case, we are looking at the IonMonkey code generator, and so the choice of which registers to use was made by the IonMonkey register allocator, so we simply take its decision: this is the ToRegister(...) bits.

  • const JSAtomState& names = gen->runtime->names();: This isn't really related to the MacroAssembler, but suffice it to say JSAtomState holds a variety of pre-determined names, and we're interested in the pointers to the true and false names right now.

  • Label true_, done;: Next we have the declaration of two labels. These correspond to the labels you would put in if you were writing assembly by hand. A label when created isn't actually associated with a particular point in the code. That happens when you masm.bind(&label). You can however branch to or jump to a label, even when it has yet to be bound.

  • masm.branchTest32(Assembler::NonZero, input, input, &true_);: This corresponds to a test-and-branch sequence. In assembly, test usually implies you take two arguments, and bitwise and them together, in order to set processor register flags. Effectively this is saying branch to true if input & input != 0.

  • masm.movePtr(ImmGCPtr(names.false_), output); This moves a pointer value into a register. ImmGCPtr is a decoration that indicates a couple of things: First, we're moving the pointer as an Immediate: that is to say, a constant that will be put directly into the code. The GCPtr portion tells the system that this pointer is a GCPtr or a pointer managed by the garbage collector. We need to tell the macroassmbler about this so it can remember the pointer, and put it in a table for the Garbage Collector so when doing a Moving GC that changes the address of this value, so that the garbage collector can update it.

  • masm.jump(&done);: Un-conditionally jump to the done lable.

  • masm.bind(&true_);: Bind the true label. When something jumps to the true label, we want them to land here in the code stream.

  • masm.movePtr(ImmGCPtr(names.true_), output);: This moves a different pointer into the output register.

  • masm.bind(&done);: Bind the done label.

The way to think of the MacroAssembler is that it's actually outputting code for most of these operations. (Labels turn out to be a bit magical, but it's Ok not to think about it normally).

So, what does this look like in actually emitted code? I added a masm.breakpoint() to just before the branch, and ran the ion tests (../jit-test/jit_test.py --jitflags=all ./dist/bin/js ion). This found me one test case that actually exercised this code path: ../jit-test/jit_test.py --debugger=lldb --jitflags=all ./dist/bin/js ion/bug964229-2.js. I then disassembled the code with dis -s $rip -e $rip+40

        0x2b872bbb5456: testl  %edx, %edx                 ; branchTest32 (the test)
        0x2b872bbb5458: jne    0x2b872bbb546d             ; branchTest32 (the jump to true)
        0x2b872bbb545e: movabsq $0x37c1c9d258a0, %rax     ; movPtr 
        0x2b872bbb5468: jmp    0x2b872bbb5477             ; jump done
true:   0x2b872bbb546d: movabsq $0x37c1c9d27480, %rax     ; movPtr
done:   0x2b872bbb5477: <unrelated instruction> 

I've annotated this with the rough macroassmbler that generated the code. The addresses that the jumps hit are those that the Labels got bound to. The choice of registers made by Ion, we can infer, to be Register input = edx and Register output = rax.

MacroAssembler is of course a very rich interface, and there are definitely lots of pieces I didn't cover here, but hopefully this helps!

(In reply to Matthew Gaudet (he/him) [:mgaudet] from comment #7)

Hey Adam, here's a quick documentation pieces of MacroAssembler I whipped up:

This is really helpful, thanks for putting this write up together for me, I really appreciate it!

For my patch: I have my additions to a point where my method is compiling and passing the tests I've written for it. I haven't benchmarked my changes yet. I was going to use the instructions here, but I'm not clear which is the currently preferred benchmarking suite for SpiderMonkey.

Do you have/know of any documentation on writing microbenchmarks for SpiderMonkey? I've been looking through searchfox and the MDN but haven't found anything helpful yet. By microbenchmarks, do you mean tests like in this directory? (If so, I've written a test for my changes based on those in that directory.)

I haven't run the full jit-test/jit_test.py and tests/jstests.py suites yet, but I was thinking of getting your feedback on my existing changes before I do that, because the full tests take a very long time on my computer. Would creating a Phabricator revision with my current changes for feedback be alright?

Flags: needinfo?(mgaudet)

Always feel free to through up a work in progress patch! If you want, you can submit as a draft (I think moz-phab submit --draft)

For now, a simple microbenchmark would suffice:

function test(a,b) { return a + b; } 

let start = performance.now();
for (var i = 0; i < BIGNUMBER; i++) { 
  test("is ", true); 
}
print(performance.now() - start);

You'll want to adjust the value of BIGNUMBER to make the test run for at least a second. Then you can disable use an environment variable disable/enable your IC; I tend to do something like static bool enabled = getenv("ENABLED") != nullptr; -- you'll want to disable IonMonkey when benchmarking, because in a simple microbenchmark like this, it's like Ion can ruin your testing by inlining things and not using inline caches where you want it. Do js --no-ion to disable IonMonkey.

Flags: needinfo?(mgaudet)

(In reply to Matthew Gaudet (he/him) [:mgaudet] from comment #9)

Then you can disable use an environment variable disable/enable your IC;

What am I disabling with an environment variable?

I tend to do something like static bool enabled = getenv("ENABLED") != nullptr;

Where do I put this? It looks like C++ to me, so it would be in the file that triggers IC to begin with?

Yeah -- I'd put that in the tryAtttachStringBool code you've presumably added; just return false if the environment variable isn't enabled. That way you control when the stub gets attached.

so something like

bool CompareIRGenerator::tryAttachStringBool() { 
  static bool enabled = getenv("ENABLED") != nullptr;
  if (!enabled) { return false; } 
}
Assignee: nobody → asorholm

Here are the results from implementing the microbenchmark that you suggested, @Matt:

Tests Performance w/o String+Boolean stubs (ms) Performance w/ String+Boolean stubs (ms)
./dist/bin/js --no-ion -f ../jit-test/tests/cacheir/bench/string-plus-boolean.js 2574.291015625 2355.072998046875
../jit-test/jit_test.py -o --jitflags=baseline --args='--no-ion' ./dist/bin/js cacheir/bench/string-plus-boolean.js 2548.40283203125 2344.177978515625

So the change gives around an 8% improvement in performance for String+Boolean arithmetic.

The contents of cacheir/bench/string-plus-boolean.js:

// |jit-test| --no-ion
// Set environment variable "ENABLED" to non-null values to enable attaching
// of boolean+string stub, and set "ENABLED" to null value to disable attaching
// of boolean+string stub.

// Microbenchmark test to show failure of boolean + String addition
// @see: https://bugzilla.mozilla.org/show_bug.cgi?id=1492995

function test(a,b) { return a + b };

let start = performance.now();
for (var i = 0; i < 1000000; i++) {
  test("is ", true);
}

print(performance.now() - start);

Note: on my machine, I measure 15% on that microbenchmark; it may be you're measuring a Debug build vs. OPT build

I've thrown up a try build here: https://treeherder.mozilla.org/#/jobs?repo=try&revision=f3d2bc9b7053b855322cbee009bd8f4e0e85eacf

(In reply to Matthew Gaudet (he/him) [:mgaudet] from comment #14)

Note: on my machine, I measure 15% on that microbenchmark; it may be you're measuring a Debug build vs. OPT build

Oops, you're correct!

I reran the tests with an OPT build and these are my results:

Tests Performance w/o String+Boolean stubs (ms) Performance w/ String+Boolean stubs (ms)
./dist/bin/js --no-ion -f ../jit-test/tests/cacheir/bench/string-plus-boolean.js 109.6708984375 90.802978515625
../jit-test/jit_test.py -o --jitflags=baseline --args='--no-ion' ./dist/bin/js cacheir/bench/string-plus-boolean.js 108.057861328125 86.70703125

Giving around an 18% improvement in performance.

I've thrown up a try build here: https://treeherder.mozilla.org/#/jobs?repo=try&revision=f3d2bc9b7053b855322cbee009bd8f4e0e85eacf

Thanks!

Pushed by mgaudet@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/fd2bf318a8b2
Adding CacheIR support for String + Boolean. r=mgaudet

Hrm.

I don't yet have an exact answer for what's going on; end of day for me, and I will dig in more tomorrow morning (because the patch -looked- fine to me, but it seems I've missed something)

Here's a couple of odd notes tho:

Guard incorrect?

If I inject masm.printf() calls into the emitBooleanToString(), I notice a curious pattern:

+++ b/js/src/jit/CacheIRCompiler.cpp
@@ -4289,20 +4289,22 @@ bool CacheIRCompiler::emitBooleanToStrin
   Register boolean = allocator.useRegister(masm, reader.objOperandId());
   Register result = allocator.defineRegister(masm, reader.stringOperandId());
   const JSAtomState& names = cx_->names();
   Label true_, done;
 
   masm.branchTest32(Assembler::NonZero, boolean, boolean, &true_);
 
   // False case
+  masm.printf("XXXfalse\n");
   masm.movePtr(ImmGCPtr(names.false_), result);
   masm.jump(&done);
 
   // True case
+  masm.printf("XXXtrue\n");
   masm.bind(&true_);
   masm.movePtr(ImmGCPtr(names.true_), result);
   masm.bind(&done);
 
   return true;
 }

When I run the test case (the stand alone one that was in an earlier revision of the patch), I only see XXXfalse get printed; and if tweak the test case to print the return value of the function-under-test, I see that we only seem to invoke BooleanToString on cases where the input is false; this suggests to me that there's something wierd going in in the guards.

The test case still passes however, on x86.

ARM failure.

If I do a debug build with ARM32 (add --enable-simulator=arm to the configure line of a 32-bit target.) The test case Adam wrote fails straight away:

$ ./dist/bin/js --baseline-eager ../jit-test/tests/cacheir/string-plus-boolean.js 
truetrue
XXXfalse
../jit-test/tests/cacheir/string-plus-boolean.js:14:13 Error: Assertion failed: got "falsetrue", expected "truetrue"
Stack:
  warmup@../jit-test/tests/cacheir/string-plus-boolean.js:14:13
  @../jit-test/tests/cacheir/string-plus-boolean.js:22:1

This could be consistent with the guard being incorrect, and reacting differently on ARM.

Will find out more tomorrow; I'll run this under rr to get an answer.

/me facepalms.

this is why I shouldn't debug late at night. See the diff I posted above? note where the printf to XXXtrue is?

Totally wrong. Anyhow. Still investigating now.

(Anba pointed out the real bug on phabricator, now waiting for revision)

To test on arm32 simulator, I think it should suffice to add this to your configure flags (if on Linux; not so sure about windows): --target=i686-pc-linux --enable-simulator=arm. If that doesn't work, drop by JSAPI and I'm sure someone can help sort it out.

(In reply to Andreea Pavel [:apavel] from comment #17)

Backed out for SM bustages and other failures

I updated my patch to address the issues you outlined in on the Phabricator revision. I pushed my patch to try, and (if I did everything correctly) added the tasks that failed to the revision's jobs.

Here is a link to the try build: https://treeherder.mozilla.org/#/jobs?repo=try&revision=6186e82563fb8c61549b72b1419215170e1193f5

Flags: needinfo?(asorholm)
Pushed by mgaudet@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/f91e832b0717
Adding CacheIR support for String + Boolean. r=mgaudet
Status: NEW → RESOLVED
Closed: 4 months ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla68
You need to log in before you can comment on or make changes to this bug.