Closed Bug 1210554 Opened 9 years ago Closed 9 years ago

ARM64: Branch target out of range after inserting constant pools

Tracking

()

Status:

RESOLVED FIXED

Milestone:

mozilla45

Tracking Flags:

Tracking

Status

firefox45

---

fixed

People

(Reporter: jolesen, Assigned: jolesen)

References

Details

Attachments

(11 files, 4 obsolete files)

Fix unified build breakage. r=sfink 9 years ago Jakob Stoklund Olesen [:jolesen] 7.31 KB, patch	sfink : review+	Details \| Diff \| Splinter Review
Add testAssemblerBuffer to jsapi-tests. r=sstangl 9 years ago Jakob Stoklund Olesen [:jolesen] 9.36 KB, patch	sstangl : review+	Details \| Diff \| Splinter Review
Implement BranchDeadlineSet. r=nbp 9 years ago Jakob Stoklund Olesen [:jolesen] 14.14 KB, patch	nbp : review+	Details \| Diff \| Splinter Review
Wire up branchDeadlines_ partially. No Asm callbacks yet. r=nbp 9 years ago Jakob Stoklund Olesen [:jolesen] 13.12 KB, patch		Details \| Diff \| Splinter Review
Implement constant pool test. r=nbp 9 years ago Jakob Stoklund Olesen [:jolesen] 12.04 KB, patch	nbp : review+	Details \| Diff \| Splinter Review
Add PatchShortRangeBranchToVeneer(). r=nbp 9 years ago Jakob Stoklund Olesen [:jolesen] 15.23 KB, patch	nbp : review+	Details \| Diff \| Splinter Review
Add enum ImmBranchRangeType. r=sstangl 9 years ago Jakob Stoklund Olesen [:jolesen] 7.05 KB, patch	sstangl : review+	Details \| Diff \| Splinter Review
Change representation of unbound Label linked lists. r=sstangl 9 years ago Jakob Stoklund Olesen [:jolesen] 16.80 KB, patch	sstangl : review+	Details \| Diff \| Splinter Review
Dynamically track short-range branches. r=sstangl 9 years ago Jakob Stoklund Olesen [:jolesen] 20.98 KB, patch	sstangl : review+	Details \| Diff \| Splinter Review
Handle toggled calls in CodeFromJump(). r=sstangl 9 years ago Jakob Stoklund Olesen [:jolesen] 2.17 KB, patch	sstangl : review+	Details \| Diff \| Splinter Review
Remove testAssemblerBuffer from unified build. 9 years ago Jakob Stoklund Olesen [:jolesen] 1.49 KB, patch	sfink : review+	Details \| Diff \| Splinter Review
Implement BranchDeadlineSet. 9 years ago Jakob Stoklund Olesen [:jolesen] 15.28 KB, patch		Details \| Diff \| Splinter Review
Wire up branchDeadlines_ partially. No Asm callbacks yet. 9 years ago Jakob Stoklund Olesen [:jolesen] 14.43 KB, patch	nbp : review+	Details \| Diff \| Splinter Review
Implement constant pool test. 9 years ago Jakob Stoklund Olesen [:jolesen] 12.27 KB, patch		Details \| Diff \| Splinter Review
Add PatchShortRangeBranchToVeneer(). 9 years ago Jakob Stoklund Olesen [:jolesen] 15.47 KB, patch		Details \| Diff \| Splinter Review

Jakob Stoklund Olesen [:jolesen]

Assignee

Description

•

9 years ago

Build SpiderMonkey with --enable-sim=arm64.
$  dist/bin/js --baseline-eager ../js/src/jit-test/tests/debug/Script-getLineOffsets-06.js 
Assertion failure: is_int19(imm19), at /Users/jolesen/gecko-dev/js/src/jit/arm64/vixl/Assembler-vixl.h:1638
Segmentation fault: 11

It is a branch target that goes out of range when incorporating the sizes of constant pools:

$ lldb -- dist/bin/js --baseline-eager ../js/src/jit-test/tests/debug/Script-getLineOffsets-06.js 
(lldb) target create "dist/bin/js"
Current executable set to 'dist/bin/js' (x86_64).
(lldb) settings set -- target.run-args  "--baseline-eager" "../js/src/jit-test/tests/debug/Script-getLineOffsets-06.js"
(lldb) r
Process 90592 launched: '/Users/jolesen/gecko-dev/obj-a64simdev/dist/bin/js' (x86_64)
Assertion failure: is_int19(imm19), at /Users/jolesen/gecko-dev/js/src/jit/arm64/vixl/Assembler-vixl.h:1638
Process 90592 stopped
* thread #1: tid = 0xb0c0d, 0x0000000100a4d323 js`vixl::Assembler::ImmCondBranch(imm19=313127) + 67 at Assembler-vixl.h:1638, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x0)
    frame #0: 0x0000000100a4d323 js`vixl::Assembler::ImmCondBranch(imm19=313127) + 67 at Assembler-vixl.h:1638
   1635	  }
   1636	
   1637	  static Instr ImmCondBranch(int imm19) {
-> 1638	    VIXL_ASSERT(is_int19(imm19));
   1639	    return truncate_to_int19(imm19) << ImmCondBranch_offset;
   1640	  }
   1641	
(lldb) p/x imm19
(int) $0 = 0x0004c727
(lldb) bt
* thread #1: tid = 0xb0c0d, 0x0000000100a4d323 js`vixl::Assembler::ImmCondBranch(imm19=313127) + 67 at Assembler-vixl.h:1638, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x0)
  * frame #0: 0x0000000100a4d323 js`vixl::Assembler::ImmCondBranch(imm19=313127) + 67 at Assembler-vixl.h:1638
    frame #1: 0x0000000100a43ea4 js`vixl::Assembler::b(at=0x0000000105ecace8, imm19=313127, cond=eq) + 36 at MozAssembler-vixl.cpp:118
    frame #2: 0x0000000100a454b7 js`vixl::MozBaseAssembler::RetargetNearBranch(i=0x0000000105ecace8, byteOffset=1252508, final=false) + 247 at MozAssembler-vixl.cpp:497
    frame #3: 0x0000000100a021a5 js`js::jit::AssemblerBufferWithConstantPools<1024ul, 4ul, vixl::Instruction, vixl::MozBaseAssembler>::patchBranch(this=0x00007fff5fbf8e30, i=0x0000000105ecace8, curpool=842, branch=(offset = 1828)) + 405 at IonAssemblerBufferWithConstantPools.h:891
    frame #4: 0x00000001009cc268 js`js::jit::AssemblerBufferWithConstantPools<1024ul, 4ul, vixl::Instruction, vixl::MozBaseAssembler>::executableCopy(this=0x00007fff5fbf8e30, dest_="\x9f#") + 472 at IonAssemblerBufferWithConstantPools.h:914
    frame #5: 0x00000001009b5341 js`js::jit::Assembler::executableCopy(this=0x00007fff5fbf8d68, buffer="\x9f#") + 49 at Assembler-arm64.cpp:143
    frame #6: 0x000000010078ca28 js`js::jit::JitCode::copyFrom(this=0x0000000105ca3520, masm=0x00007fff5fbf8d68) + 72 at Ion.cpp:793
    frame #7: 0x000000010080b4cd js`js::jit::JitCode* js::jit::Linker::newCode<(js::AllowGC)1>(this=0x00007fff5fbf8b38, cx=0x0000000103745800, kind=BASELINE_CODE) + 637 at Linker.h:72

In the patchBranch() frame, the branch offset crosses the permitted +/- 1 MB boundary for an arm64 conditional branch:

(lldb) p/x Assembler::GetBranchOffset(ci)
(ptrdiff_t) $7 = 0x00000000000d0f28
(lldb) p/x offset
(ptrdiff_t) $8 = 0x0000000000131c9c

The difference is the accumulated size of constant pools inserted between the branch and its target.

Jakob Stoklund Olesen [:jolesen]

Assignee

Comment 1

•

9 years ago

See the discussion in bug 1207827. Constant pools and branches with limited range should be dealt with together.

Jakob Stoklund Olesen [:jolesen]

Assignee

Comment 2

•

9 years ago

Alright, here's the approach I am going with for handling limited-range branches in ARM64:

- Assembling a branch to a bound label is the easy case: If the label is in range, branch directly to the label, otherwise generate two instructions:

  b.cond !cond, skip
  b label
skip:

This is already implemented in MacroAssembler::B(Label* label, Condition cond).

- Assembling a branch to an unbound label is trickier. We don't know if the label will be in range of the branch when it is bound. Optimistically assume that the label will be in range, and emit a short-range branch. Also register the branch deadline with the AssemblerBuffer which will detect when the branch is about to go out out of range.

Before the branch goes out of range, AssemblerBuffer will emit a branch veneer (preferably along with a constant pool). The branch veneer is an unconditional branch targeting the still unbound label:

original_branch:
  b.cond veneer

  ... lots of data

veneer:
  b label

The linked list of branches associated with the unbound label will contain both the original short-range branch and the veneer branch. Assembler::bind() can tell them apart because it won't be able to patch the short-range branch.

Assignee: nobody → jolesen

Jakob Stoklund Olesen [:jolesen]

Assignee

Comment 3

•

9 years ago

Attached patch Fix unified build breakage. r=sfink — Details — Splinter Review

Jakob Stoklund Olesen [:jolesen]

Assignee

Comment 4

•

9 years ago

Attached patch Add testAssemblerBuffer to jsapi-tests. r=sstangl — Details — Splinter Review

Also minor fixes to the AssemblerBuffer class:

- Tighten encapsulation / data hiding.
- Use consistent types size_t + void* for raw byte data.

Jakob Stoklund Olesen [:jolesen]

Assignee

Comment 5

•

9 years ago

Attached patch Implement BranchDeadlineSet. r=nbp (obsolete) — Details — Splinter Review

This is the data structure that will be used to keep track of
unresolved forward short-range branches.

Jakob Stoklund Olesen [:jolesen]

Assignee

Comment 6

•

9 years ago

Attached patch Wire up branchDeadlines_ partially. No Asm callbacks yet. r=nbp (obsolete) — Details — Splinter Review

AssemblerBufferWithConstantPools geta a branchDeadlines_ member which
keeps track of forward branch to unbound labels.

Add a hasSpaceForInsts() method which collects the logic for checking
for available space in one place. Insert a constant pool both when
constant pool loads are about to go out of range, and when short-range
branch deadlines are about to expire.

Add registerBranchDeadline() and unregisterBranchDeadline() methods
that the assembler will use to add and remove branches to be tracked.

Jakob Stoklund Olesen [:jolesen]

Assignee

Comment 7

•

9 years ago

Attached patch Implement constant pool test. r=nbp (obsolete) — Details — Splinter Review

Test the existing functionality of AssemblerBufferWithConstantPools
using a fake ISA that is much more constrained than ARM and ARM64.

Documant the Assembler callback that are required to use
AssemblerBufferWithConstantPools, and implement mock versions for the
unit test.

Jakob Stoklund Olesen [:jolesen]

Assignee

Comment 8

•

9 years ago

Attached patch Add PatchShortRangeBranchToVeneer(). r=nbp (obsolete) — Details — Splinter Review

This is the second part of the short branch handling in
AssemblerBufferWithConstantPools. The PatchShortRangeBranchToVeneer()
callback is called from finishPool() to patch short-range branches that
are about to expire.

Implement no-op versions of the callback for ARM and ARM64. These
versions will never be called as long as no short-line branches are
registered. They only exist to prevent linker errors in unoptimized
builds. In an optimized build, the unused function calls will be
optimized out because DeadlineSet<0>::empty() is hardwired to return
true.

Jakob Stoklund Olesen [:jolesen]

Assignee

Comment 9

•

9 years ago

Attached patch Add enum ImmBranchRangeType. r=sstangl — Details — Splinter Review

We already have an ARM64 ImmBranchType which classifies the branch
instructions in the ISA. The /range/ classification is required because
we need unique small integers to pass to
AssemblerBufferWithConstantPool::registerBranchDeadline(). The b.cond
and cbz instructions have the same range, but different branch types.

Classify the 32 KB and 1 MB range branches as 'short-range'. Request
these branch ranges to be tracked by the new
AssemblerBufferWithConstantPools::NumShortBranchRanges faclity.

Also add two functions for computing the maximum forward and backward
reach of branches given their range enumerator.

Jakob Stoklund Olesen [:jolesen]

Assignee

Comment 10

•

9 years ago

Attached patch Change representation of unbound Label linked lists. r=sstangl — Details — Splinter Review

Instead of storing byte offsets in the branch instructions using a
label, store instruction offsets, just like the finished branches do.
Use a 0 pc offset to terminate the linked list instead of -1.

This increases the maximum distance between linked branches to be the
same as the range of the branch instrructions. Previously, the
supported range was only 1/4 of what the branch instructions can
encode.

Provide protected functions for manipulating the linked list in
MozBaseAssembler, and rewrite Assembler::bind() and retarget() to use
them instead of decoding branches manually.

Move the LinkAndGet*OffsetTo functions into MozBaseAssembler. Our
version of these functions is completely different from the VIXL
versions.

Jakob Stoklund Olesen [:jolesen]

Assignee

Comment 11

•

9 years ago

Attached patch Dynamically track short-range branches. r=sstangl — Details — Splinter Review

Add a branch range argument to LinkAndGetOffsetTo(): ARM64 branches
can't encode arbitrary ranges, so the linked list of unbound label uses
needs some consideration. We can't assume that a newly assembled branch
instruction will be able to point backwards to label->offset().

Change LinkAndGetOffsetTo() to a normal function instead of a template.
We don't need the code duplication just to apply different scale
factors. Throw the premature microoptimizers a bone by replacing the
element_size template argument with its logarithm.

Implement Assembler::PatchShortRangeBranchToVeneer() to insert the
veneer branch after the original short-range branch in the linked list
of uses of the unbound label.

Fix Assembler::bind() to understand that not all branches can reach the
label. Verify that these branches jump to a veneer instead.

Register short-range branches in LinkAndGetOffsetTo(), and unregister
them again in Assembler::bind().

Jakob Stoklund Olesen [:jolesen]

Assignee

Comment 12

•

9 years ago

Attached patch Handle toggled calls in CodeFromJump(). r=sstangl — Details — Splinter Review

When handed a call that had been disabled by ToggleCall(), this
function would crash.

Jakob Stoklund Olesen [:jolesen]

Assignee

Updated

•

9 years ago

Attachment #8685645 - Flags: review?(sphink)

Jakob Stoklund Olesen [:jolesen]

Assignee

Updated

•

9 years ago

Attachment #8685646 - Flags: review?(sstangl)

Jakob Stoklund Olesen [:jolesen]

Assignee

Updated

•

9 years ago

Attachment #8685647 - Flags: review?(nicolas.b.pierron)

Jakob Stoklund Olesen [:jolesen]

Assignee

Updated

•

9 years ago

Attachment #8685648 - Flags: review?(nicolas.b.pierron)

Jakob Stoklund Olesen [:jolesen]

Assignee

Updated

•

9 years ago

Attachment #8685649 - Flags: review?(nicolas.b.pierron)

Jakob Stoklund Olesen [:jolesen]

Assignee

Updated

•

9 years ago

Attachment #8685650 - Flags: review?(nicolas.b.pierron)

Jakob Stoklund Olesen [:jolesen]

Assignee

Updated

•

9 years ago

Attachment #8685651 - Flags: review?(sstangl)

Jakob Stoklund Olesen [:jolesen]

Assignee

Updated

•

9 years ago

Attachment #8685652 - Flags: review?(sstangl)

Jakob Stoklund Olesen [:jolesen]

Assignee

Updated

•

9 years ago

Attachment #8685653 - Flags: review?(sstangl)

Jakob Stoklund Olesen [:jolesen]

Assignee

Updated

•

9 years ago

Attachment #8685654 - Flags: review?(sstangl)

Steve Fink [:sfink] [:s:]

Updated

•

9 years ago

Attachment #8685645 - Flags: review?(sphink) → review+

Jakob Stoklund Olesen [:jolesen]

Assignee

Comment 13

•

9 years ago

https://treeherder.mozilla.org/#/jobs?repo=try&revision=82ad298b424f

Jakob Stoklund Olesen [:jolesen]

Assignee

Comment 14

•

9 years ago

Unified build was broken.
Try again: https://treeherder.mozilla.org/#/jobs?repo=try&revision=6eea8e04bb93

Jakob Stoklund Olesen [:jolesen]

Assignee

Comment 15

•

9 years ago

Attached patch Remove testAssemblerBuffer from unified build. — Details — Splinter Review

The js::jit::AsemblerBuffer class has incompatible definitions in
the js/src/jit/shared and js/src/jit/x86-shared directories.

Attachment #8686175 - Flags: review?(sphink)

Nicolas B. Pierron [:nbp]

Comment 16

•

9 years ago

Comment on attachment 8685647 [details] [diff] [review]
Implement BranchDeadlineSet. r=nbp

Review of attachment 8685647 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/jit/shared/IonAssemblerBuffer.h
@@ +58,5 @@
>      }
> +
> +    // Comparator for use with std::sort etc.
> +    struct Less {
> +        bool operator()(BufferOffset a, BufferOffset b) const {

Any reasons not to overload the  operator<(const BufferOffset& b)  ?

::: js/src/jit/shared/IonAssemblerBufferWithConstantPools.h
@@ +105,5 @@
> +    // The offsets in each vector are always kept in ascending order.
> +    //
> +    // This means that as forward branches are added to the assembler buffer,
> +    // their deadlines will always be appended to the vector corresponding to
> +    // their range. If long-range and short-range branches shared the same

Start the previous sentence with "Assuming that we have different vector for different ranges, this means …"
And for the moment remove the hypothetical case of having multiple ranges.

@@ +145,5 @@
> +        earliest_ = BufferOffset();
> +        for (unsigned r = 0; r < NumRanges; r++) {
> +            auto& vec = vectorForRange(r);
> +            if (!vec.empty() &&
> +                (!earliest_.assigned() || vec[0].getOffset() < earliest_.getOffset())) {

style-nit: as the condition spans on multiple lines, move the opening brace to a new line.

Attachment #8685647 - Flags: review?(nicolas.b.pierron) → review+

Nicolas B. Pierron [:nbp]

Comment 17

•

9 years ago

Comment on attachment 8685648 [details] [diff] [review]
Wire up branchDeadlines_ partially. No Asm callbacks yet. r=nbp

Review of attachment 8685648 [details] [diff] [review]:
-----------------------------------------------------------------

(re-ask for review when answered)

::: js/src/jit/shared/IonAssemblerBufferWithConstantPools.h
@@ +406,5 @@
>  };
>  
>  
>  // The InstSize is the sizeof(Inst) but is needed here because the buffer is
>  // defined before the Instruction.

nit: Merge this comment with the one below.

@@ +631,5 @@
>      static const unsigned NO_DATA = unsigned(-2);
>  
> +    // Check if it is possible to add numInst instructions and numPoolEntries
> +    // constant pool entries without needing to flush the current pool.
> +    bool hasSpaceForInsts(unsigned numInsts, unsigned numPoolEntries) const

I have a question about some unlikely case:

Let's say I have an assemble which needs 2 instructions to make any jump, and that instruction are of size 1.
I have 4 jumps which are recorded in the list of dead lines, in 2 different ranges indexes:

  (0, BufferOffset(20)) (0, BufferOffset(22))
  (1, BufferOffset(21)) (1, BufferOffset(23)) 

As, I understand, the following will only trigger if the pool end goes beyond the first deadline.

So my question is, how do we ensure that we have enough instruction space for such cases?  Or this example is not realistic and we should assert against that?

Identically, we could end-up in a loop of veneer forward branches, If the size needed for encoding a jump is as large as the space between 2 deadlines.

@@ +773,5 @@
> +    // rangeIdx
> +    //   A number < NumShortBranchRanges identifying the range of the branch.
> +    //
> +    // deadline
> +    //   The highest buffer offset the the short-range branch can reach

typo-nit: s/the the/the/

Attachment #8685648 - Flags: review?(nicolas.b.pierron)

Nicolas B. Pierron [:nbp]

Comment 18

•

9 years ago

Comment on attachment 8685649 [details] [diff] [review]
Implement constant pool test. r=nbp

Review of attachment 8685649 [details] [diff] [review]:
-----------------------------------------------------------------

Nice!

::: js/src/jsapi-tests/testAssemblerBuffer.cpp
@@ +319,5 @@
> +    //  12: pool header
> +    //  16: poolData
> +    //  20: 2
> +    //
> +    ab.putInt(1);

This was not clear to me at the beginning that putInt(1) is used to write instructions.
Maybe it would make sense to use 0x22220000 as an instruction prefix for any random instruction.

Attachment #8685649 - Flags: review?(nicolas.b.pierron) → review+

Nicolas B. Pierron [:nbp]

Comment 19

•

9 years ago

Comment on attachment 8685650 [details] [diff] [review]
Add PatchShortRangeBranchToVeneer(). r=nbp

Review of attachment 8685650 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/jsapi-tests/testAssemblerBuffer.cpp
@@ +459,5 @@
> +    CHECK_EQUAL(*ab.getInst(br1), 0xb1bb00cc);
> +    CHECK_EQUAL(*ab.getInst(br2), 0xb1bb0d2d);
> +
> +    // Cancel one of the pending branches.
> +    ab.unregisterBranchDeadline(1, BufferOffset(off.getOffset() + TestAssembler::BranchRange));

I am not yet sure to understand this use case yet, do you have an example to clarify this?

@@ +488,5 @@
> +    CHECK_EQUAL(*ab.getInst(br2), 0xb3bb0000 + 20);         // br2 pc+20 (patched)
> +    CHECK_EQUAL(*ab.getInst(BufferOffset(28)), 0xb0bb0010); // br pc+16 (guard)
> +    CHECK_EQUAL(*ab.getInst(BufferOffset(32)), 0xffff0000); // pool header 0 bytes (not counting veneers.)
> +    CHECK_EQUAL(*ab.getInst(BufferOffset(36)), 0xb2bb00cc); // veneer1 w/ original 'cc' offset.
> +    CHECK_EQUAL(*ab.getInst(BufferOffset(40)), 0xb2bb0d2d); // veneer2 w/ original 'cc' offset.

comment-nit: veneer 2 w/ original 'd2d' offset.

Attachment #8685650 - Flags: review?(nicolas.b.pierron) → review+

Jakob Stoklund Olesen [:jolesen]

Assignee

Comment 20

•

9 years ago

(In reply to Nicolas B. Pierron [:nbp] from comment #17)
> Comment on attachment 8685648 [details] [diff] [review]
> I have a question about some unlikely case:
> 
> Let's say I have an assemble which needs 2 instructions to make any jump,
> and that instruction are of size 1.
> I have 4 jumps which are recorded in the list of dead lines, in 2 different
> ranges indexes:
> 
>   (0, BufferOffset(20)) (0, BufferOffset(22))
>   (1, BufferOffset(21)) (1, BufferOffset(23)) 
> 
> As, I understand, the following will only trigger if the pool end goes
> beyond the first deadline.
> 
> So my question is, how do we ensure that we have enough instruction space
> for such cases?  Or this example is not realistic and we should assert
> against that?
> 
> Identically, we could end-up in a loop of veneer forward branches, If the
> size needed for encoding a jump is as large as the space between 2 deadlines.

That's a good point. It is unlikely, but I don't think we can ignore it. ARM64 has two different branch ranges: the tbn/tbnz instructions can branch 32 KB ahead while the cbz/cbnz/b.cond conditional branches can branch 1 MB ahead. It would be possible to have a tbz and a cbz branch that happen to have the same deadline. In that case, we would only reserve space for one veneer, and we would not be able to add a veneer for the second branch.

This is a conservative fix: In hasSpaceForInsts(), make sure that there is room for NumShortBranchRanges after the end of the pool, and the last of the veneers must be placed before the deadline. Something like:

            if (deadline + guardSize_ < poolEnd * NumShortBranchRanges*guardSize_)
                return false;
 
This is assuming that guardSize_ is the size of an unconditional branch.

The problem I describe above is not exactly what you asked about. It fixes this case:

  (0, BufferOffset(20)) (1, BufferOffset(20))

I had initially thought that this was good enough because there is a limited density to the branches. The next deadlines in each range must be a whole instruction size after the earliest deadline. But you are right, that is not good enough. In the worst case, the deadlines can be packed so densely that we have NumShortBranchRanges new deadlines expiring for every one veneer inserted.

Assuming 2-byte branches as in your example, this worst case scenario is possible:

  (0, BufferOffset(20)) (1, BufferOffset(20))
  (0, BufferOffset(22)) (1, BufferOffset(22))
  (0, BufferOffset(24)) (1, BufferOffset(24))

The six veneers for these branches would have to be inserted no later than at offset 14:

14: veneer for (0,20)
16: veneer for (1,20)
18: veneer for (0,22)
20: veneer for (1,22)
22: veneer for (0,24)
24: veneer for (0,24)

This is an annoying problem because a situation like this is extremely unlikely to occur. I do think that it should be addressed, though. I don't want to build a JIT compiler that is just extremely likely to work.

It would be possible to compute the necessary margin by looking at the front of the deadline queues. Suppose the deadlines for all ranges were merge-sorted into one array d[]. Then we want the smallest i > 0 such that:

  d[i] - d[0] >= i * branchSize, or i = len(d)

The earliest deadline should be adjusted by (i-1)*branchSize to accomodate the 'decompression' of the dense branches.

We will find that i=1 99.9% of the time. With that assumption, it should be possible to compute i efficiently.

Jakob Stoklund Olesen [:jolesen]

Assignee

Comment 21

•

9 years ago

(In reply to Nicolas B. Pierron [:nbp] from comment #19)
> Comment on attachment 8685650 [details] [diff] [review]
> Add PatchShortRangeBranchToVeneer(). r=nbp
> 
> Review of attachment 8685650 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> ::: js/src/jsapi-tests/testAssemblerBuffer.cpp
> @@ +459,5 @@
> > +    CHECK_EQUAL(*ab.getInst(br1), 0xb1bb00cc);
> > +    CHECK_EQUAL(*ab.getInst(br2), 0xb1bb0d2d);
> > +
> > +    // Cancel one of the pending branches.
> > +    ab.unregisterBranchDeadline(1, BufferOffset(off.getOffset() + TestAssembler::BranchRange));
> 
> I am not yet sure to understand this use case yet, do you have an example to
> clarify this?

When you bind a label before the branches expire (the common case), Assembler::bind() is going to call unregisterBranchDeadline() because the branch won't need a veneer after all.

I'll clarify this in a comment in the test case.

Steve Fink [:sfink] [:s:]

Comment 22

•

9 years ago

Comment on attachment 8686175 [details] [diff] [review]
Remove testAssemblerBuffer from unified build.

Review of attachment 8686175 [details] [diff] [review]:
-----------------------------------------------------------------

I'm certainly ok with non-unifying testAssemblerBuffer, but I'd like to better understand the problem to be sure there isn't an additional fix needed to identify a similar problem more reliably.

Why are there two different definitions of jit::AssemblerBuffer? Or, more specifically, which one is testAssemblerBuffer using, and would we ever have anything else forcibly using that same one?

I don't know which patch testAssemblerBuffer is in, but is it testing an AssemblerBuffer that would never get used by the spidermonkey compiled along with this, in some configurations? Would it be better to skip the test in those cases?

Sorry, I don't want to dig up the relevant code, but I'd like to understand what the real issue is here. (That said, I'm fine with landing this as-is for now.)

Attachment #8686175 - Flags: review?(sphink) → review+

Jakob Stoklund Olesen [:jolesen]

Assignee

Comment 23

•

9 years ago

(In reply to Steve Fink [:sfink, :s:] from comment #22)
> Comment on attachment 8686175 [details] [diff] [review]
> Remove testAssemblerBuffer from unified build.
> 
> Review of attachment 8686175 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> I'm certainly ok with non-unifying testAssemblerBuffer, but I'd like to
> better understand the problem to be sure there isn't an additional fix
> needed to identify a similar problem more reliably.
> 
> Why are there two different definitions of jit::AssemblerBuffer? Or, more
> specifically, which one is testAssemblerBuffer using, and would we ever have
> anything else forcibly using that same one?
> 
> I don't know which patch testAssemblerBuffer is in, but is it testing an
> AssemblerBuffer that would never get used by the spidermonkey compiled along
> with this, in some configurations? Would it be better to skip the test in
> those cases?

Thanks, Steve

One version of jit::AssemblerBuffer is defined in jit/x86-shared/AssemblerBuffer-x86-shared.h
It is used when JS_CODEGEN_X86 or JS_CODEGEN_X64 is defined.

The other version is in jit/shared/IonAssemblerBuffer.h.
This one is used for JS_CODEGEN_ARM, JS_CODEGEN_ARM64, JS_CODEGEN_MIPS32, and JS_CODEGEN_MIPS64.

The tests in jsapi-tests/testAssemblerBuffer.cpp cover the second version. Since this version is supposed to be target-independent, the test exercises the code regardless of which JS_CODEGEN_* is set.

One AssemblerBuffer version is a template class, and the other version is a plain non-template class, so there is no chance of linkage collisions in jsapi-tests. But that is probably more luck than we deserve.

An alternative solution would be to disable testAssemblerBuffer.cpp under JS_CODEGEN_X86/JS_CODEGEN_X64.

Steve Fink [:sfink] [:s:]

Comment 24

•

9 years ago

Ok, thanks. Given that the jit/shared one is intended to be target-independent, it seems worth testing that it is, which is exactly what you're doing already by enabling testAssemblerBuffer.cpp for all arches. I'm fine with your patch as-is. Having the same name seems a little worrisome, but I'm fine relying on the linker to complain if that's ever an issue.

Jakob Stoklund Olesen [:jolesen]

Assignee

Comment 25

•

9 years ago

Attached patch Implement BranchDeadlineSet. — Details — Splinter Review

Updated patch after nbp review:
- Implement relational operator overloads for BufferOffset.
- Add size() and maxRangeSize() methods to BranchDeadlineSet.
- Tweak comment.

Jakob Stoklund Olesen [:jolesen]

Assignee

Updated

•

9 years ago

Attachment #8685647 - Attachment is obsolete: true

Jakob Stoklund Olesen [:jolesen]

Assignee

Comment 26

•

9 years ago

Attached patch Wire up branchDeadlines_ partially. No Asm callbacks yet. — Details — Splinter Review

Updated patch after nbp review:
- Tweaked comments.
- Addressed nbp's "unlikely case" in hasSpaceForInsts().

Rather than implementing a lazy merge sort to detect how much headroom is
required for veneers, I realized that a more conservative approach can be much
faster:

- Since branches and veneers have the same size, there is no problem when we
  only have one branch range. Each branch deadline must be at least an
  instruction size after its predecessor within the same range class.

- In an architecture with muliple branch range classes, the same applies to the
  range class that hapens to have the most deadlines registered. We would be
  able to emit all veneers for this range, but deadlines for the remaining ranges
  may cause trouble. Therefore, reserve space for all veneers that don't belong
  to the range with the most deadlines.

For ARM64, there will be two ranges: 1 MB for normal conditional branches and
cbz/cbnz, and 32 KB or tbz/tbnz. Most of the time, there will be more pending
1 MB branches, so we simply reserve space for all the tbz/tbnz branches. This is
unlikely to be more than a handful, so we won't lose much.

Attachment #8687492 - Flags: review?(nicolas.b.pierron)

Jakob Stoklund Olesen [:jolesen]

Assignee

Updated

•

9 years ago

Attachment #8685648 - Attachment is obsolete: true

Jakob Stoklund Olesen [:jolesen]

Assignee

Comment 27

•

9 years ago

Attached patch Implement constant pool test. — Details — Splinter Review

Updated patch:
- Use 0x2222xxxx as fake arithmetic instruction opcodes.

Jakob Stoklund Olesen [:jolesen]

Assignee

Updated

•

9 years ago

Attachment #8685649 - Attachment is obsolete: true

Jakob Stoklund Olesen [:jolesen]

Assignee

Comment 28

•

9 years ago

Attached patch Add PatchShortRangeBranchToVeneer(). — Details — Splinter Review

Updated patch after nbp review:
- Clarify use of unregisterBranchDeadline().

Jakob Stoklund Olesen [:jolesen]

Assignee

Updated

•

9 years ago

Attachment #8685650 - Attachment is obsolete: true

Sean Stangl [:sstangl]

Updated

•

9 years ago

Attachment #8685646 - Flags: review?(sstangl) → review+

Sean Stangl [:sstangl]

Comment 29

•

9 years ago

Comment on attachment 8685651 [details] [diff] [review]
Add enum ImmBranchRangeType. r=sstangl

Review of attachment 8685651 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/jit/arm64/vixl/MozBaseAssembler-vixl.h
@@ +40,5 @@
>  
>  
>  class MozBaseAssembler;
> +typedef js::jit::AssemblerBufferWithConstantPools<1024, 4, Instruction, MozBaseAssembler,
> +                                                  NumShortBranchRangeTypes> ARMBuffer;

This patch probably won't apply by itself, since it's missing the relevant modifications to AssemblerBufferWithConstantPools.

Attachment #8685651 - Flags: review?(sstangl) → review+

Sean Stangl [:sstangl]

Comment 30

•

9 years ago

Comment on attachment 8685652 [details] [diff] [review]
Change representation of unbound Label linked lists. r=sstangl

Review of attachment 8685652 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/jit/arm64/Assembler-arm64.cpp
@@ +229,2 @@
>  
> +    while (branchOffset.assigned()) {

Oh this is so much nicer.

::: js/src/jit/arm64/vixl/MozAssembler-vixl.cpp
@@ +67,5 @@
> +    Instruction* link = getInstructionAt(cur);
> +    // Raw encoded offset.
> +    ptrdiff_t offset = link->ImmPCRawOffset();
> +    // End of the list is encoded as 0.
> +    if (offset == 0)

kEndOfLabelUseList?

@@ +71,5 @@
> +    if (offset == 0)
> +        return BufferOffset();
> +    // The encoded offset is the number of instructions to move.
> +    offset *= kInstructionSize;
> +    offset += cur.getOffset();

Prefer |ptrdiff_t nextOffset = cur.getOffset() + offset * kInstructionSize|.

@@ +108,2 @@
>    if (armbuffer_.oom())
> +    return 0;

kEndOfLabelUseList

@@ +118,5 @@
>    if (!label->used()) {
>      // The label is unbound and unused: store the offset in the label itself
>      // for patching by bind().
>      label->use(branch.getOffset());
> +    return 0;

kEndOfLabelUseList

@@ +128,2 @@
>    label->use(branch.getOffset());
> +  MOZ_ASSERT(offset != 0);

kEndOfLabelUseList. This code is much nicer!

@@ +143,5 @@
>  }
>  
> +ptrdiff_t
> +MozBaseAssembler::LinkAndGetPageOffsetTo(BufferOffset branch, Label* label)
> +{

There are a bunch of such changes to this file. They should all be changed back to only one line, since this file follows VIXL style, and non-comment lines are given 100 columns in both SM and VIXL styles.

Attachment #8685652 - Flags: review?(sstangl) → review+

Sean Stangl [:sstangl]

Comment 31

•

9 years ago

Comment on attachment 8685653 [details] [diff] [review]
Dynamically track short-range branches. r=sstangl

Review of attachment 8685653 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/jit/arm64/vixl/MozAssembler-vixl.cpp
@@ +111,3 @@
>  ptrdiff_t
> +MozBaseAssembler::LinkAndGetOffsetTo(BufferOffset branch, ImmBranchRangeType branchRange,
> +                                     unsigned elementBits, Label* label)

The name elementBits is misleading. Maybe "elementShift" would be clearer?

@@ +122,5 @@
>      return label_offset - branch_offset;
>    }
>  
> +  // Keep track of short-range branches targeting unbound labels. We may need
> +  // to insert veneers in PatchShortRangeBranchToVeneer() below.

PatchShortRangeBranchToVeneer doesn't actually insert a veneer -- it just gets passed the veneer's address. It's not clear from this patch what does actually create the veneer, but I assume that's coming up.

@@ +137,2 @@
>      label->use(branch.getOffset());
>      return 0;

kEndOfLabelUseList? Maybe?

@@ +145,5 @@
> +
> +  // What is the earliest buffer offset that would be reachable by the branch
> +  // we're about to add?
> +  ptrdiff_t earliestReachable =
> +    branch.getOffset() + Instruction::ImmBranchMinBackwardOffset(branchRange);

This looks like it might fit in the 100 column limit.

@@ +152,5 @@
> +  // new branch, we can simply insert the new branch at the front of the list.
> +  if (label->offset() >= earliestReachable) {
> +      ptrdiff_t offset = EncodeOffset(branch, BufferOffset(label));
> +      label->use(branch.getOffset());
> +      MOZ_ASSERT(offset != 0);

kEndOfLabelUseList

@@ +185,5 @@
> +  } while (next.assigned());
> +  SetNextLink(exbr, branch);
> +
> +  // This branch becomes the new end of the list.
> +  return 0;

kEndOfLabelUseList

@@ +491,5 @@
> +  // Verify that the branch range matches what's encoded.
> +  MOZ_ASSERT(Instruction::ImmBranchTypeToRange(branchInst->BranchType()) == branchRange);
> +
> +  // We want to insert veneer after branch in the linked list of instructions
> +  // that use the same unbound label.

This code doesn't actually do any inserting... the veneer space is already allocated and passed in via BufferOffset.

@@ +493,5 @@
> +
> +  // We want to insert veneer after branch in the linked list of instructions
> +  // that use the same unbound label.
> +  // The veneer should be an unconditional branch.
> +  ptrdiff_t offset = branchInst->ImmPCRawOffset();

instructionOffset?

@@ +496,5 @@
> +  // The veneer should be an unconditional branch.
> +  ptrdiff_t offset = branchInst->ImmPCRawOffset();
> +
> +  // If offset is 0, this is the end of the linked list.
> +  if (offset == 0) {

kEndOfLabelUseList

@@ +502,5 @@
> +  } else {
> +      // Make the offset relative to veneer so it targets the same instruction
> +      // as branchInst.
> +      offset *= kInstructionSize;
> +      offset += branch.getOffset() - veneer.getOffset();

ptrdiff_t byteOffset = (instructionOffset * kInstructionSize) + branch.getOffset() - veneer.getOffset();

@@ +507,5 @@
> +      Assembler::b(veneerInst, offset / kInstructionSize);
> +  }
> +
> +  // Now point branchInst at veneer. See also SetNextLink() above.
> +  branchInst->SetImmPCRawOffset(EncodeOffset(branch, veneer));

Does anything assert that veneer is actually in range of branchInst?

Attachment #8685653 - Flags: review?(sstangl) → review+

Sean Stangl [:sstangl]

Comment 32

•

9 years ago

Comment on attachment 8685654 [details] [diff] [review]
Handle toggled calls in CodeFromJump(). r=sstangl

Review of attachment 8685654 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/jit/arm64/Assembler-arm64.cpp
@@ +486,5 @@
>  
>  static JitCode*
>  CodeFromJump(JitCode* code, uint8_t* jump)
>  {
>      Instruction* branch = (Instruction*)jump;

Calling this branch is a bad idea, since it's not actually a branch. I haven't thought of any better names, though. Maybe just "inst".

@@ +506,3 @@
>          target = (uint8_t*)branch->ImmPCOffsetTarget();
> +    } else if (branch->IsLDR()) {
> +        // This is an ldr+blr call that is enabled.

MOZ_ASSERT that the nextInst()->IsBLR()?

@@ +507,5 @@
> +    } else if (branch->IsLDR()) {
> +        // This is an ldr+blr call that is enabled.
> +        target = (uint8_t*)branch->Literal64();
> +    } else if (branch->IsADR()) {
> +        // This is a disabled call: adr+nop. See ToggleCall().

MOZ_ASSERT that the nextInst()->IsNOP()?

Attachment #8685654 - Flags: review?(sstangl) → review+

Jakob Stoklund Olesen [:jolesen]

Assignee

Comment 33

•

9 years ago

(In reply to Sean Stangl [:sstangl] from comment #31)
> > +  // Now point branchInst at veneer. See also SetNextLink() above.
> > +  branchInst->SetImmPCRawOffset(EncodeOffset(branch, veneer));
> 
> Does anything assert that veneer is actually in range of branchInst?

Yes, SetImmPCRawOffset() eventually VIXL_ASSERT's that the offset can be encoded.

For security reasons, we may want to change those to release asserts, though.

Nicolas B. Pierron [:nbp]

Comment 34

•

9 years ago

Comment on attachment 8687492 [details] [diff] [review]
Wire up branchDeadlines_ partially. No Asm callbacks yet.

Review of attachment 8687492 [details] [diff] [review]:
-----------------------------------------------------------------

patch comment: s/geta a/get a/

::: js/src/jit/shared/IonAssemblerBufferWithConstantPools.h
@@ +652,5 @@
>      static const unsigned NO_DATA = unsigned(-2);
>  
> +    // Check if it is possible to add numInst instructions and numPoolEntries
> +    // constant pool entries without needing to flush the current pool.
> +    bool hasSpaceForInsts(unsigned numInsts, unsigned numPoolEntries) const

Do we have a way to assert the correctness of these arguments, as valid upper bounds of the generated code?

Attachment #8687492 - Flags: review?(nicolas.b.pierron) → review+

Jakob Stoklund Olesen [:jolesen]

Assignee

Comment 35

•

9 years ago

(In reply to Nicolas B. Pierron [:nbp] from comment #34)
> > +    // Check if it is possible to add numInst instructions and numPoolEntries
> > +    // constant pool entries without needing to flush the current pool.
> > +    bool hasSpaceForInsts(unsigned numInsts, unsigned numPoolEntries) const
> 
> Do we have a way to assert the correctness of these arguments, as valid
> upper bounds of the generated code?

Yes.

This function is normally called with (1,0) or (1,1) arguments before inserting a single instruction. It is called from enterNoPool() with the user-provided maxInst argument. There is a dedicated mechanism for verifying that the user doesn't attempt to insert more no-pool instructions than promised.

In general, the instruction encoding function assert that the encoded offset on constant pool loads and branches is in range.

Pulsebot

Comment 36

•

9 years ago

https://hg.mozilla.org/integration/mozilla-inbound/rev/f5efa2ed37ac
https://hg.mozilla.org/integration/mozilla-inbound/rev/cbb4146343f3
https://hg.mozilla.org/integration/mozilla-inbound/rev/fe68d5adf1c9
https://hg.mozilla.org/integration/mozilla-inbound/rev/774b04222cba
https://hg.mozilla.org/integration/mozilla-inbound/rev/6a1c3892b4b5
https://hg.mozilla.org/integration/mozilla-inbound/rev/f356bb9fac1a
https://hg.mozilla.org/integration/mozilla-inbound/rev/ca19392cbc43
https://hg.mozilla.org/integration/mozilla-inbound/rev/8478e51fd7a8
https://hg.mozilla.org/integration/mozilla-inbound/rev/bf36faaf8408
https://hg.mozilla.org/integration/mozilla-inbound/rev/355ed0ade977

Wes Kocher (:KWierso) (Not reading bugmail; email directly if needed)

Comment 37

•

9 years ago

bugherder

https://hg.mozilla.org/mozilla-central/rev/f5efa2ed37ac
https://hg.mozilla.org/mozilla-central/rev/cbb4146343f3
https://hg.mozilla.org/mozilla-central/rev/fe68d5adf1c9
https://hg.mozilla.org/mozilla-central/rev/774b04222cba
https://hg.mozilla.org/mozilla-central/rev/6a1c3892b4b5
https://hg.mozilla.org/mozilla-central/rev/f356bb9fac1a
https://hg.mozilla.org/mozilla-central/rev/ca19392cbc43
https://hg.mozilla.org/mozilla-central/rev/8478e51fd7a8
https://hg.mozilla.org/mozilla-central/rev/bf36faaf8408
https://hg.mozilla.org/mozilla-central/rev/355ed0ade977

Status: NEW → RESOLVED

Closed: 9 years ago

status-firefox45: --- → fixed

Resolution: --- → FIXED

Target Milestone: --- → mozilla45

You need to log in before you can comment on or make changes to this bug.