Open Bug 1752520 Opened 2 years ago Updated 2 years ago

Ion's RA: after splitting, spill bundle interferes with other fragments

Categories

(Core :: JavaScript Engine: JIT, enhancement, P2)

enhancement

Tracking

()

People

(Reporter: jseward, Unassigned)

References

(Blocks 2 open bugs)

Details

Attachments

(1 file)

This

(func (param $p1 i64) (result i64)
  (local.get $p1)
)

results, on x64-linux, in the following function body, omitting
frame-management insns:

00000024  48 8b cf                  mov %rdi, %rcx
00000027  48 8b c1                  mov %rcx, %rax
0000002B  c3                        ret

One wonders why it didn't move %rdi directly into %rax. Debug printing makes
this clear:

  LIR instructions (Pre-allocation LIR)
    Block 0
      2-3 WasmParameter [def v1<g>:rdi]
      4-5 WasmParameter [def v2<g>:r14]
      6-7 WasmReturnI64 [use v2:F:r14] [use v1:F:rax]

Live range and bundle construction proceed as expected:

  Live ranges by virtual register (after liveness analysis):
    v1 3-7 { 3_def 7_v1:F:rax }
    v2 5-7 { 5_def 7_v2:F:r14 }

  Live ranges by bundle (after grouping/queueing regs):
    LB0(parent=none v1 3-7 { 3_def 7_v1:F:rax })
    LB1(parent=none v2 5-7 { 5_def 7_v2:F:r14 })

It is LB0 of interest here. It has conflicting fixed-register requirements
(defined by %rdi, used by %rax) and so must be split, which it duly is:

  Allocating LB0(parent=none v1 3-7 { 3_def 7_v1:F:rax }) [prio 5] [weight 800]
    Requirement rdi, fixed by definition
    Requirement rax, due to use at 7
    Splitting LB0(parent=none v1 3-7 { 3_def 7_v1:F:rax }) ..
    .. bundle does not contain hot code
    .. bundle is defined by a register
    .. bundle's last use is a register use
    .. into:
      LB3(parent=LB2 v1 3-3 { 3_def })
      LB4(parent=LB2 v1 6-7 { 7_v1:F:rax })
      LB2(parent=none v1 4-7 { })

Leaving aside the question of why it is split into 3 bundles rather than 2, we
get two somewhat-plausible bundles LB3 and LB4 which are attached to their
"parent" spill-bundle LB2. (The fact that LB2 has no parent indicates it is a
spill bundle).

However, LB2 interferes with (overlaps) LB4 at positions 6 and 7. We move on to
the main allocation loop, which defers LB2 and allocates LB4 to %rax
(reasonably) and LB3 to %rdi (also reasonably):

  Allocating LB2(parent=none v1 4-7 { }) [prio 4] [weight 0]
    postponed spill (no hint or register requirement)
  Allocating LB4(parent=LB2 v1 6-7 { 7_v1:F:rax }) [prio 2] [weight 2000000]
    Requirement rax, due to use at 7
    allocated to rax
  Allocating LB3(parent=LB2 v1 3-3 { 3_def }) [prio 1] [weight 2000000]
    Requirement rdi, fixed by definition
    allocated to rdi

Finally, the spill-bundle allocation loop processes LB2. It notices that LB2
overlaps with LB4, which rules out %rax. It chooses instead %rcx, hence forcing
a move through that:

  Spill or allocate LB2(parent=none v1 4-7 { })
    rax collides with LB4(parent=LB2 v1 6-7 rax { 7_v1:F:rax }) [weight 2000000]
    allocated to rcx

This is clearly a poor result. But it isn't clear what the intent of the
original authors was:

  • is it correct that the split produces produces overlapping fragments?

    • if no, why has this happened?
    • if yes, what would this even mean, for a value to be live in two different
      bundles at the same program point?
    • if yes, how is it intended to avoid the use of an extra register?
  • why split the bundle into 3 pieces? Splitting it into 2 pieces would have
    resolved the fixed-register-use conflict.

As a newcomer to this code it is disconcerting to see bundle splitting producing
overlapping fragments. A comfortable-sounding invariant would be that the
results of a split exactly cover the original, with no holes and no overlaps.
But that's obviously not the case here.


Temporarily kludging BacktrackingAllocator::splitAt() so as to not generate
overlaps for this specific example ..

    Splitting LB0(parent=none v1 3-7 { 3_def 7_v1:F:rax }) ..
    .. into:
      LB3(parent=LB2 v1 3-3 { 3_def })
      LB4(parent=LB2 v1 6-7 { 7_v1:F:rax })
      LB2(parent=none v1 4-5 { })

does "fix" the problem: LB2 no longer overlaps either of its children. Hence
the spill-bundle allocation loop allocates LB2 to %rax

  Spill or allocate LB2(parent=none v1 4-5 { })
    allocated to rax

and we get the result we want:

00000024  48 8b c7                  mov %rdi, %rax
00000028  c3                        ret

However, it's not clear [to me] whether or not it's just by luck that LB2 got
allocated to the same register (%rax) that LB4 got allocated to, hence avoiding
the need for a move at the LB2->LB4 boundary (program point in between 5 and 6).
If the original bundle had been split into only two pieces rather than three
then this question would be moot.

Blocks: sm-jits
Type: defect → enhancement
Priority: -- → P1
See Also: → 1752582

For a more general assessment of splitting invariants, see bug 1752582.

Severity: -- → N/A
Priority: P1 → P2

Comment 0:

However, it's not clear [to me] whether or not it's just by luck that LB2 got
allocated to the same register (%rax) that LB4 got allocated to

Here's similar example. It too has conflicting fixed-reg uses. Reading the
log implies that the comment 0 case really did wind up with LB2 allocated to
%rax just by chance. And hence that merely de-overlapping the split bundles
won't fix the problem in general -- the "floating" parent (spill) bundle could
well end up being assigned to a third register, hence generating an unnecessary
MoveGroup. The full log is in the attachment; I just show the highlights here.

LB3833 has a fixed-reg conflict and is split into three pieces, with the spill
bundle LB3840 overlapping one child, LB3843:

Splitting LB3833(parent=none v3 7-18 { 7_def:F:rdx 18_v3:F:rcx }) ..
.. bundle does not contain hot code
.. bundle is defined by a register
.. bundle's last use is a register use
.. into:
  LB3842(parent=LB3840 v3 7-7 { 7_def:F:rdx })
  LB3843(parent=LB3840 v3 18-18 { 18_v3:F:rcx })
  LB3840(parent=none v3 8-18 { })

LB3842 and LB3843 are given %rdx and %rcx respectively, since that's
unavoidable:

Allocating LB3842(parent=LB3840 v3 7-7 { 7_def:F:rdx }) [priority 1] [weight 2000000]
  Requirement rdx, fixed by definition
  allocated to rdx
Allocating LB3843(parent=LB3840 v3 18-18 { 18_v3:F:rcx }) [priority 1] [weight 2000000]
  Requirement rcx, due to use at 18
  allocated to rcx

Now we come to the spill bundle, LB3840. We'd hope that it would get either
%rdx or %rcx so we could get away in total with just one MoveGroup. That
doesn't happen for whatever reason, but what is of significance is what the
search loop does:

Spill or allocate LB3844(parent=none v4 10-18 { })
  rax collides with LB3848(parent=none v2 6-14 rax { 12_v2:A } ## v7 16-16 rax { 16_v7:A }) [weight 200]
  rcx collides with LB3850(parent=LB3848 v2 14-14 rcx { 14_v2:A } ## v7 15-15 rcx { 15_def }) [weight 1500]
  rdx collides with LB3838(parent=none v8 17-18 rdx { 17_def 18_v8:F:rdx }) [weight 2000]
  rbx collides with LB3840(parent=none v3 8-18 rbx { }) [weight 0]
  rsi collides with LB3836(parent=none v6 13-18 rsi { 13_def 18_v6:F:rsi }) [weight 666]
  rdi collides with LB3831(parent=none v1 3-18 rdi { 3_def:F:rdi 18_v1:F:rdi }) [weight 250]
  r8 collides with LB3847(parent=LB3844 v4 18-18 r8 { 18_v4:F:r8 }) [weight 2000000]
  allocated to r9

Imagine that %rax had been free here. Then LB3840 would have been allocated it
regardless of the status of %rdx and %rcx, even though one of the latter would
have been a better choice.

Observations re comment 2:

(1) If the spill bundle didn't overlap its children, then there would be at
least the possibility that it could later be allocated to the same register
as one or more of the children. This would open the way to removing the
spurious move, but wouldn't guarantee removal, since removal also relies on
"getting lucky" as described in comment 2.

(2) If, when the spill bundle was created (as a result of splitting a bundle
with a fixed-reg conflict), the spill bundle was "hinted" to use the
register(s) involved in the conflict, the chances of removing the spurious
move are much higher, since this removes the need to "get lucky".

Blocks: sm-regalloc
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: