Closed Bug 560370 Opened 14 years ago Closed 14 years ago

PPC cmov handling logic incorrect

Categories

(Tamarin Graveyard :: Baseline JIT (CodegenLIR), defect, P4)

PowerPC
macOS
defect

Tracking

(Not tracked)

RESOLVED FIXED
Q3 11 - Serrano

People

(Reporter: rreitmai, Unassigned)

Details

(Whiteboard: WE-2572294, fixed-in-nanojit, fixed-in-tamarin, fixed-in-tracemonkey)

Attachments

(2 files, 2 obsolete files)

Here's a code snip-it showing bad generation ; notice the mr r9,r4 at 00104dbb80 which is clearly wrong.

00104dbb5c   li r12,3 (0x3)  <= restore 3
00104dbb60   li r11,0 (0x0)  <= restore 0
00104dbb64   li r10,6 (0x6)  <= restore 6
00104dbb68   li r4,1 (0x1)  <= restore 1
    sti vars[40] = get_cache_handler1
00104dbb6c   stw r3,-16(r31)
    sti tags[20] = 0
00104dbb70   stw r11,-64(r31)
    eq1 = eq get_cache_handler1, 4       # codegen'd with the cmov
    cmov1 = cmov eq1 ? 1 : get_cache_handler1
00104dbb74   cmpwi cr7,r3,4 (0x4)
00104dbb78   beq cr7,0x104dbb80
00104dbb7c   mr r4,r3
0x104dbb80:
00104dbb80   mr r9,r4
    sti vars[40] = cmov1
00104dbb84   stw r4,-16(r31)
    sti tags[20] = 0
00104dbb88   stw r11,-64(r31)
    sti vars[8] = cmov1
00104dbb8c   stw r4,-48(r31)
    sti tags[4] = 0
00104dbb90   stw r11,-80(r31)
    188130304 = int 188130304
00104dbb94   lis r8,2870 (0xb360000)
00104dbb98   ori r8,r8,41984 (0xa400)
    sti varTraits[4] = 188130304
00104dbb9c   stw r8,-120(r31)
    sti vars[40] = 1
00104dbba0   stw r9,-16(r31)
    sti tags[20] = 6
00104dbba4   stw r10,-64(r31)
    sti vars[16] = 1
00104dbba8   stw r9,-40(r31)
    sti tags[8] = 6
00104dbbac   stw r10,-76(r31)
Attached patch v1 (obsolete) — Splinter Review
Duplicate the logic used on the x86 platform which is more robust in handing the various register permutations
Attachment #440054 - Flags: review?(edwsmith)
Attachment #440054 - Attachment is patch: true
Attachment #440054 - Attachment mime type: application/octet-stream → text/plain
Comment on attachment 440054 [details] [diff] [review]
v1

Warning: the logic in the patch isn't the same as x86, and if it is safe anyway, then its not obvious why.  the x86 backend calls asm_cmp last, but the ppc backend does more register allocation work after calling asm_branch (which calls asm_cmp).  asm_cmp() must allocate registers for the operands of the compare, which further increases register pressure.  Its not obvious that rt (register for iftrue operand) will remain in a register during that time.

Instead of asm_branch(), you could underrunProtect(8) (two instructions, check my math) then you don't need any of the long-branch cases in asm_branch() or asm_branch_near(), just the MR(rr,rf) and single-instruction short branch.

Then handle rt, then do asm_cmp.  *then* i think it will match x86's behavior.
Attachment #440054 - Flags: review?(edwsmith) → review-
(In reply to comment #2)
> (From update of attachment 440054 [details] [diff] [review])
> Warning: the logic in the patch isn't the same as x86

True comment 1 is incorrect; 'mimic' rather than 'duplicate'. 

> the ppc backend does more register allocation work after calling asm_branch

Good point, none of the registers are guaranteed to be around after this call.   One way we could lock down rt,rf and rr, would be to add a 'dontUseRegisters' parameter to asm_cmp() that force these regs to be excluded.

> Instead of asm_branch(), you could underrunProtect(8) (two instructions, check
> my math) then you don't need any of the long-branch cases in asm_branch() or
> asm_branch_near(), just the MR(rr,rf) and single-instruction short branch.

Right that would be a good optimization , but I think we're still left with the larger issue of having to generate an MR()  *after* the asm_cmp().   

x86 doesn't have this issue, b/c the sequence is 'cmp a,b  mov rr,rt  cmov rr,rf' .   Notice no allocation occurs between the cmp and the mov, i.e. asm_cmp() call is after the mov.

For PPC, the sequence is roughly 'mov rr,rt  cmp a,b  br x  mov rr,rf' ; thus we have to ensure that rr and rt exist *after* the asm_cmp() call.
Attached patch v2 (obsolete) — Splinter Review
updated to reflect prior comments / concerns ; not yet heavily tested
Attachment #440054 - Attachment is obsolete: true
Attachment #440324 - Flags: review?
Attachment #440324 - Flags: review? → review?(edwsmith)
Whiteboard: WE-2572294
Assignee: nobody → rreitmai
Target Milestone: --- → flash10.1
Status: NEW → ASSIGNED
passes all testing.
Comment on attachment 440324 [details] [diff] [review]
v2

Now that you are using asm_branch_near (good!), you can make the call to asm_cmp() be last, which will avoid the need to add an exclusion set to asm_cmp().  and, the register manipulation code will correspond 1:1 with the x86 backend.
Attachment #440324 - Flags: review?(edwsmith) → review-
Priority: -- → P2
(In reply to comment #6)
> you can make the call to asm_cmp() be last

hmm, comment 3 shows why this can't be done in the PPC case.  Am I missing something?
Attached patch v3Splinter Review
Gotcha; you simply want to move the MR *between* the cmp and branch.  Done in latest patch.  Also squeezed underrunProtect down to 8 and put release logic prior to asm_cmp() which more closely matches the x86 version.
Attachment #440324 - Attachment is obsolete: true
Attachment #440867 - Flags: review?(edwsmith)
(In reply to comment #8)
>  underrunProtect down to 8 

Not true, the patch still has it as 16, but that's still correct, just more conservative.
Comment on attachment 440867 [details] [diff] [review]
v3

looks right, assuming it passes tests it should be good to go.
Attachment #440867 - Flags: review?(edwsmith) → review+
pushed to http://hg.mozilla.org/projects/nanojit-central/rev/8047bc5db3b1
Group: tamarin-security
Whiteboard: WE-2572294 → WE-2572294, fixed-in-nanojit,
Whiteboard: WE-2572294, fixed-in-nanojit, → WE-2572294, fixed-in-nanojit, fixed-in-tr
Whiteboard: WE-2572294, fixed-in-nanojit, fixed-in-tr → WE-2572294, fixed-in-nanojit, fixed-in-tamarin
This re-strengthens the assert in asm_cmov() by fixing the typo in the isS14 macro.  (bd should be i).
Attachment #441521 - Flags: review?(rreitmai)
Comment on attachment 441521 [details] [diff] [review]
Fix the isS14 macro, then use it in asm_cmov()

I'm hesitant to + this, could we use & masking the bits instead?

something like : (((uint32)(x)) & 0xfff) == (uint32_t)(x)
http://hg.mozilla.org/tracemonkey/rev/20583089b31b
Whiteboard: WE-2572294, fixed-in-nanojit, fixed-in-tamarin → WE-2572294, fixed-in-nanojit, fixed-in-tamarin, fixed-in-tracemonkey
http://hg.mozilla.org/mozilla-central/rev/20583089b31b
Status: ASSIGNED → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
Attachment #441521 - Flags: review?(rreitmai) → review+
(In reply to comment #13)
> (From update of attachment 441521 [details] [diff] [review])
> I'm hesitant to + this, could we use & masking the bits instead?
> 
> something like : (((uint32)(x)) & 0xfff) == (uint32_t)(x)

that only works correctly for positive values of x.  To handle positive and negative values of x, shifting is the way to go.
Assignee: rreitmai → edwsmith
Assigning to me and re-opening to fix the macro bug.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Priority: P2 → P4
Target Milestone: flash10.1 → flash10.2
Comment on attachment 441521 [details] [diff] [review]
Fix the isS14 macro, then use it in asm_cmov()

NJ: http://hg.mozilla.org/projects/nanojit-central/rev/9b65dfcae9f7
Assignee: edwsmith → nobody
Comment on attachment 441521 [details] [diff] [review]
Fix the isS14 macro, then use it in asm_cmov()

fixup imported to TR:
http://hg.mozilla.org/tamarin-redux/rev/5ec0793eebe7
http://hg.mozilla.org/mozilla-central/rev/732dbc8fa456
Status: REOPENED → RESOLVED
Closed: 14 years ago14 years ago
Resolution: --- → FIXED
Flags: flashplayer-bug+
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: