Closed Bug 560370 Opened 15 years ago Closed 15 years ago

PPC cmov handling logic incorrect

Categories

(Tamarin Graveyard :: Baseline JIT (CodegenLIR), defect, P4)

PowerPC
macOS
defect

Tracking

(Not tracked)

RESOLVED FIXED
Q3 11 - Serrano

People

(Reporter: rreitmai, Unassigned)

Details

(Whiteboard: WE-2572294, fixed-in-nanojit, fixed-in-tamarin, fixed-in-tracemonkey)

Attachments

(2 files, 2 obsolete files)

Here's a code snip-it showing bad generation ; notice the mr r9,r4 at 00104dbb80 which is clearly wrong. 00104dbb5c li r12,3 (0x3) <= restore 3 00104dbb60 li r11,0 (0x0) <= restore 0 00104dbb64 li r10,6 (0x6) <= restore 6 00104dbb68 li r4,1 (0x1) <= restore 1 sti vars[40] = get_cache_handler1 00104dbb6c stw r3,-16(r31) sti tags[20] = 0 00104dbb70 stw r11,-64(r31) eq1 = eq get_cache_handler1, 4 # codegen'd with the cmov cmov1 = cmov eq1 ? 1 : get_cache_handler1 00104dbb74 cmpwi cr7,r3,4 (0x4) 00104dbb78 beq cr7,0x104dbb80 00104dbb7c mr r4,r3 0x104dbb80: 00104dbb80 mr r9,r4 sti vars[40] = cmov1 00104dbb84 stw r4,-16(r31) sti tags[20] = 0 00104dbb88 stw r11,-64(r31) sti vars[8] = cmov1 00104dbb8c stw r4,-48(r31) sti tags[4] = 0 00104dbb90 stw r11,-80(r31) 188130304 = int 188130304 00104dbb94 lis r8,2870 (0xb360000) 00104dbb98 ori r8,r8,41984 (0xa400) sti varTraits[4] = 188130304 00104dbb9c stw r8,-120(r31) sti vars[40] = 1 00104dbba0 stw r9,-16(r31) sti tags[20] = 6 00104dbba4 stw r10,-64(r31) sti vars[16] = 1 00104dbba8 stw r9,-40(r31) sti tags[8] = 6 00104dbbac stw r10,-76(r31)
Attached patch v1 (obsolete) — Splinter Review
Duplicate the logic used on the x86 platform which is more robust in handing the various register permutations
Attachment #440054 - Flags: review?(edwsmith)
Attachment #440054 - Attachment is patch: true
Attachment #440054 - Attachment mime type: application/octet-stream → text/plain
Comment on attachment 440054 [details] [diff] [review] v1 Warning: the logic in the patch isn't the same as x86, and if it is safe anyway, then its not obvious why. the x86 backend calls asm_cmp last, but the ppc backend does more register allocation work after calling asm_branch (which calls asm_cmp). asm_cmp() must allocate registers for the operands of the compare, which further increases register pressure. Its not obvious that rt (register for iftrue operand) will remain in a register during that time. Instead of asm_branch(), you could underrunProtect(8) (two instructions, check my math) then you don't need any of the long-branch cases in asm_branch() or asm_branch_near(), just the MR(rr,rf) and single-instruction short branch. Then handle rt, then do asm_cmp. *then* i think it will match x86's behavior.
Attachment #440054 - Flags: review?(edwsmith) → review-
(In reply to comment #2) > (From update of attachment 440054 [details] [diff] [review]) > Warning: the logic in the patch isn't the same as x86 True comment 1 is incorrect; 'mimic' rather than 'duplicate'. > the ppc backend does more register allocation work after calling asm_branch Good point, none of the registers are guaranteed to be around after this call. One way we could lock down rt,rf and rr, would be to add a 'dontUseRegisters' parameter to asm_cmp() that force these regs to be excluded. > Instead of asm_branch(), you could underrunProtect(8) (two instructions, check > my math) then you don't need any of the long-branch cases in asm_branch() or > asm_branch_near(), just the MR(rr,rf) and single-instruction short branch. Right that would be a good optimization , but I think we're still left with the larger issue of having to generate an MR() *after* the asm_cmp(). x86 doesn't have this issue, b/c the sequence is 'cmp a,b mov rr,rt cmov rr,rf' . Notice no allocation occurs between the cmp and the mov, i.e. asm_cmp() call is after the mov. For PPC, the sequence is roughly 'mov rr,rt cmp a,b br x mov rr,rf' ; thus we have to ensure that rr and rt exist *after* the asm_cmp() call.
Attached patch v2 (obsolete) — Splinter Review
updated to reflect prior comments / concerns ; not yet heavily tested
Attachment #440054 - Attachment is obsolete: true
Attachment #440324 - Flags: review?
Attachment #440324 - Flags: review? → review?(edwsmith)
Whiteboard: WE-2572294
Assignee: nobody → rreitmai
Target Milestone: --- → flash10.1
Status: NEW → ASSIGNED
passes all testing.
Comment on attachment 440324 [details] [diff] [review] v2 Now that you are using asm_branch_near (good!), you can make the call to asm_cmp() be last, which will avoid the need to add an exclusion set to asm_cmp(). and, the register manipulation code will correspond 1:1 with the x86 backend.
Attachment #440324 - Flags: review?(edwsmith) → review-
Priority: -- → P2
(In reply to comment #6) > you can make the call to asm_cmp() be last hmm, comment 3 shows why this can't be done in the PPC case. Am I missing something?
Attached patch v3Splinter Review
Gotcha; you simply want to move the MR *between* the cmp and branch. Done in latest patch. Also squeezed underrunProtect down to 8 and put release logic prior to asm_cmp() which more closely matches the x86 version.
Attachment #440324 - Attachment is obsolete: true
Attachment #440867 - Flags: review?(edwsmith)
(In reply to comment #8) > underrunProtect down to 8 Not true, the patch still has it as 16, but that's still correct, just more conservative.
Comment on attachment 440867 [details] [diff] [review] v3 looks right, assuming it passes tests it should be good to go.
Attachment #440867 - Flags: review?(edwsmith) → review+
Group: tamarin-security
Whiteboard: WE-2572294 → WE-2572294, fixed-in-nanojit,
Whiteboard: WE-2572294, fixed-in-nanojit, → WE-2572294, fixed-in-nanojit, fixed-in-tr
Whiteboard: WE-2572294, fixed-in-nanojit, fixed-in-tr → WE-2572294, fixed-in-nanojit, fixed-in-tamarin
This re-strengthens the assert in asm_cmov() by fixing the typo in the isS14 macro. (bd should be i).
Attachment #441521 - Flags: review?(rreitmai)
Comment on attachment 441521 [details] [diff] [review] Fix the isS14 macro, then use it in asm_cmov() I'm hesitant to + this, could we use & masking the bits instead? something like : (((uint32)(x)) & 0xfff) == (uint32_t)(x)
Whiteboard: WE-2572294, fixed-in-nanojit, fixed-in-tamarin → WE-2572294, fixed-in-nanojit, fixed-in-tamarin, fixed-in-tracemonkey
Status: ASSIGNED → RESOLVED
Closed: 15 years ago
Resolution: --- → FIXED
Attachment #441521 - Flags: review?(rreitmai) → review+
(In reply to comment #13) > (From update of attachment 441521 [details] [diff] [review]) > I'm hesitant to + this, could we use & masking the bits instead? > > something like : (((uint32)(x)) & 0xfff) == (uint32_t)(x) that only works correctly for positive values of x. To handle positive and negative values of x, shifting is the way to go.
Assignee: rreitmai → edwsmith
Assigning to me and re-opening to fix the macro bug.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Priority: P2 → P4
Target Milestone: flash10.1 → flash10.2
Assignee: edwsmith → nobody
Comment on attachment 441521 [details] [diff] [review] Fix the isS14 macro, then use it in asm_cmov() fixup imported to TR: http://hg.mozilla.org/tamarin-redux/rev/5ec0793eebe7
Status: REOPENED → RESOLVED
Closed: 15 years ago15 years ago
Resolution: --- → FIXED
Flags: flashplayer-bug+
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: