Closed
Bug 560370
Opened 14 years ago
Closed 14 years ago
PPC cmov handling logic incorrect
Categories
(Tamarin Graveyard :: Baseline JIT (CodegenLIR), defect, P4)
Tracking
(Not tracked)
RESOLVED
FIXED
Q3 11 - Serrano
People
(Reporter: rreitmai, Unassigned)
Details
(Whiteboard: WE-2572294, fixed-in-nanojit, fixed-in-tamarin, fixed-in-tracemonkey)
Attachments
(2 files, 2 obsolete files)
1.98 KB,
patch
|
edwsmith
:
review+
|
Details | Diff | Splinter Review |
759 bytes,
patch
|
rreitmai
:
review+
|
Details | Diff | Splinter Review |
Here's a code snip-it showing bad generation ; notice the mr r9,r4 at 00104dbb80 which is clearly wrong. 00104dbb5c li r12,3 (0x3) <= restore 3 00104dbb60 li r11,0 (0x0) <= restore 0 00104dbb64 li r10,6 (0x6) <= restore 6 00104dbb68 li r4,1 (0x1) <= restore 1 sti vars[40] = get_cache_handler1 00104dbb6c stw r3,-16(r31) sti tags[20] = 0 00104dbb70 stw r11,-64(r31) eq1 = eq get_cache_handler1, 4 # codegen'd with the cmov cmov1 = cmov eq1 ? 1 : get_cache_handler1 00104dbb74 cmpwi cr7,r3,4 (0x4) 00104dbb78 beq cr7,0x104dbb80 00104dbb7c mr r4,r3 0x104dbb80: 00104dbb80 mr r9,r4 sti vars[40] = cmov1 00104dbb84 stw r4,-16(r31) sti tags[20] = 0 00104dbb88 stw r11,-64(r31) sti vars[8] = cmov1 00104dbb8c stw r4,-48(r31) sti tags[4] = 0 00104dbb90 stw r11,-80(r31) 188130304 = int 188130304 00104dbb94 lis r8,2870 (0xb360000) 00104dbb98 ori r8,r8,41984 (0xa400) sti varTraits[4] = 188130304 00104dbb9c stw r8,-120(r31) sti vars[40] = 1 00104dbba0 stw r9,-16(r31) sti tags[20] = 6 00104dbba4 stw r10,-64(r31) sti vars[16] = 1 00104dbba8 stw r9,-40(r31) sti tags[8] = 6 00104dbbac stw r10,-76(r31)
Reporter | ||
Comment 1•14 years ago
|
||
Duplicate the logic used on the x86 platform which is more robust in handing the various register permutations
Attachment #440054 -
Flags: review?(edwsmith)
Updated•14 years ago
|
Attachment #440054 -
Attachment is patch: true
Attachment #440054 -
Attachment mime type: application/octet-stream → text/plain
Comment 2•14 years ago
|
||
Comment on attachment 440054 [details] [diff] [review] v1 Warning: the logic in the patch isn't the same as x86, and if it is safe anyway, then its not obvious why. the x86 backend calls asm_cmp last, but the ppc backend does more register allocation work after calling asm_branch (which calls asm_cmp). asm_cmp() must allocate registers for the operands of the compare, which further increases register pressure. Its not obvious that rt (register for iftrue operand) will remain in a register during that time. Instead of asm_branch(), you could underrunProtect(8) (two instructions, check my math) then you don't need any of the long-branch cases in asm_branch() or asm_branch_near(), just the MR(rr,rf) and single-instruction short branch. Then handle rt, then do asm_cmp. *then* i think it will match x86's behavior.
Attachment #440054 -
Flags: review?(edwsmith) → review-
Reporter | ||
Comment 3•14 years ago
|
||
(In reply to comment #2) > (From update of attachment 440054 [details] [diff] [review]) > Warning: the logic in the patch isn't the same as x86 True comment 1 is incorrect; 'mimic' rather than 'duplicate'. > the ppc backend does more register allocation work after calling asm_branch Good point, none of the registers are guaranteed to be around after this call. One way we could lock down rt,rf and rr, would be to add a 'dontUseRegisters' parameter to asm_cmp() that force these regs to be excluded. > Instead of asm_branch(), you could underrunProtect(8) (two instructions, check > my math) then you don't need any of the long-branch cases in asm_branch() or > asm_branch_near(), just the MR(rr,rf) and single-instruction short branch. Right that would be a good optimization , but I think we're still left with the larger issue of having to generate an MR() *after* the asm_cmp(). x86 doesn't have this issue, b/c the sequence is 'cmp a,b mov rr,rt cmov rr,rf' . Notice no allocation occurs between the cmp and the mov, i.e. asm_cmp() call is after the mov. For PPC, the sequence is roughly 'mov rr,rt cmp a,b br x mov rr,rf' ; thus we have to ensure that rr and rt exist *after* the asm_cmp() call.
Reporter | ||
Comment 4•14 years ago
|
||
updated to reflect prior comments / concerns ; not yet heavily tested
Attachment #440054 -
Attachment is obsolete: true
Attachment #440324 -
Flags: review?
Reporter | ||
Updated•14 years ago
|
Attachment #440324 -
Flags: review? → review?(edwsmith)
Updated•14 years ago
|
Status: NEW → ASSIGNED
Reporter | ||
Comment 5•14 years ago
|
||
passes all testing.
Comment 6•14 years ago
|
||
Comment on attachment 440324 [details] [diff] [review] v2 Now that you are using asm_branch_near (good!), you can make the call to asm_cmp() be last, which will avoid the need to add an exclusion set to asm_cmp(). and, the register manipulation code will correspond 1:1 with the x86 backend.
Attachment #440324 -
Flags: review?(edwsmith) → review-
Updated•14 years ago
|
Priority: -- → P2
Reporter | ||
Comment 7•14 years ago
|
||
(In reply to comment #6) > you can make the call to asm_cmp() be last hmm, comment 3 shows why this can't be done in the PPC case. Am I missing something?
Reporter | ||
Comment 8•14 years ago
|
||
Gotcha; you simply want to move the MR *between* the cmp and branch. Done in latest patch. Also squeezed underrunProtect down to 8 and put release logic prior to asm_cmp() which more closely matches the x86 version.
Attachment #440324 -
Attachment is obsolete: true
Attachment #440867 -
Flags: review?(edwsmith)
Reporter | ||
Comment 9•14 years ago
|
||
(In reply to comment #8) > underrunProtect down to 8 Not true, the patch still has it as 16, but that's still correct, just more conservative.
Comment 10•14 years ago
|
||
Comment on attachment 440867 [details] [diff] [review] v3 looks right, assuming it passes tests it should be good to go.
Attachment #440867 -
Flags: review?(edwsmith) → review+
Reporter | ||
Comment 11•14 years ago
|
||
pushed to http://hg.mozilla.org/projects/nanojit-central/rev/8047bc5db3b1
Group: tamarin-security
Reporter | ||
Updated•14 years ago
|
Whiteboard: WE-2572294 → WE-2572294, fixed-in-nanojit,
Whiteboard: WE-2572294, fixed-in-nanojit, → WE-2572294, fixed-in-nanojit, fixed-in-tr
Whiteboard: WE-2572294, fixed-in-nanojit, fixed-in-tr → WE-2572294, fixed-in-nanojit, fixed-in-tamarin
Comment 12•14 years ago
|
||
This re-strengthens the assert in asm_cmov() by fixing the typo in the isS14 macro. (bd should be i).
Attachment #441521 -
Flags: review?(rreitmai)
Reporter | ||
Comment 13•14 years ago
|
||
Comment on attachment 441521 [details] [diff] [review] Fix the isS14 macro, then use it in asm_cmov() I'm hesitant to + this, could we use & masking the bits instead? something like : (((uint32)(x)) & 0xfff) == (uint32_t)(x)
Comment 14•14 years ago
|
||
http://hg.mozilla.org/tracemonkey/rev/20583089b31b
Whiteboard: WE-2572294, fixed-in-nanojit, fixed-in-tamarin → WE-2572294, fixed-in-nanojit, fixed-in-tamarin, fixed-in-tracemonkey
Comment 15•14 years ago
|
||
http://hg.mozilla.org/mozilla-central/rev/20583089b31b
Status: ASSIGNED → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
Reporter | ||
Updated•14 years ago
|
Attachment #441521 -
Flags: review?(rreitmai) → review+
Comment 16•14 years ago
|
||
(In reply to comment #13) > (From update of attachment 441521 [details] [diff] [review]) > I'm hesitant to + this, could we use & masking the bits instead? > > something like : (((uint32)(x)) & 0xfff) == (uint32_t)(x) that only works correctly for positive values of x. To handle positive and negative values of x, shifting is the way to go.
Updated•14 years ago
|
Assignee: rreitmai → edwsmith
Comment 17•14 years ago
|
||
Assigning to me and re-opening to fix the macro bug.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Updated•14 years ago
|
Priority: P2 → P4
Target Milestone: flash10.1 → flash10.2
Comment 18•14 years ago
|
||
Comment on attachment 441521 [details] [diff] [review] Fix the isS14 macro, then use it in asm_cmov() NJ: http://hg.mozilla.org/projects/nanojit-central/rev/9b65dfcae9f7
Updated•14 years ago
|
Assignee: edwsmith → nobody
Comment 19•14 years ago
|
||
Comment on attachment 441521 [details] [diff] [review] Fix the isS14 macro, then use it in asm_cmov() fixup imported to TR: http://hg.mozilla.org/tamarin-redux/rev/5ec0793eebe7
Comment 20•14 years ago
|
||
TR: http://hg.mozilla.org/tamarin-redux/rev/aa5a0da409e9
Comment 21•14 years ago
|
||
TM (fixup): http://hg.mozilla.org/tracemonkey/rev/f4588c09ff8b
Comment 22•14 years ago
|
||
http://hg.mozilla.org/mozilla-central/rev/732dbc8fa456
Status: REOPENED → RESOLVED
Closed: 14 years ago → 14 years ago
Resolution: --- → FIXED
Updated•13 years ago
|
Flags: flashplayer-bug+
You need to log in
before you can comment on or make changes to this bug.
Description
•