Closed Bug 560370 Opened 14 years ago Closed 14 years ago

PPC cmov handling logic incorrect

Tracking

(Not tracked)

Status:

RESOLVED FIXED

Milestone:

Q3 11 - Serrano

People

(Reporter: rreitmai, Unassigned)

Details

(Whiteboard: WE-2572294, fixed-in-nanojit, fixed-in-tamarin, fixed-in-tracemonkey)

Attachments

(2 files, 2 obsolete files)

v1 14 years ago Rick Reitmaier 1.76 KB, patch	edwsmith : review-	Details \| Diff \| Splinter Review
v2 14 years ago Rick Reitmaier 3.33 KB, patch	edwsmith : review-	Details \| Diff \| Splinter Review
v3 14 years ago Rick Reitmaier 1.98 KB, patch	edwsmith : review+	Details \| Diff \| Splinter Review
Fix the isS14 macro, then use it in asm_cmov() 14 years ago Edwin Smith 759 bytes, patch	rreitmai : review+	Details \| Diff \| Splinter Review

Rick Reitmaier

Reporter

Description

•

14 years ago

Here's a code snip-it showing bad generation ; notice the mr r9,r4 at 00104dbb80 which is clearly wrong.

00104dbb5c   li r12,3 (0x3)  <= restore 3
00104dbb60   li r11,0 (0x0)  <= restore 0
00104dbb64   li r10,6 (0x6)  <= restore 6
00104dbb68   li r4,1 (0x1)  <= restore 1
    sti vars[40] = get_cache_handler1
00104dbb6c   stw r3,-16(r31)
    sti tags[20] = 0
00104dbb70   stw r11,-64(r31)
    eq1 = eq get_cache_handler1, 4       # codegen'd with the cmov
    cmov1 = cmov eq1 ? 1 : get_cache_handler1
00104dbb74   cmpwi cr7,r3,4 (0x4)
00104dbb78   beq cr7,0x104dbb80
00104dbb7c   mr r4,r3
0x104dbb80:
00104dbb80   mr r9,r4
    sti vars[40] = cmov1
00104dbb84   stw r4,-16(r31)
    sti tags[20] = 0
00104dbb88   stw r11,-64(r31)
    sti vars[8] = cmov1
00104dbb8c   stw r4,-48(r31)
    sti tags[4] = 0
00104dbb90   stw r11,-80(r31)
    188130304 = int 188130304
00104dbb94   lis r8,2870 (0xb360000)
00104dbb98   ori r8,r8,41984 (0xa400)
    sti varTraits[4] = 188130304
00104dbb9c   stw r8,-120(r31)
    sti vars[40] = 1
00104dbba0   stw r9,-16(r31)
    sti tags[20] = 6
00104dbba4   stw r10,-64(r31)
    sti vars[16] = 1
00104dbba8   stw r9,-40(r31)
    sti tags[8] = 6
00104dbbac   stw r10,-76(r31)

Rick Reitmaier

Reporter

Comment 1

•

14 years ago

Attached patch v1 (obsolete) — Details — Splinter Review

Duplicate the logic used on the x86 platform which is more robust in handing the various register permutations

Attachment #440054 - Flags: review?(edwsmith)

Edwin Smith

Updated

•

14 years ago

Attachment #440054 - Attachment is patch: true

Attachment #440054 - Attachment mime type: application/octet-stream → text/plain

Edwin Smith

Comment 2

•

14 years ago

Comment on attachment 440054 [details] [diff] [review]
v1

Warning: the logic in the patch isn't the same as x86, and if it is safe anyway, then its not obvious why.  the x86 backend calls asm_cmp last, but the ppc backend does more register allocation work after calling asm_branch (which calls asm_cmp).  asm_cmp() must allocate registers for the operands of the compare, which further increases register pressure.  Its not obvious that rt (register for iftrue operand) will remain in a register during that time.

Instead of asm_branch(), you could underrunProtect(8) (two instructions, check my math) then you don't need any of the long-branch cases in asm_branch() or asm_branch_near(), just the MR(rr,rf) and single-instruction short branch.

Then handle rt, then do asm_cmp.  *then* i think it will match x86's behavior.

Attachment #440054 - Flags: review?(edwsmith) → review-

Rick Reitmaier

Reporter

Comment 3

•

14 years ago

(In reply to comment #2)
> (From update of attachment 440054 [details] [diff] [review])
> Warning: the logic in the patch isn't the same as x86

True comment 1 is incorrect; 'mimic' rather than 'duplicate'. 

> the ppc backend does more register allocation work after calling asm_branch

Good point, none of the registers are guaranteed to be around after this call.   One way we could lock down rt,rf and rr, would be to add a 'dontUseRegisters' parameter to asm_cmp() that force these regs to be excluded.

> Instead of asm_branch(), you could underrunProtect(8) (two instructions, check
> my math) then you don't need any of the long-branch cases in asm_branch() or
> asm_branch_near(), just the MR(rr,rf) and single-instruction short branch.

Right that would be a good optimization , but I think we're still left with the larger issue of having to generate an MR()  *after* the asm_cmp().   

x86 doesn't have this issue, b/c the sequence is 'cmp a,b  mov rr,rt  cmov rr,rf' .   Notice no allocation occurs between the cmp and the mov, i.e. asm_cmp() call is after the mov.

For PPC, the sequence is roughly 'mov rr,rt  cmp a,b  br x  mov rr,rf' ; thus we have to ensure that rr and rt exist *after* the asm_cmp() call.

Rick Reitmaier

Reporter

Comment 4

•

14 years ago

Attached patch v2 (obsolete) — Details — Splinter Review

updated to reflect prior comments / concerns ; not yet heavily tested

Attachment #440054 - Attachment is obsolete: true

Attachment #440324 - Flags: review?

Rick Reitmaier

Reporter

Updated

•

14 years ago

Attachment #440324 - Flags: review? → review?(edwsmith)

Dan Smith

Updated

•

14 years ago

Whiteboard: WE-2572294

Dan Smith

Updated

•

14 years ago

Assignee: nobody → rreitmai

Target Milestone: --- → flash10.1

Trevor Baker

Updated

•

14 years ago

Status: NEW → ASSIGNED

Rick Reitmaier

Reporter

Comment 5

•

14 years ago

passes all testing.

Edwin Smith

Comment 6

•

14 years ago

Comment on attachment 440324 [details] [diff] [review]
v2

Now that you are using asm_branch_near (good!), you can make the call to asm_cmp() be last, which will avoid the need to add an exclusion set to asm_cmp().  and, the register manipulation code will correspond 1:1 with the x86 backend.

Attachment #440324 - Flags: review?(edwsmith) → review-

Trevor Baker

Updated

•

14 years ago

Priority: -- → P2

Rick Reitmaier

Reporter

Comment 7

•

14 years ago

(In reply to comment #6)
> you can make the call to asm_cmp() be last

hmm, comment 3 shows why this can't be done in the PPC case.  Am I missing something?

Rick Reitmaier

Reporter

Comment 8

•

14 years ago

Attached patch v3 — Details — Splinter Review

Gotcha; you simply want to move the MR *between* the cmp and branch.  Done in latest patch.  Also squeezed underrunProtect down to 8 and put release logic prior to asm_cmp() which more closely matches the x86 version.

Attachment #440324 - Attachment is obsolete: true

Attachment #440867 - Flags: review?(edwsmith)

Rick Reitmaier

Reporter

Comment 9

•

14 years ago

(In reply to comment #8)
>  underrunProtect down to 8 

Not true, the patch still has it as 16, but that's still correct, just more conservative.

Edwin Smith

Comment 10

•

14 years ago

Comment on attachment 440867 [details] [diff] [review]
v3

looks right, assuming it passes tests it should be good to go.

Attachment #440867 - Flags: review?(edwsmith) → review+

Rick Reitmaier

Reporter

Comment 11

•

14 years ago

pushed to http://hg.mozilla.org/projects/nanojit-central/rev/8047bc5db3b1

Group: tamarin-security

Rick Reitmaier

Reporter

Updated

•

14 years ago

Whiteboard: WE-2572294 → WE-2572294, fixed-in-nanojit,

Dan Smith

Updated

•

14 years ago

Whiteboard: WE-2572294, fixed-in-nanojit, → WE-2572294, fixed-in-nanojit, fixed-in-tr

Dan Smith

Updated

•

14 years ago

Whiteboard: WE-2572294, fixed-in-nanojit, fixed-in-tr → WE-2572294, fixed-in-nanojit, fixed-in-tamarin

Edwin Smith

Comment 12

•

14 years ago

Attached patch Fix the isS14 macro, then use it in asm_cmov() — Details — Splinter Review

This re-strengthens the assert in asm_cmov() by fixing the typo in the isS14 macro.  (bd should be i).

Attachment #441521 - Flags: review?(rreitmai)

Rick Reitmaier

Reporter

Comment 13

•

14 years ago

Comment on attachment 441521 [details] [diff] [review]
Fix the isS14 macro, then use it in asm_cmov()

I'm hesitant to + this, could we use & masking the bits instead?

something like : (((uint32)(x)) & 0xfff) == (uint32_t)(x)

Nicholas Nethercote [inactive]

Comment 14

•

14 years ago

http://hg.mozilla.org/tracemonkey/rev/20583089b31b

Whiteboard: WE-2572294, fixed-in-nanojit, fixed-in-tamarin → WE-2572294, fixed-in-nanojit, fixed-in-tamarin, fixed-in-tracemonkey

Robert Sayre

Comment 15

•

14 years ago

http://hg.mozilla.org/mozilla-central/rev/20583089b31b

Status: ASSIGNED → RESOLVED

Closed: 14 years ago

Resolution: --- → FIXED

Rick Reitmaier

Reporter

Updated

•

14 years ago

Attachment #441521 - Flags: review?(rreitmai) → review+

Edwin Smith

Comment 16

•

14 years ago

(In reply to comment #13)
> (From update of attachment 441521 [details] [diff] [review])
> I'm hesitant to + this, could we use & masking the bits instead?
> 
> something like : (((uint32)(x)) & 0xfff) == (uint32_t)(x)

that only works correctly for positive values of x.  To handle positive and negative values of x, shifting is the way to go.

Edwin Smith

Updated

•

14 years ago

Assignee: rreitmai → edwsmith

Edwin Smith

Comment 17

•

14 years ago

Assigning to me and re-opening to fix the macro bug.

Status: RESOLVED → REOPENED

Resolution: FIXED → ---

Edwin Smith

Updated

•

14 years ago

Priority: P2 → P4

Target Milestone: flash10.1 → flash10.2

Edwin Smith

Comment 18

•

14 years ago

Comment on attachment 441521 [details] [diff] [review]
Fix the isS14 macro, then use it in asm_cmov()

NJ: http://hg.mozilla.org/projects/nanojit-central/rev/9b65dfcae9f7

Edwin Smith

Updated

•

14 years ago

Assignee: edwsmith → nobody

Edwin Smith

Comment 19

•

14 years ago

Comment on attachment 441521 [details] [diff] [review]
Fix the isS14 macro, then use it in asm_cmov()

fixup imported to TR:
http://hg.mozilla.org/tamarin-redux/rev/5ec0793eebe7

Edwin Smith

Comment 20

•

14 years ago

TR: http://hg.mozilla.org/tamarin-redux/rev/aa5a0da409e9

Nicholas Nethercote [inactive]

Comment 21

•

14 years ago

TM (fixup): http://hg.mozilla.org/tracemonkey/rev/f4588c09ff8b

Robert Sayre

Comment 22

•

14 years ago

http://hg.mozilla.org/mozilla-central/rev/732dbc8fa456

Status: REOPENED → RESOLVED

Closed: 14 years ago → 14 years ago

Resolution: --- → FIXED

Trevor Baker

Updated

•

13 years ago

Flags: flashplayer-bug+

You need to log in before you can comment on or make changes to this bug.