Closed Bug 580515 Opened 15 years ago Closed 15 years ago

TM: LIR_cmovd mishandled with X86_FORCE_SSE2=no

Tracking

()

Status:

RESOLVED FIXED

Tracking Flags:

Tracking

Status

blocking2.0

---

betaN+

People

(Reporter: n.nethercote, Assigned: n.nethercote)

References

Details

(Whiteboard: fixed-in-nanojit, fixed-in-tracemonkey)

Attachments

(1 file)

patch (agasint TM 58013:2000c21b7910) 15 years ago Nicholas Nethercote [inactive] 34.94 KB, patch	edwsmith : review+	Details \| Diff \| Splinter Review

Nicholas Nethercote [inactive]

Assignee

Description

•

15 years ago

Nanojit's i386 back-end has two modes. You can avoid generating SSE code (ie. use the x87 stack only for FP computation) by specifying the env var X86_FORCE_SSE2=no. I'm probably the only person who tests this configuration regularly. The following three trace-tests are failing X86_FORCE_SSE2=no debug32/js -j -e "const platform='linux2'; const libdir='/home/njn/moz/ws0/js/src/trace-test/lib/';" -f /home/njn/moz/ws0/js/src/trace-test/lib/prolog.js -f /home/njn/moz/ws0/js/src/trace-test/tests/closures/set-outer-trace-3.js X86_FORCE_SSE2=no debug32/js -j -e "const platform='linux2'; const libdir='/home/njn/moz/ws0/js/src/trace-test/lib/';" -f /home/njn/moz/ws0/js/src/trace-test/lib/prolog.js -f /home/njn/moz/ws0/js/src/trace-test/tests/closures/t004.js X86_FORCE_SSE2=no debug32/js -j -e "const platform='linux2'; const libdir='/home/njn/moz/ws0/js/src/trace-test/lib/';" -f /home/njn/moz/ws0/js/src/trace-test/lib/prolog.js -f /home/njn/moz/ws0/js/src/trace-test/tests/closures/t027.js The failure is this: Assertion failure: (_allocator.active[FST0] && _fpuStkDepth == -1) || (!_allocator.active[FST0] && _fpuStkDepth == 0) (../nanojit/Assembler.cpp:352) It could well be a latent bug in the i386 backend that fatvals has exposed.

Luke Wagner [:luke]

Comment 1

•

15 years ago

(In reply to comment #0) > It could well be a latent bug in the i386 backend that fatvals has exposed. That error is unlike anything I've ever seen caused by the tracer. Do you know what situation it is asserting?

Nicholas Nethercote [inactive]

Assignee

Comment 2

•

15 years ago

(In reply to comment #1) > > That error is unlike anything I've ever seen caused by the tracer. Probably because you've never run with X86_FORCE_SSE2=no before. > Do you know > what situation it is asserting? In x87 mode the backend is really simplistic, it only ever pushes a single FP value onto the x87 stack at a time. Sometimes you have to pop what's on the stack explicitly, and sometimes you consume it naturally. There's 'bool pop' parameters in various places in Nativei386.cpp, eg. see the first arg of FSTQ. It also bleeds into Assembler.cpp, esp. see prepareResultReg() which contains a big comment about this stuff. What the assertion means is that something has gone wrong with the stack management -- eg. it's trying to push a second value on the stack, or pop an empty stack (I'm not sure which). The _fpuStkDepth variable is how the back-end tracks how many elements the stack has. I've never quite understood how it works, though.

Nicholas Nethercote [inactive]

Assignee

Comment 3

•

15 years ago

I heard on IRC that we can assume SSE2 for Firefox 4.0. In which case this bug is moot. Shaver, can you confirm this?

Nicholas Nethercote [inactive]

Assignee

Comment 4

•

15 years ago

The three tests mentioned in comment 0 are no longer failing. wmaddox fixed up some problems in the x87 assertion checking, that's probably what fixed it. We do have one new failure: X86_FORCE_SSE2=no /home/njn/moz/ws0/js/src/debug32/shell/js -j -e "const p latform='linux2'; const libdir='/home/njn/moz/ws0/js/src/trace-test/lib/';" -f /home/njn/moz/ws0/js/src/trace-test/lib/prolog.js -f /home/njn/moz/ws0/js/src/trace-test/tests/basic/test586387.js /home/njn/moz/ws0/js/src/trace-test/tests/basic/test586387.js:14: Error: Assertion failed: got 37, expected 496 A regression range is needed.

Philip Taylor

Comment 5

•

15 years ago

For the bug 586387 test: The first bad revision is: changeset: 48822:e866db9165ef user: Luke Wagner <lw@mozilla.com> date: Tue Aug 03 22:06:44 2010 -0700 summary: Bug 584158 - ensure that typed arrays cannot produce non-canonical nans (r=gal)

Nicholas Nethercote [inactive]

Assignee

Comment 6

•

15 years ago

(In reply to comment #5) > > The first bad revision is: > changeset: 48822:e866db9165ef > user: Luke Wagner <lw@mozilla.com> > date: Tue Aug 03 22:06:44 2010 -0700 > summary: Bug 584158 - ensure that typed arrays cannot produce non-canonical > nans (r=gal) This revision added uses of LIR_cmovd, so I think it very likely that LIR_cmovd isn't handled properly on i386 without SSE2.

Summary: TM: tests failing with X86_FORCE_SSE2=no due to fatvals → TM: LIR_cmovd mishandled with X86_FORCE_SSE2=no

Nicholas Nethercote [inactive]

Assignee

Comment 7

•

15 years ago

Here's a smaller test case: var v = new Float32Array(32); var t = 0; for (var i = 0; i < v.length; ++i) t += v[i]; print(t); Run normally, it prints zero. Run with X86_FORCE_SSE2=no it prints a bogus value, often -Infinity, often a very large negative double. Here's the relevant LIR and native code: ldf2d1 = ldf2d.tdata addi1[0] ...... fld32 0(eax) eqd1 = eqd ldf2d1, ldf2d1 immd1 = immd nan cmovd1 = cmovd eqd1 ? ldf2d1 : immd1/*nan*/ ...... fld f0 ...... fcompp ...... fnstsw_ax ...... test ah, 68 ...... fldq -40(ebp) <= restore cmovd1 ...... mov eax,-28(ebp) <= restore eos ...... jnp ...... ...... ffree f0 ...... fincstp ...... fldq (......) The LIR is from TM's canonicalizeNaNs(). It compares ldf2d1 to itself (a NaN) test. If it fails (ie. it's a NaN), it replaces it with a canonical NaN value. I haven't yet worked out the problem with the generated code, my x87 knowledge is weak.

Status: NEW → ASSIGNED

Nicholas Nethercote [inactive]

Assignee

Comment 8

•

15 years ago

(In reply to comment #7) > > eqd1 = eqd ldf2d1, ldf2d1 > immd1 = immd nan > cmovd1 = cmovd eqd1 ? ldf2d1 : immd1/*nan*/ > ...... fld f0 > ...... fcompp > ...... fnstsw_ax > ...... test ah, 68 > ...... fldq -40(ebp) <= restore cmovd1 > ...... mov eax,-28(ebp) <= restore eos > ...... jnp ...... > ...... ffree f0 > ...... fincstp > ...... fldq (......) The |fldq -40(ebp)| looks wrong. When generating code backwards we have these steps: - After the cmovd1 instruction, cmovd1 is in FST0. - When we get back to the comparison part (in asm_cmpd()) we need to use FST0 to hold ldf2d1 to do the comparison. - So we have to "spill" cmovd1 out of FST0; because we're going backwards, this generates a restore (the |fldq -40(ebp)|) - The problem is that -40(ebp), the spill slot assigned for cmovd1, doesn't hold something meaningful at this point. We're restoring garbage, because we're restoring cmovd1 before it's even been computed! Man, I hate x87 code. Not sure what the fix is.

Nicholas Nethercote [inactive]

Assignee

Comment 9

•

15 years ago

BTW, I think the |ffree f0; fincstp| sequence can be changed to just |fstp f0|. It just needs to pop the stack.

Nicholas Nethercote [inactive]

Assignee

Updated

•

15 years ago

blocking2.0: --- → ?

Robert Sayre

Updated

•

15 years ago

blocking2.0: ? → betaN+

Nicholas Nethercote [inactive]

Assignee

Updated

•

15 years ago

Depends on: 615159

Nicholas Nethercote [inactive]

Assignee

Comment 10

•

15 years ago

Attached patch patch (agasint TM 58013:2000c21b7910) — Details — Splinter Review

This ended up being a bit more complicated than I thought it would. The best explanation of what this patch does comes from the following comment I added to asm_cmov(): // The obvious way to handle this is as follows: // // mov rr, rt # only needed if rt is live afterwards // do comparison // jt end // mov rr, rf // end: // // The problem with this is that doing the comparison can cause // registers to be evicted, possibly including 'rr', which holds // 'ins'. And that screws things up. So instead we do this: // // do comparison // mov rr, rt # only needed if rt is live afterwards // jt end // mov rr, rf // end: // // Putting the 'mov' between the comparison and the jump is ok // because move instructions don't modify the condition codes. The "obvious way" was how things were done prior to the patch. The rest of the patch is just a lot of moving stuff around to allow this to happen. In particular: - I had to allow separation of conditional tests and conditional branches, which required adding asm_branch_helper() et al. - In NativeX64.cpp, this required operand swapping performed for some double comparisons to be done in asm_cmpd() instead of asm_branchd() and asm_condd(). This makes the X64 back-end more like the i386 back-end, which I think is a good thing. - In Nativei386.cpp, I used the commonly-used FSTP(FST0) instruction to pop the stack instead of the unusual FINCSTP()/FFREE() sequence. This allowed FINCSTP() and FFREE() to be removed. - I added a lirasm test. Without the patch, it asserts when in non-SSE2 mode. The failing test mentioned in comment 4 now works, too.

Attachment #493853 - Flags: review?(edwsmith)

Edwin Smith

Comment 11

•

15 years ago

Comment on attachment 493853 [details] [diff] [review] patch (agasint TM 58013:2000c21b7910) Tamarin doesn't currently use cmovd, so I haven't tested on the sandbox. Bug 606561 actually worked around the x87 bug; we can re-try a cmovd version of Math.abs() once this lands.

Attachment #493853 - Flags: review?(edwsmith) → review+

Nicholas Nethercote [inactive]

Assignee

Comment 12

•

15 years ago

http://hg.mozilla.org/projects/nanojit-central/rev/4effe362e918 http://hg.mozilla.org/tracemonkey/rev/a687492cff3d

Whiteboard: fixed-in-nanojit, fixed-in-tracemonkey

Robert Sayre

Comment 13

•

15 years ago

http://hg.mozilla.org/mozilla-central/rev/a687492cff3d

Status: ASSIGNED → RESOLVED

Closed: 15 years ago

Resolution: --- → FIXED

You need to log in before you can comment on or make changes to this bug.

Bugzilla

TM: LIR_cmovd mishandled with X86_FORCE_SSE2=no

Categories

(Core :: JavaScript Engine, defect)

Tracking

()

People

(Reporter: n.nethercote, Assigned: n.nethercote)

References

Details

(Whiteboard: fixed-in-nanojit, fixed-in-tracemonkey)

Crash Data

Security

(public)

User Story

Attachments

(1 file)

Description

Comment 1

Comment 2

Comment 3

Comment 4

Comment 5

Comment 6

Comment 7

Comment 8

Comment 9

Updated

Updated

Updated

Comment 10

Comment 11

Comment 12

Comment 13

Attachment

General

Description

File Name

Content Type