Closed Bug 576247 Opened 14 years ago Closed 13 years ago

TM: asm_stkarg not implemented for x86 [nanojit]

Categories

(Core :: JavaScript Engine, defect)

x86_64
All
defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla7
Tracking Status
blocking2.0 --- -
status2.0 --- wanted

People

(Reporter: abittner, Assigned: m_kato)

References

()

Details

(Keywords: 64bit, Whiteboard: [win64] fixed-in-nanojit)

Attachments

(4 files, 3 obsolete files)

User-Agent:       Mozilla/5.0 (Windows; U; Windows NT 6.1; Win64; x64; en-US; rv:2.0b2pre) Gecko/20100630 Minefield/4.0b2pre
Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 6.1; Win64; x64; en-US; rv:2.0b2pre) Gecko/20100630 Minefield/4.0b2pre

coming across a german website containing also the following html code:
<h1 class="head">Einfach stark in Preis und Leistung.</h1>

running everything on fresh win7x64 system, with having both native x64 and native x86 (via wow64) firefox4 installed

firefox 4.0b2pre compiled as x64/win64 renders the last part of that sentence "stark in Preis und Leistung." badly.

firefox 4.0b2pre as x86/win32 renders it properly.

when rightclicking on the html area of that text in the normal firefox gui it presents those text characters/words of the sentence as some internal png/base64 stuff (im not that familiar with the depths of firefox rendering engine and so forth).

copypaste-ing that png/base64representation via the clipboard actually reproduces the bad rendering on both of the 32bit and 64bit compiled firefox browsers.

so on the 64bit/x64 compiled binary of firefox there seems to be some rendering problem, or maybe the x64 specific parts of de-coding and rendering stuff badly create this png/base64 representation in the firefox memory cache for html content.

Reproducible: Always




here are the rightclick - view image url representations on the various firefox compilations:

x64 means: Build identifier: Mozilla/5.0 (Windows; U; Windows NT 6.1; Win64; x64; en-US; rv:2.0b2pre) Gecko/20100630 Minefield/4.0b2pre
running on freshly installed win7x64, latest updates



"Einfach" x64 representation: (renders properly (at least visually, didnt binary check the base64 part posted here below, on both x32/x64))



------------

"stark" x64 representation, badly rendered



------------
"in"  - x64 representation (badly rendered)



------------

"Preis" x64 representation, badly rendered:



-------------

"und" x64 representation, badly rendered



------------

"Leistung." x64 representation, badly rendered



-------------


try copy/pasting these data: urls into your address location area and see the bad image rendering on a normally working win32/x86 firefox4.0b2pre

------------------
additional update on a x32/x86 firefox4.0b2pre

x32 means: Build identifier: Mozilla/5.0 (Windows; U; Windows NT 6.1; WOW64; en-US; rv:2.0b2pre) Gecko/20100630 Minefield/4.0b2pre

running on the very same machine, win7x64 fresh, latest updates.

all data urls are being displayed fine visually, just for reference/comparison to the x64 data: urls posted before

"Einfach" x32 representation



-------------

"stark" as x32 representation:



---------------

"in" as x32 representation:



------------------

"Preis" as x32 representation:



---------------------

"und" as x32 representation:



--------------

"Leistung." as x32 representation:



---------------------




conclusion:

only the first word "Einfach" gets the same encoding/decoding in both the x64 and x32 versions of firefox, the rest of the words encode/decode differently.


maybe this is some hint for where to look for the reason.

regards.
x32 firefox renders normally, x64 firefox renders badly.

but copy/paste the data: url representation of the badly rendered stuff from x64 firefox over to x32 firefox gives bad rendering on x32 firefox as well.

so some decode/encode curriuption, memory corruption or something must be happening on the x64 firefox.

cheers.
the opposite direction holds true as well: 

pasting the x32 data: representations back into the native x64 firefox renders the images/words correctly there as well.
bug still in:

Build identifier: Mozilla/5.0 (Windows; U; Windows NT 6.1; Win64; x64; en-US; rv:2.0b2pre) Gecko/20100701 Minefield/4.0b2pre

about:buildconfig
Source

Built from http://hg.mozilla.org/mozilla-central/rev/d9253109fd88
Build platform
target
x86_64-pc-mingw32
Build tools
Compiler 	Version 	Compiler flags
cl 	16.00.30319.01 	-TC -nologo -W3 -Gy -Fdgenerated.pdb -DNDEBUG -DTRIMMED -Zi -Zi -UDEBUG -DNDEBUG -GL -wd4624 -wd4952 -O1
cl 	16.00.30319.01 	-GR- -TP -nologo -Zc:wchar_t- -W3 -Gy -Fdgenerated.pdb -wd4800 -DNDEBUG -DTRIMMED -Zi -Zi -UDEBUG -DNDEBUG -GL -wd4624 -wd4952 -O1
Configure arguments

--target=x86_64-pc-mingw32 --host=x86_64-pc-mingw32 --enable-application=browser --enable-update-channel=nightly --enable-update-packaging --enable-debug-symbols

---------------
regards.
still the same buggage with newer x64 compiled build

Build identifier: Mozilla/5.0 (Windows; Windows NT 6.1; Win64; x64; en-US; rv:2.0b2pre) Gecko/20100703 Minefield/4.0b2pre

-----------
about:buildconfig
Source

Built from http://hg.mozilla.org/mozilla-central/rev/55f39d8d866c
Build platform
target
x86_64-pc-mingw32
Build tools
Compiler 	Version 	Compiler flags
cl 	16.00.30319.01 	-TC -nologo -W3 -Gy -Fdgenerated.pdb -DNDEBUG -DTRIMMED -Zi -Zi -UDEBUG -DNDEBUG -GL -wd4624 -wd4952 -O1
cl 	16.00.30319.01 	-GR- -TP -nologo -Zc:wchar_t- -W3 -Gy -Fdgenerated.pdb -wd4800 -DNDEBUG -DTRIMMED -Zi -Zi -UDEBUG -DNDEBUG -GL -wd4624 -wd4952 -O1
Configure arguments

--target=x86_64-pc-mingw32 --host=x86_64-pc-mingw32 --enable-application=browser --enable-update-channel=nightly --enable-update-packaging --enable-debug-symbols

---------------------

same display errors on the mentioned url.
Hardware: x86 → x86_64
any comments on this stuff? is x64 for mozilla still too exotic and low-prio?

do we have an x64 discussion area, meeting place, weekly/regular meetings or anything?

win64/x64 is huge by now if you trust microsofts own word and published stats. almost half of all total sold win7 licenses were x64 up til now. in the us the current percentage is even around 60%+ for x64.

well whatever. maybe some body cares or could explain how these differences occur and where this stuff originates from.

cheers.

ref: http://windowsteamblog.com/windows/b/bloggingwindows/archive/2010/07/08/64-bit-momentum-surges-with-windows-7.aspx
Version: unspecified → Trunk
I have verified that:
1) the win64 build on a 64-bit machines renders incorrectly as per screen-shot
2) the win32 build renders it properly

I don't know which component should this belong to but the fonts are not being rendered properly.

This is a very obvious regression and would not like us to ship with a problem like this. Thanks abbittner for filing. My apologies for the late reply we were at a summit all last week .

abittner could you keep a saved version of the page? could you please get us information wrt the fonts that are being used on that page?

(In reply to comment #6)
> any comments on this stuff? is x64 for mozilla still too exotic and low-prio?
> 
It is not low-priority but it requires a lot of changes in how we do things. For instance, I am not sure how many developers are actually running the 64-bit builds and how many nightly users we actually have.
I (most likely) will be leading the push to get it into our beta releases (maybe beta 3).

> do we have an x64 discussion area, meeting place, weekly/regular meetings or
> anything?
> 
Best place to ask is http://groups.google.com/group/mozilla.dev.planning
I am not sure since I am neither QA nor developer so I am not the best person to answer your questions.
Status: UNCONFIRMED → NEW
blocking2.0: --- → ?
Ever confirmed: true
Keywords: 64bit
Whiteboard: [win64]
tried to narrow down the html/javascript code to be still able to experience the bug. some javascript stuff (third party mit/bsd licensed) is neccessary, which creates fonts and stylesheet information via javascript it seems. im not that of an expert that deep into javascript, stylesheets and font technology.

see 7zip archive. html file with six .js files recreates the bug.

extract to any directory and start the html file via your x64 firefox browser and/or x32 browser to compare results.
update on builds:

Build identifier: Mozilla/5.0 (Windows; Windows NT 6.1; Win64; x64; en-US; rv:2.0b2pre) Gecko/20100714 Minefield/4.0b2pre

about:buildconfig
Source

Built from http://hg.mozilla.org/mozilla-central/rev/c26c255bade9
Build platform
target
x86_64-pc-mingw32
Build tools
Compiler 	Version 	Compiler flags
cl 	16.00.30319.01 	-TC -nologo -W3 -Gy -Fdgenerated.pdb -DNDEBUG -DTRIMMED -Zi -Zi -UDEBUG -DNDEBUG -GL -wd4624 -wd4952 -O1
cl 	16.00.30319.01 	-GR- -TP -nologo -Zc:wchar_t- -W3 -Gy -Fdgenerated.pdb -wd4800 -DNDEBUG -DTRIMMED -Zi -Zi -UDEBUG -DNDEBUG -GL -wd4624 -wd4952 -O1
Configure arguments

--target=x86_64-pc-mingw32 --host=x86_64-pc-mingw32 --enable-application=browser --enable-update-channel=nightly --enable-update-packaging --enable-debug-symbols

------------

still experiences the malfully rendered situation on x64/win64 native.

was just posting for the records, as i think i recently read somewhere that somebody from the x64 team, buildengineering or whoever was experimenting with compiling x64 either with visualc++2008/2010, so that might give some hints what libraries or parts get compiled in some strange way that the font/js/css parts create the bad font representation at some point.

regards.
ok, so here goes the next x64 stuff, maybe its related, maybe i need to open a completely new bugreport.


i can 100% repro an immediate non-crashreporter crash with simply navigating to html5test.com

on this latest x64 firefox that i am testing.
Build identifier: Mozilla/5.0 (Windows; Windows NT 6.1; Win64; x64; en-US;
rv:2.0b2pre) Gecko/20100714 Minefield/4.0b2pre



its not reporting anything back to mozilla, bug creating hugeloads of wer (windows error reporting) crashreports sending back to msft.

crashing module is xul.dll each and every time.

can anyone else repro this?
target crashtriggering addy is: http://html5test.com/

doesnt crash but computes some html5 capability score x32 firefox builds for me.

on x64 it gives me the crashy.

i call for somebody taking the wer (windows error reporting) stuff seriously and making use of wer-reported crashes and go for registering with microsoft and adding wer-capability to the major mozilla products and learning from the wer-reports as well and not just the breakpad/crashreporter reports.

thanks and regards.

--------------------

Description
Faulting Application Path:	C:\Program Files (x86)\Minefield\firefox.exe

Problem signature
Problem Event Name:	APPCRASH
Application Name:	firefox.exe
Application Version:	2.0.0.3846
Application Timestamp:	4c3dca57
Fault Module Name:	xul.dll
Fault Module Version:	2.0.0.3846
Fault Module Timestamp:	4c3dc993
Exception Code:	c0000005
Exception Offset:	0000000000276928
OS Version:	6.1.7601.2.1.0.256.1
Locale ID:	1033
Additional Information 1:	25c0
Additional Information 2:	25c09b994a48fea9e9d21b3b8db14eb6
Additional Information 3:	613f
Additional Information 4:	613f902109b04e71c13aa2112ec8251b

Extra information about the problem
Bucket ID:	19070399

--------

Description
Faulting Application Path:	C:\Program Files (x86)\Minefield\firefox.exe

Problem signature
Problem Event Name:	APPCRASH
Application Name:	firefox.exe
Application Version:	2.0.0.3846
Application Timestamp:	4c3dca57
Fault Module Name:	xul.dll
Fault Module Version:	2.0.0.3846
Fault Module Timestamp:	4c3dc993
Exception Code:	c0000005
Exception Offset:	0000000000276928
OS Version:	6.1.7601.2.1.0.256.1
Locale ID:	1033
Additional Information 1:	25c0
Additional Information 2:	25c09b994a48fea9e9d21b3b8db14eb6
Additional Information 3:	613f
Additional Information 4:	613f902109b04e71c13aa2112ec8251b

Extra information about the problem
Bucket ID:	19070399

---------------

Description
Faulting Application Path:	C:\Program Files (x86)\Minefield\firefox.exe

Problem signature
Problem Event Name:	APPCRASH
Application Name:	firefox.exe
Application Version:	2.0.0.3846
Application Timestamp:	4c3dca57
Fault Module Name:	xul.dll
Fault Module Version:	2.0.0.3846
Fault Module Timestamp:	4c3dc993
Exception Code:	c0000005
Exception Offset:	0000000000276928
OS Version:	6.1.7601.2.1.0.256.1
Locale ID:	1033
Additional Information 1:	25c0
Additional Information 2:	25c09b994a48fea9e9d21b3b8db14eb6
Additional Information 3:	613f
Additional Information 4:	613f902109b04e71c13aa2112ec8251b

Extra information about the problem
Bucket ID:	19070399

-------------------

Description
Faulting Application Path:	C:\Program Files (x86)\Minefield\firefox.exe

Problem signature
Problem Event Name:	APPCRASH
Application Name:	firefox.exe
Application Version:	2.0.0.3846
Application Timestamp:	4c3dca57
Fault Module Name:	xul.dll
Fault Module Version:	2.0.0.3846
Fault Module Timestamp:	4c3dc993
Exception Code:	c0000005
Exception Offset:	0000000000276928
OS Version:	6.1.7601.2.1.0.256.1
Locale ID:	1033
Additional Information 1:	25c0
Additional Information 2:	25c09b994a48fea9e9d21b3b8db14eb6
Additional Information 3:	613f
Additional Information 4:	613f902109b04e71c13aa2112ec8251b

Extra information about the problem
Bucket ID:	19070399

---------------

greetings.
https://developer.mozilla.org/en/How_to_get_a_stacktrace_with_WinDbg and please don't use a bug about "error in rendering" to describe "it crashes", use a new bug (they're cheap).
Assignee: nobody → general
Component: General → JavaScript Engine
Product: Firefox → Core
QA Contact: general → general
This goes away when the jit is turned off. I guess it will need further minimization.
I've reproduced this just using cufon at:

http://people.mozilla.org/~jmuizelaar/cufon-test/test.html
Summary: x64/win64 build of firefox4.0b2pre display error in rendering html text (created as internal png representation) text → cufon gives corrupted text on x64 with jit
Since bezierCurveTo has a lot of float parameters, I have to implement asm_stkarg for double type...
Assignee: general → m_kato
Summary: cufon gives corrupted text on x64 with jit → TM: cufon gives corrupted text on x64 with jit [nanojit]
Summary: TM: cufon gives corrupted text on x64 with jit [nanojit] → TM: asm_stkarg not implemented for x86 [nanojit]
OS: Windows 7 → All
Attached patch fix (obsolete) — Splinter Review
Attachment #459735 - Flags: review?(edwsmith)
Comment on attachment 459735 [details] [diff] [review]
fix

Nice catch!  Did the TODO assert fire when testing?

The encoding of MOVQSPX doesn't incorporate the REX.W byte, and so will fail if r == XMM8-XMM15.  The current encoding in the patch is:

   00 24 44 D6 0F 66 00 06LL
                 ^
This is where REX.W belongs.  In the single-quadword encoding, it should be 40 (hex) 

Where does the byte D6 come from?  In looking at the X64 manual, the encoding of MOVQ r/m64 <- xmm is:  66 REX.W 0F 7E /r, which would lead to an instruction word like this:

   0x 00 24 44 7E 0F 40 66 07 LL => 0x0024447E0F406607

In the existing code (before the patch), the comment on X64_movspr is incorrect,
it should read:

   // 64bit store gpr -> [rsp+d8] (sib required)
                               ^

Notice how in MOVQSPR, we use U64(d)<<56 and thus only use the low 8 bits of the offset.  This will be fine in asm_stkarg() as long as stk_off is a signed 8-bit value, which is ensured by an assert.  

To land this, I recommend:

1. fix the X64_movqspx constant as above: add REX.W, fix the opcode byte (D6 should be 7E), update the length byte from 06 to 07, and fix the comment.  If D6 is the correct opcode value, please explain.

2. Also please fix the comment on X64_movspr (replace "d32" with "d8")

3. Update the shift expressions in MOVQSPX() to correspond to the correct bit offsets in X64_movqspx.  The in the patch was copied from MOVSPR(), but since the instruction encoding is different, the shift offsets are also different.

4. In a debugger, stop at the line in asm_stkarg() that calls MOVQSPX, then hand modify r to test both XMM0-7 as well as XMM8-XMM15, in order to ensure the REX.W byte is properly computed.  Step past MOVQSPX(), then disassemble the generated instruction starting at the address in _nIns, to make sure it is valid.  This should be straightforward in VS2008/2010's debugger, or in gdb.
Attachment #459735 - Flags: review?(edwsmith) → review-
blocking2.0: ? → beta5+
Attached patch v2 (obsolete) — Splinter Review
Attachment #459735 - Attachment is obsolete: true
Attached patch reftestSplinter Review
Kato-san: is your v2 patch ready for review?
This doesn't need to be in b5.
blocking2.0: beta5+ → beta6+
Attachment #469829 - Flags: review?(edwsmith)
blocking2.0: beta6+ → betaN+
Attachment #469829 - Flags: review?(edwsmith) → review?(dmandelin)
dmandelin, did you mean to ask yourself for a review?
Comment on attachment 469829 [details] [diff] [review]
v2

Because of the intricacy of this patch, I have explanatory comments below along with review comments. |*| indicates a review comment/request.

>diff --git a/js/src/nanojit/NativeX64.cpp b/js/src/nanojit/NativeX64.cpp

New function to "compute" REX byte for movqspx. More precisely, the incoming op has the a REX byte of 0x40 already. This function is intended to either mask in the correct REX byte, if one is needed; or else splice out the REX byte.

>+    static inline uint64_t rexspx(uint64_t op, Register r) {
>+        int shift = 64 - 8*oplen(op) + 8;

For |MOVQ mem, reg|, we need a REX byte only if |reg| is past the low 7. So the condition tests that, and if so, returns the original with something masked in (see below). 

The else part handles the case where |reg| is in the low 7. In that case, we copy the prefix byte over the REX byte and decrement the instruction length. This is in parallel with the previous function |rexprb|. I believe this doesn't zero out the old prefix byte, but I assume this is OK as long as the length says to ignore it.

* |rexprb| has a comment for that case. Maybe it should be here too?

* More importantly, the "then" case seems to be sort of a no-op: the REX byte on input is 0x40, so ORing in another 0x40 won't do anything. Maybe 0x44 was wanted here instead? It seems to me that it would be OK to have 0x44 be the REX byte in the header file, since that's what the REX byte will be if we need one, and then have the "then" case simply return |op|.

>+        return r & 8 ? op | 0x40 << shift:
>+               ((op & ~(255LL<<shift)) | (op>>(shift-8)&255) << shift) - 1;
>+    }
>+
>diff --git a/js/src/nanojit/NativeX64.h b/js/src/nanojit/NativeX64.h
>--- a/js/src/nanojit/NativeX64.h
>+++ b/js/src/nanojit/NativeX64.h
>@@ -241,16 +241,17 @@ namespace nanojit
>         X64_learip  = 0x00000000058D4807LL, // 64bit RIP-relative lea. reg <- disp32+rip (modrm = 00rrr101 = 05)
>         X64_movlr   = 0xC08B400000000003LL, // 32bit mov r <- b
>         X64_movbmr  = 0x0000000080884007LL, // 8bit store r -> [b+d32]
>         X64_movsmr  = 0x8089406600000004LL, // 16bit store r -> [b+d32]
>         X64_movlmr  = 0x0000000080894007LL, // 32bit store r -> [b+d32]
>         X64_movlrm  = 0x00000000808B4007LL, // 32bit load r <- [b+d32]
>         X64_movqmr  = 0x0000000080894807LL, // 64bit store gpr -> [b+d32]

* Ed asked for the comment on the following line to be corrected, s/d32/d8.

>         X64_movqspr = 0x0024448948000005LL, // 64bit store gpr -> [rsp+d32] (sib required)

Ed also asked for clarification on the D6. I checked the manual and I also got that D6 is right. I further verified that by using the MSVC disasm on the bytes given here. (Ed mentioned 7E, which is for the "load" version of MOVQ with xmm regs.) 

>+        X64_movqspx = 0x002444D60F406607LL, // 64bit store xmm -> [rsp+d8] (sib required)

So, next steps:

1. Fix REX byte handling for upper 8 registers, or explain why it's right the way it is.

2. Fix old comment per Ed's #2.

3. Check instruction encoding in the debugger per Ed's #4.
Attachment #469829 - Flags: review?(dmandelin)
(In reply to comment #21)
> dmandelin, did you mean to ask yourself for a review?

Yes, Ed asked for someone to steal the review request.
What's the status of this?  Seems like it's close to being fixed...
blocking2.0: betaN+ → -
status2.0: --- → wanted
Should this really not be blocking2.0? I have come across this bug before and when I last checked it was marked as blocking2.0 but now it's simply marked as "wanted" for 2.0.

Given that it is a clear regression and makes some text completely unreadable, I find it rather puzzling that it has been given such a low priority.
Win64 isn't a supported platform for FF4, which is why we're not blocking on it.
Also, although this is possible x86-64 nanojit bug, this depends on OS ABI.

Since Win64 ABI passes 4 parameters only by register, so this bug occurs.  If trace jit calls a function (such as DOM method) that 7th or 8th parameter is float/double, this will occurs on Mac or Linux.  But there is no function like this.
Attached patch fix v3 (obsolete) — Splinter Review
Attachment #469829 - Attachment is obsolete: true
Attached patch v3.1Splinter Review
Attachment #521721 - Attachment is obsolete: true
Attachment #521723 - Flags: review?(edwsmith)
Comment on attachment 521723 [details] [diff] [review]
v3.1

The patch looks ok to me.
Attachment #521723 - Flags: review?(edwsmith) → review+
http://hg.mozilla.org/projects/nanojit-central/rev/6176f4f11b27
Whiteboard: [win64] → [win64] fixed-in-nanojit
http://hg.mozilla.org/tracemonkey/rev/c3499d9cf12d
http://hg.mozilla.org/mozilla-central/rev/c3499d9cf12d
Status: NEW → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla7
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: