Closed
Bug 102113
Opened 23 years ago
Closed 23 years ago
nsCompressedCharMap crashes during startup on 64bit Solaris.
Categories
(Core :: Layout, defect)
Tracking
()
RESOLVED
FIXED
mozilla0.9.6
People
(Reporter: pavlov, Assigned: bstell)
References
Details
(Keywords: 64bit)
Attachments
(1 file, 1 obsolete file)
5.33 KB,
patch
|
shanjian
:
review+
brendan
:
superreview+
|
Details | Diff | Splinter Review |
stack trace: =>[1] nsCompressedCharMap::SetChar(this = 0xffffffff7fff4d10, aChar = 338U), line 156 in "nsCompressedCharMap.cpp" [2] InitGlobals(), line 810 in "nsFontMetricsGTK.cpp" [3] nsFontMetricsGTK::Init(this = 0x10050a490, aFont = STRUCT, aLangGroup = 0x100407380, aContext = 0x100441250), line 1035 in "nsFontMetricsGTK.cpp" [4] nsFontCache::GetMetricsFor(this = 0x100506bc0, aFont = STRUCT, aLangGroup = 0x100407380, aMetrics = (nil)), line 568 in "nsDeviceContext.cpp" [5] DeviceContextImpl::GetMetricsFor(this = 0x100441250, aFont = STRUCT, aLangGroup = 0x100407380, aMetrics = (nil)), line 233 in "nsDeviceContext.cpp" [6] ComputeLineHeight(aRenderingContext = 0x100508910, aStyleContext = 0x1004ffd18), line 2140 in "nsHTMLReflowState.cpp" [7] nsHTMLReflowState::CalcLineHeight(aPresContext = 0x10040c290, aRenderingContext = 0x100508910, aFrame = 0x1004ffd80), line 2182 in "nsHTMLReflowState.cpp" [8] nsBlockReflowState::nsBlockReflowState(this = 0xffffffff7fffa140, aReflowState = STRUCT, aPresContext = 0x10040c290, aFrame = 0x1004ffd80, aMetrics = STRUCT, aBlockMarginRoot = 4194304), line 156 in "nsBlockReflowState.cpp" [9] nsBlockFrame::Reflow(this = 0x1004ffd80, aPresContext = 0x10040c290, aMetrics = STRUCT, aReflowState = STRUCT, aStatus = 0), line 693 in "nsBlockFrame.cpp" I am seeing this on a build on Solaris 8 built with Forte 6U2 with -xarch=v9
Comment 1•23 years ago
|
||
In LXR, there's nothing line 156 in the current version of nsCompressedCharMap.cpp http://lxr.mozilla.org/seamonkey/source/gfx/src/nsCompressedCharMap.cpp Which version of Mozilla are you using?
Reporter | ||
Comment 2•23 years ago
|
||
it ends up being line 171 because of the license changes. there havn't been any other changes to the file... I will update my tree though.. so my line numbers will be right.
Assignee | ||
Comment 3•23 years ago
|
||
Pav: is sheep a 64 bit system? If not is there a system I can build/debug on?
Reporter | ||
Comment 4•23 years ago
|
||
yeah, sheep (can be) a 64bit system. add /opt/64bit/bin at the beginning of PATH and /opt/64bit/lib to the beginning of LD_LIBRARY_PATH (this is where I installed 64bit glib/gtk/libIDL libraries on sheep) then set CC to "cc -xarch=v9" and CXX to "CC -xarch=v9" and ASFLAGS="-xarch=v9" run configure as you normally would, and build.. when it is done, you'll have a 64bit build. dbx/workshop work as normal.
Assignee | ||
Comment 5•23 years ago
|
||
okay, made the indicated changes and I have started a build
Assignee | ||
Comment 6•23 years ago
|
||
It seems to be failing to find a 64 bit thread locking routine. rm -f libmozjs.so CC -xarch=v9 -I/usr/openwin/include -mt -DDEBUG -DDEBUG_ -DTRACING -g -G -Qoption ld -z,muldefs -h libmozjs.so -o libmozjs.so jsapi.o jsarena.o jsarray.o jsatom.o jsbool.o jscntxt.o jsdate.o jsdbgapi.o jsdhash.o jsdtoa.o jsemit.o jsexn.o jsfun.o jsgc.o jshash.o jsinterp.o jslock.o jslog2.o jslong.o jsmath.o jsnum.o jsobj.o jsopcode.o jsparse.o jsprf.o jsregexp.o jsscan.o jsscope.o jsscript.o jsstr.o jsutil.o jsxdrapi.o prmjtime.o lock_SunOS.o -xildoff -lm -lposix4 -ldl -lnsl -lsocket -L../../dist/bin -L/builds/bstell/mozilla/dist/lib -lplds4 -lplc4 -lnspr4 -lpthread -ldl -lsocket -ldl -lm ld: fatal: file lock_SunOS.o: wrong ELF class: ELFCLASS32 ld: fatal: File processing errors. No output written to libmozjs.so
Reporter | ||
Comment 7•23 years ago
|
||
is this build on top of another build or a fresh tree? sun's cache might be getting confused if this is on top of another build. I would recommend doing a 'gmake -f client.mk distclean' on the tree.
Assignee | ||
Comment 8•23 years ago
|
||
I did a "gmake -f client.mk distclean" then a "./configure" before the build
Comment 9•23 years ago
|
||
Looks like lock_SunOS.s wasn't built using -xarch=v9. Can you double check ASFLAGS in config/autoconf.mk and make sure it was set. Also, can you check the compile line in the log to see how lock_SunOS.o was built?
Assignee | ||
Comment 10•23 years ago
|
||
config/autoconf.mk: ASFLAGS = -K PIC -L -P -D_ASM -D__STDC__=0
Assignee | ||
Comment 11•23 years ago
|
||
/usr/ccs/bin/as -o lock_SunOS.o -K PIC -L -P -D_ASM -D__STDC__=0 lock_SunOS.s
Comment 12•23 years ago
|
||
Ok, that's the problem. Re-reading your previous comment, I don't see where CC/CXX/ASFLAGS were passed into the build. You need to either add those settings to your mozconfig or pass them on the ./configure line. Add: CC="cc -xarch=v9" CXX="CC -xarch=v9" ASFLAGS="-xarch=v9" to ~/.mozconfig or env CC="cc -xarch=v9" CXX="CC -xarch=v9" ASFLAGS="-xarch=v9" ./configure
Assignee | ||
Comment 13•23 years ago
|
||
okay, I finally have a build.
Assignee | ||
Comment 14•23 years ago
|
||
okay, I'll set "-g" in CFLAGS and CCFLAGS and see if I get debug symbols
Comment 15•23 years ago
|
||
bstell- mark it assign if you agree to work on it.
Assignee | ||
Updated•23 years ago
|
Status: NEW → ASSIGNED
Assignee | ||
Comment 16•23 years ago
|
||
Here is the error: signal BUS (invalid address alignment) Looks like the array needs to be 64 bit aligned.
Comment 17•23 years ago
|
||
bstell, sorry I didn't catch this 64-bit impurity in review. RISCs generally require natural alignment. The only way to ensure it is with a union around the array of PRUint16s. That'll cost an extra "u." member name and dot operator, but no big deal. /be
Comment 18•23 years ago
|
||
Oh, and (of course) round up to a 0 mod 8 byte boundary when allocating from the map -- is that going to waste too much space? We have to 0 mod 4 align for uint32 access, already. /be
Assignee | ||
Comment 19•23 years ago
|
||
Assignee | ||
Comment 20•23 years ago
|
||
Attachment 52160 [details] [diff] forces the map into 16 bit access. This stops the crash.
A complete fix would probably involve typing the memory arrays (both
stack and heap) to ALU_TYPE and doing casts for all the 16 bit accesses.
At present the 64 bit version runs, the profile manager looks okay,
but the pages are completely blank. Not even images show. I believe
this is unrelated but it prevents me from verifying this patch.
Assignee | ||
Updated•23 years ago
|
Target Milestone: --- → mozilla0.9.5
Assignee | ||
Comment 21•23 years ago
|
||
This close to 0.9.4 branch I'd prefer to get the simplest fix in.
Assignee | ||
Comment 22•23 years ago
|
||
local files display but remote URLs do not
Assignee | ||
Comment 23•23 years ago
|
||
failing to display remote URLs is probably a separate bug
Assignee | ||
Comment 24•23 years ago
|
||
When I click the off-line icon I get this error: ###!!! ASSERTION: Should have thread when shutting down.: 'Not Reached', file nsSocketTransportService.cpp, line 733 ###!!! Break: at file nsSocketTransportService.cpp, line 733 JavaScript error: line 0: uncaught exception: [Exception... "Component returned failure code: 0x80004005 (NS_ERROR_FAILURE) [nsIIOService.offline]" nsresult: "0x80004005 (NS_ERROR_FAILURE)" location: "JS frame :: chrome://communicator/content/utilityOverlay.js :: toggleOfflineStatus :: line 69" data: no]
Reporter | ||
Comment 25•23 years ago
|
||
Comment on attachment 52160 [details] [diff] [review] patch; force all to use 16 bit access r=pavlov
Attachment #52160 -
Flags: review+
Comment 26•23 years ago
|
||
Comment on attachment 52160 [details] [diff] [review] patch; force all to use 16 bit access Good for 0.9.5, sr=brendan@mozilla.org. Please leave this bug open so we can look into wider memory accesses for 0.9.6 trunk.
Attachment #52160 -
Flags: superreview+
Comment 27•23 years ago
|
||
Comment on attachment 52160 [details] [diff] [review] patch; force all to use 16 bit access a=asa (on behalf of drivers) for checkin to 0.9.5.
Attachment #52160 -
Flags: approval+
Comment 28•23 years ago
|
||
did this check into m0.9.5 branch ? IF so, please move it to m0.9.6 if you want to keep it open.
Assignee | ||
Updated•23 years ago
|
Target Milestone: mozilla0.9.5 → mozilla0.9.6
Assignee | ||
Updated•23 years ago
|
Target Milestone: mozilla0.9.6 → mozilla0.9.7
Assignee | ||
Comment 29•23 years ago
|
||
from bobbell@zk3.dec.com in bug 108950: > nsCompressedCharMap::SetChars accesses unaligned memory. With the default > settings on Tru64 UNIX, Tru64 UNIX correctly detects this, corrects it, and > prints a warning message. However, Tru64 UNIX can also be set to crash with > this behavior, and it is technically incorrect. > > The problem was discovered using a recent nightly build. Lines 298 and 299 of > nsCompressedCharMap.cpp are at fault. They read: > NS_ASSERTION(page[i]==0, "this page should be unused"); > page[i] = aPage[i]; > > page (from my crash dump) is a pointer on a four byte boundary. This is > because is an offset into mCCMap, which is an array of 16-bit data types. > However, page is a point to ALU_TYPE, which on Tru64 UNIX is a 64-bit data > type. Thus, page is not properly aligned.
Assignee | ||
Comment 30•23 years ago
|
||
What I do not understand is where the misalignment comes from. (I would appreciate anyone pointing out what I am missing or where I am mistaken). The pages are each 16 shorts (32 bytes) so the page-to-page distance should maintain the same ALU boundry alignment of the start of the map. In the section around line 299 the code that accesses the page does so in ALU sized groups so it should maintain the ALU boundry alignment as the start of the page. Doesn't malloc return memory that is aligned to the largest ALU size? If not how could the code safely alloc space for the largest ALU? Is the base CCMap address on a 4 byte boundry?
Comment 31•23 years ago
|
||
I'm seeing this problem again with a tip v9 build using WS5 . (/opt/SUNWspro/WS5.0/bin/sparcv9/dbx) where current thread: t@1 =>[1] nsCompressedCharMap::SetChar(this = 0xffffffff7fff5e80, aChar = 338U), line 223 in "nsCompressedCharMap.cpp" [2] InitGlobals(), line 825 in "nsFontMetricsGTK.cpp" [3] nsFontMetricsGTK::Init(this = 0x1004e9ec0, aFont = STRUCT, aLangGroup = 0x100420340, aContext = 0x100441430), line 1050 in "nsFontMetricsGTK.cpp" [4] nsFontCache::GetMetricsFor(this = 0x1004e5a50, aFont = STRUCT, aLangGroup = 0x100420340, aMetrics = (nil)), line 631 in "nsDeviceContext.cpp" [5] DeviceContextImpl::GetMetricsFor(this = 0x100441430, aFont = STRUCT, aLangGroup = 0x100420340, aMetrics = (nil)), line 266 in "nsDeviceContext.cpp" dbx: warning: can't find file "/space/home/cls/src/moz/main/obj-opt-ws-64-g/layout/build/nsHTMLReflowState.o" dbx: warning: see `help pathmap' [6] ComputeLineHeight(0x1004e9340, 0x1004e0360, 0xffffffff7fffb104, 0xffffffff74d9ef8c, 0x0, 0xffffffff74d7a808), at 0xffffffff74e1521c [7] nsHTMLReflowState::CalcLineHeight(0x1004417b0, 0x1004e9340, 0x1004e03c0, 0x0, 0x0, 0x0), at 0xffffffff74e1550c dbx: warning: can't find file "/space/home/cls/src/moz/main/obj-opt-ws-64-g/layout/build/nsBlockReflowState.o" [8] nsBlockReflowState::nsBlockReflowState(0xffffffff7fffb038, 0xffffffff7fffb4b8, 0x1004417b0, 0x1004e03c0, 0xffffffff7fffb5c8, 0x400000), at 0xffffffff74d9ef8c dbx: warning: can't find file "/space/home/cls/src/moz/main/obj-opt-ws-64-g/layout/build/nsBlockFrame.o" [9] nsBlockFrame::Reflow(0xffffffff7fffb5c8, 0x1004417b0, 0xffffffff7fffb5c8, 0xffffffff7fffb4b8, 0xffffffff7fffbc04, 0xffffffff755926c0), at 0xffffffff74d7ebb4 dbx: warning: can't find file "/space/home/cls/src/moz/main/obj-opt-ws-64-g/layout/build/nsContainerFrame.o" [10] nsContainerFrame::ReflowChild(0x100489e90, 0x1004e03c0, 0x1004417b0, 0xffffffff7fffb5c8, 0xffffffff7fffb4b8, 0x0), at 0xffffffff74db3324 dbx: warning: can't find file "/space/home/cls/src/moz/main/obj-opt-ws-64-g/layout/build/nsHTMLFrame.o" [11] CanvasFrame::Reflow(0x0, 0x0, 0xffffffff7fffbc04, 0xffffffff7fffb7f8, 0xffffffff7fffbc04, 0x2), at 0xffffffff74e055bc dbx: warning: can't find file "/space/home/cls/src/moz/main/obj-opt-ws-64-g/layout/build/nsBoxToBlockAdaptor.o" [12] nsBoxToBlockAdaptor::Reflow(0x1004e02d0, 0xffffffff7fffc920, 0x1004417b0, 0xffffffff7fffbbc0, 0xffffffff7fffcc98, 0xffffffff7fffbc04), at 0xffffffff75164ebc [13] nsBoxToBlockAdaptor::DoLayout(0x0, 0x0, 0x76c, 0x76c, 0x1, 0xffffffff75164588), at 0xffffffff7516435c dbx: warning: can't find file "/space/home/cls/src/moz/main/obj-opt-ws-64-g/layout/build/nsBox.o" [14] nsBox::Layout(0x1004e02d0, 0xffffffff7fffc920, 0xffffffff7fffbff8, 0x0, 0x0, 0x2), at 0xffffffff751553fc dbx: warning: can't find file "/space/home/cls/src/moz/main/obj-opt-ws-64-g/layout/build/nsScrollBoxFrame.o"
Comment 32•23 years ago
|
||
*** Bug 108950 has been marked as a duplicate of this bug. ***
Comment 33•23 years ago
|
||
In response to Brian Stell's comment #30: > Doesn't malloc return memory that is aligned to the largest ALU size? > If not how could the code safely alloc space for the largest ALU? > Is the base CCMap address on a 4 byte boundry? malloc() does indeed return memory that is aligned so that it can be used by any data type (which here would make it 64-bit aligned). However, here the memory is not being explicitly malloc'ed. The definition of the class nsCompressedCharMap includes: protected: PRUint16 mUsedLen; // in PRUint16 PRUint16 mAllOnesPage; PRUint16 mCCMap[CCMAP_MAX_LEN]; Thus, mCCMap is only guaranteed to be aligned for PRUint16 access. I believe what the Compaq cxx compiler is doing internally is aligning mUsedLen on a 64-bit boundary (either intentionally or by chance), which puts mCCMap only four bytes (two PRUint16's) later, which is not on a 64-bit boundary. From some debug printfs I added: mCCMap @ 0x11fff30f4 page_offset == 0x40 page == 0x11fff3174
Assignee | ||
Comment 34•23 years ago
|
||
thanks for the insight this I can fix
Assignee | ||
Comment 35•23 years ago
|
||
bobbell: could you try this patch? thanks
Attachment #52160 -
Attachment is obsolete: true
Assignee | ||
Comment 36•23 years ago
|
||
the patch was made in the gfx directory
Assignee | ||
Updated•23 years ago
|
Target Milestone: mozilla0.9.7 → mozilla0.9.6
Comment 37•23 years ago
|
||
Comment on attachment 57165 [details] [diff] [review] patch; use a union to make the C++ object align the map on the largest ALU Why not give that ALU_TYPE dummy; member the canonical (and less insulting :-) name, namely 'align'? sr=brendan@mozilla.org in any event. /be
Attachment #57165 -
Flags: superreview+
Comment 38•23 years ago
|
||
bstell, can you get r= and then mail drivers@mozilla.org for a= to check in for 0.9.6? Thanks, /be
Assignee | ||
Comment 39•23 years ago
|
||
okay, after only 4 hours I have a 64 bit build on sheep and it crashes without the patch and runs with that patch. (This is so weird: both my linux systems are still horked from the network upgrade :( so I'm using my Win98 system to display the Solaris client.)
Comment 40•23 years ago
|
||
Comment on attachment 57165 [details] [diff] [review] patch; use a union to make the C++ object align the map on the largest ALU I don't see any problem with the patch. r=shanjian
Attachment #57165 -
Flags: review+
Assignee | ||
Comment 42•23 years ago
|
||
checked into 0.9.6 branch
Status: ASSIGNED → RESOLVED
Closed: 23 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•