Closed
Bug 34949
Opened 24 years ago
Closed 24 years ago
Mork code is not 64 bit clean
Categories
(SeaMonkey :: General, defect, P3)
Tracking
(Not tracked)
RESOLVED
FIXED
M16
People
(Reporter: jim_nance, Assigned: Bienvenu)
Details
Attachments
(4 files)
642 bytes,
patch
|
Details | Diff | Splinter Review | |
2.33 KB,
patch
|
Details | Diff | Splinter Review | |
1.93 KB,
patch
|
Details | Diff | Splinter Review | |
673 bytes,
patch
|
Details | Diff | Splinter Review |
Running mozilla under Tru64 Unix generates unaligned access messages.
After looking at the code it appears that it makes assumptions that
are not valid on 64 bit machines. For example, from mork.h:
typedef unsigned long mork_u4; // make sure this is four bytes
typedef long mork_i4; // make sure this is four bytes
Which I think is causing this problem:
Thread stopped on Unaligned Access
stopped at [void InitWeeBookAtom(class morkEnv*, const morkBuf&, class
morkAtomSpace*, mork_aid):495 0x30011042f9c]
495 mAtom_Change = morkChange_kNil;
(ladebug) w
490 void
491 morkWeeBookAtom::InitWeeBookAtom(morkEnv* ev, const morkBuf& inBuf,
492 morkAtomSpace* ioSpace, mork_aid inAid)
493 {
494 mAtom_Kind = 0;
> 495 mAtom_Change = morkChange_kNil;
496 if ( ioSpace )
497 {
498 if ( inAid )
499 {
(ladebug) p this
0x14017e054
(ladebug)
Note that this is aligned on a 4 byte boundary, which is not appropriate for
a structure which contains longs, as this one mistakedly does.
I found some more stuff. From db/mork/src/morkConfig.h: /* ===== pooling ===== */ #define MORK_CONFIG_PTR_SIZE_4 1 /* sizeof(void*) == 4 */ //define MORK_CONFIG_ALIGN_8 1 /* must have 8 byte alignment */
Autoconfig should be setting up the stuff that is in the morkConfig.h header, so I am adding cls to the CC list.
I changed the typedefs, and the defines to reflect the 64 bit hardware I am running on, and that did not fix the unaligned access problems. I just turned off Arena support in Mork to see if that fixes the problem. FWIW, this should be a problem under 64 bit AIX, IRIX, Solaris, & HPUX. What do those platforms do on unaligned accesses?
Comment 4•24 years ago
|
||
this bug belongs to david b, who is owning mork now.
Assignee: asadotzler → bienvenu
Assignee | ||
Comment 5•24 years ago
|
||
I'm not going to be able to do anything about this without some help from someone who knows about these 64 bit platforms.
Assignee | ||
Comment 7•24 years ago
|
||
OK, but that doesn't help me :-(
bienvenu, The 64 bit problems are pretty simple. Pointers and longs on 64 bit platforms are 8 bytes instead of the usual 4 bytes. If you look in mork.h and mdb.h you will see comments where it is assumed that longs are 4 bytes. The correct way to fix this is probably to change all the types that assume a length to PRUint32 or PRInt16 or whatever PR type is appropriate. Another issue is that there is a #define that is set if you have 4 byte pointers. This needs to be tied into the autoconf system somehow since pointer size varies from platform to platform. Finally there is the issue of alignment. The address of a 64 bit quantity like a pointer or a long must be divisible by 8. If you use new or malloc to obtain memory for your structures, this happens automatically. Mork uses its own memory allocator, and it seems to align memory so that it is divisable by 4 but not always by 8. I dont know where this code is and someone may need to track it down. I have a report from wtc that the unaligned access messages go away after fixing mdb.h, so that may have fixed this problem.
Assignee | ||
Comment 9•24 years ago
|
||
using PR types is easy, now that I'm maintaining this code. I'll do that. who can I ask about the autoconf system? I know in theory about the sub-allocator used by mork, but I need to know if there's anything left to do here (wtc?)
Reporter | ||
Comment 10•24 years ago
|
||
Chris Seawood is a good autoconf contact.
Comment 11•24 years ago
|
||
It'd be cake to add a sizeof(long) == 8 test to configure that would set USE_64BIT or SIZEOF_LONG_IS_8 or something.
Assignee | ||
Comment 12•24 years ago
|
||
that would be very helpful, I think. Why hasn't anyone needed this before?
Status: NEW → ASSIGNED
Reporter | ||
Comment 13•24 years ago
|
||
It may be possible to do the check with a test like if(sizeof(void*)==8) { } else if(sizeof(void*)==4) { } statement. This may be what other code is doing. I dont know what mork does with that define, but if you can use something like the expression above, then you will not need autoconf changes. Unfortunatly you can not use sizeof() expressions in #if statements, so you would have to change the actual code.
Assignee | ||
Comment 14•24 years ago
|
||
i'd rather not do it at runtime...mork is all set up for using a #define
Comment 15•24 years ago
|
||
Unfortunately, you're doing a runtime check either way. The test in configure has to be a runtime check. Unfortunately, doing it in configure means that cross-compiling will probably use the wrong value. When cross-compiling, configure will error on the side of caution and assume the runtime check would have failed.
Comment 16•24 years ago
|
||
Comment 17•24 years ago
|
||
As far as I know, although sizeof (variablename) is a runtime operator, I believe that sizeof (typename) is actually a compile-time operator (and can even generate compile-time errors). I don't have the K&R book nearby to check.
Comment 18•24 years ago
|
||
And one consequence of that is that there shouldn't be any cross-compilation problems relating to such a test.
Reporter | ||
Comment 19•24 years ago
|
||
I did a clean build with the types in mork.h and mdb.h changed from long to int, and with the defines in morkConfig.h set for 8 byte alignment 8 byte pointers. The unaligned access messages are gone. Both wtc and I were having some problems last night wich SIGSEGVs, but this seems to have been fixed by the rebuild. This is not terriably supprising as the alpha build is not using dependencies, so its easy to get it out of whack. Are there any plans to fix any of these problems for M15? Somehow I doubt it, but it would be nice to at least change the variables that are supposed to be 4 bytes from longs to PRInt32s in mork.h and mdb.h. This should be a fairly safe change.
Assignee | ||
Comment 20•24 years ago
|
||
No, I was going to check it in for m16 - I had no idea anyone wanted it for m15.
Target Milestone: --- → M16
Reporter | ||
Comment 21•24 years ago
|
||
I was wrong about things working with different types and defines, the SIGSEGV is still there. I am debugging it now.
Reporter | ||
Comment 22•24 years ago
|
||
I have attached a patch that finishes fixing this problem. The SIGSEGV I was getting does not look like it was related to these changes. I tested this patch under Tru64 5.0.
Reporter | ||
Comment 23•24 years ago
|
||
Reporter | ||
Comment 24•24 years ago
|
||
I forgot to mention that this patch includes the patch that chris did earlier for the autocon changes. This is in my tree. I will check it in if someone gives me permission.
Reporter | ||
Comment 25•24 years ago
|
||
Reporter | ||
Comment 26•24 years ago
|
||
Dauphin noticed that that last patch was wrong. This one should be better.
Comment 27•24 years ago
|
||
r=cls
Assignee | ||
Comment 28•24 years ago
|
||
the mork part looks good. I already checked in the use of PRInt32 to mork.h and mdb.h - david
Reporter | ||
Comment 29•24 years ago
|
||
I checked in the patch
Status: ASSIGNED → RESOLVED
Closed: 24 years ago
Resolution: --- → FIXED
Reporter | ||
Comment 30•24 years ago
|
||
OK, I was wrong. There is a problem with crashes. Its just not completely
reproducable, which is why I thought it was gone. Here is what I know
so far. We get a SEGV at:
stopped at [morkRowObject* morkRow::AcquireRowObject(class morkEnv*, class
morkStore*):250 0x3000f856b20]
250 morkRowObject* ro = mRow_Object;
If I go up 1 frame I see:
>1 0x3000f856c14 in ((morkRow*)0x4017c1b0)->AcquireRowHandle(ev=0x14025ec00,
ioStore=0x1401fd800) "/house/jnance/mbuild/mozilla/db/mork/src/morkRow.cpp":271
271 morkRowObject* object = this->AcquireRowObject(ev, ioStore);
The "this" pointer, is a 32 bit number, which wrong. I assume that somewhere
someone has truncated the top 32 bits by casting to/from an integer. I am
having some problems with the debugger, so I can not go up any further,
but I did put in some code and this is the second call to AcquireRowHandle().
I think what is happening is that the memory management code for mork
is returning this pointer after it has recycled it. I dont know the
code very well. Can anyone shed any light on how this works?
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Assignee | ||
Comment 31•24 years ago
|
||
David Mc who wrote this code has left the company - I know how to read the code but I can't really tell you how things work more than you already know, I'm afraid. It sounds like something going on in mork pools. Perhaps the handles are truncating the pointers. You can try turning off the pools in morkConfig.h by commenting out the appropriate #define.
Reporter | ||
Comment 32•24 years ago
|
||
OK, I made a little more progress. Here is the code that extracts the recycled row element pointer: /*| get_assoc: read the key and/or value at index inPos |*/ void morkMap::get_assoc(void* outKey, void* outVal, mork_pos inPos) const { mork_num valSize = this->FormValSize(); if ( valSize && outVal ) /* map holds values? caller wants the value? */ { valSize is 4, which is wrong. I assume this value was set by the person who saved it, but I dont know where that was yet. My stack frame looks like this: >0 0x3000f84d6dc in ((morkMap*)0x14047cb00)->get_assoc(outKey=0x0, outVal=0x11fffaf38, inPos=0) "/house/jnance/mbuild/mozilla/db/mork/src/morkMap.cpp":268 268 if ( valSize && outVal ) /* map holds values? caller wants the value? >1 0x3000f84e614 in ((morkMap*)0x14047cb00)->Get(ev=0x14025ec00, inKey=0x11fffaf40, outKey=0x0, outVal=0x11fffaf38, outChange=0x0) "/house/jnance/mbuild/mozilla/db/mork/src/morkMap.cpp":683 683 this->get_assoc(outKey, outVal, i); #3 0x3000f859e4c in ((morkAtomRowMap*)0x14047cb00)->GetAid(ev=0x14025ec00, inAid=129) "/house/jnance/mbuild/mozilla/db/mork/src/morkAtomMap.h~alt~deccxx_CDDAEC6C":334 #4 0x3000f85ae60 in ((morkRowSpace*)0x140243f80)->FindRow(ev=0x14025ec00, inCol=130, inYarn=0x11fffb0d8) "/house/jnance/mbuild/mozilla/db/mork/src/morkRowSpace.cpp":554 #5 0x3000f85ff00 in ((morkStore*)0x1401fd800)->FindRow(ev=0x14025ec00, inScope=128, inColumn=130, inYarn=0x11fffb0d8) "/house/jnance/mbuild/mozilla/db/mork/src/morkStore.cpp":1204 #6 0x3000f83c104 in ((orkinStore*)0x14025e800)->FindRow(mev=0x14025ed48, inRowScope=128, inColumn=130, inTargetCellValue=0x11fffb0d8, outRowOid=0x11fffb0d0, acqRow=0x11fffb0c8) "/house/jnance/mbuild/mozilla/db/mork/src/orkinStore.cpp":685 #7 0x3000f009dd4 in ((nsGlobalHistory*)0x1402a5fa0)->SetPageTitle(aURL=0x1409edd10="file:///house/jnance/atom.html", aTitle=0x1409ee080) "/house/jnance/mbuild/mozilla/xpfe/components/history/src/nsGlobalHistory.cpp":664 #8 0x3000d01d0e0 in ((nsDocShell*)0x140985300)->SetTitle(aTitle=0x1409ee080) "/house/jnance/mbuild/mozilla/docshell/base/nsDocShell.cpp":1633 #9 0x30012bc6fb0 in ((nsHTMLDocument*)0x140a38800)->SetTitle(aTitle=const class { ... }) "/house/jnance/mbuild/mozilla/layout/html/document/src/nsHTMLDocument.cpp":792
Reporter | ||
Comment 33•24 years ago
|
||
I found the problem, and I have a fix. I am not exactly sure its the right fix, but it works. The problem is we are storing pointers in a hash table, and when we created the hash table, we told it the size of the stuff we were putting in it was 4, which is wrong on 64 bit OSs. This fix makes things run: Index: morkAtomMap.cpp =================================================================== RCS file: /cvsroot/mozilla/db/mork/src/morkAtomMap.cpp,v retrieving revision 1.4 diff -u -r1.4 morkAtomMap.cpp --- morkAtomMap.cpp 1999/11/06 03:17:06 1.4 +++ morkAtomMap.cpp 2000/04/21 20:41:55 @@ -423,7 +423,7 @@ morkAtomRowMap::morkAtomRowMap(morkEnv* ev, const morkUsage& inUsage, nsIMdbHeap* ioHeap, nsIMdbHeap* ioSlotHeap, mork_column inIndexColumn) - : morkIntMap(ev, inUsage, sizeof(mork_aid), ioHeap, ioSlotHeap, + : morkIntMap(ev, inUsage, sizeof(mork_ip), ioHeap, ioSlotHeap, /*inHoldChanges*/ morkBool_kFalse) , mAtomRowMap_IndexColumn( inIndexColumn ) { I am not sure if the correct place to fix this is here, or down in morkIntMap, but the result is going to be the same either way. Comments?
Reporter | ||
Comment 34•24 years ago
|
||
Assignee | ||
Comment 35•24 years ago
|
||
I'm not sure either - I suspect your way is better. Actually, I think your way is right - the parameter you changed is the size of the value to the map, and the values are pointers. Looks good.
Comment 36•24 years ago
|
||
I found that there's a morkPointerMap class. Do you think we should use that instead of morkIntMap? However, there is a comment regarding morkPointerMap that seems wrong: // keySize for morkPointerMap equals sizeof(mork_u4) Another idea is to define mork_aid as mork_ip as opposed to mork_token.
Assignee | ||
Comment 37•24 years ago
|
||
I think mork_aid is an atom id, which probably is a 32 bit integer id. It looks like pointerMap is wrong, like you say, and may have different performance characteristics. If what you've done works, that seems good to me.
Comment 38•24 years ago
|
||
Please ignore my previous comments. I didn't understand the code. I read the relevant mork code several times and now I finally get it. The value we store in morkAtomRowMap is (morkRow*), so the morkAtomRowMap constructor should pass an inValSize of sizeof(mork_ip) (== sizeof(morkRow*)) to the base class morkIntMap constructor. mork_aid is the key. Since mork_aid is defined as mork_token, which is defined as mork_u4, it is correct to base morkAtomRowMap on morkIntMap, which uses mork_u4 as the key. Jim Nance's patch (id=7824) is correct. r=wtc@netscape.com.
Assignee | ||
Comment 39•24 years ago
|
||
isn't this fixed now? Does whoever fixed it want to take it and mark it fixed?
Comment 40•24 years ago
|
||
I verified that the latest patch (id=7824) has been checked in. Marked the bug fixed.
Status: REOPENED → RESOLVED
Closed: 24 years ago → 24 years ago
Resolution: --- → FIXED
Comment 41•24 years ago
|
||
Sorry for the spam. New QA Contact for Browser General. Thanks for your help Joseph (good luck with the new job) and welcome aboard Doron Rosenberg
QA Contact: jelwell → doronr
Comment 42•24 years ago
|
||
can one of you please Verify that this is FIxed. Neither Doron nor I are going to be able to verify this. Thanks.
Updated•20 years ago
|
Product: Browser → Seamonkey
You need to log in
before you can comment on or make changes to this bug.
Description
•