Closed
Bug 327609
Opened 18 years ago
Closed 18 years ago
nsAutoLock trashes stack and triggers MSVC 7 RTC warning about corrputed stack perhaps due to incorrect calling conventions
Categories
(Core :: XPCOM, defect)
Tracking
()
RESOLVED
INVALID
People
(Reporter: bent.mozilla, Assigned: dougt)
Details
Attachments
(3 files)
I've been working on replacing some windows-specific threading code to use nspr and nsAutoLock. I've repeatedly run into the same crash where it looks like the nsAutoLock constructor is trashing the stack and then the destructor causes an exception trying to either unlock a bad PRLock* or the function returns to a bogus memory address. Running in debug mode with the runtime check compiler switches (/RTC1 and /GS) turned on I receive the warning 'Run-Time Check Failure #2 - Stack around the variable 'guidLock' was corrupted.' Here's the function I'm crashing in: NS_IMETHODIMP DatabaseQuery::SetDatabaseGUID(const PRUnichar *dbGUID) { if (!dbGUID) return NS_ERROR_NULL_POINTER; nsAutoLock guidLock(m_pDatabaseGUIDLock); // m_DatabaseGUID = dbGUID; return NS_OK; } The constructor and destructor: DatabaseQuery::DatabaseQuery() : m_pDatabaseGUIDLock(nsnull) { PRLock* pLock = PR_NewLock(); if (pLock) m_pDatabaseGUIDLock = pLock; } DatabaseQuery::~DatabaseQuery() { if (m_pDatabaseGUIDLock) PR_DestroyLock(m_pDatabaseGUIDLock); } When I saw that the stack was getting trashed I commented out everything in the function except the nsAutoLock just to make sure that the lock was indeed the culprit. I verified that m_pDatabaseGUIDLock is getting set correctly. Then I hopped into the debugger and started looking at the memory. Here's what's going on: nsAutoLock has two members: PRLock* mLock PRBool mLocked. nsAutoLockBase has three members (in debug mode): void* mAddr nsAutoLockBase* mDown nsAutoLockType mType The VC compiler is either not making enough space on the stack or the stack is misaligned. The base variables are initialized in the nsAutoLockBase constructor and occupy the 12 bytes immediately preceeding the other stack data (return pointer and other stuff). Then, when the nsAutoLock constructor runs, it overwrites the next 8 bytes (which contained the return address and some other data) and I get a friendly access violation. I'll describe that process a bit more here: Stack (before constructors): ... <--- lots more cc above here (this is VC padding) cc cc cc cc cc cc cc cc cc cc cc cc <--- &guidLock cc cc cc cc cc cc cc cc f0 1f e2 00 <--- return address .. .. .. .. \ .. .. .. .. } <--- other data .. .. .. .. / Then, in steps: (1,2,3) the base constructor is run; (4,5) then the other. ... cc cc cc cc cc cc cc cc 68 47 43 01 <--- &guidLock (1) AND mAddr 00 00 00 00 <--- (2) mDown 00 00 00 00 <--- (3) mType f0 1f e2 00 <--- return address .. .. .. .. <--- other data ... cc cc cc cc cc cc cc cc 68 47 43 01 <--- mAddr 00 00 00 00 <--- mDown 00 00 00 00 <--- mType 68 47 43 01 <--- return address <- (4) mLock 01 00 00 00 <--- other data <- (5) mLocked .. .. .. .. I'm pretty sure the problem has to do with calling conventions... My compiler is set to use __cdecl by default, and I'm pretty sure that nspr is compiled with __cdecl: bsmedberg pointed me to prtypes.h to figure that out. However, I realized that the NS_IMETHODIMP macro gets expanded to 'virtual __stdcall nsresult' and so that function was actually using __stdcall. We use the NS_IMETHODIMP macro because that is what XPIDL told us to use in this XPCOM component. Sadly, replacing all the instances of NS_IMETHODIMP declaring SetDatabaseGUID to 'virtual __cdecl nsresult' didn't fix the problem entirely. When compiled with that convention the stack was still misaligned but instead of overwriting 8 bytes of data it only overwrote 4. So I believe that I'm on the right track to figuring this out... Is XPIDL giving me the correct recommendation? Or is this somehow a real bug in nsAutoLock? I've CC'ed the only folks I could find in bugzilla that had previously worked with nsAutoLock. There aren't many of you!
Reporter | ||
Comment 1•18 years ago
|
||
Trace of the call stack. I'm pretty sure this is useless info, but just in case...
Ok. bugzilla and IE hate me. and I hate them. I'm having a really terrible morning. This comment has been eaten *Twice*. I'm very sorry for the fact that the comment you get is not going to be very good. For now, let's pretend calling conventions are a red herring. please put back the NS_IMETHODIMP stuff and do: make -C xpcom/threads nsAutoLock.i make ... CDatabaseQuery.i zip the two .i files and your mozconfig together and attach them. I won't be able to look before sunday.
Comment 3•18 years ago
|
||
if you want your functions to be callable from javascript, you better use the calling convention that NS_IMETHODIMP gives you...
Reporter | ||
Comment 4•18 years ago
|
||
(In reply to comment #2) Here you go. Something interesting that I noticed: Line 130032 of DatabaseQuery.i is where nsAutoLockBase is defined. It does *not* show that it has any members (mAddr, mDonw, mType). Here's what the included header says, though: class NS_COM nsAutoLockBase { protected: nsAutoLockBase() {} enum nsAutoLockType {eAutoLock, eAutoMonitor, eAutoCMonitor}; #ifdef DEBUG nsAutoLockBase(void* addr, nsAutoLockType type); ~nsAutoLockBase(); void Show(); void Hide(); void* mAddr; nsAutoLockBase* mDown; nsAutoLockType mType; #else nsAutoLockBase(void* addr, nsAutoLockType type) {} ~nsAutoLockBase() {} void Show() {} void Hide() {} #endif }; Weird. I'll investigate this a bit more today.
Comment 5•18 years ago
|
||
(In reply to comment #4) > Weird. I'll investigate this a bit more today. What's the mystery? You've pretty clearly compiled with and without DEBUG defined. /be
Status: NEW → RESOLVED
Closed: 18 years ago
Resolution: --- → INVALID
Reporter | ||
Comment 6•18 years ago
|
||
Argh, yes. Silly. The MSVC standard define for debug mode is _DEBUG, and the mozilla define is DEBUG. Sheesh. Thanks guys.
You need to log in
before you can comment on or make changes to this bug.
Description
•