Closed Bug 526586 Opened 11 years ago Closed 11 years ago
Crash in [@ ns
Thread Manager::Init()] when accessing TLS variable
Bug 521750 added a TLS global variable to nsThreadManager. On Windows XP (debug and release) builds of Fennec + XULRunner, we are crashing when the TLS variable is accessed in nsThreadManager::Init Call stack: > xul.dll!nsThreadManager::Init() Line 110 + 0xc bytes C++ xul.dll!NS_InitXPCOM3_P(nsIServiceManager * * result=0x0012ec5c, nsIFile * binDirectory=0x009abf00, nsIDirectoryServiceProvider * appFileLocationProvider=0x0012ed74, const nsStaticModuleInfo * staticComponents=0x023266f8, unsigned int componentCount=50) Line 477 + 0xc bytes C++ xul.dll!ScopedXPCOMStartup::Initialize() Line 1073 + 0x29 bytes C++ xul.dll!XRE_main(int argc=1, char * * argv=0x00908b10, const nsXREAppData * aAppData=0x009abe08) Line 3278 + 0xb bytes C++ fennec.exe!NS_internal_main(int argc=1, char * * argv=0x00908b10) Line 502 + 0x1a bytes C++ fennec.exe!wmain(int argc=1, wchar_t * * argv=0x00904f60) Line 110 + 0xd bytes C++ fennec.exe!__tmainCRTStartup() Line 327 + 0x19 bytes C fennec.exe!wmainCRTStartup() Line 196 C Disassembly: #ifdef NS_TLS gTLSIsMainThread = true; 01D8FEA8 mov edx,dword ptr [__tls_index (2B2E75Ch)] 01D8FEAE mov eax,dword ptr fs:[0000002Ch] 01D8FEB4 mov ecx,dword ptr [eax+edx*4] <=== CRASH 01D8FEB7 mov byte ptr [ecx+101h],1 #endif eax and edx are both 0 I checked with a release bug from here: http://ftp.mozilla.org/pub/mozilla.org/firefox/nightly/latest-mobile-trunk/fennec-1.0b5.en-US.win32.zip The debug build was from my own m-c tree.
__declspec(thread) doesn't work in a DLL loaded dynamically with LoadLibrary (such as, for example, a COM DLL). The only way around is to drop __declspec(thread) and use thread-local strorage APIs (TlsAlloc et al) directly. This is a pretty unfortunate regression!
We need to fix this for 220.127.116.11, but won't block 18.104.22.168 on it since it only affects XULRunner/embedded. This probably needs to block 1.9.2 as well.
Releng is going to push XR 22.214.171.124 out as official. We were going to skip 126.96.36.199 and got straight to 188.8.131.52, but we can change that plan.
dbaron: Can you own this? This blocks 184.108.40.206 which freezes on November 10 at 11:59pm.
Whiteboard: [needs owner]
I know almost nothing about this stuff; I think Benjamin would be a better owner.
Assigning this to Benjamin for now then.
Assignee: nobody → benjamin
Whiteboard: [needs owner]
so, this is the lazy patch. i think dbaron should be able to review it, since it's written using gecko (nspr) style code. one more advanced approach would involve distinct functions and a function pointer to deal with the differences between w5 and w6.
Comment on attachment 410463 [details] [diff] [review] untested The whole point of the fast check was to avoid the extra function all overhead of the NSPR functions. I don't know yet whether direct calls to TlsAlloc/TlsGetValue will have smaller overhead.
Attachment #410463 - Flags: review-
This compiles... going to run through tryserver momentarily.
Attachment #411230 - Flags: review? → review?(dbaron)
this is holding the beta 5 release due to desktop win32 build being dead.
tracking-fennec: --- → 1.0+
Flags: blocking1.9.2? → blocking1.9.2+
Again, with #undef ERROR for treewide sanity.
i've backed the offending code out of 1.9.2 http://hg.mozilla.org/releases/mozilla-1.9.2/rev/bc5789c6e2c3 http://hg.mozilla.org/releases/mozilla-1.9.2/rev/0e7b0f021674 http://hg.mozilla.org/releases/mozilla-1.9.2/rev/e3bb4f1d970a
I think stuart's objection to bsmedberg's latest patch was: [2009-11-09 15:38:51] <stuart> yes, his patch doesn't work [2009-11-09 15:39:11] <stuart> (you can't include windows.h in a core headerfile and hope for a real build) Maybe, to fix that, we should just not bother with the inline on Windows?
My latest version, http://hg.mozilla.org/try/rev/d4a8dbb3eee0 does exactly that, but I'm worried about the performance costs of the non-inline version, since this is a hot path and the inline can make it almost disappear but the called version can't so much. If m-c is relatively clear somebody is welcome to try it today, or I can do it in the morning tomorrow which is likely to be quieter.
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
dbaron: I hear Benjamin isn't fully around today. Can you port this over to 1.9.1 before code freeze tonight?
Comment on attachment 411440 [details] [diff] [review] Final patch as checked in We should get this onto 1.9.2 and 1.9.2... and also reland the other patches on 1.9.2.
Attachment #411440 - Flags: review+
Landed on 1.9.2 along with the relanding of bug 521750: http://hg.mozilla.org/releases/mozilla-1.9.2/rev/224d7a8a1c4b
Attachment #411440 - Flags: approval220.127.116.11?
Comment on attachment 411440 [details] [diff] [review] Final patch as checked in Approved for 18.104.22.168. a=ss
Attachment #411440 - Flags: approval22.214.171.124? → approval126.96.36.199+
Summary: Crash in nsThreadManager::Init when accessing TLS variable → Crash in [@ nsThreadManager::Init()] when accessing TLS variable
Crash Signature: [@ nsThreadManager::Init()]
You need to log in before you can comment on or make changes to this bug.