Bug 521750 added a TLS global variable to nsThreadManager. On Windows XP (debug and release) builds of Fennec + XULRunner, we are crashing when the TLS variable is accessed in nsThreadManager::Init Call stack: > xul.dll!nsThreadManager::Init() Line 110 + 0xc bytes C++ xul.dll!NS_InitXPCOM3_P(nsIServiceManager * * result=0x0012ec5c, nsIFile * binDirectory=0x009abf00, nsIDirectoryServiceProvider * appFileLocationProvider=0x0012ed74, const nsStaticModuleInfo * staticComponents=0x023266f8, unsigned int componentCount=50) Line 477 + 0xc bytes C++ xul.dll!ScopedXPCOMStartup::Initialize() Line 1073 + 0x29 bytes C++ xul.dll!XRE_main(int argc=1, char * * argv=0x00908b10, const nsXREAppData * aAppData=0x009abe08) Line 3278 + 0xb bytes C++ fennec.exe!NS_internal_main(int argc=1, char * * argv=0x00908b10) Line 502 + 0x1a bytes C++ fennec.exe!wmain(int argc=1, wchar_t * * argv=0x00904f60) Line 110 + 0xd bytes C++ fennec.exe!__tmainCRTStartup() Line 327 + 0x19 bytes C fennec.exe!wmainCRTStartup() Line 196 C Disassembly: #ifdef NS_TLS gTLSIsMainThread = true; 01D8FEA8 mov edx,dword ptr [__tls_index (2B2E75Ch)] 01D8FEAE mov eax,dword ptr fs:[0000002Ch] 01D8FEB4 mov ecx,dword ptr [eax+edx*4] <=== CRASH 01D8FEB7 mov byte ptr [ecx+101h],1 #endif eax and edx are both 0 I checked with a release bug from here: http://ftp.mozilla.org/pub/mozilla.org/firefox/nightly/latest-mobile-trunk/fennec-1.0b5.en-US.win32.zip The debug build was from my own m-c tree.
__declspec(thread) doesn't work in a DLL loaded dynamically with LoadLibrary (such as, for example, a COM DLL). The only way around is to drop __declspec(thread) and use thread-local strorage APIs (TlsAlloc et al) directly. This is a pretty unfortunate regression!
We need to fix this for 18.104.22.168, but won't block 22.214.171.124 on it since it only affects XULRunner/embedded. This probably needs to block 1.9.2 as well.
Releng is going to push XR 126.96.36.199 out as official. We were going to skip 188.8.131.52 and got straight to 184.108.40.206, but we can change that plan.
dbaron: Can you own this? This blocks 220.127.116.11 which freezes on November 10 at 11:59pm.
I know almost nothing about this stuff; I think Benjamin would be a better owner.
Assigning this to Benjamin for now then.
Created attachment 410463 [details] [diff] [review] untested so, this is the lazy patch. i think dbaron should be able to review it, since it's written using gecko (nspr) style code. one more advanced approach would involve distinct functions and a function pointer to deal with the differences between w5 and w6.
Comment on attachment 410463 [details] [diff] [review] untested The whole point of the fast check was to avoid the extra function all overhead of the NSPR functions. I don't know yet whether direct calls to TlsAlloc/TlsGetValue will have smaller overhead.
Created attachment 410489 [details] [diff] [review] untested using winapi
Created attachment 411230 [details] [diff] [review] TlsAlloc, using a static initializer, rev. 1 This compiles... going to run through tryserver momentarily.
this is holding the beta 5 release due to desktop win32 build being dead.
Created attachment 411269 [details] [diff] [review] TlsAlloc, using a static initializer, rev. 1.1 Again, with #undef ERROR for treewide sanity.
Created attachment 411299 [details] [diff] [review] backout the bad code
i've backed the offending code out of 1.9.2 http://hg.mozilla.org/releases/mozilla-1.9.2/rev/bc5789c6e2c3 http://hg.mozilla.org/releases/mozilla-1.9.2/rev/0e7b0f021674 http://hg.mozilla.org/releases/mozilla-1.9.2/rev/e3bb4f1d970a
I think stuart's objection to bsmedberg's latest patch was: [2009-11-09 15:38:51] <stuart> yes, his patch doesn't work [2009-11-09 15:39:11] <stuart> (you can't include windows.h in a core headerfile and hope for a real build) Maybe, to fix that, we should just not bother with the inline on Windows?
My latest version, http://hg.mozilla.org/try/rev/d4a8dbb3eee0 does exactly that, but I'm worried about the performance costs of the non-inline version, since this is a hot path and the inline can make it almost disappear but the called version can't so much. If m-c is relatively clear somebody is welcome to try it today, or I can do it in the morning tomorrow which is likely to be quieter.
dbaron: I hear Benjamin isn't fully around today. Can you port this over to 1.9.1 before code freeze tonight?
Comment on attachment 411440 [details] [diff] [review] Final patch as checked in We should get this onto 1.9.2 and 1.9.2... and also reland the other patches on 1.9.2.
Landed on 1.9.2 along with the relanding of bug 521750: http://hg.mozilla.org/releases/mozilla-1.9.2/rev/224d7a8a1c4b
Comment on attachment 411440 [details] [diff] [review] Final patch as checked in Approved for 18.104.22.168. a=ss