Crash in [@ nsThreadManager::Init()] when accessing TLS variable

RESOLVED FIXED

Status

()

Core
XPCOM
--
critical
RESOLVED FIXED
8 years ago
6 years ago

People

(Reporter: mfinkle, Assigned: Benjamin Smedberg)

Tracking

({crash, regression, topcrash})

unspecified
x86
Windows XP
crash, regression, topcrash
Points:
---
Bug Flags:
blocking1.9.2 +

Firefox Tracking Flags

(status1.9.2 beta3-fixed, blocking1.9.1 .6+, status1.9.1 .6-fixed, fennec1.0+)

Details

(crash signature)

Attachments

(2 attachments, 4 obsolete attachments)

Bug 521750 added a TLS global variable to nsThreadManager. On Windows XP (debug and release) builds of Fennec + XULRunner, we are crashing when the TLS variable is accessed in nsThreadManager::Init

Call stack:

>	xul.dll!nsThreadManager::Init()  Line 110 + 0xc bytes	C++
 	xul.dll!NS_InitXPCOM3_P(nsIServiceManager * * result=0x0012ec5c, nsIFile * binDirectory=0x009abf00, nsIDirectoryServiceProvider * appFileLocationProvider=0x0012ed74, const nsStaticModuleInfo * staticComponents=0x023266f8, unsigned int componentCount=50)  Line 477 + 0xc bytes	C++
 	xul.dll!ScopedXPCOMStartup::Initialize()  Line 1073 + 0x29 bytes	C++
 	xul.dll!XRE_main(int argc=1, char * * argv=0x00908b10, const nsXREAppData * aAppData=0x009abe08)  Line 3278 + 0xb bytes	C++
 	fennec.exe!NS_internal_main(int argc=1, char * * argv=0x00908b10)  Line 502 + 0x1a bytes	C++
 	fennec.exe!wmain(int argc=1, wchar_t * * argv=0x00904f60)  Line 110 + 0xd bytes	C++
 	fennec.exe!__tmainCRTStartup()  Line 327 + 0x19 bytes	C
 	fennec.exe!wmainCRTStartup()  Line 196	C


Disassembly:

#ifdef NS_TLS
  gTLSIsMainThread = true;
01D8FEA8  mov         edx,dword ptr [__tls_index (2B2E75Ch)] 
01D8FEAE  mov         eax,dword ptr fs:[0000002Ch] 
01D8FEB4  mov         ecx,dword ptr [eax+edx*4] <=== CRASH
01D8FEB7  mov         byte ptr [ecx+101h],1 
#endif

eax and edx are both 0

I checked with a release bug from here:
http://ftp.mozilla.org/pub/mozilla.org/firefox/nightly/latest-mobile-trunk/fennec-1.0b5.en-US.win32.zip

The debug build was from my own m-c tree.
(Assignee)

Comment 1

8 years ago
__declspec(thread) doesn't work in a DLL loaded dynamically with
LoadLibrary (such as, for example, a COM DLL). The only way around is to
drop __declspec(thread) and use thread-local strorage APIs (TlsAlloc et
al) directly.

This is a pretty unfortunate regression!
We need to fix this for 1.9.1.6, but won't block 1.9.1.5 on it since it only affects XULRunner/embedded.

This probably needs to block 1.9.2 as well.
blocking1.9.1: --- → .6+
status1.9.1: --- → wanted
Flags: blocking1.9.2?
Releng is going to push XR 1.9.1.4 out as official. We were going to skip 1.9.1.4 and got straight to 1.9.1.5, but we can change that plan.
Keywords: regression
dbaron: Can you own this? This blocks 1.9.1.6 which freezes on November 10 at 11:59pm.
Whiteboard: [needs owner]
I know almost nothing about this stuff; I think Benjamin would be a better owner.
Assigning this to Benjamin for now then.
Assignee: nobody → benjamin
Whiteboard: [needs owner]

Comment 7

8 years ago
Created attachment 410463 [details] [diff] [review]
untested

so, this is the lazy patch. i think dbaron should be able to review it, since it's written using gecko (nspr) style code.

one more advanced approach would involve distinct functions and a function pointer to deal with the differences between w5 and w6.
(Assignee)

Comment 8

8 years ago
Comment on attachment 410463 [details] [diff] [review]
untested

The whole point of the fast check was to avoid the extra function all overhead of the NSPR functions. I don't know yet whether direct calls to TlsAlloc/TlsGetValue will have smaller overhead.
Attachment #410463 - Flags: review-

Comment 9

8 years ago
Created attachment 410489 [details] [diff] [review]
untested using winapi
Attachment #410463 - Attachment is obsolete: true
Attachment #410489 - Flags: review?(benjamin)
(Assignee)

Comment 10

8 years ago
Created attachment 411230 [details] [diff] [review]
TlsAlloc, using a static initializer, rev. 1

This compiles... going to run through tryserver momentarily.
Attachment #410489 - Attachment is obsolete: true
Attachment #411230 - Flags: review?
Attachment #410489 - Flags: review?(benjamin)
(Assignee)

Updated

8 years ago
Attachment #411230 - Flags: review? → review?(dbaron)

Comment 11

8 years ago
this is holding the beta 5 release due to desktop win32 build being dead.
tracking-fennec: --- → 1.0+
Flags: blocking1.9.2? → blocking1.9.2+
(Assignee)

Comment 12

8 years ago
Created attachment 411269 [details] [diff] [review]
TlsAlloc, using a static initializer, rev. 1.1

Again, with #undef ERROR for treewide sanity.
Attachment #411230 - Attachment is obsolete: true
Attachment #411269 - Flags: review?(dbaron)
Attachment #411230 - Flags: review?(dbaron)

Comment 13

8 years ago
Created attachment 411299 [details] [diff] [review]
backout the bad code
Attachment #411269 - Attachment is obsolete: true
Attachment #411269 - Flags: review?(dbaron)

Comment 14

8 years ago
i've backed the offending code out of 1.9.2

http://hg.mozilla.org/releases/mozilla-1.9.2/rev/bc5789c6e2c3
http://hg.mozilla.org/releases/mozilla-1.9.2/rev/0e7b0f021674
http://hg.mozilla.org/releases/mozilla-1.9.2/rev/e3bb4f1d970a
I think stuart's objection to bsmedberg's latest patch was:

[2009-11-09 15:38:51] <stuart> yes, his patch doesn't work
[2009-11-09 15:39:11] <stuart> (you can't include windows.h in a core headerfile and hope for a real build)

Maybe, to fix that, we should just not bother with the inline on Windows?
(Assignee)

Comment 16

8 years ago
My latest version, http://hg.mozilla.org/try/rev/d4a8dbb3eee0 does exactly that, but I'm worried about the performance costs of the non-inline version, since this is a hot path and the inline can make it almost disappear but the called version can't so much. If m-c is relatively clear somebody is welcome to try it today, or I can do it in the morning tomorrow which is likely to be quieter.
(Assignee)

Comment 17

8 years ago
http://hg.mozilla.org/mozilla-central/rev/f1975d08b880
Status: NEW → RESOLVED
Last Resolved: 8 years ago
Resolution: --- → FIXED
(Assignee)

Comment 18

8 years ago
Created attachment 411440 [details] [diff] [review]
Final patch as checked in
dbaron: I hear Benjamin isn't fully around today. Can you port this over to 1.9.1 before code freeze tonight?
(Assignee)

Updated

8 years ago
Duplicate of this bug: 527758
Comment on attachment 411440 [details] [diff] [review]
Final patch as checked in

We should get this onto 1.9.2 and 1.9.2... and also reland the other patches on 1.9.2.
Attachment #411440 - Flags: review+
Landed on 1.9.2 along with the relanding of bug 521750:
http://hg.mozilla.org/releases/mozilla-1.9.2/rev/224d7a8a1c4b
Attachment #411440 - Flags: approval1.9.1.6?
Comment on attachment 411440 [details] [diff] [review]
Final patch as checked in

Approved for 1.9.1.6. a=ss
Attachment #411440 - Flags: approval1.9.1.6? → approval1.9.1.6+
http://hg.mozilla.org/releases/mozilla-1.9.1/rev/a42d516d6d72
status1.9.1: wanted → .6-fixed
status1.9.2: --- → final-fixed
Summary: Crash in nsThreadManager::Init when accessing TLS variable → Crash in [@ nsThreadManager::Init()] when accessing TLS variable
Severity: normal → critical
Keywords: crash, topcrash

Updated

8 years ago
Duplicate of this bug: 529216
Crash Signature: [@ nsThreadManager::Init()]
You need to log in before you can comment on or make changes to this bug.