Last Comment Bug 521750 - Put a runtime NS_IsMainThread check in nsCycleCollector::Suspect2 and Forget2
: Put a runtime NS_IsMainThread check in nsCycleCollector::Suspect2 and Forget2
Status: RESOLVED FIXED
[first fix landed in 1.9.2beta2 and 1...
:
Product: Core
Classification: Components
Component: XPCOM (show other bugs)
: Trunk
: x86 Windows XP
: P1 normal (vote)
: mozilla1.9.3a1
Assigned To: David Baron :dbaron: ⌚️UTC+2 (review requests must explain patch)
:
Mentors:
: 500879 (view as bug list)
Depends on: 526586 528687 529216 543241 586784
Blocks: C191ConfSync 500879 504392 513334 519878 525814 549743 567949
  Show dependency treegraph
 
Reported: 2009-10-11 23:18 PDT by David Baron :dbaron: ⌚️UTC+2 (review requests must explain patch)
Modified: 2010-08-16 22:34 PDT (History)
21 users (show)
mbeltzner: blocking1.9.2+
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---
.2-fixed
.5+
.9-fixed


Attachments
Much faster NS_IsMainThread check, rev. 1 (11.11 KB, patch)
2009-10-12 09:45 PDT, Benjamin Smedberg AWAY UNTIL 2-AUG-2016 [:bsmedberg]
no flags Details | Diff | Splinter Review
Much faster NS_IsMainThread check, rev. 2 (11.08 KB, patch)
2009-10-12 12:20 PDT, Benjamin Smedberg AWAY UNTIL 2-AUG-2016 [:bsmedberg]
no flags Details | Diff | Splinter Review
patch for cycle collector part only (7.93 KB, patch)
2009-10-14 16:51 PDT, David Baron :dbaron: ⌚️UTC+2 (review requests must explain patch)
peterv: review+
Details | Diff | Splinter Review
the faster-NS_IsMainThread() half (6.42 KB, patch)
2009-10-15 13:54 PDT, David Baron :dbaron: ⌚️UTC+2 (review requests must explain patch)
dbaron: review+
Details | Diff | Splinter Review
the fast-NS_IsMainThread part, rev. 3 (6.17 KB, patch)
2009-10-27 09:09 PDT, Benjamin Smedberg AWAY UNTIL 2-AUG-2016 [:bsmedberg]
dbaron: review+
Details | Diff | Splinter Review
fast NS_IsMainThread for 1.9.1 branch (6.16 KB, patch)
2009-10-29 16:53 PDT, David Baron :dbaron: ⌚️UTC+2 (review requests must explain patch)
samuel.sidler+old: approval1.9.1.5+
dveditz: approval1.9.1.6+
Details | Diff | Splinter Review
cycle collector patch for 1.9.1 branch (7.57 KB, patch)
2009-10-29 16:54 PDT, David Baron :dbaron: ⌚️UTC+2 (review requests must explain patch)
samuel.sidler+old: approval1.9.1.5+
dveditz: approval1.9.1.6+
Details | Diff | Splinter Review
use AC_TRY_LINK when checking for __thread (1.30 KB, patch)
2009-10-29 17:01 PDT, David Baron :dbaron: ⌚️UTC+2 (review requests must explain patch)
benjamin: review+
samuel.sidler+old: approval1.9.1.5+
samuel.sidler+old: approval1.9.1.6+
Details | Diff | Splinter Review
configure.in: add some compilation options when detecting __thread support (1.25 KB, patch)
2009-10-31 19:19 PDT, Bartłomiej Brzozowiec (BartZilla)
no flags Details | Diff | Splinter Review
add some actually used compilation options when detecting __thread support (1.25 KB, patch)
2009-11-11 11:52 PST, Bartłomiej Brzozowiec (BartZilla)
benjamin: review+
mbeltzner: approval1.9.2.2+
mbeltzner: approval1.9.1.9+
Details | Diff | Splinter Review

Description David Baron :dbaron: ⌚️UTC+2 (review requests must explain patch) 2009-10-11 23:18:33 PDT
Right now nsCycleCollector::Suspect2 has an assertion that NS_IsMainThread() returns true.  If we changed this to a runtime test for all builds (which would return null if it's false), I think this would alleviate a substantial number of the crashes in Firefox 3.5.* related to violations of our threadsafety rules in extensions.

Benjamin, you said in http://groups.google.com/group/mozilla.dev.platform/browse_thread/thread/c61c764813ba11ab/5472241d613b1684 that you had ways to make NS_IsMainThread() faster; we are likely to need this in order to do this without hurting performance, although it might be worth checking what the current performance hit is.
Comment 1 Benjamin Smedberg AWAY UNTIL 2-AUG-2016 [:bsmedberg] 2009-10-12 06:01:43 PDT
Yeah, at least on platforms which support compiler-native thread-locals (which is all of our tier-1 platforms) we can just declare one of those and use inline C++ to test it, which is really really cheap and involves no function-call overhead at all. I'll try to put a patch together for that.
Comment 2 David Baron :dbaron: ⌚️UTC+2 (review requests must explain patch) 2009-10-12 09:33:40 PDT
Are the compiler-native thread locals actually faster, or just less typing?
Comment 3 Benjamin Smedberg AWAY UNTIL 2-AUG-2016 [:bsmedberg] 2009-10-12 09:39:21 PDT
They are faster. I believe they are a lot faster, but we'll obviously need to measure.
Comment 4 Benjamin Smedberg AWAY UNTIL 2-AUG-2016 [:bsmedberg] 2009-10-12 09:45:14 PDT
Created attachment 405870 [details] [diff] [review]
Much faster NS_IsMainThread check, rev. 1

Compiles locally on Linux: running it through tryserver now.
Comment 5 David Baron :dbaron: ⌚️UTC+2 (review requests must explain patch) 2009-10-12 10:59:52 PDT
We don't want to fault (which disables collection and produces leaks that are probably just about as bad as crashing), we just want to return null (and potentially miss an object in the purple buffer).  The only case where we might want to fault is the case where nsCycleCollectingAutoRefCnt::decr calls Forget2.
Comment 6 Benjamin Smedberg AWAY UNTIL 2-AUG-2016 [:bsmedberg] 2009-10-12 11:14:02 PDT
We certainly want to put up an error console message. I see how skipping Suspect/Suspect2 isn't a big deal. But it seems to me that skipping Forget or Forget2 could end up causing CC to reference dead objects and should cause a fault. Especially if there were UI hooked up to the 'cycle-collector-fault' notification which could prompt the user to restart/uninstall extensions/whatever.
Comment 7 David Baron :dbaron: ⌚️UTC+2 (review requests must explain patch) 2009-10-12 11:35:19 PDT
But the only dangerous case for Forget is the rarer call from decr rather than the more-common call from incr (especially given what these extensions tend to be doing, which is calling methods on existing objects).

But we can solve that by making the case in decr in which Forget fails null out e->mObject, and making sure that the cycle-collector handles that case (if it doesn't already).  (That won't catch the case of fixing this for external glue users being refcounted, but I don't think we've hit that problem.)
Comment 8 David Baron :dbaron: ⌚️UTC+2 (review requests must explain patch) 2009-10-12 11:36:59 PDT
And, I agree we should put up at least a console message (if not more UI) when the cycle collector faults (maybe at the point where we skip the collection?).  But that should probably be a different bug.
Comment 9 Benjamin Smedberg AWAY UNTIL 2-AUG-2016 [:bsmedberg] 2009-10-12 11:47:33 PDT
Filed bug 521853 on Firefox UI for CC faults.
Comment 10 David Baron :dbaron: ⌚️UTC+2 (review requests must explain patch) 2009-10-12 11:50:54 PDT
Note that I'd be ok with a fault in Forget, but I'd really prefer not to fault in Suspect, Forget2, or Suspect2; just return instead (and not in a macro).
Comment 11 Benjamin Smedberg AWAY UNTIL 2-AUG-2016 [:bsmedberg] 2009-10-12 12:20:29 PDT
Created attachment 405897 [details] [diff] [review]
Much faster NS_IsMainThread check, rev. 2

I think this is what you meant. I've verified that the code in nsCycleCollector.cpp can deal with a null mObject.
Comment 12 David Baron :dbaron: ⌚️UTC+2 (review requests must explain patch) 2009-10-12 13:32:27 PDT
Mostly, except for a few things:

 * I don't think we should notify "cycle-collector-fault" if we didn't actually fault
 * Based on that, I think CC_MAINTHREAD_CHECK becomes much simpler (one line in each half of the ifdef)
 * Given *that*, CC_MAINTHREAD_FAULT shouldn't exist, and the Fault() call can be explicit in Forget (test CC_MAINTHREAD_CHECK, call Fault)


 * Forget2 shouldn't set e->mObject to null; that would leave a confused state in the forget-called-from-incr case (the more common one) and we'll start crashing the next time we reference-count the object after the next cycle collect (since it won't have gotten an unmarkPurple call); instead it should be decr that sets it to null when Forget2 fails, but incr's handling of Forget2 failure is already correct (and changing Forget2 would break that)


I didn't look closely at the TLS stuff yet.
Comment 13 Benjamin Smedberg AWAY UNTIL 2-AUG-2016 [:bsmedberg] 2009-10-12 13:36:13 PDT
CC_MAINTHREAD_CHECK should definitely at least issue some error-console notification, which is a side-effect of the current runnable.
Comment 14 David Baron :dbaron: ⌚️UTC+2 (review requests must explain patch) 2009-10-12 13:56:04 PDT
(In reply to comment #13)
> CC_MAINTHREAD_CHECK should definitely at least issue some error-console
> notification, which is a side-effect of the current runnable.

I think we'd need to figure out how to do that without significant performance degradation; this could be happening a pretty large number of times.
Comment 15 David Baron :dbaron: ⌚️UTC+2 (review requests must explain patch) 2009-10-13 11:24:11 PDT
Do you want me to update the patch per comment 12, or are you planning to?
Comment 16 Benjamin Smedberg AWAY UNTIL 2-AUG-2016 [:bsmedberg] 2009-10-13 11:45:53 PDT
Could you do it? I'm not sure exactly what you want in Forget2.
Comment 17 David Baron :dbaron: ⌚️UTC+2 (review requests must explain patch) 2009-10-13 13:59:50 PDT
OK, I've written the changes I was thinking of, and now I'm testing them.  Current state is http://hg.mozilla.org/users/dbaron_mozilla.com/patches-clean/file/c3dfaf8dbbaa/cc-main-thread-check
Comment 18 David Baron :dbaron: ⌚️UTC+2 (review requests must explain patch) 2009-10-13 15:07:59 PDT
When I tried compiling this in a 1.9.1 branch debug build, I got:

c:\builds\1.9.1\obj\firefox-debug-vc9\dist\include\xpcom\nsThreadUtils.h(104) :
error C2492: 'gTLSIsMainThread' : 'thread' data may not have dll interface
make[5]: *** [nsConsoleService.obj] Error 2
Comment 19 Benjamin Smedberg AWAY UNTIL 2-AUG-2016 [:bsmedberg] 2009-10-14 05:12:04 PDT
Yeah, apparently you can't have threadlocal dllexports (makes sense). We could make this enable-libxul-only, I suppose.
Comment 20 David Baron :dbaron: ⌚️UTC+2 (review requests must explain patch) 2009-10-14 16:51:28 PDT
Created attachment 406349 [details] [diff] [review]
patch for cycle collector part only

I separated out the faster-NS_IsMainThread part from the cycle collection part; here's just the latter, modified to do what I wanted.

I checked that, with the RelevantKnowledge spyware installed, I'm:
 * failing a bunch of the NS_IsMainThread checks
 * going through multiple cycle collections after that without crashing
 * at the end of nsPurpleBuffer::SelectPointers, mCount is 0 each time.
Comment 21 David Baron :dbaron: ⌚️UTC+2 (review requests must explain patch) 2009-10-15 00:14:27 PDT
I pushed a baseline (06656ee2b5dc), the baseline plus attachment 406349 [details] [diff] [review], and the baseline plus attachment 406349 [details] [diff] [review] and the NS_IsMainThread performance improvement patch at http://hg.mozilla.org/users/dbaron_mozilla.com/patches-clean/file/fbf27e7fed33/faster-is-main-thread to try.  I got the following talos results:

both patches  e8b16838a0e1
just cc patch: 9b3f756f3927
baseline: 06656ee2b5dc

Linux try talos:
  twinopen:
    both: 112.74, 112.47
    just cc: 114.32, 113.81
    baseline: 111.53
  tdhtml:
    both: 736.65, 735.35
    just cc: 738.38, 737.94
    baseline: 735.24

MacOSX Darwin 9.0.0 try talos:
  twinopen:
    both: 320.11, 320.42
    just cc: 323.11, 323.89
    baseline: 320.42, 318.84
  tdhtml:
    both: 784.68, 789.79
    just cc: 793.15, 788.09
    baseline: 791.03, 788.53

WINNT 5.1 try talos appears to be some number of days behind

WINNT 6.0 try talos:
  twinopen:
    both: 122.32
    just cc: 126.37, 125.84
    baseline: 126.74, 126.11
  tdhtml:
    both: 715.12
    just cc: 714.18, 715.85
    baseline: 714.44, 714.24
Comment 22 David Baron :dbaron: ⌚️UTC+2 (review requests must explain patch) 2009-10-15 13:54:13 PDT
Created attachment 406528 [details] [diff] [review]
the faster-NS_IsMainThread() half
Comment 23 David Baron :dbaron: ⌚️UTC+2 (review requests must explain patch) 2009-10-15 14:00:24 PDT
Comment on attachment 406528 [details] [diff] [review]
the faster-NS_IsMainThread() half

r=dbaron on this half

nsThreadUtils gets compiled twice, right?  Once with MOZILLA_INTERNAL_API defined and once without (for the external glue library)?
Comment 24 David Baron :dbaron: ⌚️UTC+2 (review requests must explain patch) 2009-10-15 14:01:00 PDT
Comment on attachment 406528 [details] [diff] [review]
the faster-NS_IsMainThread() half

Oh... except you need to do something about the bit that this doesn't compile in libxul builds.  I'm not sure what to do there.
Comment 25 Damon Sicore (:damons) 2009-10-26 16:39:15 PDT
Progress here?  This affects top crash killing efforts.  It's a big deal.
Comment 26 Benjamin Smedberg AWAY UNTIL 2-AUG-2016 [:bsmedberg] 2009-10-26 16:59:19 PDT
I'm going to put up a new patch for the TLS stuff tomorrow, hopefully. dbaron is doing the cycle-collector part.
Comment 27 Benjamin Smedberg AWAY UNTIL 2-AUG-2016 [:bsmedberg] 2009-10-27 09:09:07 PDT
Created attachment 408607 [details] [diff] [review]
the fast-NS_IsMainThread part, rev. 3

This fixes the --disable-libxul parts by hiding the TLS inside xpcom_core.
Comment 28 David Baron :dbaron: ⌚️UTC+2 (review requests must explain patch) 2009-10-27 14:31:12 PDT
I'll re-merge the two and run the pair through try server.
Comment 29 David Baron :dbaron: ⌚️UTC+2 (review requests must explain patch) 2009-10-27 14:34:46 PDT
And I think I'm going to drop the |CheckMainThreadIfFast| and just make it use NS_IsMainThread directly.
Comment 30 David Baron :dbaron: ⌚️UTC+2 (review requests must explain patch) 2009-10-27 14:35:34 PDT
(In reply to comment #29)
> And I think I'm going to drop the |CheckMainThreadIfFast| and just make it use
> NS_IsMainThread directly.

Actually, never mind.
Comment 31 David Baron :dbaron: ⌚️UTC+2 (review requests must explain patch) 2009-10-28 10:31:47 PDT
http://hg.mozilla.org/mozilla-central/rev/24ce78dd533f
http://hg.mozilla.org/mozilla-central/rev/ea6f9b5338b3
Comment 32 David Baron :dbaron: ⌚️UTC+2 (review requests must explain patch) 2009-10-29 13:47:47 PDT
http://hg.mozilla.org/releases/mozilla-1.9.2/rev/62cbe17eeeb3
http://hg.mozilla.org/releases/mozilla-1.9.2/rev/f9c274218b33
Comment 33 Bartłomiej Brzozowiec (BartZilla) 2009-10-29 14:49:05 PDT
(In reply to comment #32)
> http://hg.mozilla.org/releases/mozilla-1.9.2/rev/62cbe17eeeb3
> http://hg.mozilla.org/releases/mozilla-1.9.2/rev/f9c274218b33

I've got following error during linking libxpcom_core.so:
nsThreadUtils.o(.text+0x21): In function `NS_IsMainThread_P()':
/home/users/firefox/mozhg/src/objdir-ff-debug/xpcom/build/nsThreadUtils.cpp:139: undefined reference to `___tls_get_addr'
../threads/libxpcomthreads_s.a(nsThreadManager.o)(.text+0xb15): In function `nsThreadManager::Init()':
/home/users/firefox/mozhg/src/xpcom/threads/nsThreadManager.cpp:110: undefined reference to `___tls_get_addr'
collect2: ld returned 1 exit status

I guess changes from this bug are the culprit...
Comment 34 Bartłomiej Brzozowiec (BartZilla) 2009-10-29 14:55:05 PDT
(In reply to comment #33)
> (In reply to comment #32)
> > http://hg.mozilla.org/releases/mozilla-1.9.2/rev/62cbe17eeeb3
> > http://hg.mozilla.org/releases/mozilla-1.9.2/rev/f9c274218b33

Ooops, I cited links to changes in "mozilla-1.9.2", but I got this error during compiling trunk (I haven't tried compiling 1.9.2 branch).
Comment 35 David Baron :dbaron: ⌚️UTC+2 (review requests must explain patch) 2009-10-29 15:44:37 PDT
What compiler?
Comment 36 Bartłomiej Brzozowiec (BartZilla) 2009-10-29 16:15:53 PDT
GCC 4.3.3, locally installed (my system compiler is too old... OTOH I use system-wide ld, which comes from binutils 2.15.94.0.2.2...)
Comment 37 David Baron :dbaron: ⌚️UTC+2 (review requests must explain patch) 2009-10-29 16:53:26 PDT
Created attachment 409227 [details] [diff] [review]
fast NS_IsMainThread for 1.9.1 branch
Comment 38 David Baron :dbaron: ⌚️UTC+2 (review requests must explain patch) 2009-10-29 16:54:48 PDT
Created attachment 409228 [details] [diff] [review]
cycle collector patch for 1.9.1 branch

Pretty trivial merging; for the previous patch, just some context differences in comments; for this one, one (debugging-only) function it was patching didn't exist back on 1.9.1.
Comment 39 David Baron :dbaron: ⌚️UTC+2 (review requests must explain patch) 2009-10-29 17:01:06 PDT
Created attachment 409229 [details] [diff] [review]
use AC_TRY_LINK when checking for __thread

This should fix BartZ's error; this makes our autoconf test check that everything links.  Presumably __thread support requires some pieces from the compiler and some from something in binutils.
Comment 40 Bartłomiej Brzozowiec (BartZilla) 2009-10-29 17:56:56 PDT
(In reply to comment #39)
> Created an attachment (id=409229) [details]
> use AC_TRY_LINK when checking for __thread
> 
> This should fix BartZ's error; this makes our autoconf test check that
> everything links.  Presumably __thread support requires some pieces from the
> compiler and some from something in binutils.

Even with this patch applied it doesn't work :-(. During configuration there is still
..
checking for __thread keyword for TLS variables... yes
..

and compilation ends with the same error as in comment 33.
Thanks for trying, I guess I have to compile newer binutils...
Comment 41 Bartłomiej Brzozowiec (BartZilla) 2009-10-29 19:12:12 PDT
(In reply to comment #40)
> and compilation ends with the same error as in comment 33.
> Thanks for trying, I guess I have to compile newer binutils...

I compiled and locally installed binutils 2.20 and the error still persists (I checked with strace whether gcc really uses newer ld - it does (I put in my PATH my local bin dirs before system ones)).

Any ideas?
Comment 42 David Baron :dbaron: ⌚️UTC+2 (review requests must explain patch) 2009-10-30 11:49:21 PDT
(In reply to comment #40)
> (In reply to comment #39)
> > Created an attachment (id=409229) [details] [details]
> > use AC_TRY_LINK when checking for __thread
> > 
> > This should fix BartZ's error; this makes our autoconf test check that
> > everything links.  Presumably __thread support requires some pieces from the
> > compiler and some from something in binutils.
> 
> Even with this patch applied it doesn't work :-(. During configuration there is
> still

Did you build with client.mk (which will regenerate configure from configure.in), or did you just rerun configure?
Comment 43 Daniel Veditz [:dveditz] 2009-10-30 12:20:44 PDT
Comment on attachment 409228 [details] [diff] [review]
cycle collector patch for 1.9.1 branch

Approved for 1.9.1.5, a=dveditz for release-drivers
Comment 44 David Baron :dbaron: ⌚️UTC+2 (review requests must explain patch) 2009-10-30 14:09:39 PDT
Landed AC_TRY_LINK patch:
http://hg.mozilla.org/mozilla-central/rev/1c84969780d4
http://hg.mozilla.org/releases/mozilla-1.9.2/rev/ae07ff148290
Comment 45 David Baron :dbaron: ⌚️UTC+2 (review requests must explain patch) 2009-10-30 15:08:39 PDT
http://hg.mozilla.org/releases/mozilla-1.9.1/rev/ce53b1c742c3 (with AC_TRY_LINK rolled in)
http://hg.mozilla.org/releases/mozilla-1.9.1/rev/0ecb15e3abbe
Comment 46 Bartłomiej Brzozowiec (BartZilla) 2009-10-31 14:50:26 PDT
(In reply to comment #42)
> Did you build with client.mk (which will regenerate configure from
> configure.in), or did you just rerun configure?

I always start compilation with
make -f client.mk build
I never run configure script directly.
I even removed the whole objdir - the error still persists...
Comment 47 David Baron :dbaron: ⌚️UTC+2 (review requests must explain patch) 2009-10-31 14:53:29 PDT
Could you experiment to figure out under what conditions your compiler can compile and link a program with __thread and under what conditions it can't?  Is it a C vs. C++ difference?  Something related to one of the command line options we pass?  Something else?
Comment 48 Bartłomiej Brzozowiec (BartZilla) 2009-10-31 17:09:25 PDT
(In reply to comment #47)
> Could you experiment to figure out under what conditions your compiler can
> compile and link a program with __thread and under what conditions it can't? 
> Is it a C vs. C++ difference?  Something related to one of the command line
> options we pass?  Something else?

I haven't tested all possible combinations of options given during compilation, but I was playing with some of them for a while and...:

$ cat test.cpp 
__thread bool tlsIsMainThread = false;
int main() {
        return tlsIsMainThread;
}
$ g++-4.3.3 -o test test.cpp -shared -fPIC -Wl,-z,defs
/home/users/firefox/tmp/cc8h2amk.o: In function `main':
test.cpp:(.text+0x22): undefined reference to `___tls_get_addr'
collect2: ld returned 1 exit status

Removing one of the options (either "-shared" or "-fPIC" or "-Wl,-z,defs") results in not reporting the error during compilation anymore.

The same story with C version of the program:

$ cat test.c 
__thread int tlsIsMainThread = 0;
int main( void ) {
        return tlsIsMainThread;
}
$ gcc-4.3.3 -o test test.c -shared -fPIC -Wl,-z,defs
/home/users/firefox/tmp/ccQIxv7p.o: In function `main':
test.c:(.text+0x22): undefined reference to `___tls_get_addr'
collect2: ld returned 1 exit status

Again, removing one of the options (either "-shared" or "-fPIC" or "-Wl,-z,defs") results in not reporting the error during compilation anymore.

I read that support for "__thread" requires support from the following elements: (1) compiler; (2) linker; (3) system libraries. I locally installed (1) and (2), but glibc on this system is a little bit rusty (2.3.6). I guess this is the root cause of my problem, but upgrading glibc could be a little bit problematic and I'd prefer not to do this at this time...
Comment 49 David Baron :dbaron: ⌚️UTC+2 (review requests must explain patch) 2009-10-31 17:23:01 PDT
(In reply to comment #48)
> I read that support for "__thread" requires support from the following
> elements: (1) compiler; (2) linker; (3) system libraries. I locally installed
> (1) and (2), but glibc on this system is a little bit rusty (2.3.6). I guess
> this is the root cause of my problem, but upgrading glibc could be a little bit
> problematic and I'd prefer not to do this at this time...

You can get a working build by locally patching the autoconf test to fail (e.g., by introducing a typo in it).
Comment 50 Bartłomiej Brzozowiec (BartZilla) 2009-10-31 19:19:28 PDT
Created attachment 409560 [details] [diff] [review]
configure.in: add some compilation options when detecting __thread support

(In reply to comment #49)
> You can get a working build by locally patching the autoconf test to fail
> (e.g., by introducing a typo in it).

Sure, but why not to try compiling with the relevant options and check the result? Seems that options that resulted in the error are in vars DSO_PIC_CFLAGS and DSO_LDOPTS. This patch add these options to LDFLAGS right before compiling test code.
Comment 51 Bartłomiej Brzozowiec (BartZilla) 2009-10-31 19:24:07 PDT
(In reply to comment #50)
> Created an attachment (id=409560) [details]
> configure.in: add some compilation options when detecting __thread support
 
Actually, I guess that checks for GNU_CC are kinda silly since __thread is GCC-specific, right?
Comment 52 Samuel Sidler (old account; do not CC) 2009-11-02 08:18:18 PST
All three of these patches need to land on GECKO1914_20091006_RELBRANCH for 1.9.1.5.
Comment 53 David Baron :dbaron: ⌚️UTC+2 (review requests must explain patch) 2009-11-02 09:12:01 PST
http://hg.mozilla.org/releases/mozilla-1.9.1/rev/4dba03f79519 (patches 1 & 3)
http://hg.mozilla.org/releases/mozilla-1.9.1/rev/40f1c74ccf91
Comment 54 Al Billings [:abillings] 2009-11-02 10:11:51 PST
Nothing for QA to do here to verify this for 1.9.1.5.
Comment 55 David Baron :dbaron: ⌚️UTC+2 (review requests must explain patch) 2009-11-02 11:19:05 PST
(In reply to comment #54)
> Nothing for QA to do here to verify this for 1.9.1.5.

That's not entirely true; one thing that is probably worth doing is installing RelevantKnowledge spyware (see bug 521748) and checking that it at least hasn't caused any new crashes.
Comment 56 Al Billings [:abillings] 2009-11-02 11:56:10 PST
How often does it crash with the spyware? That isn't clear in the other bug (nor are there any steps for triggering the crashes with the spyware).

QA can install the spyware but, for example, a lack of a crash won't prove much other than it didn't crash that time.
Comment 57 David Baron :dbaron: ⌚️UTC+2 (review requests must explain patch) 2009-11-02 20:05:29 PST
Lack of a crash will at least demonstrate that it didn't make things worse.
Comment 58 Al Billings [:abillings] 2009-11-02 21:41:15 PST
Well, I installed the software mentioned in bug 521748 but I'm not sure how a crash is supposed to be triggered. Nothing obvious happened but I wouldn't expect that.
Comment 59 David Baron :dbaron: ⌚️UTC+2 (review requests must explain patch) 2009-11-08 08:39:36 PST
Based on looking at the 3.5.5 crash data, it looks like this fixed the crashes at:
  nsCycleCollectingAutoRefCnt::decr(nsISupports*)
  nsEventListenerManager::Release()
  nsGlobalChromeWindow::Release()
  nsPresContext::Release()
and various other Release methods, but that it did NOT fix the crashes at:
  nsGlobalWindow::cycleCollection::UnmarkPurple(nsISupports*)
  nsCycleCollectingAutoRefCnt::unmarkPurple()
which I was also expecting it would fix.  I need to look into why that's the case.
Comment 60 David Baron :dbaron: ⌚️UTC+2 (review requests must explain patch) 2009-11-08 09:48:28 PST
I filed bug 527339 on figuring out the nsGlobalWindow::cycleCollection::UnmarkPurple(nsISupports*) crashes so that I don't make this bug too complicated.
Comment 61 Stuart Parmenter 2009-11-09 15:58:02 PST
I've backed out the revs in comment 44 and 32 on 1.9.2 due to 526586 blocking the Fennec beta 5 release.

http://hg.mozilla.org/releases/mozilla-1.9.2/rev/bc5789c6e2c3
http://hg.mozilla.org/releases/mozilla-1.9.2/rev/0e7b0f021674
http://hg.mozilla.org/releases/mozilla-1.9.2/rev/e3bb4f1d970a
Comment 62 Mike Beltzner [:beltzner, not reading bugmail] 2009-11-10 08:02:46 PST
Now that beta 5 has been built, can we get this re-landed on mozilla-1.9.2 and a follow up filed to have the problems that mobile was having?
Comment 63 David Baron :dbaron: ⌚️UTC+2 (review requests must explain patch) 2009-11-10 13:25:16 PST
The problem mobile was having was bug 526586.

I relanded this on 1.9.2:
http://hg.mozilla.org/releases/mozilla-1.9.2/rev/6c82d7aee415
http://hg.mozilla.org/releases/mozilla-1.9.2/rev/c2812111ecd1
http://hg.mozilla.org/releases/mozilla-1.9.2/rev/b0312c665250

and landed bsmedberg's fix for bug 526586 there:
http://hg.mozilla.org/releases/mozilla-1.9.2/rev/224d7a8a1c4b
Comment 64 Bartłomiej Brzozowiec (BartZilla) 2009-11-11 11:52:11 PST
Created attachment 411751 [details] [diff] [review]
add some actually used compilation options when detecting __thread support

(In reply to comment #51)
> (In reply to comment #50)
> > Created an attachment (id=409560) [details] [details]
> > configure.in: add some compilation options when detecting __thread support
> 
> Actually, I guess that checks for GNU_CC are kinda silly since __thread is
> GCC-specific, right?

OK, let's try to review and checkin this (or should I fill separate bug?).
Attached patch adds (via LDFLAGS) some actually used compilation options when compiling real code, so __thread support detection is more robust: without this patch compilation on my system fails (see comment #33), because test code is compiled without -shared -fPIC -Wl,-z,defs (and succeeds), but these options (among others) are used when compiling real code (and with these options linking code with __thread fails on my system, see comment #48).
Comment 65 Reed Loden [:reed] (use needinfo?) 2009-11-16 08:49:19 PST
Comment on attachment 411751 [details] [diff] [review]
add some actually used compilation options when detecting __thread support

http://hg.mozilla.org/mozilla-central/rev/d6c911c1f46e
Comment 66 David Baron :dbaron: ⌚️UTC+2 (review requests must explain patch) 2009-11-18 12:59:34 PST
*** Bug 500879 has been marked as a duplicate of this bug. ***
Comment 67 Jesse Ruderman 2010-03-02 16:16:20 PST
Bug 549743 changed this to an abort :)
Comment 68 Serge Gautherie (:sgautherie) 2010-03-08 09:55:01 PST
Comment on attachment 411751 [details] [diff] [review]
add some actually used compilation options when detecting __thread support


Any reason why this should not go on branches where previous patches went?
Comment 69 Mike Beltzner [:beltzner, not reading bugmail] 2010-03-08 10:28:42 PST
Should we take this despite comment 67 which says that the structure changed? If so, we'll need a roll up patch for each branch.
Comment 70 David Baron :dbaron: ⌚️UTC+2 (review requests must explain patch) 2010-03-08 10:33:37 PST
Comment 67 doesn't have anything to do with whether attachment 411751 [details] [diff] [review] should go on branches; it's describing something that we did (and should do) on trunk only.
Comment 71 Mike Beltzner [:beltzner, not reading bugmail] 2010-03-10 12:52:08 PST
Comment on attachment 411751 [details] [diff] [review]
add some actually used compilation options when detecting __thread support

a=beltzner, and I'll find someone to land these.
Comment 72 David Baron :dbaron: ⌚️UTC+2 (review requests must explain patch) 2010-03-10 13:43:31 PST
http://hg.mozilla.org/releases/mozilla-1.9.2/rev/54fc0a861260
http://hg.mozilla.org/releases/mozilla-1.9.1/rev/0d8bde830a1a

Not toggling any flags, since bug is already status1.9.2:beta2-fixed, as it should be.  (The follow-up patch probably should have had its own bug.)
Comment 73 Mike Beltzner [:beltzner, not reading bugmail] 2010-03-10 13:58:49 PST
It should have, but since the most recent fix went in, I've moved the flags over to that release. I think that sucks less.

Still: serge, next time, please file follow up bugs.

Note You need to log in before you can comment on or make changes to this bug.