Closed
Bug 376874
Opened 18 years ago
Closed 18 years ago
Crash running trace-malloc on multi-core machine (put suppress_tracing in thread-local storage)
Categories
(Core :: General, defect)
Core
General
Tracking
()
RESOLVED
FIXED
mozilla1.9alpha8
People
(Reporter: bzbarsky, Assigned: dbaron)
References
Details
Attachments
(2 files, 2 obsolete files)
|
29.45 KB,
patch
|
brendan
:
review+
brendan
:
review+
|
Details | Diff | Splinter Review |
|
20.34 KB,
patch
|
brendan
:
review+
brendan
:
review+
|
Details | Diff | Splinter Review |
BUILD: MOZ_CO_DATE="Sun Apr 8 16:56:35 CDT 2007" Firefox build
STEPS TO REPRODUCE: firefox --trace-malloc=/home/bzbarsky/log.txt
EXPECTED RESULTS: No crash
ACTUAL RESULTS:
Program received signal SIGSEGV, Segmentation fault.
0xb7bc2344 in realloc (ptr=0x9425b68, size=123)
at ../../../../mozilla/tools/trace-malloc/lib/nsTraceMalloc.c:1640
1640 log_event8(logfp, TM_EVENT_REALLOC,
Current language: auto; currently c
(gdb) p oldsite
$3 = (callsite *) 0xffffffa3
(gdb) p oldsite->serial
Cannot access memory at address 0xffffffa7
(gdb) p oldsize
$7 = 3078786724
(gdb) p alloc->trackfp
$11 = (FILE *) 0xe7fde800
(gdb) p *alloc->trackfp
So something is broken with those variables there...
| Assignee | ||
Comment 1•18 years ago
|
||
This is actually Linux, right?
And I'm guessing you're on a multi-CPU machine.
This is a threadsafety bug in realloc -- it assumes another thread doesn't tweak suppress_tracing across the exit/enter of the monitor. I suppose this also means we should never exit the monitor while suppress_tracing is nonzero, which is a bit of a problem. I suppose we actually do need to convert it to thread-local storage; I wonder how expensive that is.
| Reporter | ||
Comment 2•18 years ago
|
||
Er.. yes. Linux. I forgot I wasn't sitting at my desktop when filing. ;)
And yes, I'm on a dual-cpu machine with hyperthreading (so 4 cpus as far as the OS is concerned).
OS: Mac OS X → Linux
| Assignee | ||
Comment 3•18 years ago
|
||
This uses thread-local storage. Seems to work on both Linux and Windows. I gave up on using NSPR's thread-local storage for the reasons described in the comment in the patch (after spending quite a few hours debugging those problems) and just used the Windows and pthreads APIs directly, which took me less than 30 minutes.
Assignee: nobody → dbaron
Status: NEW → ASSIGNED
| Assignee | ||
Comment 4•18 years ago
|
||
This patch reduces the scope of the locking (although there's more to do) and replaces tmmon (a PRMonitor) with tmlock (a PRLock).
I was considering macro-izing the suppress-and-lock pairs, but I just wrote them all out. Maybe worth revisiting, although right now I have a bunch of other patches merged to being on top of this one and that change might be rather painful until the whole mess lands.
| Assignee | ||
Comment 5•18 years ago
|
||
Oops, had the wrong type for tpIndex (although it worked anyway).
Attachment #275616 -
Attachment is obsolete: true
| Assignee | ||
Updated•18 years ago
|
Attachment #275617 -
Flags: review?(brendan)
| Assignee | ||
Comment 6•18 years ago
|
||
Relative to the last patch, this:
* removes some additional suppression code in nsWinTraceMalloc.cpp
that I found and that was causing problems on Windows
* fixes the signature of get_tm_thread (since I forgot I was using
C rather than C++) to use (void) rather than ()
Attachment #275635 -
Attachment is obsolete: true
Attachment #275909 -
Flags: review?(brendan)
Comment 7•18 years ago
|
||
Comment on attachment 275909 [details] [diff] [review]
patch, part 1: use thread-local storage
Looks good, only aesthetic nits (my old code, I still care ;-):
* Make macro bodies line up on basic offset indentation stops.
* tpIndex name not in style (tp_index? thread_private_index? wait, shouldn't it be tls_index?).
* Slight preference for tm_thread *t to be the first parameter, not the last, to backtrace.
r=me with nit-picking.
/be
Attachment #275909 -
Flags: review?(brendan) → review+
Updated•18 years ago
|
Attachment #275909 -
Flags: review+
Comment 8•18 years ago
|
||
Comment on attachment 275617 [details] [diff] [review]
patch, part 2: lock less
So long as the FIXMEs all have bugs slated to be fixed, r=me.
/be
Attachment #275617 -
Flags: review?(brendan)
Attachment #275617 -
Flags: review+
| Assignee | ||
Comment 9•18 years ago
|
||
Fixes both checked in to trunk at 2007-08-10 15:19 -0700.
There's no bug on the FIXME (maybe), but the other two are bug 374829 (already fixed) and bug 391749.
Status: ASSIGNED → RESOLVED
Closed: 18 years ago
Resolution: --- → FIXED
| Assignee | ||
Updated•18 years ago
|
OS: Linux → All
Hardware: PC → All
Summary: Crash running trace-malloc → Crash running trace-malloc on multi-core machine (put suppress_tracing in thread-local storage)
Target Milestone: --- → mozilla1.9 M8
You need to log in
before you can comment on or make changes to this bug.
Description
•