Closed Bug 376874 Opened 18 years ago Closed 18 years ago

Crash running trace-malloc on multi-core machine (put suppress_tracing in thread-local storage)

Categories

(Core :: General, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla1.9alpha8

People

(Reporter: bzbarsky, Assigned: dbaron)

References

Details

Attachments

(2 files, 2 obsolete files)

BUILD: MOZ_CO_DATE="Sun Apr 8 16:56:35 CDT 2007" Firefox build STEPS TO REPRODUCE: firefox --trace-malloc=/home/bzbarsky/log.txt EXPECTED RESULTS: No crash ACTUAL RESULTS: Program received signal SIGSEGV, Segmentation fault. 0xb7bc2344 in realloc (ptr=0x9425b68, size=123) at ../../../../mozilla/tools/trace-malloc/lib/nsTraceMalloc.c:1640 1640 log_event8(logfp, TM_EVENT_REALLOC, Current language: auto; currently c (gdb) p oldsite $3 = (callsite *) 0xffffffa3 (gdb) p oldsite->serial Cannot access memory at address 0xffffffa7 (gdb) p oldsize $7 = 3078786724 (gdb) p alloc->trackfp $11 = (FILE *) 0xe7fde800 (gdb) p *alloc->trackfp So something is broken with those variables there...
This is actually Linux, right? And I'm guessing you're on a multi-CPU machine. This is a threadsafety bug in realloc -- it assumes another thread doesn't tweak suppress_tracing across the exit/enter of the monitor. I suppose this also means we should never exit the monitor while suppress_tracing is nonzero, which is a bit of a problem. I suppose we actually do need to convert it to thread-local storage; I wonder how expensive that is.
Er.. yes. Linux. I forgot I wasn't sitting at my desktop when filing. ;) And yes, I'm on a dual-cpu machine with hyperthreading (so 4 cpus as far as the OS is concerned).
OS: Mac OS X → Linux
This uses thread-local storage. Seems to work on both Linux and Windows. I gave up on using NSPR's thread-local storage for the reasons described in the comment in the patch (after spending quite a few hours debugging those problems) and just used the Windows and pthreads APIs directly, which took me less than 30 minutes.
Assignee: nobody → dbaron
Status: NEW → ASSIGNED
This patch reduces the scope of the locking (although there's more to do) and replaces tmmon (a PRMonitor) with tmlock (a PRLock). I was considering macro-izing the suppress-and-lock pairs, but I just wrote them all out. Maybe worth revisiting, although right now I have a bunch of other patches merged to being on top of this one and that change might be rather painful until the whole mess lands.
Oops, had the wrong type for tpIndex (although it worked anyway).
Attachment #275616 - Attachment is obsolete: true
Relative to the last patch, this: * removes some additional suppression code in nsWinTraceMalloc.cpp that I found and that was causing problems on Windows * fixes the signature of get_tm_thread (since I forgot I was using C rather than C++) to use (void) rather than ()
Attachment #275635 - Attachment is obsolete: true
Attachment #275909 - Flags: review?(brendan)
Comment on attachment 275909 [details] [diff] [review] patch, part 1: use thread-local storage Looks good, only aesthetic nits (my old code, I still care ;-): * Make macro bodies line up on basic offset indentation stops. * tpIndex name not in style (tp_index? thread_private_index? wait, shouldn't it be tls_index?). * Slight preference for tm_thread *t to be the first parameter, not the last, to backtrace. r=me with nit-picking. /be
Attachment #275909 - Flags: review?(brendan) → review+
Attachment #275909 - Flags: review+
Comment on attachment 275617 [details] [diff] [review] patch, part 2: lock less So long as the FIXMEs all have bugs slated to be fixed, r=me. /be
Attachment #275617 - Flags: review?(brendan)
Attachment #275617 - Flags: review+
Fixes both checked in to trunk at 2007-08-10 15:19 -0700. There's no bug on the FIXME (maybe), but the other two are bug 374829 (already fixed) and bug 391749.
Status: ASSIGNED → RESOLVED
Closed: 18 years ago
Resolution: --- → FIXED
OS: Linux → All
Hardware: PC → All
Summary: Crash running trace-malloc → Crash running trace-malloc on multi-core machine (put suppress_tracing in thread-local storage)
Target Milestone: --- → mozilla1.9 M8
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: