Closed
Bug 322959
Opened 20 years ago
Closed 13 years ago
Update trace-malloc to new DbgHelp APIs
Categories
(Core :: XPCOM, enhancement)
Tracking
()
RESOLVED
INVALID
People
(Reporter: bc, Assigned: matthew.gertner)
References
Details
Attachments
(5 files, 4 obsolete files)
68.80 KB,
text/plain
|
Details | |
1.27 KB,
text/plain
|
Details | |
6.88 KB,
text/plain
|
Details | |
65.83 KB,
patch
|
Details | Diff | Splinter Review | |
77.11 KB,
patch
|
Details | Diff | Splinter Review |
Since trace-malloc was removed from xpcom/base, nsStackFrameWin.cpp and nsStackFrameWin.h have been updated in xpcom/base to use the new DbgHelp APIs for Windows XP but not in tools/trace-malloc/lib/.
Matthew did some work to make the calls to the DbgHelp APIs thread safe in nsTraceMalloc.c as well.
This bug is about updating trace-malloc's versions of nStackFrame.* to the versions in xpcom/base, incorporating Matthew's thread changes, and laying the ground work for improving nsGetTypeName() to properly return type names in Windows.
Reporter | ||
Comment 1•20 years ago
|
||
Reporter | ||
Comment 2•20 years ago
|
||
These are the changes I made to nsStackFrameWin.h copied from xpcom/base. I am not sure if these changes are needed on the xpcom version.
Reporter | ||
Comment 3•20 years ago
|
||
These are the changes I made to nsStackFrameWin.cpp copied from xpcom/base. I am not sure if these changes are needed on the xpcom version.
Reporter | ||
Comment 4•20 years ago
|
||
This patch applies the changes to bring trace-malloc's versions of nsStackFrameWin.* up to the xpcom/base versions, with the above additional changes, along with Matthew's changes and an initial attempt to get type information from DbgHelp in nsGetTypeName(). nsGetTypeName() still needs work.
Attachment #208116 -
Flags: review?
Reporter | ||
Updated•20 years ago
|
Attachment #208116 -
Flags: review? → review?(brendan)
one significant note, SymInitialize should really only be called once, calling it after it has successfully been called in the process space will fail which causes things to be fairly unhappy.
Assignee | ||
Comment 6•20 years ago
|
||
I guess I should also add support for VC71. As far as I can see, the current implementation is hardcoded to use MSVCRTD.DLL, which is only used by VC6 builds. Any opinions?
Assignee | ||
Comment 7•20 years ago
|
||
Brendan, I need help with this. I am encountering a problem when DLLs are loaded dynamically with trace-malloc turned on (Win32). The problem is related to the fact that the stack walking code must now be performed in a separate thread. When a new library is loaded, the LoadLibrary call is caught and the memory allocation hooks are set. If the module allocates memory in DllMain (e.g. global variables in the module with a constructor that use new, like strings), trace-malloc tries to generate a stacktrace, which requires starting a new thread. For some reason, it is not possible to start the new thread. I guess that synchronization in DllMain can be problematic, so perhaps this is related to the reason.
Assuming I am diagnosing the problem correctly, we shouldn't even try to walk the stack for memory allocated in DllMain. I see three possible solutions:
1) Instead of hooking into LoadLibrary, hook into some other function that is called after DllMain has finished for the new module.
2) Detect that the module is in DllMain and skip the trace-malloc code.
3) Post something to the system message queue so that the trace-malloc is performed asynchronously after LoadLibrary.
I'm not sure whether either 1) or 2) is possible, and I'm not sure whether 3) will solve the problem (can we be sure that posting to the message queue ensures that the notification will be received after DllMain has completed?). I'll spend some more time on this tomorrow, but if you have any ideas about how to address this, I'd be most appreciative.
Assignee | ||
Comment 8•20 years ago
|
||
It seems like the problem is actually with generating the stacktrace when in DllMain. I'm in the processing of figuring out why (and hopefully fixing it).
Assignee | ||
Comment 9•20 years ago
|
||
Okay, I found the problem.
For the record: TraceMalloc hooks the LoadLibraryX functions in each DLL to ensure that the memory allocation hooks are added to any DLLs loaded dynamically during executation. Since LoadLibrary can actually result in the loading of any number of referenced DLLs, it enumerates all modules in memory and hooks them. It does this using EnumerateLoadedModules.
In the stackwalking code, there is a case where SysGetModuleInfo fails to get the filename and line number for a stackframe (not sure why). In this case, the code checks whether the frame's address can be found in the address space of a known module. It does this using (no prices for guessing) EnumerateLoadedModules. Now, if we consult MSDN:
"All DbgHelp functions, such as this one, are single threaded. Therefore, calls from more than one thread to this function will likely result in unexpected behavior or memory corruption. To avoid this, you must synchronize all concurrent calls from more than one thread to this function."
So it looks like a threading problem. The obvious solution is to enumerate the modules into a list when patching, and then do the patching from the list so we're no longer inside EnumerateLoadedModules. A cleaner approach would be to find some other way of catching DLL loading during execution (a system-wide hook?).
Component: General → XPCOM
Flags: review?(brendan)
Product: Firefox → Core
QA Contact: general → xpcom
Version: unspecified → Trunk
Attachment #208116 -
Flags: superreview?(brendan)
Attachment #208116 -
Flags: review?(dougt)
Assignee | ||
Updated•20 years ago
|
Attachment #208116 -
Flags: superreview?(brendan)
Attachment #208116 -
Flags: review?(dougt)
Assignee | ||
Comment 10•19 years ago
|
||
Okay, here's a first crack at creating a generic stackwalking service. I would imagine this being used as a unified interface for all Mozilla stackwalking needs: tracerefcnt, tracemalloc and deadlock detection with NS_TRACE_MALLOC_XXX. I also see this as a component of the type of future logging system I proposed in http://wiki.mozilla.org/Logging_and_Diagnostics.
I've written this for the 1.8.0.x branch, since we need it and we're not porting to trunk for a while. Since I'm setting up an event queue to run the Windows stacktraces on a separate thread, I had to use the old API instead of the snazzy new nsIThreadManager. I don't think it will be a big problem at all to port to the new API. Also, this is Windows-only for now. Once again, I think a Unix port based on the existing nsStackFrameUnix should be pretty straightforward.
At this point, the biggest question for me is: should this be an XPCOM service? The main argument for doing this is that I believe it will be very useful to standardize all stack traces around nsIStackFrame, which implies use of XPCOM anyway with all the associated allocation, refcounting, etc. The counterargument is that this could add a lot of noise to tracemalloc and tracerefcnt, not to mention reentrancy issues. I don't see any issue at all with deadlock detection and logging.
I think that the reentrancy issues can be solved easily by simply blocking reentrancy in the stack walking service and returning a null stackframe. Also, it might make sense to give nsStackWalkService and nsStackFrame custom AddRef() and Release() methods; this should make tracerefcnt a non-issue. So the real issue is tracemalloc. Do we even need this anymore now that tools like Purify are so effective?
I'm going to go ahead and move the deadlock detection stuff over to the new service, since I need to get this working urgently. Other than that, I'd like to get some sort of opinion from someone authoritative that this approach is valid.
Attachment #228049 -
Flags: review?
Comment 11•19 years ago
|
||
Comment on attachment 228049 [details] [diff] [review]
XPCOM-based stack walking service
ToString/PR_smprintf aren't compatible - use nsPrintfCString
ToString is an xpcom api, which means nsMemory, PR_smprintf requires PR_smprintf_free or something similarly very special
PrintError's comment claims it'll print to stderr and an additional stream
it prints to stdout
+ nsRefPtr<nsStackFrame> lastFrame = 0;
and similar autos shouldn't need to be initialized
+ if (ok)
+#ifdef USING_WXP_VERSION
+ addr = frame64.AddrPC.Offset;
+#else
+ addr = frame.AddrPC.Offset;
+#endif
+ else
+ PrintError("WalkStack");
i'd put {}s around the #if..#end.. block and probably around the else block
Assignee | ||
Comment 12•19 years ago
|
||
I've gone ahead and written the necessary code to serialize stack traces using the new XPCOM service. I also migrated the nsAutoLock deadlock detection stuff to use the service when stacktraces are turned on. This is really useful (I already used it to locate two deadlock risks in our code). Since this means the stack trace service has to be around really early in the startup process, I modified the XPCOM startup code to set it up in a similar way to nsDirectoryService. It would be nice if there were a more generic way to do this but I don't think it's too awful a hack.
Brendan, I've requested review from you but if you don't have time or don't think you're the right person to look at this, perhaps you can suggest who is. I really need some sort of assurance that I'm not totally wasting my time trying to get this into Mozilla. If someone can give this to me, then I'll go ahead and port over tracerefcnt and tracemalloc as well. Note that the actual stackwalking code itself probably doesn't need review since I didn't change it much from the existing version in xpcom and tracemalloc/lib (same code, two places).
Attachment #228049 -
Attachment is obsolete: true
Attachment #228676 -
Flags: review?
Attachment #228049 -
Flags: review?
Assignee | ||
Updated•19 years ago
|
Attachment #228676 -
Flags: review? → review?(brendan)
mark may also be able to give some tips here as he's worked on bug 331357...
*** Bug 341467 has been marked as a duplicate of this bug. ***
No longer blocks: 341467
Assignee | ||
Updated•19 years ago
|
Status: NEW → ASSIGNED
Assignee | ||
Comment 15•19 years ago
|
||
I added a new method to the stackwalk service to enable generation of stack traces from Windows CONTEXT structures. I needed this to implement a crash handler that dumps the stack when an unhandled exception is thrown, the beginning of what I hope will be a more modern Talkback implementation.
Note that I was also able to use the service to implement a refcounting framework that can be turned on on a class-by-class basis (recompile required, however). If anyone's interested, let me know.
Attachment #228676 -
Attachment is obsolete: true
Attachment #228676 -
Flags: review?(brendan)
Assignee | ||
Comment 16•19 years ago
|
||
One day I will submit a patch without having to correct it immediately afterwards. Anyway, I went nuts and actually tested this on a clean tree before posting.
I should also have mentioned for clarity that the patch still applies to the 1.8.0.x branch. I dare say we'll be moving onto 1.8.x soon.
Attachment #233049 -
Attachment is obsolete: true
Assignee | ||
Comment 17•19 years ago
|
||
Attachment #233051 -
Attachment is obsolete: true
I'm planning to make trace-malloc use the XPCOM copy of the code in bug 374829.
Comment 19•13 years ago
|
||
Is this bug still valuable, or has the recent profiler work made this irrelevant?
Severity: normal → enhancement
Flags: needinfo?(vdjeric)
Comment 20•13 years ago
|
||
(In reply to Benjamin Smedberg [:bsmedberg] from comment #19)
> Is this bug still valuable, or has the recent profiler work made this
> irrelevant?
We've been linking to imagehlp library statically for a long time, and trace-malloc also uses the NS_StackWalk API these days. There's no more value to this bug.
Status: ASSIGNED → RESOLVED
Closed: 13 years ago
Flags: needinfo?(vdjeric)
Resolution: --- → INVALID
You need to log in
before you can comment on or make changes to this bug.
Description
•