Closed Bug 658995 Opened 13 years ago Closed 13 years ago

Trace malloc doesn't work on linux with standalone xpcom glue

Categories

(Core :: XPCOM, defect)

All
Linux
defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla7

People

(Reporter: glandium, Assigned: glandium)

References

Details

Attachments

(2 files, 4 obsolete files)

This was discovered after bug 552864 landed. There are basically two problems:
- Some libxul functions are registered with atexit(), and that is bound to fail after libxul is unloaded. Somehow, this is not a problem on windows and mac.
- malloc/free/... functions aren't diverted with the ones from nsTraceMalloc.c because they are already resolved to libc's when libxul is loaded.
This makes use of libc's malloc/free/... functions until libxul is loaded, and reverts to these after libxul is unloaded.
When libxul is loaded, use libxul's functions
Attachment #534438 - Flags: review?(benjamin)
Use base::AtExitManager it is possible to avoid problems that would arise with atexit() after libxul is unloaded.
The RegisterAtExitCallback is required in nsTraceMalloc.c because we can't call base::AtExitManager::RegisterCallback directly from C code.
We need to initialize an AtExitManager before NS_LogInit() in NS_InitXPCOM2 because NS_LogInit ends up trying to register a callback. We also add an AtExitManager in XRE_main because the profile lock registers a callback too, and it does so before NS_InitXPCOM2. Then, the one in NS_InitXPCOM2 becomes redundant, but we have a bunch of executables that do NS_InitXPCOM2 without ever calling XRE_main.
Attachment #534440 - Flags: review?(benjamin)
Attachment #534438 - Flags: review?(benjamin) → review?(dbaron)
Under what conditions do we use standalone XPCOM glue?  i.e., what cases does this bug affect?

When does the AtExitManager stuff run (relative to when main() exits)?  For some of this, we really want it to run after main() is done, or at the very least after XRE_main() is done.

Casting function pointers is a sign that something is wrong.  Why do you need to do that in nsStandardURL.cpp in patch 2?
(In reply to comment #4)
> Under what conditions do we use standalone XPCOM glue?  i.e., what cases
> does this bug affect?

bug 552864 switches Firefox to use the standlone XPCOM glue to allow preloading of the libraries (bug 	632404) for massive cold startup improvement.

> When does the AtExitManager stuff run (relative to when main() exits)?  For
> some of this, we really want it to run after main() is done, or at the very
> least after XRE_main() is done.

The way the AtExitManager stuff works, it runs the callbacks when destroying the AtExitManager instance. In the most common case, it means when exiting XRE_main.

> Casting function pointers is a sign that something is wrong.  Why do you
> need to do that in nsStandardURL.cpp in patch 2?

Because AtExitManager callbacks take a void * parameter we don't use. I could modify the callbacks to take one, though.
(In reply to comment #5)
> The way the AtExitManager stuff works, it runs the callbacks when destroying
> the AtExitManager instance. In the most common case, it means when exiting
> XRE_main.

I'm wondering if using a static initializer wouldn't make more sense, actually...
Comment on attachment 534440 [details] [diff] [review]
part 2 - Use base::AtExitManager instead of atexit()

Ugh, more chromium code we'll have to rip out later :-(
Attachment #534440 - Flags: review?(benjamin) → review+
Apparently, MSVC doesn't like what chromium-config.mk adds:

nsDebugHelpWin32.cpp

e:/builds/moz2_slave/try-w32-dbg/build/tools/trace-malloc/lib/nsDebugHelpWin32.cpp(73) : error C2664: 'LoadLibraryW' : cannot convert parameter 1 from 'const char [12]' to 'LPCWSTR'

        Types pointed to are unrelated; conversion requires reinterpret_cast, C-style cast or function-style cast

e:/builds/moz2_slave/try-w32-dbg/build/tools/trace-malloc/lib/nsDebugHelpWin32.cpp(275) : error C2664: 'lstrcmpiW' : cannot convert parameter 1 from 'const char *' to 'LPCWSTR'

        Types pointed to are unrelated; conversion requires reinterpret_cast, C-style cast or function-style cast
chromium-config.mk defined UNICODE ... you can just change those calls from ::Foo() to :FooA.
(In reply to comment #5)
> > Casting function pointers is a sign that something is wrong.  Why do you
> > need to do that in nsStandardURL.cpp in patch 2?
> 
> Because AtExitManager callbacks take a void * parameter we don't use. I
> could modify the callbacks to take one, though.

Please do.  That's much better than casting function pointers and making assumptions about calling conventions.
Comment on attachment 534438 [details] [diff] [review]
part 1 - Properly divert memory allocation functions for trace malloc with standalone glue on Linux

ok, r=dbaron
Attachment #534438 - Flags: review?(dbaron) → review+
This one doesn't rely on base::AtExitManager at all. It relies on static destructors.
While this approach works very well, it's however not very obvious when reading the code what it is doing. I'd like to find a way to make that into nice templates or macros so that documentation could lie in a single location, but that might not be possible at all.
Attachment #534730 - Flags: review?(benjamin)
Attachment #534440 - Attachment is obsolete: true
Comment on attachment 534730 [details] [diff] [review]
part 2 - Use static destructors instead of atexit()

Ugh, but ok.
Attachment #534730 - Flags: review?(benjamin) → review+
Attachment #534438 - Attachment is obsolete: true
As will land (added comments)
Attachment #534730 - Attachment is obsolete: true
I merged this from m-i to m-c, but it will be backed out shortly while we investigate Talos regressions:
http://hg.mozilla.org/mozilla-central/rev/6e230988614f
http://hg.mozilla.org/mozilla-central/rev/6b23b52e68d6
I figured what went wrong with the leak reports: posix_memalign wasn't diverted. I thus added posix_memalign and cfree, which are the two functions that nsTraceMalloc.c exports and that we weren't. I don't think we use cfree from anywhere, but at least, we on par with what nsTraceMalloc.c diverts. The interdiff with the previous version is simple enough (and mostly copied from nsTraceMalloc.c) that I don't need review.
Attachment #539435 - Attachment is obsolete: true
http://hg.mozilla.org/mozilla-central/rev/eef5d644e17e
http://hg.mozilla.org/mozilla-central/rev/f4ddad2c0eb7
Status: NEW → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
Whiteboard: [inbound]
Target Milestone: --- → mozilla7
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: