Closed
Bug 422055
Opened 16 years ago
Closed 16 years ago
Use jemalloc on OpenSolaris
Categories
(Core :: Memory Allocator, defect)
Tracking
()
RESOLVED
FIXED
mozilla1.9
People
(Reporter: ginnchen+exoracle, Assigned: ginnchen+exoracle)
References
Details
(Whiteboard: [RC2+])
Attachments
(4 files, 10 obsolete files)
10.79 KB,
text/html
|
Details | |
11.93 KB,
patch
|
ted
:
review+
beltzner
:
approval1.9+
|
Details | Diff | Splinter Review |
2.19 KB,
patch
|
mtschrep
:
approval1.9+
|
Details | Diff | Splinter Review |
4.55 KB,
patch
|
ted
:
review+
shaver
:
approval1.9+
|
Details | Diff | Splinter Review |
There's a little hack. I need to put -ljemalloc above -lpthread, otherwise pthread leaks. Because -lpthread is in LDFLAGS, and in rules.mk, LDFALGS is above LIBS, so I have to overwrite LDFLAGS. Maybe we should introduce a new variable for rules.mk, or use WRAP_MALLOC_LIB and move it above LDFLAGS. I don't know if it works on other platform. I use sysconf() instead of kstat, since kstat uses malloc, and we could not do malloc in malloc_init().
I've applied the patch to Firefox 3 beta 4. Here're test results: (all values are median of 3 tests) It seems we save less memory on SPARC, I guess it relates to the pagesize. It's 4096 on x86, 8192 on SPARC. We have a little trade off on performance, I hope we can get it back by using PGO. ***** There's an issue I didn't solve with draft patch. There's no posix_memalign() in Solaris libc. So if the user uses LD_PRELOAD, or embeds Firefox in another app, e.g. yelp, we will leak.
Changes: 1. enable it by default in configure.in 2. On Solaris, libc and other malloc lib has memalign but not posix_memalign. If we use posix_memalign in jemalloc, and other malloc lib or libc is loaded earlier, we will leak. To avoid this, use memalign instead and hide posix_memalign.
Attachment #308596 -
Attachment is obsolete: true
Attachment #309044 -
Flags: review?(ted.mielczarek)
Comment 4•16 years ago
|
||
Comment on attachment 309044 [details] [diff] [review] patch +ifdef MOZ_MEMORY +ifeq ($(OS_ARCH),SunOS) +LDFLAGS = -L$(DIST)/lib -ljemalloc @LDFLAGS@ +else +ifneq ($(OS_ARCH),WINNT) +LIBS += -ljemalloc +endif +endif +endif I don't like this. First of all, you should use EXPAND_LIBNAME_PATH here, second of all, I don't like the @LDFLAGS@ expansion there. I realize you need to get the ordering correct, but I think there must be a better way to do it. Also, you'll need review from other people on the js and jemalloc portions of this patch.
Attachment #309044 -
Flags: review?(ted.mielczarek) → review-
Right, I don't like it, either. So, should I add something like SOLARIS_JEMALLOC_LIBS to rules.mk?
Comment 6•16 years ago
|
||
That would be better, I think.
Attachment #309044 -
Attachment is obsolete: true
Attachment #309932 -
Flags: review?(ted.mielczarek)
Attachment #309932 -
Flags: review?(pavlov)
Comment 8•16 years ago
|
||
The change from using kstat_*() to sysconf() to get the number of processors looks good. The patch in 422960 will cause a conflict with the patch in this bug, but resolution should be obvious. The patch in bug 418016 integrated jemalloc into libxul, which will impact the build-related aspects of the patch in this bug. Also, earlier comments in this bug refer to "leaking" if multiple memory allocators are used, but I would expect undefined behavior (including likely crashes) if we mix allocator use. I'm not very happy about the prospect of burdening all users of posix_memalign() in mozilla with having to conditionally use memalign() instead. Is this really necessary for standard use cases like yelp? If it is strictly to make LD_PRELOAD of alternate malloc libraries work, I'm not convinced that it is worth the ongoing maintenance burden.
on windows we do have multiple allocators and it has to work (and it generally does). on unix multiple allocators should be made to work (the only cases i know of today don't work because they're buggy, and those are IMEs).
Comment 10•16 years ago
|
||
My comments are specific to run-time selection of memory allocator, whereas I think you (timeless) are referring to compile-time selection. I agree that nothing should be done to prevent compile-time selection, but I am questioning the value of requiring developers to write code like the js/src/jsgc.c portion of the patch in this bug everywhere that aligned memory is allocated. It may be that a simpler solution is to just use memalign() in all cases.
Comment 11•16 years ago
|
||
Comment on attachment 309932 [details] [diff] [review] patch v3 I agree with Jason about the posix_memalign change. That should be fixed differently. (i.e. not using POSIX_MEMALIGN) (there is code there to do other things already (mmap, oversized allcs, etc). I don't understand this part: +#elif defined(MOZ_MEMORY_SOLARIS) +inline int +posix_memalign(void **memptr, size_t alignment, size_t size) Shouldn't be needed.
Attachment #309932 -
Flags: review?(pavlov) → review-
Assignee | ||
Comment 12•16 years ago
|
||
Stuart, I'm trying to hide posix_memalign, just like using __hidden keyword, therefore any problem that links to libjemalloc (or libxul) won't use posix_memalign and libc free together. Jason, If js/src/jsgc.c uses memalign() in all cases, should I add inline before posix_memalign in jemalloc.c ?
Comment 13•16 years ago
|
||
Ginn: I don't know how the dynamic loader on Solaris works, but on Linux linking will cause everything to use jemalloc's free, malloc, posix_memalign. That is how things should work. If we need to hook it up on solaris differently we should do that but we shouldn't hide the symbol.
Comment 14•16 years ago
|
||
We want jsgc.c to use posix_memalign, not memalign. It should be using the posix_memalign from jemalloc on Solaris.
Assignee | ||
Comment 15•16 years ago
|
||
If jsgc.c uses posix_memalign, that would be a problem on Solaris, if any malloc library or evan libc library is loaded earlier. It would work for Firefox because we put -lxul before any other libraries. But it wouldn't work for yelp or LD_PRELOAD case. Here's a blog on this. http://blogs.sun.com/quenelle/entry/malloc_interposition_can_t_possibly
Comment 16•16 years ago
|
||
That basically says "don't miss allocation APIs" as best I can tell, so I'm not sure why having posix_memalign is a problem.
Assignee | ||
Comment 17•16 years ago
|
||
Comparing to jemalloc, libc misses posix_memalign. That's the problem. e.g. thunderbird will work as "undefined behavior (including likely crashes)" since we didn't put -lxul ahead when linking thunderbird-bin. Any program that uses libmozjs.so is same.
Comment 18•16 years ago
|
||
I don't understand. If you're saying people using posix_memalign _will_ use jemalloc and people using malloc won't, then I don't understand this patch at all.
Comment 19•16 years ago
|
||
My understanding of Ginn Chen's comments is that if we use posix_memalign() from jemalloc on Solaris, then using LD_PRELOAD to override jemalloc with some other allocator will fail, because jemalloc's posix_memalign() will continue to call into jemalloc, but (for example) free() will call into the replacement allocator. It just occurred to me that we could solve this problem on Solaris by reversing the way jemalloc implements memalign() and posix_memalign(), so that posix_memalign() calls memalign() rather than the other way around. That way we can always use posix_memalign(), and if LD_PRELOAD is used to override jemalloc, the posix_memalign() code will call the replacement memalign(). As far as I know, there are no platforms for which this reversal would cause problems, so I expect we can do it unconditionally.
Assignee | ||
Comment 20•16 years ago
|
||
Jason, I've tested, reversing the way jemalloc implements memalign() and posix_memalign() would make LD_PRELOAD work. But we need to make sure jemalloc's memalign() is not inlined into posix_memalign(). We can use __attribute__ ((noinline)) for GNUC and #pragma noinline for Sun Studio. I hope it will not be a performance overhead. Another thing is: If libjemalloc is out of libxul (using -disable-libxul option), we need to add -ljemalloc when linking libmozjs.so or linking any program that uses libmozjs.so.
Assignee | ||
Comment 21•16 years ago
|
||
(In reply to comment #20) > Another thing is: > If libjemalloc is out of libxul (using -disable-libxul option), we need to add > -ljemalloc when linking libmozjs.so or linking any program that uses > libmozjs.so. > Ignore this part, jemalloc will always be part of libxul. No matter --enable-libxul or --disable-libxul, we need to add -lxul for linking programs that uses libmozjs.so. e.g. xpcshell.
Assignee | ||
Comment 22•16 years ago
|
||
Attachment #309932 -
Attachment is obsolete: true
Attachment #309932 -
Flags: review?(ted.mielczarek)
Assignee | ||
Comment 23•16 years ago
|
||
Comment on attachment 311348 [details] [diff] [review] patch v4 Jason, can you give some comments?
Attachment #311348 -
Flags: review?(jasone)
Comment 24•16 years ago
|
||
Attached is a patch derived from patch v4. Patch v4 does not compile on Linux, due to a lacking prototype for memalign(). The fix is to reorder memalign() and posix_memalign(). There are some minor issues with error condition handling. memalign() is not required to do validation of the alignment argument, so I moved that code back into posix_memalign() and added an assertion in memalign(). I also adjusted the MALLOC_XMALLOC code, though that is a pedantic change since the code is disabled anyway. I don't think we need or want to avoid inlining memalign() on Linux, so I adjusted that code to apply only to Solaris. Ginn, can you please look over the revised patch and make sure it works correctly on Solaris?
Attachment #311348 -
Attachment is obsolete: true
Attachment #311348 -
Flags: review?(jasone)
Assignee | ||
Comment 25•16 years ago
|
||
It works great on Solaris. Thank you! In case someone build it with gcc on Solaris, I think it would be better if we write it like, #elif defined(MOZ_MEMORY_SOLARIS) #if defined(__SUNPRO_C) void * memalign(size_t alignment, size_t size); #pragma no_inline(memalign) #elif defined(__GNU_C__) _attribute__((noinline)) #endif VISIBLE void * memalign(size_t alignment, size_t size) #else
Attachment #312824 -
Flags: review?(ted.mielczarek)
Comment 26•16 years ago
|
||
Ginn, I think patch v4b addresses your feedback. Please let me know if I misunderstood your intent.
Attachment #312824 -
Attachment is obsolete: true
Attachment #312824 -
Flags: review?(ted.mielczarek)
Assignee | ||
Comment 27•16 years ago
|
||
Comment on attachment 312959 [details] [diff] [review] patch v4b [Checkin: Comment 30] Yes, it does. Thanks.
Attachment #312959 -
Flags: review?(ted.mielczarek)
Comment 28•16 years ago
|
||
Comment on attachment 312959 [details] [diff] [review] patch v4b [Checkin: Comment 30] r=me on the build bits.
Attachment #312959 -
Flags: review?(ted.mielczarek) → review+
Attachment #312959 -
Flags: approval1.9?
Comment 29•16 years ago
|
||
Comment on attachment 312959 [details] [diff] [review] patch v4b [Checkin: Comment 30] a1.9=beltzner
Attachment #312959 -
Flags: approval1.9? → approval1.9+
Assignee | ||
Comment 30•16 years ago
|
||
Checking in configure.in; /cvsroot/mozilla/configure.in,v <-- configure.in new revision: 1.1989; previous revision: 1.1988 done Checking in browser/app/Makefile.in; /cvsroot/mozilla/browser/app/Makefile.in,v <-- Makefile.in new revision: 1.154; previous revision: 1.153 done Checking in config/rules.mk; /cvsroot/mozilla/config/rules.mk,v <-- rules.mk new revision: 3.594; previous revision: 3.593 done Checking in memory/jemalloc/Makefile.in; /cvsroot/mozilla/memory/jemalloc/Makefile.in,v <-- Makefile.in new revision: 1.8; previous revision: 1.7 done Checking in memory/jemalloc/jemalloc.c; /cvsroot/mozilla/memory/jemalloc/jemalloc.c,v <-- jemalloc.c new revision: 1.12; previous revision: 1.11 done
Status: NEW → RESOLVED
Closed: 16 years ago
Resolution: --- → FIXED
Assignee | ||
Comment 31•16 years ago
|
||
There're 2 issues. 1) When compiling thunderbird, it doesn't have MOZ_ENABLE_LIBXUL, therefore XPCOM_LIBS doesn't have LIBXUL_LIBS. The result is xpcshell failed to be linked, because posix_memalign is missing. 2) On Solaris/SPARC, we don't have alloca(). I didn't notice bug 420678 used it.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Assignee | ||
Comment 32•16 years ago
|
||
Move the definition of SOLARIS_JEMALLOC_LDFLAGS to rules.mk, so that it will always apply to xpcshell and *-bin.
Attachment #319343 -
Flags: review?(ted.mielczarek)
Assignee | ||
Comment 33•16 years ago
|
||
In alloca.h, we have #define alloca(x) __builtin_alloca(x)
Attachment #319344 -
Flags: review?(jasone)
Comment 34•16 years ago
|
||
Comment on attachment 319344 [details] [diff] [review] fix bustage on Solaris/SPARC This alloca() trouble is a pain that I don't want to have to deal with on an ongoing basis. I'm going to modify the code to not use alloca() at all.
Comment 35•16 years ago
|
||
This patch removes the alloca() call, thus avoiding the need for portability fixes. Some platforms have alloca.h, and others define alloca() in stdlib.h. I think we would have needed to add an autoconf test to solve this problem.
Attachment #319344 -
Attachment is obsolete: true
Attachment #319414 -
Flags: review?(ginn.chen)
Attachment #319344 -
Flags: review?(jasone)
Assignee | ||
Comment 36•16 years ago
|
||
Comment on attachment 319414 [details] [diff] [review] Remove alloca() call removal of alloca is great, since it's machine-, compiler- and system-dependent. Nit: I think we can also remove #define alloca _alloca added by bug 420678.
Attachment #319414 -
Flags: review?(ginn.chen)
Attachment #319414 -
Flags: review+
Attachment #319414 -
Flags: approval1.9?
Comment 37•16 years ago
|
||
Per comment #36, remove the unneeded alloca #define. Thanks for catching that, Ginn.
Attachment #319414 -
Attachment is obsolete: true
Attachment #319414 -
Flags: approval1.9?
Attachment #319569 -
Flags: approval1.9?
Comment 38•16 years ago
|
||
Comment on attachment 319569 [details] [diff] [review] Remove alloca() call (and #define) [Checkin: Comment 39] a+ based on Pav's risk assessment and moz2 meeting today
Attachment #319569 -
Flags: approval1.9? → approval1.9+
Assignee | ||
Comment 39•16 years ago
|
||
Checking in jemalloc.c; /cvsroot/mozilla/memory/jemalloc/jemalloc.c,v <-- jemalloc.c new revision: 1.14; previous revision: 1.13 done
Assignee | ||
Comment 40•16 years ago
|
||
Attachment #319343 -
Attachment is obsolete: true
Attachment #319931 -
Flags: review?(ted.mielczarek)
Attachment #319343 -
Flags: review?(ted.mielczarek)
Assignee | ||
Comment 41•16 years ago
|
||
Comment on attachment 319931 [details] [diff] [review] patch revised since bug 418016 reverts to libjemalloc.so Sadly it's not working. Because if we compile xpidl with -ljemalloc, xpidl could not be run at compiling time, since dist/libjemalloc.so is a symbolic link.
Attachment #319931 -
Attachment is obsolete: true
Attachment #319931 -
Flags: review?(ted.mielczarek)
Assignee | ||
Comment 42•16 years ago
|
||
This is the best fix I can imagine right now. Add dependence of jemalloc to mozjs, since it uses posix_memalign. Apply SOLARIS_JEMALLOC_LDFLAGS to every binary that we want to use jemalloc.
Attachment #319960 -
Flags: review?(ted.mielczarek)
Comment 43•16 years ago
|
||
I discovered a syntax error that slipped into revision 1.12 of jemalloc.c as part of the commit on 30 April 2008: # elif (defined(__GNU_C__) should be: # elif (defined(__GNU_C__)) ^ I'm a little surprised this hasn't caused any trouble, but if there are no associated build errors, then we can wait to fix it until the patch in bug #422960 is committed.
Comment 44•16 years ago
|
||
This issue still breaks Firefox 3.0 on Solaris. ask for blocker 1.9.
Flags: blocking1.9?
Comment 45•16 years ago
|
||
(In reply to comment #44) > This issue still breaks Firefox 3.0 on Solaris. ask for blocker 1.9. What is the current problem you are referring to? If comment #41 describes the issue at hand, perhaps there is a simple solution, such as passing an appropriate RPATH-related flag during linking.
Comment 46•16 years ago
|
||
Alfred does comment 45 work for you?
Flags: blocking1.9? → blocking1.9-
Whiteboard: [not needed for 1.9]
Comment 47•16 years ago
|
||
Jason, the Solaris tinerboxes(http://tinderbox.mozilla.org/Firefox-Ports/) are still broken as comment #41 mentioned. Ginn, any comments for Jason's suggestion?
Assignee | ||
Comment 48•16 years ago
|
||
Jason, I think RPATH doesn't work for symbolic links to another directories.
Comment 50•16 years ago
|
||
Comment on attachment 319960 [details] [diff] [review] fix linkage on Solaris Could we do this centrally, in config.mk or rules.mk? It sucks to have that block duplicated in every binary's Makefile.
Assignee | ||
Comment 51•16 years ago
|
||
Ted, I tried, but then we need an exception for xpidl, as I mentioned in comment #41. Also I think it's necessary to add -ljemalloc for mozjs any way, because it uses posix_memalign().
Assignee | ||
Comment 52•16 years ago
|
||
Use a black list rather than a white list. Is it look better?
Attachment #322490 -
Flags: review?(ted.mielczarek)
Comment 53•16 years ago
|
||
Comment on attachment 322490 [details] [diff] [review] alternative patch for linkage [Checkin: Comment 59] Yeah, I like that better.
Attachment #322490 -
Flags: review?(ted.mielczarek) → review+
Comment 54•16 years ago
|
||
Comment on attachment 319960 [details] [diff] [review] fix linkage on Solaris Use the other patch.
Attachment #319960 -
Flags: review?(ted.mielczarek) → review-
Updated•16 years ago
|
Whiteboard: [not needed for 1.9] → [not needed for 1.9?][RC2?][has patch][has ted review]
Comment 55•16 years ago
|
||
Ted we should take this on trunk
Comment 56•16 years ago
|
||
Was that a question or directive?
Assignee | ||
Comment 57•16 years ago
|
||
It doesn't affect any platform other than Solaris. It would be better to commit it on 1.9.0 branch. Otherwise some people who use source tar ball may have trouble to build it on Solaris.
Updated•16 years ago
|
Attachment #322490 -
Flags: approval1.9?
Comment on attachment 322490 [details] [diff] [review] alternative patch for linkage [Checkin: Comment 59] Approved, please land by noon PDT to make Firefox 3.0.
Attachment #322490 -
Flags: approval1.9? → approval1.9+
Whiteboard: [not needed for 1.9?][RC2?][has patch][has ted review] → [not needed for 1.9?][RC2+][has patch][has ted review]
Comment 59•16 years ago
|
||
mozilla/browser/app/Makefile.in 1.156 mozilla/config/rules.mk 3.595 mozilla/js/src/Makefile.in 3.125 mozilla/xpcom/typelib/xpidl/Makefile.in 1.45 mozilla/xpcom/typelib/xpt/tools/Makefile.in 1.33
Status: REOPENED → RESOLVED
Closed: 16 years ago → 16 years ago
Resolution: --- → FIXED
Whiteboard: [not needed for 1.9?][RC2+][has patch][has ted review] → [RC2+]
Target Milestone: --- → mozilla1.9
Updated•16 years ago
|
Attachment #312959 -
Attachment description: patch v4b → patch v4b
[Checkin: Comment 30]
Updated•16 years ago
|
Attachment #319569 -
Attachment description: Remove alloca() call (and #define) → Remove alloca() call (and #define)
[Checkin: Comment 39]
Updated•16 years ago
|
Attachment #322490 -
Attachment description: alternative patch for linkage → alternative patch for linkage
[Checkin: Comment 59]
Updated•16 years ago
|
Attachment #319960 -
Attachment is obsolete: true
Comment 60•16 years ago
|
||
re comment 43, bug 446302 reports that this is a real issue. i don't understand why this was left unfixed....
Comment 61•16 years ago
|
||
(In reply to comment #60) > re comment 43, bug 446302 reports that this is a real issue. i don't understand > why this was left unfixed.... It was fixed in changeset 663c51189e98 on 20 June 2008, as per comment #5 in bug #442960.
Comment 62•16 years ago
|
||
it was not fixed in CVS. you have a responsibility to all branches you break, not just central.
You need to log in
before you can comment on or make changes to this bug.
Description
•