Use jemalloc on OpenSolaris

RESOLVED FIXED in mozilla1.9

Status

()

Core
Memory Allocator
RESOLVED FIXED
10 years ago
7 years ago

People

(Reporter: Ginn Chen, Assigned: Ginn Chen)

Tracking

Trunk
mozilla1.9
All
OpenSolaris
Points:
---
Dependency tree / graph
Bug Flags:
blocking1.9 -

Firefox Tracking Flags

(Not tracked)

Details

(Whiteboard: [RC2+])

Attachments

(4 attachments, 10 obsolete attachments)

10.79 KB, text/html
Details
11.93 KB, patch
ted
: review+
beltzner
: approval1.9+
Details | Diff | Splinter Review
2.19 KB, patch
Mike Schroepfer
: approval1.9+
Details | Diff | Splinter Review
4.55 KB, patch
ted
: review+
shaver
: approval1.9+
Details | Diff | Splinter Review
(Assignee)

Description

10 years ago
 
(Assignee)

Comment 1

10 years ago
Created attachment 308596 [details] [diff] [review]
draft patch

There's a little hack.
I need to put -ljemalloc above -lpthread, otherwise pthread leaks.
Because -lpthread is in LDFLAGS, and in rules.mk, LDFALGS is above LIBS, so I have to overwrite LDFLAGS.

Maybe we should introduce a new variable for rules.mk, or use WRAP_MALLOC_LIB and move it above LDFLAGS. I don't know if it works on other platform.

I use sysconf() instead of kstat, since kstat uses malloc, and we could not do malloc in malloc_init().
(Assignee)

Comment 2

10 years ago
Created attachment 308837 [details]
test result

I've applied the patch to Firefox 3 beta 4.
Here're test results: (all values are median of 3 tests)

It seems we save less memory on SPARC, I guess it relates to the pagesize. It's 4096 on x86, 8192 on SPARC.
We have a little trade off on performance, I hope we can get it back by using PGO.

*****
There's an issue I didn't solve with draft patch.
There's no posix_memalign() in Solaris libc.
So if the user uses LD_PRELOAD, or embeds Firefox in another app, e.g. yelp, we will leak.
(Assignee)

Comment 3

10 years ago
Created attachment 309044 [details] [diff] [review]
patch

Changes:
1. enable it by default in configure.in

2. On Solaris, libc and other malloc lib has memalign but not posix_memalign.
If we use posix_memalign in jemalloc, and other malloc lib or libc is loaded earlier, we will leak. To avoid this, use memalign instead and hide posix_memalign.
Attachment #308596 - Attachment is obsolete: true
(Assignee)

Updated

10 years ago
Attachment #309044 - Flags: review?(ted.mielczarek)
Comment on attachment 309044 [details] [diff] [review]
patch

+ifdef MOZ_MEMORY
+ifeq ($(OS_ARCH),SunOS)
+LDFLAGS = -L$(DIST)/lib -ljemalloc @LDFLAGS@
+else
+ifneq ($(OS_ARCH),WINNT)
+LIBS += -ljemalloc
+endif
+endif
+endif

I don't like this. First of all, you should use EXPAND_LIBNAME_PATH here, second of all, I don't like the @LDFLAGS@ expansion there. I realize you need to get the ordering correct, but I think there must be a better way to do it.

Also, you'll need review from other people on the js and jemalloc portions of this patch.
Attachment #309044 - Flags: review?(ted.mielczarek) → review-
(Assignee)

Comment 5

10 years ago
Right, I don't like it, either.

So, should I add something like SOLARIS_JEMALLOC_LIBS to rules.mk?  
That would be better, I think.
(Assignee)

Comment 7

10 years ago
Created attachment 309932 [details] [diff] [review]
patch v3
Attachment #309044 - Attachment is obsolete: true
Attachment #309932 - Flags: review?(ted.mielczarek)
(Assignee)

Updated

10 years ago
Attachment #309932 - Flags: review?(pavlov)

Comment 8

10 years ago
The change from using kstat_*() to sysconf() to get the number of processors looks good.

The patch in 422960 will cause a conflict with the patch in this bug, but resolution should be obvious.

The patch in bug 418016 integrated jemalloc into libxul, which will impact the build-related aspects of the patch in this bug.  Also, earlier comments in this bug refer to "leaking" if multiple memory allocators are used, but I would expect undefined behavior (including likely crashes) if we mix allocator use.  

I'm not very happy about the prospect of burdening all users of posix_memalign() in mozilla with having to conditionally use memalign() instead.  Is this really necessary for standard use cases like yelp?  If it is strictly to make LD_PRELOAD of alternate malloc libraries work, I'm not convinced that it is worth the ongoing maintenance burden.

Comment 9

10 years ago
on windows we do have multiple allocators and it has to work (and it generally  does). on unix multiple allocators should be made to work (the only cases i know of today don't work because they're buggy, and those are IMEs).

Comment 10

10 years ago
My comments are specific to run-time selection of memory allocator, whereas I think you (timeless) are referring to compile-time selection.  I agree that nothing should be done to prevent compile-time selection, but I am questioning the value of requiring developers to write code like the js/src/jsgc.c portion of the patch in this bug everywhere that aligned memory is allocated.  It may be that a simpler solution is to just use memalign() in all cases.

Comment 11

10 years ago
Comment on attachment 309932 [details] [diff] [review]
patch v3

I agree with Jason about the posix_memalign change.  That should be fixed differently. (i.e. not using POSIX_MEMALIGN) (there is code there to do other things already (mmap, oversized allcs, etc).

I don't understand this part:
+#elif defined(MOZ_MEMORY_SOLARIS)
+inline int
+posix_memalign(void **memptr, size_t alignment, size_t size)

Shouldn't be needed.
Attachment #309932 - Flags: review?(pavlov) → review-
(Assignee)

Comment 12

10 years ago
Stuart,
I'm trying to hide posix_memalign, just like using __hidden keyword, therefore any problem that links to libjemalloc (or libxul) won't use posix_memalign and libc free together.

Jason,
If js/src/jsgc.c uses memalign() in all cases, should I add inline before posix_memalign in jemalloc.c ?

Comment 13

10 years ago
Ginn: I don't know how the dynamic loader on Solaris works, but on Linux linking will cause everything to use jemalloc's free, malloc, posix_memalign.  That is how things should work.  If we need to hook it up on solaris differently we should do that but we shouldn't hide the symbol.

Comment 14

10 years ago
We want jsgc.c to use posix_memalign, not memalign.  It should be using the posix_memalign from jemalloc on Solaris.
(Assignee)

Comment 15

10 years ago
If jsgc.c uses posix_memalign, that would be a problem on Solaris, if any malloc library or evan libc library is loaded earlier.

It would work for Firefox because we put -lxul before any other libraries.
But it wouldn't work for yelp or LD_PRELOAD case.

Here's a blog on this.
http://blogs.sun.com/quenelle/entry/malloc_interposition_can_t_possibly

Comment 16

10 years ago
That basically says "don't miss allocation APIs" as best I can tell, so I'm not sure why having posix_memalign is a problem.
(Assignee)

Comment 17

10 years ago
Comparing to jemalloc, libc misses posix_memalign. That's the problem.

e.g. thunderbird will work as "undefined behavior (including likely crashes)" since we didn't put -lxul ahead when linking thunderbird-bin.
Any program that uses libmozjs.so is same.


Comment 18

10 years ago
I don't understand.  If you're saying people using posix_memalign _will_ use jemalloc and people using malloc won't, then I don't understand this patch at all.  

Comment 19

10 years ago
My understanding of Ginn Chen's comments is that if we use posix_memalign() from jemalloc on Solaris, then using LD_PRELOAD to override jemalloc with some other allocator will fail, because jemalloc's posix_memalign() will continue to call into jemalloc, but (for example) free() will call into the replacement allocator.

It just occurred to me that we could solve this problem on Solaris by reversing the way jemalloc implements memalign() and posix_memalign(), so that posix_memalign() calls memalign() rather than the other way around.  That way we can always use posix_memalign(), and if LD_PRELOAD is used to override jemalloc, the posix_memalign() code will call the replacement memalign().

As far as I know, there are no platforms for which this reversal would cause problems, so I expect we can do it unconditionally.
(Assignee)

Comment 20

10 years ago
Jason,
I've tested, reversing the way jemalloc implements memalign() and posix_memalign() would make LD_PRELOAD work.

But we need to make sure jemalloc's memalign() is not inlined into posix_memalign().
We can use  __attribute__ ((noinline)) for GNUC and #pragma noinline for Sun Studio.

I hope it will not be a performance overhead.

Another thing is:
If libjemalloc is out of libxul (using -disable-libxul option), we need to add -ljemalloc when linking libmozjs.so or linking any program that uses libmozjs.so.
(Assignee)

Comment 21

10 years ago
(In reply to comment #20)
> Another thing is:
> If libjemalloc is out of libxul (using -disable-libxul option), we need to add
> -ljemalloc when linking libmozjs.so or linking any program that uses
> libmozjs.so.
> 
Ignore this part, jemalloc will always be part of libxul.

No matter --enable-libxul or --disable-libxul,
we need to add -lxul for linking programs that uses libmozjs.so.
e.g. xpcshell.
(Assignee)

Comment 22

10 years ago
Created attachment 311348 [details] [diff] [review]
patch v4
Attachment #309932 - Attachment is obsolete: true
Attachment #309932 - Flags: review?(ted.mielczarek)
(Assignee)

Comment 23

10 years ago
Comment on attachment 311348 [details] [diff] [review]
patch v4

Jason, can you give some comments?
Attachment #311348 - Flags: review?(jasone)

Comment 24

9 years ago
Created attachment 312824 [details] [diff] [review]
patch v4a

Attached is a patch derived from patch v4.

Patch v4 does not compile on Linux, due to a lacking prototype for memalign().  The fix is to reorder memalign() and posix_memalign().

There are some minor issues with error condition handling.  memalign() is not required to do validation of the alignment argument, so I moved that code back into posix_memalign() and added an assertion in memalign().  I also adjusted the MALLOC_XMALLOC code, though that is a pedantic change since the code is disabled anyway.

I don't think we need or want to avoid inlining memalign() on Linux, so I adjusted that code to apply only to Solaris.

Ginn, can you please look over the revised patch and make sure it works correctly on Solaris?
Attachment #311348 - Attachment is obsolete: true
Attachment #311348 - Flags: review?(jasone)
(Assignee)

Comment 25

9 years ago
It works great on Solaris. Thank you!

In case someone build it with gcc on Solaris,
I think it would be better if we write it like,

#elif defined(MOZ_MEMORY_SOLARIS)
#if defined(__SUNPRO_C)
void *
memalign(size_t alignment, size_t size);
#pragma no_inline(memalign)
#elif defined(__GNU_C__)
_attribute__((noinline))
#endif
VISIBLE
void *
memalign(size_t alignment, size_t size)
#else
(Assignee)

Updated

9 years ago
Attachment #312824 - Flags: review?(ted.mielczarek)

Comment 26

9 years ago
Created attachment 312959 [details] [diff] [review]
patch v4b
[Checkin: Comment 30]

Ginn, I think patch v4b addresses your feedback.  Please let me know if I misunderstood your intent.
Attachment #312824 - Attachment is obsolete: true
Attachment #312824 - Flags: review?(ted.mielczarek)
(Assignee)

Comment 27

9 years ago
Comment on attachment 312959 [details] [diff] [review]
patch v4b
[Checkin: Comment 30]

Yes, it does.
Thanks.
Attachment #312959 - Flags: review?(ted.mielczarek)
Comment on attachment 312959 [details] [diff] [review]
patch v4b
[Checkin: Comment 30]

r=me on the build bits.
Attachment #312959 - Flags: review?(ted.mielczarek) → review+
(Assignee)

Updated

9 years ago
Attachment #312959 - Flags: approval1.9?
Comment on attachment 312959 [details] [diff] [review]
patch v4b
[Checkin: Comment 30]

a1.9=beltzner
Attachment #312959 - Flags: approval1.9? → approval1.9+
(Assignee)

Comment 30

9 years ago
Checking in configure.in;
/cvsroot/mozilla/configure.in,v  <--  configure.in
new revision: 1.1989; previous revision: 1.1988
done
Checking in browser/app/Makefile.in;
/cvsroot/mozilla/browser/app/Makefile.in,v  <--  Makefile.in
new revision: 1.154; previous revision: 1.153
done
Checking in config/rules.mk;
/cvsroot/mozilla/config/rules.mk,v  <--  rules.mk
new revision: 3.594; previous revision: 3.593
done
Checking in memory/jemalloc/Makefile.in;
/cvsroot/mozilla/memory/jemalloc/Makefile.in,v  <--  Makefile.in
new revision: 1.8; previous revision: 1.7
done
Checking in memory/jemalloc/jemalloc.c;
/cvsroot/mozilla/memory/jemalloc/jemalloc.c,v  <--  jemalloc.c
new revision: 1.12; previous revision: 1.11
done
Status: NEW → RESOLVED
Last Resolved: 9 years ago
Resolution: --- → FIXED

Updated

9 years ago
Component: General → jemalloc
QA Contact: general → jemalloc
(Assignee)

Comment 31

9 years ago
There're 2 issues.
1) When compiling thunderbird, it doesn't have MOZ_ENABLE_LIBXUL, therefore XPCOM_LIBS doesn't have LIBXUL_LIBS.
The result is xpcshell failed to be linked, because posix_memalign is missing.

2) On Solaris/SPARC, we don't have alloca(). I didn't notice bug 420678 used it.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
(Assignee)

Comment 32

9 years ago
Created attachment 319343 [details] [diff] [review]
fix thunderbird and seamonkey bustage

Move the definition of SOLARIS_JEMALLOC_LDFLAGS to rules.mk, so that it will always apply to xpcshell and *-bin.
Attachment #319343 - Flags: review?(ted.mielczarek)
(Assignee)

Comment 33

9 years ago
Created attachment 319344 [details] [diff] [review]
fix bustage on Solaris/SPARC

In alloca.h, we have
#define alloca(x)   __builtin_alloca(x)
Attachment #319344 - Flags: review?(jasone)

Comment 34

9 years ago
Comment on attachment 319344 [details] [diff] [review]
fix bustage on Solaris/SPARC

This alloca() trouble is a pain that I don't want to have to deal with on an ongoing basis.  I'm going to modify the code to not use alloca() at all.

Comment 35

9 years ago
Created attachment 319414 [details] [diff] [review]
Remove alloca() call

This patch removes the alloca() call, thus avoiding the need for portability fixes.  Some platforms have alloca.h, and others define alloca() in stdlib.h.  I think we would have needed to add an autoconf test to solve this problem.
Attachment #319344 - Attachment is obsolete: true
Attachment #319414 - Flags: review?(ginn.chen)
Attachment #319344 - Flags: review?(jasone)
(Assignee)

Comment 36

9 years ago
Comment on attachment 319414 [details] [diff] [review]
Remove alloca() call

removal of alloca is great, since it's machine-, compiler- and system-dependent.

Nit: I think we can also remove
#define alloca _alloca
added by bug 420678.
Attachment #319414 - Flags: review?(ginn.chen)
Attachment #319414 - Flags: review+
Attachment #319414 - Flags: approval1.9?

Comment 37

9 years ago
Created attachment 319569 [details] [diff] [review]
Remove alloca() call (and #define)
[Checkin: Comment 39]

Per comment #36, remove the unneeded alloca #define.  Thanks for catching that, Ginn.
Attachment #319414 - Attachment is obsolete: true
Attachment #319414 - Flags: approval1.9?
(Assignee)

Updated

9 years ago
Attachment #319569 - Flags: approval1.9?

Comment 38

9 years ago
Comment on attachment 319569 [details] [diff] [review]
Remove alloca() call (and #define)
[Checkin: Comment 39]

a+ based on Pav's risk assessment and moz2 meeting today
Attachment #319569 - Flags: approval1.9? → approval1.9+
(Assignee)

Comment 39

9 years ago
Checking in jemalloc.c;
/cvsroot/mozilla/memory/jemalloc/jemalloc.c,v  <--  jemalloc.c
new revision: 1.14; previous revision: 1.13
done
(Assignee)

Comment 40

9 years ago
Created attachment 319931 [details] [diff] [review]
patch revised since bug 418016 reverts to libjemalloc.so
Attachment #319343 - Attachment is obsolete: true
Attachment #319931 - Flags: review?(ted.mielczarek)
Attachment #319343 - Flags: review?(ted.mielczarek)
(Assignee)

Comment 41

9 years ago
Comment on attachment 319931 [details] [diff] [review]
patch revised since bug 418016 reverts to libjemalloc.so

Sadly it's not working. Because if we compile xpidl with -ljemalloc, xpidl could not be run at compiling time, since dist/libjemalloc.so is a symbolic link.
Attachment #319931 - Attachment is obsolete: true
Attachment #319931 - Flags: review?(ted.mielczarek)
(Assignee)

Comment 42

9 years ago
Created attachment 319960 [details] [diff] [review]
fix linkage on Solaris

This is the best fix I can imagine right now.

Add dependence of jemalloc to mozjs, since it uses posix_memalign.
Apply SOLARIS_JEMALLOC_LDFLAGS to every binary that we want to use jemalloc.
Attachment #319960 - Flags: review?(ted.mielczarek)

Comment 43

9 years ago
I discovered a syntax error that slipped into revision 1.12 of jemalloc.c as part of the commit on 30 April 2008:

#  elif (defined(__GNU_C__)

should be:

#  elif (defined(__GNU_C__))
                           ^

I'm a little surprised this hasn't caused any trouble, but if there are no associated build errors, then we can wait to fix it until the patch in bug #422960 is committed.

Comment 44

9 years ago
This issue still breaks Firefox 3.0 on Solaris. ask for blocker 1.9.
Flags: blocking1.9?

Comment 45

9 years ago
(In reply to comment #44)
> This issue still breaks Firefox 3.0 on Solaris. ask for blocker 1.9.

What is the current problem you are referring to?  If comment #41 describes the issue at hand, perhaps there is a simple solution, such as passing an appropriate RPATH-related flag during linking.

Comment 46

9 years ago
Alfred does comment 45 work for you?
Flags: blocking1.9? → blocking1.9-
Whiteboard: [not needed for 1.9]

Comment 47

9 years ago
Jason, the Solaris tinerboxes(http://tinderbox.mozilla.org/Firefox-Ports/) are still broken as comment #41 mentioned. Ginn, any comments for Jason's suggestion?
(Assignee)

Comment 48

9 years ago
Jason, I think RPATH doesn't work for symbolic links to another directories.

Updated

9 years ago
Duplicate of this bug: 433608
Comment on attachment 319960 [details] [diff] [review]
fix linkage on Solaris

Could we do this centrally, in config.mk or rules.mk? It sucks to have that block duplicated in every binary's Makefile.
(Assignee)

Comment 51

9 years ago
Ted, I tried, but then we need an exception for xpidl, as I mentioned in comment #41.

Also I think it's necessary to add -ljemalloc for mozjs any way, because it uses posix_memalign().
(Assignee)

Comment 52

9 years ago
Created attachment 322490 [details] [diff] [review]
alternative patch for linkage
[Checkin: Comment 59]

Use a black list rather than a white list.
Is it look better?
Attachment #322490 - Flags: review?(ted.mielczarek)
Comment on attachment 322490 [details] [diff] [review]
alternative patch for linkage
[Checkin: Comment 59]

Yeah, I like that better.
Attachment #322490 - Flags: review?(ted.mielczarek) → review+
Comment on attachment 319960 [details] [diff] [review]
fix linkage on Solaris

Use the other patch.
Attachment #319960 - Flags: review?(ted.mielczarek) → review-

Updated

9 years ago
Whiteboard: [not needed for 1.9] → [not needed for 1.9?][RC2?][has patch][has ted review]

Comment 55

9 years ago
Ted we should take this on trunk
Was that a question or directive?
(Assignee)

Comment 57

9 years ago
It doesn't affect any platform other than Solaris.
It would be better to commit it on 1.9.0 branch.

Otherwise some people who use source tar ball may have trouble to build it on Solaris.

Attachment #322490 - Flags: approval1.9?
Comment on attachment 322490 [details] [diff] [review]
alternative patch for linkage
[Checkin: Comment 59]

Approved, please land by noon PDT to make Firefox 3.0.
Attachment #322490 - Flags: approval1.9? → approval1.9+
Whiteboard: [not needed for 1.9?][RC2?][has patch][has ted review] → [not needed for 1.9?][RC2+][has patch][has ted review]
mozilla/browser/app/Makefile.in 	1.156
mozilla/config/rules.mk 	3.595
mozilla/js/src/Makefile.in 	3.125
mozilla/xpcom/typelib/xpidl/Makefile.in 	1.45
mozilla/xpcom/typelib/xpt/tools/Makefile.in 	1.33
Status: REOPENED → RESOLVED
Last Resolved: 9 years ago9 years ago
Resolution: --- → FIXED
Whiteboard: [not needed for 1.9?][RC2+][has patch][has ted review] → [RC2+]
Target Milestone: --- → mozilla1.9
Attachment #312959 - Attachment description: patch v4b → patch v4b [Checkin: Comment 30]
Attachment #319569 - Attachment description: Remove alloca() call (and #define) → Remove alloca() call (and #define) [Checkin: Comment 39]
Attachment #322490 - Attachment description: alternative patch for linkage → alternative patch for linkage [Checkin: Comment 59]
Attachment #319960 - Attachment is obsolete: true

Updated

9 years ago
Blocks: 446302

Comment 60

9 years ago
re comment 43, bug 446302 reports that this is a real issue. i don't understand why this was left unfixed....

Comment 61

9 years ago
(In reply to comment #60)
> re comment 43, bug 446302 reports that this is a real issue. i don't understand
> why this was left unfixed....

It was fixed in changeset 663c51189e98 on 20 June 2008, as per comment #5 in bug #442960.

Comment 62

9 years ago
it was not fixed in CVS. you have a responsibility to all branches you break, not just central.

Updated

7 years ago
Blocks: 586962
You need to log in before you can comment on or make changes to this bug.