75620 - Change clients that link to xpcom from using nsMemory::*

Reporter

Description

•

24 years ago

Callers nsMemory::* #include "nsMemory.h", which contains method declarations only. So at runtime, everytime a string is allocated or freed, this then calls into nsMemory (defined in nsMemoryImpl.cpp), which gets a global nsIMemory service, and then performs a virtual function call on that, which resolves back to class nsMemoryImpl (defined in nsMemoryImpl.cpp), which then calls PR_Free. Strings also go through a templated string allocator which wraps nsMemory::Free via a couple of virtual methods as well. Fun. (I can guess that this is designed to allow memory pools, or something) All of these just boil down into PR_Free, which is another (out of line) function which just calls free (except, of course, #if defined (WIN16)). This happens for almost every allocation/deallocation of memory in mozilla. (There is an nsCRT::free(char*) method, but that just calls down to PL_strfree, which again just calls free) nsIMemory is used from JS, but thats only for one of the debug menu memory options. I believe that this is overkill. :)

Peter "jag" Annema

Comment 1

•

24 years ago

So to keep the idl, and to please performance, we could use something like: %{C++ #if 0 %} idl fu %{C++ #endif inline implementations in c++ %} Or is this sacrilege?

Doug Turner (:dougt)

Comment 2

•

23 years ago

dp, is this something you or your team can get data on?

Assignee: kandrot → dp

Chad Austin

Comment 3

•

23 years ago

Attached patch patch for testing purposes (diff -u) — Details — Splinter Review

Chad Austin

Comment 4

•

23 years ago

If anyone wants to do some perftests (the only real way to see if this is faster)...

Chad Austin

Updated

•

23 years ago

Keywords: perf

Suresh Duddi (gone)

Assignee

Comment 5

•

23 years ago

Reassigning to dougt, the new owner of xpcom. This is a good thing to do.

Assignee: dp → dougt

Doug Turner (:dougt)

Updated

•

23 years ago

Blocks: 98275

Sebastian Späth

Updated

•

23 years ago

Blocks: 71874

Suresh Duddi (gone)

Assignee

Comment 6

•

23 years ago

Comment on attachment 47375 [details] [diff] [review] patch for testing purposes (diff -u) r=dp

Attachment #47375 - Flags: review+

Dan Mosedale (:dmosedale, :dmose)

Comment 7

•

23 years ago

The memory pressure observers are nice things to have, and that's implemented in nsMemory, iirc. Undoubtedly scc can speak more to the string allocators.

Doug Turner (:dougt)

Comment 8

•

23 years ago

running an internal page loader test (cowtools), here are the raw results: Current nsMemory implemention: Test id: 3BB036B00B Avg. Median : 823 msec Minimum : 271 msec Average : 1728 msec Maximum : 6164 msec After change: Test id: 3BB039A325 Avg. Median : 767 msec Minimum : 271 msec Average : 851 msec Maximum : 4137 msec It does nothing to hurt performance. I need to understand what the benefit of use PR_Malloc is over just malloc. Should we just call it a day and use PR_foo, or do we really want to call directly to the OS? Also, do we really need to test the result of every memory allocation call for failure so that we can FlushMemory? Seams to me, if we get to the point of malloc's failing, we are way too late.

Bradley Baetz (:bbaetz)

Comment 9

•

23 years ago

This *halves* our average? dougt, can you run those tests again? Since we don't support win16 (does NSPR still support it?), PR_Malloc == malloc, except that the compiler can assume things about malloc (the results from two calls to malloc don't alias). Also, what about declaring PR_malloc (and nsMemory::Alloc, if possible) __attribute__ ((__malloc__)) under gcc: `malloc' The `malloc' attribute is used to tell the compiler that a function may be treated as if it were the malloc function. The compiler assumes that calls to malloc result in a pointers that cannot alias anything. This will often improve optimization. This is only valid for gcc >= 2.9.6 though.

Brendan Eich [:brendan]

Comment 10

•

23 years ago

dougt, any chance that your second (w/ patch) run was hitting the cached pages loaded by the first run? Given threadsafe malloc, as dbaron points out, any trivial (few to dozen) cycle layers on top should be noise. But I'm all for cutting out layers. In the JS engine, malloc is used directly except where JS_malloc adds value by reporting an out of memory condition before returning NULL, using JS's error/exception reporting machinery. I don't think every place that *call* malloc must check for null and call a memory pressure API. Rather, the theory I recall (warren's, based on wade hennesy's idea) was to call the the flushers from a background thread, when we noticed pressure (but before it got so high that malloc was likely to fail). waterson knows all! There may have been (my memory is failing me) a related idea that subsystems should flush when successfully allocating, when pressure was rising, so as to avoid failing subsequently. I forget the details of that was supposed to work. /be

Doug Turner (:dougt)

Comment 11

•

23 years ago

cached must have been primed as I see results very close to running without this patch. As long as the cost of two function class is cheap relative to the malloc call, this patch will no buy us much. brendan, check out the current nsMemoryImpl class. EVERY time you alloc something, it checks that the result against null. If the result of the allocation is null, we call FlushMemory. What is kinda funny is FlushMemory does allocate memory. So, if alloc failed, and the value requested is equal to or greater than what FlushMemory requires, there will be a nice crash. Maybe we should just rip out these checks altogether?

Doug Turner (:dougt)

Updated

•

23 years ago

Target Milestone: --- → mozilla1.0

Cathleen

Updated

•

23 years ago

Blocks: 92580

Bradley Baetz (:bbaetz)

Updated

•

23 years ago

Blocks: deCOM

Cathleen

Comment 12

•

23 years ago

doug, can we try to fix this for soon, by 0.9.7?

Target Milestone: mozilla1.0 → mozilla0.9.7

Brendan Eich [:brendan]

Comment 13

•

23 years ago

Can we get apples-to-apples page-load results (cached for with and without this patch, and uncached too, or averaged together in the same way for both trials)? /be

Doug Turner (:dougt)

Comment 14

•

23 years ago

dp, I think that this is a dup of a bug that you have been working on. If so, can you dup this one?

Suresh Duddi (gone)

Assignee

Comment 15

•

23 years ago

Yeah. I think the right thing is to make my bug a dup of this one. I will do that.

Suresh Duddi (gone)

Assignee

Comment 16

•

23 years ago

*** Bug 105501 has been marked as a duplicate of this bug. ***

Doug Turner (:dougt)

Comment 17

•

23 years ago

nsMemory is a static class that should move into the xpcom glue library - as is. Any optimization to this class to call directly into malloc is a bad thing to do since the purpose of this class is to allow plugin/component authors easy access to global allocation. If clients are already linking to xpcom, then they should just be using plain malloc/free anyways. The optimizaiton needed here is to make callers call the right allocator. DP, is it worth scrubbing the callers in the mozilla sourcebase to use the vanilla allocator? In the other bug you said that it was 0.4 percent of startup, but noise possibly accounts for some of that.

Doug Turner (:dougt)

Updated

•

23 years ago

Summary: calls to nsMemory::* go through too much indirection. → Change clients that link to xpcom from using nsMemory::*

Doug Turner (:dougt)

Comment 18

•

23 years ago

Rick has some objections to changes any of the callers. Rick, can you doc your concern?

David Baron :dbaron: (⌚️UTC-4, no longer working on Mozilla)

Comment 19

•

23 years ago

Why not: * change nsMemory to be a static class with inlined calls to malloc and free * change nsMemoryImpl to call the nsMemory inlined malloc/free, rather than the other way around * Give the class in xpcom/glue (which would just be a simple way to maintain a per-library pointer to the global nsMemoryImpl, the way nsMemory does now) a new name other than nsMemory, since it would be for clients of xpcom that don't link against xpcom, as opposed to nsMemory, which is for those who do link against xpcom. (I still don't see why we need to care about pluggable allocators, though. We're nowhere near consistent enough.)

Mike Shaver (:shaver emeritus)

Comment 20

•

23 years ago

I _really_ don't see why we care about pluggable allocators for 1.0, for which we know we need to get speed and space improvements from all over, and for which we also know -- I think -- that we're not going to be able to support pluggable allocators. XPCOM 2.0, maybe?

Bradley Baetz (:bbaetz)

Comment 21

•

23 years ago

Note that we can't call malloc/free directly - we have to use static functions in another dll. The reason for this (which I didn't know when I filed this bug) is that on windows, debug dlls use a different mscrt dll to non-debug dlls. And malloc/free in these apparently use different heaps. So if an object is created in one dll, with a malloc'd member variable, and destroyed in one with different debug settings, Bad Things will happen. Or so I've been told, by several people. I'd appreciate it if a windows person could confirm this, though. ISTR seeing a comment in the source somewhere about this, too, but I can't find it now.

David Baron :dbaron: (⌚️UTC-4, no longer working on Mozilla)

Comment 22

•

23 years ago

Then Windows builds can pay the cost of those static functions and let the rest of us not pay that cost and can use inlined nsMemory methods. That's an advantage of encapsulating things in nsMemory.

Doug Turner (:dougt)

Comment 23

•

23 years ago

it is not so much we want to replace a allocator globally. We want to allow embedders the ablity to allocate and free into the mozilla environment. If they use their free() on a allocated buffer that mozilla created, this *will* cause heap corruption if their free differs from the free() that xpcom uses. Rick also commented in the other bug - the solution to this our problem is not to speed up free(), but to stop calling it so damn much. I do like what dbaron suggested. This solves all the problems raised. Comments on that approach? Also see 112262

rpotts (gone)

Comment 24

•

23 years ago

OK. i am really fed up with all of this CRAP about why calling nsMemory::Free() is so slow and how we should make it inline to speed mozilla up!!! will someone please grow a brain and realize that malloc/free are SLOW and removing a function call is NOT going to make a huge difference one way or the other! the problem is that we are calling free() too many times! that's the issue that we need to address! now how to make the function inline!! -- rick

Mike Shaver (:shaver emeritus)

Comment 25

•

23 years ago

Well, I can't grow a brain, but maybe you can spare a cutting of your throbbing cranium for a graft? Nobody, that I can see, is saying that inlining nsMemory::* will cure all of our performance woes, make us coffee and wash dmose's feet. What we _are_ saying is that memory management consumes a substantial amount of our CPU time, and that inlining nsMemory::* will be a relatively easy way to recover some of that, across the width of the application. How do you propose to call free() less? Leaking more, if it turns out to be possible, isn't likely to help. Do you have a better concrete means of partially addressing the cost of our memory consumption (which is, and again nobody disputes this, way too large) than that proposed here?

David Baron :dbaron: (⌚️UTC-4, no longer working on Mozilla)

Comment 26

•

23 years ago

My jprof data show that of time spent within nsMemory::Free in page loading: - 8% is in nsMemory::Free - 3% is in nsMemoryImpl::Free - 7% is in PR_Free - the rest actually within free I'm not inclined to believe that these numbers are accurate, but they're at least suggestive that there's some real cost here. virtual function calls are expensive, especially on heavily pipelined processors, since an instruction cache miss can cost hundreds of cycles (not just the 20 or so cycles from a pipeline halt that one will get with every virtual function call). It's also not clear how much not using |malloc| and |free| hurts our compilers' ability to optimize due to the assumptions malloc and free allow the compiler to make about non-aliasing (could it affect optimization of other codepaths that don't go through malloc and free?). Finally, I think it would be easier to just try it and run some page load tests in a well-controlled environment than to get better numbers using profiling. After all, we allocate too much, so it could be an easy performance win. Fixing our wider too-much-allocation problems probably requires both copy-on-write strings and switching code over to interfaces that use AString rather than "return"ing (allocated and copied) strings. Judging by quick glances at page load profiles, a vast majority (around 99% - to be exact, 156 out of the 157 timer hits in nsMemory::Alloc in a page loading profile I have) of allocations that go through nsMemory::Alloc / nsMemory::Free are string buffers, many of them string objects' internal buffers. (We spend more time in new and delete, but those aren't string buffers.) Are we willing to abandon our frozen or quasi-frozen interfaces that require such copying? Of course, another solution would be to change the string code that's calling nsMemory::Alloc and nsMemory::Free to call malloc and free directly, and then we'd just avoid the problem, since by far the largest consumer wouldn't get the extra ~20% (again, that's just a guess from jprof) overhead around malloc and free.

David Baron :dbaron: (⌚️UTC-4, no longer working on Mozilla)

Comment 27

•

23 years ago

> Do you have a better concrete means of > partially addressing the cost of our memory consumption (which is, and again > nobody disputes this, way too large) than that proposed here? Our memory consumption and the performance cost of calling malloc and free are two very different problems. Much of the latter is incurred for very short-lived allocations that have very little influence on the former. (Within page load, allocation of content-model data structures (e.g., text nodes) stands out as an exception.)

Mike Shaver (:shaver emeritus)

Comment 28

•

23 years ago

>> Do you have a better concrete means of >> partially addressing the cost of our memory consumption (which is, and again >> nobody disputes this, way too large) than that proposed here? > > Our memory consumption and the performance cost of calling malloc and free are > two very different problems. Much of the latter is incurred for very > short-lived allocations that have very little influence on the former. Our memory consumption is a two-faceted problem: we use too much memory, which bloats our heap, and we consume it too often, which hogs the CPU. Those two problems (space and time) are very different, and I never claimed otherwise, but they're both costs of our bad memory habits.

rpotts (gone)

Comment 29

•

23 years ago

ok... i've taken my prozac and the world is looking a little rosier now :-) i *really* do not think that inlining the calls to malloc/free on *any* client code is a good idea because it makes it *very* difficult for embedding applications to use their own memory managers. if we decide to call malloc/free in the static XPCOM glue code, then we are subtly forcing embeddors to use the same MM as us (ie. MSCRT). I believe that this is unacceptable for many applications. the data that i collected about startup memory memory allocations indicated that for the first 40K allocations (that's all i botherer to track) most of them were held onto for very short periods of time!! most of these temporary allocations were caused by temporary strings as out parameters... so, if we are *really* worried about the overhead of free() during startup i think that it would be better to identify the (hopefully small number) of functions that are generating these string allocations and consider changing them to use nsAString (or some other strategy)... -- rick

Mike Shaver (:shaver emeritus)

Comment 30

•

23 years ago

Embedding applications that don't want to have to use the CRT's malloc are probably going to have to define their own malloc symbol, because it's used all over just about every library in the world. And if they have their own malloc symbol, won't we pick that up? I was pretty sure I've read pages talking about how to get the MSCRT to use a custom allocator, underneath its own malloc, but I can't find them right now. I'll look again later. I agree that nsAString is probably a win too, but it's almost certainly more work than this change.

David Baron :dbaron: (⌚️UTC-4, no longer working on Mozilla)

Comment 31

•

23 years ago

If we want embedders to be able to use pluggable allocators without overriding malloc and free themselves, we'd move to a world where an XPCOM string getter couldn't just fill in an out param with PR_smprintf or PL_strdup. Instead it would have to use nsMemory::Alloc (perhaps via ToNewCString and nsDependentString or through nsCRT::strdup, if nsCRT::strdup actually called nsMemory::Alloc, which I don't think it does right now), copy the string into the out buffer and fill in the out param, and call PL_strfree or PL_smprintf_free. Do we really want to pay the cost of that consistency for pluggable allocators. (We can't even keep g_free and g_malloc vs. our own allocators straight, and we only deal with those in a handful of files -- see bug 95599.) I'd argue that we wouldn't want to deal with the cost of all that allocation. Or are we willing to abandon passing ownership of allocated memory from C to C++ (or come up with an even uglier solution)?

rpotts (gone)

Comment 32

•

23 years ago

i'm not talking about embeddors that want to use custom allocators without overriding malloc/free... i'm talking about embeddors that *have* done this work. my concern is that if nsMemory:: in the glue code calls directly into malloc, then if an embeddor has overridden it we could end up corrupting their heap. when we allocate memory within mozilla we always use MSCRT... but suppose an embeddor wanted to use say SmartHeap... they would have to create *special purpose* allocation/free routines which called MSCRT *just* to deal with freeing the blocks of memory that we handed out... that's why i strongly feel that the nsMemory glue code *must* call through nsImemory...

David Baron :dbaron: (⌚️UTC-4, no longer working on Mozilla)

Comment 33

•

23 years ago

> that's why i strongly feel that the nsMemory glue code *must* call through > nsImemory... In my proposal above the nsMemory glue code wouldn't be used by embedders -- it would be used only by those who link against XPCOM. Furthermore, why is it so important for it to go through nsIMemory if the nsIMemory implementation will always call malloc/free (which is what it seems you're saying above -- that an embedder would change their allocator but not ours, and would need to use ours explicitly in some places).

Doug Turner (:dougt)

Comment 34

•

23 years ago

moving out a milestone.

Target Milestone: mozilla0.9.7 → mozilla0.9.8

David Baron :dbaron: (⌚️UTC-4, no longer working on Mozilla)

Comment 35

•

23 years ago

Well, the checkin of bug 112262 makes my suggestion in comment #19 a lot harder to implement now.

Doug Turner (:dougt)

Comment 36

•

23 years ago

we would need to rename 'xpcom linked' nsMemory users.

Mike Shaver (:shaver emeritus)

Comment 37

•

23 years ago

> i *really* do not think that inlining the calls to malloc/free on *any* client > code is a good idea because it makes it *very* difficult for embedding > applications to use their own memory managers. As per Brendan's comment #10, the JS engine calls down into malloc(3), and is unlikely to change that behaviour. So embedders that can't use malloc(3) are going to need to at the least re#define malloc to point at their own allocator. At that point, they can rebuild our tree with minimal additional pain. (Avantgo uses spidermonkey on the Palm, which to my knowledge doesn't have a malloc(3), so there are existance proofs of the viability of this approach.) If we just need a way for embedders and XPCOM components to use the same allocator we use, then let's document malloc(3)/free(3) as that allocator, and forget this whole nsMemory thing, no? (I'm a little ticked that the glue-code stuff went ahead, without responses to dbaron's counterindications in this bug, but I guess we can back it out later if we need to?)

Doug Turner (:dougt)

Comment 38

•

23 years ago

Maybe the avocates of removing nsIMemory and nsMemory are missing the point. 1. Currently client replacement of the allocator via nsIMemory is NOT SUPPORTED. It may be at some point. 2. When you embed gecko, there are interfaces which pass buffers around. These buffers need to be allocated and freed from the same heap. We provide a interface to access the same heap. 3. Some day, the nsIMemory implemention may be better than malloc. 4. Why does everyone call into nsIMemory? We could have asked everyone to use malloc/free but history has proven that this does not work. Ask potts for the number of headache he gotten from chasing heap corruption down caused by mismatch allocations/deallocations. 5. This change was requested buy the mozilla embedding and plugin teams. dbaron's and dp's comments about speeding up startup if we inline free() don't fly with me (the data is not conclusive and imo it is the wrong approach). shaver, don't be ticked. It is Friday :-)

David Baron :dbaron: (⌚️UTC-4, no longer working on Mozilla)

Comment 39

•

23 years ago

I'm not complaining about that interface. I'm complaining that all of our internal code that's linked against XPCOM has to go through it. Now that the name that we're using throughout all our code is the name of what you put in xpcom/glue, we either have to change all the Mozilla code or change the name of what you put in xpcom/glue if we want to fix this.

Mike Shaver (:shaver emeritus)

Comment 40

•

23 years ago

If you want to mix allocators, you're going to get heap corruption. Just like you get corruption if people mix different system allocators, you'll get corruption if people mix nsMemory:: with the wrong system allocator. So don't mix allocators. Just don't. I don't see why using the term "nsMemory" rather than "free" makes the problem go away, or even mutate interestingly. The history I have -- 5 years of the JS engine being used by hundreds of embedders and extenders all over the place -- says that using malloc and free works great. If people are mixing up allocators, they will lose. People who can't use system malloc(3) are already faced with the effort of making their other system-and-component libraries replace their malloc(3) calls with something that points to their custom allocator. Many environments (Palm, for example) have existing bodies of work that make that quite a reasonable task. Do all Windows applications have to use this extra layer of indirection? Did 4.x have plugin troubles when people didn't use NS_Alloc or whatever? Maybe I just need to read more about Windows applications and heap management. I'll go and read up this weekend, so I don't look like an idiot. But what sort of data would be conclusive? What problems are there with dbaron's data? (I think this sucks just as much on a Friday as on any other day. =/)

Doug Turner (:dougt)

Comment 41

•

23 years ago

i will be the first to back this out and stript mozilla of nsMemory all together if we can get complete agreement from all parties: rick potts, patrick beard, comment please. (I think this sucks less on fridays since I am usually drinking. :-) )

Patrick C. Beard

Comment 42

•

23 years ago

On the Mac, we always have had to use the browser's allocator within plugins. There is no standard *FAST* allocator in the classic Mac OS. malloc just doesn't exist. We don't need nsMemory, but we do need nsIMemory. Static linking is bad.

Brendan Eich [:brendan]

Comment 43

•

23 years ago

dougt, I'm gonna wrassle with ya: >1. Currently client replacement of the allocator via nsIMemory is NOT >SUPPORTED. It may be at some point. And monkeys might fly outta my butt. In the order of things to do for 1.0 and any milestone within sight, this is so far over the horizon that I bet donuts it won't ever happen. Are you citing it as a "might be nice" argument? That's not compelling. "Might be nice" is too often a warning sound. >2. When you embed gecko, there are interfaces which pass buffers around. >These buffers need to be allocated and freed from the same heap. We provide a >interface to access the same heap. Good for embedders, but no reason to avoid doing what dbaron suggested. >3. Some day, the nsIMemory implemention may be better than malloc. Here I'll bet more than donuts. If we put in a better allocator, it will be called malloc, because of 1, above (too many naked malloc refs in our code). Better malloc is better, period. If the embedder can't use it, then the embedder will have to rebind our malloc refs, too -- or do some kind of prelinking of our code against the better malloc, and the embedding code against the other malloc. In the latter case, nsIMemory helps the embedder, but it only adds an unnecessary layer in our code. >4. Why does everyone call into nsIMemory? We could have asked everyone to use >malloc/free but history has proven that this does not work. Ask potts for the >number of headache he gotten from chasing heap corruption down caused by >mismatch allocations/deallocations. Everyone does not call nsIMemory. For example, nsCRT::strdup calls PL_strdup, which calls malloc -- and too many XPCOM methods use nsCRT::strdup still (see http://lxr.mozilla.org/mozilla/search?string=nsCRT%3A%3Astrdup). >5. This change was requested buy the mozilla embedding and plugin teams. >dbaron's and dp's comments about speeding up startup if we inline free() don't >fly with me (the data is not conclusive and imo it is the wrong approach). What does the first sentence have to do with the second? You need to provide some numbers as suggestive as dbaron's, but in the opposite sense, to dismiss his data (which he himself has already qualified appropriately). The change being requested by embedding and plugin teams doesn't cut any ice against dbaron's profiling results -- it's a different universe of discourse, it is not technical data. Of course we need to cut down on malloc/free calls (I'm to blame for our simple COM-ish string/wstring-map-to-char-arrays decision as much as anyone -- warren will never forgive me!). But we also need to cut out gratuitous layering, which blows modern, heavily pipelined CPUs' caches. We don't need to field- upgrade the malloc used by every part of Mozilla for 1.0 or any milestone being planned, so I do not think dbaron's suggestion should be dismissed or opposed by your code change in bug 112262, without all of us first agreeing about what we are trying to do here, and how. /be

Darin Fisher

Comment 44

•

23 years ago

LONG-WINDED PREFACE: nsMemory::Alloc currently maps to the system malloc. all of the allocations mozilla generates result from a call to the system malloc. mozilla does not implement the system malloc. the system malloc can change and mozilla would still work just fine, provided the runtime linker can find a malloc for mozilla when it's libraries are loaded. what we need to do is ensure that people building apps against mozilla always link their apps such that mozilla (as well as the embedder's app) pick up the one (and hopefully only) global malloc symbol. normally, this will just be in libc (or MSCRT) and everything will just happily work. of course this becomes difficult if the embedder #define's malloc to something else... like my_own_private_malloc. then, we've got the potential for mixing up allocators, and the nightmare ensues. IMO the solution to this problem is to make it very clear to embedders that they must use the system malloc/free when interacting with mozilla because this is the malloc/free that mozilla is going to use when it hands back strings and what-not. maybe i'm trivializing things here, but i see no reason why this can't be solved with proper documentation. if an embedder is using a special malloc for their own stuff, then they should be careful to isolate the code that talks to mozilla in such a way that they can ensure that the global malloc/free is always used. of course, on windows there's the problem of debug MSCRT vs. optimized MSCRT, which probably means on windows, we need to provide xpcom (glue or otherwise) entry points to the malloc/free that mozilla likes. these entry points MUST NOT be inlined for the embedder because that would defeat the purpose. they would have to be implemented inside the library so that they would _want_ the same malloc/free as the rest of mozilla. this is IMO nsMemory::Alloc, nsMemory::Free, etc. they should be implemented in a library that we build such that they _want_ to the same malloc/free as the rest of mozilla. so, i guess what i'm saying is that nsMemory::Alloc need not be implemented as a call to some nsIMemory implementation. the implementation of nsMemory::Alloc should just call malloc. if there are memory pressure things that we want to do, we can still do them. for example, if malloc fails, we could talk to some interface to handle the memory pressure (failure) condition. also, i'm not advocating getting rid of nsIMemory. specific components may like to abstract the memory allocator (eg. nsIPipe -- which is particularly significant to me since i am working on a specialized allocator for necko that is optimized for 4k allocations only). CANDIDATE SOLUTION: i think what's most important is that we make it possible for mozilla libraries to be built such that nsMemory::Alloc turns into an inlined malloc (why? because we can, and anything else is just overkill). for embedders, we need a different solution. the implementation could be controlled with a preprocessor directive (much like -D__KERNEL__ in linux land). for embedders that want to link to xpcom.dll, nsMemory::Alloc should be a DLL export implemented as a call to malloc. for embedders that don't want to link to xpcom.dll, nsMemory::Alloc should be a call to a nsIMemory implementation just as it currently is. any problems with this solution?

rpotts (gone)

Comment 45

•

23 years ago

ok... if everyone is hell-bent on making mozilla lightening fast by inlining all calls to free, then far be it from me to stand in the way of genius :-) my only line in the sand is to preserve the existing use of nsMemory by clients outside of the mozilla codebase. ever since we created the static nsMemory class in XPCOM we have actively told external developers to call it! i can live with dbaron's proposal to create a static class which calls free directly -- i think that it it totally unnecessary and harmful because it diverts attention from the real problem which is the frequency with which we call free not the two extra function calls we make in the process... but i cannot live with the external, public interface being 're-invented' in such a way that embeddors can no longer use it... i'm really shocked that brendan, of all people, is being so capricious about changing a longstanding, public interface for no good reason. if people feel strongly that calls to free() must be made faster, then by all means, create a new static class, say nsMonkeysFlyOutOfBrendansButt, which has inline versions of alloc/free and fix up everyplace in our codebase that currently calls nsMemory::Free(...) to call nsMonkeysFlyOutOfBrendansButt::Free() but *do not* make the existing nsMemory class unusable by external consumers!! in the end, does anyone honestly expect to see a measurable performance improvement by these changes that merits the amount of time (and emotion) that has *already* been invested in this discussion? -- rick

Darin Fisher

Comment 46

•

23 years ago

i like dbaron's suggestion (comment #19) except for the third part. i agree with rick in that we should not try to change what API embedders use. in fact, i don't see why it wouldn't be possible to allow everyone to call nsMemory:: methods. as i said previously, why can't there be multiple versions of the nsMemory:: methods. depending on some #define(s), you'd get different version(s). for example, there could be some #define that is used when building mozilla... say MOZ_INLINE_NSMEMORY that would cause all of our mozilla components to build against an inline version of the nsMemory:: methods. without this #define, you'd just get the plain old nsMemory:: impl that we currently have. this seems like a very minimal and sufficient solution. or, am i missing something?

Mike Shaver (:shaver emeritus)

Comment 47

•

23 years ago

> ok... if everyone is hell-bent on making mozilla lightening fast by > inlining all calls to free, then far be it from me to stand in the way > of genius Everyone here who thinks that this, on its own, will make Mozilla lightning-fast, please raise your hand. We don't need any more strawmen here, thanks; the actual issues are contentious enough. I don't see an @FROZEN in nsMemory.h (though I do see one in nsIMemory.idl; but we're not discussing changing the binary interface there, just not using it anymore). Who's been recommending unfrozen interfaces to embedding folk _again_? Haven't we been burned by that enough? Who has a released-to-the-world component that will survive all the other freezing-driven changes but will be burned by this? > i'm really shocked that brendan, of all people, is being so capricious about > changing a longstanding, public interface for no good reason. (Brendan, quite obviously, thinks that there is good reason to do this, as do dbaron and I. If you don't, that's fine, but don't cons up inconsistencies by projecting your premise onto other people's reasoning. This isn't some wanky nsAllocator->nsMemory renaming.) I don't think anyone's attention will be diverted from the frequency of malloc/free calls by the reduction in cost of those calls -- unless it actually takes the malloc/free patterns out of the profiler's crosshairs, in which case we shall all rejoice. But anyone who thinks that we're not going to need lots of these sorts of maybe-a-percent wins to get in shape, even if some huge 10% wins fly out of someone's butt, is, deluding themselves, IMO. We are _always_ going to have to allocate and free lots of memory as it crosses XPCOM interface boundaries, and I don't foresee us shedding the over-componentization albatross by 1.0, by a long shot. And I don't care at all about sunk-cost economics arguments around how we shouldn't bother to do this, because it won't be enough gain to have justified the emotional energy already put into it. (Surely getting _no_ gain by walking away now is even worse, by that logic, not that I'm especially fond of this variety of sunk-costing either.) As far as measurable goes, I think dbaron has already given us _measurements_, with responsible qualification and description, of the kind of gain we're looking at here. If you think that more time should be spent making us allocate/free less, then why are you spending that time posting to this bug? =)

David Baron :dbaron: (⌚️UTC-4, no longer working on Mozilla)

Comment 48

•

23 years ago

> for example, there could be some #define that is used when building mozilla... > say MOZ_INLINE_NSMEMORY that would cause all of our mozilla components to build > against an inline version of the nsMemory:: methods. without this #define, > you'd just get the plain old nsMemory:: impl that we currently have. In some cases (the ones that provide the rationale for nsIMemory), an embedder who does not link against XPCOM needs to use an nsMemory class that maintains a pointer to an nsIMemory object so that the calls to malloc and free are made in another library. At least, I think that's the case if I understand the Windows debug/opt issue. As of this afternoon, nsMemory is in xpcom/glue, which is meant to be statically linked in to any library that doesn't want to link against XPCOM. However, we don't want all the code in our browser to have to go through the extra level of (virtual function) indirection. |#define nsMemory nsMemoryForMozillaItself| (or whatever) seems a bit ugly to me, so I'd rather the callers who don't link against XPCOM be the ones who deal with a new name. After all, nsMemory has only been available to those callers since this afternoon, so there shouldn't be huge compatibility issues to deal with.

Darin Fisher

Comment 49

•

23 years ago

dbaron: good point. today, consumers of nsMemory:: are implicitly linked to xpcom.dll, and so there really isn't anyone depending on the nsIMemory indirection. however, making the nsMemory:: impl's inline is likely to cause problems for embedders (even though this is not a frozen API). is that something we're OK with? do we have any idea of how many embedders would even care? like it or not, i think it is something we should consider. as a side note: why is it so important for embedders to be able to explicitly link to xpcom.dll? why does implicit linking not suffice?

Brendan Eich [:brendan]

Comment 50

•

23 years ago

>i'm really shocked that brendan, of all people, is being so capricious about >changing a longstanding, public interface for no good reason. Rick, I'm not advocating changing the public interface embedders have been told to use, but I'm sure I blew it by not saying so more clearly, when I was (a) defending the important part of dbaron's proposal ("inlining" calls within Mozilla code) and (b) wrassling with dougt over the lack of agreement before he took action. If it isn't clear by now, I'm with darin: keep embedders using a public, "indirect-able" API, but inline our internal calls that must always go to malloc-the-symbol-we-link-against. I guess that means (and this is what dbaron was trying to avoid) that our internal calls will have to be renamed. Or does it (see dbaron's comment #48)? Are embedders in fact using nsMemory, or only nsIMemory? I'm surprised by all the emotion too -- for my part, this is not a burning issue that we need to rush along, or fret over excessively compared to any other problem, and I hope people will believe me when I say that I'm not mad at anyone. /be

Doug Turner (:dougt)

Comment 51

•

23 years ago

the nsMemory class is use all over the place. Take a look at the galleon code as an example. Just an small example is this: http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/galeon/galeon/src/mozilla/ExternalProtocolService.cpp?rev=HEAD&content-type=text/vnd.viewcvs-markup I have looked at many gecko embeded clients. all of them use this class.

Dan Mosedale (:dmosedale, :dmose)

Comment 52

•

23 years ago

Throwing out pluggable allocators seems to imply throwing out our only current hope for dealing with low-memory situations in any reasonable way (the memory flusher stuff). It seems unfortunate that folks are willing to do this without proposing some alternate solution to the low-memory issue.

Darin Fisher

Comment 53

•

23 years ago

dmose: from my comment #44: > so, i guess what i'm saying is that nsMemory::Alloc need not be implemented as a > call to some nsIMemory implementation. the implementation of nsMemory::Alloc > should just call malloc. if there are memory pressure things that we want to > do, we can still do them. for example, if malloc fails, we could talk to some > interface to handle the memory pressure (failure) condition. why isn't memory pressure handling orthogonal to memory allocation? i mean isn't it something done only when malloc fails?

Brendan Eich [:brendan]

Comment 54

•

23 years ago

As I said to dmose the other day, waiting till malloc fails usually means waiting till the app has chewed up all swap and thrased your system to its knees. That's too late. Anyway, it has nothing to do with *how* one calls the malloc-wrapper, which could still be inlined. dougt: well, more _de facto_ standardization of nsMemory.h by its use among embedders does make me want to preserve compatibility -- although if you feel so strongly, then shaver's comment #47 should guilt you into putting a @status FROZEN in nsMemory.h, no? So, if we want to support nsMemory usage among current embedders' codebases, then I think darin has the right solution: an #ifdef MOZILLA_CLIENT (JS uses this macro already, defining it for itself when mozilla/js/src is built as part of Mozilla-the-appsuite/platform) or similar test. /be

Doug Turner (:dougt)

Comment 55

•

23 years ago

things in xpcom/glue do not need to be "frozen". Embedders are expected to take snapshots of this code and link against that. It might help if I finish the document regarding this and post it to the newsgroup. :-) If someone wants to make this class inline when it is linked into xpcom, fine by me. Note that the class can not inlined for anything compiled into the glue library (namely mozilla/strings).

Patrick C. Beard

Comment 56

•

23 years ago

On Friday, December 7, 2001, at 08:58 PM, shaver@mozilla.org wrote: >I don't see an @FROZEN in nsMemory.h (though I do see one in >nsIMemory.idl; but we're not discussing changing the binary interface >there, just not using it anymore). Who's been recommending unfrozen >interfaces to embedding folk _again_? Haven't we been burned by that >enough? Who has a released-to-the-world component that will survive >all the other freezing-driven changes but will be burned by this? As long as you mean by "just not using it anymore", not using it internally to implement the nsMemory class, I have no problem with this statement. As long as we keep the nsIMemory interface useable via the service manager, plugin developers should be be unaffected.

Brendan Eich [:brendan]

Comment 57

•

23 years ago

dp, cathleen: do we need a spin-off bug, or should this bug be reassigned to one of you (dp :-) to finish fixing, by inlining conditionally at compile time, for "inside the embedded Mozilla" fast paths to malloc, etc.? /be

Suresh Duddi (gone)

Assignee

Comment 58

•

23 years ago

I wont mind fixing this just as wouldn't mind fixing it, I think. Doug why not squish this when you have the chance ? Reassign to me and I will take care of it.

Doug Turner (:dougt)

Comment 59

•

23 years ago

Over to dp.

Assignee: dougt → dp

Suresh Duddi (gone)

Assignee

Comment 60

•

23 years ago

I will hazard to summarize this: - Caller of nsMemory:: that link with xpcom changed to call malloc/free directly Intent: Save the two extra levels of call indirection that happens - Everything else stays exactly the same.

Status: NEW → ASSIGNED

Priority: -- → P4

Target Milestone: mozilla0.9.8 → mozilla0.9.9

Dan Mosedale (:dmosedale, :dmose)

Comment 61

•

23 years ago

How about just doing the following: * avoid the global service get and the virtual function call * avoid using PR_Free() under the hood * keep the existing methods in nsMemory so that the memory-pressure observer stuff stays intact. * inline the nsMemory::* methods

Suresh Duddi (gone)

Assignee

Comment 62

•

23 years ago

Moving past Mozilla 1.0

Target Milestone: mozilla0.9.9 → mozilla1.0.1

Suresh Duddi (gone)

Assignee

Updated

•

23 years ago

Target Milestone: mozilla1.0.1 → mozilla1.1

Doug Turner (:dougt)

Comment 63

•

22 years ago

dup of 153215. 153215 has a better patch, than is why this is being dup'ed of that. :-) *** This bug has been marked as a duplicate of 153215 ***

Status: ASSIGNED → RESOLVED

Closed: 22 years ago

Resolution: --- → DUPLICATE