Open Bug 1250998 Opened 4 years ago Updated 2 years ago

Add support for sized deallocation to SM

Categories

(Core :: JavaScript Engine, defect)

defect
Not set

Tracking

()

People

(Reporter: mwu, Assigned: mwu)

Details

Attachments

(2 files, 1 obsolete file)

Attached patch Support sized deallocation in SM (obsolete) — Splinter Review
jemalloc 4 supports sdallocx, which is a sized deallocation function. Most places that call js_free can easily determine the size of the thing that's being freed instead of having jemalloc figure it out. js_free_unsized is introduced for freeing things we can't easily find the size of. (usually C strings, though some parts of the JIT code do fancy things to pack a bunch of data into a single allocation) Since SM depends on MFBT, support for sized deallocation is also added to MFBT. I've only tested sized deallocation on standalone SM. This patch only adds the API - I have a separate patch for hooking up to je_sdallocx, but just for testing.
Out of curiosity, do you have any measurements of the performance improvement of calling sdallocx over free?
Some places where free() performance matters:

* Large strings - we could create a micro-benchmark that allocates thousands of different strings, then triggers a gc, and see if there's any difference.

* Vectors: malloc/free sometimes show up in profiles when we use a (temporary) Vector.
Comment on attachment 8723167 [details] [diff] [review]
Support sized deallocation in SM

Review of attachment 8723167 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/public/Utility.h
@@ +245,5 @@
>  }
>  
> +static inline void js_free(void* p, size_t s)
> +{
> +    free(p);

Would it be possible to assert that the size is correct in debug build?  I can forsee a lot of ways that can make it bitrot quickly.

Also, I would recommend to change all the current js_free by js_free_unsized first, and then make patches to fix the obvious and easy one.

I see that some data structure got extra fields added to them to keep the size of the allocation, where I wonder if this is a wise choice, as the allocator might be able to find the size in with a memory-efficient way if the data are packed by allocation sizes.
(In reply to Jan de Mooij [:jandem] from comment #2)
> Some places where free() performance matters:

No disagreement that free() performance matters, I was just interested to know how much sdallocx improves things.  That is, "free() is slow" has been the wisdom for a long time now and I was wondering if sdallocx basically fixes this, which would be awesome.
(In reply to Luke Wagner [:luke] from comment #4)
> No disagreement that free() performance matters, I was just interested to
> know how much sdallocx improves things.

Sure. I wasn't disagreeing with your comment, I just wanted to point out some hot free() calls worth benchmarking. I agree measuring this is important.

Also, if the motivation for this is purely performance, converting *all* of them seems suboptimal: 90% of them are probably not performance sensitive but explicitly passing the size does add some complexity.
For the most expensive things free() does, I doubt the difference actually matters. It's probably still worth getting numbers.
(In reply to Luke Wagner [:luke] from comment #1)
> Out of curiosity, do you have any measurements of the performance
> improvement of calling sdallocx over free?

Haven't measured anything yet.

(In reply to Nicolas B. Pierron [:nbp] from comment #3)
> Would it be possible to assert that the size is correct in debug build?  I
> can forsee a lot of ways that can make it bitrot quickly.
> 

Yeah, that's a problem. Problems don't show up unless you actually hook js_free to sdallocx. The places where we can actually easily assert the size is correct are the places where we don't need to.

> Also, I would recommend to change all the current js_free by js_free_unsized
> first, and then make patches to fix the obvious and easy one.
> 

There is a patch and it works and passes tests when hooked into sdallocx, so this advice is a bit late. :)

> I see that some data structure got extra fields added to them to keep the
> size of the allocation, where I wonder if this is a wise choice, as the
> allocator might be able to find the size in with a memory-efficient way if
> the data are packed by allocation sizes.

Yeah probably need to take another look at all the data structures that were changed and see if things are worth it. I remember that most of the structures I added fields to were large to begin with though..

(In reply to Jan de Mooij [:jandem] from comment #5)
> Also, if the motivation for this is purely performance, converting *all* of
> them seems suboptimal: 90% of them are probably not performance sensitive
> but explicitly passing the size does add some complexity.

In most cases, we get sized dealloc for "free" when the code knows the exact type that's being freed. For explicit js_free calls, I considered the complexity before converting, but there's probably some cases worth converting to js_free_unsized instead.
Attachment #8723167 - Attachment is obsolete: true
Just for testing purposes. Doing this right might require some attention to whether sdallocx has a prefix on the platform. Also not sure if this works with the way we use jemalloc on OSX..
https://github.com/jemalloc/jemalloc/commit/4cfe55166e0173be745c53adb0fecf50d11d1227 says that they have a debug assert for the correct size already in place and that they saw a 10% performance win on their micro benchmark.
I was thinking about verifying the correct size when sdallocx isn't hooked up. Things might break between landing support and turning it on. (which depends on enabling jemalloc 4)
FWIW, this would also be useful for an idea I'm working on to speed up allocation of small arrays by using the GC heap instead of jemalloc directly - though that idea still has some problems to work out, so it might not come to fruition.
You need to log in before you can comment on or make changes to this bug.