Closed Bug 366446 (js-pluggable-malloc) Opened 18 years ago Closed 12 years ago

[PATCH] Pluggable memory allocation support for JS engine

Categories

(Core :: JavaScript Engine, enhancement)

x86
Linux
enhancement
Not set
normal

Tracking

()

RESOLVED WORKSFORME

People

(Reporter: peter, Unassigned)

Details

Attachments

(1 file, 1 obsolete file)

User-Agent:       Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.1.1) Gecko/20061205 Iceweasel/2.0.0.1 (Debian-2.0.0.1+dfsg-1)
Build Identifier: 

The following patch is a result of discussions on IRC with Mozilla developers and in house requirements to limit the memory usage of the JS engine when parsing documents.

In short, it allows all the memory allocations in the JS engine to be tracked to ensure that it does not use memory beyond specified bounds by allowing custom function pointers to the calling program's custom memory allocation functions.

This code has been tested to some degree (but not extensively due to #365048) and works as expected.  This code is very unlikely to be accepted into the Mozilla CVS due to:

 - its unhelpful modification of public APIs
 - its crazy and repeated use of function pointer checks making the resulting code rather unsatisfactory

Instead it is submitted in the hope that a more consistent and better solution can be found by the JS maintainers, and possible use by others.

This is against Mozilla HEAD CVS


Reproducible: Always
Attached patch Pluggable memory support (obsolete) — Splinter Review
Comment on attachment 250960 [details] [diff] [review]
Pluggable memory support

ignoring whether brendan would accept this patch...

> JS_PUBLIC_API(JSRuntime *)
>-JS_NewRuntime(uint32 maxbytes)
>+JS_NewRuntime(uint32 maxbytes, JSAllocator *allocator)

You can't change the arguments to a public API. You'd have to introduce new names, and make the old ones somehow cooperate with the new APIs.

>+    rt->allocator = allocator;
this is strange:
>+puts("set allocator here");

don't use tabs:
>+	if (cx->runtime->allocator)

>Index: js/src/jsapi.h

my earlier comment especially applies to all of the changes here.

Note that brendan has referred to JJS_GetStringChars lack of a JSContext* as an api botch. I'd think he'd rather fix it to have a JSContext than to give it a JSRuntime*.

I'm not sure that we over indent each struct to match the other structs, that seems unlikely.
>+struct JSAllocator {
>+    /* Custom allocation */
>+    void                *allocHookData;

>Index: js/src/jsemit.c
>-                if (!AddSpanDep(cx, cg, pc, pc2, off))
>+                if (off != 0 && !AddSpanDep(cx, cg, pc, pc2, off))

please explain this change.

>Index: js/src/jsfile.c
>-    file->path = strdup(filename);
>+    file->path = JS_strdup(filename);

How does this help? it doesn't have a cx or rt handy.

I also don't understand why you insist on passing a runtime in addition to a cx. seems kinda silly.
I guess this relates to not knowing if you'll have a cx....

>-            bytes = js_DeflateString(NULL, JSSTRING_CHARS(str),
>+            bytes = js_DeflateString(rt, NULL, JSSTRING_CHARS(str),
>                                            JSSTRING_LENGTH(str));

not you fault, but it appears this argument doesn't line up ...

and note of course that I can imagine brendan complaining about perf, so you'd probably want to provide numbers.
(In reply to comment #2)
> Note that brendan has referred to JS_GetStringChars lack of a JSContext* as an
> api botch. I'd think he'd rather fix it to have a JSContext than to give it a
> JSRuntime*.

Onr can always get JSRuntime* instance from JSString* via js_GetGCStringRuntime internal API, see http://lxr.mozilla.org/seamonkey/source/js/src/jsgc.h#84 . 

Timeless, didn't you read comment 0 where Peter allows as to how this patch can't go in as-is?

"API botch" does not mean "change to add gratuitous parameter", especially now that Igor has fixed things so we don't need the formal parameter.  We can say "sorry" while keeping API compatibility.  APIs never forget.

The way to add a hook is with a hook-setting API, not by changing existing API.  It should be easy enough to add void JS_SetMallocHooks(JSRuntime *rt) or something like that.  Anyone who wants this bug fixed, please patch accordingly.

/be
Why is the allocator provided in JS_NewRuntime not actually used to allocate the runtime itself?
Apologies for the debug in the patch.  But as I stated, I don't seriously expect this patch to go in, it is for commentary only, and the coding standard reflects that.

> >Index: js/src/jsemit.c
> >-                if (!AddSpanDep(cx, cg, pc, pc2, off))
> >+                if (off != 0 && !AddSpanDep(cx, cg, pc, pc2, off))

>please explain this change.

Sorry, that was an unrelated change which is used in our code, that didn't mean to be in the patch.

> >Index: js/src/jsfile.c
> >-    file->path = strdup(filename);
> >+    file->path = JS_strdup(filename);

> How does this help? it doesn't have a cx or rt handy.

JS_strdup comes from a GCed arena pool, like JS_malloc.

The gratuitous passing of JSRuntime is because there are places where the context is NULL - deliberately so, due to my understanding.

> Why is the allocator provided in JS_NewRuntime not actually used to allocate
> the runtime itself?

No reason at all.   

By all means suggest ways that this can be done properly and completely - remember, there are lots of places in the code where allocation is done, and changes have to catch every single one - apart from JS_malloc et al, which are out of a pool which is allocated as one.

>  It should be easy enough to add void JS_SetMallocHooks(JSRuntime *rt) or
> something like that.  Anyone who wants this bug fixed, please patch
> accordingly.
> 

Oh yes.  The problem here is that JS_NewRuntime performs allocations that need to be caught - that's why I couldn't do this.
This revised patch takes into account comments made:

 - I avoid passing JSRuntime where possible by getting it from a JSString
 - There's always a default allocator to avoid the comparisons
Comment on attachment 250960 [details] [diff] [review]
Pluggable memory support

Obsolete
Attachment #250960 - Attachment is obsolete: true
you added trailing whitespace to at least one line, don't do that :)

I think until you can find some other way, you should have JS_NewRuntimeWithAllocator(....).

Alternatively, if brendan's ok with something that doesn't make sense, you could do:

JSRuntime* my_create_runtime(void) {
 JSRuntime *rt;
 JSAllocator *old;
 old = JS_SetRuntimeAllocator(my_allocator_set);
 rt = JS_NewRuntime(maxbytes);
 JS_SetRuntimeAllocator(old);
 return rt;
}

It isn't thread safe, and thread safety would be left to the callers to enforce, but basically in this system, spidermonkey would have a static variable that controls which allocator is pulled into the runtime by JS_NewRuntime.
Sorry, I should have re-read comment 0 before commenting in bug 365048 -- I now see why you are using a JSRuntime per document. That's still too heavy, and it precludes sharing. If these documents are anything like the ones reflected as DOMs in the browser embedding, you need to be able to share object references across related windows (frames, iframes). You can't do that if each DOM is in its own runtime, without some ugly proxy scheme.

This patch just needs some macroization, some sanity around js_DeflateString (one version will crash dereferencing a null rt), and style nit-picking. But given the probably misuse of JSRuntime to enforce a memory quota, I wonder if it's the right thing for SpiderMonkey.

I'm sorry if I was not paying attention to IRC, or if I didn't give good advice there -- I don't remember the exchange. Anyway, there are other ways to enforce memory quotas per document than making each live in its own JS VM. A per-context allocator struct containing the callbacks might be made to work; you'd have to balance currently unbalanced JS_malloc/free and malloc/JS_free pairs. We would want an internal API that enforces matched pair usage via C's weak type system, if possible.

/be
Status: UNCONFIRMED → NEW
Ever confirmed: true
> That's still too heavy

Agree. We hope a change to one runtime per thread might give us some performance increase 

> share object references across related windows (frames, iframes)

We don't care about this right now, and it's questionably whether we would in future, but the issue is noted.

> A per-context allocator struct containing the callbacks might be made to work

Unfortunately, some functions do not have a JSContext available, although they do have a runtime, with the current code.  

Ideally, we want a patch that we can feedback to you and is acceptable; that will save everyone hassle in future.  If not, then it is likely we will run with something very like the above, since for the moment is does precisely what we need.

Thanks.


(In reply to comment #12)
> > A per-context allocator struct containing the callbacks might be made to work
> 
> Unfortunately, some functions do not have a JSContext available, although they
> do have a runtime, with the current code.  

Yes, I wrote "might" on purpose -- but those functions that take an rt might be changed to take a cx. The internal APIs could be changed without compatibility breaks. The external APIs that take an rt and feed it into internal functions would have to stay as is, but many of these should not use a quota'd allocator in any case (e.g., JS_AddNamedRootRT -- the gc roots double hashtable must be allocated by a per-runtime or global allocator).

I'm hopeful we can get a patch in that meets your needs and does not break API compatibility or impose a noticeable performance or code size hit. A per-context allocator might be better. Care to give it a try? If not, perhaps one of the usual JS hackers will help.

/be
I would try, but I'm not sure what to do about the string handling functions - especially where only a JSString is passed in and only a JSRuntime is available, and the other functions where a NULL cx is deliberately passed.  

I really do need to catch all the allocations made by the engine, no matter how they are done - if not, I risk mismatched counts, and messing with the current hackery I have in our external allocation functions - which is to allocate space for an extra integer at the front of the memory, which contains the allocation size, allowing us to keep an accurate count on free or realloc.
Agreed that alloc/free pairs must be matched; this needs type system enforcement.

Don't agree that you need to count the allocations that are per-runtime, such as the GC roots double hash table.  That should not be subject to a per-document allocator, and I don't think you should worry about each document's "share" in that per-runtime table being an appreciable amount of memory, in the scenario where you use a cx-based allocator callback set.

All JS APIs for strings take a cx except JS_GetStringBytes. This too is a cache that could be ignored for first approximation. Even better, you could simply avoid using this API. It is only for backward compatibility and debugging; it decimates UTF-16 code points to 8-bit chars. Avoid.

/be
And JS_GetStringChars.  But JS_GetStringBytes is called over 90 places in the existing code.
(In reply to comment #16)
> And JS_GetStringChars.  But JS_GetStringBytes is called over 90 places in the
> existing code.
> 

I files bug 366725 to really deprecate JS_GetString(Chars|Bytes) and replace all calls to them with JSContext* versions. 
Summary: [PATCH] Pluggable memory support for JS engine → [PATCH] Pluggable memory allocation support for JS engine
Comment on attachment 251195 [details] [diff] [review]
Revised pluggable memory support patch

Generally you want to request review from a specific person by setting the review flag to ? and entering the person's bugzilla id.

>     str = JS_NewString(cx, names, strlen(names));
>-    if (!str) {
>-        free(names);
>+    if (!str)
>+        cx->runtime->freeHook(cx->runtime->allocHookData, names);
>         return JS_FALSE;
>     }

You removed a critical opening brace, which means this patch must not compile, right?

>+static void *JS_Runtime_malloc(void *data, size_t size)
>+{
>+    return malloc(size);
>+}

These are misnamed: js_runtime_malloc, etc. Also their return type should be passed to the JS_STATIC_DLL_CALLBACK macro instead of writing out static void *, etc.

>+static void JS_Runtime_free(void *data, void *current)
>+{
>+    free(current);
>+}
>+
>+

Nit: double blank lines are something prevailing style avoids.

> JS_PUBLIC_API(JSRuntime *)
>-JS_NewRuntime(uint32 maxbytes)
>+JS_NewRuntime(uint32 maxbytes, JSAllocator *allocator)

We talked about how this is not a compatible API change. How about a JS_SetRuntimeAllocator API instead?

>+    p = cx->runtime->allocator->mallocHook(cx->runtime->allocator->allocHookData, nbytes);

Only remaining nit, here and again several times, is to wrap at column 79 (numbering from 1). Overflow actual params indent to start in the same column as the first param. If you can't make that all fit, break after -> and underhang the first right operand of a -> operator.

>+struct JSAllocator {
>+    /* Custom allocation */
>+    void                *allocHookData;
>+    void                *(*mallocHook)(void *, size_t);
>+    void                *(*callocHook)(void *, size_t, size_t);
>+    void                *(*reallocHook)(void *, void *, size_t);
>+    void                (*freeHook)(void *, void *);

These need JS_DLL_CALLBACK macro decoration -- see jspubtd.h.

Thanks,

/be
Someone should clean this up and land it, crediting Peter.

/be
Alias: js-pluggable-malloc
Embeddings can now use JS_USE_CUSTOM_ALLOCATOR.
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: