Closed Bug 321985 Opened 15 years ago Closed 15 years ago

XDR: faster decoding of atoms


(Core :: JavaScript Engine, enhancement)

Windows XP
Not set





(Reporter: igor, Assigned: igor)




(2 files, 5 obsolete files)

Spin off from bug 279839: see comment 22 there.

Currently JS_XDRScript when decoding atom table for each string atom does the following:

1. JS_XDRString (called during JS_XDRValue) reads the string characters into malloc buffer and calls JS_NewUCString to construct the string.

2. js_AtomizeValue turns the string into the atom.

This can be optimized in the following ways:

1. String characters can be read into temporary buffer which would be atomized using js_AtomizeChars. In this way the overhead of creating a new string would be avoided if the corresponding atom already exists.

2. All strings can be read into temporary area and atomized at once using new API for bulk allocation/atomization. This would reduce the cost of taking various locks.

3. Better yet a single string can be constructed to hold all characters from the strings that do not yet have corresponding atoms and then atoms would be constructed to keep dependent strings of the big string as keys. This would require to add a notion of immutable dependent string, but memory savings of malloc header per string could be worth it.
Attached patch Using js_AtomizeChars (obsolete) — Splinter Review
This is a preliminary implementation for item 1 from the list above. It adds new XDR functions to explicitly control coding/decoding of jsval. Atom map xdr code uses the new API to decode strings into temporary buffer which is passed to js_AtomizeChars later.

I did not test the patch: I need to figure out what is a good way to test xdr code.
Flags: testcase-
Attachment #207252 - Flags: review?(brendan)
Is it possible to get profiling details similar to one from bug 279839#c10 with and without patch applied?
Blocks: 279839
Here is a patch n top of previous one to print number of newly added/total read atoms from xdr scripts. After the browser starts it prints:

XDR atoms total: new/total=1543/7543 (20.5%)

Which means that only 20.5% of all strings read from XDR scripts where not already atomized and the patch saved 6000 useless malloc/js_NewString calls. 

The stats also indicate that farther optimiations like using dependant strings for new atoms would not bring significant if any reduction in memory overhead as the number of new strings in most xdr scripts are low.
Comment on attachment 209588 [details] [diff] [review]
Patch to print number of new/total xdr atoms

Unsolicted r=me if you wrap all this with #ifdef DEBUG_notme or similar.  Instrumentation is good, if it doesn't get in the way of readability too badly.

Attachment #209588 - Flags: review+
Comment on attachment 207252 [details] [diff] [review]
Using js_AtomizeChars

>+    uint32 length, i, index, valueType, slen;

Nit: slight preference given context for type or tag instead of valueType, and nchars instead of slen.

>+    if (xdr->mode == JSXDR_ENCODE)
>+        vector = map->vector;
>     if (xdr->mode == JSXDR_DECODE) {

Second if can be eliminated via else for first if.

>+    for (i = 0; i != length; ++i) {
>+        if (xdr->mode == JSXDR_ENCODE)
>+            index = i;
>+        if (!JS_XDRUint32(xdr, &index))
>+            goto bad;
>+        JS_ASSERT(0 <= index && index < length);

0 <= index is vacuous given unsigned type.

>+        if (!JS_XDRValueType(xdr, &v, &valueType))

I don't think the XDR API needs type and body entry points for jsval serialization.  It doesn't hurt, but if API users will only ever JS_XDRValue, it doesn't help.  It hurts a little, actually, in complexity.  It *could* hurt down the road in terms of backward compatibility.

Alternative is js_XDRValueType/Body as library-private (extern, but js_ and no JS_{PUBLIC,FRIEND}_API declspecs) helper functions.

>+            if (xdr->mode == JSXDR_DECODE) {
>+                if (!(vector[index] = js_AtomizeValue(cx, v, 0)))

Nit: outside of loop control, nested assignment is frowned upon -- but I see shaver wrote some long ago in nearby code.  Not a big deal, but in this case begs question of why if-if instead of if(A&&B) where assignment nests in B.  Breaking the assignment out into a statement before the if seems most readable to me.

>+            if (xdr->mode == JSXDR_DECODE) {
>+                if (!chars) {
>+                    JS_ARENA_ALLOCATE_CAST(chars, jschar *, &cx->tempPool,
>+                                           slen * sizeof(jschar));
>+                    if (!chars)
>+                        goto bad;

Need to JS_ReportOutOfMemory somewhere here or at label bad (if not reached by other code that does its own error reporting).

>+                    charsCapacity = slen;
>+                } else if (charsCapacity < slen) {
>+                    JS_ARENA_GROW_CAST(chars, jschar *, &cx->tempPool,
>+                                       charsCapacity * sizeof(jschar),
>+                                       (slen - charsCapacity) *
>+                                       sizeof(jschar));
>+                    if (!chars)
>+                        goto bad;


>+            if (xdr->mode == JSXDR_DECODE) {
>+                if (!(vector[index] = js_AtomizeChars(cx, chars, slen, 0)))
>+                    goto bad;

Ah, more nested assginment, and here (and above in the first nested assignment Nit) we have an error reported already, so label bad: code can't call JS_ReportOutOfMemory.  Need a label out_of_memory: that does that followed by goto bad.

>+JS_XDRString(JSXDRState *xdr, JSString **strp)
>+    uint32 len;
>+    jschar *chars;
>+    if (xdr->mode == JSXDR_ENCODE)
>+        len = JSSTRING_LENGTH(*strp);
>+    if (!JS_XDRStringLength(xdr, &len))
>+        return JS_FALSE;
>+    if (xdr->mode == JSXDR_DECODE) {
>+        chars = (jschar *) JS_malloc(xdr->cx, (len + 1) * sizeof(jschar));
>+        if (!chars)
>+            return JS_FALSE;
>+    } else {
>+        chars = JSSTRING_CHARS(*strp);
>+    }
>+    JS_XDRStringChars(xdr, chars, len);

Don't drop return value on floor, and with it error or exception, here.

>+    if (xdr->mode == JSXDR_DECODE) {
>         chars[len] = 0;
>         if (!(*strp = JS_NewUCString(xdr->cx, chars, len)))
>Index: src/jsxdrapi.h
>--- src.orig/jsxdrapi.h
>+++ src/jsxdrapi.h
>@@ -156,6 +156,12 @@ extern JS_PUBLIC_API(JSBool)
> JS_XDRCStringOrNull(JSXDRState *xdr, char **sp);
> extern JS_PUBLIC_API(JSBool)
>+JS_XDRStringLength(JSXDRState *xdr, uint32 *lengthp);
>+extern JS_PUBLIC_API(JSBool)
>+JS_XDRStringChars(JSXDRState *xdr, jschar *chars, uint32 length);

In addition to not extending the public API, you could avoid js_XDRStringLength or whatever, and just use JS_XDRUint32.

Given the stats above the problem of slow JS_XDRScript should be sufficiently addressed via JS_XDRAtom. The new function when decoding string and double atoms should read data to a temporary buffer and then call js_AtomizeChars or js_AtomizeDouble. 

Compared with the patch above this approach would not expose details of jsval serialization beyond jsxdrapi.c for the penalty of calling JS_ARENA_MARK/JS_ARENA_RELEASE during each string read.
Summary: Optimizing atomization of string decoder in JS_XDRScript → XDR: faster decoding of atoms
Attached patch JS_XDRAtom (obsolete) — Splinter Review
Attachment #207252 - Attachment is obsolete: true
Attachment #209755 - Flags: review?(brendan)
Attachment #207252 - Flags: review?(brendan)
The new version besides JS_XDRAtom adds 2 library private functions, js_XDRCStringAtom, js_XDRStringAtom which ecode/decode string/C-chars - only atoms. In this way fun_xdrObject and js_XDRObject avoid calling malloc/free just to atomize C chars. 

It saves extra 4000 malloc/free calls on browser sturtup in addition to already saved 6000 by the previous version.
Attachment #209755 - Attachment is obsolete: true
Attachment #209822 - Flags: review?(brendan)
Attachment #209755 - Flags: review?(brendan)
Comment on attachment 209822 [details] [diff] [review]
JS_XDRAtom, js_XDRCStringAtom, js_XDRStringAtom

>+        } else {
>+            atoms = (JSAtom **)JS_malloc(cx, (size_t)natoms * sizeof *atoms);

Nit: space after cast, before JS_malloc(...), to match prevailing style.

>+        /*
>+         * Assert that when decoding the read index is valid and points to

Nit: "Assert when decoding that ..." or "Assert that, when decoding, ...."

>+extern JS_PUBLIC_API(JSBool)
>+JS_XDRAtom(JSXDRState *xdr, JSAtom **atomp)

Don't add this to the public API -- just make it library-private, js_XDRAtom, for now.

>@@ -1174,13 +1174,12 @@ fun_xdrObject(JSXDRState *xdr, JSObject 
> {
>     JSContext *cx;
>     JSFunction *fun;
>-    JSString *atomstr;
>+    uint32 nullAtom;
>     uint32 flagsword;           /* originally only flags was JS_XDRUint8'd */
>     uint16 extraUnused;         /* variable for no longer used field */
>-    char *propname;
>+    JSAtom *propatom;

propAtom, to match newer style that is starting to prevail here ;-).

>+    /* XXX only if this was a top-level function! */

This is a bogus old XXX comment, please remove it.

>-                /* XXX lossy conversion, need new XDR version for ECMAv3 */
>-                propname = JS_GetStringBytes(ATOM_TO_STRING(JSID_TO_ATOM(sprop->id)));
>-                if (!propname ||
>-                    !JS_XDRUint32(xdr, &type) ||
>+                propatom = JSID_TO_ATOM(sprop->id);

OTOH, the "XXX lossy conversion" comment here is valid and should be kept.  Better yet, file a bug and cite its number in a FIXME: comment (which should replace any XXX-commented problem that is worth fixing).

> static JSBool
>-GetClassPrototype(JSContext *cx, JSObject *scope, const char *name,
>+GetClassPrototype(JSContext *cx, JSObject *scope, JSAtom *nameAtom,
>                   JSObject **protop);

Seems like this pushes js_Atomize calls into three callers -- worth commoning somehow, e.g. with two params only one of which may be null?  Else another static helper subroutine?

>+    JSAtom *classatom;

classAtom, prevailing new style (classNameAtom?  Too long.  Unify elsewhere on classAtom).

>+        if (!proto && !GetClassPrototype(cx, parent,
>+                                         cx->runtime->atomState.ObjectAtom,
>+                                         &proto))
>             return NULL;

Nit: brace then part even if single-line, if condition is multiline.  Also break after && for less over-indented overflow lines.

>@@ -4058,45 +4076,45 @@ js_XDRObject(JSXDRState *xdr, JSObject *
> {
>     JSContext *cx;
>     JSClass *clasp;
>-    const char *className;
>+    JSAtom *classNameAtom;
>     uint32 classId, classDef;
>-    JSBool ok;
>     JSObject *proto;
>     cx = xdr->cx;
>     if (xdr->mode == JSXDR_ENCODE) {
>+        /*
>+         * XXX: faster way to get already existing classNameAtom ?
>+         */

This XXX is good, but it belongs in the common GetClassPrototype layer that atomizes its name param if its atom param is non-null, per above suggestion to unify js_Atomize calls.

I long ago decided not to try to avoid re-atomizing class names, but we may revisit this soon, at least for the standard classes.  See bug .

>     if (xdr->mode != JSXDR_ENCODE) {

Nit: existing code, I know, but xdr->mode == JSXDR_DECODE is better I think.

I'll r+ a new patch that addresses these small issues.  Thanks for doing this!

(In reply to comment #9)
> >+        /*
> >+         * XXX: faster way to get already existing classNameAtom ?
> >+         */
> This XXX is good, but it belongs in the common GetClassPrototype layer that
> atomizes its name param if its atom param is non-null

I pretty obviously must have meant "its atom param is null" here.

Attached patch Reflecting nits (obsolete) — Splinter Review
This version addresses above nits except that instead of moving name atomization code inside GetClassPrototype I moved it outside js_GetClassPrototype. The latter is used only by js_ErrorToException and I added the following comments there:
     * FIXME: Can runtime->atomState->lazy.nameErrorAtom be used here instead
     * of atomizing name?
    errAtom = js_Atomize(cx, exceptions[exn].name,
                         strlen(exceptions[exn].name), 0);

One possibility to address that "fixme" is to move JSAtomState.lazy.nameErrorAtom to just JSAtomState.lazy as any thrown exception triggers atomization of all exception names. But that is for another bug.
Attachment #209822 - Attachment is obsolete: true
Attachment #209822 - Flags: review?(brendan)
Attachment #210135 - Flags: review?(brendan)
Comment on attachment 210135 [details] [diff] [review]
Reflecting nits

These are really minor, so check in if you like.  I'm more interested in squeezing out useless layers of functions, and your thoughts on that.

s/classatom/classAtom/g in js_NewObject.

Given the js_GetClassPrototype change, I don't think the benefit of subroutining GetClassPrototype is worth it.  How about exposing that JSObject *scope parameter to all callers of js_GetClassPrototype?

I see net of three added js_Atomize calls, two passing clasp->name -- that to me is what the XXX (or now the FIXME comment in jsexn.c) should be about, at least for standard class names (which have atoms).

Attachment #210135 - Flags: review?(brendan) → review+
Attached patch Less indirections (obsolete) — Splinter Review
This version of patch exposes js_ConstructObject that takes the scope argument in jsobj.h to avoid useless indirection and consistently uses className and not classAtom to denote atom representing class name.

This patch still contains extra js_Atomize calls, but I will address this in bug 325724.
Attachment #210135 - Attachment is obsolete: true
Attachment #210695 - Flags: review?
Blocks: 325724
Attachment #210695 - Attachment is obsolete: true
Attachment #211045 - Flags: review?(brendan)
Attachment #210695 - Flags: review?
Comment on attachment 211045 [details] [diff] [review]
Patch update due to CVS HEAD changes

Sorry for the delay, r=me.  Thanks for doing this, can't wait to see Ts wins on!

Attachment #211045 - Flags: review?(brendan) → review+
I committed the last patch.
Closed: 15 years ago
Resolution: --- → FIXED
(In reply to comment #15)
> (From update of attachment 211045 [details] [diff] [review] [edit])
> Sorry for the delay, r=me.  Thanks for doing this, can't wait to see Ts wins on

The patch decresed startup time of Firefox on Windows pacifica from 484ms to 468ms or by 3.3%.
You need to log in before you can comment on or make changes to this bug.