Closed
Bug 386265
Opened 17 years ago
Closed 17 years ago
Using double-hashing for atoms.
Categories
(Core :: JavaScript Engine, enhancement)
Core
JavaScript Engine
Tracking
()
RESOLVED
FIXED
People
(Reporter: igor, Assigned: igor)
References
Details
Attachments
(5 files, 13 obsolete files)
37.31 KB,
patch
|
Details | Diff | Splinter Review | |
45.91 KB,
text/plain
|
Details | |
4.04 KB,
patch
|
Details | Diff | Splinter Review | |
87.62 KB,
patch
|
brendan
:
review+
brendan
:
approval1.9+
|
Details | Diff | Splinter Review |
13.24 KB,
patch
|
Details | Diff | Splinter Review |
During the browser startup until the default page is shown the JS engine allocates about 15000 atoms for strings. Running GMail brings that number to 26000. Currently each atom means an extra malloc call since the engine uses chained hash table and represents atoms as hash table entries. It would be nice to avoid these 15K malloc calls and switch the table to double hashing which is also more cache friendly. The basic idea is to use jsval itself as a property id with string and double properties interned through the hashing. The nontrivial part of this schema is how to deal with hidden properties. Since hidden properties are strings, a possible solution for the problem is to tag them as booleans with an added benefit of very simple conversion between hidden and normal properties.
Comment 1•17 years ago
|
||
You can't use double-hashing for atoms if scripts point to atoms, because addresses are not stable across growth and shrinkage of the double hashtable. Using jsval and getting rid of jsid is a fine idea, separate bug though. It may even be on file already (I don't remember, can't find it at a glance). /be
Comment 2•17 years ago
|
||
We should also be sure we are chasing real costs. Counting mallocs is not enough: does malloc size-classify and handle small objects well (with nearly zero space overhead)? Does it use thread-local size-classified freelists to avoid locking? I don't know, but I'd hate to struggle for double hashing with address invariance, only to find it didn't actually save appreciable space or time. /be
Assignee | ||
Comment 3•17 years ago
|
||
(In reply to comment #1) > You can't use double-hashing for atoms if scripts point to atoms, because > addresses are not stable across growth and shrinkage of the double hashtable. The idea to use jsval both for atoms and jsid and replace atomization by JSString* and double* interning. Scripts will point to this interned jsval providing uniqueness. Since GC does not move things, this will be a stable reference. Currently the size of JSAtom* is 6 words. AFAICS with intrened atoms that can be reduced to just JSDHashEntryStub or 2 words since for interned things one can use GC flags as a replacement for marked,pinned and locked flags and JSATom->number can be replaced by a new compilation-time-only structure.
Comment 4•17 years ago
|
||
I see -- this does seem like the one bug to do away with jsid and use JSDHashTable for atoms then. Good idea! /be
Comment 5•17 years ago
|
||
Crypto-booleans are not necessary for hidden strings. All you need is advice when looking up and adding them to xor that subspace pattern with the key-hash. And we have that advice, in the form of the internal js_*HiddenProperty APIs. In the GC heap, there can be many strings named, e.g. "x". One will be hashed if 'x' is used as a local or formal, in the hidden subspace. Others may be hashed for the usual reasons (string literal "x", runtime computed property name "x", etc.). /be
Assignee | ||
Comment 6•17 years ago
|
||
This is just a backup of something that just compiles.
Assignee | ||
Comment 7•17 years ago
|
||
Here is yet another backup.
Attachment #271651 -
Attachment is obsolete: true
Assignee | ||
Comment 8•17 years ago
|
||
Here is a version that passes the test suite for jsshell.
Attachment #271759 -
Attachment is obsolete: true
Assignee | ||
Comment 9•17 years ago
|
||
The patch that passes jsshell test suite but hangs the browser during the startup.
Attachment #271888 -
Attachment is obsolete: true
Assignee | ||
Comment 11•17 years ago
|
||
Backup of working patch. The patch requires the patch from bug 387481 comment 6 to be applied first.
Attachment #272614 -
Attachment is obsolete: true
Assignee | ||
Comment 12•17 years ago
|
||
Resolving the bug would remove 15K of malloc calls on the browser startup.
Flags: blocking1.9?
Assignee | ||
Comment 14•17 years ago
|
||
Here is the first CVS diff. I will test it more before asking for a review.
Attachment #275516 -
Attachment is obsolete: true
Assignee | ||
Updated•17 years ago
|
Attachment #275573 -
Flags: review?(brendan)
Assignee | ||
Comment 15•17 years ago
|
||
Comments on the patch: 1. Patch makes JSAtom* == jsval with uniqueness of JSString* and jsdouble* ensured via hashtables. The patch uses separated tables for double and string values to minimize the number of checks during hash lookups. During GC the hash entries are purged if the corresponding GC thing would be finalized. A consequence of this is that the entry remains in the hashtable as its GC thing is live even if it is no longer used as an atom by life things. But this should not be a problem as in practice once a string is atomized, it is continued to be used as an atom. 2. jsid becomes almost jsval. The exception is that for hidden atoms patch tags the corresponding JSString* as a boolean. I considered alternatives, but they require an extra hashtable lookup to hide/unhide atoms with more complex code. Such pseudo-boolean tagging should not cause any problems as code continues to treat jsid and jsval differently and the details of id->value mapping are hidden behind a set of macros. 3. To pin/intern atoms the patch always uses js_LockGCThing to minimize tracing of locked atoms.
Status: NEW → ASSIGNED
Assignee | ||
Comment 16•17 years ago
|
||
Comment on attachment 275573 [details] [diff] [review] implementation v1 It turned out that browser's code calls JS_InternString repeatedly for the same string. This means that with the patch js_LockedGCThing will be called on each such invocation. It in turn triggers creation of a hash entry for locked strings and may even overflow the lock counter. So a new patch without this deficiency is comming.
Attachment #275573 -
Attachment is obsolete: true
Attachment #275573 -
Flags: review?(brendan)
Assignee | ||
Comment 17•17 years ago
|
||
The new patch resorts to storing ATOM_PINNED and ATOM_INTERNED flags in the hashtable so interning already atomized strings just consumes cycles.
Attachment #275822 -
Flags: review?(brendan)
Assignee | ||
Comment 18•17 years ago
|
||
Fixing js_DumpAtoms bug in the previous patch.
Attachment #275822 -
Attachment is obsolete: true
Attachment #275828 -
Flags: review?(brendan)
Attachment #275822 -
Flags: review?(brendan)
Comment 19•17 years ago
|
||
jst, see comment 16. Igor, which strings were those? Any stack backtrace samples? /be
Assignee | ||
Comment 20•17 years ago
|
||
The new version also fixes that bug js isapi.c when I used GCX_ instead of JSTRACE_ for xml things. Since the patch changes JSTRACE_ constants, GCX_ no longer matches JSTRACE_.
Attachment #275828 -
Attachment is obsolete: true
Attachment #275832 -
Flags: review?(brendan)
Attachment #275828 -
Flags: review?(brendan)
Comment 21•17 years ago
|
||
(In reply to comment #16) > (From update of attachment 275573 [details] [diff] [review]) > It turned out that browser's code calls JS_InternString repeatedly for the same > string. Do you have a stack for these calls, or even the string value being interned by chance?
Assignee | ||
Comment 22•17 years ago
|
||
(In reply to comment #21) > (In reply to comment #16) > > (From update of attachment 275573 [details] [diff] [review] [details]) > > It turned out that browser's code calls JS_InternString repeatedly for the same > > string. > > Do you have a stack for these calls, or even the string value being interned by > chance? I will try to get a details on that.
Assignee | ||
Comment 23•17 years ago
|
||
More stats about re-interning. After the browser startup until the first window is shown: *** lockReals/lockTotals=546/1763, 31.0% meaning that just 31% of intern calls were usefull. After running GMail for a while the stats becomes: *** lockReals/lockTotals=1659/11279, 14.7% I will try to build a distribution per string.
Comment 24•17 years ago
|
||
Comment on attachment 275832 [details] [diff] [review] v4 > JS_PUBLIC_API(JSBool) > JS_GetObjectId(JSContext *cx, JSObject *obj, jsid *idp) > { >- JS_ASSERT(((jsid)obj & JSID_TAGMASK) == 0); Was this assertion just unnecessary (I forget why I wrote it), or now wrong? >+/* >+ * We use JSDHashEntryStub for hash table entries to store JSString * and >+ * jsdouble * pointers. To support pinned and interned atoms, we use the >+ * lowest bits of the key to store ATOM_PINNED and ATOM_INTERNED flags. >+ */ >+#define ATOM_KEY_FLAG_MASK (ATOM_PINNED | ATOM_INTERNED) >+JS_STATIC_ASSERT(ATOM_KEY_FLAG_MASK < JSVAL_ALIGN); Nit: I still prefer a blank line separating #define and any related JS_STATIC_ASSERT. >+#define KEY_TO_DOUBLE(key) ((jsdouble *)(key)) Nobody pins or interns doubles, ok. Assert this elsewhere? >+#define KEY_TO_STRING(key) ((JSString *)((jsuword)(key) & \ >+ ~ATOM_KEY_FLAG_MASK)) Nit: indent the right operand of & to underhang the left (which is (jsuword)(key) in this case). Should we const-ipate harder and avoid casting away const in GET_ENTRY_KEY? >+MatchString(JSDHashTable *table, const JSDHashEntryHdr *hdr, const void *key) > { >+ JSString *str1, *str2; >+ >+ JS_ASSERT(IS_STRING_TABLE(table)); >+ str1 = KEY_TO_STRING(GET_ENTRY_KEY(hdr)); >+ >+ /* >+ * str1 is null after we added entry and release the lock but before we s/added entry/add hdr's entry/ >+ * initialize keybits with newly constructed GC thing. We always return s/keybits/its key/ maybe? >+ * false for such entries so JS_DHashTableOperate never finds them. We >+ * clean them during the sweeping stage of GC. s/sweeping stage of GC/GC's sweep phase/ >+ * >+ * It means that with congested lock we may end up adding 2 entries until s/congested/a contested/ s/2/two/ >+#define JS_STRING_HASH_COUNT 1024 >+#define JS_DOUBLE_HASH_COUNT 64 Sorry if you already said and I missed it -- are these based on measurements? >+ /* >+ * Any entry that remains at this point must be initialized as the last GC Nit: comma before "as the last GC". >+ trc = (JSTracer *)arg; >+ if (IS_DOUBLE_TABLE(table)) { >+ JS_SET_TRACING_INDEX(trc, "locked_double_atom", (size_t)number); >+ traceKind = JSTRACE_DOUBLE; >+ } else { >+ JS_ASSERT(IS_STRING_TABLE(table)); >+ JS_SET_TRACING_INDEX(trc, "locked_atom", (size_t)number); Why not "locked_string_atom"? Symmetry makes things line up more, usually a good thing. >+ /* Remove uninitialized entries or entries with about to GC things. */ s/about to GC things/things about to be GC'ed/ >@@ -1375,17 +1353,17 @@ fun_xdrObject(JSXDRState *xdr, JSObject > sprop = spvec[i]; > JS_ASSERT(sprop->flags & SPROP_HAS_SHORTID); > type = (i < fun->nargs) > ? JSXDR_FUNARG > : (sprop->attrs & JSPROP_READONLY) > ? JSXDR_FUNCONST > : JSXDR_FUNVAR; > userid = INT_TO_JSVAL(sprop->shortid); >- propAtom = JSID_TO_ATOM(sprop->id); >+ propAtom = JSID_TO_ATOM(JSID_UNHIDE_NAME(sprop->id)); > if (!JS_XDRUint32(xdr, &type) || > !JS_XDRUint32(xdr, &userid) || > !js_XDRCStringAtom(xdr, &propAtom)) { Something's wrong here, propAtom is still a JSAtom *, and you didn't patch js_XDRCStringAtom AFAICT. But JSID_TO_ATOM is: > #define JSID_TO_ATOM(id) ((JSAtom *)(id)) which can't be right any longer, but this is masked in most cases by ATOM_KEY's (jsval) cast. What besides XUL FastLoad might actually test this code? /be
Attachment #275832 -
Flags: review?(brendan) → review-
Comment 25•17 years ago
|
||
Forgotten nit: + * str is not necessary GC thing here. and similar comments should say "necessarily a GC thing". /be
Assignee | ||
Comment 26•17 years ago
|
||
js_InternString stats after the browser startup. Note that stats in comment 23 were wrong, there lockReals were excluding strings that became interned while already atomized. These stats are related just for js_InternString calls. Here are the strings with the most frequent number of js_InternCalls: 279,QueryInterface 44,name 33,nodeValue 32,prefix 27,title 26,toString 26,data 20,id 19,type 18,removeChild 18,normalize 18,dir 17,previousSibling 17,nextSibling 17,lastChild 17,firstChild 17,attributes 16,TEXT_NODE 16,replaceChild 16,PROCESSING_INSTRUCTION_NODE 16,parentNode 16,ownerDocument 16,NOTATION_NODE 16,nodeType 16,nodeName 16,namespaceURI 16,location 16,localName 16,isSupported 16,insertBefore 16,hasChildNodes 16,hasAttributes 16,ENTITY_REFERENCE_NODE 16,ENTITY_NODE 16,ELEMENT_NODE 16,DOCUMENT_TYPE_NODE 16,DOCUMENT_NODE 16,DOCUMENT_FRAGMENT_NODE 16,COMMENT_NODE 16,cloneNode 16,className 16,childNodes 16,CDATA_SECTION_NODE 16,ATTRIBUTE_NODE 16,appendChild entries / number of intern calls=2313/5541, 41.7%
Assignee | ||
Comment 27•17 years ago
|
||
Here the stats after using gmail and bugzilla. The highlights are: number of intern calls, string 763,QueryInterface 111,name 83,title 78,data 71,nodeValue 70,prefix 69,toString 56,loadType 56,layoutHistoryState 54,sticky 54,id 52,dir 50,className 48,lang 45,contentType 44,location 39,type 39,result 39,removeChild 39,message 39,init 39,filename 39,contentViewer 39,close 38,parent 38,lineNumber 38,inner 38,columnNumber 37,normalize 36,previousSibling 36,nextSibling 36,marginWidth 36,marginHeight 36,lastChild 36,firstChild 36,attributes 35,TEXT_NODE 35,replaceChild 35,PROCESSING_INSTRUCTION_NODE 35,parentNode 35,ownerDocument 35,open 35,NOTATION_NODE entries / number of intern calls=2836/14703, 19.3%
Assignee | ||
Comment 28•17 years ago
|
||
For references this is a patch on top of the patch from comment 20 to produce the intern stats. The files that are attached comes after sorting the collected stats. This is very ugly one shot code.
Comment 29•17 years ago
|
||
Ah, this is XPConnect's interning. It interns used names once per scope (or maybe even once per "type" in every scope). Not sure there's a lot we can, or would want to do about that.
Assignee | ||
Comment 30•17 years ago
|
||
(In reply to comment #24) > >@@ -1375,17 +1353,17 @@ fun_xdrObject(JSXDRState *xdr, JSObject > > sprop = spvec[i]; > > JS_ASSERT(sprop->flags & SPROP_HAS_SHORTID); > > type = (i < fun->nargs) > > ? JSXDR_FUNARG > > : (sprop->attrs & JSPROP_READONLY) > > ? JSXDR_FUNCONST > > : JSXDR_FUNVAR; > > userid = INT_TO_JSVAL(sprop->shortid); > >- propAtom = JSID_TO_ATOM(sprop->id); > >+ propAtom = JSID_TO_ATOM(JSID_UNHIDE_NAME(sprop->id)); > > if (!JS_XDRUint32(xdr, &type) || > > !JS_XDRUint32(xdr, &userid) || > > !js_XDRCStringAtom(xdr, &propAtom)) { > > Something's wrong here, propAtom is still a JSAtom *... I had to make it clear that with the patch it is the id that now can be hidden, not the atom. This allows to avoid an extra hash entry in the atom table for hidden properties for the price of using pseudo-booleans for hidden names. But this is OK since id belongs to a different type and namespace than jsval. Thus the fact that with the patch it now mostly coincides with the corresponding jsval does not matter. This means that to get the atom that is behind the id one has to know if the id is hidden or not and, if so, use JSID_TO_ATOM(JSID_UNHIDE_NAME(id)), not JSID_TO_ATOM((id)). This is what the patch does in fun_xdrObject when encoding. There corresponding JSID_HIDE_NAME during decoding happens when fun_xdrObject calls js_AddHiddenProperty: if (!js_AddHiddenProperty(cx, fun->object, ATOM_TO_JSID(propAtom), getter, setter, SPROP_INVALID_SLOT, attrs | JSPROP_SHARED, dupflag | SPROP_HAS_SHORTID, JSVAL_TO_INT(userid))) { goto bad; }
Assignee | ||
Comment 31•17 years ago
|
||
(In reply to comment #24) > What besides XUL FastLoad might actually test this code? It was exactly the XUL loading that pointed to missed JSID_UNHIDE_NAME in JSID_TO_ATOM(JSID_UNHIDE_NAME(sprop->id)) in some early version of the patch.
Assignee | ||
Comment 32•17 years ago
|
||
In the new version, besides addressing the nits and adding the comments, I added explicit JSAtomHashEntry to make code clearer. The patch also does not bother to optimize access to double keys in JSAtomHashEntry and always clears the mask for simpler code.
Attachment #275832 -
Attachment is obsolete: true
Attachment #275955 -
Flags: review?(brendan)
Comment 33•17 years ago
|
||
Comment on attachment 275955 [details] [diff] [review] v5 >+ * JSAtomState.doubleAtoms and JSAtomState.stringAtoms hashtable entry. To >+ * support pinned and interned string atoms, we use the lowest bits of "the" after "lowest bits of" -- makes the comment look less ragged-right, to boot! >+js_locked_atom_tracer(JSDHashTable *table, JSDHashEntryHdr *hdr, >+ uint32 number, void *arg) Overflow line is overindented two spaces. >+ * To put a property into the hidden subspace we re-tag JSString * behind >+ * property's atom as JSVAL_BOOLEAN to get a different id. js_TraceId must >+ * properly trace such pseudo booleans to ensure GC safety. "pseudo-booleans". >-intN >-js_CompareStrings(JSString *str1, JSString *str2) Did this really have to move? r=me with these addressed. /be
Attachment #275955 -
Flags: review?(brendan) → review+
Assignee | ||
Comment 34•17 years ago
|
||
The new version, besides addressing the nits, fixes various places with code like "switch(id_kind)" to check for hidden id. This was mostly in the dumping code. I kept that movement of js_EqualString in the patch so it is closer to js_HashString and one can spot the comments about non-GC string instantly for both functions.
Attachment #275955 -
Attachment is obsolete: true
Attachment #276250 -
Flags: review?(brendan)
Attachment #276250 -
Flags: approval1.9?
Assignee | ||
Comment 35•17 years ago
|
||
Assignee | ||
Comment 36•17 years ago
|
||
fixing bogus line added to SrcNotes from js.c by v6
Attachment #276250 -
Attachment is obsolete: true
Attachment #276251 -
Attachment is obsolete: true
Attachment #276252 -
Flags: review?(brendan)
Attachment #276252 -
Flags: approval1.9?
Attachment #276250 -
Flags: review?(brendan)
Attachment #276250 -
Flags: approval1.9?
Assignee | ||
Comment 37•17 years ago
|
||
Comment 38•17 years ago
|
||
Comment on attachment 276252 [details] [diff] [review] v7 Yeah, hash and equals should be next to each other. r/a=me, thanks again -- great patch. /be
Attachment #276252 -
Flags: review?(brendan)
Attachment #276252 -
Flags: review+
Attachment #276252 -
Flags: approval1.9?
Attachment #276252 -
Flags: approval1.9+
Assignee | ||
Comment 39•17 years ago
|
||
I checked in the patch from comment 36 to the trunk: http://bonsai.mozilla.org/cvsquery.cgi?module=PhoenixTinderbox&branch=HEAD&cvsroot=%2Fcvsroot&date=explicit&mindate=1186863900&maxdate=1186863961&who=igor%mir2.org
Status: ASSIGNED → RESOLVED
Closed: 17 years ago
Resolution: --- → FIXED
Assignee | ||
Comment 40•17 years ago
|
||
I have not included from dom/src/base/nsJSEnvironment.cpp in the last commit. Now everything from the patch is committed: http://bonsai.mozilla.org/cvsquery.cgi?module=PhoenixTinderbox&branch=HEAD&cvsroot=%2Fcvsroot&date=explicit&mindate=1186864740&maxdate=1186864861&who=igor%mir2.org
Comment 41•17 years ago
|
||
JS_IS_VALID_TRACE_KIND should probably be changed for the case were JS_HAS_XML_SUPPORT is not defined (see http://mxr.mozilla.org/seamonkey/source/js/src/jsgc.h#182).
Updated•17 years ago
|
Flags: in-testsuite-
Assignee | ||
Comment 42•17 years ago
|
||
Clearing blocking 1.9 ? flag as the bug was fixed based on 1.9 approval for the checked-in patch.
Flags: blocking1.9?
You need to log in
before you can comment on or make changes to this bug.
Description
•