Closed
Bug 331456
Opened 18 years ago
Closed 18 years ago
Per-runtime deflated string cache
Categories
(Core :: JavaScript Engine, enhancement)
Core
JavaScript Engine
Tracking
()
RESOLVED
FIXED
People
(Reporter: igor, Assigned: igor)
Details
Attachments
(1 file, 4 obsolete files)
14.39 KB,
patch
|
Details | Diff | Splinter Review |
Currently the cache of deflated strings is global is shared between runtimes. For embeddings with multiple runtimes and threads this increases a lock congestion without benefits of sharing the deflated characters as strings are cached based on their address, no context. See comment 11 in bug 294499#11 for details. Thus the idea is to use one cache per runtime to address this.
Assignee | ||
Comment 1•18 years ago
|
||
Assignee: general → igor.bukanov
Status: NEW → ASSIGNED
Assignee | ||
Updated•18 years ago
|
Attachment #216008 -
Flags: review?(brendan)
Comment 2•18 years ago
|
||
Comment on attachment 216008 [details] [diff] [review] Implementation >+++ src/jscntxt.h 2006-03-23 13:57:18.000000000 +0100 >@@ -134,6 +134,13 @@ > jsdouble *jsNegativeInfinity; > jsdouble *jsPositiveInfinity; > >+#ifdef JS_THREADSAFE >+ JSLock *deflatedStringCacheLock; >+#endif >+ JSHashTable *deflatedStringCache; >+#ifdef DEBUG >+ uint32 deflatedStringCacheBytes; >+#endif Empty line here please (before comment, generally). > /* Empty string held for use by this runtime's contexts. */ > JSString *emptyString; > >Index: src/jsgc.c >=================================================================== >--- src.orig/jsgc.c 2006-03-23 13:57:03.000000000 +0100 >+++ src/jsgc.c 2006-03-23 13:57:18.000000000 +0100 >@@ -312,6 +312,22 @@ > return (uint8 *)(pageAddress - offsetInArena + thingIndex); > } > >+JSRuntime* >+js_GetGCStringRuntime(JSString *str) >+{ >+ JSGCPageInfo *pi; >+ JSGCArenaList *list; >+ >+ pi = (JSGCPageInfo *) ((jsuword)str & ~GC_PAGE_MASK); Time to macro-ize the right-hand side of = here, for cleaner code at all the places such an expression occurs in jsgc.c? >+ list = PAGE_TO_ARENA(pi)->list; >+ >+ JS_ASSERT(list->thingSize == sizeof(JSGCThing)); >+ JS_ASSERT(GC_FREELIST_INDEX(sizeof(JSGCThing)) == 0); >+ >+ return (JSRuntime *)((unsigned char*)list No need for unsigned, and space before * is more consistent style. >+ - offsetof(JSRuntime, gcArenaList)); >+} >+ > JSBool > js_IsAboutToBeFinalized(JSContext *cx, void *thing) > { >@@ -2471,21 +2434,40 @@ > JSAtom *atom; > > rt = cx->runtime; >- JS_ASSERT(!rt->emptyString); >+ >+ /* Initialize string cache */ >+#ifdef JS_THREADSAFE >+ JS_ASSERT(!rt->deflatedStringCacheLock); >+ rt->deflatedStringCacheLock = JS_NEW_LOCK(); >+ if (!rt->deflatedStringCacheLock) >+ goto bad; No need for goto bad here, save an if-test at that label by returning false. /be
Assignee | ||
Comment 3•18 years ago
|
||
(In reply to comment #2) > >+ return (JSRuntime *)((unsigned char*)list > > No need for unsigned I used "unsigned char *" and not "char *" since strictly speaking according to the C standard an arbitrray pointer could only be casted to "unsigned char *" while casting to "char*" was not defined. Is it OK if I replace that by uint8 *?
Comment 4•18 years ago
|
||
Good grief, does ISO C really say that? What pedantry. It does not square with the evolution of C on real architectures, which I witnessed a lot of, or the plain physical sense of the types and the pointer arithmetic. Sure, uint8 * -- there are other such naughty char * casts, however, and if you aren't going to fix them I don't see the point in being different. I've never seen a warning on this. /be
Assignee | ||
Comment 5•18 years ago
|
||
(In reply to comment #4) > does ISO C really say that? Sorry for spreading FUD about ISO C. Casting to signed char* has the same restrictions as casting to unsigned char*. signed char* is only problematic when one dereference it: according to the standard an implementation is not required to use two's-compliment form for signed integers. So possible bit patterns for signed char may not cover the whole byte and accessing a pattern that is outside the defined range is an undefined behavior. Still IMO using unsigned char or byte or uint8 is better as it emphasizes that raw memory is accessed and not a C string.
Assignee | ||
Comment 6•18 years ago
|
||
Attachment #216008 -
Attachment is obsolete: true
Attachment #216217 -
Flags: review?(brendan)
Attachment #216008 -
Flags: review?(brendan)
Comment 7•18 years ago
|
||
Comment on attachment 216217 [details] [diff] [review] Implementation v2 (addressing the nits) > #define PAGE_TO_ARENA(pi) \ > ((JSGCArena *)((jsuword)(pi) - (pi)->offsetInArena \ > - offsetof(JSGCArena, base))) > > #define PAGE_INDEX(pi) \ > ((size_t)((pi)->offsetInArena >> GC_PAGE_SHIFT)) > >+#define GC_THING_PAGE(thing) \ >+ ((JSGCPageInfo *)((jsuword)(thing) & ~GC_PAGE_MASK)) More clear and consistent name would be THING_TO_PAGE. I'd probably have burdened all these PAGE names with _INFO after PAGE, but that's unnecessary. Still, THING_TO_PAGE seems better than GC_THING_PAGE. > JSBool > js_InitRuntimeStringState(JSContext *cx) > { > JSRuntime *rt; > JSString *empty; > JSAtom *atom; > > rt = cx->runtime; >- JS_ASSERT(!rt->emptyString); >+ >+ /* Initialize string cache */ >+#ifdef JS_THREADSAFE >+ JS_ASSERT(!rt->deflatedStringCacheLock); >+ rt->deflatedStringCacheLock = JS_NEW_LOCK(); >+ if (!rt->deflatedStringCacheLock) >+ return JS_FALSE; >+#endif > > /* Make a permanently locked empty string. */ >+ JS_ASSERT(!rt->emptyString); > empty = js_NewStringCopyN(cx, js_empty_ucstr, 0, GCF_LOCK); > if (!empty) >- return JS_FALSE; >+ goto bad; > > /* Atomize it for scripts that use '' + x to convert x to string. */ > atom = js_AtomizeString(cx, empty, ATOM_PINNED); > if (!atom) >- return JS_FALSE; >+ goto bad; > > rt->emptyString = empty; > rt->atomState.emptyAtom = atom; >+ > return JS_TRUE; >+ >+ bad: >+#ifdef JS_THREADSAFE >+ if (rt->deflatedStringCacheLock) { No need for if-test here. >+ JS_DESTROY_LOCK(rt->deflatedStringCacheLock); >+ rt->deflatedStringCacheLock = NULL; >+ } >+#endif >+ return JS_FALSE; >+ > } About char * vs. uint8 * -- I'm fine either way in SpiderMonkey. In C since the odd-ball DEC systems of the 70s died, char * has meant byte pointer. And even on my beloved DECSystem 2060, where char was int was a 36-bit word (!), you couldn't address at a finer grain, so char * was fine for arbitrary pointer type. Then came void *, but it doesn't work if you want address arithmetic. C99 will save us, some year (maybe this year?). /be
Assignee | ||
Comment 8•18 years ago
|
||
(In reply to comment #7) > In C since the odd-ball DEC systems of the 70s died, char * has meant byte pointer. Friend of mine who works in a company that provides ticket booking and reservation for few airlines had to support 36-bit C code at least in 2003. Next time we meet I ask him about the hardware. > even on my beloved DECSystem 2060, where char was int was a 36-bit word (!), > you couldn't address at a finer grain, so char * was fine for arbitrary pointer > type. > > Then came void *, but it doesn't work if you want address arithmetic. > > C99 will save us, some year (maybe this year?). > > /be >
Assignee | ||
Comment 9•18 years ago
|
||
The previous version of the patch had a bad bug: I forgot to call js_GetGCStringRuntime in js_SetStringBytes. GCC was screaming about uninitialized rt but I missed the warning in the make output :(. So here is a patch version that changed js_SetStringBytes to take JSContext* as argument as the context is available in places where the function is called. In addition js_GetStringChars is changes to take JSRuntime* parameter so the only place where js_GetGCStringRuntime is called is JS_GetStringBytes. It would be even cleaner to add a version of JS_GetStringBytes that takes JSContext* as argument and change JS_GetStringBytes to create a temporary context to pass as argument to js_GetStringChars and fix the following comment: /* * May be called with null cx by js_GetStringBytes, see below. */ char * js_DeflateString(JSContext *cx, const jschar *chars, size_t length) But that is for another bug.
Attachment #216217 -
Attachment is obsolete: true
Attachment #216255 -
Flags: review?(brendan)
Attachment #216217 -
Flags: review?(brendan)
Comment 10•18 years ago
|
||
(In reply to comment #9) > It would be even cleaner to add a version of JS_GetStringBytes that takes > JSContext* as argument and change JS_GetStringBytes to create a temporary > context to pass as argument to js_GetStringChars and fix the following comment: > > /* > * May be called with null cx by js_GetStringBytes, see below. > */ > char * > js_DeflateString(JSContext *cx, const jschar *chars, size_t length) > > But that is for another bug. > Please don't -- JS_GetStringBytes is called a lot. It should not have to create and destroy a context just to avoid a null-test on cx under js_DeflateString. The branch penalty is small with a short then part. /be
Comment 11•18 years ago
|
||
(In reply to comment #9) > It would be even cleaner to add a version of JS_GetStringBytes that takes > JSContext* as argument No point in this API either, really -- lots of sunk costs to un-sink, and bigger fish to fry elsewhere. Plus, infallible and cx-less is better for callers. Most should not need to test and propagate OOM. OOM checking is a fruitless distortion to much of our code, on modern VM-based OSes. For handhelds it could matter, but should be handled in a more centralized way. /be
Comment 12•18 years ago
|
||
Comment on attachment 216255 [details] [diff] [review] Impementation v3 >@@ -4254,17 +4251,17 @@ > /* Free chars (but not bytes, which caller frees on error) if we fail. */ > str = js_NewString(cx, chars, length, 0); > if (!str) { > JS_free(cx, chars); > return NULL; > } > > /* Hand off bytes to the deflated string cache, if possible. */ >- if (!js_SetStringBytes(str, bytes, nbytes)) >+ if (!js_SetStringBytes(cx, str, bytes, nbytes)) js_SetStringBytes should take an rt, not a cx that it uses only to get cx->runtime -- this restores symmetry with js_GetStringBytes, confirming the parameterization matching the data dependency. > JS_GetStringBytes(JSString *str) > { >+ JSRuntime *rt; > char *bytes; > >- bytes = js_GetStringBytes(str); >+ rt = js_GetGCStringRuntime(str); >+ bytes = js_GetStringBytes(rt, str); Single-use variable to avoid nested function call expression seems not worth its source-code weight here. >+extern void >+js_PurgeDeflatedStringCache(JSRuntime *rt, JSString *str); >+ Since you're moving this, perhaps it belongs down with js_[GS]etStringBytes. /be
Assignee | ||
Comment 13•18 years ago
|
||
Attachment #216255 -
Attachment is obsolete: true
Attachment #216260 -
Flags: review?(brendan)
Attachment #216255 -
Flags: review?(brendan)
Comment 14•18 years ago
|
||
Comment on attachment 216260 [details] [diff] [review] Impementation v4 Thanks, sorry for the patch iteration. Glad you caught that warning -- I'm not test-building, just reading (too quickly, obviously). /be
Attachment #216260 -
Flags: review?(brendan) → review+
Assignee | ||
Comment 15•18 years ago
|
||
This is the version to commit.
Attachment #216260 -
Attachment is obsolete: true
Assignee | ||
Comment 16•18 years ago
|
||
I committed the patch from attachment 216261 [details] [diff] [review].
Status: ASSIGNED → RESOLVED
Closed: 18 years ago
Resolution: --- → FIXED
Updated•18 years ago
|
Flags: in-testsuite-
Comment 17•18 years ago
|
||
This bug introduced a leak (caught by valgrind while running the shell), because we finish the string runtime state before running the last GC, so we weren't purging remaining strings from the string cache and freeing their bytes. This checkin fixed the leak: Checking in jsstr.c; /cvsroot/mozilla/js/src/jsstr.c,v <-- jsstr.c new revision: 3.128.2.4; previous revision: 3.128.2.3 done Checking in jsstr.h; /cvsroot/mozilla/js/src/jsstr.h,v <-- jsstr.h new revision: 3.30.4.5; previous revision: 3.30.4.4 done Checking in jscntxt.c; /cvsroot/mozilla/js/src/jscntxt.c,v <-- jscntxt.c new revision: 3.83.2.5; previous revision: 3.83.2.4 done
You need to log in
before you can comment on or make changes to this bug.
Description
•